Database Principles: Fundamentals of Design, Implementation, and Management [3 ed.] 9781473768062


454 22 294MB

English Pages [965] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Part I: Database Systems
Chapter 1: The Database Approach
Chapter 2: Data Models
Chapter 3: Relational Model Characteristics
Chapter 4: Relational Algebra and Calculus
Part II: Design Concepts
Chapter 5: Data Modelling with Entity Relationship Diagrams
Chapter 6: Data Modelling Advanced Concepts
Chapter 7: Normalising Database Designs
Part III: Database Programming
Chapter 08: Beginning Structured Query Language
Chapter 09: Procedural Language SQL and Advanced SQL
Part IV: Database Design
Chapter 10: Database Development Process
Chapter 11: Conceptual, Logical, and Physical Database Design
Part V: Database Transactions And Performance Tuning
Chapter 12: Managing Transactions and Concurrency
Chapter 13: Managing Database and SQL Perfomance
Part VI: Database Management
Chapter 14: Distributed Databases
Chapter 15: Databases for Business Intelligence
Chapter 16: Big Data and NoSQL
Chapter 17: Database Connectivity and Web Technologies
Glossary
Index
Recommend Papers

Database Principles: Fundamentals of Design, Implementation, and Management [3 ed.]
 9781473768062

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Australia

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Brazil

Reserved. content

does

May not

not materially

Mexico

be

copied, affect

South

scanned, the

overall

or

duplicated, learning

Africa

in experience.

Singapore

whole

or in Cengage

part.

United

Due Learning

to

electronic reserves

Kingdom

rights, the

right

some to

third remove

United

party additional

content

States

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

This is an electronic

some third content

does not

to remove valuable

formats,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

print textbook.

affect the

this title

overall

at any time

learning

on pricing,

www.cengage.com/highered

your areas

Rights

be

Media

available

Reserved. content

does

May not

in

not materially

The publisher

rights

changes

restrictions,

restrictions

to current

editions,

to search by ISBN#,

reserves

require

it.

the right For

and alternate

author, title, or keyword for

of interest.

Notice: not

editions,

rights

has deemed that any suppressed

experience.

if subsequent

please visit in

previous

Due to electronic

Editorial review

information

Important may

from

of the

may be suppressed.

materially

content

materials

text

version

party content

be

copied, affect

content the

referenced

eBook

scanned, the

overall

or

within

the

product

description

or the

product

version.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Database

Principles:

Design,

Fundamentals

Implementation,

and

of

2020

Cengage

US

Edition

from

Database

Management

Authors:

Carlos

Coronel,

Steven

13th

Keeley

Crockett,

Craig

Blewett

Marinda

Marketing

Louw

Manager:

Anna

Cengage

RIGHTS

Content

Project

reproduced,

transmitted,

in

or

any

form

Manager:

Manager:

Sue

Povey

written

Eyvett

by

Cover

No part

2019.

of this

stored,

any

means,

recording

permission U.K. from

Steven All

Morris.

Rights

work

may be

distributed

or used

electronic,

mechanical,

or

otherwise,

without

of Cengage

Learning

Copyright

Licensing

the

the

or under

prior

license

Agency

Ltd.

Author(s)

and

the

Adapter(s)

have

asserted

the

right

SPi-Global under

Cover

Inc.,

Davis

The Typesetter:

Coronel,

&

Reading

in the Manufacturing

by Carlos

Learning,

RESERVED.

photocopying, Senior

Edition,

Design, Implementation,

Reserved.

ALL Publisher:

Systems:

Morris Copyright

Adapters:

EMEA

Management

Adapted Third

Learning

Designer:

Simon

Levy

Image(s):

Vijay

Kumar/Getty

the

Copyright

identified

Associates

Images

Designs

as Author(s)

For product

us

at

permission

product

Patents

Adapter(s)

information

contact

For

and

and

Act

1988

of this

and technology

to

be

Work.

assistance,

[email protected]

to

use

and for

material

from

permission

this

text

queries,

or

email

[email protected]

British

Library

A catalogue

British

Cataloguing-in-Publication

record

for

this

Data

book

is

available

from

the

Library.

ISBN:

978-1-4737-6804-8

Cengage

Learning,

Cheriton

House,

Andover,

Hampshire,

United

EMEA

North

Way SP10

5BE

Kingdom

Cengage

Learning

learning different around

is

a leading

solutions

with

countries

and sales

the

world.

provider

employees

Find

your

in

of

customized

residing

in

more than

local

nearly

125

40

countries

representative

at:

www.cengage.co.uk.

Cengage by

Learning

Nelson

To learn

register

more

Printed Print

Copyright Editorial

review

2020 has

in

China

Number:

Cengage deemed

Learning. that

any

at

RR

All

Print

Rights

Reserved. content

in

Canada

Cengage

your

materials

platforms

online

for

your

and

learning

services,

solution,

or

course,

www.cengage.com.

Donnelley

01

suppressed

are represented

Ltd.

about

or access

purchase visit

products

Education,

does

May not

not materially

be

Year:

copied, affect

2020

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Brief Contents

Part i

Database systems 2

1

The Database Approach

2

Data

3 4

Relational Model Characteristics 70 Relational Algebra and Calculus 119

Models

Part ii 5 6 7

5

34

Design Concepts 162

Data Modelling with Entity Relationship Diagrams 165 Data Modelling Advanced Concepts 233 Normalising Database Designs 271

Part iii

Database Programming

8

Beginning Structured

Query Language

9

Procedural

SQL and

Part iV

Language

320

Advanced

SQL 426

Database Design 522

10

Database Development

11

Conceptual,

Logical,

Process

525

and Physical

Database

Part V Database transactions tuning 632 Transactions

and

Managing

13

Managing Database and SQL Performance

Part Vi Database

Concurrency

Management

Appendix

A:

Appendix

B: The

Appendix

C: Global

2020 has

706

860

938

Appendices (Available

review

672

912

Index

Copyright

578

635

Distributed Databases 709 Databases for Business Intelligence 750 Big Data and NoSQL 826 Database Connectivity and Web Technologies

Glossary

Editorial

Design

and Performance

12

14 15 16 17

318

Cengage deemed

Learning. that

any

Designing

All suppressed

Databases

University

Rights

Lab:

Tickets

Reserved. content

online)

does

May not

Ltd:

not materially

be

copied, affect

with

Visio

Professional:

Conceptual,

Logical,

Conceptual,

scanned, the

overall

or

duplicated, learning

Logical,

in experience.

whole

or in Cengage

A Tutorial

and Physical and

part.

Due Learning

Database

Physical

to

electronic reserves

Database

rights, the

Design

right

some to

third remove

party additional

Design

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

iv

Brief

Contents

Appendix

D: Converting

Appendix

E:

Comparison

Appendix

F:

Client/Server

G:

Appendix

H: Databases I:

Appendix

Copyright Editorial

review

2020 has

of ER

Appendix

Appendix

an ER Model into

The

Databases

in e-Commerce

Network

Appendix

K:

Database

Appendix

L:

Data

Database

Database

Implementation

M: Creating

Appendix

N: A Guide

Appendix

O: Building

Appendix

P:

Microsoft

Appendix

Q:

Working

with

Appendix

R:

Working

with Neo4j

Cengage

Learning. that

any

All suppressed

Rights

a New Database to

Using

SQL

a Simple

Reserved. content

does

Model

Model

Administration

Warehouse

Appendix

deemed

Structure

Notations

Systems

Object-Orientated

The Hierarchical

J:

a Database

Modelling

May

not materially

Using

Oracle 12c

Developer

with

Object-Relational

Access

not

Factors

Oracle

12c

Database

Using

Oracle

Objects

Tutorial

MongoDB

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

Preface

xiii

Changes

to the

Third

Edition

Acknowledgements

About the

Authors

Walk Through Dedication

xvii

Tour

xviii

xx

Teaching

and

Learning

Parti

Support

Vignette:

1.1 1.2

The Relational Revolution

Historical

and the

DBMS

is important

files

and

with file

system

Database

systems

21

Preparing

for your

data

processing

data

database

8

13 13

management

professional

17

career

28

30

Key terms

30

reading

Review

31

questions

Problems

31

32

Data Models 34 Preview

34

2.1

The importance

2.2 2.3 2.4

Data

The evolution

2.5

Degrees

model

Business

Summary

66

any

36

of data

models

abstraction

39 58

65

questions

Learning.

blocks

65

Review

that

building

35

37

of data

Problems

Cengage

basic

models

64

Key terms

deemed

of data

rules

Further reading

has

database design

roots:

Problems

Further

2020

3

6

the

Why database

Summary

2

vs information

Introducing

1.4 1.5 1.6 1.7

review

An Historical Journey

5 Data

1.3

Copyright

xxi

the Database Approach 5 Preview

Editorial

Resources

Databasesystems 2

Business

1

xv

xvi

All suppressed

Rights

Reserved. content

does

65

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

vi

Contents

3

relational Preview

3.1

A logical

3.2 3.3 3.4 3.5

Keys

revisited

98

database

rules

catalogue

85

database

87

relational

103

104 104

Further

reading

Review

questions

104

105

108

relational

Algebra and Calculus 119

119

4.1 4.2 4.3

Joins

4.4

Relational

Relational

operators

121

133

Constructing

queries

using

calculus

relational

algebraic

expressions

141

148

153

154

Further

reading

Review

questions

Problems

155 155

157

Partii

Design Concepts 162

Business

Vignette:

Using Data to Improve the Lives of Children and Women 163

Data Modelling Preview

with entity relationship

Diagrams 165

165

5.1 5.2

The entity relationship

5.3

Database

Developing

Summary

(ER)

an ER diagram design

model 167 196

challenges:

conflicting

goals

212

215

Key terms

216

Further

reading

Review

questions

Problems

216 217

220

Data Modelling Advanced Concepts 233 Preview

6.1 6.2

Cengage deemed

relational

101

Codds

Key terms

has

within

Data redundancy Indexes

Summary

2020

83

the

Preview

review

rules

and the system

Problems

6

72

Relationships

Key terms

Copyright

of data

The data dictionary

Summary

Editorial

view 78

Integrity

3.7 3.8

5

70

70

3.6

4

Model Characteristics

Learning. that

any

233 The

extended

Entity

All suppressed

Rights

clustering

Reserved. content

entity

does

May not

not materially

relationship

model

234

242

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

6.3

Entity

6.4 6.5

Design

Data

Key terms

257

reading

Review

studies

Preview

244 design

249

255

258

261

Database Designs 271

271

7.1

Database

tables

7.2 7.3 7.4 7.5

The

for

need

and

the

design

Surrogate

key

considerations

Higher-level

7.7 7.8

Normalisation

normal

276

284 289

forms

and

Denormalisation

272

272

process

Improving

7.6

290

database

design

296

302

303

Key terms Further

normalisation

normalisation

The normalisation

Summary

306

reading

Review

306

questions

Problems

306

308

Part iii 8

checklist

keys database

258

normalising

Business

flexible

257

questions

Case

primary

learning

modelling

256

Problems

selecting

cases:

Summary

Further

7

integrity:

vii

Database Programming Vignette:

318

Open Source Databases 319

Beginning structured Preview 320 Introduction 8.1

Query Language 320

to SQL 321

8.2

Data definition

8.3 8.4 8.5

Data manipulation commands 339 Select queries 347 Advanced data definition commands

commands

8.6

Advanced

select

324

queries

361

369

Virtual tables: creating a view 383 8.7 Joining database tables 385 8.8 Summary 392 Keyterms 393 Further reading

393

Review questions Problems 401

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

394

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

viii

Contents

9

Procedural Language sQL and Advanced sQL 426 Preview

426

9.1

Relational

9.2 9.3 9.4 9.5

SQLjoin

set

operators

operators

Subqueries

and correlated

SQL functions Oracle

446

468

Updatable

views

9.7 9.8

Procedural

SQL

Embedded

SQL 495

472 475

500

Key terms

501

Further

reading

Review

questions

Problems Case

queries

459

sequences

9.6

Summary

428

438

502

502

503

515

PartiV Database Design 522 Business Vignette: EM-DAT:TheInternational DisasterDatabasefor DisasterPreparedness523

10

Database Development Preview

10.1 10.2 10.3

Process 525

525

The information system 527 The systems development life cycle (SDLC) The database life cycle (DBLC) 532

10.4

Database design strategies 552

10.5 10.6

Centralised vs decentralised design 553 Database administration 555

Summary

573

Key terms

574

Further

reading

Review

questions

Problems

529

575 575

576

11 Conceptual, Logical, and Physical Database Design 578 Preview

578

11.1

Conceptual design 580

11.2

Logical database design 594

11.3

Physical database design 603

Summary

625

Key terms

626

Further

reading

Review

questions

Problems

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

627 627

628

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

ix

Part V Databasetransactions and Performance tuning 632 Business

12

Vignette:

From Data Warehouse to Data Lake 633

Managing transactions Preview

and Concurrency

635

12.1

What is a transaction?

12.2 12.3 12.4

637

Concurrency Concurrency Concurrency

control control control

646 withlocking methods 651 with time stamping methods 659

12.5

Concurrency

control

with optimistic

12.6 12.7

ANSI levels of transaction isolation 661 Database recovery management 662

Summary

660

668

reading

Review

668

questions

Problems

13

methods

666

Key terms Further

635

668

669

Managing Database and sQL Performance Preview

672

672

13.1 13.2 13.3

Database performance-tuning concepts Query processing 678 Indexes and query optimisation 682

13.4

Optimiser

13.5 13.6 13.7 13.8

SQL performance tuning 685 Query formulation 688 DBMS performance tuning 689 Query optimisation example 692

Summary

683

699

Key terms Further

choices

673

700

reading

700

Review

questions

Problems

701

700

Part Vi Database Management 706 Business

14

Vignette:

Distributed Preview

14.1 14.2 14.3

Copyright Editorial

review

2020 has

Cengage deemed

The FacebookCambridge

Learning. that

any

Analytica Data Scandal andthe GDPR 707

Databases 709

709

The evolution of distributed database management systems DDBMS advantages and disadvantages 712 Distributed processing and distributed databases 714

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

710

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

x

Contents

14.4

Characteristics of distributed database management systems 715

14.5 14.6 14.7 14.8

DDBMS Components 717 Levels of data and process distribution 719 Distributed database transparency features 722 Distribution transparency 723

14.9

Transaction transparency 726

14.10 14.11 14.12

Performance and failure transparency Distributed database design 733 The CAP theorem 740

14.13

Database security 742

14.14 14.15

Distributed databases within the cloud 742 C.J. Dates 12 commandments for distributed

Summary

745

Key terms

746

Further

reading

Review

questions

Problems

15

732

744

746 746

747

Databases for Business intelligence Preview

15.1 15.2

databases

750

750

The need for data analysis 751 Business intelligence 751

15.3

Decision support data 762

15.4 15.5 15.6 15.7

The data warehouse 767 Star schemas 777 Data analytics 789 Online analytical processing

15.8

SQL analytic functions

15.9

Data visualisation

Summary

818

Key terms

819

Further

reading

Review

questions

Problems

794

805

811

820 820

821

16 Big Data and nosQL 826 Preview

16.1 16.2

826

Big data 827 Hadoop 833

16.3

NoSQL databases 840

16.4 16.5 16.6

NewSQL databases 848 Working with document databases using MongoDB 849 Working with graph databases using Neo4j 853

Summary

857

Key terms Review

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

858 questions

All suppressed

Rights

Reserved. content

does

859

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

17

Database Connectivity Preview

17.1

Database connectivity 861

17.2 17.3 17.4 17.5

Database internet connectivity 873 Extensible markup language (xML) 884 Cloud computing services 898 The semantic web 907 908

Key terms

909

Further reading Review

Index

909

questions

Problems

Glossary

909

910

912 938

Appendices (Available Appendix

A:

Appendix

B: The

Appendix

C:

2020 has

Lab:

Tickets

Ltd:

Global

D:

Converting

Comparison

Appendix

F:

Client/Server

Appendix

G:

Object-Orientated

Appendix

H:

Databases

J:

an

in

Hierarchical

Network

K:

Database

L:

Data

Conceptual,

ER

Model into

Database

Database

P:

Microsoft

Q:

Working

with

MongoDB

Appendix

R:

Working

with

Neo4j

does

May not

a New

Factors

Appendix

Reserved.

Structure

Model

Oracle

Appendix

content

Design Design

Model

Database

O: Building

Rights

Database Database

Notations

Implementation

N: A Guide to

All

Physical

Administration

Warehouse

Appendix

suppressed

and

Databases

Appendix

any

Physical

e-Commerce

M: Creating

Learning.

Logical,

A Tutorial

and

a Database

Modelling

Appendix

that

Logical,

Systems

The

Appendix

with Visio Professional: Conceptual,

of ER

The

Appendix

Cengage deemed

Databases

E:

I:

online)

University

Appendix

Appendix

review

Designing

Appendix

Appendix

Copyright

860

860

Summary

Editorial

and Web technologies

xi

Using

SQL Developer

a Simple

not

be

copied, affect

Database

Using

Oracle

Objects

Tutorial

scanned, the

12c

with Oracle 12c

Object-Relational

Access

materially

Using

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PrefACe

We are solid

excited

and

to

introduce

practical

This foundation

creation

the

foundation is

built

depends

of core in the

database

Approach:

As the

broad to

title

suggests,

database

the

by poorly

induced

provides when

database

databases

systems.

their

successful

level

and and

database

the

areas

will also provide practical

coverage

management

systems

of business

technology,

Design

detail,

and

Management

reasons,

special

covers

attention

to

potential

three

is given

processing

Learning. any

All suppressed

Rights

does

and

conflicts

May

not materially

In

and

better

our

design

software

likely

approach

many, if

be solved

even the

people to

without

experience,

cannot

DBMS

system

with the

to

not

help

overcome

best bricklayers

management

worthwhile

management

to skills

use in

most, of even

problems

and carpenters

seem to

scarce

order

be

affect

carefully

function

techniques

be triggered

resources

to

to

exercise

ones

warehouses

structures,

and

develop

them

design

useful

design

between

scanned, the

overall

or

are

on crises

duplicated, learning

it

makes little

in experience.

whole

or in Cengage

part.

Due Learning

may

completed. database

data from more

operational

sense

covered

problems make

elegance,

sense to

to

clients

of current

make

we have

We also

design

what they

when the

understood.

stressed,

skills.

get

design is

much of their

end-of-chapter

database

to

In fact,

understanding

procedures are

numerous

more likely

database

derive

and implementation

the

are

and thoughtfully.

once a good

promotes

data

For example,

copied,

Clients

approached

of database

real

not

design

of communication. is

because

sure that

speed.

Reserved. content

disasters.

poor

seems

concepts,

aspects

develop

the create

blueprint.

hardly

really

structure

and actual

even database-inexperienced

Using an analogy,

means

design

making

enables

Nor is

design

warehouse

practical

in

students

that

a

databases.

For example, data

Cengage

in

Unfortunately,

system

organisations

operational

deemed

studying

Implementation,

with database

It

system

databases,

transaction

has

of

database

to

a bad

database

an excellent

with

of

associated

database

Familiarity

the

from

designed

how their

procedures

number

databases.

discover

software

by poor design.

extensive

technologies.

2020

Stages

applications.

managers.

building

by poorly

Design

review

provide

database

things,

comprehensive

on courses

Design,

are traceable

problems

and

any

and

designed

excellent

Copyright

way to

a good

Most difficult

Editorial

to

define them.

at undergraduate

only for those

of

practical

However, for several important

database

failures

or magnified

create

Because

that

designed

management

very

Providing

those

Principles:

database

programmers

created

the

and

system

best

need

not also

on the

systems.

of excellent

paves

database

for

Database

databases

cant

concepts

is

design:

usually

the

Emphasis

of database

The availability create

text but

and are

databases courses.

which

and data analytics.

Continued

aspects

course in

an ideal

Principles,

databases

the important

science,

data science

while

postgraduate

it is

of computer

Database

implementation

that,

a first

conversion

concepts,

context

introductory

The

for

for

of

design,

notion

on understanding

material

edition

for the

on the

This edition is suitable essential

third

electronic reserves

sure

right

that

some to

third remove

concepts

additional

content

understand

requirements,

databases

party

and

challenging

students

information

design

rights, the

design

are sufficiently

may content

that

be

meet design

suppressed at

any

time

and

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

xiv

Preface

elegance the

standards

use

of

requirements This

Standard

(UML

Foot

of both

Copyright Editorial

review

2020 has

Cengage deemed

these

Learning. that

any

by the

In

is

to high

of

information

that

the

2017

with the

Modelling

are

Therefore,

capable

of

notation

for

third

edition.

to

data

modelling

Language)

Group has led to

edition

second

this

Appendix

requirements.

databases

we explore

meeting

end-user

data

modelling.

standards.

Management

as the

approaches

maintained.

design

Object

within

ensure

UML (Unified

keeping

models

notation

that familiarity

to use

2.5.1 is available

relationship

Crows

the

reviewed.

meet end-user

trade-offs

conforming

retains

development

continually

entity

defined

while

edition

Continual

is

while they fail to

carefully

standard: edition,

order

E, Comparison

ISO/IEC

UML

However, in

has

as to

of ER

UML becoming 19505-1

continued

organisations maintain

Modelling

and 19505-2), to

be used

still

legacy

an International

use

both

systems,

Notations,

to

which produce

Chen

and

it is important

contains

coverage

notations.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

ChAnges to the thirD eDition

In this

third

edition,

database

design

To support Data

the

of Big

Data

technologies

that

have

been

developed

and

expanded

Business and

databases

Cengage deemed

Learning. that

any

coverage

of

a few

and

focuses

data

of the

NoSQL in

and

to

strengthen

the

already

strong

highlights:

technology,

greater

to

continued

depth

support

its

visualisation

we have on the

added

use, including

tools

a new

characteristics Hadoop

and techniques

Chapter

of

in

Big

and

16:

Data

Big

and the

MongoDB.

Chapter

15,

Databases

the

classroom.

Intelligence.

updated of

An additional

has

growth

new features

are just

chapter

Coverage

2020

Here

The

New

review

coverage.

some

NoSQL.

New

Copyright

added

and

for

Editorial

we have

Business

MongoDB

with

appendix

All

Rights

provide

exercises

coverage

topical for

discussion

querying

of Neo4j

points

MongoDB

with hands-on

in

databases

exercises

(Appendix

for

querying

Q). graph

R).

Reserved. content

to

hands-on

containing

(Appendix

suppressed

Vignettes

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

ACknoWLeDgeMents

The

publisher

feedback Emilia

on the

second

Mwim,

UNISA

Patricia Judy

acknowledges

Casper

Essop, Jakeman,

Andy

Davies,

Mick Ridley, Ray

Turner,

Mark

For this

Lecturer

Oxford

I

coverage

of relational

Last,

College

of Essex University of

Glamorgan

to

say

a special

School

of

Computing,

of experience

within

thanks Maths

the

to and

Pamela Digital

database

field

Quick,

who

Technology have

previously

at

been

worked

Manchester

very

valuable,

Metropolitan specifically

I have

Louw.

been lucky

Marinda

to

provided

work

fantastic

with a very support

patient,

in

supportive

answering

all

and

It

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

not least,

Reserved. content

has

been

with you.

certainly

thank

you

to

my family

(my

ohana)

for

your

patience

and

support.

January

Copyright

the

professional

my emails.

Keeley

Editorial

as a

algebra.

working and

State

of Bradford

like

edition,

Marinda

a pleasure

invaluable

Greenwich

Brookes

would

Free

Regional

University

in the

third

provided

College

University

Her years

On this

who

of Technology

of the

of

Blackburn

University.

Publisher,

University

Peterborough

McPhee,

edition,

lecturers,

of Pretoria

University

University

Green,

Duncan

Senior

Central

University

Chris

following

UNISA

Macdonald,

Ismael

of the

editions:

University

Wessels,

Theo

contribution

and third

Alexander,

van Biljon,

the

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

Crockett 2020

eBook rights

and/or restrictions

eChapter(s). require

it

ABout the Authors

Carlos

Coronel

Tennessee

is

State

Administrator,

courses

in

Web development,

Steven

completed

Morris Design

and

Dr Keeley

and

and

from

Rule Induction data

in

dialogue

such

Women

as

in

be a STEM

Dr Craig in

South

the

Activated

is the such is

founder

also

has

Cengage deemed

Learning. that

any

technology

literacy,

All

Rights

does

May not

fields

at

as

a

and

Middle

Database

has

taught

data communications

not materially

be

copied, affect

been

which

at the

of

has

Women

in

Masters

explored

the

His PhD, in education (ACT)

companies

teaching

speaker

who is

digital

scanned, the

overall

or

of

resulted to teaching

of numerous

using

articles

undertaking

many

is

and IEEE also

proud

to

schools.

and

Artificial

Technology

Intelligence

in the

to

development

with technology. books

running,

his innovative

for its

and journal

Systems

application

the

and natural

committee,

of Information

author

She leads

Keeley

in rural

Fuzzy

presence

IEEE

in

systems

systems,

papers

roles.

with technology,

entitled

database

international

science

Mathematics

1998 of

Leadership

approach

and is the

systems,

in

in the

other

technology,

model, a unique

has published

students.

conference

many

area

Systems

a BSc Degree (Hons)

fuzzy

volunteer

in computer

in the

Steven

field

a strong

Engineering

among

outreach

the

intelligence,

an active

PL/SQL,

of Computing,

postgraduate

established

She is

School

She gained

machine learning

and

He has taught

and

University.

within

artificial

University. SQL

journals.

in the

teaching

using

and teaching His

State

over 125 refereed

IEEE

Auburn

University.

subcommittee

acclaimed

Reserved. content

Labs

Specialist,

Advanced

of several

Intelligence

Profiling

database

changing

with

undergraduate

Lab,

years.

of

multiple

various

and

PhD from

boards

has

both

of the

Teaching

our rapidly

suppressed

to

management.

an internationally

in

She

been researching 25

and

and a PhD in the field

with a passion for

over

in

Computer

Technology

Middle Tennessee

review

and journals.

Classroom

as computer

education

2020

for

Business

experience and

Programming

She has published

member

transaction

of

development,

Metropolitan

intelligence

has

Africa

database

1993,

conferences a

of

and

Computational

Psychological

Ambassador

Blewett

in

Domains.

systems.

being

years

Science

of MIS at

20 years

Computational

of

on the

Research

Adaptive

major international

roles

review

Data

Intelligence

into

College

Manager

Database

Manchester

UMIST in

for

the

design

Bachelor

serves

at

engineering

language

his

a Reader

from

Computational

research

database

and Principles

is

25 Web

Development,

Technology

Computation

over

for

levels.

currently

Crockett

Digital

and

and

Design,

articles,

Director

He has

and graduate

many

Lab

Administrator,

undergraduate

Analysis

Copyright

the

University.

Network

Database

Editorial

currently

covering

and

active

approaches

to

topics

living.

help

of Craig

He

change

world.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

WALk-through tour CHAPTER 1 The

IN

THIS

CHAPTER,

The

BUSINESS

Database

VIGNETTE

a

between

database

valuable

RELATIONAL

AN

REVOLUTION

HISTORICAL

How

differs Until

the

late

difficult

1970s,

to

the

databases

navigate.

stored

large

Programmers

database

was

designed.

time-consuming

and

amounts

needed

to

Adding

or

of

know

data

what

changing

in

structures

clients

the

way

that

wanted

the

to

data

were

do

were

inflexible

with

the

1970,

article

Edgar

stored

or

entitled

realised

of

Codd,

theories

computers

and

query

strange

Ted

five-page

of

would

spark

a

to

one

Data

for

a

And

I

said,

Data

Chamberlin,

it

this

very

The

before

analysed

was

main

to

fund

At

on

par

with

of

guy

Ted

the

the

SQL,

Codd

time,

System

R,

eventually

The

lead

a

a

the

who

had

most

popular

some

kind

listened

as

Chamberlin

Codd

recalls.

reduced

number

vested

interest

about

a

research

to

the

of

years,

in

the

same

who

role

of

open

had

this

two

research,

project

creation

that

of

which

IMS,

a

time

as

read

Codds

tight-knit

IBM

Ellison,

SQL

built

and

turned

reliable,

a

DB2.

out

to

high-end

System

R

The

prototype

of

IBM,

be

a

however,

a

symposium

database,

System

decision,

R

because

its

had

just

up,

system

two

established

fuelled

staff

that

professors

a

a

to

series

publish

founded

programmers

Navy,

Ellison

By

the

1983,

a

from

was

able

the

to

had

from

similar

of

these

small

System

market

company

database,

database

are

management

and

how

database

system

a

database

system

(DBMS)

been

the

project

papers.

systems

the

first

(Software

had

and

the

over

those

decisions

require

Data

this

are

to

you

other

in

had

and

information,

which

managed

most

what

a

is

derived

efficiently

from

when

database

management

is,

what

methods.

they

it

You

raw

are

does

will

facts

stored

and

also

known

in

why

learn

a

it

as

database.

yields

about

better

different

types

databases

and

why

database

evolved

is

design

from

now

is

so

computer

largely

important.

file

outmoded,

systems.

Although

understanding

the

file

system

characteristics

of

data

file

systems

of

papers

the

1979,

a

changed

CIA

well

released

had

be

learn

data

1968.

potential

the

from

database

annually,

good

likely

chapter,

than

is

was

important

because

chapter,

they

you

will

are

also

the

source

learn

how

of

the

serious

data

database

management

system

limitations.

approach

In

helps

eliminate

Laboratories.

funding

relational

000

quality

competition

market

Development

securing

Laboratories)

910

data

California

The

the

reading

Software

and

SQL-based

3

and

back

in

of

Ingres.

of

Among

called

Ingres,

Development

grossed

governance

company

released

University

called

Unaware

papers.

company

R

data

which

on

the

most

of

a

convinced

relational

kept

crucial

database

started

work,

groups

allowed

who

Recruiting

systems

components

PREVIEW

of

this Larry

file

management

main

source

of

management the

from

data

of

Databases Berkeley,

between

are

complicated

of At

at

they

system

functions

importance

results had

why

development

In for

and

nobody

data. would

burner

are,

groundbreaking

Good IBM

file

databases

seriously.

Chamberlin

Wow,

a

Banks.

co-inventor

was

took

and

published

revolution

There

nobody

IBM,

Shared

technological

symposium,

line.

by

Large

Don

explains:

but

organised

programs

employed

internet.

today,

notation,

Codd

mathematician

Model

the

language

mathematical

Then

a

Relational

Codds

personal

database

system

systems

a

of

expensive.

Ted

A

that

evolved

file

database

from

types

making

design

and

data

The In

in

the

information

different

database

databases

flaws

What

LEARN:

and

the

decision

of

modern

About

JOURNEY

what

for

importance

WILL

data

is,

assets

The

THE

YOU

difference

What

Approach

and

before

the

shortcomings

of

file

system

data

management.

IBM.

portable

its

of

the

version

name

to

Oracle.

3

Business Vignettes illustrate the parttopics with a

Chapter Previews setthe scenefor the chapter and

genuine scenario and show how the subject integrates

with

provide an overview of the chapters

contents.

the real world.

20

PART

I

Database

CHAPTER 3

Systems

The

are

1

criticisms

not

of

unique

to

introduced

design

Relational

Model

in

THIS

CHAPTER,

YOU

WILL

a

adhering

end

to

the

relational

That

the

relational

tables

in

How

a

database

model

models

relational

takes

basic

field

Entity

you

learn

about

database

definitions

and

and

of

always

structure

of

important

in

Both

learn

6,

data

environment,

the

designers

of

1.3

they

about

Chapter

types

Figure

later,

you

issues

the

reflect

requirements.

naming

be

implementation

must

processing

file

to

when

Regardless

database

the

prove

conventions

Diagrams,

Design.

a

and

in

will

naming

Relationship

Database

or

shown

conventions

and

are

database

Data

Modelling

in

Chapter

11,

the

design

documentation

needs

are

best

served

conventions.

LEARN: Online

That

when

system

reporting

proper

conventions

such

with

Physical

file

users

naming

definitions

Modelling

and

involves

the

field

and

Logical

it

and

and

Because

revisit

Data

Concepts;

whether

needs

IN

will

5,

Conceptual,

by

definitions

systems.

You

Chapter

Advanced

Characteristics

field

file

early.

a logical

view

components

are

of

this

data

relations

implemented

Content

Appendices

A to

P are

available

on the

online

platform

accompanying

book.

through

DBMS

relations

are

organised

in

tables

composed

of

rows

(tuples)

and

columns

(attributes)

NOTE Key

terminology

About

used

the

role

of

in

the

describing

data

relations

dictionary,

and

the

system

catalogue No

How

data

redundancy

is

handled

in

the

relational

database

naming

the

Why

indexing

is

convention

can

fit

all

requirements

for

all

systems.

Some

words

or

phrases

in

some

are

reserved

for

model

important

DBMSs

your

be

internal

DBMS

use.

might

interpreted

you

as

would

For

interpret

get

a

an

example,

a

command

(-)

to

error

the

hyphen

name

as

a

subtract

the

message.

On

the

ORDER

generates

command

to

NAME

other

an

subtract.

field

from

hand,

error

Therefore,

the

CUS

the

field.

CUS_NAME

DBMSs.

field

Because

would

work

Similarly,

CUS-NAME

would

neither

fine

field

because

exists,

it

uses

an

underscore.

PREVIEW 1.5.3 In

Chapter

and

2,

data

Data

Models,

you

independence

allow

learnt

you

that

to

the

examine

relational

the

data

models

models

logical

the

physical

aspects

of

data

storage

and

retrieval.

You

Data

also

without

learnt

that

file

ERM

may

be

used

to

depict

entities

and

their

relationships

graphically

through

systems

structure

an

organisational

chapter,

you

will

learn

some

important

details

about

the

relational

models

and

more

about

how

the

ERD

can

be

used

to

design

a

relational

will

learn

how

the

relational

databases

basic

data

components

fit

into

construct

known

as

a

table.

You

will

discover

that

one

is

unlikely

that

database

physical

important

reason

for

be

models

units.

related

simplicity

You

to

will

one

also

is

learn

that

its

how

tables

the

can

be

independent

treated

as

tables

learning

introduced

an

of

the

to

the

and

part

are

few

way

which

that

relational

poorly

to

For

those

components

example,

and

shape

database

designed

introduced

chapters.

in

their

concepts

of

and

you

next

the

tables,

basic

integral

well-designed

Finally,

it

storage

islands

difficult

of

of

to

the

combine

same

data

basic

information

for

from

data

such

multiple

in

sources.

different

scattered

locations.

data

locations.)

in

different

locations

will

logical

within

versions

numbers

of

occur

in

the

always

same

both

the

be

data.

updated

For

consistently,

example,

CUSTOMER

in

and

the

As

the

Figures

islands

1.3

AGENT

files.

and

of

1.4,

You

the

need

only

correct

copy

of

the

agent

names

and

phone

numbers.

Having

them

occur

in

more

than

one

place

rather

the

data

redundancy.

Data

redundancy

exists

when

the

same

data

are

stored

unnecessarily

at

database places.

another.

about

to

such

make

the

different

phone

Uncontrolled After

security

term

the

different can

stored

contain

and

produces than

data

often

names

one relational

the

a agent

logical

use

database. information

You

of

promotes

professionals

logical it

structure

lack

structure

ERD. (Database

this

and

the The

In

Redundancy

structural

structure

The considering

the

their

design

design,

of

you

will

relationships,

tables.

also

you

Because

the

the

characteristics

learn

table

is

Data

you

relationships

concepts

will

might

that

examine

be

will

become

different

handled

your

kinds

in

the

of

in

in

the

gateway

files

on

relationships

relational

sets

Data

appear

address

basic

redundancy

inconsistency.

data

tables.

some

data

the

stage

for:

are

which

inconsistency

different

the

file.

of

For

If

different

version

exists

places.

AGENT

contain

you

data

the

data

is

when

example,

forget

for

same

different

suppose

to

the

make

and

you

conflicting

change

corresponding

agent.

versions

an

agents

changes

Reports

will

yield

of

phone

in

the

the

same

number

or

CUSTOMER

inconsistent

file,

results

depending

used.

database Poor

data

being

susceptible

security.

Having

multiple

copies

of

data

increases

the

chances

of

a

copy

of

the

data

environment.

Learning

Objectives

appear at the start of each chapter

to

Online Content

to help you monitoryour understandingand progress

unauthorised

access.

boxes draw attention to relevant

material

onthe online platformfor this book.

through each chapter. Each chapter also ends with a

Notes highlight important facts about the concepts

summary section that recaps the key content for revision

introduced in the chapter.

purposes. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

154 64

PART

I

Database

PART

I

Database

Systems

Systems

User TABLE

Levels

2.3

of

data

queries

can

expression,

Degree

Model

be

written

as

relational

algebraic

expressions.

In

order

to

write

such

as

an

abstraction

Focus

of

Independent

the

?

List

all

?

Select

?

Specify

following

the

steps

attributes

we

should

need

be

to

give

taken:

the

answer.

of all

the

relations

we

need,

based

on

the

list

of

attributes.

Abstraction

2

High

External

End-user

views

Global

Conceptual

view

of

data

(independent

of

database

Hardware

and

software

Hardware

and

software

Specific

database

Storage

and

relational

calculus

predicate

calculus.

operators

is

a

and

formal

the

language

intermediate

results

based

upon

a

that

branch

are

of

needed.

mathematical

logic

called

model)

Tuple Internal

the

Relational

relational

calculus

allows

users

to

describe

what

they

want,

rather

than

how

to

compute

it,

modelHardware and

underlines

the

appearance

of

Structured

Query

Language

(SQL).

Expressions

in

tuple

4 Low

Physical

access

methods

Neither

hardware

nor

software

relational

calculus

Domain

relational

that

TABLE

SUMMARY

data

model

is

Database

a

(relatively)

designers

The

Business

data

abstraction

models

data-modelling

rules

real-world

simple

use

basic

are

of

to

used

to

identify

a

complex

communicate

are

and

real-world

with

components

define

the

data

applications

entities,

basic

and

relationships

modelling

tuples

calculus

on

values

is

from

Summary

which

different

an

of

Operator

for

given

from

attribute

predicate

tuple

is

relational

true.

calculus

as

it

uses

of

the

domain

variables

domain.

relational

Symbol

a

operators

Description

environment.

programmers

attributes,

take

4.1

Relational A

users.

return

and

components

end

constraints.

within

a

specific

SELECT

s

Selects

a

subset

PROJECT

P

Selects

a

subset

Selects

tuples

in

Relation1

but

INTERSECT

Selects

tuples

in

Relation1

or

UNION

Selects

tuples

in

Relation1

and

-

DIFFERENCE

of

tuples

of

from

a

columns

relation.

from

a

not

relation.

in

Relation2*.

environment.

The

hierarchical

the

concepts

The

relational

the

and

are

network

found

in

model

end

user

means

of

is

the

database

for

to

and

visually

end

users

being

stored

complements

present

different

to

that

are

no

longer

used,

but

some

standard.

in

tables.

Tables

The

entity

the

the

are

of

data

the

to

data

into

a

seen

common

is

ER

by

PRODUCT

THETA

other

model

The

as

model,

each

(ER)

model.

the

relational

related

relationship

relational

views

integrate

In

X

JOIN

Computes

{

popular

two

5,

,,

the

possible

relations

,5,

excluding

combinations

to

.5,

Relation2,

,

be

.}.

of

combined

When

the

duplicate

one

operator

tuples*.

tuples.

using

is

comparison

5 the

operators

operator

is

known

as

an

EQUIJOIN.

allows

database

all

Allows

u

by

a

model

Relation*.

of

CARTESIAN

attributes.

that

and

models

implementation

common

modelling

designers

programmers

as

early

models.

database

data

in

data

were

data

current

values

tool

models

current

the

perceives

common

graphical

data

in

NATURAL

|X|A

JOIN

version

designers,

of

the

EQUIJOIN

Relation1Tuple.Y

framework.

both

which

5

relations

selects

those

Relation2Tuple.Y.

which

Y is

must

share

and

natural

the

tuples

a

same

set

where

of

common

domain.

attributes

Duplicate

to

columns

are

removed. The

object-orientated

object

data

resembles

also

an

includes

model

entity

in

information

objects,

thus

(OODM)

that

it

about

giving

its

uses

includes

objects

the

relationships

data

more

as

facts

the

that

between

basic

define

the

modelling

it.

facts

But

as

structure.

unlike

well

as

an

An

entity,

the

relationships

object

with

OUTERJOIN

Based

other

all

relational

model

relational

data

and

scientific

the

most

has

model

adopted

many

(ERDM).

At

applications,

likely

object-orientated

this

while

future

point,

the

scenario

is

(OO)

the

ERDM

an

OODM

is

is

largely

primarily

increasing

extensions

used

geared

merger

of

to

to

in

the

specialised

business

OODM

become

overshadowed

by

NoSQL

the

databases

4

need

are

to

a

distributed

support

to

new

data

consistency

the

develop

internet

generation

stores

and

very

access

of

specific

that

strategies

databases

shifting

needs

provide

the

high

burden

tuples

ERDM

that

of

Big

for

do

Data

not

use

scalability,

in

Relation1

JOIN,

that

have

the

no

OUTERJOIN

in

corresponding

addition

values

in

selects

and

fault

of

maintaining

Although

technologies,

UNIVERSAL

both

relational

model

NoSQL

relationships

and

tuples

in

Relation1

that

match

every

row

in

the

relation

Relation2.

A

formula

The

;

must

formula

be

true

must

for

be

at

true

least

for

one

instance

all instances

are the

case

of

these

operators,

relations

must

be

union-compatible.

databases.

the

organisations.

availability

Selects

'

EXISTENTIAL

engineering

applications.

and

and

KEY geared

u-JOIN

extended

* in

are

the

the

Relation2.

meaning.

DIVIDE The

on

databases

tolerance

data

integrity

TERMS

offer

by

sacrificing

to

the

data closure

natural

difference

PROJECT

DIVISION

predicate

SELECT

join

program safe

expression

code.

Data

modelling

requirements

are

a

function

of

different

data

views

(global

vs

local)

and

domain level

of

data

abstraction.

The

American

National

Standards

Institute

Standards

Planning

relational

calculus

Requirements

Committee

(ANSI/SPARC)

describes

three

levels

of

data

abstraction:

lowest

level

internal.

of

There

data

is

abstraction

also

is

a

fourth

level

concerned

of

data

abstraction

exclusively

with

(the

physical

algebra

relational

algebraic

relational

schema

expression

theory

theta

join

tuple

relational

calculus

external, INTERSECT

and

relational

and equijoin

conceptual

set

calculus

the

physical

level).

storage

This

join

column(s)

left

outer

UNION

union-compatible

RESTRICT

methods. right

join

outer

join

Summary Eachchapter ends witha comprehensive

Key Terms arelisted atthe end ofthe chapter and

summary that provides a thorough recap of the issues in

explained in full in a Glossary at the end of the book,

each chapter, helping you to assess your understanding

and

enabling you to find explanations of key terms quickly.

revise key content.

CHAPTER

single-user

query

database

1

transactional

The

Database

Approach

32

31

language

query

result

social

set

record

semi-structured

media

workgroup

structural

dependence

structural

independence

Structured

data

XML

I

Database

Systems

PROBLEMS

database

1 query

PART

1

database

database

Online

Query

Language

(SQL)

in

a

Content

Microsoft

platform

FURTHER

READING

Given

the

1 Codd,

E.F.

Date,

C.J.

The

Capabilities

The

of

Database

Assessment

of

Relational

Database

Relational

E.F.

Model,

Codds

Management

A

Contribution

Systems.

Retrospective

to

Review

the

Field

of

IBM

and

Database

Research

Analysis:

a

Technology.

Report,

Historical

RJ3132,

Account

Addison

2

Date,

C.J.

An

Introduction

C.J.

Date

to

on

Database

Database:

Systems,

Writings

8th

20002006.

edition.

Addison

Apress,

2006.

Review

Questions

Wesley,

Content

are

available

Answers

on

the

to

online

selected

platform

accompanying

this

and

shown

records

file

structures

database

you

named

see

in

this

problem

Ch01_Problems,

set

available

are

on

simulated

the

online

book.

in

does

problem

would

Figure

the

you

P1.1, P1

file

answer 1answer

contain,

Problems Problems1

and

encounter

if

how

you

1-4.

many

wanted

4

fields

to

are

produce

there

a

per

listing

by

record?

city?

How

would

you

2000. this

problem

by

altering

the

file

structure?

for

Problems

2003.

FIGURE

Online

structure

many

What

solve Date,

this

1981.

and

Wesley,

file

How

The

Access

for

Problems

for

this

chapter

P1.1

PROJECT_

book.

The

file

structure

PROJECT_

CODE

MANAGER

21-5Z

Holly

25-2D

Jane

14

MANAGER_

PROJECT_BID_

MANAGER_ADDRESS

PRICE

PHONE

B.

Naidu

33-5-59200506

180

Boulevard

Dr,

D.

Grant

0181-898-9909

218

Clark

Blvd.,

Dr.,

Phoenix,

64700

London,

13

9

NW3

179

975.00

787

037.00

TRY

REVIEW

QUESTIONS 25-5A

1

Discuss

each

of

the

following

Menzi

25-9T a

Holly

27-4Q

c

Menzi

Holly

is

data

redundancy

and

which

characteristics

of

the

file

system

can

lead

to

Discuss What

5

What

6

Explain

7

What

the is

a

lack

of

DBMS,

data

and

independence what

are

in

its

Boulevard

0181-227-1245

124

River

33-5-59200506

180

Boulevard

Durban,

25

4001

Dr,

Phoenix,

64700

458

16

005.00

887

Zulu

Naidu

Dr.,

Durban,

8

4001

Dr,

Phoenix,

64700

181.00

078

124.00

20

014

885.00

file

systems.

is

structural

is

independence,

the

difference

the

role

and

between

of

a

data

DBMS,

List

and

describe

What

What

11

Explain

12

What

Use

are

is

What

15

Explain

Further

main

and

why

and

what

different

K.

Moor

wanted

postal

is

it

to

Via

39-064885889

code,

how

produce

alisting

would

you

of

alter

the

you

detect,

the

Valgia

file

Silvilla

file

contents

23,

by

last

Roma,

00179

name,

area

44

516

677.00

code,

city,

FIGURE

its

of

of

a

What

data

redundancies

do

county

or

structure?

and

how

could

those

redundancies

lead

to

anomalies?

important?

information.

are

types

components

you

P1.2

The

file

structure

for

Problems

58

advantages?

databases.

PROJ_

NUM

database

system?

EMP_

NAME

EMP_NAME

NUM

1

Hurricane

101

John

1

Hurricane

105

David

F.

1

Hurricane

110

Anne

R.

2

Coast

101

John

D.

Dlamini

2

Coast

108

June

H.

Ndlovu

3

Satellite

110

Anne

R.

3

Satellite

105

David

F.

3

Satelite

123

3

Satellite

112

D.

Dlamini

Schwann

JOB_

JOB_CHG_

PROJ_

CODE

HOUR

HOURS

EE

65.00

13.3

31-20-6226060

CT

40.00

16.2

0191-234-1123

CT

40.00

14.3

34-934412463

EE

65.00

19.8

31-20-6226060

EE

65.00

17.5

0161-554-7812

CT

42.00

11.6

34-934412463

CT

6.00

23.4

0191-234-1123

EE

65.00

19.1

0181-233-5432

BE

65.00

20.7

0181-678-6879

EMP_PHONE

metadata?

why

are

database

the

design

potential

examples

in

are

a

the

what

of

compare

typical

six

is

costs

to

prevalent

14

the

the

If

functions?

PROJ_

13

River

180

it?

4

10

124

file

What

3

9

F.

B.

William

31-7P

8

0181-227-1245

33-5-59200506

record

d

4

Naidu

field 29-2D

3

Zulu

B.

data

b

2

F.

terms:

and

business

levels

is

important.

implementing

a

contrast

database

system?

structured

and

unstructured

data.

Which

type

is

more

Ramoras

environment?

on

meant

Ramoras

which

by

data

Reading

the

quality

of

data

can

be

examined?

Mary

Allecia

Schwann

D.

Chen

R.

Smith

governance.

allows you to explore the subject further,

Problems

become progressively

more complex as

and acts as a starting pointfor projects and assignments.

students draw onthe lessons learnt from the completion of

Review

preceding problems.

Questions

help reinforce and test your knowledge

and understanding, and provide a basis for group discussions and activities. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

DeDiCAtion

To

my son,

To Craig, you

Kona,

I

being

would

my

be

am today.

there

To

whom I

my best friend

nothing

person

of

for

am so

proud

and

patient

possible.

In

To

my

keep

husband. memory

mother,

Norma

following

your

Thank you for

of

my father,

Crockett,

dreams.

supporting

Frank

who is the

my crazy

Crockett, angel

busy life

who inspired

in

my life.

without

me to

Thank

you

be the

for

always

me.

mother-and

father-in-law

Jackie

and

Bill

Smith

who

have

provided

me

with

much love

and

support. In

memory

To

of Leslie

my family

Much love

Crockett,

and friends, and

aloha

to

a true

all of you

gentleman

whom

have

and

painted

much-loved rainbows

uncle.

in

my life.

all.

Keeley

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

Crockett

eBook rights

and/or restrictions

eChapter(s). require

it

Teaching Support

Cengages

courses

& Learning Resources

peer-reviewed

is

content

accompanied

support

resources.

specific

needs

Examples

The

kind

are

carefully and

provided

area

a test

an instructors

example,

Lecturers:

to

resources

for

Students:

area

online

and learning

tailored the

to

the

course.

include:

instructors

PowerPoint

with,

slides

and

for

students

appendices,

including,

useful

for

weblinks

and

terms.

discover

the

accompanying

access:

education

manual.

An open-access

glossary

bank,

for

further

teaching

student

of resources

example,

and

of digital

resources

instructor,

A password-protected

for

higher

by a range

of the

of the

for

dedicated this

teaching

textbook

digital

please

support

register

here

cengage.com/dashboard/#login

to

resources

discover

the

accompanying

Database

Principles:

Implementation,

dedicated

this

learning

textbook,

Fundamentals

and

please of

Management.

digital

support

search

for

Design,

Edition

on: cengage.com

BEUNSTOPPABLE! Learn Copyright Editorial

review

more 2020 has

Cengage deemed

at cengage.com Learning.

that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

DATABASE PRINCIPLES

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PartI

DATABASE SySTEmS 1 The Database Approach 2 Data Models

3 Relational Model Characteristics 4 Relational Algebra and Calculus

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

BuSINESS VIgNETTE THE RELATIONAL REVOLuTION AN HISTORICALjOuRNEy Until the late 1970s, databases stored large amounts of data in structures that wereinflexible and difficult to navigate. Programmers needed to know what clients wanted to do with the data before the database was designed. Adding or changing the way the data were stored or analysed was time-consuming and expensive. In

1970,

Edgar Ted

Codd,

a mathematician

employed

by IBM,

published

a groundbreaking

article entitled A Relational Model of Data for Large Shared Data Banks. At the time, nobody realised that Codds theories would spark atechnological revolution on par with the development of personal computers and the internet. Don Chamberlin, co-inventor of SQL, the most popular database

query language

today,

explains:

There

was this

guy Ted Codd

who had some

kind of

strange mathematical notation, but nobody took it very seriously. Then Ted Codd organised a symposium, and Chamberlin listened as Codd reduced complicated five-page programs to one line. And I said, Wow, Chamberlin recalls. The symposium convinced IBM to fund System R, a research project that built a prototype of a relational database, which would eventually

lead

to the

creation

of SQL and

DB2. IBM,

however,

kept

System

R on the

back

burner for a number of years, which turned out to be a crucial decision, because the company had a vested interest in IMS, areliable, high-end database system that had been released in 1968. At about the same time as System Rstarted up, two professors from the University of California at Berkeley, who had read Codds work, established a similar project called Ingres. The competition between

the two tight-knit

groups

fuelled

a series

of papers.

Unaware

of the

market

potential

of

this research, IBM allowed its staff to publish these papers. Among those reading the papers was Larry Ellison, who had just founded a small company called Software Development Laboratories. Recruiting programmers from System R and Ingres, and securing funding from the CIA and the Navy, Ellison

was able to

market the first

SQL-based

relational

database

in 1979,

well before IBM.

By 1983, the company (Software Development Laboratories) had released a portable version of the database, had grossed over 13 910 000 annually, and had changed its name to Oracle.

?

3

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

4

PART I

Database

Systems

Spurred on by competition, IBM finally released SQL/DS, its first relational database, in 1980.1 In 2008, a group of leading database researchers metin Berkeley and issued a report declaring that the industry had reached an exciting turning point and was on the verge of another database revolution.2

In 2010, Oracle acquired MySQL as part ofits acquisition of Sun. It has since maintained the free open-source MySQL Community Edition while providing several versions (Standard Edition, Enterprise Edition and Cluster Edition) for commercial customers. In 2019, the release of MySQL Document

Store

brought together

the

SQL and the

NoSQL languages,

enabling

developers

to link

SQL relational tables to schema-less NoSQL databases.3 Oracles latest offering is Oracle Database 19c, where the c represents cloud; new versions now come out every year. In our historical journey, we must also mention PostgreSQL, developed in1986 as part of the POSTGRES project at the University of California at Berkeley. PostgreSQL4 is afree, open source, object-relational

database

that

extends

the traditional

SQL language

by allowing

creation

of new

datatypes and functions, and the ability to write code in different programming languages. It is a strong competitor to MySQL, given that it has had over 33 years of active development. Analysts, journalists and business leaders continually see new developments with data acquisition and its management, such as the explosion of unstructured data, the growing importance

of business intelligence,

and the

emergence

of cloud technologies,

which

may require

the development of new database models. Although traditional relational databases meetrigorous standards for data integrity and consistency, they do not scale unstructured data as well as new database models such as NoSQL. NoSQL is also known as a non-relational database, which allows

the

storage

and retrieval

of unstructured

data using

a dynamic

schema.

A key

question

asked by database developers today is whether they need a NoSQL database or an SQL database for their application. For example, Twitter and Facebook, which do not require high levels of data consistency and integrity, have adopted NoSQL databases. In 2019, businesses are opting for SQL and NoSQL multiple database combinations, which suggests that one size does not fit all. As of

March 2019, the

most popular

database

management

systems

worldwide

were Oracle,

MySQL, Microsoft SQL and PostgreSQL.5 So, whatis the future? Disruptive database technologies are required for business to remain competitive and the key is real-time data. Alternative database models such as cloud database platforms, which have the capability for real-time data analytics, are for certain. Big data has a role to play as additional data sources must be processed using data

pipelines,

regulations.

1

IBM

2

accordance

The relational

and

Rakesh

all in

Oracle

Agrawal

with the

new

model will survive,

Trade

Barbs

et al.,The

over

Claremont

General

Regulation

(GDPR)

but it will also adapt at unprecedented

speed.

Databases,

Report

on

Data Protection

data

https://phys.org/news/2007-05-ibm-oracle-barbs-databases.html

Database

Research,

http://db.cs.berkeley.edu/claremont/

claremontreport08.pdf. 3

MySQL

4

Editions,

PostgreSQL,

5

Top

www.mysql.com/products/ www.postgresql.org/about/

10 Databases

for

2019,

The

Database

Journal,

www.databasejournal.com/features/oracle/slideshows/

top-10-2019-databases.html

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER 1 The Database Approach IN THIS CHAPTER, yOu The difference

between

What a database valuable

is,

assets

How

modern

About

flaws

What the differs The

decision

making

of database

design

databases in file

types

from

data

systems

a file

file

of databases

are, and

why they

are

systems

management main components

are

and

how

a database

system

system

main functions

The role

evolved

system

database

from

data and information

what the different

for

The importance

wILL LEARN:

of a database

of open

source

The importance

of

data

management

database

system

(DBMS)

systems

governance

and

data

quality

Preview Good

decisions

data.

Data

In this

require

are likely

chapter,

results

than

other

data

and

why

is important this most

review

2020 has

of the

Cengage deemed

Learning. that

any

management

you

they

shortcomings

All suppressed

Rights

does

May not

not materially

be

system

affect

scanned, the

overall

does

systems.

and

facts

known

in

a database.

why it yields about

of serious

duplicated,

Although the

database

learning

raw

are stored

You will also learn

data

or

from

different

as

better types

so important. file

how the

copied,

what it

understanding

are the source

of file

Reserved. content

is

computer

outmoded,

will also learn

derived when they

is,

methods.

design

from

now largely

because

chapter,

which is

most efficiently

what a database

database

evolved is

information,

managed

you learn

Databases

Copyright

be

of databases

management

Editorial

good

to

file

system

characteristics

of file

data

management

system

approach

data systems

limitations. helps

In

eliminate

management.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

6

PART I

Database

Systems

1.1

1

DATA VSINFORmATION

To understand

what

information. to reveal

their

lab think

of its

performance. your

they

of our labs

(a) initial

because

has

survey

Cengage deemed

graphics

summary

(c) information

2020

It is

are second-year

bar

FIguRE 1.1

review

(c).

any

would

has

possible

customer

base?

your

you

begin

Web survey

been

completed,

the

In

get this

(38

raw

data into

quick

answers

case,

that

you

a data

can

extract

respond

are

saved

to

summary

quickly

the

facts

like

the

such

a data

in

hand,

determine

data

one

that

is the

most

(32

quickly,

to

shown

as, What

undergraduates

meaning from

to

and ones is not likely

questions

and first-year

and

labs

to

have

page of zeros

computer

users

data

now

data

processed

of a computer

assess the

raw you

to

users

enables

forms

Although

page after

per cent)

ability to

Panel

users to

form the

Panel (b).

reading

to

know

by surveying

the

1.1,

undergraduates

1.1,

between

of your

per cent).

you show the

(d).

Transforming raw datainto information screen

(b)

in summary

Learning. that

that

you transform

Figure

difference

what the

now

in

the

want to

Figure

can enhance

graph

understand

suppose

useful in this format Therefore,

Panel

to

have not yet been

shows

in

need

that the facts

form

shown

you

word raw indicates

you (a),

survey

one

much insight. 1.1,

Panel

When the as the

customers

data

Copyright

Typically,

1.1,

composition

And,

Editorial

services. Figure

are not particularly

Figure

design,

The

For example,

such

provide in

database

meaning.

questions.

repository,

drives

Data are raw facts.

All suppressed

Rights

Reserved. content

does

format

May not

not materially

be

raw

data

(d) information

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

in graphic format

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

information simple using For

is the result

as organising statistical the

data

weaknesses, Raw data

student

undergraduates

In this

you to

1 to

to

data types age,

to

said

In turn,

be entering

familiarity,

characteristic

Data

constitute

some the

key

building

is used to reveal

Timely

relevant

and

must

environment generation, that

Data

of lab

and presentation. the results

based

customers.

For example, on the

The respondents

More complex

making. strengths

yes/no

formatting

the

classifications responses

is required

when

or images. and timely business

are the

information

survival

foundation

in

is the a global

key to market.

of information,

good We are

which

is

the

and facts about a specific subject. Knowledge

of information can

as it applies

be derived

from

to an environment.

old

A key

knowledge.

data.

meaning

key to

of data.

is

is the

and retrieval

accurate

easy to

activity

good

survival

data.

access

Such

and

decision in

a global

data

process.

making.

must

environment.

be generated

And, like

any

properly,

basic

Given the for

any

crucial role that

business,

data play, it should

government

agency,

and they

resource,

Data management is a discipline that focuses

of data.

a core

key to

organisational

requires that

is

decision

out the labs

needs

1

inferences

of information.

must be managed carefully.

management

for

point

7

may be as

or drawing

meet the

other.

key to

knowledge

information

is the

a format

storage

data

age.6

the

and timely

information

in

show

relevant

making is the

by processing

making

useful

be stored

to

can

better

processing

videos

of accurate,

new

to

storage.

as sounds,

blocks

Information

decision

foundation

form

Approach

points:

is produced

Good

as the

survey

a category

data

and understanding is that

Information

Accurate,

such

knowledge

awareness

summarise

be used

that is, the body ofinformation

of knowledge

Lets

and

decision

the

storage,

for

production

good

bedrock of knowledge

for

forecasts

Panel (c) is formatted

a Y/N format

information

now

implies

1.1,

making

decisions

Database

meaning. Data processing

as

on the

make informed

with complex

making.

question

3, postgraduates

be converted

decision

or as complex can then

each

formatted

Figure

years

to

working

for

must be properly in

patterns

Such information

summary

helping

classification

may need

of processing raw data to reveal its

to reveal

modelling.

example,

and

data

1 The

the

data

on the proper

not surprise

you

organisation

or

service

charity.

1.1.1 Data Quality and Data governance The quality

of the data

long-term

business

within the database

decisions.

develop new strategies can

be examined Accuracy:

Completeness: Timeliness:

6

Peter

knowledge

Copyright Editorial

review

2020 has

Cengage deemed

and

data updated

the Mr

phrase

George

has it

data

purpose

and this

is to

often

make accurate

means

that

it

short-and

can

generation of an organisation.

be used to

Data quality

including:

been

obtained

from

a verifiable

source?

organisation?

being stored?

frequently

in

knowledge

Gilder,

organisation

the income

levels,

to the

if the

Dr

order to

worker George

in

meet the

1959

Keyworth

and

in

his

business

book

Dr Alvin

requirements?

Landmarks

Toffler

of

Tomorrow.

introduced

the

In

1994,

concept

of the

age.

Learning. that

accurate

data relevant

coined

Dyson,

be fit for

of different

Is the required

Is the

Drucker

Ms Esther

data

Is the

must

which aim to increase

at a number

Is the

Relevance:

Data

is essential

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

8

PART I

Database

Systems

Uniqueness:

1

Is the

Unambiguous: The above

to

not

be subject

major

to

Individuals

South

Africa

POPIA

promotes

development and

the

within

database.

Master

and accurate reporting

working

Once in

place,

the

to

ensure

that

they

to

ensure

that it is

1.2

within and

procedures

is

to

create,

a component

of the

managing example, update

of a data

strategy.

within an organisation

and

strategy

levels

within

is

the

polices

and the

several

data,

and

up to

Once the

strategy

will allow

date for

strategy

The

metadata

or raw

that

facts

stores

of interest

filing

the

cabinet

in

cabinets

Figure The

to

the

Copyright Editorial

review

2020 has

Cengage deemed

which

In

to

the

has

any

All suppressed

structure

Reserved. content

does

quality, who

owns

new records

strategy

that

that

data is

to

allow the

in the provides

consistent auditing,

the

be regularly

of the

May

not materially

willinvolve

of the

many

developed

and

with the

measured data

organisation.

and

governance

Data

people put into

strategy. monitored strategy

profiling

and

data

of data over time.

be

copied, affect

data

scanned, the

overall

serves

duplicated, learning

set of relationships

a very

is

that link

well-organised

management

(DBMS)

to the

as the

electronic

system,

a collection

data stored in the

intermediary

and translates

hides

or

system

and managed.

helps

manage

of programs

that

database.

DBmS

requests

DBMS

and the

resembles

as a database

access

of the

A database is a shared,

data are integrated

characteristics

a database

known

database.

user

which the end-user

and controls

DBMS

The

not

and

of:

management

all application

requests.

Rights

willinvolve

defines

delete

been

process to keep track

end

a sense,

software,

A database

that

receives

those

Learning. that

powerful

database

1.2 illustrates

and

monitoring

purpose

a collection

of the

database.

Role and Advantages

DBMS

fulfil

the

contents.

manages the

1.2.1

a description

within

2013.

THE DATABASE AND THE DBmS

structure

provide

data found

procedures. law in

that

all data complies

should

continual

monitoring

metadata, or data about data, through

the

given.

usability,

strategy

task

Efficient data management typically requires the use of a computer

end-user

is

by an organisation

strategy

technology

and time-consuming

of the

This

are often used as part of the

computer

defined

governance

months to ensure that

procedures

being followed.

still relevant

a complex

organisation.

organisation

INTRODuCINg

integrated

not

bodies.

MDM ensures

provides

May

of how the

into

availability,

the

25

consent

and statistical

private

an

of an individual

was signed

methodology

for For

authorised

(MDM)

or

Europe from

an explanation

own data governance

organisation.

who is

and

which

which governs

of data.

governance

are

and

its

which

public

in

explicit

ask for

mathematical

a strategy

for implementation

all systems

by

of data

rights

unless

to

Act (POPIA)

produces

the

Management

will take the

quality tools

of policies

organisation

compliance

different

it

describe

Each organisation

of data

Data

a data

at

information

across

Creating

operation,

of personal

storage

the

profiling, right

appropriate

protection

foundation

and

must utilise

Information

of a series

GDPR includes

have the

of Personal

security

the technological

making

on the

Regulation (GDPR),

for all organisations

22 of the

which includes

decision

used to

own laws

Data Protection

requirement

Article

making,

such

is the term

the

General

alegal in

decision

data quality.

integrity

became detailed

to

will have their

the

Protection

the

Data governance

data

data,

and organisations

has the

to safeguard the

Most countries

changes

subject

decision is reached

data clear?

For example,

automated

who are

without redundancy?

of the

exhaustive.

and processing

One of the

and

meaning

must adhere to.

collecting 2018.

Is the

list is

organisation

data unique

much

in experience.

whole

or in Cengage

of the

part.

Due Learning

between

them

to

into

databases

electronic reserves

rights, the

right

the

the

some to

user

complex

internal

third remove

party additional

content

and the

database.

operations

required

complexity

may content

be

suppressed at

any

time

from if

from

the

subsequent

eBook rights

the

and/or restrictions

eChapter(s). require

it

CHAPTER

application

programs

and users.

programming

language

DBMS

program.

utility

such

FIguRE 1.2

as

The application Python,

program

Visual

Basic,

might be written

C++

or Java,

The DBmS managesthe interaction

1 The

Database

by a programmer

or it

might

be

created

Approach

using

9

a

through

1

a

between the end user and

the database End

users Application

Database

structure

request Metadata

Data

Customers DBMS database management End

End-user

Invoices

system

data

users Products

Application request

Data

Having

a

DBMS

advantages.

between

First,

or users.

the

Second,

the

DBMS

the

end

users

enables

the

applications

data in the

DBMS integrates

the

and

database

many different

the

to

database

be shared

users

offers among

views

some

important

multiple

of the

applications

data into

a single

all-encompassing

data repository. Because

data

managing

such

efficient

and

In

more

respond

and

quickly

to

actions

in

changes

segment

and of the

Minimised data inconsistency. data

appear

in

different

stores

department

stores that

regional its

sales

shows

national

sales

office

inconsistency

is

greatly

Improved

data

the

access

to

a clearer

For

The

data

of the

affect

big

other

example,

price the

such

need

data

a good

way

management

of

more

as:

in

which end users

makes it

well-managed view

company

reduced

you

make

possible

for

end

have better users

to

environment.

persons

shows

access.

access

derived,

helps

an environment

DBMS

It

name

of product

makes it

exists

as Thobile

X as

products

a properly

an integrated

becomes

view

much

easier

of the

to

see

how

exists when different versions of the same

data inconsistency

name as Bathobile

same

in

promotes

picture.

segments.

Data inconsistency

same

is

DBMS

advantages

Such

a sales representatives

office

the

provides

data.

places.

department

which information book,

helps create

in their

Wider

from

in this

a DBMS

The DBMS

operations one

material

better-managed

data integration.

organisations

raw

will discover

particular,

data sharing. to

Better

crucial

As you

effective.

Improved access

are the data.

price

possible

Cele and the M. Cele

R390.00 as

designed

when

in

a companys

companys

or when the

South

R350.00.

African The

sales

personnel

companys currency

probability

and

of data

database.

to

produce

quick

answers

to

ad hoc

queries.

From a database perspective, a query is a specific request for data manipulation (for example,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

10

PART I

Database

Systems

to read or update the data) issued to the DBMS. Simply put, a query is a question and an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to the application. For example, end users, when dealing withlarge amounts of sales data,

1

might

want

quick

?

What

?

What is the

?

answers

was the

How

many

volume

of our

customers

decision better-quality

Increased

end-user

The

usable

more

making.

for

each

have

credit

hoc

on

productivity.

of our

of using

a DBMS

as you learn

past

six

of

months?

during

better

end

3

users

not limited

to

are

make

global

to the

000)

three

or

months?

more?

make it

possible

to

based.

with the tools

quick,

informed

that transform

decisions

that

can

be

economy.

few

technical

past

data access

decisions

of data, combined

in the

the

the

R5 000 (or

data and improved

and failure

more about

the

as:

salespeople

The availability

are

such

balances

which

empowers

success

queries)

during

Better-managed

information,

between

advantages

(ad

by product

figure

information,

difference

advantages

of sales

bonus

generate

data into

questions

sales

Improved

the

to

just

listed.

details

In fact,

you

of databases

will discover

and their

many

proper

design.

1.2.2 Types of Databases A DBMS the

can

number

usage

support

and the

The

number

B and

When the

used

which

of users

C must

or a specific is

to

types

where the

of

the

data

determines

databases.

Databases

are located,

the

data

are

can

type

be classified

of data

stored,

according

to

the intended

data

structured.

whether the

database

is

classified

as single-user

or

multi-user.

database supports only one user at a time. In other words, if user Ais using the database, wait until

is called a desktop time.

different

supported,

degree

A single-user users

many

of users

multi-user

entire

might

done.

A single-user

supports

database

database

a relatively

small

that

and

supports

many

users

number

(more

the database is known as an enterprise also

be used

to

classify

the

database.

runs

supports

on a personal

database.

than

computer

multiple users at the same

of users (usually

within an organisation, it is called a workgroup

organisation

many departments, Location

Ais

database

department

by the

user

database. In contrast, a multi-user

50,

fewer

than

50)

Whenthe database

usually

hundreds)

across

database.

For

example,

a database

that

supports

data

located at a single site is called a centralised database. A database that supports data distributed across several different sites is called a distributed database. The extent to which a database can be distributed,

and the

Distributed The

way in

product

popular

way

must

as an online

Copyright review

2020 has

Cengage deemed

any

All suppressed

Rights

is

addressed

in

detail

in

Chapter

14,

does

however,

from

purchases

them.

(OLTP),

based

on how they

For example,

reflect

and immediately.

operations is classified

is

critical

that

is

as an operational

transactional

transactions

day-to-day

A database

will be

used

such

operations.

designed

primarily

database,

or production

as

Such to

also referred

database.

databases comprise two main components: a data warehouse and an online (OLAP) front end. The data warehouse is a specialised database that stores for

decision

databases

Reserved. content

accurately

processing

optimised

operational

Learning. that

managed,

today,

gathered

and supply

day-to-day

transaction

a format

the

payments

be recorded

Typically, analytical analytical processing

Editorial

is

databases

of the information

sales,

support a companys

from

distribution

of classifying

sensitivity

or service

transactions

data in

such

Databases. most

and on the time

to

which

May not

not materially

be

as

copied, affect

The

well as data from

scanned, the

support.

overall

or

duplicated, learning

in experience.

whole

or in Cengage

data other

part.

Due Learning

warehouse external

to

electronic reserves

sources.

rights, the

contains

right

some to

third remove

historical Online

party additional

content

may content

data

analytical

be

suppressed at

any

time

processing

from if

obtained

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

is a set of tools processing

that

and

application

work together

modelling

has

grown

intelligence.

capturing

and processing

decision

making. (See

data

are

can

data

Therefore,

that

use, that

to

processing,

37890

postal

code

with it.

merely

and

data.

apply

Some

data

hand,

the

code

and is

if this

value

concept

store

of

these

a graphic

invoices

sales,

such

a (structured)

graphic

been

processed

in

focus

to

some

a prearranged

other

on the to the

storage

use

valuable

procedures,

and

rules

are being

management

of structured

data.

information

and

For

that

can

Web pages.

addressed

you

of and

on the

if you look

company

the

you

can

scan

store

data have the

data

corporations Just

documents

known

are

and

thus

are

data.

data storage

the

mentioned

unstructured

of databases

and

computations.

types

and

you

as monthly

Web page,

memos

a

If

you could

However, and

data

them

such

requisite

at a typical

of

as numeric.

invoices.

Semi-structured

data.

types

computations

paper

Instead,

and semi-structured

some

example,

printed

The database

and

processing

value represents

be formatted

the

yields

storage of

for For

display,

emails,

generation

its

mathematical

perform

collected. that

type

must

as semi-structured.

structured

were

derive information

can

semi-structured

in

a new

to

business

Unstructured

they

code. If this

not be useful.

use

Unstructured

through

discipline:

processing

(unstructured)

it

want to

highly

be found

1

approach

data to facilitate

a stack

would

which

perform

some information.

also

11

database

own

structured.

to the

based

transaction,

retrieval

example,

of

They

its

of

to support

of processing.

you cannot

so that

are in

itself

be ready

types

hand, if you

to convey

into

data

or a product

imagine

format

extent.

format

evolved

Approach

for retrieving,

area

information

(format) not

a sales

storage

spreadsheet

this

a comprehensive

unstructured

for future

other

environment

times,

format

not lend

value

as text,

as images

has

the

is, in the

might

further,

Onthe

which

structure

for

stored

structure

format.

to

does

a sales

represents

it

of generating

of formatting

code,

other

that

describes

that

that

You

analysis

recent

Database

Business Intelligence.)

state

a format

(structured)

average

presented

needs

on the

point

degree

most data you encounter are best classified

already

of the

(raw)

in

of information.

perform

for

be ready

data in

Actually,

exist

the

purpose

the

a postal

want to

invoice

Databases

data are the result

or a product

save them in

with the

to reflect

original

data

to

In

intelligence

might

On the

limited

in their

data

warehouse.

usage,

might refer to

To illustrate

far

exist

but they

value

totals

be classified

generation

you intend

15,

also

Structured

and the

data

an advanced

data

business

business Chapter

provide the

and

The term

unstructured

information.

to from

in importance

business

Databases

data

1 The

not

think

such

as

management

as XML

databases.

extensible Markup Language (XML) is a special language used to represent and manipulate data elements in a textual format. An XML database supports the storage and management of semi-structured XML

Connectivity

and

Analytical for

tactical

(data

data.

and

to

sophisticated

tools.

transactional

or

etc.

strategic

easier

to

to

retrieve

Copyright review

2020 has

Cengage deemed

The

15, Databases

Learning. that

any

in

more

All suppressed

Rights

Reserved. content

does

design,

for

May not

not materially

data,

the

base end

detail

in

data

Chapter

16,

Database

to

massaging

forecasts,

advanced

typically

formulate

data

require

pricing

can store

warehouse

structure

and

of data

use

data

sales

market analysis

of

data used to generate information

by data are based

implementation

extensive

perform

decisions

to

metrics used exclusively

decisions,

on storing

warehouse data

requires

pricing

user

Such

information

the

typically

to

primarily

supported

the

such

analysis which

decisions.

extract

Most decisions

on historical data is

extensive

decisions,

data

forecasts,

data obtained

derived quite

sales

from

different

warehouses

are

from

many sources. from

that

of

in

detail

covered

a

Business Intelligence.

Table 1.1 compares features

Editorial

discussed

allow

focuses

Additionally,

database.

Chapter

warehouse

manipulation)

databases.

make it

on

databases

make tactical

market positioning, operational

be

Such

information

Analytical

a data

(data

making.

produce

on.

data using

massaging

in

decision

to

so

In contrast, required

will

Web Technologies.

or strategic

business

databases

databases focus primarily on storing historical data and business

manipulation)

strategies

To

XML

be

of several well-known

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

database

part.

Due Learning

to

electronic reserves

management systems.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

12

PART I

Database

Systems

TABLE 1.1

Types of databases

1 Product

Number

Data Location

Of Users

XML

Data Usage

Multi-user

Single User

workgroup

X

X

X3

X

X

X

X

X

X

X

X3

X

X

X

X

X

X

X

MySQL

X

X

X

X

X

X

X

X

Oracle

X3

X

X

X

X

X

X

X

MS

enterprise

Centralised

Distributed

Operational

X

Analytical

X

X

Access MS SQL Server IBM

DB2

RDBMS

All the

database

commercial

DBMS, its system

applications

the

for

purpose,

any to

The there

MySQL look

general

main benefit of open source

define

and the

Perl

blocks for

the

most popular

PostgreSQL8

media

such

Over the

and

widespread

term

NoSQL9

Copyright Editorial

review

2020 has

PostGres

9

NoSQL

Cengage deemed

Learning. that

any

not

SQL) is

based

Available:

All suppressed

Rights

develop

the

source the

the

which

provided buy

by actual

database

database

database

will then

and

system

be released

for

A disadvantage

of open

Twitter grow

and

new breed

this

LinkedIn

exponentially

new

generally

on the traditional

relational

source

capture

is

a new

database

the

system

vast

as they

software

is

DBMS building

such

as

stick to the

organisations

is that it

to

does

not

systems.

as the

use

basis for the new

Social

media refers

interactions.

amounts

the

of

to

Websites

of data

about

specialised

end

database

has grown in sophistication

known

as

generation

model.

products

and

human

database

database

describe

basic

companies

on

and require

of

to

MySQL

and analysed.

of specialised

type

used

Web server,

commercial

always

LAMP

provides

technologies

anytime,

However,

The term

DBMS products

smaller

by large-scale

anywhere,

stack

management

vendor

ideal

product itself.

software.

Apache

software

database

required

and use the

You

a

NoSQL

database.

of database will learn

The

management

more about

NoSQL

NoSQL.

www.postgresql.org/ http://nosql-database.org/

Reserved. content

to

www.mysql.com/

Available: Available:

data

years, this

16 Big Data and

mysql.com

8

can

Linux,

this

of data are being stored

enable

Currently,

only

namely:

Together

makes them

durability

Instagram,

These

usage.

is

are

order

distribute

of the

World Wide Web and internet-based

that

past few

(Not

that

Chapter

users

support

use than large-scale This

great amounts

Facebook,

consumers.

systems.

7

generation,

Google,

and

systems

of the

ongoing

open source

quickly.

and

mobile technologies

as

users

Typically,

applications

Withthe emergence

Web and

choice,

any improvements,

software,

languages.

principles.

functionality

and

source

are easier to

database

the robust

social

is that

make

in

MySQL7 is an open

of their

software is that it is free to acquire development

open

websites.

database-centred

provide

and

MySQL)

a company

maintenance.

The idea

code

development

developing

fundamental

develop

in the

PHP/Python

MySQL and basic

and

1.1 (except

from

modify a database

product.

source

Table

public.

will be costs involved

used to

in

DBMS

and

in

investment

support

build

at the

shown

a significant

and ongoing users to

actual

the

systems

and require

which allows

improve

back

management

vendors

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1 The

Database

Approach

13

1

NOTE Most

of the

database

production First,

design,

(transactional)

production

implementation

databases.

databases

are the

enrolling in a class, registering warehouse poorly

designed,

1.3

design

to

store

data

most of their

warehouse

to the

manage

does

a crucial

of good

refers

and

requirements

such

the

databases

a car, buying

derive

management

issues

on production most frequently

a product

is

in this

based

encountered

in

production

based

databases,

on them

book

on two

are

activities

or withdrawal.

their

reliability

on

such

Second,

and if production

will lose

based

considerations.

common

or making a bank deposit

data from

databases

addressed

databases

databases

and

value

as

data

as

are well.

wHy DATABASE DESIgN IS ImPORTANT

Database used

databases

and

The focus

not just

aspect

database

activities

end-user happen;

of working

design

that

focus

data. its

on the

A good

structure

must

with databases

techniques.

design

database

that

of the that

be designed

DBMS

structure

a database

carefully.

most of this

Even a good

database

is,

book is

In fact,

poorly

will be

meets

all user

database

dedicated

will perform

that

that

to the

design

is

development

with a badly

designed

database. Proper

database

expected

use.

operational

Designing

speed.

aggregated

design

Designing

approach

emphasises

the

15 also

requires

that

used

critical

designer

database

design

issues

the

databases

accurate

and

consistent

the

use of historical

a centralised,

single-user

a distributed,

single-user

confronting

precisely

recognises

of

centralised,

identify

database

be used in

the

to

emphasises

warehouse

to in

of transactional,

examine

database

of a data

a database

from

design

the

a transactional

The

data.

a different

and

design

the

environment

multi-user

and

designer

data

requires

database.

multi-user

This

databases.

of distributed

and

generates

accurate

and

and

book

Chapters

data

14

warehouse

databases. A

well-designed

information. errors

that

may lead

organisation. study

database

A poorly

seminars,

and

1.4

bad

why

data

database decision

making

design

to and

bad

often

a breeding

decision

to

of all types

consultants

and

become

too important

why organisations

database

management

is likely

design is simply

design,

ground

making

be left to luck. and sizes make

can lead

Thats

send

valuable

difficult-to-trace

to

the

failure

why university

personnel

an excellent

and

for

to

of an

students

database

design

living.

HISTORICAL ROOTS: FILES AND DATA PROCESSINg

Understanding considering can

to

Database

database

facilitates

designed

be

what a database what

helpful

in

Understanding

a database

is,

is

what it

not.

understanding

A brief

the

these limitations

data

is relevant

does

and the

explanation access

to

proper

of the

evolution

limitations

database

way to

that

use it

of file

system

databases

designers

can

be clarified data

processing

to

overcome.

attempt

and developers

by

because

database

technologies do not make these problems magically disappear database technologies simply make it easier to create solutions that avoid these problems. Creating database designs that avoid the pitfalls of earlier systems requires that the designer understands these problems and how to avoid them;

otherwise,

technologies

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

the

database

and techniques

All suppressed

Rights

Reserved. content

does

May not

not materially

technologies

are no better (and

are potentially

even

worse!) than

the

they have replaced.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

14

PART I

Database

Systems

1.4.1

manual File Systems

1 To be successful,

an organisation

must develop

systems

for handling

core business

tasks.

Historically,

such systems were often manual, paper-and-pencil systems. The papers within these systems were organised to facilitate the expected use of the data. Typically, this was accomplished through a system of file folders and filing cabinets. As long as a collection of data was relatively small and an organisations

business

users had few reporting

requirements,

the

manual system

served its role

well

as a data repository. However, as organisations grew and as reporting requirements became more complex, keeping track of data in a manual file system became more difficult. Therefore, companies looked to computer technology for help.

1.4.2 Computerised

File Systems

Generating

manual file

reports

from

systems

was slow

and

cumbersome.

In fact,

some

business

managers faced government-imposed reporting requirements that led to weeks of intensive effort each quarter, even when a well-designed manual system was used. Therefore, a data processing (DP) specialist was hired to create a computer-based system that would track data and produce required

reports.

Initially,

the

computer

files

within the file

system

were similar

to the

manual files.

A

simple example of a customer data file for a small insurance company is shown in Figure 1.3. (You will discover later that the file structure shown in Figure 1.3, although typically found in early file systems, is unsatisfactory for a database.)The description of computer files requires a specialised vocabulary. Every discipline develops its own terminology to enable its practitioners to communicate clearly. The basic file vocabulary

shown in Table

1.2

will help you to understand

subsequent

discussions

more easily.

Online Content Thedatabases usedin the chapters areavailable onthe onlineplatform accompanying to

chapter

access

Raw facts,

smallest letter Field

online

Online

platform.

Content boxes

Please

see the

highlight

prelims

for

material related

details

on how to

resources.

such as a telephone Data have little

piece A, the

record

define

store

File

A collection

2020 has

Cengage deemed

Learning. that

any

phone

or a file

All suppressed

Rights

such

by the

as /.

computer

A single

(alphabetic

a record number,

name and a year-to-date

is

character

or numeric)

records.

for a customer date

does

May not

not materially

be

of birth,

a single

requires

that

has

copied, affect

scanned, the

overall

or

duplicated, learning

named

a file

the records

describes

credit limit

For example,

might contain

Reserved. content

be recognised

set of one or morefields that

of related

Company,

can

of characters

constitute

address,

a birth date, a customer

character, 1 byte

a specific

(YTD)

manner. The such

as the

of computer meaning.

storage.

A field

is

used

data.

connected

the fields that

number,

meaning unless they have been organised in some logical

5 or a symbol

or group and

Alogically

name,

of data that

number

A character to

review

on the

book,

Basic file terminology

sales value.

Copyright

located

useful

the

Definition

Data

Editorial

book. Throughout

content

these

TABLE 1.2 Term

this

for

in experience.

whole

a person,

J. D. Rudd

might consist

and

balance.

unpaid

might contain

the

or in Cengage

students

part.

Due Learning

place or thing.

to

data

about

currently

electronic reserves

rights, the

right

some to

third remove

of J. D. Rudds

vendors

enrolled

party additional

content

For example,

of

ROBCOR

at Gigantic

may content

be

University.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIguRE

1.3

C_NAMe Alfred

A.

Database

Approach

15

Contents of the CuSTOmER file

C_PHONe

C_ADDreSS

32-3-8891367

Stationsplein

Ramas

Sea

0181-894-1238

C_POSTCODe

A_NAMe

A_PHONe

TP

AMT

reN

2880

Leah

F.

27-21-410-7100

T1

100.00

05-Apr-2018

B.

0161-228-1249

T1

250.00

16-Jun-2018

F.

27-12-410-7100

S2

150.00

29-Jan-2018

F.

27-21-410-7100

S1

300.00

14-Oct-2018

0181-228-1249

T1

100.00

28-Dec-2018

0181-123-5589

T2

850.00

22-Sep-2018

27-21-410-7100

S1

120.00

25-Mar-2018

0181-123-5589

S1

250.00

17-Jul-2018

0161-228-1249

T2

100.00

03-Dec-2018

0181-123-5589

S2

500.00

14-Mar-2018

Hahn

Town

Box 12A

Dlamini

2,

1

Point,

Cape Mpu K.

1 The

Rd,

N6 4WE

Alex

Highgate,

Alby

Johannesburg Loli

32-3-8890340

W.

Rijksweg

Ndlovu

58,

2880

Nkita

Pretoria

Paul

31-20-6226060

F.

Brown

Martin

Olowski

Rd,

1018

Nkita

Westville,

Brown

Durban 0161-222-1672

Fatima

Box 111

Naidoo

Dr.,

M15 REE

Alex

Chatsworth,

B.

Alby

Durban Amy

B.

0181-442-3381

387 Troll

OBrian

Dr.,

N6 LOP

Menzi

Highgate,

East James

G.

19

33-5-59200506

Khumalo

London East

Block

647000

F.

Brown

Plain

3 Baobab

39-064885889

Mahraj

Nkita

Street,

Mitchells Saajidah

T.

Ndlovu

00179

Menzi

Street,

T.

Ndlovu

Queenswood, Pretoria Anne

G.

2119

0181-382-7185

Farriss

Elm

St.,

NW3

RTA

Alex

Parkview,

B.

Alby

Johannesburg Olette

K.

35 Libertas

34-934412463

Snyman

08001

Menzi

Avenue,

T.

Ndlovu

Stellenbosch

C_NAME

5 Customer

C_PHONE C_ADDRESS

A_NAME

Using the

proper

1.3.

The

of nine fields: REN.

its filename

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

phone

type

5 Insurance

policy

REN

5 Insurance

renewal

amount,

in thousands

of euro

date

name

file terminology CUSTOMER

given in

file

shown

C_PHONE,

records

is

5 Agent

5 Insurance

AMT

postcode

C_NAME,

The ten

TP

address

5 Customer

5 Agent

A_PHONE

phone

5 Customer

C_POSTCODE

Figure

name

5 Customer

are

stored

in

Table

1.2, you can identify

Figure

1.3

C_ADDRESS,

in

a named

file.

contains

ten

C_POSTCODE, Because

the

the file

records. A_NAME,

file in

Figure

components

Each record A_PHONE, 1.3 contains

shown

is

in

composed

TP,

AMT

customer

and data,

CUSTOMER.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

16

PART I

Database

Systems

When business

1

the

users

DP specialist.

from

the

report.

the

file, If

For manipulate

a request

existing

program

their

more

data

and

had to

the insurance

the

files

of other

DP specialist

created

automate

was

were used to coverage,

was so

to

asked

among

1.4

Contents

A_NAMe

A_PHONe

A_ADDreSS

Alex

0161-228-1249

Deken

Alby

had

printed

This

user run

results

business

which

which

processing the

personnel and

AGENT

transfers

user.

in turn

SALES, the

the

file

that

other in

Figure

daily sales

1.4.

to

be able to

data

create

management department

efforts.

at

The sales

demanded

The

rerun

saw the

to

data

functions.

of taxes

could

users

sales

manager

personnel

keep track

more

data

as a printed

wanted

the

data to the

DP specialist

example,

helped track

shown

(EFTs),

they

department

it

business

the

meant For

present

DP specialist

As other

for

for the

to retrieve

and

previously,

reports.

payroll create

had requested

for

that

programs

were being reported,

data,

named

sent requests

create

more requests

more requests

a file

to

to the

generated

obvious

fund

the been

data

file, they

had

access

to

Consequently,

the

in the

file

AGENT

paid and summarise

insurance

other tasks.

FIguRE

B.

to

do electronic

the

and

computerised

manner that

which customer

be created,

success

DP specialist

provide

the

DP specialist

whatever

fashions.

company

departments

the

a report

similar

computerised

programs

it in

ways in

in

data from

request,

was for

new and innovative view

wanted

each

of the

Van

Erpstraat

AgENT file

POSTCODe

HireD

YTD_PAY

YTD_iT

YTD_Ni

YTD_SLS

DeP

5492

01-Nov-2001

20

806.00

5

201.00

1

664.00

103

963.00

3

8002

23-May-2004

25

230.00

6

308.00

2

018.00

108

844.00

0

2193

15-Jun-2003

18

169.00

4

542.00

1

453.00

99

20,

Best Nkita

F.

27-21-410-7100

West

Brown

Quay

Road, Waterfront, Cape

Menzi

T.

452

0181-123-5589

Town Elm

St.,

548.00

2

Parkview,

Ndlovu

Johannesburg

A_NAME

5 Agent

A_PHONE

5 Agent

A_ADDRESS

address

5 Agent

5 Agent

As the

YTD_PAY

phone

5 Agent

POSTCODE HIRED

name

date

postcode

owned

the

used its

file

system

DP specialist

alarger,

or the grew,

5 Year-to-date

file

programs

the

demand to

for

the

The new

like

tax

national

the

DP specialists

one shown

in

and

its

DP department.

activity

remained

programmer

Copyright Editorial

review

2020 has

Cengage deemed

In

Learning. that

any

All

of these

programming,

and

suppressed

spite

and the

more time

Rights

program

Reserved. content

does

May not

organisational

and the

DP

changes,

manager

Figure

modify

inevitably

1.5, evolved.

data.

Each file

And each file

was

creation.

The size additional

skills of the

grew

file

managing technical

(DP)

however, spent

even faster,

system

programming

Therefore, the DP specialists job evolved into that of a data processing a

paid

programming

programmers.

and

paid

insurance

sales

to store, retrieve

computer

programming

income

of dependents

commissioned

hire additional

computer.

to spend less time

that

pay

5 Year-to-date

5 Number

system,

department

was authorised

more complex

DP specialist

a small

own application

by the individual

As the

5 Year-to-date

YTD_NI

DEP

of files increased,

in the system

YTD_IT

YTD_SLS

of hire

number

5 Year-to-date

the

and

also required

staff

caused

the

and human resources.

manager,

who supervised

DP departments

much time

primary

as a supervising

senior

troubleshooter.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIguRE

1.5

Database

Approach

17

Asimple file system

Sales

1 Personnel

department

File

department

File

Management

Management

Programs

Programs

CUSTOMER

SALES

file

AGENT

file

file

File

1.5

1 The

File

Report

Report

Program

Program

PROBLEmS wITH FILE SySTEm DATA mANAgEmENT

The file system system

method of organising

and served

a useful

and managing data was a definite improvement

purpose in

data

management

for

over two

in the computer era. Nonetheless, many problems and limitations critique of the file system method serves two major purposes: Understanding the shortcomings modern databases.

decades,

on a manual

a very long timespan

became evident in this approach.

of the file system enables you to understand the development

A

of

Many of the problems are not unique to file systems. Failure to understand such problems is likely to lead to their duplication in a database environment, even though database technology makes it easy to avoid them. The following problems severely challenge the types as well as the accuracy of the information:

of information

that can be created from the data

Lengthy development times. The first and most glaring problem with the file system approach is that even the simplest data-retrieval task requires extensive programming. Withthe older file systems,

programmers

had to specify

what

must be done

and how to

do it.

As you

will learn in

upcoming chapters, modern databases use a non-procedural data manipulation language allows the user to specify what must be done without specifying how.

that

Difficulty in getting quick answers. The need to write programs to produce even the simplest reports makes ad hoc queries impossible. DP specialists who work with mature file systems often receive numerous requests for new reports. They are often forced to say that the report will be ready next week or even next month. If you need the information now, getting it next week or next

Copyright Editorial

review

2020 has

month

Cengage deemed

Learning. that

any

All suppressed

will not serve your information

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

needs.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

18

PART I

Database

Systems

Complex

1

the

system

system

file

administration.

expands.

management

to

add,

and

system

Each file

delete

records;

and limited

and limited

the file

multiple

geographically

management consequently

password measures system

are

often

protection, designed

to

data

security,

and

Extensive

the

safeguard

programming.

system

from

system

devices

changes

to

tend

to

just

when

file

one field

features

is

can

of

to

data

program

include

effective

and other

made to improve

and

effectiveness.

be

original

alack

data among

of creating

difficult

system itself,

scope

structure in the

is

Sharing

are

an attempt in

user

ad hoc

own files.

In terms

Such

be limited

an existing

changing

risks.

or parts of the

Even

allow the

data repository

features

environment.

confidentiality.

security

For example,

a program

a file

out parts of files

data

Making

environment.

require

omitted

several

Because

are closely related.

of security

of files in

The problem is compounded

of a file system

data-sharing

number maintaining

that

reports.

its data by creating its

a lot and

and

programs

generate

multiply quickly.

owns

as the

creating

management

and security

security

ability to lock

the

can

users introduces

programs,

more difficult

requires

and to

Another fault

Data sharing

dispersed

and reporting

own file

programs

data sharing.

files

contents;

in the organisation

data sharing.

becomes

with afew

must have its

to list

each department

of security

security

and

file

are not possible, the file reporting

by the fact that Lack

System administration

a simple

programs.

modify

queries

Even

difficult

in

CUSTOMER

a file file

would

that:

1 Reads a record from the original file. 2 Transforms the original data to conform to the new structures 3

Writesthe transformed

storage requirements.

data into the new file structure.

4 Repeats the preceding steps for each record in the original file. In fact, that

any change

use the

spent

using

structural

to a file

data in that a debugging

and

data

structure,

file.

process

adding

five

will

steps

work

to

a customer

programs

Even

changes

to

exhibit

in the

when it is ability

in file

The

(how

data

data). to

do it.

type,

its

record

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

to

does

May

be

structural

such

changes

in

of the

data

affecting

afile

copied, affect

systems

program

scanned, the

overall

the

or

programs

programs

must

be

application

programs

Conversely,

structural

structure

the

file

without

affecting

the

definitions.

data

Data

management

in experience.

whole

or in Cengage

part.

Due Learning

to

dependence

electronic reserves

require

when

not the

to

to

make

access

logical

the

data

computer

only

what to

opening

makes the file

changes

the

data. format

sees do,

the

but also

of a specific system

file

extremely

of view.

rights, the

the

(how

specify

point

possible ability

computer

that

decimal,

data type), the file system is said

between

data format

to

are subject to change

when it is

difference

lines

integer

programs

must tell the

contain

from

the

exists

physical

must

duplicated, learning

file

previous

system

system

programs

application

is

and the

and

file

file

a field

all data access

dependence

data)

of the

dependence. the

as changing

data independence

without

and its field

not

exhibit

none

the

change (that is, changing

each

materially

because

characteristics

accesses

not

all

short,

make

file shown in Figure 1.3 would require

change,

Because

a programming

Reserved. content

lead

data.

characteristics,

of

specification

this

Therefore,

In

to

the

CUSTOMER Given

access the file.

views the

that

from

turn,

time is

For

they

possible

Conversely,

Consequently,

cumbersome

Editorial

being

Any program

how

that

significance

human

in

programs

additional

that is, access to afile is dependent onits structure.

structure.

structure.

access

characteristics

practical

the

to

dependence.

storage

file

new file

data storage

data

data

limitations,

all of the

and

of

section.

in the file structure,

changes in all programs any of the files

Those

in

errors (bugs),

problems

field to the

previous

CUSTOMER

exists

application

date-of-birth

to the

by change

independence

modifications

produce

errors.

dependence;

in the

new

conform

are affected

those

to

and Data Dependence

described

with the

modified

minor, forces

are likely

to find

Afile system exhibits structural example,

matter how

dependence.

1.5.1 Structural

the

no

Modifications

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1.5.2 Field Definitions

1 The

Database

Approach

19

and Naming Conventions 1

At first

glance,

the

CUSTOMER

file

shown

in

Figure

1.3 appears

to

have served

its

purpose

well:

requested reports could usually be generated. But suppose you want to create a customer phone directory based on the data stored in the CUSTOMER file. Storing the customer name as a single field turns out to be aliability because the directory must break up the field contents to list the last names, first

names

and initials

in

alphabetical

order.

Or suppose

you

want to

get a customer

listing

by area

code. Including the area code in the phone number field is inefficient. Similarly, producing alisting of customers by city is a more difficult task than is necessary. From the users point of view, a much better (more flexible) record definition would be one that anticipates reporting requirements by breaking up fields into their component parts. Thus, the CUSTOMER files fields

might be listed

as shown in

TABLE 1.3

Sample

Table 1.3.

customer

Customer

last

name

Ramas

CUS_FNAME

Customer

first

name

Alfred

CUS_INITIAL

Customer

initial

CUS_AREACODE

Customer

area

CUS_PHONE

Customer

phone

CUS_ADDRESS

Customer

street

CUS_CITY

Customer

CUS_COUNTY CUS_POSTCODE

Selecting

field

proper

field

name would

origin,

which is the

as

name

London

Customer

county/district

Eastern

Customer

postcode

3001

also important. the

file

customers

file.

Therefore,

file

number

in

renewal

the

portion

the

shown

insurance First,

or box

For example,

structure

prefix

the field not

can

1.3, it is

Using the be

used

of the

field

name

structure

becomes

which

the

files

is

field

more descriptive

belong

to

name

and

of the

yields

That is,

are

that

the

CUS_RENEW_

of the

self-documenting.

fields

names

obvious

as an indicator

question

Lane

Cape

make sure that

date.

CUS

Meadow

Figure

you know that the field in

a few

within

fields

place restrictions

on the length

those

In

restrictions.

on a page,

thus

addition,

making

output

CUSTOMER_INSURANCE_RENEWAL_DATE,

Another

problem

CUSTOMER

have

several

field that

has

address

fields

a CUSTOMER fields

contents.

by simply

what information

looking the

fields

of field very long

names, field

names

spacing

a problem.

being

self-documenting,

while

so it is

wise to

make it

For

be as

difficult

example,

the

is less

to field

desirable

CUS_RENEW_DATE.

The

2020

East

determine

packages

possible

more than

than

0161-234-5678

contain.

software

descriptive

can

1615

code

city

RENEW_DATE

you

A

Green

reasons.

conventions,

names,

to

the

for two

the

is

entry

123

examining

CUSTOMER

naming

field

Some

fit

be better

Second,

With proper

are likely

In

REN represents

DATE

at the

names

descriptive.

property.

review

Sample

CUS_LNAME

reasonably

Copyright

fields

Contents

Field

Editorial

file

Cengage deemed

any

Figure

All suppressed

Reserved. content

does

May not

CUSTOMER

does

named

a unique

Rights

1.3s

currently

customers

contains

Learning. that

in file

not

James

customer

not materially

be

copied, affect

a unique

G. Khumalo.

account

scanned, the

file is the

have

overall

or

duplicated, learning

in

whole

of finding

identifier.

Consequently,

number

experience.

difficulty

record

or in Cengage

For

the

desired

data

example,

addition

of

it is

efficiently. possible

to

a CUS_ACCOUNT

would be appropriate.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

20

PART I

Database

Systems

The criticisms

1

are

not

introduced design

of field

unique

to file

early. in

You

Chapter

Advanced

whether

Data

and the

by adhering

a file

end

to

and

definitions

Online Content this

Design.

Regardless

of the

always

of Figure later,

you learn

and in

Chapter

implementation

issues

the

designers

Both types

are

database

Data

in

1.3

they

about 6,

Modelling

Chapter

data environment,

reflect

requirements.

naming

structure

be important

when

Diagrams,

must

processing and

to

conventions

database

or a database

reporting

field

naming

about

in the file

will prove

Relationship

Database

system

users

proper

and

you learn

shown

conventions

with Entity

when

conventions

such

definitions

and Physical

it involves

needs

field

Modelling

and

Logical,

and naming

Because

will revisit

5,

Concepts;

Conceptual,

definitions

systems.

the

11,

design

documentation

of needs

are

best

served

conventions.

Appendices Ato Rareavailable ontheonlineplatformaccompanying

book.

NOTE No naming

the your

convention

DBMS

fit

all requirements

use. For example,

might interpret

be interpreted you

can

DBMSs internal

get

all systems.

name

ORDER

a hyphen (-) as a command

as a command

would

for

the

an error

to

subtract

message.

the

On the

NAME

other

Some

to subtract. field

from

hand,

words

generates

or phrases

Therefore,

the

are reserved

an error in some

CUS field.

CUS_NAME

would

DBMSs.

the field

CUS-NAME

Because

neither

field

because

it

work

fine

for

Similarly, would exists, uses

an

underscore.

1.5.3 Data Redundancy The file The

systems

structure

organisational

and lack

structure

Database professionals it is

unlikely

information agent one

that

data

contain

different

and phone

numbers

correct

copy

produces

data

different

places.

stored

of the

agent

different AGENT

on Poor

Copyright review

2020 has

Cengage deemed

of the

security.

any

All suppressed

Rights

to

Reserved. content

both the

and

phone

sets the

If

data is

does

the

May

not materially

be

multiple

copied, affect

in

different

be updated

consistently,

and the

Having

them

when the

same

different

and

the

in Figures

occur

in

As

islands

of

1.3 and 1.4, the

AGENT files.

data

locations.

data locations.

You need

more than

are stored

only

one

place

unnecessarily

at

stage for: when

to same

suppose make

you change

corresponding

agent.

conflicting

Reports

versions

an agents

changes

of the

phone

in the

same

number

CUSTOMER

will yield inconsistent

results

or file,

depending

copies

of data increases

the

chances

of a copy

of the

data

access.

scanned, the

data

multiple sources.

used.

unauthorised

not

basic

data. For example,

numbers.

For example,

data for

data from

for such scattered

CUSTOMER

exists

exists

you forget

same

will always

occur in

names

Having

susceptible

Learning. that

different

version

data

being

Editorial

contain

which

locations

of the same

places. file.

of the

to combine

of information

versions

Data inconsistency

data appear in

difficult

storage

Data redundancy

address

in the

the

different

data redundancy

Data inconsistency.

files

in

redundancy.

Uncontrolled

the

promotes

makeit

use the term islands

often

names

of security

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1 The

Database

Approach

21

1

NOTE Data that display data inconsistency defined and

as the

condition

conditions.

In

are

accurate;

Data

are verifiable;

Data entry

errors

shown

has

in

file,

spelled

name

accrue

Data anomalies. should

be

by forcing

to

change.

address

occur

corrections, an agent Any

in

any field

entered in

only three

hundreds

or even

? Insertion

?

anomalies.

will also

1.6

problems

Copyright Editorial

review

2020 has

file

Cengage deemed

any

All suppressed

Rights

Clearly,

only the

a field

which

the

change phone of

to

CUSTOMER

entry

to reflect

making

for

file

new

a single and

hundreds

occurs

of

when

data integrity.

data are not made as follows:

number

phone such

must be

number is shown.

changes

might

data inconsistencies

existed,

the

1.3. are

a new agent.

maintain

number, that

system,

in

number

problem

defined

potential

change condition

and phone

Ms Browns file

value

file in Figure

The same

places

an incorrectly

CUSTOMER

prospect

and

a non-existent

name, address

name,

commonly

a large

to

add

agents

is

a new

addition.

In

occur

in

great.

agent, Again,

you the

would be great.

Amy B. OBrian,

T. Ndlovus

systems

reference

transaction

does

are

in

of records.

data

many

as

name

allow

problems.

must be assigned

has a new phone

made. In

if

agent! agent

the

agent

Saajidah

data.

Clearly,

Maharaj and this

is

not

Olette

K. Snyman,

then

desirable.

SySTEmS in file

Reserved. content

1.3

be

customer

agents

file

CUSTOMER

an abnormal

changes in the redundant

file records

must

example,

Menzi

made

and the

Learning. that

inherent

Figure

data inconsistencies

delete

often

in

such

Ideally, fosters

with the

by that by that

the

manager

error

numbers)

CUSTOMER

27-21-410-1700).

into

supplies

phone/address

that

made in

Nikita F. Brown

If you delete

DATABASE

systems

master

For

Deletion

served served

found

CUSTOMER

a dummy

for creating

is

events

CUSTOMER

of data integrity

agent

and/or

time be faced

must be correctly

thousands

potential

you

The

add

could

customers

changes

anomalies. also

You

name each

entry

Look at the

move, the

when all of the required

If agent

each of the

case,

change

file.

anomalies

anomalies.

however,

phone

than

number

agency

kind

abnormality.

Each customer

value

data

real-world

the

in the

personnel

a data

as an

a single

make the

record rather

the

same

as 12-digit

phone

Data redundancy,

married and

making just

develops

The

Update

quit.

get

third

and

fact, the

many different locations.

one for each of the

successfully.

would

of

name

In

yields

anomaly

place.

CUSTOMER

decides to

change

this

defines

the

And should

benefits?

number

decides to

must

in the

Data integrity

with the

morefiles. In fact,

if the insurance

not exist.

and

entries (such

(27-12-410-7100

agents

be impressed

phone

changes in

also

A data anomaly

?

to

a single

Instead

you

data integrity.

consistent

results.

error:

number

sales

who does

dictionary

Nikita F. Brown

file (AGENT),

are

in one or

an entry

phone

bonuses

only

field value

If agent likely

The

made in

consistent

when complex

such

agents

or an incorrect

yield

occur

just

are not likely

to

database

and/or recur frequently

a non-existent

of an agent

agent

data in the

data inconsistencies.

to

contains

enter

but customers

of the

will always

more likely

1.3

to

number

no

data

digit in the

possible

phone

are

different files

Figure

are also referred to as data that lack

all

words,

the

are

a transposed

It is

which

there

made in several

file

file

other

Data

are

in

May not

not materially

file,

be

copied, affect

make

to

which

scanned, the

several

overall

or

duplicated, learning

using

a

files

such

were

in experience.

whole

stored

or in Cengage

part.

database

system

as the separately.

Due Learning

to

electronic reserves

very

customer

desirable. master

However,

rights, the

right

some to

third remove

party additional

unlike

content

may content

Traditional

file, the

be

suppressed at

any

time

the

product

file

system,

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

22

PART I

1

Database

Systems

the

database

label

reflects

consists the

fact

of logically that,

related

although

contents

may actually

be physically

Since

databases

data

the

in the 1.6,

way end-user provides

possible

to

structural the

that

DBMS

be referred

make a human

In the sections

FIguRE

the

DBMS

1.6

over file

system

systems

the

one

as the

database

youll

learn

database

of

DBMS

change

shown in Figure

data

by

making it

dependency

software

and the

its

locations.) major

Figure 1.5,

anomaly,

structures

crucial

stores

access

paths

also takes

components heart.

more than

and not

only

to those

care of defining,

of a database

However,

a DBMS to

what a database

system

a

user,

components.

systems

it takes

and/or

of DBMS software

to those

of several

facilities

DBMS,

logical

end

represents

shown in

generation

(The

unit to the

The databases

data

those

generation

paths

storage

database

management,

current

data repository.

be a single

data

the

managed.

between

The current

database

Contrasting

yet, the

access

to

unit,

data inconsistency,

relationships

being function,

that follow,

fits into

and

to

multiple

logical

accessed

is just

appears

among

a single

Better

also the

all required

may even

heart to

the

file

problems. but

managing

Remember DBMS

of the

all in a central location.

and

is

advantages

most

structures,

structures, storing

numerous eliminate

data stored in a single logical data repository

distributed

repository

data are stored,

dependency

data

the

just

as it takes

make a database

system is,

what its

system.

The

more than

a

system function.

components

are and how

picture.

and file systems A Database

Personnel

System

D ata b a s e

dept

E m pl my o e es DBMS

er s

C us t o s

S al Sales

ne

dept

I n v e t or y u nt s

Acco

Accounting

dept

A File System Personnel

dept

Sales

mpl oy e e s

E

C u st o mer

Accounting

dept

I n v e nt or y

S al es

dept

A c c o u nt s

1.6.1 The Database System Environment The term

database

collection,

storage,

management

point

1.7: hardware,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

system

of view,

software,

Rights

Reserved. content

refers

to

management

does

the

May

not materially

be

copied, affect

organisation

use

database

people,

not

an

and

of

system

procedures

scanned, the

overall

or

duplicated, learning

of

data is

components

within

that

a database

composed

of the

define

and

regulate

the

environment.

From

a general

major

shown

in

five

parts

Figure

and data.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Lets take

a closer look

Hardware.

Hardware

(microcomputers, devices ID

refers

to

switches,

components

shown

all of the systems

mainframes,

(hubs,

readers,

at the five

workstations

routers

and fibre

in Figure

devices

and servers),

storage

and

other

Database

Approach

23

1.7:

physical

optics)

1 The

for example, devices,

devices

1

computers

printers,

(automated

network

teller

machines,

etc.).

FIguRE 1.7

The database system environment writes

Procedures

and

and standards

supervises

enforces Database Analysts

System

administrator

Database

administrator manages

designer

designs End

Hardware

Programmers

users

Application DBMS

programs

use

utilities

write DBMS

access Data

Software.

Although

the

most readily

identified

software

is the

DBMS itself,

to

make the

database

system function fully, three types of software are needed: operating system software, software, and application programs and utilities: ? Operating system software all other

software

Microsoft

to run

on the

Windows, Linux,

? DBMS software software

manages all hardware components computers.

Microsoft

and makesit possible for

of operating

system

software

include

Mac OS, UNIX and MVS.

manages the database

include

Examples

DBMS

Access

within the database system. Some examples

and

SQL Server,

Oracle

Corporations

of DBMS

Oracle and IBMs

DB2. ? Application and to

programs and utility software

manage the

computer

are used to access and

environment

in

manipulate data in the DBMS

which data access

and

manipulation

take

place.

Application programs are most commonly used to access data found within the database, and to generate reports, tabulations and other information to facilitate decision making. Utilities are the software tools used to help manage the database systems computer components. For example, all of the major DBMS vendors now provide graphical user interfaces (GUIs) to help create People.

Copyright review

2020 has

Cengage deemed

Learning. that

structures,

This component

functions,

Editorial

database

any

five types

All suppressed

Rights

Reserved. content

does

control

includes

database

all users

access

of the

and

database

monitor system.

database On the

operations.

basis of primary job

of users can beidentified in a database system: systems

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

administrators,

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

24

PART I

Database

1

Systems

database

administrators,

Each

type,

?

user

Systems

?

database

described

administrators

Database

Database

designers

database

their

?

is

data resources,

Systems

?

design

produce

dimensions

the poor,

even the database

the

database

design

and create the

data entry

access

and

manipulate

are the

operations.

tactical

Procedures.

system.

business

used to

play

entered

into

designers

managerial

with varying with

tends

to

mainframes

In

Copyright Editorial

review

2020 has

at

addition

Cengage deemed

be low.

tends to the

any

the

book.

architects.

most to

to

application

dedicated

optimise

cover

new

programs.

programs

the

to

run

and

obtained

that

They

which end users

organisations

directors from

generated

daily

are all classified

the

database

to

as

make

of facts

they

and audit the

and use of the component

enforce

with customers.

monitor

both the

determination

are to

be organised

of is

standards are

data that

by also

enter the

data.

database.

the

the

of the

Procedures

use of that

stored in the

generated,

design

forgotten,

because

and

through

data

govern the

occasionally

a company

way to

All suppressed

Rights

to

does

May not

an

organisations

on the

can be created

Since which

data are the

data

a vital

part

are to of the

gym

be

database

the

size, its functions

at different levels

managed compare

system

the

system

alocal

may

procedures

is likely

to

and programmers;

procedures

structure.

organisations

membership

claims

many designers

management

and

For example,

microcomputer,

The insurance

are likely

to

have

the

be are

and its

gym

managed

how

corporate

of complexity membership by two

probably

simple

at least

one systems

hardware

probably

be numerous,

Just

complex

and

system

people,

the

and the

data

administrator,

includes

several

and rigorous;

and

be high. levels

account:

Reserved. content

depends

The

locations;

to

standards.

a single

different

into

is

system.

probably

multiple

fact

Learning. that

has expanded

through

and rules although

dimension

precise

DBAs and

volume

important

this

database

strive

managers

organisation

is

systems

to

claims

used is

data

the

and the

the information

how those

structure

adherence

several full-time

the

effect,

and procedures

supervisors,

collection

a new

database

an insurance

volume

the

that is

and

adds

Therefore,

hardware

application

is an organised

database

accompanying

programmers

description

reports

role in

within

which information

system

this

culture.

the

job.

A database complex

the

that

data.

a critical,

an important

conducted

from

ensure

As organisations

and implement

screens,

clerks,

are

and the information

material

and

decisions.

Data. The word data covers the raw

DBMS

are, in

application

job

design

are the instructions

ensure that there

database

sales

Procedures

is

They

environment.

designers

who use the

business

Procedures

Procedures

which

the

and end users.

functions:

operations.

on the online platform

best

end users employ

strategic

system.

general

manage

structure.

databases

people

High-level

and

database

the

For example,

end users.

and programmers,

complementary

responsibilities.

programmers

users

DBAs,

database

and

End

and

systems

available

a useful

and growing analysts

as

Administration,

design

cannot

database

known

analysts

unique

The DBAsroleis sufficientlyimportantto warranta detailedexploration in

K, Database

DBAs

systems both

properly.

Appendix

If the

the

also

is functioning

Online Content

?

performs

oversee

administrators,

database

designers,

below,

not materially

be

of database

database

copied, affect

scanned, the

overall

or

duplicated, learning

system

solutions

in experience.

whole

complexity,

managers

must be cost-effective

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

must also take

as

party additional

content

another

well as tactically

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and

and/or restrictions

eChapter(s). require

it

CHAPTER

strategically

effective.

example the

of good

database

Producing

database

a

million-rand

system

technology

already

selection in

solution

to

or of good

use is likely

to

a thousand-rand

database

affect

the

design

selection

1 The

problem and

Database

is

hardly

management.

of a database

Approach

25

an

1

Finally,

system.

1.6.2 DBmS Functions A DBMS in the

performs

several

database.

through

the

Most

use

data and

functions

integrity

database

dictionary

functions

data

security

work through

data component complex

DBMS

are

automatically

of the

stores

in

of

CUSTOMER

Oracles

data

uses

most can

of the

data

be achieved

storage

only

management,

control, and

Chapter

development

data

backup

application

elements

SQL

dictionary

data

and recovery programming

and their

In

to look

you from

freeing other

dependency

2, Data tool

data

any changes

thereby

structure.

and

data

thus relieving

dictionary,

structural

of the

the

Additionally,

changed

in

access

consistency

In turn, all programs that access the data in the

DBMS

program.

the

data abstraction how

The

the

access

and it removes

more about example

that

and data

languages

definitions

and relationships,

in each

recorded

programs

abstraction

DBMS.

structures

relationships

users,

multi-user access

and

interfaces.

The

the

end

management,

database

communication

the integrity

to

dictionary

relationships (metadata) in a data dictionary. database

guarantee

management,

management,

management.

that

are transparent

They include

and presentation,

management,

Data

of those

of a DBMS.

transformation

interfaces,

important

having to

made in you

words, from

up the

the

having

DBMS

system.

Models).

For example,

Developer

presents

code such

a database

from

the

to

structure modify

provides (You

Figure

the

required

will learn

1.8 shows

data

all

data

definition

an for

the

table.

FIguRE 1.8

Illustrating

metadata with Oracles SQL Developer

Metadata

Data storage

management.

The DBMS

creates

and

manages the

complex

structures

required

for

data storage, thus relieving you of the difficult task of defining and programming the physical data characteristics. A modern DBMS system provides storage not only for the data, but also for related data entry forms or screen definitions, report definitions, data validation rules, procedural code, structures

to handle

video

and picture formats,

database performance tuning.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

Performance

scanned, the

overall

or

duplicated, learning

etc. Data storage

management

is

tuning relates to the activities that

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

also important

for

makethe database

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

26

PART I

Database

Systems

perform

1

more efficiently

as a single (see

Figure

DBMS

the

The

Name:

datafiles

six

E:

database

drive

and access

actually

in

be stored

disk request

the

13,

Although

database

on different

to finish

concurrently.

Chapter

speed.

stores

the

Database

user sees the

multiple

storage

before

physical

media.

next

Data storage

Managing

the

in

one

database data files

Therefore,

starts.

management

In

the

other

words,

and performance

and SQL Performance.

data storage management with Oracle

The

is

Oracle

Manager also

six

Enterprise Express

shows

space

located

the

used

interface amount

of

by each

of the

datafiles.

of the

server

one

requests

are addressed

physical

into

tablespaces

on the

database

DBMS may even

wait for

Illustrating

database in

organised

logical

data files

of storage

the

PRODORA

PRODORA stored

Such have to

issues

FIguRE 1.9

actually

unit,

1.9).

doesnt

in terms

storage

DBMS can fulfil

tuning

Database

data

computer

The

data structures.

characteristics

and

presentation.

The

it

The DBMS relieves

and the

conform

physical

to the

multinational

company. In

of the

logical

DBMS

same

data presentation

South

Africa

the

entered

data

to

conform

of making a distinction

to

enter

in the

to

between

physically

the logical

data

data to

make

database

data

United

required

retrieved

an enterprise

expect

be entered

DBMS

the

imagine

would

would

format,

data

database.

DBMS formats

For example,

date

the

PRODORA

transforms

That is, the

user in

the

GUI shows

for the

expectations.

An end

contrast,

Express

you of the chore

data format.

users

as 11/07/2020.

Regardless

Manager

management

Data transformation

format

Oracle Enterprise

storage

such

States

used

by a

as 11 July

2020

as 07/11/2020.

must manage the date in the

proper format

for each country. Security

management.

privacy.

Security

rules

user

access

and

can

This is

especially

simultaneously.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

The DBMS creates determine which

important

in

All database

Rights

Reserved. content

does

May not

not materially

be

which

data

affect

multi-user

scanned, the

users

operations

users

copied,

a security

overall

can

(read, database

system that

access add,

duplicated, learning

in experience.

whole

or in Cengage

systems

part.

Due Learning

database,

delete

may be authenticated

or

the

or

electronic reserves

the

right

many users

some to

third remove

party additional

content

and data

data items

user

can

each

perform.

access

DBMS through

rights, the

user security which

modify)

where

to the

to

enforces

the

database

a username

may content

be

suppressed at

any

time

from if

and

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

password

or through

information

to

biometric

assign

access

authentication privileges

to

such

various

as a fingerprint database

scan.

1 The

Database

Approach

27

The DBMS uses this

components,

such

as queries

1

and reports.

Online Content AppendixK, Database Administration, examines datasecurityandprivacy issues

in

greater

Multi-user

detail

access

sophisticated

without

and recovery

perform recovery capability

is

covers

critical

backup

and

used to

database

Database

access

languages

through

and

user specify

what

Visual

Basic.NET

and

The

the

majority

Procedural

SQL

communication

multiple,

different

database

?

is the

network

provides

environment, users

communications

can

generate

answers

SQL,

are

data dictionary

in

and

DBMS

Chapter

provides

be done.

languages

such

Structured

address

the

The

used

of

DBMSs

example,

of

Web browsers

the

can

be accomplished by filling

in

DBMS

C,

by the

DBA

Structured

supported and

end-user

might

by

Chapter

requests

provide

as

Chrome,

several

ways:

screen

DBMS

9,

SQL.

accept

such in

standard

Query Language,

use

data

that lets

as COBOL,

utilities

8,

Concurrency.

languageone

how it is to

data

transactional

addressed

The

and data access

For

queries

minimising

in

Transactions

administrative

use

to

Such

Administration,

monitor and maintain the database.

Current-generation

the

with the

failure.

stored in the

a non-procedural

procedural

8, Beginning

environments. through

or a power

important

interfaces.

is

to

also

Advanced

DBA to

deals

rules, thus

issues

Managing

query language

Chapter

and

interfaces.

via the internet

End

de facto

of DBMS vendors. Language

Database

In this

(SQL)

allow the

K, Database

especially

having to specify

interfaces

and the database designer to create, implement, Query Language

is

programming

DBMS

disk

Appendix

12,

A query language

C#.

in the

management

Chapter

without

and

to ensure

management

The data relationships

application

programming

sector

data integrity

and

must be done

application

utilities that

Recovery

and enforces integrity

and transaction

Language,

Transactions

platform).

promotes

Ensuring

uses

concurrently

and data recovery

special

integrity.

online

data consistency.

a query language.

also provides Java,

Query

provide

as a bad

DBMS

database

Managing

backup

procedures.

databases

(see

The DBMS

Data integrity

Structured

such

the

data integrity.

systems.

Beginning

a failure,

the

the

12,

book.

control.

provides

systems

this

consistency,

access

Chapter access

and restore

issues

maximising

enforce

DBMS

data can

database.

The

accompanying

and users

multi-user

DBMS

preserving

management.

redundancy

access

to

after

and recovery

Data integrity

of the

backup

database

platform

multiple

of the

Current

special

online

data integrity that

management.

and

of the

ensure

details

and integrity.

routine

on the

the integrity

covers the

data safety

available

To provide to

compromising

Backup

the

control.

algorithms

Concurrency,

are

and is

forms

access

Firefox

through

via

to

the

or Edge.

their

preferred

Web browser.

? The DBMS

can automatically

?

can

The

DBMS

productivity

Copyright review

communication

Databases,

in

2020 has

Cengage deemed

Learning. that

any

All suppressed

to third-party

predefined systems

reports to

on a website.

distribute

information

via email

or other

applications.

Database

in e-Commerce

Editorial

connect

publish

interfaces

Chapter

17,

(see

online

Rights

Reserved. content

does

May not

are

Database

examined

Connectivity

and

in

greater

detail

Web Technologies,

in

Chapter

and in

14,

Appendix

Distributed H, Databases

platform).

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

28

PART I

Database

1.6.3

Systems

managing the Database System: A Shift in Focus

1 The introduction

of a database

system

provides

a framework

in

which strict

procedures

and standards

can be enforced. Consequently, the role of the human component changes from an emphasis on programming to afocus on the broader aspects of managing the organisations data resources and on the administration of the complex database software itself. The database

system

makes it possible

to tackle

far

more sophisticated

uses of the

data resources

as long as the database is designed accordingly. The kinds of data structures created within the database and the extent of the relationships among them play a powerful role in determining the effectiveness of the database system. Although the database system yields considerable advantages over previous data management approaches,

database

systems

do impose

significant

overheads.

For example:

Increased costs. Database systems require sophisticated hardware and software and highly skilled personnel. The cost of maintaining the hardware, software and personnel required to operate and manage a database system can be substantial. Management complexity. Database systems interface with many different technologies and have a significant impact on a companys resources and culture. The changes introduced by the adoption of a database system must be properly managed to ensure that they help advance the companys objectives. Given the fact that database systems hold crucial company data that are accessed from

multiple sources,

security issues

must be assessed

constantly.

System maintenance. To maximise the efficiency of the database system, you must keep your system current. Therefore, you must perform frequent updates and apply the latest patches and security

measures to all components.

training Vendor

costs tend to

be significant.

dependence.

Given the

Since database

heavy investment

may be reluctant to change database vendors. pricing point advantages to existing customers of database system components.

1.7

technology

advances

in technology

rapidly,

and personnel

personnel

training,

companies

As a consequence, vendors are less likely to offer and those customers may be limited in their choice

PREPARINg FOR yOuR DATABASE PROFESSIONAL CAREER

In this chapter, you wereintroduced to the concepts of data, information, databases and DBMSs. You also learnt that, regardless of what type of database you use (OLTP or OLAP), or whattype of database environment

you are

working in (for

example,

Oracle,

Microsoft

or IBM),

the

success

of a database

system greatly depends on how wellthe database structure is designed. Throughout this book, you willlearn the building blocks that lay the foundation for your career as a database professional. Understanding these building blocks and developing the skills to use them effectively will prepare you to work with databases at many different levels within an organisation. A small sample

of such

career

opportunities

is shown

in

Table 1.4.

As you also learnt in this chapter, database technologies are constantly evolving to address new challenges such aslarge databases, semi-structured and unstructured data, increasing processing speed and lowering costs. While database technologies can change quickly, the fundamental concepts and skills do not. It is our goal that, after you learn the database essentials in this book, you will be ready

to

apply

cutting-edge,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

your

knowledge

complex

Rights

Reserved. content

does

and skills to

work

database technologies

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

with traditional

OLTP and

OLAP systems

as

well as

such as:

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 1.4

Database

career

Database

developer

Creates

Database

Approach

29

opportunities

Description

Job Title

1 The

Sample

and

maintains

database-based

Skills

Programming,

required database

fundamentals,

SQL

applications Database designer

Designs and

Database

Manages

administrator Database

maintains databases

and

maintains

DBMS

and

Develops

databases

for

decision

architect

Designs

and

consultant

database

(conceptual,

SQL courses

hardware

database

to improve and

Implements

security

Cloud Computing

business

achieve

security

specific policies

infrastructure cloud

Scientist

for

database

Analyze to

data

for

data

warehouses,

modelling,

SQL,

generate

(VLDB).

the

insights,

data

Internet

data

relationships,

behaviors

Many vendors

database

technologies,

cloud storage

data security,

large

databases,

Data analysis,

statistics,

mathematics,

SQL,

the

need for

administration,

technologies

machine learning,

are addressing

modelling,

SQL, DBMS, hardware,

security

technologies, tuning,

of varied

data

technologies

DBMS fundamentals,

next-generation

amounts

design,

vendor-specific

systems

large

and predictable

databases

data

knowledge

database

goals

SQL,

Design and implement

Data

Architect

optimisation,

Database fundamentals,

administration

officer

query

physical)

processes

Very large

design,

SQL, vendor

DBMS fundamentals,

logical

Helps companies leverage technologies

Data

database

data lakes

and implements

environments

Database

fundamentals,

SQL,

support reporting

Database

design,

Database

databases analyst

Database

Systems

performance

etc. advanced

programming,

data

mining,

data visualization

databases

that

support

large amounts of data, usually in the petabyte range. (A petabyte is more than 1 000 terabytes.) VLDB vendors include Oracle Exadata, IBMs Netezza, Greenplum, HPs Vertica and Teradata. VLDB are now being overtaken in marketinterest by Big Data databases. Big Data databases. Products such as Cassandra (Facebook) and Bigtable (Google) are using columnar database technologies to support the needs of database applications that manage large

amounts

In-memory

of non-tabular

databases.

data.

Most

See

more about this topic

major database

vendors

in

also offer

Chapter

2.

some type

of in-memory

database

support to address the need for faster database processing. In-memory databases store most of their data in primary memory (RAM) rather than in slower secondary storage (hard disks). In-memory databases include IBMs solidDB and Oracles TimesTen. Cloud databases. Companies can now use cloud database services to add database systems to their environment quickly, while simultaneously lowering the total cost of ownership of a new DBMS. A cloud database

offers all the advantages

of alocal

DBMS, but instead

network infrastructure, it resides onthe internet.

of residing

within your organisations

See more about this topic in Chapter 14.

Weaddress some of these topics in this book, but not all no single book can cover the entire realm of database technologies. This books primary focus is to help you learn database fundamentals, develop your database design skills and master your SQL skills so you will have a head start in becoming a successful database professional. However, you first need to learn about the tools at your disposal. In the

next chapter,

influence

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

you

willlearn

different

approaches

to

data

management

and how these

approaches

your designs.

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

30

PART I

Database

Systems

SummARy

1

Data are raw facts.

Information

is the result

of processing

data to reveal its

relevant and timely information is the key to good decision the key to organisational survival in a global environment. Data are usually

stored in a database.

To implement

meaning.

Accurate,

making, and good decision

a database

and to

manage its

making is

contents,

need a database management system (DBMS). The DBMS serves as the intermediary user and the database. The database contains the data you have collected and data known as metadata.

you

between the about data,

Database design defines the database structure. A well-designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organisation. Databases evolved from manual and then computerised file systems. In afile system, data are stored in independent files, each requiring its own data management programs. Although this method of data management is largely outmoded, understanding its characteristics makes database design easier to understand. Awareness of the problems of file systems can help you avoid

similar

problems

with DBMSs.

Some limitations of file system data management are that it requires extensive programming, system administration can be complex and difficult, making changes to existing structures is difficult,

and security

redundant Database

are likely

management systems

weaknesses. to the

features

to

be inadequate.

data, leading to problems of structural

Rather than

files tend to

data

within independent

data repository.

files,

This arrangement

a DBMS presents

promotes

DBMS software

allows

users to

develop the database

the

data sharing,

eliminating the potential problem ofislands ofinformation. In addition, the integrity, eliminates redundancy and promotes data security. Open source

contain

were developed to address the file systems inherent

depositing

end user as a single

Also, independent

and data dependency.

database

thus

DBMS enforces data

system for any purpose, look

at

the source code and make any improvements, which willthen be released back to the general public. Open source DBMSs such as MySQL are currently free to acquire and use, making them ideal for smaller companies and organisations to develop database-centred applications quickly.

KEy TERmS

Copyright Editorial

review

adhocquery analytical database

dataprocessing (DP)specialist dataquality

information

business intelligence

dataredundancy

knowledge

centralised database data dataanomaly datadependence

datawarehouse database database design database management system (DBMS)

logical data format

datadictionary

database system

online analytical processing(OLAP)

datagovernance datainconsistency dataindependence dataintegrity

desktop database distributed database enterprise database Extensible Markup Language (XML)

online transaction processing(OLTP)

data management

field

physical dataformat

dataprocessing (DP) manager

file

production database

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

islandsofinformation

electronic reserves

metadata

multi-userdatabase NoSQL

opensource

operationaldatabase performance tuning

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

query

single-user database

querylanguage

social media

1 The

Database

Approach

31

transactional database

1

workgroup database

queryresultset

structuraldependence

record

structuralindependence

semi-structured

Structured QueryLanguage(SQL)

XMLdatabase

FuRTHER READINg Codd,

E.F.

Date,

C.J.

The

Capabilities

The

of

Database

Assessment

of

E.F.

Date,

C.J.

An Introduction

Date,

C.J.

Date

on

Codds

Database:

on the

REVIEw 1

c

record

d

file

Writings

20002006.

the

Field 8th

of

Database

edition.

Apress,

Research

Report,

a Historical

Technology.

RJ3132,

Account

Addison-Wesley,

1981.

and 2000.

2003.

2006.

Answers to selectedReviewQuestionsand Problemsforthis chapter online platform

accompanying

this

book.

Whatis data redundancy

3

Discuss the lack

4

Whatis a DBMS, and what areits functions?

5

Whatis structural independence, and whyis it important?

and which characteristics

of data independence

of the file system can lead to it?

in file systems.

Explain the difference between data and information. Whatis the role of a DBMS, and what areits advantages? List and describe the different types

9

What are the

10

main components

of databases.

of a database system?

Whatis metadata?

11

Explain why database design is important.

12

What are the potential costs ofimplementing

13

a database system?

Use examples to compare and contrast structured and unstructured data. Whichtype is more prevalent

14

in

a typical

business

environment?

What are the six levels on which the quality of data can be examined?

15

2020

IBM Analysis:

Addison-Wesley,

2

8

has

Systems,

and

data field

7

review

to

Database

Systems.

Review

QuESTIONS

b

6

Copyright

Management

A Retrospective

Discuss each ofthe following terms: a

Editorial

Database

Model,

Contribution

to

Online Content are available

Relational

Relational

Explain whatis

Cengage deemed

Learning. that

any

All suppressed

Rights

meantby data governance.

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

32

PART I

Database

Systems

PROBLEmS

1

Online Content Thefile structures youseein this problem setaresimulated in

a Microsoft

platform

Given the file 1

How

2

for this

Ch01_Problems,

available

Figure

P1.1, answer

contain,

Problems

on the

online

and how

1-4.

many fields

are there

would you encounter if you wanted to produce by altering

per record?

alisting

by city?

How would you

the file structure?

The file structure for Problems 14

PrOJeCT_

CODe

named

does the file

problem

P1.1

PrOJeCT_

shown in

many records

solve this

database

book.

structure

What problem

FIguRE

Access

MANAGer_ADDreSS

MANAGer_

PrOJeCT_BiD_ PriCe

MANAGer

PHONe

21-5Z

Holly

B. Naidu

33-5-59200506

180

Boulevard

25-2D

Jane

D. Grant

0181-898-9909

218

Clark

Blvd.,

F. Zulu

0181-227-1245

124

River

Dr.,

Dr, Phoenix,

13

64700

London,

NW3

TRY

179 975.00

9

787 037.00

25-5A

Menzi

25-9T

Holly B. Naidu

33-5-59200506

180 Boulevard

27-4Q

Menzi F. Zulu

0181-227-1245

124 River Dr., Durban, 4001

29-2D

Holly B. Naidu

33-5-59200506

180 Boulevard

64700

20

014 885.00

39-064885889

Via Valgia Silvilla 23, Roma, 00179

44

516 677.00

William K. Moor

31-7P

Durban,

4001

Dr, Phoenix,

64700

25

458 005.00

16

887 181.00

8 078 124.00

Dr, Phoenix,

3 If you wanted to produce alisting of the file contents bylast name, area code, city, county or postal

4

how

would

you

What data redundancies

FIguRE

P1.2

alter

the

file

structure?

do you detect, and how could those redundancies

lead to anomalies?

The file structure for Problems 58

PrOJ_

PrOJ_

eMP_

NUM

NAMe

NUM

1

Hurricane

101

1

Hurricane

1

eMP_NAMe

JOB_

JOB_CHG_

PrOJ_

CODe

HOUr

HOUrS

John D. Dlamini

EE

65.00

13.3

31-20-6226060

105

David

F.

CT

40.00

16.2

0191-234-1123

Hurricane

110

Anne

R. Ramoras

CT

40.00

14.3

34-934412463

2

Coast

101

John

D. Dlamini

EE

65.00

19.8

31-20-6226060

2

Coast

108

June

H. Ndlovu

EE

65.00

17.5

0161-554-7812

3

Satellite

110

Anne R. Ramoras

CT

42.00

11.6

34-934412463

3

Satellite

105

David F. Schwann

CT

6.00

23.4

0191-234-1123

3

Satelite

123

Mary D. Chen

EE

65.00

19.1

0181-233-5432

3

Satellite

112

Allecia R. Smith

BE

65.00

20.7

0181-678-6879

Copyright Editorial

code,

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

Schwann

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

eMP_PHONe

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

5 Identify in

6

and discuss the serious data redundancy

Figure

problems

exhibited

1 The

Database

by the file structure

Approach

33

shown

1

P1.2.

Looking atthe EMP_NAME and EMP_PHONE contents in Figure P1.2, which change(s) would you recommend?

7 Identify the different data sources in the file you examined in Problem 5. 8

Given your answer to Problem 7, which new files should you create to help eliminate the data redundancies

found

FIguRE P1.3

in the file shown

in

Figure

P1.2?

Thefile structure for Problems 910 DAYS_TiMe

TeACHer_

BUiLDiNG_

rOOM_

TeACHer_

TeACHer_

CODe

CODe

LNAMe

FNAMe

KOM

204E

Mbhato

Horace

KOM

123

Adam

Maria

L

LDB

504

Patroski

Donald

J

KOM

34

Hawkins

Anne

JKP

225B

Risell

James

LDB

301

Robertson

Jeanette

KOM

204E

Adam

Maria

LDB

504

Mbhato

Horace

KOM

34

Adam

Maria

L

MWF

LDB

504

Patroski

Donald

J

MWF 2:00-2:50

9 Identify

and discuss the serious

data redundancy

iNiTiAL MWF 8:00-8:50

G

MWF 8:00-8:50 TTh

W

MWF 10:00-10:50 TTh 9:00-10:15

P

TTh 9:00-10:15 MWF 9:00-9:50

I

TTh

G

problems

exhibited

Copyright Editorial

review

2020 has

Given the file structure KOM were deleted?

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

shown in Figure P1.3, which problem(s)

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

1:00-2:15 11:00-11:50

by the file structure

Figure P1.3. (The file is meant to be used as a teacher class assignment schedule. problems with data redundancy is the likely occurrence of data inconsistencies initials have been entered for the teacher named Maria Adam.) 10

1:00-2:15

shown in

One of the many two different

might you encounter if building

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter 2 Data Models In thIs

Chapter,

Why data

wIll learn:

models are important

About the

basic

data-modelling

What business How the How

you

rules

models

blocks

are and how they influence

major data

data

building

database

design

models evolved

can

be classified

by level

of abstraction

Preview This chapter

examines

design journey, resides

in the

end

most pressing

users

see

data in

data can lead to database failing

to

meet end-user

database the

uses

designers, Data

First,

you

database

data number

notation. are

still

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

about

Finally,

you views

Rights

Reserved. content

does

May not

known

be

the

not materially

be

systems,

emerging

copied, affect

model

scanned, the

overall

or

duplicated, learning

how

data

different

these

as Chen and

model

object and

degrees

to a are

diagrams.

Crows

Within (UML)

Next, model.

being

a

Foot notation

standard.

how it is

of data

how

There

language

relational

media data sets

and

of those

(ERD).

modelling

new industry

and the

social

draw

are

to

them.

will be introduced

diagram

to

and

and implementation

you

unified

such

UML is the

NoSQL

will also learn

used

the

ER model notations

object-oriented

same

are

among

development

design

Second,

data

database

complexities

concepts

database

failures,

as possible.

relations

the

relationship

to

of the among

real-world

and the

same

operation,

such

of ambiguities

Tracing

book.

that

introduced

manage very large

of the

the

entity

actual

nature

data-modelling

models.

of this

systems

entities

basic

as the

briefly

of the

be as free

of the

To avoid

Communication

by reducing

earlier

you understand

in legacy

need to

description

define

of the

from

notation

will

to the

varying

that

programmers

views

an organisations

organisation.

that

designers,

different

requirements.

communications

in the rest

Whilst traditional

will learn

Editorial

model

common

the

some

will help

you

be introduced

current

what

technique

ER

efficiency

a precise

within

developed

are addressed

of

database

and the database

design is that

do not reflect

data

obtain

such

will learn

chapter

and

abstractions

models

modelling

this

step in the

objects

Consequently,

and end users should

clarifies

models

that

of database ways.

data

understood

data

issues

must

of that

modelling

current

modelling is the first real-world

different

designs that

programmers

more easily

Data

between

problems

needs

designers

many

modelling.

as a bridge

computer.

One of the and

data

serving

used

you

will

Then,

you

to fulfil

the

efficiently

and effectively.

abstraction

help reconcile

data.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.1

the IMportanCe

oF Data

2

Data

Models

35

MoDels

Traditionally, database designers relied on good judgement to help them develop a good design. Unfortunately, good judgement is often in the eye of the beholder, and it often develops after much trial and error. Fortunately, data models (relatively simple representations, usually graphical, of more complex real-world data structures), bolstered by powerful database design tools, have made it possible

to

diminish

the

potential

for errors in database

design

substantially.

In

general terms,

2

a model

is an abstraction of a more complex real-world object or event. A models mainfunction is to help you understand the complexities of the real-world environment. Within the database environment, a data model represents data structures and their characteristics, relationships, constraints and transformations.

note Theterms model

Data

data model and database

model are often used interchangeably.

will be used to refer to the implementation

models

can facilitate

interaction

among

of a data

the

designer,

In this book, the term database

model in a specific

the

applications

database

system.

programmer

and the

end

user. A well-developed data model can even foster improved understanding of the organisation for which the database design is developed. This important aspect of data modelling was summed up neatly by a client whose reaction was as follows: I created this business, I worked with this business for years,

and this is the first time Ive

really

understood

how

all the

pieces really fit together.

Theimportance of data modelling cannot be overstated. Data constitute the most basic information units employed by a system. Applications are created to manage data and to help transform data into information. But data are viewed in different ways by different people. For example, contrast the (data) view of a company manager with that of a company clerk. Although the manager and the clerk both work for the

same

company,

the

manager is

more likely

to

have an enterprise-wide

view

of company

data than the clerk. Even different managers view data differently. For example, a company director is likely to take a universal view of the data because he or she must be able to tie the companys divisions to a common (database) vision. A purchasing manager in the same company is likely to have a more restricted view of the

data,

as is the

companys

inventory

manager. In

a subset of the companys data. The inventory while the purchasing manageris more concerned relationships with the suppliers of those items. Applications

programmers

have yet another

effect,

each

department

manager

works

with

manager is more concerned about inventory levels, about the cost ofitems and about personal/business

view of data,

being

more concerned

with data location,

formatting and specific reporting requirements. Basically, applications programmers translate company policies and procedures from a variety of sources into appropriate interfaces, reports and query screens. The different users and producers of data and information often reflect the blindfolded people and the elephant analogy: the blindfolded person whofelt the elephants trunk had quite a different view of the

elephant

from those

who felt the

elephants

leg

or tail.

Whatis needed is the

ability to see the

whole

elephant. Similarly, a house is not arandom collection of rooms; if someone is going to build a house, he or she should first have the overall view that is provided by blueprints. Likewise, a sound data environment requires an overall database blueprint based on an appropriate data model. When a good

database

blueprint

is available,

view of the data is different from that of the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

it

does not

matter that

an applications

programmers

manager and/or the end user. Conversely,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

when a good

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

36

part

I

Database

Systems

database

blueprint is not available,

program

or an

costing

the

data

entry

without

thousands

(or

a house

blueprint

to

2.2 The

selecting

stored.

without

blocks

anything

(a

of all

an abstraction;

person,

An entity represents

a blueprint,

as customers

a place,

or products;

an inventory

of operational

management

requirements,

thereby

you

cannot

live

in the

data out of the

you are equally

blueprint.

data

unlikely to

Similarly,

the

model. Just as you are create

a good

database

model.

models

BloCks

are

a thing

a particular

For instance, set

draw the required

data

data

overall

millions).

is

you cannot

house

the

MoDel BasIC BuIlDIng

building

entity is

even

an appropriate

Data

basic

such

build a good

first

are likely to ensue.

may not fit into

mind that

model is an abstraction;

not likely

problems

system

company

Keep in

2

order

type

entities,

attributes,

or an event)

about

of object in the real

but entities

relationships

which

and

data are to

world. Entities

may also be abstractions,

such

constraints.

An

be collected

may be physical as flight

routes

and

objects,

or musical

concerts.

An attribute is a characteristic by attributes

such

and customer

as customer

credit limit.

Arelationship and

customer

may be served

agents

many-to-many

and

One-to-many

entity

are

(the

agent.

often

Many-to-many

Thus, the

capitalised

the

by

as *:*. thus

yielding

the

the

*:* relationship

many customers,

and

each

one-to-many,

shorthand among

is related

PAINTER so they

INVOICE

to the

paints

are

notations

1:*, *:*

the three:

easily

designers

label and

for

the

(the many).

as 1:*. (Note

distinguished.) (the

many)

skills,

that

Similarly, is

generated

a by only

would also be labelled

many job

many classes

paintings

PAINTING

relationship

may learn

label

address

exists between

of relationships:

use the

but each invoice

Database can take

types

distinctions

painter (the one)

An employee

a student

customer

many different paintings, but each one of them

relationship

generates

many employees.

Similarly,

phone,

systems.

can serve

three

usually

illustrate

many invoices,

The CUSTOMER

use

designers

as a convention

may generate

customer

an agent

models

A painter paints label

name,

of fields in file

as follows: Data

examples

(*:*) relationship.

may be learnt

first

among entities. For example, a relationship

Database

designers

one)

customer.

students,

customer

be described one

(1:*) relationship.

names

a single

can by

The following

database

customer

SKILL

that

by only one painter.

Therefore,

name,

are the equivalent

one-to-one.

and 1:1, respectively.

painted

last

Attributes

describes an association

customers

is

of an entity. For example, a CUSTOMER entity would be described

and each job

1:*. skill

the relationship

EMPLOYEE

learns

each

be taken

many

class

relationship

can

expressed

by

by STUDENT

takes

CLASS.

One-to-one (1:1) relationship. of its

stores

be

manages labelled The

managed

only a single

Aretail companys

by a single

store.

employee.

Therefore,

management structure

In turn,

each

the relationship

store

mayrequire that each

manager,

EMPLOYEE

who is

manages

an employee,

STORE is

1:1.

preceding

discussion

identified

each relationship

in

both

directions;

that

is, relationships

are

bidirectional: One CUSTOMER Each

of the

A constraint

Copyright review

2020 has

Cengage deemed

Learning. that

any

many INVOICEs

is a restriction

data integrity.

Editorial

can generate

Constraints

All suppressed

Rights

Reserved. content

does

May not

is

many INVOICEs. generated

placed

on the data.

are normally

not materially

be

copied, affect

scanned, the

by only

overall

or

duplicated,

in experience.

whole

CUSTOMER.

Constraints

expressed

learning

one

are important

in the form

or in Cengage

part.

Due Learning

to

electronic reserves

because

they

help to ensure

of rules; for example:

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The employees

salary

A students

grade

Each

must

class

must have values that

must be between have

one

and

are between

2

Data

Models

37

6 000 and 350 000.

0 and 100.

only

one teacher.

2 How

do you

identify

the

2.3

identify

business

rules

BusIness

When

that

properly

database

of the

of data

such

data

From

a database

go

build

are in

attributes,

environment

an

about

a data

do not,

point

of view,

A business

organisation, that

Business

rules,

and enforce and

written you

see

business

rules

application

of business

person

in simple

by

yield

the

entities,

by gaining

used the

attributes

a thorough

and in

which

required

becomes

generate

in the are

one

In

a business, to

clearly

time

frames

a sense,

a government

of an

environment.

organisations to

this

Business

entities,

as an

you

of

what

used.

But

total

business.

when it reflects

properly

description

rules

are

a religious

organisations

of a policy,

misnamed:

group

they

or a research

are

operations,

rules

operational

define

such

agent,

unit,

are

of the

only

business

relationships

they

understanding

meaningful

and

understanding

information.

description

used

rules throughout

business

in the

the

of business

A customer

rules

seeing

help to

must be rendered

create

in

writing

environment.

attributes,

agent

can

relationships

serve

business

book, especially

is

A training

session

be easy

shares

to

many

rules

at

in the chapters

rules

understand

a common

main and distinguishing

and

constraints.

customers, work.

and

You

devoted

to

each

will see

data

the

modelling

and

interpretation

widely

of the

characteristics

of the

disseminated

rules.

to

Business

ensure

rules

data as viewed

that

describe,

by the company.

are as follows:

may generate

An invoice

must

organisation

language,

Examples

step is to

design.

To be effective, every

data

statements

served

are

of data

organisations

relationship

determining

data

a detailed

change

may be

database

from

any

customer

and

The first

modelling.

organisation.

or small

uses

within that

to reflect

Properly Any time

and

derived

actions

updated

constraints?

rule is a brief, precise and unambiguous

a specific

large

stores

or

collection

within

or principle any

are

and

may start

by themselves,

the

procedure to

selecting

how the

defined business rules. apply

you

model, they

organisation,

and information

laboratory

relationships

rules

designers

will be used to

types

entities,

generated

many invoices. by only

cannot

one

customer.

be scheduled

for

fewer

than

ten

employees

or for

more than

30 employees. Note

that

two

those

business

those

two

rules

establish

entities.

more than and

business

rules

The third

30 people;

two

The

main sources

written and

Copyright review

2020 has

entities,

entities,

business

entities,

rule

relationships

and

constraints.

CUSTOMER

and INVOICE,

establishes

a constraint:

EMPLOYEE

and

TRAINING;

For

no fewer

and

example,

the

and a 1:* relationship than

a relationship

ten

people

between

first

between and no

EMPLOYEE

TRAINING.

2.3.1 Discovering

Editorial

establish

two

Cengage deemed

of business

documentation, more

direct

Learning. that

any

All suppressed

Business rules

source

Rights

Reserved. content

rules

such

does

are company

as a companys

of business

May not

not materially

be

copied, affect

rules

scanned, the

overall

or

is

duplicated, learning

managers, procedures,

direct

in experience.

whole

policy

interviews

or in Cengage

part.

makers,

standards

Due Learning

with

to

electronic reserves

end

rights, the

right

department

or operations

some to

users.

third remove

party additional

managers manuals.

Unfortunately,

content

may content

be

because

suppressed at

any

time

and

A faster

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

38

part

I

Database

Systems

perceptions rules.

differ, end users sometimes

For example,

maintenance task.

2

a

maintenance

procedure,

Such

when

a distinction

are crucial

to the

Too often, interviews of

what the job

general

does

and

verify

not the

can

people

help the results

the

designer.

same job

to

yield

can

perform

very

database

designers

that

the

rules

essential

job

a

users

perceptions.

different

business

a

such

end

end-user

to management

ensure

business

can initiate

Although

pays to verify

may point The

reconciliation

mechanic

consequences.

rules, it

a discovery

any

authorisation

major legal

who perform

database

of the

that

with inspection

have

of business

While such

when it comes to specifying

may believe

mechanics

but it

development

are.

source

mechanic

only

trivial,

with several

components

diagnosis

differences

actually

may seem

contributors

are aless reliable

department

perceptions

problems,

that

is to reconcile

such

rules

are

appropriate

and accurate. The

process

of identifying

and

documenting

business

is

to

database

design

for

several

reasons: They

help

standardise

the

companys

They can be a communications

to understand

They

allow

the

designer

to

They

allow

the

designer

to

create

pilot

not

can

business

fly

more than

rule

can be enforced

ten

In

keep

in

a business

nouns

track

the

rule

associates

hours

be

relationship

modelled.

within

any

their

the

To properly

the type

go both

ways.

by the is

used

entity

to identify

objects.

business

in the

model

the entities. nouns (customer

rule,

you

of interest

could for

the

between

For example,

one-to-many

customer

a

the

and

environment

or passive)

business

wants

rule,

a noun

associating

rule a

customer

and a verb (generate)

that

that: and

should

be represented

by

rule

(1:*).

properly

you should

the

an

business

invoice

Customer

identify

and invoice.

is

is the

consider

rule a

customer by

side,

and invoice

1

the relationship

type,

How

many instances

of A are related

to

one instance

of B?

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

that relationships

generated

of A?

All

no

such

relationships

As a general

(active

environment

one instance

suppressed

that

However,

business

them.

a verb

deduce

to

any

specifies

modelled.

and invoices)

of B are related

Learning.

that

attributes,

For example,

many instances

that

be

for

and

How

Cengage

rule

If the

rules

two

objects

cannot

of entities,

among

business

a business

period

identification

of relationship,

business

As a general rule, to

deemed

an

relationship

identify

complemented

are

contains

are

constraints,

entities.

a generate

is, they

into

this

and

Data Model Components

will be specific

a relationship

From

rules

software.

proper

names there

and invoice

is

relationship

into

nouns.

respective

There

for the

world,

objects,

data.

participation

For example,

24-hour

by application

many invoices

Customer

has

can

will translate

will translate

may generate

that

real

of the

of the

processes.

appropriate

stage

set the

and scope

model.

Business

rules

designers.

nature, role

Business rules into

to

2020

users and

2.3.2 translating

constraints.

review

rules

data.

business

develop data

all business

the

understand

an accurate

of

between

designer

Of course,

Copyright

tool

They allow the

and to

Editorial

view

part.

Due Learning

to

electronic reserves

may generate

only

one

rights, right

some to

third remove

In

many

additional

content

may content

that

is

case,

the

side.

ask two

party

bidirectional;

many invoices

customer.

is the

you should

the

are

questions:

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

For example,

you could identify

How

many classes

How

many

the relationship

can one student

students

can

enrol

in

between

enrol in?

one

class?

student

Answer:

and class

by asking two

2

Data

Models

39

questions:

Many classes.

Answer:

Many students.

2 Therefore,

the

opportunities

soon the

to

between

determine

process

2.4 The

relationship

the

will become

relationships

second

the eVolutIon

quest

for

better

file

systems

is,

what it should

data

critical

chronological

order.

remarkable

You

of the

taBle

to

2.1

many-to-many

entities

to

several

as you

different

models represent

of structures

that it

This section

some

major data

is

(*:*).

proceed

You

that

through

many

of the old

evolution

should

of the

new

that

this

attempt

of thought

employ,

an overview

model

models

schools

gives

data

of major data

Time

Data

First

1960s-1970s

File system

as to

many

book,

and

to resolve

the

what a database

and the technology of the

database

concepts

major

concepts

and

that

data

structures.

would

models

and

be

in roughly

structures

Table

bear

2.1 traces

a

the

Model

models examples

Comments

VMS/VSAM

Used

mainly

Managed 1970s

IMS,

Hierarchical and

Third

have

models.

Generation

Second

will

MoDels

has led

will discover

resemblance

evolution

between

These

structures.

class

nature.

management

shortcomings.

these

and

oF Data

do, the types

used to implement

student

Mid-1970s

ADABAS,

IDS-II

Early

network DB2

Relational

Oracle

on IBM

records,

database

Server

access

Conceptual

simplicity

support

for

systems

systems

Navigational

Entity relationship

MS SQL

mainframe not relationships

(ER)

relational

modelling and data

modelling

MySQL Fourth

Mid-1980s

Object-oriented

Versant

Object/

Objectivity/DB

relational

(O/R)

Object/relational

support

DB2 UDB

Star Schema support

Oracle 11g

warehousing Web databases

Fifth

Mid-1990s

XML Hybrid

DBMS

dbXML

Unstructured

Tamino

O/R

DB2 UDB

Hybrid

Oracle 11g MS SQL Emerging

Late

Models:

2000s

to

Key-value

present

Column

store

Bigtable

NoSQL

Support

(Amazon)

High

Cassandra (Apache)

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

large

databases highly

performance,

Very large

rights, the

right

some to

third remove

party additional

content

XML

documents

end to

databases

Distributed,

(Google)

support

supports

(terabyte

size)

scalable fault

tolerant

storage (petabytes)

Proprietary

Copyright

data

common

DBMS adds object front

Suited for sparse

Editorial

object

for data

become data

model

relational

Server

SimpleDB

store

for

types

data

API

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

40

part

I

Database

Systems

online Content Thehierarchical andnetworkmodels arelargelyof historical interest,yet they

do still

technical

2

on the

contain

some

details of those two accompanying

model.

However,

focuses

online given the

on that

The hierarchical manufacturing

model (Each

as the

parent

can

is the

devoted

presence

rocket

that

an upside-down of a file

of the segment

children,

to improve

Appendices

to the

professionals.

The

I and J, respectively,

object-orientated model,

(OO)

most of the

tree.

directly

but

each

database

that

The

book

with the

The schema is the conceptual

only

is

organisation

contains

allows

not

levels,

a higher

child.

and its

basic or

layer

The hierarchical

children

segments.

parent.)

of records

a record

used

today,

are still

more effectively than

a database

as a collection

model

models

hierarchy,

data relationships

model

generally

network

a parent

one

The

structure

Within the

and to impose

database

of data for complex

1969.

which is called the

complex

network

model

emerged

type.

between has

moon in

hierarchical

beneath it,

performance

the

on the

record

child

the network model,

manage large amounts

landed

systems

database

hierarchical

network

concepts

database

of the relational

(1:*) relationships

user perceives

the

While the

database

current

Models

Apollo

by

many

model,

unlike

parent.

market

Gis

model was created to represent

model, the

However,

as the

equivalent

parent

have

hierarchical

network

such

a set of one-to-many

The network the

dominant

is represented

A segment

depicts

Appendix

model was developed in the 1960s to

structure

is perceived

that interest

models are discussed in detail in platform.

and network

projects,

segments.

and features

model.

2.4.1 hierarchical

logical

elements

used

in

to

the by

standard.

1:* relationships.

have

more than

definitions

modern

In the

of

data

one

standard

models:

of the entire database as viewed by the database

administrator.

The subschema actually

A data and is

A schema

to

desired

language

work

with the

needs

grew

model became

and

programs

Copyright review

2020 has

Cengage

Learning. that

any

All suppressed

that

Large

the

programs that

database.

which data can be managed

to define the

Rights

Reserved. content

does

May not

databases

The lack to

of ad hoc

produce any

applications

the

change

database.

replaced

were required,

query capability

even the simplest

structural

data from

were largely

and

by the

put heavy

reports. database

Because

of the

relational

pressure

Although the

in the

the

could

still

produce

disadvantages

data

on

existing

of the

model in the

1980s.

Model

Shared

of the

sophisticated

drew

they

model wasintroduced

Data for

Communications

deemed

more

models,

both users and designers.

Editorial

by the application

within

(DDL) enables the database administrator

data independence,

network

The relational

1

data

database.

code required

2.4.2 the relational of

the

defines the environment in

cumbersome.

the

limited

all application

hierarchical

Model

(DML)

data in the

and

too

to generate provided

in

from

data definition language

programmers databases

information

components.

As information

havoc

the

manipulation used

schema

network

defines the portion of the database seen

produce

by E.F. Codd (of IBM) in 1970 in hislandmark

Databanks.1

To use an analogy,

ACM,

not materially

be

pp. 377-387,

copied, affect

scanned, the

overall

or

duplicated, learning

The relational

model

the relational

model produced

June

in experience.

whole

represented

a

paper A

major

Relational

breakthrough

an automatic

for

transmission

1970.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

database set the In

to replace stage

for

1970,

the standard

a genuine

Codds

simplicity

was

to implement

work

bought

was

expense

efficiency.

Better

desktop

and laptop

computers,

relational

mainframe

ingenious

yet, the

cost

overhead;

preceded

it. Its

conceptual

a fraction

software

software

of

provided

The relational

computers

computer

of computers

costing

database

relational

that

but impractical.

of computer

model. Fortunately,

system

other

databases

Data

Models

41

simplicity

revolution.

considered

at the

the relational

sophisticated

transmission

database

2

power

rapidly

such

Oracle,

the

power

ancestors

as

conceptual

lacked

power

2

as did operating

as their

mainframe

by vendors

time

grew exponentially,

diminished what their

models

at that

did,

grew.

can run

DB2, Informix,

Today

relatively

Ingres

and

vendors.

note The relational in

Chapter

relational

The

database

model

3, Relational

Model

model is

relational

system

hierarchical

model

Arguably

easier

the

relational relational

database

as

data in a way that Each

table

relations,

a

the

in the

implemented

to

in addition

Relational

the

a more detailed

Algebra

discussions

a

performs

in

sophisticated

same

and

Calculus.

relational

basic

discussion In fact,

most of the remaining

functions

to a host of other functions

of the

RDBMS

RDBMS

manages

of tables

in

database

provided

that

the

chapters.

by the

make the relational

is its

all of the

which

data

ability

to

physical

are

hide

the

details,

stored

and

complexities

while the

can

of the

user

manipulate

sees and

the

query

and logical.

consisting

each

through

RDBMS

advantage The

CUSTOMER

4,

to introduce

and implement.

a collection

matrix,

are related

contained

is

seems intuitive

is

example,

Chapter basis for

The

user.

designed

and in

understand

the

is

as the

will serve

DBMS systems, to

chapter

Characteristics,

model

most important

model from

in this

that it

(rDBMS).

and network

database

For

so important

database

management

presented

of a series

other

through

table

in

the

Figure

of row/column sharing

2.1

intersections.

of a field

might

which

contain

Tables,

is

a sales

common

agents

also

to

both

number

called entities.

that

is

also

AGENT table.

online Content Thischaptersdatabases canbefound onthe accompanying online platform Figure

for this

The common or her data is

sales are

Kubu

link

For example,

in the

between

agent

stored

even

in

though

the

table.

because

other,

minimum

level

you of

for

can easily

associate

Copyright review

2020 has

are stored

you

Dunne,

redundancy

and

CUSTOMER

enables

you to

tables

shown

in

for

can

the

data

to

eliminate

one table

most

that

tables

Bhengani.

between

and the

determine

CUSTOMER

Kubu

the

in

easily

sales

the tables

Dunnes

agent

is

which

model

redundancies

501,

are independent

The relational

of the

to his

representative

customer

AGENT_CODE

Although

tables.

match the customer

provides

commonly

found

a in

systems. The relationship

Editorial

data

AGENT_CODE

controlled

AGENT

and AGENT tables

customer

customer

of the

Ch02_InsureCo.

For example,

type

(1:1,

1:*

or *:*) is

depicted in Figure 2.2. Arelational the

contents

named

CUSTOMER

AGENT tables

of each

the

database

the

another

Bhengani,

matches the

file

book.

2.1 are found

attributes

Cengage deemed

Learning. that

any

within

All suppressed

Rights

those

Reserved. content

does

entities

May not

not materially

be

copied, affect

often

shown

in

a relational

diagram is a representation and the

scanned, the

overall

or

relationships

duplicated, learning

in experience.

whole

or in Cengage

between

part.

Due Learning

to

electronic reserves

schema,

an

example

of the relational those

rights, the

right

of

databases

which

is

entities,

entities.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

42

part

I

Database

FIgure Database

2

Systems

2.1

linking

name:

relational

Ch02_InsureCo

AGeNT_

Table

AGeNT_LNAMe

tables name:

AGENT

(first

AGeNT_FNAMe

six attributes)

AGeNT_iNiTiAL

AGeNT_

CODe

AGeNT_PHONe

AreACODe

501

Bhengani

Kubu

B

0161

228-1249

502

Mbaso

Lethiwe

F

0181

882-1244

503

Okon

John

T

0181

123-5589

Link through

Table name:

CUSTOMER

CUS_

CUS_

CUS_

CODe

LNAMe

FNAMe

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Du Toit

10013

Pieterse

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

CUS_

CUS_reNew_

AGeNT_

AreACODe

PHONe

DATe

CODe

A

0181

844-2573

05-Apr-2018

502

K

0161

894-1238

16-Jun-2018

501

0181

894-2285

29-Jan-2018

502

0181

894-2180

14-Oct-2019

502

0181

222-1672

28-Dec-2019

501

B

0161

442-3381

22-Sep-2019

503

G

0181

297-1228

25-Mar-2018

502

0181

290-2556

17-Jul-2019

503

iNiTiAL

W

Jaco

F

Myron

George

Padayachee

10019

CUS_

CUS_

Maelene

Williams

10018

AGENT_CODE

Moloi

Vinaya

G

0161

382-7185

03-Dec-2019

501

Mlilo

K

0181

297-3809

14-Mar-2019

503

In Figure 2.2, the relational diagram shows the connecting fields (in this case, AGENT_CODE) and the relationship type, 1:*. In this example, the CUSTOMER represents the many side because an AGENT can have many CUSTOMERs. The AGENT represents the 1 side because each CUSTOMER has only one

AGENT.

Arelational table stores a collection of related entities. In this respect, the relational database table resembles a file. However, there is one crucial difference between a table and a file: a table yields complete data and structural independence because it is a purely logical structure. How the data are physically stored in the database is of no concern to the user or the designer; the perception is what counts.

And this

property

of the relational

database

model, explored

in

depth in the

next

chapter,

became the source of a real database revolution. Another reason for the relational database models rise to dominance is its powerful and flexible query language. Relational algebra, which was defined by Codd in 1971, wasthe basis for manyrelational query languages

and

will be introduced

in

more detail in

Chapter

4, Relational

Algebra

and

Calculus.

For

most

relational database software, the query language used is known as Structured Query Language (SQL). SQLis a 4GL that allows the user to specify what must be done without specifying how it must be done. The RDBMS uses SQL to translate user queries into instructions for retrieving the requested data. SQL makesit possible to retrieve data with far less effort than any other database orfile environment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.2

relational

diagram: a relational

2

Data

Models

43

class diagram

2

From an end-user a user interface,

explained

perspective,

any SQL-based relational

a set of tables

stored in the

database

database

and the

application involves

SQL engine.

Each

three

of these

parts: parts is

below:

The end-user interface. Basically, the interface allows the end user to interact with the data (by auto-generating SQL code). Each interface is a product of the software vendors idea of meaningful interaction with the data. You can also design your own customised interface with the help of application generators that are now standard in the database software arena. A collection of tables stored in the database. In a relational database, all data are perceived to be stored in tables. The tables simply present the data to the end user in a way that is easy to understand.

Each table is independent

from

another.

Rows in

different

tables

are related,

based

on common values in common attributes. SQL engine.

Largely

hidden from

the end user, the

SQL engine

executes

all queries

or data

requests. Keep in mind that the SQL engine is part of the DBMS software. The end user uses SQL to create table structures and to perform data access and table maintenance. The SQL engine translates all of those requests into the instructions necessary to perform such tasks largely

behind the scenes

and

without the

end users

knowledge.

Hence, its

said that

SQL is a

declarative language that tells what must be done but not how it must be done. (You willlearn more about the SQL engine in Chapter 13, Managing Database and SQL Performance.) Because the RDBMS performs the behind-the-scenes tasks, it is not necessary to focus on the physical aspects of the database. Instead, the chapters that follow will concentrate on the logical

portion

of the relational

database

in Chapter 8, Beginning Structured SQL and Advanced SQL.

2.4.3 the entity relationship The conceptual

simplicity

the rapidly increasing

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

database

Furthermore,

SQL is

covered

in

and in Chapter 9, Procedural

technology

and information

scanned, the

design.

detail

Language

Model

of relational

transaction

and its

Query Language,

overall

or

duplicated, learning

in experience.

whole

triggered

requirements

or in Cengage

part.

Due Learning

to

electronic reserves

the

demand for

RDBMSs.

created the need for

rights, the

right

some to

third remove

party additional

content

may content

be

more complex

suppressed at

any

In turn,

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

44

part

I

Database

Systems

database (For

implementation

example, Complex

2

model

features

that

graphically

activities

would

require

describe

them

a

widely

accepted

Chen first

and their relationships the

relational

the foundation

model

diagram

One of the

between

relationships over

among

data

When the

were illustrated:

were represented database

designers

including

one of the

1976;

that

notation

it

quickly

results.

network

prefer to

models,

it

Although

the

still lacked

the

use a graphical

(er)

was the

graphical

became

popular

database

model

representations

structures

tool in

which

model, or erM,

representation

because

and

of their

ERM

has

of entities

it complemented

combined

was and

how it

between

entities

was

(1:M),

to

provide

model,

most common

more

versions

using

simple

ERD,

of

the

Chens

which uses the

modelling

notation

such

(1:1).

notation

as n

Relationships line.

were

Foot

style

of relationships

relationship

Crows

Chen also

notation

types

and one-to-one

entities

data

Chens

three

through

versions

model,

in the

a relationship.

(M:N)

entities

graphical

of the

to

achieved

related

components. between

ER data

debate

were introduced,

many-to-many

to the

of the

a large

was different

model components

connected

this

This fuelled

originally

model database

made a distinction

early releases

own.

an entity

one-to-many

to

was that it clearly

However in the

basic data

adopted

successful

a kennel.)

database design. ER models are normally represented in an entity

Chens

by a diamond

design tools.

building

modelling.

The relational

attributes

associations

many.

database

than

Because it is easier to examine

designers

model in

structure

them.

have

what exactly

for representing

to indicate

to

yield and

design tool.

which uses graphical

of Peter

to

activities

Thus, the entity relationship

data

concepts.

(erD),

strengths

and the relationships

community

ER data

for tightly structured

relationship

allowed

for

the

more effective

design

hierarchical

database

standard

in a database

database

the

database

in text,

need for

detailed simplicity

over

are pictured.

introduced

the

more

conceptual

make it an effective

than to

Peter

creating

requires

was a vast improvement

entities and their relationships become

thus

a skyscraper

design

relational

structures,

building

Whilst

developed,

notation.

note One of the

more recent

Foot notation James such

the

was originally

Martin. In as n

symbol

of

legacy

UML,

invented

many

many

of Peter

used the

you

Chen. side

organisations

that

produce

larger

entity

UML

standard.

online

with

C. Finkelstein,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

and

is

Foot

the

from is

simple

the

Crows

This is

modelling

and

notation

three-pronged

a general

shift

towards

particularly

but are vital to the Foot

The

by Clive Finkelstein2

derived there

true

organisation.

in

It is

notations.

Modelling Language (UML) has been used

diagrams

notation

willtherefore

model.

of using the

notation.

Crows

of the Unified class

Foot

is

have

been

emerging

be used to

developed

as the

as a part

industry

data

of the

modelling

model ERDs using relational

concepts.

Morein-depth coverage ofthe Crows Foot notationis providedin

E, Comparison

of ER

An Introduction

Addison-Wesley,

Foot

and software

Chens

method,

UML notation

Crows

made popular

Although

Crows

hardware

Although

design

book the

Crows

use the

both

as the

were used instead

relationship.

component models.

Content

Appendix

2

relationship

object-orientated

In this

still

known

and later

symbols

The label

on obsolete

are familiar

is

Everest

of the

today

Morerecently the class diagram to

notations

graphical

by

many

which are running

important

Chens

by Gordon

Foot notation,

to represent

systems

therefore

Crows

to indicate

used

use

versions

Modelling

to Information

Notations,

available

Engineering:

From

on the

Strategic

online

platform.

Planning

to Information

Systems.

1989.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

45

note UML is and

an object-orientated

published

common

and

as

set

databases. model

the

Rather,

website

based

and

The

a language The

and

that

OMG

is

of an

Object

effort

(symbols

UML is

describes

Management

headed

and

constructs)

a set of diagrams

which

OMG

for

the

software

includes

to

for

that

More details

which

data are to

)

2

a

design

developing

can be used to

consortium

UML.

(OMG

develop

analysis,

or procedure

and symbols

not-for-profit

computing,

Group

by the

not a methodology

an international

object

by the

result

notations

mind that

of distributed

on the following

Earlier in this

collected

is

area

the

that can

is

setting

be found

on

www.uml.org/

The ER model is

box.

UML is

sponsored

UML is

Keep in

graphically.

in the

language

1997. diagrams

of systems.

a system

Entity.

in

of object-orientated

modeling

standards

modelling

a standard

stored.

name

generally

in

was defined

is represented

entity,

a noun,

is

capital

letters

and is

or EMPLOYEE

relational

an entity

An entity

of the

written

PAINTERS,

chapter,

components:

rather

model, an entity is

as an entity instance

ERD

in the

centre

written

of the

singular

Usually,

a relational

occurrence

about

by a rectangle,

in the

EMPLOYEES.

mapped to

or entity

in the

written

than

as anything

table.

also

rectangle. form:

known The

as an entity

entity

PAINTER

when applying

be

name

rather

the

than

ERD to the

Each row in the relational

table is

known

in the ER model.

note A collection

of like

entities

is known

Figure 2.3 as a collection depicts

entity

conform

Each

entity

example, a first

sets.

to that

is

name.

entity

Data

can

describe

written

connects

two

entities.

be illustrated:

next to

line. paints

2.3 shows

connectivities.

in the

Copyright Editorial

review

2020 has

ERD

Cengage deemed

Learning. that

any

many

some

All

examine

box.)

Reserved. content

entity

AGENT file in

speaking,

set,

the

and this

ERD

book

will

components.

characteristics

as an employee

does

May

basic (1:*)

number,

Diagrams,

data

of the

data.

of the

entity.

For

a last

name

and

explains

how

Most relationships

model, three

many-to-many

(*:*)

are represented

attributes

describe

of relationships

one-to-one

(1:1).

ERD

(The connectivities

by a relationship

an active

companys

types and

of relationships.

of the relationship,

ERDs that basic

use the UML

or vertically. just

not

among

Relationships

each

the

horizontally

Rights

of the

or passive

DEPARTMENTs

line

verb, is

has

that

written

on the

many EMPLOYEEs;

PAINTINGs.

basic

are immaterial;

suppressed

for

and its

Relationship

to label the types

The name

For example,

As you

may be presented

can think

set. Technically

as a substitute ERD

particular

such

Entity

Within the

one-to-many

entity

entities.

a PAINTER

Figure

each

related

relationship

entity

describes

associations

modellers use the term connectivity are

that

you

AGENT entity

any

attributes with

For example,

ERD.)

between

data

Modelling

use

discussing

will have

Relationships

associations among

5,

set. in the

designers when

by a set of attributes

(Chapter

Relationships.

ERD

practice

EMPLOYEE

in the

as an entity

agents (entities)

Unfortunately,

established

described

the

are included

of three

not materially

be

affect

scanned, the

overall

ERD in

to read

or

Figure

The location

remember

copied,

UML notation

duplicated, learning

in experience.

2.3,

and the

to illustrate note that order in

a 1:* relationship

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, right

some to

the

the

third remove

relationships

entities

which the

from

the

these

1

party additional

content

and relationships

entities

are

side to the

may content

and

be

suppressed at

any

time

presented

*

from if

side.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

46

part

I

Database

FIgure

Systems

2.3

the basic uMl erD A One-to-Many

(1..*)

Relationship:

each

PAINTING

A PAINTER

is

painted

can

paint

many

PAINTINGs:

by one PAINTER.

2 PAINTER

paints

PAINTING

c

1..1

A

Many-to-Many

(*..*)

each

0..*

Relationship:

SKILL

An

EMPLOYEE

can be learned

EMPLOYEE

by

learns

can learn

0..*

(1..1) each

Relationship:

STORE is

An EMPLOYEE

managed

EMPLOYEE

manages

and their

manages

STORE

c

Because

set in the

of participation

1

be aware that, typically,

associations.

of an entity

ER

the

an object

model.

in a relationship

one STORE:

by one EMPLOYEE.

1

You should

SKILLs:

SKILL

c

0..*

A One-to-Many

many

many EMPLOYEEs.

class is

Likewise,

is

UML class

diagram

a collection

an association

often referred

to

was developed of similar

is

similar

objects,

to

as multiplicities.

to

model object

a class is the

a relationship

The only

classes

equivalent

where

the

major difference

degree

between

a UML class and an ER entity is that a blank box is left in the drawing of the UML class to add the names of methods which are required when developing object-orientated systems. However,from a data modelling perspective this does not affect the structure of the data and you will use the UML notation to represent relational concepts only. Chapter 5, Data Modelling with Entity Relationship Diagrams, will introduce

the concepts

of both

Crows

Foot notation

and the

Class

Diagram

notation in

more detail.

Most database modelling tools let you select the UML model diagram option. Microsoft Visio Professional software was used to generate the UML class diagrams you will see in subsequent chapters.

note Many-to-many them.

(*:*)

However,

appropriate

you

relationships will learn

in a relational

exist in

at a conceptual

Chapter

3,

Relational

level, Model

and

you

should

Characteristics,

know

that

how

to

recognise

*:* relationships

are

not

model.

online Content Fora moredetaileddescription ofthe Chen,CrowsFootandotherER model notation

systems,

see Appendix

E, Comparison

of ER Model Notations,

available

on the

online platform.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

47

note For the Figure

purposes

Figure

2.6

shows

alternative

Crows

Foot

models

of the

UML

ERDs in

2

FIgure

As you

2.4

examine

represented

the basic Crows Foot erD

the

basic

by the

be presented

Figure

visual

Nevertheless,

the

2.4,

three-pronged

horizontally

Its exceptional

to

of illustration,

2.4.

note

or vertically

simplicity

search for

that

Crows

and the

makes the better

data

the

Foot.

1

is represented

As

with

order is

UML

again

the

line

segment

entities

and the

and relationships

*

is may

unimportant.

ER model the dominant modelling tools

by a short

notation

database

continues

as the

modelling

and design tool.

data environment

continues

evolve.

2.4.4

the

Increasingly

object-orientated complex

(oo)

real-world

problems

Model demonstrated

a need

for

a data

model

that

more

closely

represented the real world. In the object-orientated data model (OODM), both data and their relationships are contained in a single structure known as an object. In turn, the OODM is the basis for the object-orientated database management system (OODBMS).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

48

part

I

Database

Systems

online Content Thischapter introducesonlybasicOOconcepts.Youllhavea chance to examine

object-orientated

concepts

and principles in

detail in

Appendix

Like the

relational

G, Object-Oriented

Databases, which can be found on the online platform.

2

An

OODM

is

reflects

described

relationships

other

a different

by its

between

objects.

data

be

on it,

The

An object

to

an ER

Attributes

describe Name,

objects

on the

the ID

share

the

contains

of

as finding In

languages.

structure

models includes about

meaning.

entity,

an object

information its

about

relationships

with

The OODM is said to be

also

to

contain

a specific

all

data

and operational potentially

operations

value

and

that

procedures,

a basic

can

printing

data

the object

building

block for

class

inheritance

is the

methods

of the

be created inherit

has

To illustrate

the

As you examine The

OO data

other

objects

all related

that

Learning.

within the

All suppressed

related

Rights

that

Reserved. content

does

from

May

not materially

behaviour.

the

the

class two

class data

this

of the

OO data

representation.)

and the

the

CUSTOMER

and

EMPLOYEE,

CUSTOMER

and

class

respect.)

attributes

and

tree in

EMPLOYEE

model in this

to inherit

case,

a PERSONs

an upside-down

CUSTOMER

classes,

action

programming

variants in their

hierarchical

hierarchy

In

or printing

resembles

it

a real-world

in traditional (Some

a class

set in that

represents

name

of similar

sense,

an entity

methods

hierarchy

to the

PERSON.

can

EMPLOYEE

will

PERSON.

the

OO

problem

an object

same

object

objects

model

and

shown

in

the

ER

Figure

2.5.

as a box; all of the object

model,

examine

is related

be

copied, affect

scanned, the

overall

to

box.

Note that

one

For

and

must contain

or

duplicated, learning

in experience.

attributes

box. The object representation

to the INVOICE.

each INVOICE

not

includes

a general

from

method

do not include

example,

In

different

a PERSONs

objects

similarity

within the object

each INVOICE

object indicates

object

their

graphical

2.5, note that:

model represents

of the

occurrence

in this list.)

A class is a collection

of procedures

The class

class

between

are included

objects

relationship

an

is

changing

within the

invoicing

one individual

of the items

a PERSON

A classs

For example, the

For

the

methods

simple

Figure

it.

may be considered

only

several

(methods).

equivalent

model

parent.

from

and

the

define

object

above

difference in

methods.

name,

of an object

classes

attributes

representations

indicates

ability

as subclasses

all

as

(Note

represents

in classes.

a class

methods

one

an object

example,

behaviour

are the

class.

For

and

a class hierarchy.

PERSON

an object

However,

PERSONs

only

each

a parent

terms,

defined through

are grouped

methods

semantic

general

of Birth.

set.

known

OO terms, as the

In

of an object. Date

entity

words,

components:

entity.

(attributes)

models

which

any

object finding

at least

More precisely,

and

a selected

such

following

characteristics

in

that

greater

of relationships

content is

properties

Number

Classes are organised

share

entity.

semantic

procedures

other In

model

Cengage

an

object

of a real-world

similar

ER

a set

address.

deemed

based

object

well as information

values,

types

making the

models

with shared

resembles

has

data

an

meaning.

allowed

its

data, various

objects

attributes

Objects that

2020

as

entity,

are given

indicates

has

thus

an abstraction

of an entity. (The

review

an

object,

semantic

as changing

model is

is

equivalent

Copyright

the

entities.

unlike

structures.

OO data

such

use

quite

within the object

because

self-contained,

autonomous

within

development

As objects include

becomes

Editorial

facts

such

and

But

the facts

OODM

performed

values.

the

model

Subsequent

define

content.

Therefore,

a semantic

way to

factual

whole

or in Cengage

the

only

one

at least

part.

Due Learning

connectivities

example,

to

the

1:1

and relationships of the INVOICE

(1:1

and

1:*) indicate

next to the

CUSTOMER.

The

reserves

rights, the

right

some to

third remove

party additional

content

1:* next to

may content

the

CUSTOMER the

one LINE but can also contain

electronic

to

includes

be

many LINEs.

suppressed at

any

time

object

LINE

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.5

a comparison

OO data

2

Data

Models

49

of the oo model and the er model

model

ER model

2 INVOICE

INV_NUMBER

INV_DATE INV_SHIP_DATE INV_TOTAL 1 CUSTOMER *

LINE

The ER model uses three separate entities and two relationships to represent an invoice transaction. As customers can buy more than one item at a time, each invoice references one or morelines, one item per line. And because invoices are generated by customers, the data modelling requirements

include

a customer

entity

and a relationship

between

the

customer

and

the invoice. The

OODM

advances

influenced

many areas,

from

system

modelling

to

programming.

(Most

contemporary programming languages have adopted OO concepts, including Java, Ruby, Perl, C# and Visual Studio) The added semantics of the OODM allowed for a richer representation of complex objects. This in turn enabled applications to support increasingly complex objects in innovative

ways.

online Content Ausefulcomparison between the OOandER model components canbe found

in

Table

G.3, located

platform for this

It is important suited data

than

to

purposes.

and

Appendix

G, Object-Orientated

to

some

not

all data

tasks.

For

while implementation

The

network

such

note that

others

modelling,

in

Databases,

available

on the

online

book.

entity

as the relational

model

are created

example, models

relationship

models

models

is

equal;

conceptual are

better

an example

are examples

of implementation

model and the

OODM,

could

some

models

at

are

managing

be used

suited

as both

are

to

better

high-level

data for implementation

model,

At the

models

better

stored

of a conceptual

models.

data

while

the

same time,

conceptual

hierarchical

some

models,

and implementation

models.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

50

part

I

Database

Systems

2.4.5 other Facing the

Models

demand

to support

more complex

data representations,

the relational

models

main vendors

evolved the model further and created the extended relational data model (erDM). The ERDM adds many ofthe OO models features within the inherently simpler relational database structure. The ERDM gave birth to a new generation of relational databases that support OO features such as objects

2

(encapsulated

data

and

methods),

extensible

data types

based

on classes

and inheritance.

Thats

why a DBMS based on the ERDM is often described as an object relational database management system (OrDBMS). Today, mostrelational database products can be classified as object relational, and they represent the dominant market share of OLTP and OLAP database applications. The success of the ORDBMS can be attributed

transaction OODBMS is (CAD/CAM), support for

to the

models

conceptual

simplicity,

data integrity,

easy-to-use

query language,

high

performance, high availability, security, scalability and expandability. In contrast, the popular in niche markets such as computer-aided drawing/computer-aided manufacturing geographic information systems (GIS), telecommunications and multimedia, which require more complex objects.

From the start, the

OO and relational

data

models

were developed

in response

to

different

problems.

The OO data model was created to address very specific engineering needs, not the wide-ranging needs of general data management tasks. The relational model was created with afocus on better data management based on a sound mathematical foundation. Givenits focus on a smaller set of problem areas, it is

not surprising

that

the

OO

market has not grown

as rapidly

as the relational

data

model

market. However, large DBMS vendors such as Oracle readily promote their once relational DBMS now as object relational, with each new release adding new functionality. This gives organisations more choice and flexibility in the design and development of new database applications and in the integration with existing OO applications. The use

of complex

objects

received

a boost

with the internet

integrated their business models with the internet, they exchange critical business information. This resulted in business communication tool. Within this environment, as the de facto standard for the efficient and effective unstructured

data.

Organisations

that

revolution.

When organisations

realised its potential to access, distribute and the widespread adoption of the internet as a Extensible Markup Language (XML) emerged exchange of structured, semi-structured and

use XML data soon realised

that they

needed

to

manage large

amounts of unstructured data such as word-processing documents, Web pages, emails and diagrams. To address this need, XML databases emerged to manage unstructured data within a native XML format. (See Chapter 17, Database Connectivity and Web Technologies). Atthe same time, ORDBMSs added support

for

XML-based

documents

within their

relational

data structure.

Due to its robust

foundation

in broadly applicable principles, the relational model is easily extended to include new classes of capabilities, such as objects and XML. Modelling spatial data for use in applications such as route optimisation (an ambulance finding the quickest route to a patient) or urban planning requires yet another type of data model. Spatial data comprises objects

such

as cities

or forests

that

exist in

a multi-dimensional

space.

Storing

such

data in a relational

database would simply take up too much space and queries would be too long and complex to manage. A spatial database management system (SDBMS) is a database system with additional capabilities for handling spatial data. SDBMS include spatial data types (SDTs) in its data model and query language. For example

the

ability to

model objects (forests,

cities

or rivers) in space

using types

such

as POINT, LINE

and REGION. The POINT data type refers to the objects centre point in the multi-dimensional space, the LINE data type is used to represent connections in multi-dimensional space, e.g. rivers or roads, and the REGION data type is a representation of an extent e.g. alake in a 2-D space. In addition SDMS supports spatial indexing allowing the fast retrieval of objects in a specific area and efficient algorithms for supporting

spatial joins.

SDBMS

are often used to support

GIS applications

one of the

most popular

today being Google Earth.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Although a new

relational

generation

some

and

of

object

databases

relational has

years has become

histories,

customer

Twitter

and

According

to

many studies,

manage

balancing

ways to

about

NoSQL

The Big

in

Data

It is

not

always of rows millions lead

that

data

specific

processing

data

needs,

challenges

found

in

2

of

Data

growth

this

data from

(rapid

Data.

challenge

Todays

growing data

organisations

have accumulated

of browsing

patterns,

sources

of structured

is the top

rapidly

trends

for

data

and

derive

managers

The need to

scalability

business

data.

with system

(IT)

budgets.

performance,

at a reasonable

as Facebook,

unstructured

technology

with shrinking

growth,

purchasing

organisations,3

information

data

such

and

Big Data refers to a movement to find

and scalability

and

media

with combinations

Web-generated

relational

Web data that

challenges.

called Big

and lower

new and better

insight

from

cost. (You

willlearn in

the

of

it,

while

more detail

NoSQL.)

approach

does

not

always

match

needs

organisations

with

social

media

data into

the

conventional

relational

the

of

multiformat

need

for

(structured

more storage,

the type

and

in the relational of high-volume

come

non-structured)

processing

power

on

a daily

sophisticated

basis

data

will

analysis

environment.

implementations

with a hefty

data

and

price tag for

required

expanding

in the

RDBMS

hardware,

storage

environment and

licences.

highly

data

collected

based

on OLAP tools

structured

data.

from

Web sources

will probably

fault-tolerant

cure try to

sell

infrastructure

business

world

has

advantage,

and

others

MySpace

Barnes

it is

that

had

not

mining for

requires to

Big

miss it.

to

developed that

hidden

analysis

ask

in

Netflix

business

a viable

internet

some

needs

could

of

(although

prove

that

of unstructured

many

to

be a

leverage

matter

to

landscape

database

a highly

of business

technology

business

established

creating

scalable,

survival.

gain

The

a competitive

would

be different

if:

in time. model sooner. strategy

organisations

mountains

environments

amounts

organisations,

how the

challenge

in relational vast

approach.

For some

yourself

Facebooks

to the

surprising

of information

idea).

of companies Just

data in the

management

on the Data

be very successful usable

a different

data

you

many examples

had reacted

& Noble

Therefore,

for

had responded

Blockbuster

has proven to

However,

no one-size-fits-all

vendors

unstructured,

columns.

of rows

speaking,

with

is

to fit

and

to

Data analysis

before

Amazon.

are turning

Web data

and

to

gain

NoSQL

databases

a competitive

to

mine the

advantage.

www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo, Gartner

Cengage deemed

the

Big Data problem

software

has

converging

may not be available

Generally for the

See

manage

of

and social

of data

high performance

possible

Adding

wealth

to

16 Big

is that

inevitably

tools

2020

very

51

challenges:

structure

There

pace

amounts

Chapter

problem

need

all these

providing

patterns

as the next biggest

a phenomenon

manage large

simultaneously

review

most current

some

Web data in the form

organisations

the rapid

the

and leverage

costs) has triggered

mountains

need.

behaviour

have inundated

and scalability

are constantly

from the

an imperative

preferences,

LinkedIn

performance

Copyright

address

address

Models

Data Models: Big Data and nosQl

Deriving usable business information over the

Editorial

to

Data

organisations.

2.4.6 emerging

3

databases

emerged

2

Learning. that

any

All suppressed

Identifies

Rights

Reserved. content

does

May not

Top

not materially

be

10

copied, affect

Data

scanned, the

overall

and

or

duplicated, learning

Analytics

in experience.

whole

Technology

or in Cengage

part.

Due Learning

to

Trends

electronic reserves

rights, the

right

for

some to

third remove

2019,

party additional

February

content

may content

be

2019.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

52

part

I

Database

Systems

note Does this

2

mean that

No, relational and

structured

approach

relational

databases data

the

2.4.7

Every time

any

challenges

Bigtable

storage,

relational

The value,

key-value

to

three draw

simple

drivers

has

review

2020 has

example the

one

any

All suppressed

Rights

or

data to

column

different

model. In fact,

models

are

grouped

stores

and

key-value

to

the

these This

and

grow

of products

Facebook,

watch

database.

types that

a

As with

of technologies.

address

is

the

specific

no standard

under

the

stores.

It is

as Amazons and

comes

as LISP), in

NoSQL

stores

store

from

the

fact

to

in the

Googles as the

data

in

early

secondary

that

which in-memory

early

force

SimpleDB,

column

NoSQL umbrella,

still too

a dominant

permanently

emphasis

(such

there

become

stores

models

added

to

such

key-value data

languages

based

does

May not

on a structure

a corresponding

more

composed

of two

value

or set

of values.

data

model.

To better

or associative

these

arrays

data

of values

data

The

elements:

key-value

understand

a key

data the

and

model is

key-value

a

also

model,

2.6.

of a small truck-driving

certifications

and

other

company

general

called

information.

Trucks-R-Us. Using

this

Each

example,

of the we can

points: every of the model,

an attribute

Reserved. content

many

via

a NoSQL

consistency.

will survive

success

that

in Figure

an attribute

key-value

Learning.

to

using

tolerance.

data

points

example

model,

that

early

attribute-value

relational

Cengage

to friends

are

NoSQL.

relational

stores,

database.

has

In the

points

the

model is key

important

deemed

best

were

data.

models

programming

following

column

name

and fault

on the

graph data

other

the

In the

the

data.

2.6 shows

represents

Copyright

any

the

different

indicates

from

every

as the

Figure

and

databases

more detail.

based

Cassandra

stores

data

which

you

applied

than transaction

in

many

However,

word

hold

rather

to

to

messages

Maps,

be loosely

hence

of sparse

of these

Apaches

like

can

send

Google

availability

amounts

are not

and

at the

of application,

architectures.

high

contrary,

The

in

referred

look

database

characteristics

any,

originated

in

to refer

model,

databases if

are used to

Editorial

these

To the

just

models

NoSQL

performance

arena.

leaders.

challenges? transactions

businesses.

on Amazon,

directions

NoSQL

scalability,

document

database

Data

2019, relational

characteristics:

databases

which,

for

areas

September

Big

most day-to-day

general

Geared towards

know

has its

in

with

support

Big Data era and have the following

very large

from

technology

perspective,

to

of databases

Supports

model.

for

term

distributed

NoSQL

organisations

generation

high

data

in

databases

a new

on the

examine

a product

uses

Provides

Lets

for

the

of the

Supports

DBMS

DDMS technology

or search

chapter

based

a place

Databases

new technology, this

have

and dominant

Each

most dominant

YouTube

However,

Not

needs.

you search

on

dont

preferred

best tool for the job. In

nosQl

video

the

analytics

is to use the

still significantly

databases

remain

not materially

be

row

represents

entity

occurrence.

each

row

and the

copied, affect

scanned, the

overall

a single

or

duplicated, learning

Each

represents value

in experience.

or in Cengage

part.

occurrence

column

one

column

whole

entity

has

attribute

Due

to

electronic reserves

the

rights, the

right

some to

every

a defined

of one

contains

Learning

and

third remove

data

entity

actual

party additional

type.

instance.

value

content

column

may content

for

be

The key

the

attribute.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.6

2

Data

Models

53

a simple key-value representation Trucks-R-Us Data stored

Data stored using traditional

In the

relational

Each

row

Each

column

In the

row

one

represents in

entity one

a column

key-value

Each

model

are

of the

of the

same

data

entity type

model:

represents

one

attribute/value

of

one entity

Driver 2732

The key The

2

model

instance attribute

instance column

values

type

using

key-value

model:

represents

The values

relational

in

could

represent

the value

and therefore

any

column

it is

entitys

could

generally

attribute

be of any

assigned

data

a long

string

data type SOURCE:

Course

The data type of the value column is generally along string to accommodate data types of the values placed in the column.

Technology/Cengage

Learning

the variety of actual

To add a new entity attribute in the relational model, you need to modify the table definition. To add a new attribute in the key-value store, you add a row to the key-value store, which is whyit is said to beschema-less. NoSQL databases do not store or enforce relationships among entities. The programmer is required to manage the relationships in the program code. Furthermore, all data and integrity validations

must be done in the

expanded to support

program

code (although

some implementations

have been

metadata).

NoSQL databases use their own native application programming interface (API) with simple data access commands, such as put, read and delete. Because there is no declarative SQL-like syntax to retrieve data, the program code must take care of retrieving related data in the correct way. Indexing and searches can be difficult. Because the value column in the key-value data model could contain many different data types, it is often difficult to create indexes on the data. Atthe same time, searches can become very complex. As a matter of fact, you could use the key-value structure as a general data modelling technique when attributes are numerous but actual data values are scarce. The key-value data modelis not exclusive of NoSQL

databases;

actually,

key-value

data structures

could

reside

inside

a relational

database.

However, because of the problems with maintaining relationships and integrity within the data, and the increased complexity of even simple queries, key-value structures would be a poor design for most structured business data. Several

NoSQL

database implementations,

such as Googles

Bigtable

and Apaches

Cassandra,

have

extended the key-value data model to group multiple key-value sets into column families or column stores. In addition, such implementations support features such as versioning using a date/time stamp. For example, Bigtable stores data in the syntax of [row, column, time, value], where row, column and

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

54

part

I

Database

value

Systems

are string

composed NoSQL is

2

that

data types

of (row, supports

they

nodes.

to

use

that

supports particular,

very large

but the number

any certification

very large they

exam,

possible

certificates

only four

data instances.

than

for

each

not required

driver,

there

extrapolate

500 possible tests, remembering NoSQL

provides

databases to the

are

distributed

tolerance

means that,

Most NoSQL of the

biggest

databases fault

problems

with

If the

a copy

In a relational back.

the system

NoSQL only

you

data.

sacrifice

is

need

of

to

of the

technology.

hottest

to

in

the

best

section

and disadvantages

which

data tool

briefly

goes

levels

about

of

this

means that

updates

of the

technologies

for

the

job

summarises

performance.

today.

But,

from

and

any

a data

other

update?

is rolled

Chapter

NoSQL

database

12,

databases

will propagate

consistency,

as you learnt

database

understanding

the evolution

Distributed

data are

after an update.

Whichever

by

One

availability

(See

Some

With eventual

data immediately

management.

high

during

to the

Fault

as normal.

or the transaction

topic.)

NoSQL

consistency.

be served

down

all.

of nodes

operating

ensure

can

more

downtime.

data consistency. to

be consistent

will be consistent.

database

in

high

and

Web origins,

will keep

request

network

to

more

all copies

trends

select

The following

advantages

items

attain

learn

all data copies across

many emerging

be able

to to

consistency,

be consistent

if the

it

are

to take

in the form

than transaction

nodes

the

there

patients

without

is

can take

however,

capacity and

volumes

and three

but is not required

enforcing

down,

happens

consistency

eventual

goes

high

drivers

with 15 000

add

fails,

multiple

are guaranteed

Concurrency,

and eventually

to one

one

and

what

is

at

most

of attributes

drivers

are three

True to its

to

rather

databases

data

updates

of a clinic

database

elements

of the

very

practice,

a few tests

ability

performance

distributed

However,

In

do it transparently

distributed

of data

transaction

called

in the

Bigtable)

number

example,

if there

points.

case

handle

preceding

data

as the

and to

value.

database

of some

which the

and fault tolerance.

high,

with the requested

databases

not guaranteed

it is

node

of the

a feature

nodes

is

labs

data is

databases

of distributed

can

in

case,

can take

such

are geared towards

make copies

Transactions

provide

of the

of very large

database,

NoSQL

Managing

through

one

automatically

tolerance.

node

if

demand

databases

Using the

for the

Web operations,

when the

databases

example

NoSQL

(Cassandra,

network

cases

all. In this

each patient

of

of them

in the research

possible

high availability

to support

database

this

that

high scalability,

designed

will be nine

the stored

budgets!

NoSQL

is low.

access

most recent

big advantages

a complex

that is, for

to take

used to the

several

small

data.

data

data instances

are

Now

sparse

of the

originated

on very

The key

to indicate

fact,

to form

of sparse

for

of actual

One

databases

amounts

but they

blank

In

servers

most started

are suited

data type.

be left

architecture.

NoSQL

and

can

architecture.

commodity

several

NoSQL

a date/time time

database

Web companies,

of data. In

is

where

a distributed

use low-cost

Remember

successful

time),

distributed

generally

are designed

and time

column,

the

of data

in

Chapter

technology pros

and

1,

you cons

use,

of each

models and provides

some

of each.

2.4.8 Data Models: a summary The

evolution

complex Figure

order

be

of data

widely

model

semantic

2020 has

to

A data

model

Cengage deemed

Learning. that

been

driven

by the

of the

search

for

most commonly

new

ways

of

recognised

modelling

data

increasingly

models is shown

in

any

models,

some

All suppressed

than

must represent

Rights

semantics

common

of conceptual

database. the

real

characteristics

that

data

models

must have

the

real

to the

It

does

May not

not materially

be

copied, affect

scanned, the

overall

world

models

or

duplicated, learning

does

not

simplicity

without

make sense

to

compromising

have

a data

the

model

that

is

more

world. as closely data

while data representation

Reserved. content

are some

degree

of the

conceptualise

more

there

accepted:

must show

data behaviour,

review

always

A summary

completeness

difficult

by adding

Copyright

has

data.

evolution to

A data

Editorial

DBMSs

2.7.

In the in

of

real-world

as possible.

representation.

constitutes

in experience.

whole

or in Cengage

part.

the

Due Learning

to

electronic reserves

static

rights, the

right

This

goal is

more

easily

(Semantics

concern

aspect

of the real-world

some to

third remove

party additional

content

may content

be

the

suppressed at

any

time

dynamic

scenario.)

from if

realised

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Representation consistency

of the real-world and integrity

FIgure

2.7

transformations

characteristics

(behaviour)

of any

data

must be in

compliance

2

Data

Models

55

with the

model.

the evolution of data models 2

Semantics in Data

Comments

Model

least

1960

Difficult

Hierarchical

to represent

(hierarchical Structural 1969

Network

1970

Relational

level

No ad

hoc

Access

path

dependency

queries

(record-at-a-time

predefined

Conceptual

access)

(navigational

simplicity

access)

(structural

independence)

Provides ad hoc queries (SQL) Set-oriented

1976

M:N relationships

only)

Entity Relationship

Easy to

understand

Limited

to

(no

1983

access (more

conceptual

semantics) modeling

implementation

component)

Internet is born

Semantic

1978

More semantics Support

in

data

complex

Inheritance

1990

1985

for

(class

model

objects hierarchy)

Behaviour Extended

Object-Oriented

(O/R

most

Relational

Unstructured

DBMS)

XML

Addresses

2009 Big

Data

Big

data

data

Data problem

Lesssemantics in data model

NoSQL

Based on schema-less

key-value

Best suited for large

sparse

SOURCE:

Each

the

new

data

model

hierarchical

relationships. models

In turn,

through

language;

environment.

relational

store

note

model is

of implementation

OODM,

review

2020 has

Cengage deemed

any

All suppressed

also

emerged

the

Big

that

not

all

data

For example,

an example

be used

Rights

Reserved. content

does

May not

several

as the

models

as

not materially

of

both

conceptual

of the various

be

copied, affect

scanned, the

are

overall

or

duplicated, learning

equal;

some

stored

time,

whole

data

while the

hierarchical

models,

applications.

within the

models

or in Cengage

The

business

of alternative

are

better

data

suited

modelling,

purposes.

as the

models.

query

management.

and

such

network

The ERDM added

data for implementation

some

and

development data

Learning

easy-to-use

market share

with traditional

and implementation

in

hierarchical

business

the

Technology/Cengage

model replaced

framework.

has stimulated

created

database

experience.

for

model

(many-to-many)

models are better suited to high-level

model,

same

model

maintain strong also

network

and

data

data stores

Course

complex

over the

data

a break

managing

At the

The

independence

within a rich semantic

conceptual

a conceptual

data

dominant

data

represents

models.

advantages

superior

Data phenomenon

data that

models.

much easier to represent

model and allowed it to

manage

and disadvantages

Learning. that

could

offered

models are better for

examples

advantages

Copyright

years,

relationship

the

model

of previous

made it

support for complex

others for some tasks.

while implementation

Editorial

model

and

to

shortcomings

data representation,

to the relational

In recent

model,

It is important

than

on the

the former

relational

simpler

model introduced

many OO features

ways to

the

its

the

OO data

capitalised

model because

(XML)

exchanges

The entity

network

models

relational

model

Table

2.2

summarises

are and the

models.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

56

part

I

Database

Systems

a all

all

in still

(no good

in

hardware

2

a

hierarchical

DBMS. of

data

changes

use

development storage

relatively

changes

complex

complex management,

to

or

requires

efficiency

the

limitations

data in

substantial

gives overhead.

require

tools yields

yields

require

relationships). limits

knowledge standards.

definition

the

application

M:N

of

physical system

language

system

requires

system.

development,

structure

of

or

simplicity programs

changes

implementation

implementation in

requires

software

data

programs.

people

lack

no complexity

a

are

is

is

management.

RDBMS system

use;

manipulation

multiparent Complex

Disadvantages

knowledge

characteristics.

1.

Navigational

application

Changes

and

path.

2.

application

3.

navigational

implementation,

Navigational

There

There

There

System

4.

5.

6.

1.

2.

that

as

and

The

application

Structural

Conceptual

and

1.

3.

untrained

2.

or

and the

a

to

by

DBMS.

in

(DDL)

data

conceptual

enforced

such

access

in promotes

in equal

data

and

models.

than

models

(DML)

types,

Changes

promoted

least

language promotes

promotes

standards. is at affect

to is

relationships. provided

relationship

not

flexible system tables.

model.

language

is

sharing.

1:M

do

database

definition

more

relationship

file

with

data

relationship

relationship

and

is simplicity

programs.

more

data

security

multiparent.

independence

conformance

structure

various hierarchical

owner/member

access

independent

manipulation

is

and integrity.

of

of

efficient

DBMS.

the promotes

handles includes

is

Database

Parent/child

M:N

Conceptual

Parent/child

by

Data

Data

of

simplicity.

integrity.

It

There

data

data

Structural

application

use

hierarchical

tables

It

It

It

Advantages 1.

4.

3.

2.

6.

5.

4.

3.

2.

1.

5.

1.

disadvantages

and No

No

Yes

Yes

Yes

Yes

Network

Relational

Structural

independence

advantages

Data

independence

2.2

Model

taBle Data

Copyright Editorial

review

2020 has

Cengage deemed

Hierarchical

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

2

Data

Models

57

an

in has

by

may

to

when

only

found it

caused support.

2

provides

graphical

language. departments

applications.

limitation

enhancements,

entities

occurs

transactions.

it

system.

representation.

representation. required.

information

curve.

accepted own

and

(This own

anomalies

support

of

is

integrity

slows

from

unchecked,

model

standards

content subsequent

data

if

of

their widely their

manipulation

relationship

islands

consistency,

in

constraint

a

learning navigational

displays.

and, removed

overhead

same individuals

code.

data

data

relationship

supply

transaction

develop

consistent

steep

as

are programming

no

the

to

poorly limited

limited

is

is

no

no

is

is

of

a

information promote

complex crowded

is

addressed

is

development

of

system eliminating

easily

a systems.

may terms is

system

problems

produce file

can

There

There

There

Loss

versions.)

vendors

Slow

Complex

There

been avoid

attributes

High

There

application

There

eventually

standard.

thus

In It

It

3.

1.

2.

1.

4.

3.

2.

4.

3.

1.

4.

3.

2.

SQL. tolerance conceptual efficiency. user simplicity. on

effective

relational

fault

improves promoting

an

end

semantic

hardware. it

and storage

and

based

improves

integrity.

the

implementation, is

dominant

exceptional

thereby

data

makes

management added.

includes

details the

use.

improves

is isolates

commodity

design, availability

yields

and

Data.

tool. with

and

capability

substantially

promotes

model

Big

simplicity,

content

RDBMS

query view modelling

database

low-cost representation

representation physical-level

scalability,

provided.

hoc

integrated

is

management

Tabular

conceptual

Powerful

Ad

easier

implementation Visual

from

Visual

model.

Semantic Visual

communication

content.

uses

supports

It

It

2.

3.

Key-value

High are Inheritance

simplicity.

It

4.

3.

2.

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

Yes

Yes

Yes

Yes

Yes

Object-Orientated

NoSQL

May not

1.

3.

2.

Yes

Relationship

Entity

Editorial

1.

3.

2.

1.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

4.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

58

part

I

Database

Systems

2.5

Degrees

In the

early

1970s,

Requirements

2

the

As you to

details

can

see,

the

are

framework That is, details helpful

transfer

level

of of

specified,

created

by the

closer

multiple

(and

in integrating

floor

Designing

a usable

that

cannot

Using

conflicting)

of

follows overall of

of

produced.

the

the

floor.

proceeds

the

engineering

basic

conceptual

same

basic

process.

environment

and adds

abstraction

can

as seen

at different

data

Finally,

and

unless

data

engineers

on the factory

abstraction

without

design.

Next,

be

proceed

levels

views

can

and

of data

of automotive

be used

exist

of the

Planning

on degrees

produced.

to

database

view

to implementation. sometimes

be

at a high level

cannot

based

example

to

a structure

process

with an abstract

is

specifications

details

Standards

modelling the

that

into

begins

engineering

designer.

comes

design

car

The factory

and the

starts

concept

car

production the

(ANSi)

data

consider

of the

basic

into

producing

detail.

institute for

abstraction,

concept

the

designer

Standards a framework

of data the

are translated

properly

as the

meaning

process

a database

defined

drawing

help

drawings

an ever-increasing

details

by

that

National

(SPARC) the

begins

engineering

aBstraCtIon

American

To illustrate

designer

design

the

the

Committee

abstraction. A car

oF Data

also

be very levels

of

an organisation.

ANSI/SPARC

architecture

external,

The

conceptual

and internal.

as shown

in

Figure

of a physical

FIgure

2.8. In the

model

2.8

to

(as it is

figure,

address

often referred

to)

defines

You can use this framework the

ANSI/SPARC

physical-level

to

three

better

framework

has

implementation

levels

been

details

of data

understand expanded

of the

abstraction:

database

models,

with the

internal

model

addition

explicitly.

Data abstraction levels

End-User

View

End-User

View

External

External

Model

Model

Degree

of

Abstraction Conceptual

Characteristics

Designers

Model

High

View

ER

Hardware-independent Software-independent

Logical

independence Relational

Medium

Hardware-independent

Object-Orientated Internal Model

View

Network Low

Physical

Software-dependent

DBMS

Hardware-dependent

Hierarchical

Software-dependent

independence

Physical

Model

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.5.1 the external The

external

2

Data

Models

59

Model

model is the

end

users

view

of the

data

environment.

The term

end

users refers

to people who use the application programs to manipulate the data and generate information. End users usually operate in an environment in which an application has a specific business unit focus. Companies are generally divided into several business units, such as sales, finance and marketing.

Each

business

unit is

subject

to

specific

constraints

and requirements,

and

each

2

one

uses a data subset of the overall data in the organisation. Therefore, end users working within those business units view their data subsets as separate from or external to those of other units within the organisation. As data is being modelled, ER diagrams will be used to represent the external views. A specific representation

of an external

view is known

as an external

schema.

To illustrate

the

external

models

view, examine the data environment of Tiny University. Figure 2.9 (a) and (b) presents the external schemas for two Tiny University business units: student registration and class scheduling. Each external schema includes the appropriate entities, relationships, processes and constraints imposed by the business unit. Also note that, although the application views are isolated from each other, each view shares

a common

entity

with the

other

view.

For example,

the registration

schemas share the entities

CLASS and COURSE.

Note the entity relationships

represented in Figure 2.9. For example:

A LECTURER

may teach

many CLASSes,

is, there is a 1:* relationship A CLASS

may ENROL

and each

CLASS is taught

and scheduling

external

by only one LECTURER;

that

between LECTURER and CLASS.

many students,

and each student

may ENROL in

many CLASSes, thus

creating a *:* relationship between STUDENT and CLASS. (You willlearn about the precise nature of the ENROL entity in Chapter 5, Data Modelling with Entity Relationship Diagrams.) Each COURSE may generate many CLASSes, but each CLASS references a single COURSE. For example, there may be several classes (sections) of a database course having a course code of CIS-420. One of those classes may be offered on Mondays, Wednesdays and Fridays from 8:00 a.m. to 8:50 a.m., another may be offered on Mondays, Wednesdays and Fridays from 1:00 p.m. to

1:50 p.m.,

while a third

may be offered

on Thursdays

from

6:00 p.m. to

8:40 p.m.

Yet all three classes have the course code CIS-420. Finally, a CLASS requires one ROOM, but a ROOM may be scheduled for many CLASSes; that is, each classroom may be used for several classes: one at 9:00 a.m., one at 11:00 a.m., and one at 1:00 p.m., for example. In other words, there is a 1:* relationship between ROOM and CLASS. The use of external views representing It

makesit easy to identify

It

makes the

designers

specific

job

easy

subsets

of the database has some important

advantages:

data required to support each business units operations.

by providing

feedback

about the

models

the model can be checked to ensure that it supports all processes models, as well as all operational requirements and constraints.

adequacy.

Specifically,

as defined bytheir external

It helps to ensure security constraints in the database design. Damaging an entire database is more difficult when each business unit works with only a subset of data. It

Copyright Editorial

review

2020 has

makes application

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

program development

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

much simpler.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

60

part

I

Database

FIgure

Systems

2.9

external

models for tiny university (a) Student

2

A student classes

registration

STUDENT

may take up to six per registration

1..1 enrols_in

c

1..6

ENROL

1..35 is_taken_by

1..1

COURSE

generates

CLASS

c

1..1

1..*

A class is limited

to

35 students

(b) Aroom

Class scheduling ROOM

may be used to teach many classes

1..1 is_used_for

c

1..*

Each class is taught in only one room Each class is taught by one lecturer

CLASS

COURSE

b generates

1..*

1..3 teaches

1..1

c

1..1

LECTURER

Alecturer

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

may teach

scanned, the

overall

or

duplicated, learning

in experience.

up to three

whole

or in Cengage

part.

Due Learning

to

classes

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.5.2 the Conceptual

Model

Having identified

views,

the

external

a conceptual

model is used,

graphically

2

represented

Data

Models

61

by an ERD

(Figure 2.10), to integrate all external views into a single view. The conceptual model represents a global view of the entire database. It is a representation of data as viewed by the entire organisation. That is, the conceptual model integrates all external views (entities, relationships, constraints and processes)

into

a single

global view of the

entire

data in the enterprise,

known

as a conceptual

2

schema.

The conceptual schema is the basis for the identification and high-level description of the main data objects (avoiding any database model specific details). The most widely used conceptual modelis the ER model. Remember that the ER modelis illustrated with the help ofthe ERD, which is, in effect, the basic database blueprint. The ERDis used to graphically represent

the conceptual

schema.

The conceptual model yields some very important advantages. First, it provides a relatively easily understood birds-eye (macro-level) view of the data environment. For example, you can get a summary of Tiny Universitys data environment by examining the conceptual model presented in Figure 2.10. Second,

the

conceptual

model

is independent

of

both

software

and

hardware.

Software

independence means that the model does not depend on the DBMS software used to implement the model. Hardware independence means that the model does not depend on the hardware used in the implementation of the model. Therefore, changes in either the hardware or the DBMS software

will have

no effect

on the

database

design

logical design is used to refer to the task implemented in any DBMS.

FIgure

2.10

Conceptual

at the

of creating

model for tiny

conceptual

a conceptual

level.

Generally,

data

model that

the

term

could

be

university

enrols_in

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

62

part

I

Database

Systems

2.5.3 the Internal Once

a specific

Model

DBMS

has been

selected,

the internal

model

maps the

conceptual

model to the

DBMS. The internal model is the representation of the database as seen by the DBMS. In other words, the internal model requires the designer to match the conceptual models characteristics and constraints to those of the selected implementation model. An internal schema depicts a

2

specific

representation

of an internal

model,

using

the

database

constructs

supported

by the

chosen database. Since this book focuses on the relational model, a relational database was chosen to implement the internal model. Therefore, the internal schema should mapthe conceptual modelto the relational model constructs. In particular, the entities in the conceptual model are mapped to tables in the relational model.

Likewise,

since

a relational

database

has

been selected,

the internal

schema

is

expressed

using SQL, the standard language for relational databases. In the case of the conceptual model for Tiny University depicted in Figure 2.10, the internal model wasimplemented by creating the tables LECTURER, COURSE, CLASS, STUDENT, ENROL and ROOM. A simplified version of the internal model for Tiny College is shown in Figures 2.11 (a) and (b). The development

of a detailed

internal

model is especially

important

to

database

designers

who

work with hierarchical or network models because those models require very precise specification of data storage location and data access paths. In contrast, the relational model requires less detail in its internal model because most RDBMSs handle data access path definition transparently; that is, the designer

need

not be aware

of the

data

access

path

details.

Nevertheless,

even relational

database

software usually requires data storage location specification, especially in a mainframe environment. For example, DB2 requires that the data storage group, the location ofthe database within the storage group, and the location of the tables within the database be specified. Because the internal model depends on specific database software, it is said to be software-dependent. Therefore,

a change

in the

DBMS

software

requires

that the internal

model be changed

to fit the characteristics and requirements of the implementation database model. When you can change the internal model without affecting the conceptual model, you have logical independence. However, the internal modelis also hardware-independent, because it is unaffected bythe choice ofthe computer on which the software is installed. Therefore, a change in storage devices or even a change in

operating

systems

will not affect the internal

2.5.4 the physical

model.

Model

The physical model operates at the lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes. The physical model requires the definition of both the physical storage

devices

and the (physical)

access

methods

required

to reach

the

data

within those

storage

devices, makingit both software-and hardware-dependent. The storage structures used are dependent on the software (DBMS, operating system) and on the type of storage devices that the computer can handle. The precision required in the physical models definition demands that database designers who work at this level have a detailed knowledge of the hardware and software used to implement the database

design.

Early data models forced the database designer to take the details of the physical models data storage requirements into account. However, the now-dominant relational modelis aimed largely at the logical rather than the physical level; therefore, it does not require the physical-level details common to its

Copyright Editorial

review

2020 has

predecessors.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.11

an internal

model for tiny

2

Data

Models

63

university

2

Although the relational physical

storage

model does not require the designer to be concerned

characteristics,

the implementation

of a relational

model

about the datas

may require

physical-level

fine-tuning for increased performance. Fine-tuning is especially important when very large databases are installed in a mainframe environment. Yet even such performance fine-tuning at the physical level does not require knowledge of physical data storage characteristics. As noted earlier, the physical model is dependent on the DBMS, file level access methods and types

of hardware

storage

devices

supported

by the

operating

system.

When you can change

the

physical model without affecting the internal model, you have physical independence. Therefore, a change in storage devices or methods and even a change in operating system will not affect the internal model. A summary

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of the levels

All suppressed

Rights

Reserved. content

does

May not

of data abstraction

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

is

in experience.

whole

given in Table

or in Cengage

part.

Due Learning

to

electronic reserves

2.3.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

64

part

I

Database

taBle

Systems

2.3

levels

of data abstraction

Degree

Model

2

of

independent

Focus

of

Abstraction High

External

End-user

Hardware

views

(independent Internal Low

Physical

of database

Specific

database

Storage

and

software

Hardware and software

Global view of data

Conceptual

and

model) Hardware

model

access

Neither

methods

hardware

nor software

suMMary A data

model is

Database users.

The

Business

basic

rules

real-world The

a (relatively)

designers

data-modelling

and

end

graphical

perceives

tool for

database

data

are

and

end

resembles

real-world

with entities,

data

applications attributes,

basic

environment.

programmers relationships

modelling

and and

end

constraints.

components

within a specific

most likely

future

are

used,

geared

are

to support

a new the

and

shifting

of the

data into

uses

objects

the

but

that

between

the

relational to

some

of

(ER)

as seen

a common

it.

facts

by

model is a popular

by

model allows

database

modelling

designers,

structure.

But unlike

as

other

framework.

basic

define

model,

each

model. The ER

data

as the

facts

object-orientated

point,

the

An

an entity,

the

well as relationships

object

with other

of

used in

geared

access

to

become

the

specialised

business

strategies

that

of Big

high scalability,

burden

to

extended

engineering

applications.

Although

merger of OODM and ERDM technologies,

of databases needs

extensions

is largely

primarily

an increasing

specific

(OO)

OODM

is

develop internet

provide

the

In the

are related

the relational

views the

ERDM

generation

very

data stores that

consistency

At this

is

Tables

The entity relationship

different

many

while the

scenario

no longer

meaning.

adopted

by the need to

that

standard.

tables.

complements

relationships

model (ERDM).

databases

distributed

has

in

attributes.

it includes

more

applications,

overshadowed NoSQL

data

models

implementation

model (OODM) in that

early

stored

and to integrate

about

its

model

data

being

present

users

were

models.

database as

visually

data

giving

scientific

data

an entity

The relational relational

current

information

thus

models

data

modelling that

to

also includes

are

of a complex

communicate

define the

values in common

The object-orientated

objects,

data

the

designers

programmers

the

and

in current

model is the

user

means of common

and

to

components

network

are found

The relational

object

abstraction

models

environment.

concepts

the

data

are used to identify

hierarchical

the

simple

use

do not

Data

maintaining

use the

relational

organisations.

availability

NoSQL

model

and

and

databases

and fault tolerance

relationships

both are

for databases.

offer

by sacrificing

data integrity

to the

data

program

code.

Data level

modelling requirements of

data

abstraction.

Requirements conceptual

lowest

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

The

Committee

All suppressed

American

There is

of data abstraction

Rights

Reserved. content

does

May not

not materially

be

copied, affect

also

of different

National

(ANSI/SPARC)

and internal.

level

are a function

is concerned

scanned, the

overall

or

duplicated, learning

Standards

describes a fourth

in experience.

whole

data views (global

level

three

of data

exclusively

or in Cengage

part.

Institute levels

Due Learning

to

Standards

of data

abstraction

reserves

rights, the

right

some to

third remove

and the

Planning

abstraction: (the

with physical

electronic

vs local)

physical

storage

party additional

content

may content

and

external, level).

This

methods.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

65

key terMs AmericanNationalStandards Institute(ANSI)

entityrelationshipdiagram(ERD)

object-orientateddata model(OODM)

attribute

entityset

object-orientateddatabasemanagement

Big Data

extended relational data model(ERDM)

businessrule

external model

one-to-many (1:*) relationship

class

external schema

one-to-one (1:1) relationship

classdiagram

hardwareindependence

physicalindependence

class hierarchy

hierarchical model

physical model

conceptual model

inheritance

relational database managementsystem

conceptual schema

internal

connectivity

internal schema

relationaldiagram

constraint

logical design

relational model

Crows Foot notation

logical independence

relations

system (OODBMS)

model

(RDBMS)

data definition language (DDL)

many-to-many (*:*) relationship

relationship

data manipulationlanguage (DML)

method

schema semantic data model

data models

network model

softwareindependence

entity

NoSQL

subschema table

entity instance

object

entity occurrence

object relational database management

entityrelationship(ER) model(ERM)

Further Blaha,

Premerlani,

P. The

1(1):

system(ORDBMS)

reaDIng

M. and

Chen,

Unified Modelling Language(UML)

W. Object-Oriented

entity-relationship

model

Modelling

towards

and

a unified

Design

view

for

of data,

Database ACM

Applications.

Prentice

Transactions

on

of the

ACM,

Hall,

Database

1998.

Systems,

1976.

Codd,

E.F. A

Codd,

E.F. A

relational

Conference Codd,

E.F.

Lausen,

on The

Data

for large

Model G.

Database

shared

founded

Description,

Vossen,

NoSQL

of data

sublanguage

Relational

G. and

Oracle

model

database

Access for

and

Database

Models

and

databanks,

on relational Control,

Documentation,

pp.

Management,

Languages

of

ORACLE,

Communications

calculus, 3568,

2.

Addison-Wesley,

Orientated

[online]

of the

pp.

AIM

377-387,

1970.

SIGFIDET

1971.

Version

Object

2019

Proceedings

1990.

Databases.

Available:

Addison-Wesley,

1998.

https://docs.oracle.com/en/database/

other-databases/nosql-database/index.html Thalheim,

B. Entity-Relationship

Modelling

Foundations

of

Database

Technology.

Springer,

2000.

online Content Answers to selected Review Questions andProblems forthischapter can

be found

reVIew 1

review

for

this

book.

QuestIons of data modelling.

Whatis a business rule, and whatis its purpose in data modelling?

3

How would you translate

2020 has

platform

2

business rules into

Describe the basic features user

Copyright

online

Discuss the importance

4

Editorial

on the

Cengage deemed

and the

Learning. that

any

All suppressed

of the relational

data model components? data model and discuss their importance

to the end

designer.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

66

part

I

Database

5

Systems

Explain

how the

database

6

design

entity

relationship

(ER)

model

helped

produce

a

more structured

relational

environment.

Usethe scenario described by A customer can make many payments, but each payment is by only

2

your

one

customer

answer

using

UML

as the

basis

class

diagram

for

an entity

relationship

diagram

(ERD)

presentation.

Whyis an object said to have greater semantic content than an entity?

8

Whatis the difference between an object and a class in the object-orientated

9

How would you model Question 6 with an OODM? (Use Figure 2.7 as your guide.)

10

Whatis an ERDM, and what role does it play in the

11

Whatis arelationship,

12

Givean example of each ofthe three types of relationships.

13

Whatis atable,

14

Whatis arelational

15

Whatis connectivity?

16

Describe the

17

Whatis sparse data? Give an example.

18

Defineand describe the basic characteristics of a NoSQL database.

19

Describe the key-value

20

to

of relationships

data model(OODM)?

database environment?

exist?

model?

Give an example.

Draw ERDs to illustrate

connectivity.

Big Data phenomenon.

model

key-value

and which three types

modern (production)

and what role does it play in the relational

Using the example how

Show

notation.

7

diagram?

made

this

data model.

of a medical clinic with patients and tests, example

modelling

using

the

relational

model

and

provide a simple representation

how it

would

be represented

of

using

the

technique.

21

Whatis logical independence?

22

Whatis physical independence?

proBleMs Use the

contents

of

Figure

would the

would

5

and the

the

1-5.

between

Using and

look

like?

wereimplemented

Label the

Figure

Learning. that

any

structure

in a hierarchical fully,

identifying

model, the

root

1 segment.

between

network

structure

AGENT and CUSTOMER.

AGENT and CUSTOMER

model look

like?

(Identify

the

wereimplemented

record

types

and

in a network

model, what

set.)

OO model.(Use Figure 2.7 on p. 55

guide.) P2.1

attributes

Cengage deemed

AGENT and CUSTOMER

hierarchical Level

between

Using the ERD you drew in Problem 2, create the equivalent as your

has

Problems

Given the business rule(s) you wrotein Problem 1, create a basic UML class ERD.

4 If the relationship

2020

work

2

segment

review

p.46 to

Writethe business rule(s) that govern the relationship

what

Copyright

on

1

3 If the relationship

Editorial

2.3

for

All suppressed

Rights

as your the

Reserved. content

does

guide,

DealCo

May not

not materially

be

answer

stores,

copied, affect

scanned, the

overall

Problem

6. The

in

regions

located

or

duplicated, learning

in experience.

whole

two

or in Cengage

part.

Due Learning

to

DealCo

Class

of the

electronic reserves

rights, the

right

ERD

shows

the

initial

entities

country.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

p2.1

2

Data

Models

67

the DealCo class erD

2

6 Identify Using

each relationship

Figure

entities

P2.2

as your

and attributes

7 Identify

for

type and write all of the business rules.

guide,

answer

Problems

7-9.

The

Tiny

University

class

ERD

shows

the initial

Tiny University.

each relationship

type and write all of the business rules.

8 A hospital patientreceives medicationsthat have been ordered by a particular doctor. Becausethe patient

often

ORDER. ORDER

a

and

painters,

paintings

one

gallery.

gallery.

many paintings.

2020 has

per

can include

several

day, there

is

a 1:* relationship

medications,

creating

between

database

model to capture these business rules.

and

galleries.

A gallery

Similarly,

Using

can

A painting exhibit

a painting

PAINTER,

is

is

many created

PAINTING

paintings, by

and

artists. UBA maintains a small database to

created

by a particular but

a single

each

painter,

GALLERY, in terms

artist

painting but

and then can

each

of a relational

b

How might the (independent)

c

Drawthe complete ERD.

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

tables

scanned, the

overall

or

duplicated, learning

exhibited

be exhibited

painter

Whichtables would you create, and what wouldthe table components be?

Cengage

and

between

PATIENT, ORDER and MEDICATION.

a

deemed

PATIENT

a 1:* relationship

MEDICATION.

a particular

only

review

medications

United Broke Artists (UBA) is a broker for not-so-famous in

Copyright

order

Create an ERD that depicts arelational

track

Editorial

several

each

Identify the business rules for

b 9

receives

Similarly,

can

in

create

database:

be related to one another?

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

68

part

I

Database

FIgure

Systems

p2.2

the tiny

university

class erD

2

ENROL_GRADE

10

Using the ERDfrom attributes the

Problem 9, create the relational

for each of the

entities.

schema. (Create

Make sure you use the

appropriate

an appropriate naming

collection

conventions

of

to name

attributes.)

11

Describe the relationships (identify the business rules) depicted in the ERD shown in Figure P2.3.

12

Convert the ERD from

13

Describe the relationships

14

Create a UML ERD for each of the following more

than

a

one in the

Each of the those

Problem 11into a UML class diagram.

database

modelling

has

The word many merely means

environment.)

many employees

Each department

is

manage only one department

b

descriptions. (Note:

MegaCo Corporations divisions is composed of many departments. Each of

departments

department.

shown in the ERDin Figure P2.4.

assigned

managed

to it,

but each

by one employee,

employee

works

and each of those

for

only

one

managers

can

at a time.

During a period oftime, a customer can rent many DVDsfrom the BigVid store. Each ofthe BigVids

DVDs

can

be rented

to

c

An airliner can be assigned to fly

d

The KwikTite region but

e

Corporation

can be home

each

of those

An employee

to

many

customers

manyflights,

operates is

may have earned

employed

that

period

of time.

but each flight is flown by only one airliner.

manyfactories.

many of KwikTites

employees

during

Each factory is located in a region.

factories. by only

Each factory

employs

Each

many employees,

one factory.

many degrees, and each degree

may have been earned by

many employees. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

p2.3

2

Data

Models

69

the Crows Foot erD for problem 11

LECTURER

CLASS 2 Teaches

Advises

STUDENT

FIgure

p2.4

the

uMl

erD for

problem

13

note Many-to-many them. not

Copyright Editorial

review

2020 has

However, appropriate

Cengage deemed

(*:*) relationships

Learning. that

any

All suppressed

you in

Rights

in

a relational

Reserved. content

will learn

does

May not

not materially

be

exist

at a conceptual

Chapter

3,

level,

Relational

and you should

Model

Characteristics,

know that

how to recognise

*:* relationships

are

model.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 3 Relational Model Characteristics IN THIS CHAPTER,YOU WILLLEARN: That the

relational

database

That the

relational

models

tables

in a relational

How relations

model basic

takes

a logical

components

view

of data

are relations

implemented

through

DBMS

are organised

in tables

composed

of rows

(tuples)

and columns

(attributes) Key terminology About

the

How

data

used in

role

of the

data

redundancy

Why indexing

describing

relations

dictionary,

is

handled

and the

in the

system

relational

catalogue

database

model

is important

PREVIEW In

Chapter

and

data

2,

considering ERM

the

chapter,

structure

and

You

you

physical

to

introduced

Finally, to the

you

next few

and the

way in

basic

tables,

and retrieval.

models

be used

You

to

will discover

a relational components

database.

one important

fit

into

reason

can be treated tables

the

an ERD.

models logical

data

how the independent

that

through

design

that

without

also learnt

graphically

basic

structural

structure

details about the relational

is that its tables

their

concepts

and poorly

that

chapters.

components shape

database

designed

are introduced

which

logical

for the

aslogical

within the

a

rather

database

another.

part of relational

of well-designed

data

models

relationships

can

You

simplicity

You will also learn

about

to the

such an integral

relational

the

databases

as a table.

one

learning

ERD

relational

models

units.

be related After

how the

the

storage

and their

some important

the

known

that examine

of data

entities

about how

database

you to

aspects

depict

willlearn

more

construct

relational

you learnt

allow

physical

will learn

logical

can

Models,

may be used to

In this

than

Data

independence

to

the

design

design,

you

their

relationships,

of tables.

you

Because

the

are

table

is

will also learn the characteristics

tables.

some

basic

For example,

those

and

you

relationships

concepts

that

will examine might

will become

different

be handled

kinds

in the

your

gateway

of relationships

relational

database

environment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3 Relational

Model

Characteristics

71

NOTE The

relational

model,

Predicate logic, of fact)

can

be

of 12345678

is

theory data

is a

as either

named

Cela

mathematical in the

as B(44,

yields

a result

A and

B share

Based

set

Alogical

77, 90,

a common

data structure

deals

model.

1970,

For

is

based

example,

on

predicate

logic

example,

assume

information,

77. This result

that

and

a student

be demonstrated

or groups

Furthermore,

Given this

suppose

can easily

with sets,

For

77).

number,

value,

in

or false.

24,

11).

concepts,

Codd

This assertion

that

A(16,

with a single

on these

true

relational as

E.F.

set theory.

mathematics, provides aframework in which an assertion (statement

Nkosi.

science

77, represented

represented

by

verified

manipulation

24 and

introduced

used extensively in

of things, that

set

B contains

you

can

three

numbers

44,

that

a student

ID

or false.

Set

as the

A contains

four

can be expressed

with

be true

and is used

set

conclude

to

numbers, 77,

of

B 5 77. In

16,

90 and

the intersection

as A

3

basis for

11,

A and

other

B

words,

77.

the

relational

represented

model

has three

by the relational

well-defined

table,

where

components:

data are stored (Sections

3.1, 3.2

and 3.4). A set

of integrity

rules

to

enforce

that

the

data

are

and remain

consistent

over

time

(Sections

3.3,

3.5,

3.6 and 3.7). A set

of operations

that

define

how

data

are

manipulated

(Chapter

4,

Relational

Algebra

and

Calculus).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

72

PART I

Database

3.1

Systems

A LOGICAL VIEW OF DATA

In Chapter metadata. structure. eliminates

1, The Database Approach, you learnt that a database stores and manages both data and You also learnt that the DBMS manages and controls access to the data and the database Such an arrangement placing the DBMS between the application and the database most of the file systems inherent limitations. The result of such flexibility, however, is afar

more complex

3

physical

structure.

In fact,

the

database

structures

required

by both the

hierarchical

and network database models often become complicated enough to diminish efficient database design. The relational data model changed all of that by allowing the designer to focus on the logical representation ofthe data and their relationships, rather than on the physical storage details. To use an automotive

analogy,

the relational

database

uses an automatic

transmission

to relieve

you of the

need

to manipulate clutch pedals and gear levers. In short, the relational model enables you to view data logically rather than physically. The practical significance of taking the logical view is that it serves as areminder of the simple file concept of data storage. Although the use of a table, quite unlike that of a file, has the advantages of structural

and

data independence,

a table

does resemble

a file from

a conceptual

point

of view.

Since

you can think of related records as being stored in independent tables, the relational database model is much easier to understand than its hierarchical and network database predecessors. Greaterlogical simplicity tends to yield simpler and more effective database design methodologies. As the table

our discussion

plays such a prominent

role in the relational

begins with an exploration

model, it

deserves

of the details of table structure

a closer look.

Therefore,

and contents.

NOTE Relational

database

terminology

is

very

precise.

Unfortunately,

file

system

terminology

sometimes

creeps into the database environment. Thus, rows are sometimes referred to as records and columns are sometimes labelled asfields. Occasionally, tables arelabelled files. Technically speaking, this substitution of terms

is

not

always

terms file, record table is rows

actually

alogical

as records

familiar

file

appropriate;

and field and

system

the

database

describe physical

rather

of table

than

table

a physical

columns

is

a logical

concepts.

as fields.

construct, In

rather

than

Nevertheless, you

fact,

may (at the

many

a physical

as long

conceptual

database

concept,

and the

as you recognise that the

software

level)

think

vendors

of table

still

use

this

terminology.

3.1.1 Tables and Their Characteristics The logical view of the relational database is facilitated by the creation of data relationships based on alogical construct known as a table. Atable is perceived as atwo-dimensional structure composed of rows and columns. As far as the tables user is concerned, a table contains a group of related entities, that is,

an entity

set; for that

reason,

the terms

entity

set and table

are often

used interchangeably.

Atable is also called arelation because the relational models creator, E.F. Codd, used the term relation as a synonym for table. You can think of atable as a persistent relation, that is, a relation whose contents can be permanently saved for future use. Withinthe relational model, columns oftables are referred to as attributes

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

and rows

All suppressed

Rights

Reserved. content

does

of tables

May not

not materially

be

copied, affect

are known

scanned, the

overall

or

duplicated, learning

as tuples.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3 Relational

Model

Characteristics

73

NOTE The

concept

of a relation

restricted

set

mathematics,

is

modelled

of rules.

For

a relation

is formally

on a

example,

every

mathematical relation

defined

construct

within

the

and therefore

database

must

must follow have

a certain

a distinct

name.

In

as:

distinct), Ris a relation on these n sets, Given a number of sets D1 , D2 , ..., Dn(which are not necessarily it is a set of tuples each of which hasits first element from D1 , second element from D2and so on. Lets last

examine

names

this

formal

and

STU_LNAME

{Ndlovu,

DEPT_CODE

{BIOL,

Then

definition

(STU_LNAME)

a relation

can

R 5 {(Ndlovu,

3.1

shows

CIS,

A table

is

2

Each table

row

Each cell or column/row

All values integer Each

7

The order

8

Each table

rules

3.1

in

the

review

2020 has

two

Table

with three

Introduction

to

Learning. that

any

and

All suppressed

have

of students enrolled.

as:

EDU)}

pairs.

conform

to.

columns

an attribute

hence

to the

cells

same

column

and columns.

within the

of values

entity

set and

must be distinct.

is immaterial

to the

a relation.

contains

only an atomic

attribute

attribute

that is, a single

if the

attribute

is

assigned

an

must be integers. domain.

that

LECTURER

The table

multiple

value

DBMS.

of attributes

The

name.

For example,

that

as the

or a combination

a distinct

of a relation.

data format.

known

constitutes

has

contain

representing

LECTURER.

COURSE_NAME

of rows

column

should

in the

and

column

each

in a relation

in the

COURSE

occurrence

and

not allowed

range

and

entity

composed

values.

uniquely

table

COURSE

identifies

conforms however

For example

each row.

to

is

all of the

not

CRS_CODE

a relation

CIS-420 is

values:

and Implementation

Databases

Modelling:

Cengage

are

a specific rows

5 2), one

they

a relation.

must conform

tables: 3.1

Design

deemed

has

sets (n where

DEPT_CODE

EDU),(Ismail,

structure

an attribute,

all values

must have

Database

Data

a column

of the

must

intersection

values

COURSE_NAME

associated

Copyright

in

column

shows

listed

because

Multiple

we have two

(DEPT_CODE)

and

a set of ordered

a single in

represents

data format,

6

Figure

column

represents

4

5

Roux,

a relation

not allowed

Each table

value.

STU_LNAME

as a two-dimensional

are

Assume codes

of a relation

3

data

Editorial

is simply

(tuple)

rows

sets

CIS),(Le

that

perceived

Duplicate

Roux, Ismail}

over the

Properties

1

Le

(Smithson,

properties

TABLE 3.1

an example. department

EDU}

be defined

BIOL),

the

with of the

Smithson,

So, as you can see, a relation

Table

one

3

An Introduction

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

74

PART I

Database

Systems

FIGURE 3.1(a) Table

name:

The relation

LECTURER EMP_

LECTURER_

LECTURER_

LECTURER_HIGH_

NUM

OFFICE

103

DRE 156

6783

PhD

104

DRE 102

5561

MA

105

KLR 229D

8665

PhD

106

KLR 126

3899

PhD

110

AAK 160

3412

PhD

114

KLR 211

4436

PhD

155

AAK 201

4440

PhD

3

FIGURE 3.1(b) Table

LECTURER

name:

EXTENSION

The non-relational

table

DEGREE

COURSE

COURSE CRS_

COURSE_NAME

CODE CIS-220

Introduction

CIS-420

to

Computer

Assembly

Language

Database

Design and

Science

Programming

Implementation Introduction Data QM-261

to

Modelling:

Intro.

to

Applying the concepts A relational

described

Applications

of relations to database

schema

is

byits name followed

a textual

An Introduction

Statistics

Statistical

entity.

Databases

models allows us to define arelational

representation

of the

database

tables,

schema for each

where each table

is

bythe list ofits attributes in parentheses.

NOTE A relational

schema

belonging

the

to

R can be formally

defined

as R5{a1, a2,...,an} where a1...an

is

a set

of

attributes

relation.

For example,

consider

the

database

table

LECTURER

in Figure 3.1. The relational

schema for LECTURER

can be written as: LECTURER(EMP_NUM,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

LECTURER_OFFICE, LECTURER_EXTENSION,

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

LECTURER_HIGH_DEGREE)

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3.1.2 Attributes Each

attribute

is

3 Relational

Model

Characteristics

75

and Domains

a named

column

within the relational

table

and

draws its

values

from

a domain.

A domain is the set of possible values for this attribute. For example, an attribute called STU_CLASS, which stores the students classification whilst at university, may have the following domain {UG1, UG2, UG3, PG, Other}, which meansthat STU_CLASS can only have one ofthese values within the database. The domain

of values for an attribute

should

contain

only atomic

values

and any one value

should

not

3

be divisible into components. In addition, no attributes with morethan one value are allowed. (These are often referred to as multi-valued attributes.) For example, the value of STU_CLASS could not be UG1 and UG2 at the same time. Each domain is also defined by its data type for example, character string, number, date, etc. The fundamental

principle

of the relational

achieved by comparisons of their values. if their values are drawn from the same LECT_POSTCODE may bein two different postal codes and could be compared. In STU_NAME

with STU_CLASS,

model is that relating

different

entities

A pair of attribute values can only be domains. For example, the columns relational tables, but would share the contrast, it would be nonsense to try

even though

the

domains

are defined

to

one another

is

meaningfully compared STU_POSTCODE and common domain of all to match the attribute

by the data type (character

string).

3.1.3 Degree and Cardinality Degree and cardinality are two important properties of the relational model. A relation with N columns and Nrows is said to be of degree N and cardinality N. The degree of a relation is the number of its

attributes

and the

cardinality

of a relation

is the

number

of its tuples.

The product

of a relations

degree and cardinality is the number of attribute values it contains. Figure 3.2 shows the relational table DEPARTMENT with a degree of 4 and a cardinality of 4. The product of the relational table DEPARTMENT is 16 (4 * 4) and, as you can see in Figure 3.2, it contains 16 attribute values.

FIGURE 3.2 Table

name:

Cardinality

Degree and cardinality

of the DEPARTMENTrelation

DEPARTMENT

5 4

DEPT_CODE

DEPT_NAME

DEPT_ADDRESS

DEPT_EXTENSION

ACCT

Accounting

KLR 211, Box 52

3119

ART

Fine Arts

BBG 185, Box 128

2278

BIOL

Biology

AAK

Box 415

4117

CIS

Computer

Box 56

3245

Info.

Systems

Degree

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

KLR 333,

5 4

rights, the

230,

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

76

PART I

Database

Systems

NOTE The word relation, theory

from

also known

which

relationships

Codd

among

relationships.

derived

tables,

many

as a dataset in

Microsoft

his

the

model.

Since

database

Many then incorrectly

conclude

Access, is

relational

based on the

model

users

incorrectly

that

only the relational

assume

uses

that

the

attribute term

mathematical values

relation

to

set

establish

refers

to

such

model permits

the use of relationships.

to

define

3

3.1.4 You

Summary will

discover

thereby the

greatly

the

table

Characteristics

view

the task

of a relation

listed

of

data

makes

of database

in

Table

3.1

it

easy

design.

can

spot

The tables

be applied

to

and

shown in

a database

entity

Figure

relationships,

3.3 illustrate

how

table.

STUDENTtable attribute values

name:

Table name:

that

simplifying

properties

FIGURE 3.3 Database

of Relational

Ch03_TinyUniversity

STUDENT STU_

STU_

STU_

STU_

STU_

DEPT_

STU_

LECT_

DOB

HRS

CLASS

GPA

TRANSFER

CODE

PHONE

NUM

C

12-Feb-1999

42

UG3

2.84

No

BIOL

2134

205

K

15-Nov-2000

81

UG2

3.27

Yes

CIS

2256

222

23-Aug-2000

36

UG3

2.26

Yes

ACCT

2256

228

H

16-Sep-1996

66

UG2

3.09

No

CIS

2114

222

STU_

STU_

STU_

NUM

LNAME

FNAME

INIT

321452

Ndlovu

Amehlo

324257

Smithson

Anne

324258

Le

Dan

Roux

STU_

324269

Oblonski

324273

Smith

John

D

30-Dec-1998

102

PG

2.11

Yes

ENGL

2231

199

324274

Katinga

Raphael

P

21-Oct-1999

114

PG

3.15

No

ACCT

2267

228

Hemalika

T

08-Apr-1999

120

PG

3.87

No

EDU

2267

311

John

B

30-Nov-2001

15

UG1

2.92

No

ACCT

2315

230

324291

Ismail

324299

Smith

STU_DOB

5

Student

date of birth

STU_HRS

5

Credit

STU_CLASS

5

Student

STU_GPA

5

Grade

point

STU_PHONE

5

4-digit

campus

LECT_NUM

5

Number

Copyright Editorial

Walter

review

2020 has

Cengage deemed

Learning. that

any

hours

All suppressed

earned

classification

average phone

extension

of the lecturer

Rights

Reserved. content

does

May not

who is the

not materially

be

copied, affect

students

scanned, the

overall

or

duplicated, learning

advisor

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using the to the

1

STUDENT

points

in

table

Table

shown in Figure

3.3, you can draw the following

3 Relational

conclusions

eight

rows

degree

is

attributes

2

(tuples)

12.

corresponding

You

and can

twelve

also

columns.

describe

The

the

cardinality

table

as

of

being

STUDENT

composed

is

therefore

of eight

8 and

records

and

the

twelve

(fields).

entity

set is represented

by STU_NUM Amehlo

by the

5 321452

C.

Oblonski. the

77

structure composed

Each row in the STUDENT table describes a single entity occurrence (The

Characteristics

3.1:

The STUDENT table shown in Figure 3.3is perceived to be atwo-dimensional of

Model

Ndlovu.

Similarly,

STUDENT

defines

For

the

example,

row

entity

STUDENT

table.)

characteristics

row

3 describes

4 in

eight

3.3

row

Dan

entities

(entity

or record)

or fields)

describes

named

distinct

the

(attributes

Figure

a student

set includes

Note that

Roux.

defined

of a student

a student

le

3

within the entity set. named

named

Given

the

Walter

table

H.

contents,

(rows).

3 Each column represents an attribute, and each column has a distinct name. 4

All of the values in a column point

average

must

be classified

different

a

(STU_GPA)

match the entitys

column

according

data types,

to their

most support

STU_HRS

STU_PHONE

is

b

not intended

STU_FNAME,

c d

In

and

for

3.3,

not

Microsoft

Figure

of the

various

table

rows.

DBMSs

can

Data

support

are

numeric

adding

attributes.

or subtracting

On the

phone

other

hand,

numbers

does

result.

mathematical

manipulation.

STU_CLASS

In

is

Figure

and

a data

3.3, the

all, relational

Access

In

Figure

STU_PHONE

3.3, for

example,

are character,

text

or

STU_LNAME,

or string

attributes.

attribute.

range

known

04,

STU_TRANSFER

database

uses the label

a data type

to the

Each table

software

Yes/No

student

students find

number) last

several

is the

name

the

domain

quite

students

Cengage

Learning. that

any

possible

All suppressed

Rights

does

May not

key.

Using

would

be

copied, affect

scanned, the

overall

duplicated, learning

Smith.

in experience.

format.

data type

TRUE, FALSE

whereas

and

NULL.

Because the STU_GPA values

whole

the primary

data

Even

presented

or in Cengage

the

primary named

part.

Due Learning

to

key (PK) is an attribute

any given row. In this

be a good

one student

or

format.

data

is [0,4].

the

not be an appropriate

more than

is

not

last

not

name

data

logical

a logical

domain.

identifies

whose

materially

to indicate

a logical

the

can have values

general terms,

uniquely

primary

uses

support

to the user.

would

to find

Reserved. content

key. In

that

(STU_LNAME)

name (STU_FNAME)

which

values is known as its

inclusive,

of attributes)

attribute

packages

data type

as Boolean,

must have a primary

a combination

(the

deemed

3.3

because

The order of rows and columns is immaterial

(or

has

in

attribute

The columns range of permissible are limited

2020

STU_GPA

STU_DOB

transfer?

but

Oracle uses

review

Although

For example, the grade

each

Logical. Logical data can have only atrue or false (yes or no) condition. For example, is a student Most,

Copyright

and function.

for

the following:

meaningful

STU_INIT,

Figure

a university

is

entries

Date. Dateattributes contain calendar dates storedin a special format known as the Julian date format.

7

characteristics.

STU_GPA

Character. Character data, also known astext data or string data, can contain any character symbol

Editorial

format

at least

not a numeric

not yield an arithmetically

6

attribute

only

Numeric. Numeric data are data on whichyou can perform meaningful arithmetic procedures. For example,

5

contains

reserves

Figure

key

combination

STU_NUM

observe

it is

of the last

name

as Figure

that

possible

a to

and first

3.3 shows,

it

Smith.

rights, the

3.3,

because

key because,

John

electronic

in

primary

case,

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

78

PART I

Database

Systems

Online Content on the

online

names

platform

Allofthe databases usedtoillustratethe material in this chapterarefound for this

used in the figures.

'Ch03_TinyUniversity'

book.

The

database

names

For example, the source

used in the folder

of the tables

match the

database

shown in Figure 3.3 is the

database.

3

3.2

KEYS

A key consists number

of one or more attributes

identifies

One type of table shown because the the primary attention.

all of the invoice

that determine

attributes,

such

other attributes. (For example,

as the invoice

date

and the

an invoice

customer

name.)

key, the primary key, has already been introduced. Given the structure of the STUDENT in Figure 3.3, defining and describing the primary key seems simple enough. However, primary key plays such animportant role in the relational environment, we will examine keys properties more carefully. There are several other kinds of keys that warrant

In this

section,

you

will also

become

acquainted

with superkeys,

candidate

keys

and

secondary keys. The keys role is based on a concept known as determination. In the context of a database table, the statement A determines B indicates that if you know the value of attribute A, you can look up (determine) the value of attribute B. For example, knowing the STU_NUM in the STUDENT table (see Figure 3.3)

means that

you are able to look

up (determine)

that

students

last

name,

grade

point average,

phone number and so on. The shorthand notation for A determines B is A ? B.If A determines B, C and D, you write A ? B, C, D. Therefore, using the attributes of the STUDENT table in Figure 3.3, you can represent the statement STU_NUM

determines

STU_LNAME

by writing:

STU_NUM ? STU_LNAME In fact, the STU_NUM value in the For example, you can write: STU_NUM

STUDENT table

determines

all of the students

attribute

? STU_LNAME,

STU_FNAME,

STU_INIT

? STU_LNAME,

STU_FNAME,

STU_INIT, STU_DOB, STU_TRANSFER

values.

and STU_NUM In

contrast,

STU_NUM

is

not

determined

by STU_LNAME

because

it is

quite

possible

for

several

students to have the last name Smith. The principle of determination is very important because it is used in the definition of a central relational database concept known as functional dependence. The term functional dependence can be defined

most easily this

way: the

attribute

Bis functionally

dependent

on Aif

A determines

B. More

precisely: The output

of the

DIVIDE

Using the contents is functionally

operation

is a single

column

with the

values

of column

of the STUDENT table in Figure 3.3, it is appropriate

dependent

on STU_NUM.

For example,

the

STU_NUM

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

to say that

value

STU_PHONE value 2134. Onthe other hand, STU_NUM is not functionally

B.

321452

STU_PHONE determines

the

dependent on STU_PHONE

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

because

the

STU_PHONE

(Apparently,

some

STU_LNAME

value

because

one

The functional

occur

a phone.)

But the

student

definition

more than

with two

Similarly,

STU_NUM

may have

dependence

values

2267 is associated

share

Smith.

more than

attribute

value

students

the

value

the last

is

value

not functionally

name

a table.

values:

STU_NUM

Model

324274

324273

Characteristics

and 324291.

determines

dependent

79

on

the

STU_LNAME

Smith.

can be generalised

once in

STU_NUM

3 Relational

to cover the

Functional

case in

dependence

which the determining

can then

be defined

this

way:1

3 Attribute table

A determines

that

agree

Be careful student

in

when

value

defining

classification

for

B(that is,

attribute

the

based

TABLE 3.2 Hours

attribute

Bis functionally

A also

agree

dependencys on hours

Student

in

value

direction.

completed;

for

For

these

are

you

can

its

write:

? STU_CLASS

the specific

number

undergraduate

the

determines

3.2.

PG

more

STU_HRS

words,

University

Table

UG1

30

UG3

a third-year

Tiny

in

Classification

60-89

However,

B.

example,

UG2

Therefore,

attribute

shown

30-59

90 or

on A)if all of the rows in the

classification

completed

Fewer than

dependent

classification

of hours is not dependent

(UG3)

with

(STU_CLASS)

62 completed does

not

on the hours

classification.

or one

determine

one

with

and

It is quite possible

84 completed

only

one

value

hours.

for

to find In

completed

other hours

(STU_HRS).

Keep in is,

a key

mind that it

might take

may be composed

of

more than more than

a single

attribute

one attribute.

to

Such

composite key. Any attribute that is part of a keyis known as a key attribute. the

students

last

name, first

attributes.

last

name

would

name, initial

For example,

STU_LNAME,

not

be sufficient

and home

you

can

STU_FNAME,

to

serve

dependence;

multi-attribute

key is

that

known

as a

For instance, in the STUDENT table,

as a key.

phone is very likely

define functional a

Onthe

to produce

other

unique

hand,

the

combination

of

matches for the remaining

write:

STU_INIT, STU_PHONE

? STU_HRS, STU_CLASS

or

1

ISO-ANSI

Working

provided

Copyright Editorial

review

2020 has

Cengage deemed

through

Learning. that

any

All suppressed

Draft the

Rights

Reserved. content

Database

courtesy

does

May not

not materially

Language/SQL

of

be

copied, affect

Dr David

scanned, the

overall

or

Foundation

(SQL3),

Part

2, 29

August,

1994.

This

source

was

Hatherly.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

80

PART I

Database

Systems

STU_LNAME,

STU_FNAME,

STU_INIT, STU_PHONE

? STU_HRS, STU_CLASS,

STU_GPA

or

STU_LNAME,

STU_FNAME,

STU_INIT,

STU_PHONE

? STU_HRS,

key, the

of functional

STU_CLASS,

STU_GPA,

STU_DOB Given the

refined

3

possible

existence

by specifying

If the

attribute

composite Within the

(B)

any key that attributes.

dependence:

is functionally

dependent

key, the

broad

of a composite

full functional

attribute

uniquely

identifies

STUDENT

a composite

(B) is fully functionally

key classification,

In the

on

notion

several

key (A)

dependent

specialised

keys

superkey

could

but

not

can

on any

be further

subset

of that

on (A).

can be defined.

each row. In short, the superkey

table, the

dependence

For example,

functionally

a superkey

determines

is

all of the rows

be any of the following:

STU_NUM STU_NUM,

STU_LNAME

STU_NUM,

STU_LNAME,

In fact,

STU_NUM,

attributes

or

without

additional

attributes,

can

be

a superkey

even

when the

additional

are redundant.

A candidate Using this

key can be described as a superkey

distinction,

STU_NUM, is

with

STU_INIT

note that the

composite

without redundancies,

that is, a minimal superkey.

key

STU_LNAME

a superkey,

but it is

not

a candidate

key

because

STU_NUM

by itself

is

a candidate

key!

The

combination STU_LNAME, might last

also

be a candidate

name,

If the 3.3

first

would

would

name,

students

perhaps

one

STU_FNAME,

named

be driven

as long

and

STU_ID

by the

as you

phone

discount

and

student.

designers

the

possibility

that

two

choice

as one of the attributes

STU_NUM In that

would

case,

the

or by end-user

unique row identifier.

have

in the

been

selection

the

same

requirements.

keys,

STU_NUM In

Note, incidentally,

short,

that

table in Figure because

as the the

primary

a primary

either

primary

key

key is the

key is

a superkey

key.

each

(that is,

share

STUDENT

candidate of

primary

key

value

must

be unique

to

ensure

that

each

bythe primary key. In that case, the table is said to exhibit entity integrity. a null value

students

number.

both it each

key chosen to be the

a table,

STU_PHONE

had been included

identify

as well as a candidate Within

key,

initial

ID number

uniquely

candidate

STU_INIT,

no data entry at all)is

not permitted

in the

primary

row

is

uniquely

identified

To maintain entity integrity, key.

NOTE A null

does

not

A null is created words,

Copyright Editorial

review

2020 has

mean

a zero

when you press the

a null is no value

Cengage deemed

Learning. that

any

or a space.

All suppressed

Rights

keyboards

the

keyboards

space

Enter key without

bar

creates

a blank

(or

a space).

making a prior entry of any kind. In other

at all.

Reserved. content

Pressing

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Nulls can never in

other

are

working

be the

too.

with non-key

values

nulls

cannot

is

often

example,

A not

missing,

on the

sophistication

such

as

attributes

PRODUCT

in

tables

Because

the

once

You

table

Figure

232

table

is likely some

to

of the

may be situations

entities. In any case,

the

have

existence

many

software,

used.

In

of nulls in

different

nulls

addition,

VENDOR

even

a table

meanings.

can

create

nulls

can

For

problems

create

logical

table,

In

thus

database

2,

3.4,

tables

evidence

Data

the

Models,

that

the

data

PRODUCT

note

that

the

VEND_CODE

these

the

that

they

And

value

VEND_CODE

between VENDOR

VEND_CODE

multiple

because

that the

share

PRODUCT

is

occurrences

is the

1

value

may

the

*

of the

are required

to

redundancy

exists

values,

VENDOR

and

side in

side

occur of the

VEND_CODE

make the relationship only

when

there

is

point

to

values.

note

that

the

other table.

is

VEND_CODE

the

VENDOR table.

delivered

connection

value

For example,

Ortozo in the bar

unique

terms,

does

through

is

providing

as

1:* relationship

VENDOR

database

VEND_CODE.

once,

table

value

given

within the

Note, for example,

make the

VEND_CODE

Tables

named

more than

to

any

work.

together.

occurs to the

But

Chapter

16 cm

are

attribute

are not redundant

Henry

saw,

be linked

is related

of attribute

The same

by

can

Henry

be

in

one table

VEND_CODE

Consequently, Ortozo

made for

used

to

value 235 in the

he can

product

be

you discover

and that

the

can

Steel

PRODUCT

that the product

be contacted

tape,

12

by calling

mlength

in the

table.

Remember

the

naming

belong

CODE indicates used

that

be shown

to the

points

VENDOR

3.1.1,

attribute(s)

Normalising

the table.

VEND_CODE

point

key

convention PRODUCT

in section

primary 7,

to the

to

As defined

to

table

a relational is (are)

Database

prefix

PROD

Therefore,

the

some

other

in the

database

For

table

Figure

VEND in the in the

3.4 to indicate

PRODUCT

database.

In

that

tables

this

case,

the

VEND_ the

VEND

database.

underlined

Designs.

was used in prefix

can also be represented with the

example,

schema. the

You

relational

by a relational will see such

schema

for

schema.

schemas

Figure

3.4

in

would

as:

VENDOR

(VEND_CODE,

PRODUCT

Learning. that

they

database

values is required

value in the

chain

Cengage

table

from

points to vendor

deemed

value

PRODUCT

recall

0181-899-3425.

has

share

relationship.

corresponding

relational

relationship.

examine

PRODUCT

3.4

VENDOR

in the

SUM

to

duplication

Houselite

2020

attributes

you

are linked.

of the

Each

should

As you

review

two

In fact,

when

Therefore,

section that there

between

development

and

a common

PRODUCT

unnecessary

Copyright

tables

middle initial.

later in this

because

enable the tables

PRODUCT

VENDOR-PRODUCT values in the

Chapter

a

sparingly.

application

AVERAGE

VEND_CODE

work.

more than

The

have

possible

3

Figure

VENDOR-PRODUCT

is

be used

problems,

makes the

occurrence

PRODUCT

prefix

an EMPLOYEE

not

extent avoided

design.

of the

tables

that

tables

multiple

attributes

of

be reasonably

81

value.

COUNT,

redundancy

VENDOR

table

one do

to the greatest

cannot

of the relationship

must

create

attribute

when relational

common

the

nulls

Characteristics

value.

condition.

Controlled

Editorial

can

attribute

functions

work.

example,

nature

they

database

applicable

problems

the

of poor

but

Depending

235.

be avoided

which

Model

a null can represent:

A known,

the

should

in

employees

of the

be avoided,

improperly,

An unknown

and

For

some

because

always

used

attributes.

cases

may be null. You will also discover

an indication

Nulls, if

key, and they

are rare

However,

which a null exists

if

when

There

EMP_INITIAL.

EMP_INITIAL in

be part of a primary

attributes,

3 Relational

any

VEND_CONTACT,

(PROD_CODE,

All suppressed

Rights

Reserved. content

does

May not

VEND_AREACODE,

PROD_DESCRIPT,

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

VEND_PHONE)

PROD_PRICE,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

PROD_ON_HAND,

rights, the

right

some to

third remove

party additional

content

VEND_CODE*)

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

82

PART I

Database

Systems

FIGURE 3.4 Database Table

An example of a simple relational

name:

name:

Ch03_SaleCo

PRODUCT

PROD_CODE

3

database

Primary

key:

PROD_CODE

Foreign

key:

VEND_CODE

PROD_PRICE

PROD_DESCRIPT

PROD_ON_HAND

10.23

001278-AB

Claw hammer

123-21UUY

Houselite

QER-34256

Sledge hammer,

SRE-657UG

Rat-tail file

ZZX/3245Q

Steel tape,

chain

12

saw,

16 kg head

232

23 4

235

14.72

6

231

2.36

15

232

5.36

8

235

150.09

16 cm bar

VEND_CODE

mlength link

VEND_CODE

VEND_CONTACT

VEND_PHONE

7325

555-1234

Johnson

0181

123-4536

Sibiya

7325

224-2134

0113

342-6567

0181

123-3324

0181

899-3425

Shelly K. Smithson

230

Table

VEND_AREACODE

231

James

232

Khaya

233

Lindiwe

234

Nijan

235

Henry

name:

Molefe Pillay Ortozo

VENDOR

Primary key: VEND_CODE Foreign key: none

The link between the PRODUCT and VENDOR tables in Figure 3.4 can also be represented by the relational diagram shown in Figure 3.5. In this case, the link is indicated by the line that connects the VENDOR and PRODUCT tables.

FIGURE 3.5

The UMLentity relationship diagram for the CH03_SaleCodatabase

The relationship

line in Figure 3.5 is created

More specifically,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

the

Rights

Reserved. content

does

primary

May not

not materially

be

when two tables

key of one table (VENDOR)

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

share an attribute

appears

to

electronic reserves

rights, the

right

with common values.

as the foreign

some to

third remove

party additional

content

key in

may content

be

a related

suppressed at

any

time

from if

the

subsequent

eBook rights

table

and/or restrictions

eChapter(s). require

it

CHAPTER

(PRODUCT). table.

Aforeign

For example,

as a foreign VENDOR

key (FK) is an attribute

in

Figure

key in the table

PRODUCT

shown

If the foreign

3.5, the

in

table.

Figure

key contains

3.4

not

that

key

contains

referential

a value,

matching

integrity

is

Finally, a secondary customer you

facilitated

that when

members

yield

keys

effectiveness

key is.

of view, the to

dozens

For instance, attribute

examine

of

and

VENDOR

in

one

number? number

could phone

which

narrowing

values New

the

used.

of the

a third

occurs

table,

the

means that, if the

another

tables

course,

database

table

last

Suppose Do

can

primary

name

and

be key

phone

For example, Smith family of last

combination

on

a specific how

is legitimate

to

key.

matches if several

for

produce

CUS_CITY

name match.

restrictive

from

that

a database

a usable return

is

3

3.4.

the

depends

are not likely

Figure

the

outcome.

be searched

key CUS_CITY

(Of

Note

a customer

case,

customers

a search

in

purposes.

for

yield a unique

then

relation.

shown

In that

Similarly,

could

or Paris

matches.

in

retrieval

yield several

down

and it

number is the primary

Data are

line.

secondary

York

of possible

(row)

which the customer

number

83

make(s) use of that foreign

used strictly for data retrieval

combination

matches,

although

millions

PRODUCT

phone

only

to

integrity

the

key does not necessarily

with

that

tuple

and

table

not linked

valid

key is the

at a residence

could

A secondary

want

secondary

a secondary

is

an existing

their

Characteristics

key.

or nulls, the table(s)

table in

name

table

VENDOR

to

will remember

name and home telephone

code

VENDOR

as a key that is

last

the

mind that

last

secondary

than

customers

were living

postal

point

defined

key in the

In other words, referential

refers

Model

matchthe primary key values in the related

primary

a foreign

values

between

customers

number;

Keep in

a customers

you

maintained

key is

most the

customer

number.

and

value

data are stored in a CUSTOMER

suppose

is the

that

the

contain

key is (are) said to exhibit referential integrity. foreign

is the

Because

does

either

whose values

VEND_CODE

3 Relational

a better

unless

secondary

key

CUS_COUNTRY.) Table

3.3

summarises

TABLE 3.3

the

different

Relational

database

Key type

Definition

Superkey

An attribute

Candidate

key

relational

A minimal

(or

keys.

keys

combination

superkey.

of attributes)

A superkey

that

that

does

uniquely

not contain

identifies

each row in

a subset

of attributes

a table. that

is itself

a superkey. Primary

key

A candidate Cannot

Secondary Foreign

key key

database

RDBMSs

application rules

combination

An attribute

(or

combination

Copyright review

2020 has

enforce

Learning. that

any

rules

All suppressed

rules

integrity

conforms

are summarised

Cengage deemed

integrity

design

The integrity

Editorial

values in any given row.

null entries.

(or

primary

all other attribute

key in

another

of attributes)

used

of attributes) table

in

strictly

one table

for

data retrieval

whose

values

purposes. must either

match

or be null.

INTEGRITY RULES

Relational all)

contain

An attribute

the

3.3

key selected to uniquely identify

Rights

in

rules

Table

does

May not

not materially

be

to

automatically.

to the entity

good

database

However,

and referential

it is

integrity

design.

much

safer

Many (but to

by no

make

rules

mentioned in this

Figure

3.6.

sure

means

that

chapter.

your

Those

3.4.

summarised

Reserved. content

are very important

in

copied, affect

Table

scanned, the

overall

or

duplicated, learning

3.4 are illustrated

in experience.

whole

or in Cengage

part.

Due Learning

in

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

84

PART I

Database

TABLE 3.4 Entity

Systems

Integrity

rules

Description

integrity

All primary key entries are unique,

Requirement

Each row

Purpose

will have a unique identity,

reference

3

primary

No invoice

Example

can

are uniquely Referential

and no part of a primary key and foreign

may be null.

key values can properly

key values. have

a duplicate

identified

by their

number, invoice

nor

can it

be null. In

short,

all invoices

a part

of its tables

number.

Description

integrity

A foreign

Requirement

primary

key key)

it is related.

may have

either

or an entry (Every

a null entry (as long

that

non-null

matches foreign

the

primary

key value

as it is

not

key value in

must reference

a table

to

an existing

which primary

key value.) Purpose

It is possible for an attribute impossible rule

to

foreign

The CUSTOMER

The enforcement a row in

key values

one table

in

another

an assigned

to have an invalid

of Figure 3.6 at the top

Entity integrity.

entry. delete

might not yet have

will be impossible

1

to

matching,

A customer

Note the features

an invalid

makes it impossible

mandatory, Example

have

NOT to have a corresponding

sales

value,

but it

of the

referential

whose

primary

integrity key has

table. representative

sales representative

(number),

but it

(number).

of the next page.

tables

primary

key is

CUS_CODE.

The CUSTOMER

column has no null entries, and all entries are unique. Similarly, the AGENT tables AGENT_CODE, and this primary key column also is free of null entries. 2

will be

primary

key

primary key is

Referential integrity. The CUSTOMER table contains a foreign key AGENT_CODE, which links entries in the CUSTOMER table to the AGENT table. The CUS_CODE row that is identified bythe (primary key) number 10013 contains a null entry in its AGENT_CODE foreign key, because MrJaco Pieterse does not yet have a sales representative assigned to him. The remaining AGENT_CODE entries in the

To avoid

nulls,

CUSTOMER

some

table

designers

all

match the

use special

AGENT_CODE

codes,

known

entries in the

as flags,

AGENT table.

to indicate

the

absence

of some

value. Using Figure 3.6 as an example, the code -99 could be used as the AGENT_CODE entry of the fourth row of the CUSTOMER table to indicate that customer Jaco Pieterse does not yet have an agent assigned to him. If such a flag is used, the AGENT table must contain a dummy row with an AGENT_ CODE value of -99. Thus, the AGENT tables first record might contain the values shown in Table 3.5. TABLE

3.5

A dummy

variable

value

used as a flag

AGENT_CODE

AGENT_AREACODE

AGENT_PHONE

AGENT_LNAME

AGENT_YTD_SALES

-99

0000

000-0000

None

0.00

Chapter 5, Data Modelling may be handled.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

with Entity Relationship

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

Diagrams, discusses several

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

ways in which nulls

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.6 Database

Anillustration

name:

Table name:

ofintegrity

3 Relational

Model

Characteristics

85

rules

Ch03_InsureCo

CUSTOMER

Primary key: CUS_CODE Foreign

key:

AGENT_CODE CUS_

CUS_

CUS_

CUS_

CODE

LNAME

FNAME

INITIAL

CUS_

CUS_

CUS_RENEW_

AGENT_

AREACODE

PHONE

DATE

CODE

10010

Ramas

Alfred

A

0181

844-2573

12-Mar-19

502

10011

Dunne

Leona

K

0161

894-1238

23-May-18

501

10012

Du Toit

W

0181

894-2285

05-Jan-19

502

10013

Pieterse

0181

894-2180

20-Sep-19

10014

Orlando

0181

222-1672

04-Dec-18

501

10015

OBrian

Amy

B

0161

442-3381

29-Aug-19

503

10016

Brown

James

G

0181

297-1228

01-Mar-19

502

10017

Williams

George

0181

290-2556

23-Jun-19

503

10018

Padayachee

Vinaya

G

1061

382-7185

09-Nov-19

501

10019

Moloi

Mlilo

K

0181

297-3809

18-Feb-19

503

Table

name:

Marlene Jaco

F

Myron

3

AGENT

Primary key: AGENT_CODE Foreign

key:

none

AGENT_CODE

AGENT_LNAME

AGENT_AREACODE

AGENT_PHONE

AGENT_YTD_SLS

501

Bhengani

0161

228-1249

1

371 008.46

502

Mbaso

0181

882-1244

3

923 932.59

503

Okon

0181

123-5589

2

444

244.52

Other integrity rules that can be enforced in the relational model are the NOT NULL and UNIQUE constraints. The NOT NULL constraint can be placed on a column to ensure that every row in the table has a value for that

column.

The UNIQUE

constraint

is

a restriction

placed

on a column

to ensure that

no duplicate values exist for that column.

3.4

THE DATA DICTIONARY AND THE SYSTEM CATALOGUE

The data

dictionary

provides

a detailed

accounting

of all tables

found

within the

user/designer-created

database. Thus, the data dictionary contains atleast all of the attribute names and characteristics for each table in the system. In short, the data dictionary contains metadata data about data. Using the small database presented in Figure 3.6, you might picture its data dictionary as shown in Table 3.6.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

86

PART I

Database

Systems

TABLE 3.6 Table

Name

Asample

Attribute

Name

data dictionary

Contents

Type

Format

Domain

Required

FK

PK

Referenced

or

Table

FK

CUSTOMER

CUS_CODE

3

account

CUS_FNAME

code

Customer

CUS_INITIAL

last

name

CUS_RENEW_DATE

Customer

AGENT_CODE

99999

10000-99999

Y

PK

VARCHAR2(20)

Xxxxxxxx

100-999

Y

FK

VARCHAR2(20)

Xxxxxxxx

CHAR(5)

Customer

CUS_LNAME

first

name Customer

AGENT

Y

CHAR(1)

X

DATE

dd-mmm-yyyy

CHAR(3)

999

CHAR(3)

999

CHAR(4)

999

CHAR(14)

999-9999

Y

Xxxxxxxx

Y

initial

Customer insurance renewal

AGENT

date

Agent

code

AGENT_CODE

Agent

code

AGENT_AREACODE

Agent

area

AGENT_PHONE

Agent

AGENT_LNAME

number

AGENT_YTD_SLS

Agent

code

telephone

VARCHAR2(20) last

Agent

NUMBER(9,2)

name

PK

Y 0.00-9

9 999

999

Y

999.99

Y

999.99

year-to-date sales

FK

5

Foreign

PK

5

Primary

5

Fixed

VARCHAR2

CHAR

5

Variable

NUMBER

5

key

key

character

length

character

Numeric

data

MONEY

or

data

length

(1-255

data

(NUMBER(9,2)

characters)

(1-4

is

CURRENCY

data

000

used

characters)

to

specify

numbers

with

two

decimal

places

and

up

to

nine

digits,

including

the

decimal

places.

Some

RDBMSs

permit

the

use

of

a

type.)

NOTE Telephone area codes are always composed of digits 0-9. Because area codes are not used arithmetically, they are most efficiently stored as character data. Also, the area codes are always composed of a maximum of four digits. Therefore, the area code data type is defined as CHAR(4). Onthe other hand, names do not conform to a standard length. Therefore, the customer first names are defined as VARCHAR2(20), thus indicating

that

up to

20 characters

may be used to

store the

names.

Character

data

are shown

as

left-justified.

NOTE The data dictionary in Table 3.6is an example of the human view of the entities, attributes and relationships. The purpose of this data dictionary is to ensure that all members of database design and implementation teams use the same table and attribute names and characteristics. The DBMSs internally stored data dictionary

contains

additional

and enforcement, database

Copyright Editorial

review

2020 has

implementation

Cengage deemed

Learning. that

any

information

and index types

All suppressed

about relationship

and components.

types,

entity

and referential

This additional information

integrity

checks

is generated during the

stage.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The data the

dictionary

design

is sometimes

decisions

about

described

tables

Like the data dictionary, the system described

as a detailed

data about table the

data type

access

system

database

catalogue

information store

effect,

the

such

of the

same

in

in

is

very

to

describe

the

must be avoided. 33 at the

end

3.5

same

of the

content.

users

and

information,

fact,

current

designers

the

the

3

relational

data

database

Therefore,

database

allows the

homonyms

dictionary

whose tables

system

spelled

words

with

catalogue

As new

check for

For

table

the

confusion,

meanings,

word

example,

and also

use

tables

eliminate

such

homonym

you

might

C_NAME

you should

and

words with different

different

context,

attributes.

To lessen

documentation.

RDBMS to

are similar-sounding

a database

For

example,

of a homonym car

and

why using synonyms

avoid

as fair

indicates use

the

C_NAME

to

to label

a consultant

database

homonyms;

is

auto

and indicates

refer

a bad idea

to the

the use of different

same

when you

object.

Synonyms

work through

Problem

chapter.

know that relationships

(*:*). This section developing

explores

database

The 1:* relationship norm in The

the

each table,

RELATIONSHIPS WITHINTHE RELATIONAL DATABASE

You already

start

which

including

in

authorised

In

can be

regard.

attribute.

You will discover

87

it records

database,

dictionary

a system-created

and

In a database context, a synonym is the opposite names

the

creators, data

from

produces

different

in this

Characteristics

table.

a CUSTOMER table.

useful

because

of columns

interchangeably.

actually

also

In

to label

attribute

a CONSULTANT

dictionary

is

or identically festival).

name

name

used

characteristics

automatically

son,

index

catalogue,

catalogue

within

number

all required

often

documentation

(meaning

attribute

a customer

data

and

and fair

name attribute the

as sun

just)

all objects

date, the

filenames,

a system

and synonyms. In general terms,

meanings, (meaning

describes

user/designer-created

catalogue

that

metadata. The system catalogue

index

are

only

any

contains

contains

dictionary

database

database,

that

column,

The system

like

designers

and creation

catalogue

provides

just

system

to the

homonyms

use

data

generally

be queried

are added

label

and

database

catalogue

dictionary

each

user/designer-created

can

In

to

system

database

Model

structures.

creator

the

may be derived.

the

tables

Since

software

data

the tables

corresponding

privileges.

terms

system

names,

as the

and their

3 Relational

will see

how

focusing

is the relational database

1:1 relationship

should

cannot

as one-to-one

those relationships

designs,

any relational

*:* relationships

are classified

further,

on the

one-to-many

(1:*), and

to help you apply them

following

modelling ideal.

(1:1),

many-to-many

properly

when you

points:

Therefore,

this relationship

database

design.

type

should

be the

design.

be rare

in

any relational

be implemented

any *:* relationship

can

as such in the relational be changed

into

two

model. Later in this

section,

you

1:* relationships.

NOTE The

UML class

element

diagram

to represent

to represent

represents

relationships

*:* relationships

a *:* association

as associations

directly.

between

two

However,

you

classes in

among

objects

will also learn

Chapter

5, Data

how

and

can

use the

an association

Modelling

multiplicity

class is

with Entity

used

Relationship

Diagrams.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

88

PART I

Database

Systems

3.5.1 The 1:* Relationship The 1:* relationship

is the relational

database

norm.

To see how such

implemented, consider the PAINTER paints PAINTING example that the data models in Figure 3.7 withits implementation in Figure 3.8.

FIGURE 3.7

3

The 1:* relationship

As you examine

the

PAINTER

a relationship

was used in

is

modelled

and

Chapter 2. Compare

between PAINTERand PAINTING

and

PAINTING

table

contents

in

Figure

3.8, note the following

features:

each painting is painted by one and only one painter, but each painter could have painted many paintings. Note that painter 123 (Onele P. Najeke) has three paintings stored in the PAINTING table. There is only one row in the PAINTER table for any given row in the PAINTING table, but there may be manyrows in the PAINTING table for any given row in the PAINTER table.

FIGURE 3.8 Database Primary

name: key:

Theimplemented 1:* relationship Ch03_Museum

PAINTER_NUM

Table name:

PAINTER

Foreign

none

PAINTER_NUM

Thunder

1339

Vanilla

Roses

1340

Tired

1341

Hasty

1342

Plastic

Table name:

PAINTING

Primary

PAINTING_NUM

Key:

P

Julio

G

PAINTER_NUM 123 To Nowhere

123

Flounders

126

Exit

123

Paradise

126

Foreign

As we are using the

PAINTER_INITIAL

Onele

Itero

PAINTING_TITLE Dawn

PAINTER_FNAME

Najeke

126

1338

key:

PAINTER_LNAME 123

PAINTING_NUM

between PAINTERand PAINTING

UML notation,

Key:

it is

PAINTER_NUM

worth pointing

out some

of the

different

terminology

may see when representing relationships amongst entities. In UML, relationships associations among entities. Associations have several characteristics:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

that

you

are also known as

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Association over the

name.

written

on the

Association to

the

the

line.

In the

association

direction.

direction

in

PAINTING

Role

Each association

association

Associations the

The participating

the

name.

entities A role

relationship

A PAINTER

paints

example

In

name

of the

association

association

represented

Figure

3.7, the

seen

role

the

entity

and each

names

can role

name

is

written

paints

by an arrow (

arrow

is

would

shown

concepts,

for

PAINTING

have

by a given

name

(class);

be paints

alternatively

played

as the association

by each

a PAINTING,

the two

expresses

names,

as

relationship

is

? ) pointing

pointing

towards

role

class

names

in the

paints is displayed.

3

instead

relationship.

The role

names

example:

is_painted_by

a PAINTER.

and is_painted_by.

we shall not use role

Multiplicity refers to the number ofinstances

one instance

information model.

of a related

as the

As we are concentrating

names in

modelling

in

any relationships

entity

connectivity,

(class).

Multiplicity

cardinality

of one entity (class) that are associated

in the

and relationship

UML

model

participation

provides

the

constructs

same

in the

ER

For example:

One (and and

the 3.7, the

89

entities.

Multiplicity. with

flows.

in the

name

book on modelling relational

between

Figure

also have a direction,

relationship

Figure 3.7 does not show role

this

Normally,

in

Characteristics

line.

which

of an association

In this

shown

Model

entity.

name.

represent

has a name.

example

3 Relational

only

only

one)

PAINTER

one

PAINTER.

generates

one to

many

PAINTINGs,

implemented

in the

and

one

PAINTING

belongs

to

one

NOTE The

one-to-many

of the 1

(1:*)

side in the table

The 1:* relationship will discover

that

COURSE.

relationship

For

Wednesdays

is found each

and

an

Fridays

can

COURSE

There

Figure

and

review

2020 has

from

many

course

10:00

a.m.

Students

CLASSes might

to

two

a.m.

between

in

but that

yield

10:50

the 1:* relationship

by putting

the

primary

key

a typical each

classes:

and

one

CLASS one

offered

COURSE

college

or university

refers

offered

to

one

Mondays,

on Thursdays

and CLASS

only

on

(Th)

from

might be described

one row

many rows in the 3.9

many CLASSes,

maps the

in the

COURSE

CLASS table for

ERM (Entity

but each

CLASS references

table

any

for

given

row

any given row in the

Relationship

Model)

for the

only one

in the

CLASS

COURSE. table,

but there

COURSE table.

1:* relationship

between

COURSE

CLASS.

Cengage deemed

can have

will be only

can be

Copyright

II

model

key.

environment.

generate

Accounting

(MWF)

as a foreign

relational

way: Each

Editorial

side

in any database

6:00 p.m. to 8:40 p.m. Therefore, this

easily

of the many

COURSE

example,

is

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

90

PART I

Database

Systems

FIGURE 3.9

The 1:* relationship

between COURSEand CLASS

3 The 1:* relationship

FIGURE 3.10 Database Primary

key:

COURSE

and

CLASS is further

Theimplemented 1:* relationship Table name:

Ch03_TinyUniversity

Foreign

CRS_CODE

key:

illustrated

in

Figure

3.10.

between COURSEand CLASS COURSE none

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

CRS_CREDIT

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

CIS-420

CIS

Database

QM-261

CIS

Introduction

QM-362

CIS

Table

name:

Primary

key:

to

Computer

3

Science

4

Design and Implementation

3

to Statistics

Statistical

4

Applications

CLASS

Foreign

CLASS_CODE

key:

CRS_CODE

CLASS_CODE

CRS_CODE

CLASS_SECTION

10012

ACCT-211

1

MWF 8:00-8:50

10013

ACCT-211

2

MWF 9:00-9:50

10014

ACCT-211

3

10015

ACCT-212

1

10016

ACCT-212

2

10017

CIS-220

1

MWF 9:00-9:50

10018

CIS-220

2

MWF 9:00-9:50

10019

CIS-220

3

MWF 10:00-10:50

10020

CIS-420

1

W6:00-8:40

10021

QM-261

1

MWF 8:00-8:50

10022

QM-261

2

10023

QM-362

1

10024

QM-362

2

Copyright Editorial

name:

between

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

CLASS_ROOM

LECT_NUM

a.m.

BUS311

105

a.m.

BUS200

105

BUS252

342

BUS311

301

BUS252

301

a.m.

KLR209

228

a.m.

KLR211

114

KLR209

228

KLR209

162

KLR200

114

KLR200

114

KLR200

162

KLR200

162

CLASS_TIME

TTh

2:30-3:45

p.m.

MWF 10:00-10:50 Th 6:00-8:40

a.m.

p.m.

a.m.

p.m.

TTh 1:00-2:15

a.m. p.m.

MWF 11:00-11:50 TTh

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

2:30-3:45

or in Cengage

part.

Due Learning

a.m.

p.m.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using Figure

3.10, take

CLASS

uniquely

key.

table

However,

in the

class

the

In

key.

other

Note in in the

Figure

PAINTING

CRS_CODE,

the

key

table

example, as

is included

SQL in

key.

CLASS

been

will also

composed

Similarly,

Model

Characteristics

CLASS_CODE

chosen

to

uniquely

of CRS_CODE

not null and unique

in the

be the

primary

identify

and

91

each

row

CLASS_SECTION

constraints

enforced.

(You

will

8.)

PAINTER

table

Note that has

CLASS_SECTION key

Chapter

the

terminology.

CLASS_CODE

must have the

that

a foreign

in the

and

composite

when you learn

3.8, for

Therefore,

CRS_CODE

words,

Any candidate

see how this is done

some important

each row.

combination

table.

is a candidate

a minute to review

identifies

3 Relational

tables in

primary

Figure

as a foreign

key,

3.10,

the

PAINTER_NUM,

COURSE

is included

tables

primary

3

key,

key.

3.5.2 The 1:1 Relationship As the vice

1:1 label

versa.

For

department

exhibit

in this

example,

can

one

have

only

a 1:1 relationship.

be required at this

implies,

stage

of the

FIGURE 3.11

you

the

Each lecturer

is

EMP_NUM.

should

in

Figure

tables

in

a Tiny

(However,

can

only

entities

chair

one

one

on the

is

entity,

and

and

one

DEPARTMENT

thus

and lecturers

entities is

basic

1:1 relationship

other

department

and

chair a department

attention

basic

only

LECTURER

between the two

your

5.) The

cannot

optional.

However,

1:1 relationship. modelled

in

Optional

Figure

3.11,

and

between LECTURER and DEPARTMENT

3.12,

University

note that

employee.

that

to

3.12.

Figure

note

be related

not all lecturers

focus

Chapter

The 1:1 relationship

examine

The

That is, the relationship

in

shown

can

a lecturer

chair.

might argue that

discussion,

is

entity

chair

department

(You

will be addressed

its implementation

one

department

one

to chair a department.

relationships

As you

relationship,

not

all

there

are

Therefore,

employees

several the

are

important

lecturer

features:

identification

LECTURERS

is

theres

through

another

the

optional

relationship.) The 1:1 LECTURER foreign

key in the

1:* relationship contains

in

the

Also

which the

EMP_NUM

note that

DEPARTMENT participate

chairs

DEPARTMENT

DEPARTMENT

the

(or

many

LECTURER

the

table

contains

to

a single

that

the

relationship.

more) relationships

is implemented

1:1 relationship

key to indicate

LECTURER

even

relationship

Note that

side is restricted

as a foreign

employs

in two

table.

occurrence.

it is the

In this

case,

that

foreign

a good

EMP_NUM

as a special

department

DEPT_CODE

This is

by having the

is treated

has

case

DEPARTMENT a chair.

key to implement

example

of the

of how two

the

entities

1:* can

simultaneously.

Online Content If youopenthe'Ch03_TinyUniversity' database available onthe online platform

accompanying

LECT_NUM which is

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

this

book youll

as their foreign an example

All suppressed

Rights

of the

Reserved. content

key.

does

May not

not materially

use

be

copied, affect

see that the

LECT_NUM

and

of synonyms

scanned, the

overall

or

duplicated, learning

STUDENT

EMP_NUM

or different

in experience.

whole

or in Cengage

part.

names

Due Learning

to

and

CLASS entities still use

are labels

electronic reserves

for the

for the

rights, the

right

same

some to

third remove

same

attribute,

attribute.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

92

PART I

Database

Systems

FIGURE 3.12 Database

name:

Primary

3

key:

The implemented Ch03_TinyUniversity

Table

Foreign

EMP_NUM

key:

name:

between LECTURER and DEPARTMENT LECTURER

DEPT_CODE LECT_EXTENSION

LECT_HIGH_DEGREE

DRE 156

6783

PhD

ENG

DRE 102

5561

MA

ACCT

KLR 229D

8665

PhD

KLR 126

3899

PhD

EMP_NUM

DEPT_CODE

103

HIST

104 105

LECT_OFFICE

MKT/MGT

106 110

BIOL

AAK

160

3412

PhD

114

ACCT

KLR 211

4436

PhD

AAK

4440

PhD

MATH

155

201

160

ENG

DRE 102

2248

PhD

162

CIS

KLR 203E

2359

PhD

191

MKT/MGT

KLR 409B

4016

DBA

195

PSYCH

AAK 297

3550

PhD

209

CIS

KLR 333

3421

PhD

228

CIS

KLR

300

3000

PhD

297

MATH

AAK

194

1145

PhD

299

ECON/FIN

KLR 284

2851

PhD

301

ACCT

KLR 244

4683

PhD

335

ENG

DRE 208

2000

PhD

342

SOC

BBG 208

5514

PhD

387

BIOL

AAK

230

8665

PhD

401

HIST

DRE 156

6783

MA

425

ECON/FIN

KLR 284

2851

MBA

435

ART

BBG

2278

PhD

The 1:* DEPARTMENT CODE foreign

employs

key in the

The 1:1 LECTURER foreign

key in the

chairs

DEPARTMENT

Primary

key:

DEPT_CODE

Foreign

key:

2020 has

is implemented

through

the placement

of the

DEPT_

relationship

is implemented

through

the placement

of the

EMP_NUM

EMP_NUM

Cengage deemed

relationship

DEPARTMENT table.

DEPARTMENT

review

LECTURER

185

LECTURER table.

Table name:

Copyright Editorial

1:1 relationship

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

DEPT_NAME

DEPT_

CODE

SCHOOL_

EMP_

CODE

NUM

3 Relational

Model

DEPT_ADDRESS

Characteristics

DEPT_

EXTENSION

ACCT

Accounting

BUS

114

KLR 211, Box 52

3119

ART

Fine Arts

A&SCI

435

BBG 185, Box 128

2278

BIOL

Biology

A&SCI

387

AAK 230,

Box 415

4117

CIS

Computer

BUS

209

KLR

333,

Box 56

3245

ECON/FIN

Economics/Finance

BUS

299

KLR

284,

Box 63

3126

ENG

English

A&SCI

160

DRE 102,

Box

223

1004

HIST

History

A&SCI

103

DRE 156,

Box

284

1867

Info.

Systems

MATH

Mathematics

A&SCI

297

AAK 194,

Box 422

4234

MKT/MGT

Marketing/Management

BUS

106

KLR

Box 55

3342

126,

PSYCH

Psychology

A&SCI

195

AAK 297, Box 438

4110

SOC

Sociology

A&SCI

342

BBG 208, Box 132

2008

illustrates

a proper

The preceding

LECTURER

chairs

DEPARTMENT

the use of a 1:1 relationship ensures that should not be. However, the existence of a were not defined properly. It could indicate As rare as 1:1 relationships should be, suppose

you

manage the

database

example

93

1:1 relationship.

3

In fact,

two entity sets are not placed in the same table when they 1:1 relationship sometimes meansthat the entity components that the two entities actually belong in the same table! certain conditions absolutely require their use. For example,

for a company

that

employs

pilots, accountants,

mechanics,

clerks,

salespeople, service personnel and more. Pilots have many attributes that the other employees dont have, such aslicences, medical certificates, flight experience records, dates offlight proficiency checks and proof of required periodic medical checks. If you put all of the pilot-specific attributes in the EMPLOYEE

table,

you

will have several

nulls in that table for all employees

who are not pilots.

To avoid

the proliferation of nulls, it is better to split the pilot attributes into a separate table (PILOT) that is linked to the EMPLOYEE table in a 1:1 relationship. Since pilots have many attributes that are shared by all employees such as name, date of birth and date of first employment those attributes would be stored in the EMPLOYEE table.

Online Content If youlook atthe'Ch03_AviaCo' databaseonthe onlineplatform for this book, you will see the implementation relationship

will be examined

in

of the 1:1 PILOT to

detail in

Chapter

6, Data

EMPLOYEE relationship. Modelling

Advanced

This type

of

Concepts.

3.5.3 The *:* Relationship A many-to-many (*:*) relationship is a more troublesome proposition Traditionally in data modelling the *:* relationship can be implemented set of 1:* relationships.

To explore

the

many-to-many

(*:*) relationship,

in the relational environment. by breaking it up to produce a consider

a rather

typical

college

environment in which each STUDENT can take many CLASSes and each CLASS can contain STUDENTs. The ERD modelin Figure 3.13 shows this *:* relationship.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

many

eBook rights

and/or restrictions

eChapter(s). require

it

94

PART I

Database

Systems

FIGURE 3.13

The *:* relationship

between STUDENT and CLASS

3 Note the features Each

CLASS

There

can

can be

TABLE

Students

can have

be

the

three

Figure 3.13:

many STUDENTs,

many rows

in the

CLASS

STUDENT

*:* relationship

classes.

3.7

Last

ERD in

many rows in the

To examine

takes

of the

more

Name

closely,

times

in the

and

each

the

of those

hours

CIS-220, code

to

and

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

to

May not

not materially

QM-261,

code 10021

reflected

copied, affect

phone

would

of whom

overall

or

also

be repeated

in

in experience.

whole

or in Cengage

STU_NUM

student

of the

taking

lead

Due Learning

output

to

electronic reserves

the

records

the

such

such

many

as

STUDENT

table,

here.

Similarly,

generates

a CLASS

attributes

as credit

discussed

operations

as shown

occur

shown

class

anomalies

the relational

and

part.

to the

values

attributes in

CLASS table included

redundancies

errors

not be implemented

be contained

each

each student

worse if the

duplicated, learning

10018

the

additional

would

of the two tables,

scanned,

code

note that

situation,

home

efficiency

the

each

10018

in Figure 3.13, it should

For example,

Those

system

students,

10014

Statistics,

many duplications:

be

code

CIS-220,

a real-world

and contents

to lead

with two

students.

code

Science,

would be even

structure

are likely

1, ACCT-211,

and

description.

and there

10021

Computer

values

contains

course

In

major

table,

code 10014

QM-261,

table.

The problem

Given the and

Copyright

1, ACCT-211, Science,

attribute

CLASS table

record.

Editorial

STUDENT

STUDENT

data

many redundancies.

classification,

university

Computer

is logically

address,

many CLASSes.

CLASS table.

data for the two

Statistics,

reasons:

create

a small

to

3.14 for

The tables

imagine

to

the *:* relationship good

row in the

Intro

Intro to

two

can take

Classes

Accounting

Figure

given

Intro

Intro

Although

any

enrolment

enrolment

Accounting

Smithson

for

STUDENT

table for any given row in the

Selected

Ndlovu

in

table

Table 3.7 shows the

Sample student

and each

in

become

Chapter

1.

very complex

errors.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.14 Database Primary

Table

name: key:

Model

Characteristics

Ch03_CollegeTry

Table

name:

Foreign

STUDENT

key:

none

STU_NUM

STU_LNAME

CLASS_CODE

321452

Ndlovu

10014

321452

Ndlovu

10018

321452

Ndlovu

10021

324257

Smithson

10014

324257

Smithson

10018

324257

Smithson

10021

3

CLASS

Key:

CLASS_CODE

CLASS_CODE

Foreign

STU_

CRS_CODE

Key:

STU_NUM

CLASS_

NUM

CLASS_TIME

CLASS_

SECTION

PROF_NUM

ROOM

10014

321452

ACCT-211

3

TTh 2:30-3:45

p.m.

BUS252

342

10014

324257

ACCT-211

3

TTh 2:30-3:45

p.m.

BUS252

342

10018

321452

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10018

324257

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10021

321452

QM-261

1

MWF 8:00-8:50

a.m.

KLR200

114

10021

324257

QM-261

1

MWF 8:00-8:50

a.m.

KLR200

114

Fortunately,

95

between STUDENT and CLASS

STU_NUM

name:

Primary

The *:* relationship

3 Relational

the

problems

inherent

in the

many-to-many

(*:*) relationship

can

easily

be avoided

by

creating a composite entity or bridge entity. Because such a table is used to link the tables that originally were related in a*:* relationship, the composite entity structure includes asforeign keys at least the primary keys of the tables that are to belinked. The database designer has two main options when defining a composite tables primary key: use the combination of those foreign keys or create a new primary

key.

NOTE In UML class diagrams, the composite

entity,

multiplicity element can represent *:* relationships

an association

explore the concept Diagrams.

class is used to represent

of an association

the association

directly. Instead between

two

of using a

entities.

We will

class further in Chapter 5, Data Modelling with Entity Relationship

Remember that each entity in the ERD is represented by a table. Therefore, you can create the composite ENROL table shown in Figure 3.15 to link the tables CLASS and STUDENT. In this example, the

ENROL tables

primary

key is the

combination

of its foreign

Butthe designer could have decided to create a single-attribute

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

keys

CLASS_CODE

and

STU_NUM.

new primary key such as ENROL_LINE,

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

96

PART I

Database

using

Systems

a different line

use the

FIGURE 3.15 Database

value to identify

Autonumber

name:

data

type

to

each

such line

Converting the *:* relationship Ch03_CollegeTry2

Table

Primary key: STU_NUM

3

ENROL table

generate

name:

Table

users

might

STUDENT

STU_LNAME

321452

Ndlovu

324257

Smithson

CLASS_CODE1STU_NUM

keys:

CLASS_CODE,

name:

Primary

Access

ENROL

Primary key: Foreign

(Microsoft

key: none

STU_NUM

Table

uniquely. automatically.)

into two 1:* relationships

name:

Foreign

row

values

STU_NUM CLASS_CODE

STU_NUM

ENROLL_GRADE

10014

321452

C

10014

324257

B

10018

321452

A

10018

324257

B

10021

321452

C

10021

324257

C

CLASS

key:

CLASS_CODE

Foreign

key:

CRS_CODE

CLASS_CODE

CRS_CODE

CLASS_SECTION

CLASS_TIME

10014

ACCT-211

3

TTh 2:30-3:45

10018

CIS-220

2

MWF 9:00-9:50

10021

QM-261

1

MWF 8:00-8:50

Because

the

linking

ENROL table in

Figure

3.15 links

CLASS_ROOM

PROF_NUM

BUS252

342

a.m.

KLR211

114

a.m.

KLR200

114

p.m.

two tables,

STUDENT

table. In other words, alinking table is the implementation

and

CLASS, it is also called

of a composite

a

entity.

NOTE In

addition

as the

to the linking

grade

designer

earned

attributes,

in the

wants to track.

the

course.

composite

In fact,

Keep in

ENROL

a composite

mind that the

table

table

can

can

composite

also

contain

contain

entity,

any

although

relevant

number

attributes,

such

of attributes

it is implemented

that

the

as an actual

table, is conceptually alogical entity that was created as a meansto an end: to eliminate the potential for multiple redundancies in the original *:* relationship.

The linking composite

Copyright Editorial

review

2020 has

Cengage deemed

(ENROL)

entity

Learning. that

table

any

All suppressed

shown

represented

Rights

Reserved. content

does

May not

not materially

in

Figure

by the

be

copied, affect

scanned, the

overall

3.15

ENROL

or

duplicated, learning

yields

table

in experience.

whole

the

required

*:* to

must contain

or in Cengage

part.

Due Learning

to

electronic reserves

at least

rights, the

right

some to

third remove

1:* conversion. the

party additional

Observe

primary

content

may content

keys

be

suppressed at

any

time

that

of the

from if

the

subsequent

the

CLASS

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

and

STUDENT

Also

note

table

tables

that

the

contains

incapable

multiple

of

be assigned

FIGURE

ENROL tables

the

3.16

As you

class

code

conversion

between

3.16,

and

The 1:* relationship With the control

help

the

between

sections

of this

to

FIGURE 3.17

each

foreign

key

key consists student in the

that

values,

of the two

ERM,

per entity.

but those

controlled

too.

enforced.

to satisfy

attributes

are needed

to

The revised

Model

which it serves

one row

is

is selected

number

for

only

integrity

ENROL_GRADE

primary

note

respectively) contain

as referential

and the

Characteristics

as a connector.

The linking

ENROL

redundancies

Additional

a reporting

a particular

relationship

is

are

attributes

may

requirement.

Also

CLASS_CODE

define

97

and

STU_NUM,

students

shown

in

grade.

Figure

3

3.16.

to two 1:* relationships

the

composite

entity

named

ENROL

represents

the

linking

table

CLASS. COURSE

relationship,

you

redundancies. and

of a CLASS

common

case,

between

databases COURSE

of the

now

the *:* relationship

Figure

STUDENT

STU_NUM,

tables

as long

is reflected

Changing

examine

and

CLASS

anomalies

as needed. In this

both the

Naturally,

and

occurrences

producing

note that the because

(CLASS_CODE

STUDENT

3 Relational

CLASS

while

CLASS

shown

kept

CLASS

can increase Thus, in

controlling

are

and

Figure Figure

was first illustrated

the 3.16 3.17.

COURSE

The expanded entity relationship

of available

be expanded

Note that

redundancies

in the

amount can

by

in Figure

the

making

3.9 and Figure

information,

to include model

is

sure that

even

the

able

3.10. as you

1:* relationship

to

handle

all of the

multiple

COURSE

data

table.

model COURSE

1..1

has

c

1..*

STUDENT

registers

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

shows_in

1..*

1..1

Editorial

ENROL

c

does

May not

not materially

be

copied, affect

1..*

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

CLASS

c

1..1

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

98

PART I

Database

Systems

The ERD

will be examined

complex

databases.

of a realistic

database

3.6

in

The

greater

ERD

design

detail in

will also in

Chapter

5 to

as the

basis

be used

Appendices

B and

C (see

show for

the

you

the

how it is

used to

development

online

platform

design

more

and implementation for this

book).

DATA REDUNDANCY REVISITED

3 In

Chapter

the

1 you learnt

effectiveness

control

that

of the

data redundancy

database.

data redundancies

The

proper

use

of foreign

that, in the

strictest

because

foreign

values

keys

minimises

key

data

crucial

that

to

the

exercising

thus

are

data

anomalies

database by tables,

redundancy

control.

the

chance

that

it

called

possible

foreign

However,

keys does not eliminate Nevertheless,

can destroy

makes

shared

many times.

minimising

Those

relational

that

use of foreign

be repeated

redundancies,

data anomalies.

attributes

sense, the can

to

also learnt

common

keys is

emphasising the

You

by using

leads

to keys.

it is

worth

data redundancies,

the

proper

use

of foreign

destructive

data

anomalies

will

are stored,

but whether the

develop.

NOTE The real test elimination

of redundancy

of an attribute

information

can still

redundant. multiple

in

Given

be generated

that

view

occurrences

mind that

in

controlled

and/or information

is not how will eliminate

many copies

information.

through

relational

of redundancy,

a table.

proper

However,

redundancies

of a given attribute

Therefore,

even

algebra,

foreign

when

Exclusive

reliance

delete

an attribute

the inclusion

keys

you

are

use this

are often designed

requirements.

if you

clearly

less

of that

restrictive

algebra

attribute

not redundant view

as part of the system

on relational

and the

to

in

original would

spite

be

of their

of redundancy,

keep

to ensure transaction

speed

produce

required

information

maylead to elegant designs that fail the test of practicality.

You

will learn

in

requirements:

Chapter

design

15,

Databases

defined

and controlled the

As important must

such

review

2020 has

Cengage deemed

any

All suppressed

about

a consistent

a system

Rights

Reserved. content

does

May not

one

Regardless

serve

when

crucial

the

data. For example,

input

are shown in

not materially

be

copied, affect

at a time,

purchased

pricing

scanned, the

overall

or

each

consider

in experience.

whole

The

or in Cengage

that

Due Learning

to

electronic reserves

table

appears

class

rights, the

right

You seem

some to

third remove

system.

several

should

content

may content

be

any

Because

the

LINEs, product

The tables

time

that

Figure 3.19.

suppressed at

to

The system

contain

ERD is shown in

additional

exist

invoice

on the invoice.

party

will learn to

an INVOICE.

may contain

PRODUCT

redundancy

purposes.

generating

data

control.

of data

a small invoicing thus

The systems

part.

level

in

carefully

of how you describe and careful

the

will learn

requires

data redundancies

an invoice

product

Figure 3.18.

duplicated, learning

product. for

design

information

when

contradictory

And you

warehousing

are times

are times

often

requirements.

data

properly.

there

And there

product

three

by proper implementation

is,

database

15.

of the

proper

who may buy one or more PRODUCTs,

more than

details

provide

Learning. that

buy

control

reconcile

and information

that

damage is limited

Chapter

must

to function

make the

accuracy

are part of such

Copyright

to

CUSTOMER,

providing

price to

for

historical

may

speed

Intelligence,

redundancy

in

a customer each

data

designers

processing

data redundancies

be increased

the

database

Business

redundancies

preserve the includes

for

potential as

actually

about

Editorial

elegance,

Chapter

redundancies,

5 that

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.18 Database

Asmall invoicing

name:

3 Relational

Model

Characteristics

system Table

Ch03_SaleCo

name:

Foreign

Primary key: CUS_CODE

CUSTOMER

key: none

CUS_CODE

CUS_LNAME

CUS_FNAME

CUS_INITIAL

CUS_AREACODE

CUS_PHONE

10010

Ramas

Alfred

A

0181

844-2573

10011

Dunne

Leona

K

0161

894-1238

10012

Du Toit

0181

894-2285

10013

Pieterse

0181

894-2180

10014

Orlando

0181

222-1672

10015

OBrian

Amy

B

0161

442-3381

10016

Brown

James

G

0181

297-1228

0181

290-2556

10017

Marlene

George

Moloi

10019

Table

F

Myron

Padayachee

10018

W

Jaco

Williams

99

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

3

name: INVOICE

Foreign

Primary key: INV_NUMBER INV_NUMBER

key: CUS_CODE

CUS_CODE

INV_DATE

1001

10014

08-Dec-19

1002

10011

08-Dec-19

1003

10012

08-Dec-19

1004

10011

09-Dec-19

Table name: LINE Primary

key: INV_NUMBER

1 LINE_NUMBER

Foreign

key: INV_NUMBER,

PROD_CODE

INV_NUMBER

Copyright Editorial

review

LINE_PRICE

LINE_NUMBER

PROD_CODE

LINE_UNITS

1001

1

123-21UUY

1

1001

2

SRE-657UG

3

2.36

1002

1

QER-34256

2

14.72

1003

1

ZZX/3245Q

1

5.36

1003

2

SRE-657UG

1

2.36

1003

3

001278-AB

1

10.23

1004

1

001278-AB

1

10.23

1004

2

SRE-657UG

2

2.36

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

150.09

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

100

PART I

Table

3

Database

name:

Systems

PRODUCT

Primary

key:

PROD_CODE

Foreign

key:

none

PROD_CODE

PROD_DESCRIPT

001278-AB

Claw

123-21UUY

Houselite

QER-34256

Sledge

SRE-657UG

Rat-tail file

ZZX/3245Q

Steel tape,

FIGURE 3.19

PROD_PRICE

PROD_ON_HAND

VEND_CODE

23

232

4

235

6

231

2.36

15

232

5.36

8

235

10.23

hammer chain

saw,

hammer,

16 cm

150.09

bar

14.72

16 kg head

12 mlength

The ClassERDfor the invoicing system

As you examine

the tables

in the invoicing

system in Figure 3.18 and the relationships

depicted

in Figure

3.19, note that you can keep track oftypical sales information. For example, by tracing the relationships among the four tables, you discover that customer 10014 (Myron Orlando) bought two items on 8 December, 2012 that were written to invoice number 1001: one Houselite chain saw with a 16-inch bar and three rat-tail files. (Note: Trace the CUS_CODE number 10014 in the CUSTOMER table to the matching

CUS_CODE

value in the INVOICE

table.

Next, take the INV_NUMBER

1001 and trace it to the

first two rows in the LINE table; then match the two PROD_CODE values in LINE with the PROD_CODE values in PRODUCT.) Application software will be used to write the correct bill by multiplying each invoice line items LINE_UNITS byits LINE_PRICE, adding the results, applying appropriate taxes, etc. Later,

other

application

software

might use the

same technique

to

write sales reports

that

track

and

compare sales by week, month or year. As you examine the sales transactions in Figure 3.18, you mightreasonably suppose that the product price billed to the customer is derived from the PRODUCT table because thats where the product

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

data are stored. redundancy?

But why does that

It certainly

success.

Copying

accuracy

of the

table

appears

the

product

you use the Now suppose reflected

sales

transaction

thus

revenues took

place!

eliminating

the

price

data

are

price

will always

such

planned

is

case,

the

data

is

are

stored.

You

on

myinvoice

from those

3.7

not

you

orderly

potential price

topic

arrangement

through books

when

other

3

the

hand, if the

LINE

You

table,

that

will discover

that

LINE table in Figure primary but this

numbers

generation,

key

and,

redundancy

automatically.

the redundancy

3.18.

is

In this

not

a source

benefit: the order of the retrieved invoicing

as soon when

effect

the

composite

were entered. If product

codes

change

calculations

will be incorrect,

is redundant,

such line

automatic

are looking

look

in

the

For

codes

as the invoice

a customer

is

calls

at an invoice

and

are used as part of the completed

and the

says,

second

whose lines

The

show

data item

a different

order

and

pointers. you

database

in

and see if the PAINTER_NUM

table

and use the index

in the

index

Cengage

find

any

All suppressed

Rights

the

does

Figure

May not

not materially

be

in this

read

the

to locate

is

reference

the

to the location of the

pointers.

every

system)

which

points

you

matter. Anindex is an

you

make sense

that

point

described

in

the

go to to the

by each

preceding

and a set

of

anindex is an ordered by the

a given row

in

up the appropriate index

key.

painter the

However, if you index

the

you

key

data identified

must read

speaking,

to

quickly.

created

merely need to look

to read

much simpler

of an index

of the

painter.

Conceptually

item

Moreformally,

paintings

an index,

references

indexes

point.

up

you

page

composed

points

Without

through

catalogue,

Does it

not; it is

a needed

work like

an index

all

book.

Of course

matches the requested

matching

in

Reserved. content

3.8.

key PAINTER_NUM,

depicted

Learning. that

and

key

manual or a computer

model,

and

used

of view,

Each

Figure

table

presentation

model,

want to look

a

the topic?

environment

point

to look

logically.

as ER

is

make sense

not; you use the librarys

of the book a quick and simple

across ER

Does it

Of course

a table

an index

database

suppose

Ch03_Museum

in such

phrase case,

a library.

(in either

key is, in effect, the indexs

of keys

example,

rows

a topic,

a conceptual

pointers. The index arrangement

each

relational

From

The index

access

to find

up the

In

in

want?

making retrieval

until you stumble

page(s).

Indexes

to

book

one you

and author.

want

page

index,

paragraphs.

deemed

you

a particular

thereby used

you

every

appropriate

has

be a sufficient

generates

confusion

and

until you find the

by title,

Or suppose

2020

product

the

copy!

want to locate

to the books location,

review

the

an incorrect

which the

LINE

calculate

the

in

time.

historical the

This price

not in

a data

systems

INDEXES

is indexed

Copyright

given its

at that

LINE_NUMBER

that

to

Onthe

was used in the

data

can imagine

book in the library

Editorial

Yes, the

table

101

design.

attribute

order in

on the customers

Suppose

the

place

database

in

transactions

transaction

also adds another

those

all past

with the

took

the

was

over time.

stored

and PROD_CODE

But

that

that

maintains

changes.

Characteristics

to the

Unfortunately,

price for

of LINE_NUMBER

will arrange

has

product

that

software

necessary.

PROD_PRICE

Isnt

LINE_PRICE

PRODUCT

sales comparisons

good

redundant?

the

Model

crucial

table

and

LINE_NUMBER

by invoicing

match the

key, indexing

in

is

write the

calculations.

calculations

transaction

of INV_NUMBER

The inclusion

new

table

common

why the

tables

LINE

to

price) from

the

making proper

the

to the

Relational

LINE table?

redundancy

you fail

revenue

revenue

PRODUCT

reflect

created

will always

primary

reflect

table

that

(product

again in the

apparent

PRODUCT

sales

the

of

the

LINE_NUMBER

redundancy

of anomalies.

also

are

combination

commonly

the

from

might wonder

isnt

quite

now

As a result,

the

PRODUCT

will

redundancies

Wouldnt the

the

the

all subsequent

accurately

Finally, you

therefore,

that

price occur

time,

for instance,

in

possibility

copied

from

PROD_PRICE

will be properly past

product

But this

Suppose,

sales revenue.

of

be.

price

transactions.

and that

same

to

3

in the

PAINTING

the

PAINTER

PAINTER_NUM

would

resemble

the

3.20.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

102

PART I

Database

Systems

FIGURE 3.20 PAINTING

Components table

of anindex

index

PAINTING

123

1, 2, 4

126

3, 5

table

3

PAINTER_NUM (index key)

Pointers to the PAINTING table rows SOURCE:

Course

Technology/Cengage

Learning

As you examine Figure 3.20, note that the first PAINTER_NUM index key value (123) is found in records 1, 2 and 4 of the PAINTING table. The second PAINTER_NUM index key value (126) is found in records

3 and 5 of the

PAINTING

table.

DBMSs use indexes for many different purposes. You just learnt that an index can be used to retrieve data more efficiently. But indexes can also be used by a DBMS to retrieve data ordered by a specific attribute or attributes. For example, creating anindex on a customers last name will allow you to retrieve the customer data alphabetically ordered by the customers last name. Also, anindex key can be composed

of one or more attributes.

For example,

in

Figure

3.18, you can create

an index

on VEND_CODE and PROD_CODE to retrieve all rows in the PRODUCT table ordered by vendor and within vendor, ordered by product. Indexes play animportant role in DBMSs for the implementation of primary keys. Whenyou define atables primary key, the DBMS automatically creates a unique index on the primary key column(s) you declared. For example, in Figure 3.18,

when you declare

CUS_CODE to

be the

primary

key of the

CUSTOMER

table,

the DBMS automatically creates a unique index onthat attribute. A unique index, asits name implies, is an index in whichthe index key can have only one pointer value (row) associated withit. (The index in Figure 3.20 is not a unique index because the PAINTER_NUM has multiple pointer values associated withit. For example,

painter

number

123 points to three rows

1, 2 and 4 in the

PAINTING table.)

Indexes are crucial in speeding up data access. They can be used to facilitate searching, sorting and even joining tables. Theimprovement in data access speed occurs because anindex is an ordered set of values that contains the index key and pointers. A table can have manyindexes, but each index is associated with only one table. Theindex key can have multiple attributes (composite index). Creating an index

is

easy.

You

will learn

in

Chapter

8 that

a simple

SQL command

will produce

any required

index.

NOTE You willlearn more about how indexes can be applied to improve Conceptual, Logical, and Physical Database Design.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

data access and retrieval in

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

Chapter 11,

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3.8 In

3

Relational

Model

Characteristics

103

CODDS RELATIONAL DATABASE RULES

1985,

Dr E.F. Codd

published

alist

of 12 rules

to

define

a relational

database

system.2

The reason

Dr Codd published the list was his concern that many vendors were marketing products asrelational even though those products did not meet minimum relational standards. Dr Codds list, shown in Table 3.8, serves as a frame of reference for what atruly relational database should be. Bearin mindthat even the

dominant

database

TABLE Rule

3.8

vendors

do not fully

Dr Codds

12 relational

Rule Name

1

Information

2

Guaranteed

support

all 12 rules.

database

All information

Access

in a relational

values

Every

Systematic

Treatment

of

Nulls

Nulls

Based

Online on the

Catalogue

5

Comprehensive

Data

guaranteed

name,

The relational

database.

key value in

through

as

a

and column

a systematic

one

management

and Such

managed data

name.

way,

may support

well-defined

authorised

language. However

language

data

constraints,

commit

data, that is, in to

many languages.

declarative

view definition,

(begin,

as ordinary

must be available

database relational

database

must support

it

with support

manipulation (interactive authorisation

and transaction

and rollback).

Any view that is theoretically

Updating

be accessible

and treated

and by program), integrity

View

to

primary

must be stored

within the

for data definition,

6

represented

within tables. is

users, using the standard

Sub-language

must be logically

of data type.

metadata

tables

Model

database

must be represented

The

Relational

a table of table

independent

Dynamic

in rows

value in

combination

4

rules

Description

column

3

3

updatable

must be updatable

through

the

system. 7

High-Level and

Insert,

Physical

8

The

Update

database

Data Independence

Application physical

9

must support

Logical

Data Independence

programs access

Application changes

programs are

Integrity

Independence

11

Distribution

Independence

12

Non-Subversion

The

made to the

Rule Zero

to

Codd,

E.F., Is

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Your

and 21

All suppressed

Rights

DBMS October,

Reserved. content

users data

does

and

May not

not materially

Really

structures

application

location

bypass

All preceding to

table

constraints

and

deletes.

rules

are

based

that

when

unaffected

preserve

or inserting

the

when

original

columns).

not at the application unaware

of the

of and

level.

unaffected

databases).

access to the

on the

it

are logically

are

vs local

rules

relational,

unaffected

are changed.

catalogue,

programs

low-level

the integrity

be considered

are logically

must be definable in the relational

(distributed

If the system supports way to

14 October

updates

structures

order of column

and stored in the system

end

by the

or storage

and ad hoc facilities

All relational integrity language

2

inserts,

and ad hoc facilities

methods

table values (changing 10

set-level

Delete

data, there

must not be a

database.

notion

that,

in

must use its relational

order for

a database

facilities

exclusively

manage the database.

Relational?

and Does

Your

DBMS

Run by the

Rules?

Computerworld,

1985.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

104

PART I

Database

Systems

SUMMARY Tables are the basic building blocks of a relational as an entity

set, is stored in

a table.

Conceptually

database. speaking,

A grouping of related entities, known the relational

intersecting rows (tuples) and columns. Each row represents represents the characteristics (attributes) of the entities.

3

Keys are central to the

use of relational

tables.

table is composed

of

a single entity, and each column

Keys define functional

dependencies;

that is, other

attributes are dependent on the key and can, therefore, be found if the key value is known. A key can be classified as a superkey, a candidate key, a primary key, a secondary key or aforeign key. Each table row

must have a primary

key. The primary

key is an attribute

or a combination

of

attributes that uniquely identifies all remaining attributes found in any given row. Because a primary key must be unique, no null values are allowed if entity integrity is to be maintained. Although the tables key of one table

are independent,

can appear

they can belinked

as the foreign

integrity dictates that the foreign table or must contain nulls. Once you know the relational

by common attributes.

key in another

table to

key must contain values that

database

basics,

Thus, the primary

which it is linked.

Referential

match the primary key in the related

you can concentrate

on design.

Good design

begins by identifying appropriate entities and attributes, and the relationships among the entities. Those relationships (1:1, 1:* and *:*) can be represented using ERDs. The use of ERDs allows you to create and evaluate simple logical design. The 1:* relationships are most easily incorporated in a good

design;

you just

have to

make sure that the

primary

key of the 1

is included

in the table

of

the many.

KEYTERMS associations

flags

predicate logic

associationclass

foreign key(FK)

primary key (PK)

attribute domain

full functional dependence

referential integrity

bridge entity

functional dependence

relation

candidatekey

homonyms

relationalschema

cardinality

index

secondary key

composite entity

index key

superkey

composite key

key

synonym

datadictionary

key attribute

systemcatalogue

determination

linking table

tuple

domain

multiplicity

entity integrity

unique index

null

FURTHER READING Codd,

E.F.

Codd,

E.F. Relational

The

Series

RJ987

March

(6

Copyright review

2020 has

Cengage deemed

Learning. that

any

Series

All suppressed

Model

for

Data

completeness

Symposia

Symposia

Editorial

Relational

Rights

6, Data 1972). 6.

Base

Republished

does

May not

not materially

be

Management:

base

Systems,

Prentice-Hall,

Reserved. content

Base

of data

New

in

Version

sublanguages York

Randall

J.

City,

Rustin

2.

Addison-Wesley,1990.

(presented NY,

(ed.),

2425 Data

at May,

Base

Courant

1971).

Computer

IBM

Systems:

Science

Research

Courant

Report

Computer

Science

1972.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Date,

C.J.

The

Date,

C.J.

Darwen,

Relational

Date,

C.J.

Date

Date,

C.J.

Database

Database

Dictionary.

H. Databases,

on

Database: in

Types Writings

Depth:

The

OReilly,

and the

Relational

Model.

APress, Model

Relational

Model

Characteristics

105

2006.

Relational

20002006.

3

for

Addison-Wesley,

2006.

2006.

Practitioners.

OReilly,

2005.

Online Content Allofthe databases usedin the questions andproblems areavailableon the

online

platform

database is

the

accompanying

names

used

in the

'Ch03_CollegeQue'

chapter

are

also

REVIEW

this figures.

database.

available

on the

book. For

The

example,

Answers online

database the

to

names

source

selected

used

of the

Review

in the

tables

folder

shown

Questions

and

match in

2

What does it

3

Whyare entity integrity and referential integrity important in a database?

4

What can a NULL value represent?

5

Whatis the domain of an attribute?

6

Create the basic ERD using UML notation for the database shown in Figure Q3.1.

Table

this

QUESTIONS

Whatis the difference between a database and a table?

Database

Q3.1 for

platform.

1

FIGURE

the

Figure

Problems

3

meanto say that a database displays both entity integrity

Q3.1 name:

name:

The Ch03_CollegeQue

database

and referential integrity?

tables

Ch03_CollegeQue Table

STUDENT

STU_CODE

LECT_CODE

100278

name:

LECTURER

LECT_CODE

DEPT_CODE

1

2

128569

2

2

6

512272

4

3

6

531235

2

4

4

531268

553427

7

Copyright Editorial

review

2020 has

1

Create the basic ERD using UML notation for the database shown in Figure Q3.2.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

106

PART I

Database

FIGURE Database Table

3

Systems

Q3.2 name:

name:

The Ch03_TravelQue

database tables

Ch03_TravelQue

CUSTOMER

CUS_CODE

CUS_LNAME

CUS_EMAIL

CUS_MOBILE

24563

GARNETT

[email protected]

08703345671

24565

MWBAU

[email protected]

08734566664

Table name: BOOKING BOOKING_NO

PACKAGE_ID

BOOK_TOTAL_COST

BOOK_PAID

BOOK_DEP_DATE

24563

9910001

956.00

Y

06-Jan-19

24565

9910001

895.00

N

07-Sep-19

24563

9910003

3056.00

N

05-Oct-19

Table name: PACKAGE_HOLIDAY PACKAGE_ID

PACK_DESTINATION

9910001

Spain

Riveria Travel

7

9910002

USA

Mouse

14

9910003

Australia

Wallaby Tours

8

PACK_OPERATOR

PACK_DURATION

Holidays

21

Suppose you have the ERD shown in Figure Q3.3. How would you convert this that displays only 1:* relationships? (Make sure you create the revised ERD.)

FIGURE Q3.3

The UMLClassERDfor question 6 TRUCK

DRIVER

1..*

1..*

During

some

TRUCKS

9 10

What are homonyms

and

time

any

interval,

TRUCK

and synonyms,

How would you implement example.

Use your knowledge

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

of naming

not materially

be

copied, affect

a DRIVER

can

be driven

by

can

the

overall

or

duplicated, learning

many DRIVERs.

in a database

composed

of two tables?

Give an

ofthe table shown in Figure Q3.4, using correct terminology.

conventions

scanned,

drive many

and why should they be avoided in database design?

a 1:* relationship

11 Identify and describe the components

Editorial

modelinto an ERD

in experience.

whole

to identify

or in Cengage

part.

Due Learning

to

the tables

electronic reserves

rights, the

right

some to

probable

third remove

party additional

content

foreign

may content

be

key(s).

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE Database Table

Q3.4 name:

name:

The Ch03_NoComp

Characteristics

Ch03_NoComp

EMPLOYEE EMP_FNAME

11234

Friedman

K

Robert

MKTG

12

11238

Zulu

D

Cela

MKTG

12

11241

Fontein

11242

Theron

11245

Smithson

11256

McBride

11257

Mazibuko

11260

Ratula

Oleta

ENGR

8

Randall

ENGR

8

the

b

Identify the foreign Create the

Q3.5

Database name:

Table

name:

primary

has

William

Learning. that

any

14

MKTG

14

5

INFS

Katrina

of the two tables shown in Figure Q3.5.

keys.

ERM.

The Ch03_Theatre

database tables

Ch03_Theatre

DIRECTOR

name:

Cengage deemed

MKTG

keys.

DIR_NUM

DIR_LNAME

DIR_DOB

100

Broadway

12-Jan-75

101

Hollywoody

18-Nov-63

102

Goofy

21-Jun-72

PLAY

PLAY_CODE

PLAY_NAME

DIR_NUM

1001

Cat On a Cold, Bare Roof

102

1002

Hold the

1003

2020

Fikile

Suppose you are using the database composed Identify

6

INFS

G

A

a

9

ENG

Bernard

D

3

5

B

W

Smith

JOB_CODE

INFS

Emma

J

Washington

11258

DEPT_CODE

Juliette

11248

Table

107

database EMPLOYEE table

EMP_INITIAL

FIGURE

review

Model

EMP_LNAME

c

Copyright

Relational

EMP_NUM

12

Editorial

3

All suppressed

Rights

Reserved. content

does

I

Mayo, Pass the

Never Promised

1004

Silly

Putty

1005

See

No Sound,

1006

Starstruck

1007

Stranger

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

Goes

in In

whole

Bread

101

You Coffee To Hear

102

Washington

100

No Sight

101

Biloxi

102

Parrot Ice

or in Cengage

part.

Due Learning

to

101

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

108

PART I

Database

d

Systems

Suppose you wanted quick lookup director.

e

Which table

be the

What would be the conceptual contents

13

would

of the

conceptual

capability to get alisting basis for

the INDEX

table,

of all plays directed and

what

would

3

world.

The database

table.

composed

a

Identify the primary keys.

b

Identify the foreign

c

Createthe ERM.

FIGURE Q3.6 Database

is

name:

key?

view of the INDEX table that is described in Part d? Depict the

INDEX

Suppose you are using the database to enable a museum to find the location the

by a given

be the index

of the three

tables

shown

in Figure

of artefacts around

Q3.13.

keys.

Table Name Artefact Museum

Database

ARTEFACT_DESCRIPTION

ARTEFACT_

TRACK_ID 10034

Greywacke

Statue Tribute to Isis

10039

The Golden Rhinoceros

ARTEFACT_

ARTEFACT_

ARTEFCAT_

AGE

VALUE

LOCATION_ID

664525

of

BC

6000000

78343

10751220

12100000

56432

18th

85900000

23412

Mapungubwe 10056

Pinner

Qing

Dynasty

Vase

Century 19002

Rosetta

181

Stone

BC

23412

Table name: LOCATION ARTEFACT_LOCATION_ID

ARTEFACT_COUNTRY

78343

FRANCE

56432

USA LONDON

23412

d

Suppose the could

be

museum database

contacted

CURATOR_NO,

for

to

request

to

CURATOR_NAME

more than

one location.

was to be expanded

to include

see

details

an

and

artefact.

The

CURATOR_CONTACT.

Modify your

ERM to include

details of a curator

that

need

to

A curator

be

may

who

stored

are

a

be responsible

this information.

PROBLEMS Use the four

database

tables

that

shown reflect

in

Figure

these

P3.1 to

work

Problems

1-7.

Note that

the

database

is

composed

of

relationships:

An EMPLOYEE

has only one JOB_CODE,

An EMPLOYEE

can

participate

in

many

but a JOB_CODE PLANs,

and

any

can be held

PLAN

can

by many EMPLOYEEs.

be assigned

to

many

EMPLOYEEs. Note

table

Copyright Editorial

review

2020 has

also that

serves

Cengage deemed

Learning. that

any

the

*:* relationship

has been

as the composite

All suppressed

Rights

Reserved. content

does

May not

not materially

be

or bridge

copied, affect

scanned, the

overall

or

broken

two

1:* relationships

for

which the

BENEFIT

entity.

duplicated, learning

down into

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P3.1 Database

name:

Relational

Model

Table name: JOB

EMP_LNAME

JOB_CODE

JOB_CODE

14

Rudell

2

1

Clerical

15

Arendse

1

2

Technical

16

Ruellardo

1

3

17

Smith

3

20

Smith

2

name:

109

Ch03_BeneCo

EMP_CODE

Table

Characteristics

The Ch03_BeneCo database tables

Table name: EMPLOYEE

1

3

BENEFIT

JOB_DESCRIPTION

3

Managerial

Table name: PLAN

EMP_CODE

PLAN_CODE

PLAN_CODE

PLAN_DESCRIPTION

15

2

1

Term life

15

3

2

Stock purchase

16

1

3

Long-term

17

1

4

Dental

17

3

17

4

20

3

For each table in the

have a foreign

database,

identify

the

primary

key and the foreign

disability

key(s). If a table

does

not

key, write None in the space provided.

Primary

Table

Key

Foreign

Key(s)

EMPLOYEE BENEFIT

JOB PLAN

2

Create the ERD using UML notation to show the relationship

between EMPLOYEE and JOB.

3

Do the tables

explain

exhibit

entity integrity?

Answer

yes or no; then

Entity Integrity

Table

your

answer.

Explanation

EMPLOYEE BENEFIT JOB PLAN

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

110

PART I

Database

4

Systems

Dothe tables (not

exhibit referential integrity?

applicable)

if the

table

does

not

Referential

Table

Answer yes or no; then explain your answer.

have

a foreign

Write NA

key.

Integrity

Explanation

EMPLOYEE BENEFIT

3

JOB PLAN

5

Createthe ERD using Crows Foot notation to show the relationships JOB and

6

among EMPLOYEE, BENEFIT,

PLAN.

Create the ERD using UML class diagram notation to show the relationships BENEFIT,

JOB

among EMPLOYEE,

and PLAN.

Usethe database shown in Figure P3.2 to answer Problems 7-13.

FIGURE P3.2 Database

name:

Table name:

Ch03_StoreCo

EMPLOYEE

EMP_CODE

EMP_TITLE

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_DOB

STORE_CODE

21-May-70

3

09-Feb-75

2

1

Mr

Govender

Adimoolam

2

Ms

Ratula

Nancy

3

Ms

Greenboro

Lottie

R

02-Oct-67

4

4

Mrs

Rumpersfro

Jennie

S

01-Jun-77

5

5

Mr

Smith

Robert

L

23-Nov-65

3

6

Mr

Renselaer

Cary

A

25-Dec-71

1

7

Mr

Ogallo

Roberto

S

31-Jul-68

3

8

Ms

Van Blerk

Elandri

10-Sep-74

1

9

Mr

Eindsmar

Jack

19-Apr-61

2

10

Mrs

Jones

Rose

06-Mar-72

4

11

Mr

12

Mr

13

Mr

14 15

Broderick

W

I W R

Tom

21-Oct-78

3

Alan

Y

08-Sep-80

2

Smith

Peter

N

25-Aug-70

3

Ms

Smith

Sherry

H

25-May-72

4

Mr

Olenko

Howard

U

24-May-70

5

16

Mr

Archialo

Barry

V

03-Sep-66

5

17

Ms

Grimaldo

Jeanine

K

12-Nov-76

4

18

Mr

Rosenberg

Andrew

D

24-Jan-77

4

19

Mr

Bophela

F

03-Oct-74

4

20

Mr

Mckee

Robert

S

06-Mar-76

1

21

Ms

Baumann

Jennifer

A

11-Dec-80

3

Copyright Editorial

The Ch03_StoreCo database tables

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Washington

Rights

Reserved. content

does

May not

not materially

be

Ingwe

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Table

3

Relational

Model

Characteristics

111

STORE

STORE_CODE

STORE_NAME

1

Access

2

Database

3

Tuple

4

Attribute

5

Primary

name:

STORE_YTD_SALES

EMP_CODE

730.05

2

8

123 370.04

2

12

792

Junction 1

Corner

Charge Alley Key

REGION_CODE

779

558.74

1

7

746

209.16

2

3

314 777.78

1

15

2

Point

3

REGION REGION_CODE

REGION_DESCRIPT 1

2

East

West

7 For eachtable, identify the primary key and the foreign key(s).If atable does not have aforeign key,

write

None in the space

provided.

Primary

Table

Key

Foreign

Key(s)

EMPLOYEE STORE REGION

8

Dothe tables exhibit entity integrity?

Entity

Table

Answer yes or no; then explain your answer.

Integrity

Explanation

EMPLOYEE STORE REGION

9

Do the tables

exhibit referential

(not applicable) if the table

integrity?

Referential

Table

Answer

does not have aforeign

yes or no; then

explain

your

answer.

Write NA

key.

Integrity

Explanation

EMPLOYEE STORE REGION

Copyright Editorial

review

10

Describe the type(s) of relationship(s)

11

Create the ERD using UML notation to show the relationship

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

between STORE and REGION.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

between STORE and REGION.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

112

PART I

Database

12

Systems

Describe the type(s) of relationship(s) many

13

employees,

Create the

one

of

ERD using

whom

between EMPLOYEE and STORE. (Hint: Each store employs

manages

the

store.)

UML notation to show the relationships

among EMPLOYEE,

STORE and

REGION.

Use the

database

shown in Figure

P3.3 to

answer

Problems

14-18.

3

FIGURE P3.3 Database

name:

The Ch03_CheapCo database tables Ch03_CheapCo

Table name:

PRODUCT

Foreign

VEND_CODE

key:

PROD_

Primary key: PROD_CODE

PROD_DESCRIPTION

CODE

PROD_ON_

PROD_

VEND_

DATE

HAND

PRICE

CODE

12-WW/P2

18 cm power saw blade

07-Apr-16

12

10.94

123

1QQ23-55

6 cm wood screw,

19-Mar-16

123

13.55

123

231-78-W

PVC pipe, 8 cm, 2.44

07-Dec-15

45

17.01

121

33564/U

Rat-tail

08-Mar-16

18

10.94

123

AR/3/TYR

Cordless

136.33

121

DT-34-WW

Philips

118.40

123

EE3-67/W

Sledge

ER-56/DF

Houselite

file,

100 m

0.5 cm, fine

drill,

0.6 cm

screwdriver

29-Nov-15 20-Dec-15

pack

hammer,

8

7 kg

chain saw, 40 cm

11

25-Feb-16

9

114.21

121

28-Dec-15

7

1186.04

125

FRE-TRY9

Jigsaw,

30 cm blade

12-Aug-15

67

11.15

125

SE-67-89

Jigsaw,

20 cm blade

11-Oct-15

34

11.07

125

23-Apr-16

14

110.26

123

01-Mar-16

15

17.07

121

ZW-QR/AV

Hardware

ZX-WR/FR

Claw

VENDOR

Foreign

none

key:

cloth,

Primary key: VEND_CODE

VEND_CODE

VEND_NAME

120

Bargain

121

Cut n

122

Rip & Rattle

123

Tools R

124

Trowel

125

Bow

review

2020 has

Cengage deemed

VEND_CONTACT

Snapper, Glow

write

Learning. that

any

All suppressed

Anne

does

May not

not materially

0181

899-1234

Olero

0181

342-9896

Morrins

0113

225-1127

G. McHenry

0161

546-7894

F. Frederick

0113

453-4567

0113

324-9988

T. Travis

R.

George

Inc.

& Wow Tools

the

VEND_PHONE

J.

Juliette

& Dowel,

Reserved. content

Co.

Us

None in

Rights

Henry

Co. Supply

VEND_AREACODE

Melanie

Inc.

For each table, identify key,

Copyright

0.6 cm.

hammer

Table name:

14

Editorial

PROD_STOCK_

Bill S. Sedwick

the primary key and the foreign space

be

copied, affect

key(s). If a table

does not have aforeign

provided.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

Primary

3

Foreign

Key

Relational

Model

Characteristics

113

Key(s)

Product VENDOR

15

Dothe tables exhibit entity integrity?

Entity

Table

Answer yes or no; then explain your answer.

Integrity

3

Explanation

Product VENDOR

16

Dothe tables exhibit referential integrity? Answer yes or no; then explain your answer. (not applicable) if the table does not have aforeign key.

Referential

Table

Integrity

Write NA

Explanation

Product VENDOR

17

Create the ERD using UML notation for this database.

18

Create the data dictionary for this database.

Use the

database

shown

FIGURE P3.4 Database Table

name:

name:

Foreign

in

Figure

Copyright review

answer

Problems

Ch03_TransCo Primary

TRUCK

key:

19-24.

The Ch03_TransCo database tables

BASE-CODE,

key:

TRUCK_NUM

TYPE_CODE

TRUCK_

BASE_

TYPE_

TRUCK_

TRUCK_BUY_

TRUCK_SERIAL_

NUM

CODE

CODE

KM

DATE

NUM

1001

501

1

32 123.50

23-Sep-13

AA-322-12212-W11

1002

502

1

76 984.30

05-Feb-12

AC-342-22134-Q23

1003

501

2

12 346.60

11-Nov-13

AC-445-78656-Z99

1

2 894.30

06-Jan-14

WQ-112-23144-T34

45 673.10

1004

Editorial

P3.4 to

01-Mar-13

FR-998-32245-W12

245.70

15-Jul-10

AD-456-00845-R45

3

32 012.30

17-Oct-11

AA-341-96573-Z84

502

3

44 213.60

07-Aug-12

DR-559-22189-D33

503

2

10 932.90

12-Feb-14

DE-887-98456-E94

1005

503

2

1006

501

2

1007

502

1008 1009

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

193

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

114

PART I

Table

Database

name:

Foreign

Systems

BASE

Primary

key:

BASE_CODE

key: none

BASE_CODE

BASE_CITY

BASE_PROVINCE

BASE_AREA_CODE

BASE_MANAGER

BASE_ PHONE

3

501

Polokwane

502

Cape

503

Best

North

504

Durban

KwaZulu-Natal

Table

name:

Foreign

Town

Western

Cape

Brabant

0700

123-4567

Sibusiso

7100

234-5678

Clementine

4567

345-6789

4001

456-7890

Primary

TYPE

key:

19

Limpopo

key:

Balisa Daniels

Maria J. Talindo Pragasen

Khan

TYPE_CODE

none TYPE_CODE

TYPE_DESCRIPTION

1

Single

box,

2

Single

box, single-axle

3

Tandem

For each table, identify key,

write

trailer,

single-axle

the primary key and the foreign

None in the space

Primary

Table

double-axle

key(s). If a table

does not have aforeign

provided.

Key

Foreign

Key(s)

exhibit entity integrity?

Answer yes or no; then explain your answer.

TRUCK BASE TYPE

20

Dothe tables

Entity

Table

Integrity

Explanation

TRUCK BASE TYPE

21

Dothe tables (not

exhibit referential integrity?

applicable)

if the table Referential

Table

Answer yes or no; then explain your answer.

does not have a foreign

Write NA

key.

Integrity

Explanation

TRUCK BASE TYPE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

22 Identify the TRUCK tables 23

For each table, identify

Relational

Model

Characteristics

115

candidate key(s).

a superkey and a secondary

key.

Superkey

Table

3

Secondary

Key

TRUCK

3

BASE TYPE

24

Createthe ERD using UML notation for this database.

FIGURE Database

P3.5 name:

Table name: CHAR_ TRIP

The Ch03_AviaCo

database tables

Ch03_AviaCo

CHARTER

CHAR_

CHAR_

CHAR_

AC_

CHAR_

CHAR_

CHAR_

CHAR_

DATE

PILOT

COPILOT

NUMBER

DESTINATION

DISTANCE

HOURS_

HOURS_

FLOWN

10001

05-Feb-20

104

10002

05-Feb-20

101

10003

05-Feb-20

105

10004

06-Feb-20

106

1484P

CPT

10005

06-Feb-20

101

2289L

CDG

10006

06-Feb-20

109

4278Y

CPT

10007

06-Feb-20

104

2778V

10008

07-Feb-20

106

1484P

10009

07-Feb-20

105

2289L

LHR

10010

07-Feb-20

109

10011

07-Feb-20

101

10012

08-Feb-20

101

2778V

10013

08-Feb-20

105

4278Y

10014

09-Feb-20

106

4278Y

10015

09-Feb-20

104

101

2289L

10016

09-Feb-20

109

105

2778V

10017

10-Feb-20

101

10018

10-Feb-20

105

The

destinations

CDG CPT

Copyright Editorial

review

2020 has

5 PARIS 5 CAPE

Cengage deemed

Learning. that

any

All

109

105

May

not materially

320.00

1.6

0

7.8

0

472.00

2.9

4.9

023.00

5.7

3.5

397.7

472.00

2.6

5.2

LHR

1 574.00

7.9

TYS

644.00

4.1

1 574.00

6.6

23.4

affect

998.00

6.2

352.00

1.9

884.00 644.00

the

97.2

1

10019

2

10011

117.1

0

10017

0

348.4

2

10012

0

140.6

1

10014

459.9

0

10017

3.2

279.7

0

10016

5.3

66.4

1

10012

4.8

4.2

215.1

0

10010

3.9

4.5

174.3

1

10011

936.00

6.1

2.1

302.6

0

10017

1 645.00

MOB TYS

6.7

0

459.5

2

10016

MQY

312.00

1.5

0

67.2

0

10011

CPT

508.00

3.1

0

105.5

0

10014

644.00

3.8

4.5

167.4

0

10017

three-letter

airport

FRANCE,

LHR

SOUTH

AFRICA

duplicated, learning

72.6

10014

CDG

or

10011

10016

TYS

overall

1

2

CDG

scanned,

CODE

0

1

LHR

CUS_

OIL_QTS

339.8

BNA

by standard

copied,

354.1

1 574.00

4278Y

be

2.2

LHR

DE GAULLE,

not

5.1

BNA

INTERNATIONAL,

does

936.00

CHAR_

GALLONS

4278Y

1484P

Reserved. content

WAIT

2778V

1484P

104

CHARLES

Rights

CDG

4278Y

104

are indicated

TOWN

suppressed

2289L

CHAR_ FUEL_

in experience.

whole

or in Cengage

codes.

5 LONDON

part.

Due Learning

to

electronic reserves

For example, HEATHROW,

rights, the

right

some to

third remove

party additional

UNITED

content

may content

be

KINGDOM

suppressed at

any

time

from if

the

subsequent

AND

eBook rights

and/or restrictions

eChapter(s). require

it.

116

PART I

Table

Database

name:

Systems

AIRCRAFT

AC_NUMBER

3

1 833.10

101.80

2289L

C-90A

4 243.80

768.90

1 123.40

2778V

PA31-350

7 992.90

1 513.10

789.50

4278Y

PA31-350

2 147.30

622.10

243.20

5 Aircraft total time, left

AC_TTER

5 Total time,

right

developed table

Table name:

AC_TTER

1 833.10

5 Total time,

a fully

AC_TTEL

PA23-250

AC_TTEL

CHARTER

AC_TTAF

1484P

AC_TTAF

In

MOD_CODE

system, entries

airframe (hours)

engine

(hours)

engine such

(hours) attribute

values

would

be updated

by application

software

when the

are posted.

MODEL

MOD_CODE

MOD_MANUFACTURER

MOD_SEATS

MOD_NAME

MOD_CHG_MILE

C-90A

Beechcraft

KingAir

8

1.67

PA23-250

Piper

Aztec

6

1.20

PA31-350

Piper

Navajo

10

1.47

Customers

number

are charged

per round-trip

mile, using

of seats in the airplane, including

a pilot

and

copilot

Table

name:

has

six

passenger

the

Chieftain

MOD_CHG_MILE

the pilot and copilot

seats

rate.

seats.

The

Therefore

MOD_SEAT

gives the total

a PA31-350 trip that is flown

by

available.

PILOT

EMP_

PIL_

NUM

LICENCE

PIL_RATINGS

PIL_MED_

PIL_MED_

PIL_PT135_

TYPE

DATE

DATE

101

ATP

ATP/SEL/MEL/Instr/CFII

1

20-Jan-20

11-Jan-20

104

ATP

ATP/SEL/MEL/Instr

1

18-Dec-19

17-Jan-20

105

COM

COMM/SEL/MEL/Instr/CFI

2

05-Jan-20

02-Jan-20

106

COM

COMM/SEL/MEL/Instr

2

10-Dec-19

02-Feb-20

109

COM

ATP/SEL/MEL/SES/Instr/

1

22-Jan-20

15-Jan-20

CFII

The pilot licences Pilot.

Businesses

(FARs) 135

that

shown in the that

operate

are enforced

operators.

pilots

by the

Part 125

six months. The Part

PILOT table include on demand Federal

operations

135 flight

must have at least

Aviation

require

are

governed

Administration

that

proficiency

a commercial

the ATP 5 Airline Transport

air services

pilots

(FAA).

successfully

Such

of the

flight

medical certificate

Air Regulations

are known

proficiency

in PIL_PT135_DATE.

and a second-class

5 Commercial

Federal

businesses

complete

check data is recorded

licence

Pilot and COM

by Part 135

as Part

checks

every

To fly commercially,

(PIL_MED_TYPE

5 2).

The PIL_RATINGs include: SEL

5 Single

engine,

land

MEL

SES 5 Single engine, sea CFI

Copyright Editorial

review

5 Certified

2020 has

Cengage deemed

Learning. that

any

flight

All suppressed

Instr.

instructor

Rights

Reserved. content

Multi-engine,

does

5Instrument

CFII

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

land

5 Certified

or in Cengage

part.

Due Learning

to

electronic reserves

flight

instructor,

rights, the

right

some to

third remove

instrument

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Table

name:

3

Relational

Model

Characteristics

EMPLOYEE

EMP_NUM

EMP_TITLE

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_DOB

EMP_HIRE_DATE

100

Mr.

Nkosi

Cela

D

15-Jun-52

15-Mar-98

101

Ms.

Naude

Amahle

G

19-Mar-75

25-Apr-96

102

Mr.

Vandam

Rhett

14-Nov-68

18-May-03

103

Ms.

Jones

Anne

11-May-84

26-Jul-09

104

Mr.

Lange

John

P

12-Jul-81

20-Aug-00

105

Mr.

Williams

Robert

D

14-Mar-85

19-Jun-13

106

Mrs.

Duzak

Jeanine

K

12-Feb-78

13-Mar-99

107

Mr.

Diante

Jorge

D

01-May-85

02-Jul-07

108

Mr.

Wiesenbach

Paul

R

14-Feb-76

03-Jun-03

109

Ms.

Travis

Elizabeth

K

18-Jun-71

14-Feb-16

110

Mrs.

Genkazi

Leighla

19-May-80

29-Jun-10

Table

name:

M

W

3

CUSTOMER

CUS_ LNAME

CUS_ FNAME

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Smith

Kathy

10013

Pieterse

Jaco

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

Williams

George

10018

Padayachee

Vinaya

10019

Smith

Olette

CUS_CODE

Use the

117

database

CUS_ PHONE

A

0181

844-2573

10.00

K

0161

894-1238

10.00

0181

894-2285

1559.73

0181

894-2180

1802.09

0181

222-1672

1420.15

B

0161

442-3381

1633.19

G

0181

297-1228

10.00

0181

290-2556

10.00

G

0161

382-7185

10.00

K

0178

297-3809

1283.33

W F

Myron

shown in

Figure

P3.5 to

CUS_ BALANCE

CUS_ AREACODE

CUS_ INITIAL

answer

Problems

25-28.

ROBCOR is

an aircraft

charter

company that supplies on-demand charter flight services using a fleet of four aircraft. Aircraft are identified by a unique registration number. Therefore, the aircraft registration number is an appropriate primary key for the AIRCRAFT table. The nulls in the CHARTER tables CHAR_COPILOT column indicate that a copilot is not required for some

charter trips

or for some aircraft.

(Federal

Aviation

Administration

(FAA) rules require

a copilot

onjet aircraft and on aircraft having a gross take-off weight over 5 500 kg. None of the aircraft in the AIRCRAFT table are governed bythis requirement; however, some customers mayrequire the presence of a copilot for insurance reasons.) All charter trips are recorded in the CHARTER table.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

118

PART I

Database

Systems

NOTE Earlier both

in the the

the

chapter

pilot

it

was stated

and the

CHARTER

copilot

table.

are

Therefore,

that

it is

pilots the

best

in the

to

avoid

PILOT

synonyms

homonyms

table,

but

CHAR_PILOT

and

synonyms.

EMP_NUM and

In

cannot

this

be used

CHAR_COPILOT

problem, for

were

both in

used

in

the

CHARTER table.

3

Although is

the

solution

not required.

charter flight

Worse,

company

such

grows

engineers

additional

works in this

and load

crew

would

have

to

trip

without

would

yield

additional

to

You will have a chance points:

Dont

synonyms.

greatest

structural on the

25

26

extent,

in the

Given this

For

requirements then

have to

required

change,

example,

if the

AviaCo

to

modified

to include

be

include the

CHAR_LOADMASTER

time

aircraft,

when a copilot

may increase

and

each

in larger

nulls

a smaller

the

aircraft

missing

crew

flew

a

members

table.

design

requires

tables.

shortcomings

the

design the

database

change.

crew would

generates

CHAR_FLT_ENGINEER

table.

those

design

table

as

members

CHARTER

to correct

possible

changes

CHARTER

and it

requirements

aircraft,

attributes

of crew

If your

larger

CHARTER

number nulls in the

two important

To the

The

such the

as crew

using

masters.

be added the

proliferate

starts

assignments;

charter

use

nulls

and

case, it is very restrictive

use

Problem

of synonyms,

database Plan

in

revise

to accommodate

ahead

27. The problem illustrates

and try to

the

design!

growth

anticipate

without requiring

the

effects

of change

database.

For each table,

where possible, identify:

a

The primary

key.

b

A superkey.

c

A candidate

d

The foreign

e

A secondary

Create the

key.

key(s). key.

ERD using

UML notation.

(Hint:

Look

at the table

contents.

You

will discover

that

an

AIRCRAFT can fly many CHARTER trips, but each CHARTER trip is flown by one AIRCRAFT, that a MODEL references many AIRCRAFT, but each AIRCRAFT references a single MODEL, etc.) 27

Modify the ERD you created in Problem 26 to eliminate the problems created by the use of synonyms. (Hint: Modify the CHARTER table structure by eliminating the CHAR_PILOT and CHAR_COPILOT attributes; then create a composite table named CREW to link the CHARTER and EMPLOYEE tables. Some crew members, such as flight attendants, may not be pilots. Thats why the

28

EMPLOYEE

Create the

table

ERD using

enters into

UML notation

this relationship.) for the design

you revised

in

Problem

27. (After

you have had

a chance to revise the design, your instructor will show you the results of the design change, using a copy of the revised database named Ch03_AviaCo_2).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 4 Relational Algebra and Calculus IN THIS CHAPTER,YOU WILLLEARN: What is

meant

How to

by relational

manipulate

How the

DBMS

The different How to

database

supports

types

and relational

tables

the

using

calculus

relational

key relational

set

operators:

operators

select,

project

and join

of joins

write queries

About tuple

algebra

using relational

and domain

algebra

relational

expressions

calculus

PREVIEW Relational

algebra

databases

and

relational of

how

both

it

and relational

model.

Codd

proposed

actually

be

components

and

of formal

is

language.

a

Predicate

basis

for

Once

we have

Query

Copyright review

from

which

required theory,

Language,

you

model.

relations and

manipulation

is relatively

easy

will learn

how

the

understand. calculus,

modified.

SQL

such In

in

the

These which

can

to

logic

as the and

set

important

This is

usually

as SQL (Structured DML languages

both

8, Beginning be used

Set theory used

next

a relation. such

in

a database.

provide

as SQL use alimited

Chapter

commands

or false.

predicate

(DML)

to

as a result.

and is

database, within

data to

a collection

a framework

on relations

data

key

as a procedural

of things,

Together,

and relational

often

which allows

as either true

language

by any DML. Languages are

in

modify

of the

algebra is

provides

or groups

described

one

described

mathematics,

operations

be

that

new relations

and is

the

independently

should

Relational

produce

defining

have

the

basic

implementation Structured

accomplish

Query relational

tasks.

Cengage deemed

relational

the

algebra

and

that

sets,

performing

data

relational

of relational

has

the for

specified

a high-level

operations

2020

in

basis

with

modelled

data

of a relation,

manner.

in

be

basis for relational basis for

we identified

can be verified

deals

how to retrieve

Language),

algebra

Editorial

is

using

stemmed

of fact)

that

2

set theory

extensively

(statement

an ideal

consideration achieved

used

and

as the

do this,

concept

relations

logic

1971 should

to

Chapter

was the

on these

manipulation

provide

data

that,

in a structured

on predicate

science

data

model

logic,

mathematical

theory

In

acting

based

which an assertion is

minimally.

mathematical

Codd in

the

and

database

operations

The algebra

that

used

of the relational within the

are the

by E.F.

would

mathematically

be stored

calculus

were proposed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

120

PART I

Database

Systems

Although language algebra that

is

both

gain

provides

us to

be used to

is

not an easy language of the

with aformal retrieve

express

the

form

relational

can

operators

queries

using

both tuple

also

be

and

how

relational

modify

same

data

of how the

which

they

by the

can

be used

complete

to

database

First, data.

you

Then,

will explore

study

the

relational

and the

mathematics

relational

calculus

a relationally

you

to

Essentially,

operates,

if any query that

manipulate

necessary

and tuple

we have

language.

Finally,

it is

operations.

algebra

means that

query

expressions.

relational

understand,

Relational

is relationally

expressed

to

manipulation

a relational

data.

queries,

algebraic

and domain

basic

description

and

We say a query language

algebraic

using

algebra

an understanding

necessary

language.

write

relational

to

complete

can

query

can be written in relational will learn you

about

will learn

how to

the

basic

about

write

how

simple

to

queries

calculus.

4

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4.1

4

Relational

Algebra

and

Calculus

121

RELATIONAL OPERATORS

Relational

algebra

relational

defines the theoretical

operators.

Codd originally

way of manipulating table contents through

defined

eight relational

operators,

called

a number

SELECT (or

of

RESTRICT),

PROJECT, JOIN, PRODUCT, INTERSECT, UNION, DIFFERENCE and DIVIDE. The most important operators are SELECT, PROJECT and JOIN, which can be used to formulate relational algebra expressions to answer many user queries. The relational operators have the property of closure; that is, relational algebra operators are used on existing tables to produce new tables. The relational operators are classed

as being

unary or binary.

Unary operators,

such

as SELECT

and PROJECT,

can be applied

to one relation, whilst binary operators such as JOIN are applied on two relations. In Chapter 3, Relational Model Characteristics, welearnt about a number of important concepts and properties of relations that are essential for understanding the relational model.In this chapter, we will build

on these

concepts

to understand

how relational

algebra

can be used to

write queries.

4

Within

Chapter 3, we modelled a relation on a mathematical construct, which had to abide by a set of rules (Table 3.1). When applying relational operators to relations, we have to follow these rules in addition to those defined for each relational operator. In the following sections you willlearn about the theory associated with common relational operators and view some

practical

examples.

Remember

that the term relation

is a synonym

for table.

NOTE To

be

considered

PROJECT

minimally

and

JOIN.

relational,

Very few

the

DBMSs

DBMS

are

must

capable

support

the

of supporting

key

all eight

relational

operators

relational

SELECT,

operators.

A NOTE ON SET THEORY Set theory is one of the most fundamental concepts in mathematics.1 The theory is based on the idea that elements have membership in a set. Given two sets, A and B, wesay that Ais a member of B, which can be written

as A [

B. Alternatively,

we can say that the

set

B contains

A as its element.

The elements

of a set can be numbers, the names of students who enrolled in a course or the flight numbers of all the flights operated by an airline. Each set is then determined by its elements and each element in a set is unique. Venn diagrams2 are a way of visually representing sets. Supposing we have the following two sets: Set

A 5 Students

who take

the

Databases

Set

B 5 Students

who take

the

Programming

Some

of the

Venn

diagram

1

Karel

2

John

Copyright review

2020 has

Hrbacek

Cengage deemed

Learning. that

any

and

All

set

Rights

the

Reserved. content

in

Thomas

On

Magazine

suppressed

in

as shown

Venn (1880)

Philosophical

Editorial

students

does

A appear Figure

Jech,

May not

not materially

be

also in

Introduction

copied, affect

to

and

Journal

of

scanned, the

{Sarah, unit set

{Paul, B and

vice

Phinda, Mikla,

Paul,

Asanda,

versa.

Hamzah, Kiki,

Mikla}

Craig}

We can represent

these

facts

using

a

4.1.

Diagrammatic and

unit

overall

Science

or

duplicated, learning

Set Theory,

Mechanical 9(59):

in experience.

whole

third

edn.

Marcel

Representation

Dekker,

of Propositions

Inc.,

1999.

and

Reasonings.

Dublin

118.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

122

PART I

Database

Systems

FIGURE 4.1

Asimple Venn diagram

4

In

Figure

and

4.1, the

two

Programming

sections

of the

the left-hand right-hand

represent

and

who

the

appear

two

circles.

Sarah,

circle,

whilst

Asanda,

in

Phinda

two

sets

A and

both

sets

are

and

Kiki and

Hamzah

B.The

Paul only

take

Craig only take

students

and

the

who take

Mikla. the

These

Database

Programming

both

the

will go in the unit,

Database overlapping

so these

go only in

unit and only appear in the

circle.

We will be using union,

circles

units

Venn

intersection

and

diagrams

throughout

this

chapter

to illustrate

the

three

relational

set

operators:

difference.

4.1.1 Selection The relational or it

operator SELECT, also known as RESTRICT, can be used to list all of the row values,

can return

a horizontal

only those subset

row

values

that

match

a specified

criterion.

In

other

words,

SELECT

returns

of a relation.

The SELECT operator, denoted by su, is formally

defined

as:

su(R)

or s,criterion. (RELATION)

where su(R) is the set of specified tuples the

required

of the relation

R and uis the predicate (or criterion) to extract

tuples.

NOTE The Euro, denoted as , became the official currency of 12 European member states in 2002. Today the Euro is used by more than 175 million Europeans in 19 of 28 EU member countries, as well as some countries that are not formally members of the EU.

Figure

4.2 (a)

contains shows

Copyright Editorial

review

2020 has

the

Cengage deemed

Learning. that

shows

visually

information

any

effects

All suppressed

Rights

about of selecting

Reserved. content

how

does

May not

not materially

rows

products

be

all rows

copied, affect

scanned, the

within

which

overall

or

with

duplicated, learning

a relation

are

sold in

no criteria.

in experience.

whole

or in Cengage

part.

are a store The

Due Learning

to

electronic reserves

selected. is

criterion

rights, the

An example

shown

right

in

Figure

specified

some to

third remove

party additional

content

in

may content

of 4.2 (b).

Figure

be

any

time

that

Figure

4.2 (c)

4.2 (d)

suppressed at

a relation

from if

the

subsequent

selects

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

those

rows

P_CODE

only

where the

123456

is

4.2 (a)

than

2.00

Figure

4.2 (e),

Algebra

only the row

name:

Row

1

Row

2

Row

3

Row

4

Row

5

P_DESCRIPT

containing

1

Column

2

4

123456

Flashlight

123457

Lamp

123458

Box Fan

213345

Relation

Figure 4.2 (c) s(PRODUCT)

PRICE

P_CODE

P_DESCRIPT

123456

Flashlight

19.87

123457

Lamp

8.68

123458

Box

9 v battery

1.52

213345

9 v battery

1.52

254467

100

1.16

254467

100

1.16

311452

Powerdrill

27.64

311452

Powerdrill

4.16

W bulb

s price , 2.00(PRODUCT)

4.2 (d)

Figure

P_CODE

P_DESCRIPT

213345

9 v battery

1.52

254467

100

1.16

Figure

possible

contains

create the

about only the

PRICE

W bulb

to

4.3 illustrates

information

123

Ch04_Relational_DB_Operators

P_CODE

It is also

Calculus

SELECTION

Figure 4.2 (b) The PRODUCT

Figure

and

The SELECToperator

Column

Database

and, in

Relational

displayed.

FIGURE 4.2

Figure

price is less

4

more complex

use

courses tuples

of the offered

where

the

criteria AND

at

University.

Tiny

DEPT_CODE

operator

is

4.16

19.87 Fan

8.68

W bulb

27.64

(PRODUCT) s p_code5123456

4.2 (e)

P_CODE

P_DESCRIPT

123456

Flashlight

by using the logical

logical

using Figure

CIS and the

PRICE

PRICE 4.16

operators

the

COURSE

4.3 (b)

shows

AND,

the

CRS_CREDIT

OR and

relation,

which

new

value

NOT. stores

relation,

is

which

4.

Online Content Allofthe databases usedtoillustratethe material in this chapterarefound

Copyright Editorial

review

2020 has

on the

online

names

used in the figures.

Cengage deemed

Learning. that

any

All suppressed

platform

Rights

Reserved. content

does

for this

May not

not materially

be

book.

copied, affect

The

scanned, the

overall

or

database

duplicated, learning

in experience.

whole

names

or in Cengage

part.

used in the folder

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

match the

party additional

content

may content

database

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

124

PART I

Database

Systems

FIGURE 4.3 Database Figure

name:

Ch04_TinyUniversity

4.3 (a) the

COURSE

DEPT_CODE

CRS_DESCRIPTION

CRS_CREDIT

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Intro.

to

Computer

Science

3

and Implementation

4

Statistics

Statistical

3

Applications

4

s dept_code5CIS ANDcrs_credit5 4(COURSE)

4.3 (b)

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

CIS-420

CIS

Database

Design

QM-362

CIS

Statistical

Applications

4.1.2 The

Relation

CRS_CODE

4

Figure

Selecting from the COURSErelation

CRS_CREDIT and Implementation

4 4

Projection

PROJECT

vertical

operator

subset

defined

returns

of a relation

all values

excluding

for

any

selected

duplicates.

attributes. The

In

other

PROJECT

words,

operator,

PROJECT

denoted

by

returns

a

P,is formally

as:

Pa1...an (R)

or P,List of attributes.

(Relation)

where the projection the relation Figure Figure to

4.4 (b)

4.4 (c)

create

(d)

Copyright review

2020 has

relation

the

effect

that

how columns

within a relation

stores

information

about

the

PROJECT

relational

of applying

containing

only the

PROJECT

operator.

attribute

PRICE.

Notice that

the

products

which

operator

The two

order

attributes a1...an of

are selected. are

on the

sold

in

a store.

PRODUCT

further

examples

of attributes

is

relation,

in

Figure

maintained

4.4

in the

relations.

Learning. that

R, denoted by Pa1...an (R) is the set of specified visually

a relation

the

and (e) illustrate

Cengage deemed

shows

shows

a new

resulting

Editorial

of the relation

R. Figure 4.4 (a) shows

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.4 Database Figure

Relational

Algebra

and

Calculus

125

The PROJECT operator

name:

4.4 (a)

4

Ch04_Relational_DB_Operators

PROJECTION

Column

Column

1

2

Row 1 Row 2

4

Row 3 Row 4 Row 5

Figure

4.4 (b)

Figure

The

PRODUCT

relation

P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box

213345 254467 311452

Powerdrill

Figure

P_DESCRIPT

PRICE

PRICE 4.16

Flashlight

4.16

19.87

Lamp

19.87 8.68

Box Fan

8.68

9 v battery

1.52

9 v battery

1.52

100

1.16

100

1.16

Fan

W bulb

W bulb

27.64

Powerdrill

27.64

(PRODUCT) Pprice

4.4 (c)

(PRODUCT) Pp_descript,price

4.4 (d)

Figure

(PRODUCT) Pp_code,price

4.4 (e)

PRICE

P_CODE

PRICE

4.16

123456

19.87

123457

19.87

8.68

4.16

123458

8.68

1.52

213345

1.52

1.16

254467

1.16

27.64

311452

27.64

4.1.3 UNION The

UNION

relations

set

must

be used in the degree, The

Copyright Editorial

review

2020 has

and

UNION

Cengage deemed

Learning. that

any

operator

have the

UNION.

Rights

denoted

Reserved. content

from

characteristics

or more tables

does

May not

not materially

by

be

copied, affect

, is formally

scanned, the

overall

or

duplicated, learning

two

relations,

(the

columns

share the same

share the same (or compatible)

operator,

All

all tuples

attribute

When two

when they

suppressed

combines same

domains,

defined

in experience.

whole

or in Cengage

part.

excluding and

number they

duplicate

domains

must

of columns,

are said to

tuples.

The

be identical)

i.e.

to

have the same

be union-compatible.

as:

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

126

PART I

Database

Systems

The union of relations relation

R3(c1 , c2,...,

R1(a1 , a2,..., an) and R2 (b1, b2,..., bn) denoted

cn)

where for

each i (i

R1

5 1, 2..n), ai and bi must have

The degree of R3is the same as that of R1and R2.However the cardinality b are the cardinalities

of R1 and

R2respectively,

Figure 4.5 (a) visually shows R1 Figure

4

4.5 (b) to (c)

Both

PRODUCT1

same

domains.

FIGURE 4.5 Database

name:

shows

the

since there

R2 with degree compatible

n, is the

domains.

of R3is a 1 b, only if a and

may not be duplicate

tuples

in

R1 and

. R2

. R2 effect

and PRODUCT2

of the

UNION

operator

are union-compatible

on relations

as they

PRODUCT1

have the

same

and

degree

PRODUCT2.

and share the

The UNIONoperator Ch04_Relational_DB_Operators

Figure 4.5 (a) R1 Union R2

R1

R2

Figure 4.5 (b) The UNION_PRODUCT1

Figure

relation

4.5 (d)

Result

of UNION_PRODUCT1

UNION_PRODUCT2 P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box Fan

8.68

213345

9 v battery

1.52

254467

100

1.16

311452

Powerdrill

Figure

4.5 (c)

The

P_CODE

Copyright Editorial

review

2020 has

PRICE 4.16

19.87

Wbulb

27.64

UNION_PRODUCT2

P_DESCRIPT

relation

Microwave

126.40

345679

Dishwasher

395.00

Cengage

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

P_DESCRIPT

PRICE

123456

Flashlight

123457

Lamp

123458

Box

213345

9 v battery

1.52

254467

100

1.16

311452

Powerdrill

4.16 19.87 8.68

Fan

W bulb

27.64

345678

Microwave

126.40

345679

Dishwasher

395.00

PRICE

345678

deemed

P_CODE

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure that

4.6 shows the

only

one

exists in the

effects

additional

UNION operator

has

been

added

in

when two relations Figure

4.6 (c),

as

contain

Relational

Algebra

duplicate

tuples.

CRS_CODE

and

5 ACCT-211

Calculus

127

Notice already

COURSE_RELATION.

FIGURE 4.6 Database name: Figure

of the

tuple

4

4.6 (a)

The Union operator

COURSE

COURSE2

Ch04_TinyUniversity

The

COURSE_RELATION CRS_CREDIT

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Intro.

to

Computer

3

Science

4

and Implementation

3

Statistics

Statistical

4

4

Applications

Figure 4.6 (b) The COURSE2_RELATION DEPT_CODE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

CIS-430

CIS

Advanced

Databases

6

Figure 4.6 (c) Result of COURSE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Statistical

Applications

4

CIS-430

CIS

Advanced

Databases

6

(a)

and the

attribute.

In the

2020 has

relation

in

example,

the

4.7 (a) is

not

could

3

UNION operator cannot be applied as the results

UNION allowed

COURSE

write PCRS_CODE (COURSE)

4

and Implementation

operator

to the

(COURSE

be used to restrict

both relations

3

Science

Statistics

then the

applying

operator

to

Computer

and

COURSE

CLASS).

the columns

CLASS

have

(CLASS) PCRS_CODE

In

order

to

obtain

Figure

4.6

around

this

over a common

attribute the

in get

in each relation

a common

and

relation

CRS_CODE.

resulting

relation

We shown

4.7 (b).

Cengage deemed

example,

PROJECT

could therefore Figure

For

CLASS

the

Intro.

are not union-compatible,

be invalid.

problem,

in

review

CRS_CREDIT

DEPT_CODE

would

Copyright

COURSE2

CRS_CODE

If two relations

Editorial

CRS_CREDIT

CRS_CODE

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

128

PART I

Database

Systems

FIGURE 4.7 Database Figure

The Union operator

name:

not union-compatible

example

Ch04_TinyUniversity

4.7 (a) the

CLASS_RELATION

CLASS_CODE

CRS_CODE

CLASS_TIME

CLASS_

CLASS_ROOM

LECTURER_ NUM

SECTION

4

10012

ACCT-211

1

MWF 8:00-8:50

a.m.

BUS311

105

10013

ACCT-211

2

MWF 9:00-9:50

a.m.

BUS200

105

10014

ACCT-211

3

TTh 2:30-3:45

BUS252

342

10015

ACCT-212

1

MWF 10:00-10:50

BUS311

301

10016

ACCT-212

2

Th 6:00-8:40

BUS252

301

10017

CIS-220

1

MWF 9:00-9:50

a.m.

KLR209

228

10018

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10019

CIS-220

3

MWF 10:00-10:50

KLR209

228

10020

CIS-420

1

W 6:00-8:40

KLR209

162

10021

QM-261

1

MWF 8:00-8:50

KLR200

114

10022

QM-261

2

TTh 1:00-2:15

KLR200

114

10023

QM-362

1

KLR200

162

10024

QM-362

2

KLR200

162

MWF

p.m.

p.m.

a.m.

p.m. a.m. p.m.

11:00-11:50

a.m.

TTh 2:30-3:45

(COURSE) Figure 4.7 (b) Result of PCRS_CODE

a.m.

p.m.

(CLASS) PCRS_CODE

CRS_CODE ACCT-211 ACCT-212 CIS-220 CIS-420 QM-261 QM-362

4.1.4 INTERSECT The INTERSECT true

in the

cannot

operator,

case

of

denoted

UNION,

use INTERSECT

the

if

as

tables

one

,

returns

must

of the

attributes

in the second table is character-based.

only the

tuples

that

be union-compatible in

the

first

The INTERSECT

to

table

is

appear

give

in

valid

numeric

both

relations.

results. and the

operator is formally

As

was

For example,

you

corresponding

one

defined as:

The intersect of relations R1 (a1, a2,..., an) and R2 (b1, b2,..., bn) denoted R1 R2 with degree n, is the relation R3(c1 , c2,..., cn) that includes only those tuples of R1that also appear in R2 where for each i (i

5 1, 2..n), ai and

bi must have

compatible

Figure 4.8 (a) visually shows R1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

domains.

. R2

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The effect is

shown

in

F_NAMEs

of applying Figure

that

appear

FIGURE 4.8 Database

the INTERSECT

4.8 (d). in

Only both

Kuhle

operator and

Jorge

to the first appear

name

in the

INTERSECT_RELATION_1

4

name:

column (F_NAME)

final

relation

4.8 (b)

and

Calculus

129

in two relations

as they

are the

only

two

and INTERSECT_RELATION_2.

Ch04_Relational_DB_Operators

R2

R1

4

R2

Figure

The INTERSECT_

RELATION_1

Algebra

TheINTERSECT operator

Figure 4.8 (a) R1INTERSECT

Figure

Relational

4.8 (c)

RELATION_2

relation

The INTERSECT_

Figure

relation

4.8 (d)

Result

of

INTERSECT_RELATION_1 INTERSECT_RELATION_2

F_NAME

F_NAME

F_NAME

George

Kuhle

Kuhle

William

Kuhle Elaine

Jorge

Piet

Dennis

Jorge

Jorge

4.1.5 DIFFERENCE The

DIFFERENCE

is, it

subtracts

operator

returns

one relation

from

must be union-compatible.

all tuples

the

other.

in

one relation

The

that

DIFFERENCE

The DIFFERENCE

are

not found

in the

operator

also requires

operator is formally

defined as:

other

that

the

relation; two

The difference of relations R1 (a1, a2,..., am) and R2 (b1, b2,..., bm) denoted R1 R2 with degree relation R3(c1 , c2,..., cm) that includes all tuples that arein R1 but not in R2 wherefor each i (i domains. ai and bi must have compatible Figure

4.9 (a) shows

The effect relation that

Copyright review

2020 has

Figure

appear

result

Editorial

in

in

4.9 (c)

Learning. that

any

DIFFERENCE

shows

only

DIFF_RELATION_1

All suppressed

Rights

order

Reserved. content

does

May not

operator

George, and

not in

of the relations

not materially

m,is the

51,2..m),

R2 can be visualised.

the

as BA, i.e. the

Cengage deemed

how R1

of applying

that

relations

be

copied, affect

scanned, the

overall

or

duplicated, learning

to two

Elaine

and

relations Piet,

is

DIFF_RELATION_2.

are important

in experience.

whole

or in Cengage

part.

Due

to

in

electronic reserves

rights, right

some to

third remove

only

AB

DIFFERENCE

the

Figure

are the

Note that

in the

Learning

shown

as these

party additional

4.9.

The resulting

values

of F_NAME

will not

give the

same

operator.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

130

PART I

Database

Systems

FIGURE 4.9

The DIFFERENCEoperator R1R2

R1

R2

4

Database Figure

name:

4.9 (b)

Ch04_Relational_DB_Operators

The

DIFF_

Figure

4.9 (c)

The

Figure

DIFF_

4.9 (d)

Result

RELATION_1

of DIFF_

- DIFF_RELATION_2

RELATION_1 relation

RELATION_2 relation

F_NAME

F_NAME

F_NAME

George

Kuhle

George

Kuhle

Elaine

William

Elaine

Piet

Jorge

Piet

Dennis

Jorge

4.1.6 CARTESIAN PRODUCT The CARTESIANPRODUCTis usually written as R1 3 R2withthe new resulting relation R3containing all the attributes that are present in R1 and R2along . both R1 and R2 It

can

be formally

defined

with all the possible combinations

of tuples from

as:

The CARTESIAN PRODUCT of two relations R1 (a1, a2,..., an) with cardinality i and R2(b1, b2,..., bm) , with cardinality j is arelation R3 with degree k 5 n 1 m, cardinality i*j and attributes (a1, a2,..., an, b1

b2,..., bm).This can be denoted as R3 5 R1 3 R2. Therefore, two 4

if

one relation

attributes,

the

1 2 5 6 attributes,

Figure

4.10

LOCATION

(c)

i.e. the

shows

relations

You can see in cardinality

it is

Copyright Editorial

review

2020 has

Cengage deemed

by itself,

used in

known

the

Figures

conjunction

would

PRODUCT

of 6 (3

many tuples

with the

other relation

is

composed

be 18 tuples used

on

has three

of 6 and the

rows

and

3 5 18 rows

and

degree

combining

the

would

be 6.

PRODUCT

and

and (b) respectively.

a degree

combines

and the

a new relation

new relation

4.10 (c) that the result

3 3) and as it

of the

CARTESIAN 4.10 (a)

attributes

creates

RESTRICT

of PRODUCT

3 LOCATION

1 3). The

CARTESIAN

that

no association

have

(SELECT)

operator,

it

is a new relation

PRODUCT with

becomes

is

each

not

with a

a very

other.

useful

However,

a very important

if

operator

as a JOIN.

Learning. that

in

and four

PRODUCT

cardinality

how

Figure

of 18 (6

operation

has six rows

CARTESIAN

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.10 Database Figure

name:

4.10 (a)

The

PRODUCT

Figure 4.10 (c) PRODUCT

2020 has

Cengage deemed

Learning. that

any

All suppressed

and

Calculus

131

relation P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box

213345

9 v battery

1.52

254467

100

1.06

311452

Powerdrill

Rights

Reserved. content

does

4.16 19.87 8.68

Fan

Wbulb

4

27.64

AISLE W

SHELF 5

24

K

9

25

Z

6

X LOCATION P_CODE

P_DESCRIPT

STORE

AISLE

SHELF

123456

Flashlight

4.16

23

W

5

123456

Flashlight

4.16

24

K

9

123456

Flashlight

4.16

25

Z

6

123457

Lamp

19.87

23

W

5

123457

Lamp

19.87

25

Z

6

123457

Lamp

19.87

24

K

9

123458

Box Fan

10.99

23

W

5

123458

Box Fan

10.99

24

K

9

123458

Box Fan

10.99

25

Z

6

213345

9 v battery

1.52

23

213345

9 v battery

1.52

24

K

9

213345

9 v battery

1.52

25

Z

6

254467

100

W bulb

1.16

23

254467

100

W bulb

1.16

24

K

9

254467

100

W bulb

1.16

25

Z

6

311452

Powerdrill

27.64

24

W

5

311452

Powerdrill

27.64

25

K

9

311452

Powerdrill

27.64

26

Z

6

May not

PRICE

relation

23

review

Algebra

Ch04_Relational_DB_Operators

STORE

Copyright

Relational

The CARTESIAN PRODUCT

Figure 4.10 (b) The LOCATION

Editorial

4

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

PRICE

or in Cengage

part.

Due Learning

to

electronic reserves

W

5

W

rights, the

right

some to

third remove

5

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

132

PART I

Database

Systems

4.1.7 DIVISION , that The DIVISION operation produces a new relation by selecting the tuples in one relation, R1 match every row in another relation, R2.It is essentially the inverse of the CARTESIAN PRODUCT operation, just like the arithmetic divide is the inverse of multiplication. DIVISION, denoted by R1 4 R2, can be formally defined as:

(b1, b2,..., bm) with cardinality j R1 (a1, a2,..., an) with cardinality i and R2

The DIVISION of two relations

is arelation R3with degree k 5 n 2 mand cardinality i 4 j. Using the example shown in Figure 4.11, note that: Table 1 (Figure 4.11(a)) is divided by Table 2 (Figure 4.11(b)) to produce Table 3(Figure 4.11(c)). Tables 1 and 2 both contain the column CODE but

4

do not share

LOC.

To be included

in the resulting

Table 3, a value in the

unshared

column

(LOC)

be associated (in the dividing Table 2) with every value in Table 1. The only value associated A and Bis 5.

FIGURE 4.11 Database

Name:

must

with both

The DIVISION operator Ch04_Relational_DB_Operators

Figure 4.11 (a) Division Table 1 CODE

LOC

A

5

A

9

A

4

B

5

B

3

C

6

D

7

D

8

E

8

Figure 4.11 (b) Division Table 2 CODE A B

Figure

4.11

(c)

Result

of

Division

Table

1

4

Division

Table

2

LOC 5

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4.2

4

Relational

Algebra

and

Calculus

133

JOINS

The JOIN operation is one of the essential operations of relational algebra. It is a binary operation that allows the user to combine two relations in a specified way. JOIN operations are the real power behind the relational database, allowing the use ofindependent tables linked by common attributes. The JOIN oftwo relations R1and R2is arestriction ontheir Cartesian product R1X R2to meet a specified criterion. Thejoin itself is defined on an attribute a of R1and an attribute b of R2 where the attributes the same domain. A JOIN operator may be formally defined as:

a and b share

(a1, a2,..., an) and R2 Thejoin of two relations R1 (b1, b2,..., bm) is a relation R3 with degree k 5 n 1 m ) that satisfy a specific join condition. and attributes (a1, a2,..., an, b1 , b2,..., bm In this section we willlook at a number of different kinds ofjoin operations including EQUIJOIN, NATURAL JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN.

4

the THETA JOIN,

4.2.1 Theta Join and Equijoin One of the equality

most commonly

condition

that

used joins is known as an equijoin,

compares

specified

columns

whichlinks tables

of each table.

The outcome

on the basis of an of the equijoin

does

not eliminate duplicate columns, and the condition or criterion used to join the tables must be explicitly defined. The equijoin takes its name from the equality comparison operator (5) used in the condition. If any other comparison operator is used the join is called a theta join denoted with the symbol u(u-join). So, theta represents

a predicate

The equijoin is therefore

that

consists

of one of the comparison

operators

{ 5, ,,

,5,

.5,

,

.}.

one special type of theta join:

Let R1 (a1, a2,..., an) and R2 (b1, b2,..., bm) be relations that may have different schemas. Then the u-join . of R1and R2is denoted as R1 uR2 and the equijoin is denoted as R1 R1.a5R2.bR2 It is also

possible to

express

both the u-join

and the

equijoin in terms

of the restriction

and

Cartesian

). product operations. So,for example, the equijoin R1 R1.a 5 R2.bR2 mayalso be written as sR1.a 5R2.b (R1 3 R2 Looking at the u-join and the equijoin in this way allows us to create some simple rules, which will allow us to compute such joins on any two relations: . This first performs a Cartesian product to form all possible combinations Compute R1 3 R2

1

of the

. rows of R1and R2 2

Restrict the Cartesian product to only those rows

where the values in certain columns

match.

For example, suppose we wish to find out all students who take classes in each department at Tiny University. To answer this query, we mustjoin together the two relations STUDENT-2 and DEPARTMENT-2 shown in Figure 4.12 (a) and (b). Following the two rules stated above, this will first involve finding the Cartesian

product

of the

STUDENT-2

and

DEPARTMENT-2

relations

shown in

Figure 4.12 (c).

Then,

we

need to restrict the resulting relation in Figure 4.12 (c) to only those tuples that satisfy the join condition on the common columns of DEPT_CODE, which is found in both relations (Figure 4.12 (d)). In this case, this would be where STUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE. This query, which we will call STUDENT_IN_DEPT, can be written in relational algebra as: (STUDENT 3 DEPARTMENT) STUDENT_IN_DEPT 5 sSTUDENT.DEPT_CODE 5DEPARTMENT.DEPT_CODE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

134

PART I

Database

Systems

FIGURE 4.12 Database Figure

name:

4.12 (a)

4

Figure

4.12 (b)

Equijoin example Ch04_TinyUniversity

The

STUDENT-2

relation STU_LNAME

STU_FNAME

STU_DOB

321452

Ndlovu

Amehlo

12 February

324257

Smithson

Anne

15 November

324258

Le Roux

Dan

23 August 1986

324269

Oblonski

324273

Smith

The

Walter

1992

30

BIOL

1997

16 September

John

DEPARTMENT-2

Figure 4.12 (c) The Cartesian

CIS ACCT

1997

December

CIS

1975

ENGL

relation DEPT_CODE

DEPT_NAME

ACCT

Accounting

BIOL

Biology

CIS

Computer

ENGL

English

product (STUDENT

Info.

Systems

3 DEPARTMENT) DEPT_NAME

S.DEPT_

D.DEPT_

CODE

CODE

1992

BIOL

ACCT

Accounting

1992

BIOL

BIOL

Biology

12 February

1992

BIOL

CIS

Computer

Amehlo

12 February

1992

BIOL

ENGL

English

Smithson

Anne

15

1997

CIS

ACCT

Accounting

324257

Smithson

Anne

15 November

1997

CIS

BIOL

Biology

324257

Smithson

Anne

15 November

1997

CIS

CIS

Computer

324257

Smithson

Anne

15 November

1997

CIS

ENGL

English

324258

Le Roux

Dan

23 August

1986

ACCT

ACCT

Accounting

324258

Le Roux

Dan

23 August

1986

ACCT

BIOL

Biology

324258

Le Roux

Dan

23 August

1986

ACCT

CIS

Computer

324258

Le Roux

Dan

23 August 1986

ACCT

ENGL

English

324269

Oblonski

Walter

16 September

1993

CIS

ACCT

Accounting

324269

Oblonski

Walter

16 September

1993

CIS

BIOL

Biology

324269

Oblonski

Walter

16 September

1993

CIS

CIS

Computer Info.

324269

Oblonski

Walter

16 September

1993

CIS

ENGL

English

324273

Smith

John

30 December

1975

ENGL

ACCT

Accounting

324273

Smith

John

30 December

1975

ENGL

BIOL

Biology

STU_

STU_

STU_

NUM

LNAME

FNAME

321452

Ndlovu

Amehlo

12

February

321452

Ndlovu

Amehlo

12

February

321452

Ndlovu

Amehlo

321452

Ndlovu

324257

Copyright Editorial

DEPT_CODE

STU_NUM

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

STU_DOB

May not

not materially

be

November

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

Info.

Systems

Info.

Systems

Info.

Systems

suppressed at

any

time

Systems

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

STU_DOB

STU_

STU_

STU_

NUM

LNAME

FNAME

324273

Smith

John

30

324273

Smith

John

30 December

Figure 4.12 (d) the final relation

STUDENT_IN_DEPT

December

4

Relational

Algebra

S.DEPT_

D.DEPT_

CODE

CODE

1975

ENGL

CIS

Computer

1975

ENGL

ENGL

English

and

Calculus

135

DEPT_NAME

Info.

Systems

(STUDENT 5 sSTUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE

3

DEPARTMENT) STU_

STU_

STU_

STU_DOB

NUM

LNAME

FNAME

321452

Ndlovu

Amehlo

12 February

324257

Smithson

Anne

15 November

324258

Le Roux

Dan

23 August 1986

324269

Oblonski

324273

Smith

Walter

1992

16 September 30 December

John

Notice in Figure 4.12 (c) that there

are two

columns

1997

1993 1975

called

S.DEPT_

D.DEPT_

CODE

CODE

BIOL

BIOL

Biology

CIS

CIS

Computer Info.

ACCT

ACCT

Accounting

CIS

CIS

Computer Info.

ENGL

ENGL

English

This is

due to the fact that

DEPT_CODE.

DEPT_NAME

4 Systems

Systems

both

STUDENT-2 and DEPARTMENT-2 both contain a column of the same name. In this case DEPT_CODE also shares the same domain and provides referential integrity between the two relations. In order to distinguish between them, a prefix of S and D has been added to the name of these columns, i.e. S.DEPT_CODE and D.DEPT_CODE, to makethem easier to read. You can also see these two common columns

again in the resulting

relation

in

Figure

4.12 (d)

as the

equijoin

columns. Ideally, it would be far better not to show duplicate equijoins are so common, so an operator called the natural join

does not eliminate

columns in the resulting was defined.

duplicate

relation,

as

4.2.2 The Natural Join The natural join

operation

is the

most common

variant

of the joins.

The natural join

operation

requires

that the two operant relations must have at least one common attribute, i.e. attributes that share the same domain. The common column(s) is (are) referred to as the join column(s). The natural join is in fact an equijoin; however, in addition, we drop the duplicate attributes, so the resulting relation contains one less column than that of the equijoin. Let R1be arelation having attributes (a1, a2,..., an, y), R2be another relation having attributes (b1, b2,..., bm y) where y is a set of common attributes (join column(s)) that share the same domain. The natural join operator is defined as: The natural join of R1and R2,denoted R1|3| R2 , consists of combining the tuples of R1 and R2to build a new relation R3,such that if R1Tuple [ R1 , R2Tuple [ R2 , and R1Tuple.y 5 R2Tuple.y, then R3Tuple 5 R1Tuple.a1 , R1Tuple.an, R1Tuple.y, R2Tuple.b1,... R2Tuple.bm. R1Tuple.a1 corresponds

; the notation Note that the common set of attributes y appears only once in R3 . to the a1attribute value of atuple of R1 Although

this

definition

appears

to

be quite complicated,

join of two relations is quite straightforward 1

Copyright Editorial

review

2020 has

. This first Compute R1 3 R2 . rows of R1and R2

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

performs

copied, affect

scanned, the

overall

or

and is the result a Cartesian

duplicated, learning

the

in experience.

whole

or in Cengage

of a three-stage

product

part.

Due Learning

to

steps required

electronic reserves

to form

rights, the

right

to

compute

to

third remove

party additional

natural

process:

all possible

some

the

content

combinations

may content

be

suppressed at

any

time

from if

the

subsequent

of the

eBook rights

and/or restrictions

eChapter(s). require

it

136

PART I

Database

2

Systems

Select those tuples values

3

in the join

where R1Tuple.y

column(s)

are

Perform a PROJECT operation final

relation.

joining

This is to

column,

ensure

thereby

DEPARTMENT

tables

on either R1 .y or R2.yto the result

that

the

eliminating on the

5 R2Tuple.y. Only the rows are selected

where the attribute

equal.

final

relation

duplicate

DEPT-CODE

results

columns. joining

in

a single

For example,

column,

of step (2), and call it yin the copy if

of each

wejoined

we would

only

attribute

the

want

in the

STUDENT

one

column

called

DEPT_CODE in our final relation. Finally, project the rest of the attributes in R1and R2except drop the prefix R1and R2in the final relation. Let us now apply these

4

AGENT

that

steps to an example.

will be used

FIGURE 4.13

to illustrate

the

Figure

natural

4.13 shows two

join

relations

called

and

y and

CUSTOMER

and

operator.

The CUSTOMERand AGENTrelations

Database name: Ch04_Relational_DB_Operators Relation:

CUSTOMER CUS_CODE

Relation:

CUS_LNAME

CUS_POSTCODE

AGENT_CODE

1132445

Strydom

4001

231

1217782

Adares

7550

125

1312243

Nokwe

678954

167

1321242

Reddy

2094

125

1542311

Smithson

1401

421

1657399

Vanloo

67543W

231

AGENT

1

Copyright review

2020 has

Cengage deemed

AGENT_PHONE

125

01812439887

167

01813426778

231

01812431124

333

01131234445

First, compute the Cartesian product operation

Editorial

AGENT_CODE

Learning. that

any

All suppressed

will produce

Rights

Reserved. content

does

May not

not materially

the results

be

copied, affect

scanned, the

overall

of CUSTOMER and AGENT,i.e.

shown

or

duplicated, learning

in experience.

in Figure

whole

or in Cengage

part.

CUSTOMER

3 AGENT. This

4.14.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.14 Database

name:

and

Calculus

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

125

01812439887

1132445

Strydom

4001

231

167

01813426778

1132445

Strydom

4001

231

231

01812431124

1132445

Strydom

4001

231

333

01131234445

1217782

Adares

7550

125

125

01812439887

1217782

Adares

7550

125

167

01813426778

1217782

Adares

7550

125

231

01812431124

1217782

Adares

7550

125

333

01131234445

1312243

Nokwe

678954

167

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1312243

Nokwe

678954

167

231

01812431124

1312243

Nokwe

678954

167

333

01131234445

1321242

Reddy

2094

125

125

01812439887

1321242

Reddy

2094

125

167

01813426778

1321242

Reddy

2094

125

231

01812431124

1321242

Reddy

2094

125

333

01131234445

1542311

Smithson

1401

421

125

01812439887

1542311

Smithson

1401

421

167

01813426778

1542311

Smithson

1401

421

231

01812431124

1542311

Smithson

1401

421

333

01131234445

1657399

Vanloo

67543W

231

125

01812439887

1657399

Vanloo

67543W

231

167

01813426778

1657399

Vanloo

67543W

231

231

01812431124

1657399

Vanloo

67543W

231

333

01131234445

Notice

C.CUS_

in

Figure

4.14

C.AGENT_CODE to the

column

relations. i.e.

3

in the

from

prefixed

AGENT_CODE AGENT

result

Therefore

of

Step

we SELECT

2 so that

of the

attributes

prefix

C and

Cengage

Learning. that

any

All suppressed

only

one

A in the

Reserved. content

does

our

May not

not materially

copied, affect

the

starting

relation

scanned, the

overall

or

duplicated, learning

in experience.

is

letter

whilst

of

each

4

relation.

A.AGENT_CODE

refers

which the

4.15 shows

appears

whole

or in Cengage

part.

is

Due Learning

in the

electronic reserves

appears

values

in

the both

are equal,

of Step 2.

or A.AGENT_CODE to the result

shown

to

as it

AGENT_CODE

the results

final

CUS_POSTCODE, relation

we must first identify

AGENT_CODE

C.AGENT_CODE

column

The final

this

for

Figure

CUS_LNAME,

relation.

be

with

example

only the rows

AGENT_CODE

final

column CUSTOMER

5 R2Tuple.y. To perform this step

1. In

on either

(CUS_CODE,

Rights

each in the

5 A.AGENT.CODE.

Perform a PROJECT operation

deemed

137

relation.

where R1Tuple.y

the

C.AGENT_CODE

Step

has

we have

to the

Select those tuples join

2020

that

refers

AGENT_CODE

2

review

Algebra

Ch04_Relational_DB_Operators

CODE

Copyright

Relational

Step 1: CUSTOMER X AGENT

C.CUS_

Editorial

4

in

rights, the

relation.

Then

project

AGENT_PHONE)

right

Figure

some to

third remove

the

and

of

rest

drop

the

4.16.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

138

PART I

Database

Systems

FIGURE 4.15 Database Relation

name:

Step 2: Selecting rows

where values in the join column

Ch04_Relational_DB_Operators

CUSTOMER

X AGENT Joining

4

columns

C.CUS_

C.CUS_

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

CODE

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

125

01812439887

1132445

Strydom

4001

231

167

01813426778

1132445

Strydom

4001

231

231

01812431124

1132445

Strydom

4001

231

333

01131234445

1217782

Adares

7550

125

125

01812439887

1217782

Adares

7550

125

167

01813426778

1217782

Adares

7550

125

231

01812431124

1217782

Adares

7550

125

333

01131234445

1312243

Nokwe

678954

167

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1312243

Nokwe

678954

167

231

01812431124

1312243

Nokwe

678954

167

333

01131234445

1321242

Reddy

2094

125

125

01812439887

1321242

Reddy

2094

125

167

01813426778

1321242

Reddy

2094

125

231

01812431124

1321242

Reddy

2094

125

333

01131234445

1542311

Smithson

1401

421

125

01812439887

1542311

Smithson

1401

421

167

01813426778

1542311

Smithson

1401

421

231

01812431124

1542311

Smithson

1401

421

333

01131234445

1657399

Vanloo

67543W

231

125

01812439887

1657399

Vanloo

67543W

231

167

01813426778

1657399

Vanloo

67543W

231

231

01812431124

1657399

Vanloo

67543W

231

333

01131234445

The tuples

shaded

in

produce the results

blue are those

where

C.AGENT_CODE

5 A.AGENT.CODE.

These

are then

selected

to

of Step 2.

C.CUS_

C.CUS_

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

CODE

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

231

01812431124

1217782

Adares

7550

125

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1321242

Reddy

2094

125

125

01812439887

1657399

Vanloo

67543W

231

231

01812431124

Copyright Editorial

match

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.16 Database

name:

no

Relational

Algebra

and

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

crucial features

match is

made

tuple.

Smithson

139

Ch04_Relational_DB_Operators CUS_LNAME

unmatched

Calculus

CUSTOMER|X| AGENT

CUS_CODE

Note a few If

Step 3: Final relation

4

of the

between

In that

is included.

CUS_POSTCODE

natural join the

case,

tuples

Smithsons

AGENT_PHONE

4

operation:

in the

neither

AGENT_CODE

relation,

the

AGENT_CODE

AGENT_CODE

new relation

does

not include

421 nor the customer

421

does

not

match

the

whose last

any

entry in

name is

the

AGENT

table. The

column

on

which

the join

was

made

that

is,

were to

occur

several

AGENT_CODE

occurs

only

once

in the

new

table. If the

same

AGENT_CODE

be listed

for

each

AGENT

table,

occur three result

the

match.

For example,

customer

named

times in the resulting

because it

if the

times

Nokwe

who is

table. (A good

would contain

unique

in the

AGENT_CODE

primary

AGENT 167

associated

with

AGENT table

table,

were to

a customer

occur

three

AGENT_CODE

cannot,

would times

167,

of course,

in the

would

contain

such

a

key values.)

4.2.3 The Outer Join When using

the

theta

join

do not have identical that

all the tuples

have a join

and the

natural

join,

it is

values for the common

from the

which keeps

original tables all the tuples

possible

attributes.

are to

outer join,

denoted

There are three Left Right Full As you

outer

join

outer

in relation

join,

whether

Copyright Editorial

review

2020 has

keeps steps

except

that

we are

Cengage

Learning. that

any

All suppressed

determining

Reserved. content

does

the

As a result these tuples

R1 which

from from

May not

or right

aleft first

not materially

left-hand

be

right-hand

both

relations

an outer

affect

the

overall

no corresponding

have

null

If

then it is values

values.

This type

we require

necessary in the

of join

to

relation

is

known

or

relation

join

data from

outer join

scanned,

will be lost.

relation,

relations

are

very

the left

similar

or right

to

side

those

of the

steps

for

relation,

computing

depending

on

outer join.

performs

copied,

have

R2 will

in the joined

relation

the

determining

a left

tuples

outer join:

we also include

performing

Rights

data

for

. This Compute R1 3 R2 rows of R1and R2.

deemed

data

of the

.

of the

data from

keeps

the

The stages in

1

join

symbol

types

keeps

outer join

will see,

a natural

by the

common

some

be shown in the resulting

R2 . In these tuples, the attributes in the second relation as the

that

are:

a Cartesian

duplicated, learning

in experience.

whole

or in Cengage

product

part.

Due Learning

to

to form

electronic reserves

rights, the

right

all possible

some to

third remove

party additional

content

combinations

may content

be

suppressed at

any

time

from if

of the

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

140

PART I

Database

2

Systems

Select those tuples values

in the join

4

Perform a PROJECT operation

in For

in

Aleft

join,

including

those

that

in

4.17.

Notice

Figure been

entered

has

outer join

for

AGENT,

do not have a matching that

there

in the

is

columns.

the

the

final

R2Tuple.y.

of Step 2, and call it simply yin

in a single

Finally,

relations

will return

AGENT_PHONE

copy

project

the

of each attribute rest

of the

name:

relation.

CUSTOMER

and

AGENT,

of the

tuples

AGENT relation.

for

the

in

the

customer

Smithson

and

relation,

a value

returns

values in the

all of the

CUSTOMER

tuples

relation.

in the

AGENT

The result

relation,

of

NULL

including

of this join is

Left outer join : CUSTOMER

shown in

AGENT

Ch04_Relational_DB_Operators CUS_POSTCODE

AGENT_CODE

AGENT_PHONE

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

1542311

Smithson

1401

421

name:

Right outer join : CUSTOMER

NULL

AGENT

Ch04_Relational_DB_Operators

CUS_CODE

CUS_LNAME

CUS_POSTCODE

AGENT_CODE

AGENT_PHONE

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

NULL

NULL

NULL

333

01131234445

Learning. that

were

of this join is shown

CUS_LNAME

Cengage

which

CUSTOMER

The result

CUS_CODE

deemed

in

attributes

field. AGENT,

matching

all

value in the

no AGENT_PHONE

CUSTOMER

do not have

FIGURE 4.18

2020

duplicate

results

,.

4.18.

FIGURE 4.17

review

an

CUSTOMER

outer join,

that

Figure

Copyright

eliminating

performing

where the attribute

4.14.

outer

those

thereby

consider Figure

A right

Editorial

This is to ensure that the final relation

column,

example,

has

on either R1 .y or R2.yto the result

R1and R2,except y, and drop the prefix R1and R2in

defined

Database

5 R2Tuple.y. Only the rows are selected

equal.

Select those tuples in R1that do not have matchingvalues in R2, so R1Tuple.y

the joining

Database

are

3

the final relation.

4

where R1Tuple.y

column(s)

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

So, regardless the

of the type

matched

null.

pairs

Outer joins

of outer join,

would

are

especially

useful

cause(s)

referential

primary

key values in the related

other

amounts

the

integrity

non-database

vast

and

any

when

you

problems

data into

of time

the two

be retained

and

relational

Figures

values to

determine

what key

In fact, if you

are asked to

tables,

you

when

relation

value(s)

encounter

and

would in

values

convert

will discover

you

Algebra

Calculus

4.18 have shown

other

when foreign

headaches

Relational

4.17 and

in the

created

are

database

uncounted

in

are trying

which

table(s).

examples unmatched

4

large

that the

not

tables match

the

spreadsheets

outer joins

referential

that be left

related

do

integrity

141

or

save you errors

after

conversions. You

may

the tables

wonder

why the

are listed

in the

outer

joins

are labelled

SQL command.

left

Chapter

and right.

The labels

8 will explore

refer

to the

order

in

which

such joins.

4

4.3

CONSTRUCTING QUERIES USING RELATIONAL ALGEBRAIC

EXPRESSIONS The

main purpose

a database. are

used

to tell

calculus

the

provides

relations.

non-procedural

1977

(Lacroix

and

properties

and

over

set

again,

writing relational

algebraic not

optimiser. access and

the

SQL

need

in

for

relations.

in terms

and

a

calculus

databases.

power

is

of relational

relational

with relational

other

calculus

one form

by domain

expressive

Relational of those

relational

Codd proposed

use

in its

other

whilst

followed

relational this

relational

algebra

book

to

calculus.

users

will ask

will

that

expressions This

different

at the

to formulate

examine

For those end

of this

on the

spur

of the

of

smaller

used

query

in

the

in

However,

both

provide

the

expressions

the

mathematical

who

are interested

using

definitions, there

is

a

chapter.

these

results

DBMSs,

is to

it is

the

moment.

query

each

of the

query.

pointing

queries

building step

out that

Chapter

a when

of individual same,

the

but

can

efficiency

determined

the

a query

generates

Generally,

be the

is

and find in

will be asked

of execution

will always

of execution

optimiser

of

where

the order

query

Some

The task

steps

worth

order

of queries.

steps,

queries, of the

analyse the

more about the

kinds

following

However,

most

optimiser

You will discover

in the

to represent

expressions.

The job

many different

a number

means that

and that,

of the

be

into

are then

very important

data.

no

section

down

matter.

by slightly

is

of

relation

in

section

queries.

reading

others

query

results

does

In 1972,

algebra

on applying

is

behind further

whilst

the

operations obtained

There

of a database,

of intermediate

be

database

terms

(tables)

previous

Queries

breaking

a query

real

in

relations

about in the

language,

was later

designed

relational

manipulate

of the required

logic.

and this were

to

relation

definition

as a procedural

versions

and

you have just read

required

on predicate

equivalent

a way to create

that

the

calculus

Both

is

in the

During the lifetime over

some

classed

and based

characteristics

material

4.3.1 Building

involves

often

relational

operators.

and of

build

we will be focusing

main relation

selection

is

specifying

section,

to

provide

algebra

for formulating

Pirotte).

calculus

base for

In this

how

algebra

as tuple

relational

the

DBMS

language

known

required

algebra is to of relational

a notation

Relational

calculus

tuple

of relational

The operations

of

by a query

most efficient

13, Managing

way to Database

Performance.

In order to build a query using a relational

algebraic

expression,

you should take the following

steps:

1 List all the attributes we need to givethe answer. 2 Select allthe relations we need, based onthe list of attributes.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

142

PART I

Database

3

Systems

Specify the relational

To learn

how to

small

database

Each

car is

a new

queries

to

undergo

maintenance parts

to

are shown in Figure

FIGURE 4.19

these

steps,

about the

is

each

created

purchased

and

and fitted.

are completed

results that are needed.

we will now look

maintenance

an inspection

record

be

FAIL until all the repairs

4

following

stores information

required

inspection, require

build that

operators and the intermediate

year

to

any repairs

If

a car

test

needs

are

based

it is roadworthy.

needed

a repair,

examples

on a

ERD is shown in Figure 4.19).

whether

that

and then it is set to

at some

of cars (the

then

PASS. The tables

After

are recorded. the

each

A repair

EVALUATION

representing

is

this

can set to

database

4.20.

The car inspection

ERD CAR

MAINTENANCE_RECORD REGISTRATION INSPECTION_CODE

{PK}

REGISTRATION

b requires

{PK}

CAR_MAKE

{FK}

CAR_MODEL

INSPECTION_DATE

0..*

EVALUATION

MODEL_YEAR

1..1

LICENCE_NO

1..1

is_for

c

0..* PART

REPAIR INSPECTION_CODE PART_NO

{PK}

requires {PK}

c

PART_NO

{FK}

{FK}

Database

name:

Table name:

Thecarinspection database Ch04_Car_Inspection

CAR

REGISTRATION

CAR_MAKE

Toyota

3679MR82

Copyright Editorial

review

PART_COST

0..*

0..*

FIGURE 4.20

{PK}

PART_NAME

CAR_MODEL

CAR_COLOUR

MODEL_YEAR

LICENCE_NO

Corolla

Blue

2016

1967fr89768

Micra

Red

2004

1973Smith121

E-TS865

Nissan

PE57UVP

Peugeot

508

Blue

2017

1990bty3212

PISE567

Volkswagen

Eos

Lime

2016

DF-678-WV

ROMA482

Volkswagen

Golf

Black

2017

AQ-123-AV

Z-BA975

Peugeot

Black

2017

1980vrt7312

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

GT

208

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

PART_NO

PART_NAME

12390

Paint sealants

List To

answer

12392

Brake

pads

24.99

12393

Brake

discs

49.54

12395

Spark

plugs

0.99

12396

Airbag

24.95

12397

Tyres

25.00

REGISTRATION

INSPECTION_DATE

Copyright Editorial

review

2020 has

Cengage deemed

any

FAIL

10/05/2018

100390

ROMA482

01/09/2018

106750

E-TS865

01/03/2016

PASS

122456

Z-BA975

03/10/2018

FAIL

145678

PISE567

30/09/2017

PASS

200450

E-TS865

21/02/2015

PASS

200456

E-TS865

01/04/2017

FAIL

query,

the .

All suppressed

query

asked

about

cars

you

relation

Rights

12396

106750

12397

100036

12393

200450

12391

100036

12397

200450

12392

200456

12397

where

The the

106750

the

model

interpret

that

user

only

relational

year is List

wants

2016.

all information

to

operator

after

about

see information

SELECT

on

we can

cars cars

means list where

write this

query

the

all the attribute

as a relational

as:

Reserved. content

Using

PART_NO

by a user:

must first CAR.

2016.

expression

Learning. that

EVALUATION

PE57UVP

following

MODEL_YEAR

algebraic

4

REPAIR

this in

143

19.95

Wiper

100036

all information

attributes

Calculus

14.95

INSPECTION_CODE

1

and

MAINTENANCE_RECORD

Table name:

the

Algebra

PART_COST

12391

INSPECTION_CODE

Consider

Relational

PART

Table name:

Example

4

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

144

PART I

Database

Systems

(CAR) smodel_year . 2016 The resulting

relation

FIGURE4.21

4

is shown

CAR_MAKE

PE57UVP

Peugeot

ROMA482

Volkswagen

Z-BA975

Peugeot

Example

the

CAR_COLOUR

508 Golf

GT

208

mechanic

The following

query is

Display

all the

query

at the

names

only

the

SELECT

operator.

FIGURE4.22

garage

wishes

and their

specific

attributes

2017

1990bty3212

Black

2017

AQ-123-AV

Black

2017

1980vrt7312

find

information

will also

Consider

algebraic

a

more complex

cars

parts

to

of the

to

restrict

the

part is

greater

be displayed,

PART_COST.

stock.

Both

so

are

rows

we

20.00.

will need

obviously

where

for this

than

in the

the

relation

relation

PART_COST

PART.

. 20.00

using

Ppart_name (s part_cost.20.00(PART))

query is

4.22.

PART_NAME

PART_COST

Brake Pads

24.99

Brake

49.54

of

and

Discs

24.95

different

model

operator

and show

how

we can

write expressions

when

tables.

details

and

out

after

was carried

Cengage

part

numbers

for

01/03/2018,

all

which

cars

resulted

where in

the

model

a part

being

year is required

and

and

will have to

results.

CAR_MODEL

MODEL_YEAR

is

be

broken

The first

part of the

which are located 2017.

down

in the

This information

can

into

a number

query states

that

CAR relation. be

written

of different

we need the

Also, using

stages,

each

attributes

we are only interested

the

following

relational

expression:

Learning. that

query

a set of intermediate

whose

algebraic

deemed

are in

query:

an inspection

REGISTRATION

has

a number

following

car registration where

one having

2020

parts

a repair.

This is

review

cost

expression

is shown in Figure

will also use the natural join from

the

the

2017.

Copyright

which

3 example

data is required

Editorial

about

Resultof Ppart_name (PART)) (s part_cost . 20.00

Example The final

in

out information

the

about and

be required

The relational

relation

where

Airbag

for

LICENSE_NO

Blue

to

prices

PART_NAME

PART_COST

The resulting

List

MODEL_YEAR

asked:

part

requires

contains

The attribute the

CAR_MODEL

2

Supposing

which

4.21.

(CAR) Resultof s model_year . 2016

REGISTRATION

This

in Figure

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4

Relational

Algebra

and

Calculus

145

Pregistration, car_model (smodel_year 52017(CAR)) The result

of applying

this

FIGURE4.23

statement

to the

CAR table is

shown in

Resultof Pregistration, car_model (smodel_year5 2017(CAR)) REGISTRATION

CAR_MODEL

PE57UVP

508

ROMA482

Golf

next

part

01/03/2018.

of this

query is not asking means the values the

query,

part

query

Information

for

any specific

selecting

query

can

be

information

inspections

of all attributes

by only

of the

requires

about

those

is

4

so

in

inspections

the

where

we will assume

the

that

were

carried

MAINTENANCE_RECORD

relation.

after The

about inspections

However,

INSPECTION_DATE

out

relation.

that information

MAINTENANCE_RECORD

tuples

written

about stored

attributes,

in the

GT

208

Z-BA975

The

Figure 4.23.

we must restrict

. 01/03/2018.

This

second

as:

( MAINTENANCE_RECORD) sinspection_date . 01/03/2018

The result

of applying

this

FIGURE4.24

expression

to the

INSPECTION_DATE

EVALUATION

100036

PE57UVP

10/05/2018

FAIL

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

with the

REGISTRATION.

TempR where

been

tables

can

be

Copyright review

2020 has

Cengage deemed

Learning. that

any

4.24.

shown

the

in Figures

column written

for

in

first

two

parts

FAIL

of the

query.

4.23 and 4.24. This join

both the

CAR and

The

next

operation

stage is to join

is the

MAINTENANCE_RECORD

now natural

relations

being

as:

(MAINTENANCE_RECORD) 5 Pregistration, car_model (s model_year 52017 (CAR)) |3|s inspection_date . 01/03/2018

a relation

which stores

of the

natural

join

is

prefixed

with

the

letters

(MAINTENANCE_RECORD

Editorial

expressions

common

This

TempR is

The result have

algebraic

from the resulting

operation,

Figure

REGISTRATION

have relational

the rows join

table is shown in

(MAINTENANCE_RECORD) Resultof sinspection_date . 01/03/2018

INSPECTION_CODE

We now

MAINTENANCE_RECORD

All suppressed

Rights

and

Reserved. content

does

May not

not materially

be

the intermediate

shown

using

the

M and

results.

three

C to

steps

show

in

Figure

which

4.25.

relations

Notice they

that

were

the

attributes

originally

from

CAR respectively).

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

146

PART I

Database

Systems

FIGURE 4.25 Step

The TempR relation

1: Compute

the

Cartesian

M.INSPECTION_

product:

MAINTENANCE_RECORD

M.REGISTRATION

M.INSPECTION_

CODE

4

X CAR. M.EVALUATION

C.REGISTRATION

C.CAR_

DATE

MODEL

100036

PE57UVP

10/05/2018

FAIL

PE57UVP

100036

PE57UVP

10/05/2018

FAIL

ROMA482

100036

PE57UVP

10/05/2018

FAIL

Z-BA975

208

100390

ROMA482

01/09/2018

PE57UVP

508

100390

ROMA482

01/09/2018

ROMA482

100390

ROMA482

01/09/2018

Z-BA975

208

122456

Z-BA975

03/10/2018

FAIL

PE57UVP

508

122456

Z-BA975

03/10/2018

FAIL

ROMA482

122456

Z-BA975

03/10/2018

FAIL

Z-BA975

Step

2: SELECT

only the rows

for

which the

REGISTRATION

values

are

equal, i.e.

508 Golf GT

Golf

GT

Golf

GT

208

M. REGISTRATION

5 C.

REGISTRATION. Joining

Columns

M.REGISTRATION

M.INSPECTION_

C.CAR_ MODEL

FAIL

100036

PE57UVP

10/05/2018

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

Step 3: Perform a PROJECT prefixes

C and

of

3.

result

C.REGISTRATION

DATE

CODE

the

M.EVALUATION

M.INSPECTION_

Step

on either

Min the final

FAIL

C.REGISTRATION

relation.

The table

or M.REGISTRATION

below

shows

508

PE57UVP

the

relation

ROMA482

Golf

Z-BA975

208

to the result TempR,

of Step 2 and drop

which

has

been

created

INSPECTION_CODE

REGISTRATION

INSPECTION_DATE

EVALUATION

CAR_MODEL

100036

PE57UVP

10/05/2018

FAIL

508

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

The next

part of the

query requires

Golf

we have

as a

GT

208

FAIL

the information

GT

obtained

so far to

be restricted

even further

by only displaying information for cars where a part was needed for arepair. To find out this information we have to look to see if there is a PART_NO in the REPAIR relation, which corresponds to a specific INSPECTION_CODE in the MAINTENANCE_RECORD relation. The relation TempR already stores the intermediate results from the first part of our query, so we must now connect TempR to the REPAIR relation

using

a natural join

QueryResult

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

on the INSPECTION_CODE

5 TempR |3|

Rights

Reserved. content

does

May not

not materially

be

column.

This can be

written as the

expression:

REPAIR

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure

4.26 shows

called

QueryResult.

the result

FIGURE 4.26 The relation

of performing

this

natural join

operation

4

Relational

Algebra

and stores the results

and

Calculus

in a relation

The QueryResultrelation

TempR

INSPECTION_CODE

REGISTRATION

100036

PE57UVP

10/05/2018

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

The relation

INSPECTION_DATE

EVALUATION

CAR_MODEL

FAIL

508

FAIL

208

Golf GT

4

REPAIR

QueryResult

5 TempR

|3|

INSPECTION_CODE

INSPECTION_CODE

PART_NO

106750

12396

106750

12397

100036

12393

200450

12391

100036

12397

200450

12392

200456

12397

REPAIR REGISTRATION

INSPECTION_DATE

EVALUATION

CAR_MODEL

PART_NO

100036

PE57UVP

10/05/2018

FAIL

508

12393

100036

PE57UVP

10/05/2018

FAIL

508

12397

Finally,

the

147

original

This requires

query

us to

requested

perform

using the following

that

a PROJECT

we only list

the

operation

on the intermediate

car registration,

in

4.27.

model results

details in the

and

part numbers.

QueryResult

relation

expression:

(QueryResult) Pregistration, car_model,part_no The final

results

of the

FIGURE 4.27

query

are

shown

Figure

Solution to example 3 REGISTRATION

As you

can

see, it is

smaller

relational

possible

algebra

to

CAR_MODEL

PART_NO

PE57UVP

508

12393

PE57UVP

508

12397

solve

a complex

expressions.

The full

query

by

expression

breaking

for

down

example

the

3 can

query be

into

written

a number

of

as:

car_model (smodel_year 52018 (CAR)) |3|sinspection_date . Pregistration, car_model,part_no((REPAIR) |3| ( Pregistration, 01/03/2018 (MAINTENANCE_RECORD)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

148

PART I

Database

4.4

Systems

RELATIONAL CALCULUS

Relational

calculus

calculus.

There

are two types

Tuple

relational

calculus. compute

it. In

will learn

uses

addition,

about

domain

tuple.

In the

is a formal

in

variables

8.

the

Domain

take

sections

a branch

users

tuple to

relational

of

is

about

calculus is a precise language

that

calculus

what

they

two

called

predicate

and

domain

relational

rather

Language

from

domain,

these

logic

want,

Query

different

an attribute

more

mathematical

Structured

calculus

from

will learn

of

relational

describe

appearance

on values

you

upon

calculus,

allows

underlines

that

based

of relational

calculus it

Chapter

following

language

tuple

rather

types

than

(SQL),

relational

than

to you

calculus

values for

of relational

how which

as it

an entire

calculus.

4

NOTE A NOTE ON PREDICATE CALCULUS First-order

logic

or predicate

are words that

describe certain relations

can be used to express

and properties. In logic,

queries.

Predicates

a predicate has the form:

name_of_predicate(arguments). Consider

the following

statements:

student(Alex) studies(Alex,

Database Systems)

In these two statements, student and studies are the names of the predicates. The statement student(Alex) has a value TRUE if Alexis a student, and a value FALSE if Alexis not a student. Variables

are used if

individual.

we want to express the

So the above

statements

property

of being a student,

and not refer to a specific

become:

student(x) studies(x,y) The expression student(x) is now referred to as a predicate expression. It has no predetermined truth value as the value of xis currently unknown. Variables in a predicate expression can take values within a certain domain. The domain of a predicate variable is the set of all values that can be substituted in the place of the variable. When writing expressions in predicate

P(x)represents

a predicate

calculus,

we use a capital letter

asthe name ofthe predicate.

with one variable x.

Whenx has a value we can say whether or not the expression is true or false. known

as a Truth

For example:

Set which is

defined

Every predicate has whatis

as:

{x[D|P(x)} So, atruth set of a predicate substituted

P(x) with a domain

for x. For example,

consider

Dis the set of all elements

the following

predicate,

lecturer(x).

of Dthat

make P(x)true

The domain

when

would be all people

and the truth set would be alllecturers. Aformula in predicate calculus can comprise: Set of comparison

operators:

Set of connectives: Implication

Copyright Editorial

review

2020 has

Cengage deemed

(5.)

Learning. that

any

,,

#,

.,

$,

5,

and (`), or (~), not () where x 5. y means:if x is true, then y is true.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4.4.1 Tuple Relational Tuple relational

4

Relational

Algebra

and

Calculus

149

Calculus

calculus is a non-procedural

query language

which is

used to describe

what information

is required from the database without giving a specific method for obtaining that information. When specifying a query in tuple relational calculus we say only which attributes are to be retrieved and not how the query is to be executed. This is in contrast to relational algebra, which provides a procedural way of writing the query and incorporates

a strategy

for executing

the

query through

the

way in

which the

operations are ordered. Relational algebra and tuple relational calculus can both be used to express the same queries, which meansthat wehave arelationally complete query language. Wesay a query language is relationally complete if any query that can be written in relational algebraic form can also be expressed bythe query language. Most relational query languages such as SQL are not only relationally complete, but also contain

additional

features

like

aggregate

functions

that

allow

more complex

4

queries to be written.

In tuple relational calculus, wespecify a number of tuple variables where each tuple variable ranges over a database table. The values of the tuple variables are the actual tuples in the table. A query in the tuple relational calculus is expressed as: {t|P(t)} which represents the set of tuples, T, for which predicate, P,is true. Therefore, the results of this query are alltuples that satisfy the condition represented by predicate P. For example, consider the car inspection database in Figure 4.20. If we wanted to write the following query Find

{t|t

all cars

with a model_year

.5

2018

using tuple

relational

calculus

we would

write:

[ Car ` t.MODEL_YEAR.52018}

This query means return the set of tuples, t, where t belongs to the Carrelation year t is greater than 2018. As you can see in the

in the following

example,

a query

or expression

in tuple

relational

and the

calculus

model_year for

can also be written

extended form:

{t1.A1, t2.A2,..., tn.An| P(t1,..., tn, tn11,..., tn1m) where:

t1,..., tn, tn11,..., tn1mare tuple variables, on which ti ranges,

A1...An are attributes of the relation Pis a predicate A formula

following

in tuple

relational

calculus

consists

of predicate

calculus

atoms.

An atom

has one

of the

forms:

(i) R(t) where t is

a tuple

variable

and

Ris

a relation

name.

(ii) t.A oper s.B where t and s are tuple variables, A and B are attributes and oper is a comparison operator. (iii) t.A

oper const

where t is atuple variable, Ais an attribute, oper is a comparison operator, and const is a constant.

Each of these types Every atom

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of atoms evaluates to either TRUE or FALSE for a specific

has a truth

All suppressed

Rights

Reserved. content

combination

of tuples.

value.

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

150

PART I

Database

Systems

Tuple relational logical

Boolean

Existential

and

Tuple relational these

calculus

AND,

Universal

are either

OR,

NOT (`,

an atom

or atoms

or other formulae

connected

via the

~, ).

Quantifiers

calculus formulae

quantifiers

by a quantifier

formulae

operators

is to constrain

can contain

existential

the variables

is

said to

be free.

the

following

two

A tuple

(')

and universal (;)

of tuples in a single relation.

relational

calculus

expression

quantifiers.

The role

of

Any variable that is not bound

may contain

at

most one free

variable. Consider

4

' t

[

R( P(t) ) reads

that

; t

[

R( P(t) ) reads

that

The

existential

universal

(')

(;)

expressions:

there

a tuple

P is true

for

states

that

quantifier

quantifiers

exists

t in

all tuples

relation

t in relation

a formula

state that the formula

R such

must

that

predicate

P(t) is true

R.

be true

for

at least

one instance,

while

the

must be true for all instances.

4.4.2 Building a Tuple Relational Calculus Expression To specify (i)

a tuple

Specify

(ii)

the

Specify

which

calculus

relation to

how

at any

to

build

branch are

to

of the

variable

the

following

t. In the

combinations

we will look

bank.

in

take

steps:

form

of R(t).

of tuples.

be retrieved.

about

shown

tuple

particular

expressions,

information

database

expression,

R of each select

a set of attributes

stores

money this

range

a condition

(iii) Specify To learn

relational

at some

customers

at a bank.

The

shown

Figure

ERD is

in

examples

based

Customers Figure

4.28

can

on a simple

withdraw

and the relations

small

database

money and (tables)

deposit

representing

4.29.

FIGURE 4.28 WITHDRAWAL WITH_TRANS_NO makes

c

{PK}

0..*

WITH_DATE

makes

c

WITH_AMOUNT

0..*

CUS_ACCNO

{FK1}

BRANCH_NO

{FK2) 1..1

1..1

CUSTOMER CUS_ACCNO

BRANCH {PK}

BRANCH_NO

CUS_LNAME

{PK}

BRANCH_NAME

CUS_FNAME

BRANCH_CITY

CUS_BALANCE

1..1 1..1 DEPOSIT DEP_TRANS_NO

makes

c

0..*

{PK}

0..*

DEP_DATE

makes

c

B_AMOUNT

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

CUS_ACCNO

{FK1}

BRANCH_NO

{FK2}

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4

Relational

Algebra

and

Calculus

151

FIGURE 4.29 Relation:

CUSTOMER CUS_ACCNO

Relation:

CUS_LNAME

CUS_BALANCE

2465454

Emerson

Percy

1034

1012345

Adares

Constance

1865

BRANCH BRANCH_NO

Relation:

BRANCH_NAME

BRANCH_CITY

125

Monsuir

333

FirstStep

Paris

231

Cross_St

Rome

4

London

WITHDRAWAL WITH_TRANS_NO

Relation:

WITH_DATE

WITH_AMOUNT

CUS_ACCNO

BRANCH_NO

48887211

01-Jul-18

50

2465454

125

48867666

02-Jul-18

100

1012345

333

64446566

18-Jul-18

200

2465454

125

64443229

20-Jul-18

400

2465454

231

DEPOSIT

DEP_TRANS_NO

Example

CUS_FNAME

DEP_DATE

DEP_AMOUNT

CUS_ACCNO

BRANCH_NO

90000034

30-Jun-18

1000

2465454

125

90000780

30-Jun-18

1400

1012345

333

1

Suppose we wanted to find out which customers the following expression:

had made any withdrawals over 200.

{w| w [ WITHDRAWAL(w) ` w.WITH_AMOUNT

We would write

.5200}

This expression gives us all attributes from the WITHDRAWAL relation, but suppose we only want the last names of customers who have withdrawn 200 or more. CUS_LNAME exists in the CUSTOMER relation, which means we will have to perform ajoin on the CUSTOMER and WITHDRAWAL relations. The attribute CUS_ACCNO appears in both CUSTOMER and WITHDRAWAL and is used to join the two relations

together

as shown in the expression

{w.CUS_LNAME|

w [ WITHDRAWAL(w) ` ('c) (c [ CUSTOMER ` (c.CUS_ACCNO

w.WITH_AMOUNT

would read display the names of all customers such that there exists a

WITHDRAWAL

AND CUSTOMER

attribute are equal, and the value of the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

5 w.CUS_ACCNO)

.5200}

In English, the above expression tuple in the relations

below:

does

May not

not materially

be

copied, affect

scanned, the

overall

for

which the

values

of and for the

CUS_ACCNO

WITH_AMOUNT attribute is greater than or equal to 200.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

152

PART I

Database

Systems

Example Find

2

all customers

{c| c [ ` ('b)

CUSTOMER

(b

[

4

(d

and

from

[ DEPOSIT

relies

be seen

CUS_ACCNO

a deposit

(b.BRANCH_NO

expression As can

made

` ('d)

BRANCH

The above relations. is

having

London.

` (c.CUS_ACCNO

existing

Figure

between

in

5 d.CUS_ACCNO)

5 d.BRANCH_NO)

on joins

from

branch

4.29,

` (b.BRANCH_CITY

between

the

the

common

DEPOSIT

and

BRANCH

expressions

that

generate

CUSTOMER,

column the

.5

BRANCH

between

common

'London')))}

DEPOSIT

attribute

is

and and

DEPOSIT

CUSTOMER

BRANCH_NO.

NOTE Safety of Expressions It is

{ t|

possible

t

to

write tuple

[ R } results

in an infinite

In order to solve this A safe

expression

component For { t | This

[

if the

domain

the set of allowable

is an expression

consider

the

CUSTOMER)

expression

possible

problem,

relation

to

is

have

infinite

relations.

of any attribute

expressions

following

tuples,

that

the

expression

Ris infinite.

to safe expressions.

calculus

or constants

For example,

of relation

is restricted

{ t | P(t) } in the tuple relational

of t appears in one of the relations,

example, ( t

calculus

that is

classed

as safe if every

appear in tuple relational

formula

P.

expression:

}

NOT safe a customer

as it reads

display

tuple

does

that

all tuples not

that

appear

in

are

NOT in the

CUSTOMER

relation.

It is

not

CUSTOMER.

4.4.3 Domain Relational Calculus Domain relational in

calculus is classed as a non-procedural

power to tuple relational

calculus for

in that

an entire

calculus.

domain

expression

However,

variables

that

domain relational take

on values

calculus is

from

different from tuple relational

an attribute

domain,

rather

than

values

in

domain

relational

calculus

is

of the

of atoms,

as

was the

form:

x1, x2,..., xn. | P(x1 , x2,..., xn)}

Where x1, x2,..., case in tuple that

uses

tuple.

A general

{,

it

query language that is seen to be equivalent

xn represent

relational

involve

getting

are created

domain

Calculus. tuples

A formula

in

domain

Formulae

from

using the logical

variables.

relations

are recursively and

connectives

relational

P represents

making

AND,

calculus

is

formulae defined,

starting

comparisons

OR and

composed

with simple

of attribute

values.

atomic

formulas

Bigger

formulae

NOT.

constructed

using

the

following

rules:

(i) an atomic formula; (ii)

Copyright Editorial

review

2020 has

p, p`q,

p~q

where

p and q are formulas;

(iii)

'

X(p (X))

where

Xis

a domain

variable;

(iv)

;

X (p (X))

where

Xis

a domain

variable.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The use of quantifiers be free.

This

'x

means that,

and

;x in a formula

is said to

when writing expressions

must be the only free variables in the formulae Let

us take

a look

at some

examples

4

Relational

bind x. A variable that is

in domain relational

calculus,

Algebra

and

Calculus

153

not bound is said to

the

variables

x1, x2,...,

xn

P(x1, x2,..., xn).

using

the

simple

banking

database

shown

in

Figures

4.28

and 4.29. Example

1

Find all customers

with a balance

{CUS_ACCNO,

` CUS_BALANCE

In this formula,

the

should

in the

CUS_LNAME,

CUS_LNAME

to the left

result

CUS_LNAME,

CUS_BALANCE

[

4

CUS_ACCNO,

The term

be included

((CUS_ACCNO,

. 500)}

CUS_ACCNO,

tuple.

Example Find

condition

variables

CUSTOMER

500.

CUS_LNAME,CUS_BALANCE|

CUSTOMER)

domain

greater than

of |

and

CUS_BALANCE

CUS_BALANCE

means that

every

customer,

ensures

are bound to the fields

tuple

that

satisfies

that

the

of the same

CUS_BALANCE

.

500

set.

2

all customers

with

{CUS_ACCNO,

a balance

greater

CUS_LNAME,

CUSTOMER)

`

DEPOSIT)

`

than

500

and

CUS_BALANCE|

CUS_BALANCE

.

500)

DEPOSIT.BRANCH_NO

`

who have

deposited

((CUS_ACCNO, '

money

CUS_LNAME,

DEPOSIT.BRANCH_NO

at branch

125.

CUS_BALANCE

[

(DEPOSIT.BRANCH_NO

5 CUSTOMER.BRANCH_NO

`

[

CUSTOMER.BRANCH_NO

5 125} In this

the

example,

the

CUSTOMERS

Example

existential

quantifier

' has

been

used

to to find

a tuple

in

DEPOSIT

that

joins

with

tuple.

3

List the

branches

where there

{BRANCH_NO,

BRANCH_NAME,

[ BRANCH) CITY)

` ('

have been

` ('

no deposits.

BRANCH_CITY|

BRANCH.BRANCH_NO)

DEPOSIT.BRANCH_NO)

(({BRANCH_NO,

BRANCH_NAME,

BRANCH(BRANCH_NO,

(DEPOSIT.BRANCH_NO

[

BRANCH_CITY

BRANCH_NAME,

DEPOSIT)

BRANCH_

` (DEPOSIT.BRANCH_NO

5 BRANCH.BRANCH_NO)}

SUMMARY One of the within

key

the

Relational

algebra

Relational relations

components

database

algebra

is

for formulating

the

real

queries,

as they

as

SQL.

Cengage deemed

Learning. that

any

and used

supports

DIVIDE.

All

Rights

does

May not

not materially

be

eight

affect

allows

data

scanned, the

overall

or

on relations and

are formally

both

algebra

and

the

is

in

shown

to

be stored

in experience.

whole

or in Cengage

part.

Due Learning

to

produce

to

each

data

calculus)

of those

other

operations

for

other.

originally

PRODUCT,

new

relational

in terms

operators

for

databases.

the required

equivalent

JOIN

basis Table

relation

JOIN,

to

domain

provide

operators

PROJECT,

PROJECT

duplicated, learning

act

calculus

and form

for relational

calculus

of the required

relational

SELECT,

operators

copied,

which

basis

relational

RESTRICT),

information

of these

Reserved. content

The

to retrieve

A summary

suppressed

the

as SELECT (or

that

relational

definition

specifying

model

relation,

mathematical

operations (tuple

and tuple

DIFFERENCE

has

of formal calculus

algebra

commonly

2020

are the

Both relational

These are known

review

calculus

Relational

database

model is the

manner.

a collection

a notation

relational

relations.

The relational

Copyright

of the

a structured

and relational

as a result.

provides

Editorial

in

defined

by

INTERSECT,

are the

manipulation

ones

Codd.

UNION,

that

are

languages

most

such

4.1.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

154

PART I

Database

Systems

User queries expression, ?

List

?

Select

?

Specify

can be written the

all the

following

attributes

all the the

calculus

predicate

calculus.

Tuple relational

4

and

the

calculus

take

TABLE 4.1 Relational

from

Summary

Operator

based

for

describe

which

different

of relational

a given

from

write such

as an

are

needed.

mathematical

want, rather (SQL).

logic

than

called

how to compute

Expressions

it,

in tuple

is true.

tuple relational

calculus

as it uses

domain

variables

domain.

operators

Description a subset

of tuples

PROJECT

P

Selects

a subset

of columns

-

INTERSECT UNION

THETA JOIN

of

Language

predicate

Selects

PRODUCT

that

a branch

what they

s

CARTESIAN

results

Query

SELECT

DIFFERENCE

order to

of attributes.

upon

Structured

an attribute

Symbol

on the list

users to of

tuples

calculus is

on values

based

language

allows

In

answer.

and the intermediate

appearance return

Domain relational that

a formal

expressions.

be taken:

give the

operators

calculus

underlines

relational

is

to

we need,

relational

algebraic

should

we need

relations

Relational

as relational

steps

from

a relation.

from

a relation.

Selects

tuples

in

Relation1

but

Selects

tuples

in

Relation1

or in

Relation*.

Selects

tuples

in

Relation1

and

Relation2,

X

Computes

u

Allows two relations {

5, ,,

all the possible

,5,

.5,

to ,

not in

Relation2*.

be combined

.}.

excluding

combinations

When the

duplicate

tuples*.

of tuples.

using one of the comparison

operator

is

5 the

operator

is

operators

known

as an

EQUIJOIN. NATURAL

JOIN

|X|

A version

of the

EQUIJOIN

Relation1Tuple.Y

which

selects

5 Relation2Tuple.Y.

both relations

which

those Yis

tuples

where

a set of common

must share the same domain.

attributes

Duplicate

to

columns

are

removed.

OUTERJOIN

Based on the u-JOIN and natural JOIN, the all the tuples in

Relation1 that

OUTERJOIN in addition

have no corresponding

selects

values in the relation

Relation2. 4

Selects

'

A formula

;

The formula

DIVIDE EXISTENTIAL

UNIVERSAL * in the

case

of these

operators,

relations

must

tuples

in

Relation1

must be true for

that

match

at least

every row

in

Relation2.

one instance

must be true for all instances

be union-compatible.

KEYTERMS closure

left outer join

COURSE_RELATION

naturaljoin

SELECT

DIFFERENCE

predicate calculus

set theory

DIVISION

predicateexpression

thetajoin

domain

PROJECT

tuple relational calculus

domain relational calculus

relational algebra

UNION

equijoin

relational algebraic expression

union-compatibl

INTERSECT

RESTRICT

join column(s) Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

safe expression

right outerjoin All

suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

4

Relational

Algebra

and

Calculus

155

FURTHER READING Codd,

E.F.

A Relational

Milestones

of

Model

Research:

of

Data for

Selected

Large

Papers

Shared

Data

19581982.

Banks.

CACM

CACM

25th

13,

No.

Anniversary

6, June

Issue,

1970.

CACM

Republished 26,

No.

in

1, January

1983. Codd,

E.F.

Relational

Research Date,

Report

C. J.

Dietrich,

completeness RJ 987,

An Introduction

Hrbacek,

K. Jech,

Lacroix,

M. and

Venn,

J. On

T. Introduction

on the

Very

Jose,

Base

California,

Database Set

Magazine

and and

edition.

Query

pp.

3rd

edition.

of

Marcel

6598,

Prentice

Hall

and IBM

2004.

edition.

Prentice

Dekker,

Inc.,

Hall,

2001.

1999.

Proceedings

of the

4

3rd International

1977.

Representation

Science

1st

Languages.

370378,

Mechanical

Journal

Systems:

Addison-Wesley,

Languages,

Relational

Databases,

Database

1972. 8th

Theory,

A. Domain-Oriented Large

Sublanguages.

Systems,

to

Diagrammatic

Philosophical

Data

Database

Relational

Pirotte,

Conference

San

to

S. Understanding

of

9(59):

of

118,

Propositions

and

Reasonings.

Dublin

1880.

Online Content Answers to selectedReviewQuestions andProblems forthis chapter are contained

on the

online

platform

accompanying

this

book.

REVIEW QUESTIONS 1

What are the

main operations

of relational

2

Whatis the

3

Whatis the difference between

algebra?

Cartesian product? Illustrate

your answer with an example.

PROJECTION and SELECTION?

4

Explain the

difference

between

the

5

Whatis the

difference

between

tuple relational

6

natural join

and the

outer join.

calculus

and domain

relational

calculus?

Usethe small database shown in Figure Q4.1to illustrate the difference between a natural join, an equijoin

and an outer join.

FIGURE Database

Table

Q4.1 name:

name:

The Ch04_UniversityQue

database tables

Ch04_UniversityQue

Table

STUDENT STU_CODE

name:

LECTURER

LECT_CODE

100278

LECT_CODE

DEPT_CODE

1

2

128569

2

2

6

512272

4

3

6

531235

2

4

4

531268

553427

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

1

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

156

PART I

Database

Systems

Online Content the

online

platform

Allofthe databases usedin the questions andproblems arefoundon

for this

book.

names used in the figures. 'Ch04_UniversityQue'

7

Using the relations

names

For example, the source

used in the folder

of the tables

match the

database

shown in Figure Q4.1 is the

shown in Figure Q4.2, compute the following

TOUR_UK

TOUR_EUROPE

b

TOUR_UK

BOOKING

c

TOUR_UK

TOUR_EUROPE

d

TOUR_UK

e

TOUR_EUROPE

f

TOUR_UK X TOUR_EUROPE

g

sprice_brand 5P2(TREK_UK)

h

(TREK_EUROPE) Ptour_name, price_band

i

database

database.

a

4

The

relational

algebra expressions:

TOUR_EUROPE TOUR_UK

(TREK_UK))

Ptour_name (sprice_brand 5 P2

j

TREK_UK |X| BOOKING k

TREK_EUROPE |X| BOOKING

l

BOOKING

TREK_EUROPE

m Ptour_name, price_band(stour_no 5A1ortour_no 5A2( TREK_UK |X| TREK_EUROPE)) 8

Using the relations

shown in

Figure

Q4.2, compute

the following

tuple

relational

calculus

domain relational

calculus

expressions:

9

a

Find all bookings

with a rating

b

List the tour names offered by TREK_UK and TREK_EUROPE.

Using the relations

shown in

of S6.

Figure

Q4.2, compute the following

expressions:

a

Find all bookings

b

List the tours from TREK_UK that have not yet been booked.

FIGURE Database Table

Copyright Editorial

review

2020 has

Cengage deemed

name:

name:

Learning. that

Q4.2

any

All suppressed

with a rating

of S7.

The Ch04_Tours database tables Ch04_Tours

TREK_UK

Rights

Reserved. content

does

May not

not materially

be

TOUR_NO

TOUR_NAME

PRICE_BAND

A1

TREK PERU

P2

A2

TREK

ANDES

P2

A3

TREK

EVEREST

P3

A4

TREK

K2

P5

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Table

Database

TOUR_NO

TOUR_NAME

PRICE_BAND

A3

TREK

EVEREST

P3

A1

TREK

K2

P4

A2

TREK

ALPS

P9

name:

TOUR_NAME

CUSTOMER_NO

RATING

TREK

ANDES

C2

S5

TREK

K2

C3

S6

TREK

K2

C4

S7

The Ch04_Vending

Q4.3 to

answer

Writethe relational Figure

11

Questions

Table

name:

MACHINE

1014.

algebra formula to apply a UNION relational

operator to the tables

shown in

applying

a UNION relational

operator to the tables

shown in

algebra formula to apply anINTERSECT relational

operator to the tables shown

Create the table that results from applying and INTERSECT relational

operator to the tables shown

Figure

in

14

4

Q4.3.

Writethe relational in

157

Q4.3.

Figure

13

Calculus

database tables

BOOTH

Create the table that results from

12

and

Ch03_VendingCo Table name:

10

Algebra

BOOKING

Q4.3

Use Figure

Relational

TREK_EUROPE

name:

FIGURE

4

Q4.3.

Figure

Q4.3.

Usingthe tables in Figure Q4.3, create the table that results from

MACHINE DIFFERENCE BOOTH.

PROBLEMS The four

relations

shown

in

Figure

P4.1 represent

tables

in

a database

which

contains

information

about customers eating habits. The database tables store information about customers and the types of restaurants that they frequently visit. In addition, for each restaurant the types of cuisine which is served is recorded.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

158

PART I

Database

Systems

Use the relations queries

in

in

Figure

P4.1 to

write relational

algebraic

expressions

for the following

1-12:

1

Display all information

2

Find all the customers

3

List the names of allrestaurants

4

Show the names of all customers who wentto Claridges before 10 January 2008 or have spent more than

4

shown

Problems

250

about restaurants who frequently

on the last

where the restaurant

visit

price is equal to .

McDonalds.

whereit is possible to have fine dining.

bill.

5 Find the names and phone numbers of all customers who have visited fast food restaurants more than

40 times.

Use the

relations

shown

in

Figure

P4.1 to

6

RESTAURANT X CUSINE

7

CUSTOMER |X| VISIT

8

CUSTOMER |X| VISIT|X| RESTAURANT Hint

9

When trying

to solve this

RESTAURANT

problem

shows

a set

Use the

queries in

of

database

relations

shown

Problems

1220:

12

STUDENT-1

STUDENT-2

13

STUDENT-1

STUDENT-2

14

STUDENT-1

FIGURE P4.1 Database

name:

Table name:

review

2020 has

answer from

expressions:

Problem

7.

in

that

Figure

VISIT))

store P4.2

information to

write

about

relational

student

assessments

algebraic

expressions

at Tiny for

the

STUDENT-2

The Ch04_Restaurant_Guide database tables Ch04_Restaurant_Guide

CUSINE TYPE

CATEGORY

American

FAST FOOD

French

FINE DINING

Chinese

BUFFET African

FINE

DININ

CUSTOMER

Cengage deemed

|X|

tables

South

name:

algebra

VISIT

(CUSTOMER)) Prest_name,last_bill_amount (VISIT)|X| (scus_lname 5Dunnes P4.2

relational

see how you can use your

11

University.

Copyright

following

(CUSTOMER Pcus_lname (srest_name 5MacDonalds

following

Editorial

the

10

Figure

Table

compute

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

CUS_CODE

CUS_LNAME

CUS_PHONE

10010

Ramas

844-2573

10011

Dunne

894-1238

10012

Smith

894-2285

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Table

name:

Table

name:

REST_NAME

REST_LOCATION

REST_PRICE

and

Calculus

McDonalds

The

Claridges

London

French

Pompidou

Paris

French

The Islands

Cape

Frankies

Milan

American

Hague

South

Town

African

American

4

VISIT

10010

The Islands

10

02/01/2018

146.78

10011

McDonalds

87

30/12/2017

7.98

10011

Claridges

1

01/01/2018

520.22

10012

Pompidou

5

03/01/2017

68.75

10012

McDonalds

32

04/01/2018

12.75

DATE_LAST_VISITED

15

STUDENT-1

16

STUDENT-2

17

(ASSESSMENT))) Pstu_lname(STUDENT-1 |X|(s exam-mark . 60

18

Pclass_name(CLASS) |X| ((ASSESSMENT) |X|(s stu_lname 5Vos(STUDENT-1)))

LAST_BILL_AMOUNT

|X| ASSESSMENT

ASSESSMENTS

19

Write a relational algebraic expression to find scored less than 60 in the Java_Prog exam.

20

To obtain a merit in a class, students must achieve 65 or over in both coursework and exam marks. Write a relational algebraic expression to show the names and numbers of all students in STUDENT-1 who have achieved a meritin their classes.

P4.2 name:

The Ch04_Student_Assess

out the names of all students in STUDENT-1

Ch04_Student_Assess

Table name:

2020 has

STU_LNAME

CRS_CODE

321452

Vos

Comp-600

12

324257

Smith

Eng-534

43

324258

Oblonski

Comp-600

46

STU_LNAME

CRS_CODE

324258

Oblonski

Comp-600

324787

Swithety

Comp-600

Learning. that

any

CLASS_NAME Databases Info_Sys Java_Prog

STUDENT-2

STU_NUM

Cengage deemed

CLASS

CLASS_CODE

STU_NUM

name:

who

relations

Table name: STUDENT-1

Table

159

REST_TYPE

NO_TIMES_VISITED

Database

review

Algebra

REST_NAME

FIGURE

Copyright

Relational

RESTAURANT

CUS_CODE

Editorial

4

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

160

PART I

Table

Database

name:

Systems

ASSESSMENT

STU_NUM

CLASS_CODE

EXAM_MARK

COURSE_WORK_MARK

321452

12

60

70

321452

46

50

60

324258

46

65

65

324457

43

0

70

4 21

Usethe following relational

schema to write relational algebra expressions for the following

a

Show the names of all authors who have published

b

List the ISBNs of all books in stock.

c

Show all the stores in Belgium.

d

Find the ISBN of all stores that carry a non-zero

e

Find the name and address of all stores that do not carry any books byCornell.

queries:

books after 1st January 2019.

quantity of every book in the BOOK relation.

Relational schema BOOK(ISBN, Author_name, Title, Publisher, Publish Date, Pages, Notes ) STORE(Store_No, Store_Name, Street, Country, Postcode) STOCK(ISBN, Store_No, Price, Quantity ) 22

Usethe following relational

schema to write relational algebra expressions for the following

a

Show the Reservation_No 21 December 2020.

b

List the last name of passengers travelling

c

Find the efficiency ratings plane.

d

List the Passport_No

Relational schema PASSENGER(Passenger_ID, FLIGHT(Flight_No,

and Total_cost of all flights that on flight

of all planes, including

were paid before

number VO345. in your answer the airline name for

Passenger_firstname,

Airline_Name,

24

Copyright Editorial

review

2020 has

Using the relations expressions: a

Find the names

b

List all students

c

List all students

Passenger_lastname,

Passport_No,

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

shown in

Figure

of all students

P4.2, compute

who are studying

who have course studying

not materially

be

copied, affect

Date_of_Birth)

Plane_Type)

the

scanned, the

overall

or

class Java_Prog

duplicated, learning

in experience.

whole

or in Cengage

part.

the following

tuple

the

using domain relational

Due

relational

calculus

marks are both greater than 50.

and have taken

Learning

Date_paid, Total_Cost )

Comp-600.

work and exam

Repeat Problem 7, but compute the expressions

Cengage deemed

each

of passengers sitting in seats 36C, 38F and 42D on Flight_No V0667.

RESERVATION(Reservation_No, passenger_ID, Flight_No, Seat_No, Flight_date, PLANE(Plane_type, Traveller_Capacity, Efficiency_Rating) 23

queries:

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

assessment.

calculus.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PartII

DesIgn concePts

5 Data Modelling with Entity Relationship Diagram 6 Data Modelling Advanced Concepts

7 Normalising Database Designs

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

busIness

vIgnette

usIng DAtA to IMPRove AnD WoMen

tHe LIves oF cHILDRen

Overthe past 20 years, UNICEF1 has assisted charities, organisations and governments by providing data, analytics and insights to helpimprove the welfare of children and women worldwide. They are harnessing the Big Data available to them to make data workfor children. In 2017, UNICEF released the Data for Children Strategic Framework, which has allowed it to expand its commitments in three

areas that

are essential

for

good

data

work: coordination,

strategic

planning

and knowledge

sharing.2 UNICEF currently holds data assets that have been generated from household surveys, global data advocacy and data provided by individual countries; the framework provides an opportunity to build a new data landscape to work within the data governance frameworks of individual

countries

and provide

a gateway

to reliable

and open

data and analysis

on the situation

of children and women worldwide.2 UNICEF Data and Analytics teams workto ensure that the data collected is statistically sound by using Multiple Indicator Cluster Surveys (MICS).3 Global databases are used to track children and women, and new methodologies and monitoring tools have been designed to enable successful data gathering

on issues

such

aslow

birth

weight, education

and child labour.

UNICEF

houses the

power of a modern data warehouse to enable data to be more accessible through interoperability, and data visualisation is achieved through the use of interactive maps and graphs. The ultimate aim is to put data into action.

1

UNICEF,

2

Data for

3

Multiple

available: Children

https://data.unicef.org/about-us/ Strategic

Indicator

Cluster

Framework,

Surveys,

available:

available:

https://data.unicef.org/resources/data-children-strategic-framework/

http://mics.unicef.org/

163

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

164

PARt

II

Design

Concepts

One example

project

Uganda through project

the

was to

impact

tackle

to the

has been concerned

use of near real-time the

scorecards,

allowing

health

of care.4 This, coupled including

SMS

Action

Ministry

of

of the

are

successfully

awareness

Copyright Editorial

review

2020 has

Cengage

to

near

Learning. that

any

to

quality,

be

Swaziland

and report

using

and impact

collected

the

colour-coded of their

through

and the

and

The aim of the

areas

collection

reach

measured

Kenya,

community.4

in rural

data

feedback

provide

delivery

many sources,

ability for the community

real-time

data

and

children

The

to raise tip

there

for

feedback

education

virus

in

2013

data received schools

built

to

from

or the impact

Education

Management

an

adaptive

EMIS,

and periods

when

Framework

has

Strategic

has

75 counties

of the

data

Lebanon.4

together

history

Children

Zika

data

and

to

no

MEHE

virus

analytics

severe

South

up to

was used to and

analyse

data

maternal,

social

prevention of

mining

newborn

and

To raise

media

a data-informed

the impact

and

distress

America.

develop

provide

demonstrate

support

caused in

UNICEF teamed

awareness

of the iceberg

community

for

at least

and

came

in

use in 355 schools.4

Data

in

refugees

of the

was and

attendance,

anonymised

data

needs

UNICEF

Brazil, the

Facebook

Brazil.

the

In

child

UNICEF

the

time,

2016,

UNICEFs

and robust

and

At this

to

and poor-quality

determine

a childs

of children.

within

clean

and

Today, this system is in where

are just

education

(MEHE)

During

to track

campaign

studies

provide

to

delivered.

measures,

Zika

to

Education

Lebanon.

women

of prevention

case

Using

deemed

of

communications

available:

being in

the lives

about

UNICEF

was impossible

examples

well-being

conversations

4

the

community

impact

services

mobile

but due to the inadequate

werent in school. many

the

it

(EMIS)

improved

databases

health

enabled

understand

Higher

a way for schools

were and

These

has

time

enabled and

services

System

which provided

public

to

to children,

educational

affected

facilities

has enabled

Education

of schools,

There

Action

also

number

Information

they

has

free education

alimited

child health in from the

solutions.

Data for

provide

monitoring

of decentralised

Data for

with near-real

messaging,

to recommend

The

problems

communities.

with

data and feedback

strategies.

well architected

purposes.

and

child

health

in

East

Africa,

https://data.unicef.org/wp-content/uploads/2018/01/From-Insight-to-Action-November-2017.pdf

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

cHAPteR5 Data Modelling with entity Relationship Diagrams In tHIs cHAPteR, you The

main characteristics

How relationships

relationships

WILLLeARn:

of entity

between

entities

are incorporated

How ERD components That real-world

relationship

affect

database

are

components

defined

and refined,

into the

database

database

design

design

design

and

how those

process

and implementation

often requires

the reconciliation

of conflicting

goals

Preview This chapter

expands

coverage

modelling is the first real-world

objects

Therefore, entity

the

and

the

of

diagrams

in

and relationships

chapter

goes

among

the

much entities,

of data required studies One

will case

which

known

study

as Tiny

Copyright Editorial

review

2020 has

Learning. that

any

shows

is

based

on

University, is

based

All suppressed

Rights

Reserved. content

does

you to

May not

not materially

should

be

copied, affect

the

Throughout

types

scanned, the

overall

design

or

computer. through

duplicated, learning

you

world.

structure

to

this

you.

the

chapter,

This

called

wealth

two

amongst

The

of

of relationships

summarise

company

the

model (ERM) components

be familiar

of relationships

travel

conflicting

basic

depiction

design.

around

make

now

graphic help

on the internal

how

the

depictions

different

agents

the

Data

between

graphically

entity relationship

For example,

an international

of travel

requiring

those

the

in

expressed

used in the

analysing

a successful

to illustrate

design.

as a bridge

is implemented

representation

how

of database

serving

be overstated.

Models.

broader,

chapter illustrates

possibly

Cengage deemed

and

to implement

a number

Finally, the design,

and

be used

owns

2, Data

aspect

details,

cannot

and their

deeper

that

modelling

and definitions

Chapter

entities

modelling

design journey, model

data

(ERDs),

Most of the basic concepts were introduced

data

database

database

importance

relationship

of the

step in the

case

entities.

ILoveHolidays,

second

case

study,

of a university.

goals can be a challenge

in

database

compromises.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

166

PARt

II

Design

Concepts

note As this is

book

generally

exclusively

and

design

type.

the

data

Conceptual

in the logical Chapter they

on the

tool.

Actually,

used

requirements

model,

conceptual organisation.

conceptual

of databases.

However,

model is

develop

database

used

you

models

of an

relational to

relational

models are used in the

design

3, the

are

focuses

a relational

might such

Therefore,

design

since

be tempted as the

the

ERM

in this

chapter

conclude

to

that

be used

ERM is independent

of databases,

you are now familiar

extensively

to can

to of the

while relational

ERM

database

models are used

with the relational explain

the

understand

model from

ER constructs

the

and the

way

designs.

5

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

5.1

tHe entIty

You should

remember

that the ERM forms end

user.

ERDs

an entity Thus,

the

the

the

basis

for

of entities

Figure

these

5.1

entities,

a BOOK

can

a RECIPE

Chapter

database

design

In

that

some

with

Entity

Relationship

Diagrams

167

3, Relational

attributes object

design

Model Characteristics,

database

and relationships.

are

often

developed

The order in tools

used in

used

by the Because

interchangeably.

this

which the

are

as viewed

chapter

includes

ERD components

to

develop

are

ERDs that

can

and implementation. has

Figure 5.1, BOOK,

out

and

modelling

which

been

RECIPE,

would be identified find

Modelling

conceptual

entities,

database

way the

ERD,

the

entity

and flights.

by the

a simple

we can

such

hotels

dictated

successful

words

ILoveHolidays

employees, is

and their ingredients.

ERD in

Models, and

the

Data

MoDeL

main components: object,

of the

by introducing

all examples

2, Data

databases

(objects)

chapter

start

Chapter

a real-world

bookings,

in the

Lets

book

depict

entities

customers,

form

from

(eR)

the basis of an ERD. The ERD represents

represents

covered

ReLAtIonsHIP

5

during

basic

created

to

model

recipes

RECIPE_INGREDIENT

database

information

design.

about

the

within

a cookery

and INGREDIENT

Bylooking

are

more closely

relationships

that

exist

5

at the

between

as:

contain

at least

requires

at least

one INGREDIENT

one one

RECIPE,

but

may contain

RECIPE_INGREDIENT,

can be found

in

a number

many

but

RECIPEs

may have

many

of RECIPE_INGREDIENTs

RECIPE_INGREDIENTs

but

may not appear in any

RECIPE_INGREDIENT.

You can also see in BOOK

contains

chapter

that

attribute

each

as the

Copyright review

2020 has

5.1 that

used

called to

each

denote

in

of that

Likewise,

Figure

entity which

has a number has the

an attribute

instance

book uniquely. one shown

each ISBN,

that

entity.

is the

In this

FK is used to

5.1 to illustrate

of attributes.

notation

next

PRIMARY example,

denote

all the

{PK}

For example to it.

KEY a books

a FOREIGN

concepts

of

You

an entity,

ISBN

KEY.

of entity

is

the

entity

will learn

in this

which

used

is

an

to identify

We will use examples

relationship

modelling

in

chapter.

FIguRe

Editorial

is

identifies

different

such this

{PK}

that

Figure

an attribute

Cengage deemed

Learning. that

any

5.1

All suppressed

Arecipe eRD

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

168

PARt

II

Design

In

Concepts

Chapter

2, you learnt

Foot

notation

UML

notation

about the

and the to

more

model

different

notations

contemporary

ERDs,

using

UML

relational

used

within ERDs, including

notation.

Within this

concepts

the traditional

chapter,

we

Crows

will continue

to

use

and terminology.

online content Fora more detailed description ofthe Chen,CrowsFootandotherER model

notation

the online

systems,

see

platform for this

Appendix

E, Comparison

of ER

Modelling

Notations,

2, you learnt

that,

available

on

book.

5.1.1 entities 5

An entity is an object level,

an entity

word

entity

refers

to

actually

in the

ERM

a specific

represented

of interest

entity

corresponds

to

table

part is

used

end user. In

refers to the

row

as an

is

subdivided

by a box that

The top

to the

to

name

set and not to a single

a table

entity

the

Chapter

and

not to

instance

into

or entity

three

entity.

The

a row

entity in the

at the

occurrence. relational

occurrence.

In

In

ER

modelling

other

words, the

environment. UML

The

notation,

ERM

an entity

is

parts:

entity

name,

a noun,

is

usually

only

when

written

in

capital

letters. The

middle

The

bottom

part is

used

part is

to

used

name

to list

the

or object-orientated within

this

and

describe

the

methods.

database

attributes.

Methods

models

are

used

and therefore

will be left

designing blank

object-relational

in the

examples

book.

note One component

database some

of this

However, an entity

of

UML is the

modelling. class

diagram

it is important is referred

The

UML

UML

be shown

aware

ERDs

standards

to

you

similar

to the function

book for

modelling

will be described that

in

UML

are

For

see in this reflected

However,

another,

formats. in

but it

are

capabilities.

vendor

presentation

which is

in this

the

of the

entities

using relational terminology

is

ER diagram

in relational

and their relationships, terminology

different.

uses

and concepts.

For

example,

in

UML,

as a class.

These

software

you

diagram

modelling

diagram,

adopted

notation,

that

to

class

standards.

class

The notation

in

any

although

most of the

example,

chapter

the

to the

commercial

the software

software entity

adhere

that

name

database

details

generates

may be

generally

accepted modelling

UML

modelling

software

that

has

do not vary significantly

from

one

such

ERDs lets

boldfaced

and the

you select

entity

name

various box

may

colour.

5.1.2 Attributes Attributes

are characteristics

AGENT_ID, attribute

Copyright Editorial

review

2020 has

Cengage deemed

box

Learning. that

any

of entities.

AGENT_NAME,

All suppressed

below

Rights

the

Reserved. content

does

entity

May not

For example,

AGENT_ADDRESS.

not materially

be

rectangle

copied, affect

scanned, the

overall

In

(see

or

duplicated, learning

Figure

in experience.

whole

or in Cengage

the the

TRAVEL_AGENT

entity includes

UML

attributes

model,

the

the

are

attributes

written

in the

5.2).

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

As you examine data

entries,

if the

travel

manager

Figure

because agent

5.2, note that

of the

AGENT_ID,

assumption

has just

been

that

it

Data

AGENT_NAME

all travel

established,

5

agents

might

have

with

Entity

Relationship

and AGENT_ADDRESS

have

not

Modelling

an ID,

name

a phone

and

number,

Diagrams

169

will require

address. email

However, address

and

yet.

FIguRe 5.2

the attributes of the tRAveL_Agent

entity

Travel_Agent AGENT_ID

{PK}

AGENT_NAME AGENT_ADDRESS

AGENT_PHONE

5

AGENT_EMAIL

online content Foot ERDs and Databases shows

Microsoft VisioProfessional wasusedto generate boththe Crows

UML class diagrams in this and subsequent

with

Visio

you how to

Professional:

create

ERD

A Tutorial, models like

available

the

chapters.

on the

ones in this

Appendix

accompanying

A, Designing

online

platform,

chapter.

Domains

Attributes have a domain. As you learnt in Chapter For example, the domain for the (numeric) attribute the lowest possible GPA value is 0 and the highest attribute GENDER consists of only two possibilities: for

a companys

date

of hire attribute

consists

3, a domain is the attributes set of possible values. grade point average (GPA) is written (0,4) because possible value is 4. The domain for the (character) M or F(or some other equivalent code). The domain

of all dates that

fit in

a range

(for

example,

company

startup date to current date). Attributes may share a domain. Forinstance, an employee of atravel agency may also be a customer of the travel agency and share the same domain of all possible addresses. In fact, the data dictionary may let

a newly

declared

attribute inherit

the characteristics

name is used. For example, the TRAVEL_AGENT named ADDRESS. identifiers The

(Primary

ERM

of an existing

attribute if the

AND EMPLOYEE entities

same attribute

may each have an attribute

Keys)

uses identifiers

to

uniquely

identify

each

entity instance.

In the

relational

model, such

identifiers are mapped to primary keys in tables. Identifiers are underlined in the ERD. Key attributes are also underlined when writing the relational schema, using the notation introduced in Chapter 3. TABLE

NAME (KEY_ATTRIBUTE

For example, a CAR entity CAR(CAR_REG,

1, ATTRIBUTE

may be represented

2, ATTRIBUTE

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

ATTRIBUTE

K)

by:

MOD_CODE, CAR_YEAR, CAR_COLOUR)

(REG is the standard acronym for vehicle registration

Editorial

3, ...

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

number.)

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

170

PARt

II

Design

Concepts

Composite Ideally,

Primary

a primary

a single-attribute that

is,

a primary

key

primary

PAYMENT_NO. PAYMENT

from

the

5

table

primary

FIguRe 5.3

a single

more than

attribute.

shown

entity, the

attribute.

5.3,

candidate

entity

table

in

Figure

5.3

entity

candidate

and INVOICE_NO

is the

primary

key

the

a

of using

current

and

PAYMENT_NO

and INVOICE_NO)

by using

instead

Given the

key,

structure

combination

attribute

becomes

is

deleted

an acceptable

key.

the PAyMent (entity) components and contents

PAYMeNT_NO

iNvOiCe_NO

CUST_NO

AMOUNT_PAiD

PAYMeNT_TYPe

DATe_PAiD

152675687

631304

152001

500

VISA

03-Apr-19

152342111

631304

152002

500

VISA

03-May-18

152887222

631304

152003

1000

VISA

03-June-19

152228445

712344

152010

350

American

152987877

712344

152011

550

VISA

152344223

901234

152132

2000

MasterCard

06-Jun-19

152334534

091234

152167

4329

MasterCard

02-Aug-19

Express

24-May-19 01-Jul-19

If the PAYMENT_NO in Figure 5.3is used asthe primary key, the PAYMENT entity in shorthand form by: PAYMENT (PAYMENT_NO, On the

uses

database

(occurrence)

instance.

key. If the

key (CUST_NO

the ILoveHolidays

instance

of CUST_NO each

PAYMENT_NO

a proper

the

For instance,

PAYMENT

identifies

Figure is

For example,

However, it is possible to use a composite

combination

uniquely in

one

each

of the

approach

and INVOICE_NO

PAYMENT

composite

of

to identify

key composed

Either

CUST_NO

of only

composed

may decide

composite

of

composed

primary key named PAYMENT_NO.

administrator

of the

Keys

key is

other

CUST_NO, INVOICE_NO,

hand, if

PAYMENT_NO

CUST_NO AND INVOICE_NO, (CUST_NO,

INVOICE_NO,

Composite

deleted

and the

the PAYMENT entity

AMOUNT_PAID,

Note that both key attributes

Attributes

is

AMOUNT_PAID_PAYMENT, composite

TYPE, DATE_PAID)

primary

may be represented

PAYMENT_TYPE,

may be represented

key is the

combination

of

by:

DATE_PAID)

are underlined in the entity notation.

and Simple Attributes are classified

as simple

or composite.

A composite

attribute,

not to

be confused

with a

composite key, is an attribute that can be further subdivided to yield additional attributes. For example, the attribute ADDRESS can be subdivided into street, city, state and postal code. Similarly, the attribute PHONE_NUMBER can be subdivided into area code and exchange number. A simple attribute is an attribute that cannot be subdivided. For example, age, gender and marital status would be classified as simple

into

attributes.

To facilitate

detailed

queries, it is

usually

appropriate

to

change

composite

Single-valued

Attributes

A single-valued

attribute

is an attribute

that

can have

only a single

value.

For example,

have only one ID number and a manufactured part can have only one serial number.

Copyright Editorial

review

2020 has

Cengage deemed

attributes

a series of simple attributes.

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

a person

Keep in

suppressed at

any

time

from if

the

subsequent

can

mindthat

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

a single-valued as

attribute

SE-08-02-189935,

the

region

(02)

in

which

and the

part

Multivalued Multivalued several

trim).

no

entity

was

by adding

attribute.

a composite

(SE),

or a household may

Figure for

produced

are attributes that

colour

support

a simple

but it is

the

Data

Modelling

For instance, attribute

plant

within

with

a parts

because

that

Entity

region

it

(08),

Relationship

serial

can

Diagrams

number,

such

be subdivided

the

shift

within

171

into

the

plant

(189935).

degrees

a cars

is

part

Attributes attributes

The ERD in

there the

the

single-valued,

number

university

Similarly,

is not necessarily

is

5

may have

be subdivided

5.4 contains

primary

the

can have

all of the

keys.

notation

into

However,

{PK}

after

many values. For instance,

several many

components primary

the

different

colours

phones,

(that

is,

attribute(s)

can

each

colours

introduced

keys

a person with its

for

the

own

roof,

thus far. In the

be easily

determined

to

added be the

to

may have number. body

and

UML notation

an attribute

primary

within

key.

5 FIguRe 5.4

resolving

Multivalued

Although

the

implement relation

table,

review

2020 has

can

handle

Remember

*:* relationships from

intersection

must decide

and

Chapter

3,

represents

on one of two

possible

create

the

shown

in

Cengage deemed

components.

Learning. that

any

new

attributes

Figure

5.5

All suppressed

For example,

5.5,

Reserved. content

does

May not

not materially

be

assigned

the

copied, affect

to

the

data

courses

of action:

the

overall

or

duplicated, learning

in experience.

So if

whole

should that,

multivalued

CAR_COLOUR and

not

in the

attributes

one for each of the original

CAR_BODYCOLOUR, CAR

you

Characteristics,

value.

attribute

multivalued

can be split to

CAR_TRIMCOLOUR,

entity.

multivalued attribute into

scanned,

attributes,

Model

a single

CAR entitys

CAR_TOPCOLOUR,

and

splitting

Rights

the

multivalued

Relational

Within the original entity, create several new attributes,

FIguRe

Copyright

RDBMS.

column/row

designer

Problems

model

in the each

attributes

Editorial

Attribute

conceptual

them

exist, the

1

the multivalued attribute in an entity

or in Cengage

part.

Due Learning

to

electronic reserves

new attributes

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

172

PARt

II

Design

Concepts

Although table.

this

For

solution

example,

cars,

the

table

cars

that

do

their

If some of

be

have ten

degrees

most of the

In

5

new (independent)

relationship. the

are

and

colour

section.

not

some

that

case,

applicable.

new attributes degrees

certifications.

or none,

attribute

you have seen solution

or

(Imagine would cause

and

most have fewer of those

In

in the

for

components

into

most

added

non-existing

employee

while

ten

Table

such

entity

a change

allows

is

the

number

values

1 applied,

would

it is

be

not an

the

then

related

designer

to

to

components. (See Figure 5.6.) the

define

original

colour

CAR

for

entity

different

in

a 1:*

sections

of

5.1).

A new entity set composed of a multivalued attributes components

5.1

components

of the

multivalued

attribute Colour

Top

White

Body

Blue

Trim

Gold Blue

Using the as

many

Figure

approach colours

5.5 (a)

multivalued

illustrated

as

Derived Finally,

attributes.

Learning. that

a new

to

change

listed

entity

expandable

any

(derived)

All suppressed

If you

Rights

Reserved. content

from

instead,

may be found

EMP_DOB.

Cengage deemed

having

components

may be classified

database;

EMP_AGE,

has

5.1, you even get a fringe

in

in

the

Table

benefit:

table

5.1.

structure.

This is the

a 1:* relationship

solution,

you are now able to

with the

and it is compatible

Note that preferred

original

the

ERMs

way to entity

assign

deal

yields

with the relational

in with

several

model!

Attributes

within the

2020

Table without

the

Creating

an attribute

value is calculated

the

in

necessary

and (b) reflect

benefits: it is a more flexible,

review

attribute

containing

Section

Copyright

the

of the original multivalued attributes

Interior

Editorial

new

for

problems

colour

as N/A to indicate

although

CAR_COLOUR

Note that

car (see

FIguRe 5.6

short,

the nulls

certifications number

major structural

solution.

Create a new entity composed The

entity

and

to

as alogo

accommodate

a multivalued

would

employees.)

such

generate

an employee

attributes

can lead

are entered

splitting

to

to

sections

sections

Figure 5.5

adoption

components

modified

colour

applied

employees

acceptable

tAbLe

in

degree/certification

null for

2

such

entries for those

when it is

work, its colour

must

have

how the solution problems

to

additional

structure not

colour

seems

if

does

May

other

Microsoft

not materially

be

copied, affect

the

overall

or

duplicated, learning

The derived

value

you

in experience.

A derived

whole

would

or in Cengage

part.

Due Learning

of the

attribute

attribute

by using an algorithm.

the integer Access,

scanned,

attribute.

attributes.

it can be derived

by computing

use

not

as a derived

need

difference

electronic reserves

rights, the

right

some to

whose

not be physically

For example,

stored

an employees

between

use INT((DATE()

to

is an attribute

the

current

age,

date

and

EMP_DOB)/365).

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

If you use EMP_DOB by

Oracle, you was stored

multiplying

by dividing

which

the

trip

would use SYSDATE instead

in the

Julian

quantity

ordered

distance

by the

can be seen on the

FIguRe 5.7

date format.) by the

time

attribute

price.

en route.

EMP_AGE in

Data

Modelling

of DATE(). (You

Similarly,

unit

spent

5

the

total

Or the In

Figure

Entity

are assuming,

cost

average

derived

Relationship

of course,

of an order line

estimated

UML,

with

speed

attributes

are

Diagrams

173

that the

can

be derived

can

be derived

prefixed

with

a /,

5.7.

Depiction of a derived attribute

5

Derived could

attributes

aggregating The

to

attribute

as computed

values

located

on

many table

to

store

derived

attributes

in

database

placed

on a particular

with such derived

tAbLe

5.2

in

on the

rows

(from

tables

application.

constraints.

attributes

attributes.

located

of values

in accordance storing)

referred

two

sum

constraints

not

as adding

the

decision

the

are sometimes

be as simple

A derived

same

the

row,

same

depends

should

Advantages

could

computation be the

or from

result

a different

processing

balance

the

and disadvantages

of

table).

requirements

be able to

Table 5.2 shows the advantages

the

table

on the

The designer

attribute

or it

and

design

of storing (or

database.

and disadvantages

of storing

derived Derived

Attribute

Stored

Advantage

Not

Saves

CPU

Data Can

value be

processing is

used

available

to

track

keep

data

Requires

constant

to ensure derived especially

cycles

readily

historical

Disadvantage

attributes

Saves

storage

space

Computation of

always

yields

current

value

maintenance

Uses

value is current,

if any values

calculation

Stored

CPU

processing

Adds coding

cycles

complexity

to

queries

used in the

change

5.1.3 Relationships A relationship

is an association

also known as participants. identified verb;

for

employs

by a name

that

example,

a STUDENT

a LECTURER,

is

descriptive

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

entities.

of the

takes

a DIVISION is

and an AIRCRAFT is flown

Editorial

between

The entities

You should recall from a

relationship.

CLASS,

managed

that

participate

in

a relationship

are

Chapter 2, Data Models, that each relationship is The relationship

a LECTURER

name is

teaches

by an EMPLOYEE,

a

an active

CLASS,

a

a CUSTOMER

or passive

DEPARTMENT

makes a BOOKING

by a CREW.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

174

PARt

II

Design

Concepts

Relationships between

the

between

entities

A CUSTOMER Each INVOICE Because

you

entities

named

many INVOICEs.

generated

by one CUSTOMER.

both

directions

see that this relationship The relationship For

example,

if

A DIVISION

you dont

5

manage

of the

is

specify

is

both

between

difficult

to

establish

by one

is 1:1 or 1:*. Therefore, If the

answer

many

DIVISIONs.

division?

written

An EMPLOYEE

may

manage

cannot

manage

the

is then

written

An EMPLOYEE

CUSTOMER

and INVOICE,

it is

easy

to

if you know

only

one side of the relationship.

is

you should

yes, the

ask the question

relationship

is

Can

1:*, and the

an employee

second

part

of the

as:

If an employee relationship

define the relationship

that:

EMPLOYEE

know if the relationship

is then

specify

as 1:*.

more than

relationship

That is, to

would

that:

managed

one

directions.

you

relationship

can be classified

classification

you

operate in and INVOICE,

may generate is

know

always

CUSTOMER

more than

one division,

the relationship

is

1:1, and the

second

part of

as:

may manage only one DIVISION.

note In UML class diagrams the relationship name

of the

association

by an arrow ( name

may be replaced

seen

chapter,

A role

usually

an INVOICE

a PRODUCT

all relationship

relationship

5.1.4

is

line.

Associations

also

which the relationship name

expresses

described

the

by two role

have

flows. role

a direction,

represented

Alternatively, the association

played

names

name. Normally, the

by a given

which represent

entity

(class)

in

the relationship

example:

generates

supplies

as the

association

names.

Each relationship

A VENDOR

same

over the

with role

by each class; for

A CUSTOMER

In this

written

? ) pointing in the direction in

the relationship. as

is

name is often referred to as an association

and

names

name

and each INVOICE each

will

be

PRODUCT described

used in traditional

belongs

is

to a CUSTOMER.

supplied

using

relational

the

by a VENDOR. singular

association

name,

as it is the

modelling.

Multiplicity

You learnt

in

Chapter

many-to-many.

2 that

entity

Multiplicity is the

relationships

may

main constraint

that

be classified

exists

as

one-to-one,

on a relationship,

one-to-many,

which enables

or

us to define

the number of participants in that relationship. So, multiplicity refers to the number ofinstances of one entity that are associated with one instance of a related entity. Figure 5.8 illustrates how Visio shows multiplicity

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

on an ERD

All suppressed

Rights

Reserved. content

does

using

May not

not materially

be

UML

copied, affect

notation.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

FIguRe

5.8

5

Data

Modelling

with

Entity

Relationship

Diagrams

175

Multiplicity in an eRD Relationship

name: teaches

Herethe arrow indicates

the direction

of the

relationship

LECTURER

teaches

CLASS

c

1..1

1..4

5 Multiplicities

As you

examine

Figure

5.8, notice

that the

multiplicities

represent

the

number

of occurrences

in the

related entity. For example, the multiplicity (1..4) written next to the CLASS entity in the LECTURER teaches CLASS relationship indicates that the LECTURER tables primary key value occurs at least once and no morethan four times as foreign key values in the CLASS table. If the multiplicity had been written the

as (1..*), there

multiplicity

(1..1)

would be no upper limit to the

number

written next to the LECTURER

of classes

entity indicates

alecturer

that

might teach.

each class is taught

Similarly, by one and

only one lecturer. That is, each CLASS entity occurrence is associated with one and only one entity occurrence in LECTURER. If you examine multiplicity further, you will see that each numerical range actually describes two important

constraints:

participation

and

cardinality.

The

word cardinality

is

a common

term

used

in traditional entity relationship modelling, and is used to express the maximum number of entity occurrences associated with one occurrence of the related entity. Participation determines whether all occurrences of an entity participate in the relationship or not. So, the multiplicity (1..4), written next to the CLASS entity in Figure 5.8, can beinterpreted as follows: The 1 represents the participation and indicates relationship and that it is mandatory. The 4 to four

represents the cardinality,

Copyright review

2020 has

Cengage deemed

Learning. that

that one lecturer

must participate in the

mustteach

atleast

one and up

classes.

You willlearn

Editorial

and indicates

that alllecturers

any

more about relationship

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

participation in Section 5.1.8.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

176

PARt

II

Design

Concepts

note Traditional

modelling

occurrences

to

one

or

many.

has to

written

entities,

entities,

while

minimum

the

MS Visio to

example,

in the

Tiny

unless it

has at least

application

DBMS

provided

Chapter

draw

application

9, Procedural

Multiplicities

in

the

Chapter

2.)

the

may

to limit

very

concise

Such

rules,

want to

can

in

notation,

a class

only

class. table

level

create

and

tool

Knowing

software

that

hold

text

numbers

of associated

entities.

the

to

the

number

of

of zero,

appropriate

at the how

numbers

cardinality

application

ensure

classroom

will learn

the

level.

is

For

not taught

30 students,

the

keep

mind

However, that

in

capability

is

triggers

in

execute

SQL.

statements derived

also establish

Foot

minimum

at the

cardinalities

You

specific the

of associated

enrolment

of the

allow

Crows

the

useful

if the

not

by placing the

number very

university

SQL and Advanced

data environment,

example

is

or by triggers.

by

to represent

is indicated

Similarly,

cardinality

software

for

maximum

implementation

established

were introduced

organisations

the

Language

are

study,

did

used

value represents

occurrences

enrolled.

use that

handle

the

Foot

were

using

The first

of entity

Crows

Cardinality

represents

case

and symbols

ERDs,

(x,y).

students

should

cannot

by the

value

ten

Chen

Instead,

cardinality.

number University

software

the

numeric

second

maximum

as

ERD.

using the format

and

that

such

on the

When using

be used to specify

beside the

5

notations

be

the

known

from

as

a precise

ERMs

entities,

business and

rules.

detailed

attributes,

(Business

rules

description

relationships,

of

an

cardinalities

and constraints.

online content

Since the carefuldefinitionofcompleteandaccuratebusiness rules

is crucial to good design, their undertake learning and

a real-life

in this

C(Global

through

logical

are

Tickets

platform

Ltd

in the

and physical

online

database

chapter

all stages

derivation is examined in detail in Appendix design exercise for a university lab.

applied

in the

e-commerce database

database

development database).

design

In

process

database

Appendices

B and

conceptual

design and implementation.

accompanying

this

The modelling skills you are

of a real

from

B, where you will

design C you

design

(Both

and

appendices

in

Appendices

B

will be taken verification

to

are available

on the

book.)

Since business rules define the ERMs components, making sure that all appropriate identified is an important part of a database designers job.

business rules are

5.1.5 existence Dependence An entity is said to be existence-dependent if it can exist in the database only when it is associated with another related entity occurrence. In implementation terms, an entity is existence-dependent if it has a mandatory foreign key that is, a foreign key attribute that cannot be null. For example, if an XYZ Corporation employee wants to claim one or more dependents for tax-withholding purposes, the relationship

EMPLOYEE

claims

DEPENDENT

would

be appropriate.

In that

case, the

DEPENDENT

entity is clearly existence-dependent on the EMPLOYEE entity, because it is impossible for the dependent to exist apart from the EMPLOYEE in the XYZ Corporation database. If an entity can exist apart from one or morerelated entities, it is said to be existence-independent. (Sometimes designers refer to such an entity as a strong or regular entity.) For example, suppose that

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

the

XYZ Corporation

produced PART

uses parts to

in-house

to

and

other

exist independently

all, at least

some

Therefore,

The

parts

of relationship

the

CAR

in the

entity

primary

primary

entity

does

also

not

suppose

In that

relationship

with

that

scenario,

PART

is

Entity

some it is

Relationship

of those quite

supplied

Diagrams

177

parts are

possible

for

by VENDOR.

a

(After

VENDOR.

In this

on

how the

of one entity

VENDOR

and

section,

in database

in the

as

you

a PK component

PK of the

parent

and

entity

both

how

TRAVEL_AGENT(AGENT_ID, EMPLOYEE(EMP_ID,

of the

key

different

is

defined.

To

related

entity.

For

3.5, is implemented

PRODUCT.

entity.

parent

There

For example,

component

are times

in

and

relationship

relationship, entity.

Figure

a foreign

strength

5

5.6, key

decisions

in the

AGENT_NAME,

Travel

Agent

relationships

entity. case

AGENT_ADDRESS,

EMP_LNAME,

exists if the PK of the related

By default,

as a FK on the related

entities

AGENT_ID,

3, Figure

key in

related

entity

key in the

Chapter

a primary

will learn

of a related

design.

appear

EMPLOYEE

in

as a foreign

component

appears

key

as a foreign

PRODUCT

VENDOR

key

primary

appears

relationships also known as a non-identifying

contain

TRAVEL_AGENT

based

key in

a primary

key arrangement

by having the the

between

primary

entity.

weak (Non-identifying) A weak relationship,

Modelling

by a vendor.)

from

key

key (CAR_REG)

CAR_COLOUR

affect

is

primary

VEND_CODE key is

Further,

vendors.

in the

supplied

strength the

1:* relationship

the foreign

the

not

Data

strength

a relationship,

by using the when

from

a VENDOR

are

products.

bought

PART is existence-independent

concept

example,

are

from

of the

5.1.6 Relationship

implement

produce its

parts

5

are

For example,

study

are

EMP_FNAME,

suppose

defined

AGENT_PHONE,

established

that

as:

AGENT_EMAIL)

EMP_PHONE,

EMP_GRADE,

PAYROLL_NO) In this case, a weak relationship is the

EMPLOYEE

EMPLOYEE

entitys

PK did

the

weak relationship

that

the

UML

not inherit

notation

does

do not require

However,

because

foreign

key

FIguRe

the

between

diagrams

the

exists

not

in

PK component

make

the foreign

attributes

TRAVEL_AGENT

AGENT_ID

TRAVEL_AGENT

the focus

5.9

between

PK, while the

from and

a distinction

are shown

in

the

the

to

A weak non-identifying

and

be added to the

diagrams

relationship

entity.

strong

to

{FK}

the EMP_ID example,

Figure

Figure

5.9

5.9,

of the

class

1:* relationship. databases,

attribute

between tRAveL

will see

UML

model relational after the

the

shows

you

relationships.

many side

diagrams

by adding

because

only an FK. In this

By examining

weak

use of UML class class

is

TRAVEL_AGENT

EMPLOYEE. between

key attribute

here is on the

AND EMPLOYEE

EMPLOYEE

name.

Agent

and eMPLoyee

EMPLOYEE TRAVEL_AGENT AGENT_ID

EMP_ID

{PK}

AGENT_ID

{PK}

AGENT_NAME

employs

{FK1}

PAYROLL_NO

c

EMP_LNAME

AGENT_ADDRESS AGENT_PHONE

EMP_FNAME

1..*

1.1

EMP_PHONE

AGENT_EMAIL

EMP_GRADE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

178

PARt

II

Design

Concepts

used

to looking

note If

you

are

expect

to

see

relational

the

diagram

design

relationship the

FK

characteristics

can, of course, but that

5

tools

after

the

always

properly

clearly

lines

in

used for

communication

FIguRe 5.10

Table

name:

name:

Primary

Key:

rather

reflected

been

defined; by the

the

reflects

choice,

decision

between

tables

the

that

than

in

Weak(non-identifying)

anchor

readability

PK to

after the

You

of the

is

the

Access,

you

However,

the

on the

which

design.

after the

the

the

line

on

FK attribute

FK points.)

You

has been completed

that

vertically

entities

graphically.

relationship that

update

will discover and

focus

ensures

of the

horizontally

FK.

FKs are established

to

points

necessity.

the

a

anchor

both

designer(s)

exist

the

the

are anchored

feature

characteristics

includes

to improve

(This

Microsoft

PK to

an ERD, the

so it is impossible

line

by

the

relationships

properly,

software.

attribute

rather

ERD that

produced from

ERD. In

way those

are used

created

match

ones drawn

in the

the

Professional

has

as the

diagram

than

move the relationship

designers

of the

such

relational

necessarily

been

a complex

by the

An example

not

them,

entities

FK has

to

diagrams

in the

as Visio

the

dictated

Database

such

decide

decision

relationship

is

between

between until

line

convention

and the relationships In fact, if

at relational

relationship

the

placement

placed

entities

(Remember

of the is largely

that

the

ERD is

and end users.)

weak relationship

relationship

is

shown

in

Figure

5.10.

between tRAveL_Agent

and eMPLoyee

CH05_Travel_Agent

Travel_AGENT AGENT_ID

AGeNT_iD 1

AGeNT_NAMe

AGeNT_ADDreSS

Timeless

Upper

Travel

Keys

Cannock,

Business

FlightLite

Anansi

7550, 9

VILLANOVO

0800

Village,

AGeNT_eMAiL

333 2233

[email protected]

Staffordshire,

WS12 2HA, 8

AGeNT_PHONe

UK

Park,

0860232425

Durbanville,

[email protected]

Cape Town, SA 33170809753

244 Rue De Rivoli 75001

[email protected]

Paris 222

Rue

Paris,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

De Rivoli 75001

France

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Table

name:

Primary

5

Data

Modelling

with

Entity

Relationship

Diagrams

179

EMPLOYEE

Key: EMP_ID

eMP_iD

AGeNT_iD

eMP_LNAMe

eMP_FNAMe

eMP_PHONe

eMP_GrADe

PAYrOLL_NO

1239909

9

Meniur

Adele

044573322

Manager

NW445T

1239986

9

Vos

Astrid

049989900

Deputy

NW211Q

Manager

1344255

9

Marin

Gaston

046656671

Staff

NW887L

1556743

9

Vulstrek

Henry

043343322

Staff

NW667P

4000566

1

Khoza

Buhle

087632343

Staff

CW990U

4000768

1

Fenyang

Abri

084544477

Staff

CW211R

4005655

1

Xu

Chang

088765676

Manager

CW223V

5009323

8

Lefu

Mosa

081231133

Manager

TY334Z

Strong (identifying) A strong

relationships

relationship,

also known

entity contains a PK component and EMPLOYEE entities:

EMPLOYEE(AGENT_ID,

as an identifying

AGENT_NAME,

PAYROLL_NO,

that a strong relationship

EMPLOYEE

entitys

composite

relationship,

exists

when the

PK of the related

of the parent entity. For example, the definitions of the TRAVEL_AGENT

TRAVEL_AGENT(AGENT_ID,

indicate

5

exists

AGENT_ADDRESS,

EMP_LNAME,

between

PK is composed

AGENT_PHONE,

EMP_FNAME,

TRAVEL_AGENT

of AGENT_ID

AGENT_EMAIL)

EMP_PHONE,

EMP_GRADE,)

and EMPLOYEE

+ PAYROLL_NO.

because the

(Note that the

AGENT_ID

in EMPLOYEE is also the FK to the TRAVEL_AGENT entity.) Whetherthe relationship between TRAVEL_AGENT and EMPLOYEE is strong or weak depends on how the EMPLOYEE entitys primary key is defined. Figure 5.11 shows the strong relationship between TRAVEL_AGENT and EMPLOYEE.

online content available

Copyright Editorial

review

2020 has

Cengage deemed

on the

Learning. that

any

All suppressed

Allofthe databases usedtoillustratethe material in this chapterare

online

Rights

Reserved. content

does

platform

May not

not materially

be

accompanying

copied, affect

scanned, the

overall

or

duplicated, learning

this

in experience.

whole

book.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

180

PARt

II

Design

FIguRe

Concepts

5.11

stRong (non-identifying)

relationship

between tRAveL_Agent EMPLOYEE

TRAVEL_AGENT

AGENT_ID AGENT_ID

and eMPLoyee

{PK}

{PK}

PAYROLL_NO

AGENT_NAME

employs

c

{FK1}

{PK}

EMP_LNAME

AGENT_ADDRESS EMP_FNAME AGENT_PHONE

1..*

1..1

EMP_PHONE

AGENT_EMAIL EMP_GRADE

5

Database Table

name:

name:

CH05_Travel_Agent

Travel_AGENT

Primary Key: AGENT_ID AGeNT_iD

AGeNT_NAMe

1

Timeless

AGeNT_PHONe

AGeNT_ADDreSS

Travel

Upper Keys Business Village,

Cannock

Staffordshire,

AGeNT_eMAiL

0800 333 2233

[email protected]

0860232425

[email protected]

, WS12 2HA,

UK 8

FlightLite

Anansi

Park,

7550, 9

VILLANOVO

Paris

Key:

33170809753

222

[email protected]

Rue De Rivoli

Paris,

France

EMPLOYEE

Primary Key: AGENT_ID Foreign

SA

244 Rue De Rivoli 75001

75001

Table name:

Durbanville,

Cape Town,

AND PAYROLL_NO

AGENT_ID

AGeNT_iD

PAYrOLL_NO

eMP_iD

eMP_LNAMe

9

NW445T

1239909

9

NW211Q

1239986

Meniur

Vos

eMP_FNAMe

eMP_PHONe

Adele

044573322

Astrid

049989900

eMP_GrADe Manager

Deputy Manager

Gaston

046656671

Staff

Vulstrek

Henry

043343322

Staff

4000566

Khoza

Buhle

087632343

Staff

CW211R

4000768

Fenyang

Abri

084544477

Staff

1

CW223V

4005655

Xu

Chang

088765676

Manager

8

TY334Z

5009323

Lefu

Mosa

081231133

Manager

9

NW887L

1344255

Marin

9

NW667P

1556743

1

CW990U

1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Keep in

mind that the

order in

in the TRAVEL_AGENT be created tables

before

foreign

the 1

EMPLOYEE

reference

problem

the

table.

Remember

that

professional

transaction,

5.1.7

are

the

After

all, it

and loaded

would

not

that

Entity

Relationship

not

yet

to exist.

have In

into the tables.

of referential

table

the

181

must

EMPLOYEE

some

DBMSs,

In fact,

integrity

Diagrams

For example,

TRAVEL_AGENT

be acceptable did

possibility

with

is very important.

5.11), the

data are loaded

to avoid the

Modelling

you

this

must load

errors, regardless

of

or strong.

of the to

Data

(Figure

table

up until the

weak

nature

judgement

efficiency

relationship

a TRAVEL_AGENT

does not crop

relationships

are created

EMPLOYEE

side first in a 1:* relationship

whether

use

the

key

sequencing

which the tables

employs

5

relationship

determine

is

which

and information

often

determined

relationship

requirements.

by the

type

and

That point

database

strength

will often

designer, best

who

suit

the

be emphasised

in

must

database

detail!

Weakentities 5

A weak entity is one that

meets two conditions:

1 It is existence-dependent;

that is, it cannot exist without the entity

2 It has a primary key that is partially or totally For example, purpose the

a company

of describing

DEPENDANT

without

the

unless

DEPENDANT

A strong both

an insurance

policy,

an EMPLOYEE

be associated that

of the

conditions

weak

to

and the

entity

the

cannot

DEPENDANT

entity

PK of the related the

and

ERD in

weak

entities

when

shown the

you

using

the

in

related have

contains

5.12,

the

Figure

For the

XYZ

for

but

cannot

exist

Corporation

the

XYZ

as a

Corporation.

DEPENDANT. 5.12.

entity

is

weak.

met

the

a PK component that

at the

working

has

been

will notice UML

dependants.

DEPENDANT

coverage

of an employee

definition

entity

Figure

Moreover,

EMPLOYEE

that

and his/her

may or may not have a DEPENDANT,

get insurance

is

indicates

weak

an employee

an EMPLOYEE.

be a dependant

relationship for

FIguRe 5.12

a person

happens

(identifying)

strong

is,

with

weak entity in the relationship

As you examine between

may insure

s(he)

is the

An example

that

policy

EMPLOYEE;

dependant

derived from the parent entity in the relationship.

insurance

must

with whichit has a relationship.

Such

of the

there

is

a relationship

related

entity

parent

is

means existence-dependent,

entity.

no diagrammatic

distinction

notation.

A weakentity in an eRD EMPLOYEE EMP_NUM

DEPENDANT

{PK}

DEP_NUM

{PK}

EMP_LNAME

EMP_NUM

{PK}

EMP_INITIAL

DEP_FNAME

EMP_FNAME

has

c

0..*

1..1

EMP_DOB

{FK1}

DEP_DOB

EMP_HIREDATE

Strong

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

Entity

not materially

be

copied, affect

Weak

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

Entity

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

182

PARt

II

Design

Concepts

Remember at least

that the

part

weak entity inherits

of the

DEPENDANT

part of its primary

entitys

key

shown

in

key from its strong

Figure

5.12

counterpart.

was inherited

from

For example, the

EMPLOYEE

entity: EMPLOYEE

(EMP_NUM,

DEPENDANT

5.13 illustrates

and its

parent or strong

attributes,

this

scenario Linda

5

FIguRe Database Table

key:

you

dependants,

Annelise

weak entity (DEPENDANT) primary

was inherited

can

determine:

and

Jorge.

key is composed

from

EMPLOYEE.

eMP_HireDATe

1001

De Lange

Linda

J

12-Mar-74

25-May-07

1002

Smithson

William

K

23-Nov-80

28-May-07

Herman

H

15-Aug-78

28-May-07

Lydia

B

23-Mar-84

15-Oct-08

28-Sep-76

20-Dec-08

G

12-Jul-89

05-Jan-12

Washington

1004

Chen

1005

Johnson

Melanie

1006

Khumalo

Mandla

1007

ODonnell

Peter

D

10-Jun-81

23-Jun-12

1008

Brzenski

Barbara

A

12-Feb-80

01-Nov-13

DEPENDANT EMP_NUM

and

DEP_NUM

EMP_NUM

mind that

weak based

Cengage deemed

Given

EMP_NUM eMP_DOB

key:

of

EMPLOYEE

eMP_iNiTiAL

Foreign

has

EMP_NUM

relationship,

the

eMP_FNAMe

name:

2020

and that

between

DEPENDANTs

eMP_LNAMe

keys:

review

Note that

eMP_NUM

Primary

Copyright

of this

EMP_HIREDATE)

CH05_ShortCo

Keep in

Editorial

DEP_NUM,

EMP_DOB,

DEP_DOB)

of the relationship

and

two

EMP_INITIAL,

DEP_FNAME,

(EMPLOYEE).

help

claims

EMP_FNAME,

A weak entity in a strong relationship

1003

Table

counterpart

with the

J. De Lange

name:

DEP_NUM,

the implementation

EMP_NUM and

5.13

name:

Primary

(EMP_NUM,

Figure

two

EMP_LNAME,

Learning. that

any

eMP_NUM

DeP_NUM

DeP_FNAMe

DeP_DOB

1001

1

Annelise

05-Dec-07

1001

2

Jorge

30-Sep-12

1003

1

Suzanne

25-Jan-14

1006

1

Nonhlanhla

25-May-11

1008

1

Michael

19-Feb-05

1008

2

George

27-Jun-08

1008

3

Katherine

18-Aug-13

the

on the

All suppressed

Rights

business

Reserved. content

database

does

May not

not materially

be

designer

rules.

copied, affect

determines

An examination

scanned, the

usually

overall

or

duplicated, learning

in experience.

whole

or in Cengage

whether

of the relationship

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

an entity

can

between

TRAVEL_AGENT

party additional

content

may content

be

be described

suppressed at

any

time

from if

the

subsequent

eBook rights

as and

and/or restrictions

eChapter(s). require

it

cHAPteR

EMPLOYEE AGENT.

in

After

cannot

exist

Figure

without

employee

Mosa

may cause

examine

being

Lefu

case is the travel which is

5.10

all, if you

the

employed

cannot

agent

you to

he is

parent

entity.

5.10, is

with

is

clear

existence to

EMPLOYEE

Relationship

that

existing

tables

183

TRAVEL

For example,

travel

primary

EMPLOYEE

Diagrams

a EMPLOYEE

dependency.

an

EMP_FNAME,

Entity

a weak entity to

it seems

attached

That is,

EMP_LNAME,

Modelling

EMPLOYEE

so there

unless

Data

Figure

Note that the

COURSE

AGENT_ID,

that

in

agency;

employee

called FlightLite.

EMPLOYEE(EMP_ID,

rows

by a travel

be an

not derived from the

conclude

EMPLOYEE

5

agent,

key is

in this

EMP_ID,

may be represented EMP_PHONE,

by:

EMP_GRADE,

PAYROLL_NO) The second in

Figure

had

weak entity requirement

5.10

been

may not

defined

EMPLOYEE

key,

be represented

by:

could

case,

AGENT is

a

weak

always

in

entity

by

existence-dependent

Participation

occurrence

words,

an entity

a table.) In

the

in

optional

(row)

occurrence

FLIGHT

EMPLOYEE entitys

AGENT_ID

and

entity

primary

key

PAYROLL_NO,

Crows

as strong,

EMP_PHONE,

key is

primary

partially

key.

Foot

terms,

or not it is

the

In

EMP_GRADE)

derived

Given this

or identifying.)

whether

(The

The

between

in

or mandatory.

the

from

decision,

TRAVEL_ EMPLOYEE

relationship

any case,

defined

between

EMPLOYEE

is

as weak.

considered

relationship

between

of an

optionality

is

to

to

necessarily that

be optional

to the shown that

the

condition

in

any

entity

relationship.

a flight. the

other

existence

of as

entity.

by a 0..1 or 0..* minimum which

5.14. In In

is implemented

BOOKING

indicates

label

be for

require

each

entities is

optionality

used

not

may not

(Remember

means that

a particular

and FLIGHT in Figure

bookings

does

table.

participation in

BOOKING

some

table

FLIGHT

Optional

occurrence

entities

at least

BOOKING

existence

entity

the two

relationship,

entity is

term

optional

a corresponding

in the

an optional 5.11.

entity.

relationships

primary

tables

Professional

is either

of FLIGHT

the

Figure

combination

EMP_FNAME,

EMPLOYEE

classified

not require

occurrence

UML notation,

illustrated

the

the relationship

entity

Therefore,

the

EMPLOYEE

Participation

consists

a corresponding

if the

5

on TRAVEL_AGENT,

does

consider

BOOKING

by definition,

hand,

EMP_LNAME,

Visio

is

in an entity relationship

entity

For example, the

(In

other of the

TRAVEL_AGENT

EMPLOYEE

met; therefore,

On the

composed

5.11,

is the

definition.

and

5.1.8 Relationship

one

Figure

AGENT_ID

TRAVEL_AGENT

weak.

PAYROLL_NO,

illustrated

because

as

as a composite

EMPLOYEE(AGENT_ID, In that

has not been

be classified

multiplicity

cardinality

one

or

is

more

as 0 for

optional

exist.)

FIguRe 5.14

An optional FLIgHt entity in the relationship booKIng consists of FLIgHt BOOKING FLIGHT

BOOKING_NO

{PK} FLIGHT_NO

EMP_ID

{FK1}

CUST_NO

{PK}

FLIGHT

AIRLINE

{FK2} consists_of

BOOK_STATUS_CODE

EVENT_ID

c

FLIGHT_DEPART_AIRPORT

{FK3}

FLIGHT_ARRIVE_AIRPORT

{FK4}

0..* HOTEL_ID

0..1

FLIGHT_DEPART_TIME

{FK5}

FLIGHT_ARRIVE_TIME

FLIGHT_NO

{FK6} FLIGHT_COST

BOOK_TOTAL_COST

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

184

PARt

II

Design

Concepts

note Remember foreign

that

the

key. In

Mandatory in

of establishing

cases,

that

mandatory

relationship

If

no

with the

cardinality

relationship

entity

is

on the

always

placed

many side

of the

on the

entity

that

optionality

related

is 1 for the

symbol

entity.

is

The

mandatory

depicted

existence

with the

of a

contains

the

relationship.

meansthat one entity occurrence requires a corresponding

relationship.

minimum

the

will be the

participation

a particular

the

5

burden

most

entity,

mandatory

entity occurrence

the

entity

relationship

exists

in

indicates

a

that

entity.

note

You may be tempted optional relationship.

strength

entity

clearly

is

entity

poor

rules

to

create

different decisions

examine

research

without

other

CLASS. thus table

FIguRe

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

optionalities that

more scenarios.

hand,

a CLASS ERD

rows

does

May

Tiny

examine

teach

may teach and

only

by the

at all or as

LECTURER

row

multiplicity

cLAss entity in the relationship

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

be supplied

distinction

by a

may lead

or deleted.

some

CLASS

Therefore,

lecturers

CLASS is

to

LECTURER

multiplicity

next to classes.

each

who conduct

relationship,

optional

many as three

to the

entity

For example,

of the database design process,

assuming

next

is

when

related

written.

may not

teaches

Therefore,

DEPENDANT

PK of the

this

employs

5.15 shows the

no classes one

component

by a LECTURER.

(1..1)

the

may or

LECTURER

CLASS.

when

be established

rule is

are inserted

University

the

a

model shown in Figure

one

not

you

that

and

relationship and

to

Failure to understand

when table

must be taught

a lecturer

Reserved.

If not to

represented

content

entities!

Suppose

classes.

a LECTURER

Rights

problems

on how

part

mandatory

After all, you cannot require

business

and A

participation turns out to be animportant

An optional

All

same

major

EMPLOYEE

depends

a

participation

a strong

a weak relationship

on how the

between entities in an entities in

relationship

encounter

to EMPLOYEE.

for

strength

for the

cause

to

between

optional

as possible

by a vendor

must

will reference

suppressed

as clearly

be supplied

that

5.15

relationship

part

teaching

one lecturer,

the

occur

between

mind that

are likely

depends

Note that the

row

only

You

The relationship

for

indicating

is just

occur

Keep in

thing.

example,

And it is just

another.

a few

possible

warranted.

same

participation

Since relationship

On the

For

not

when they

relationship

Each

design

quite

the

are weak when they

are strong

is

describe

another.

mandatory while the

lets

conclusion

have dependents.

is

business

vendor to

to

relationships

one, but DEPENDANT

to

is formulated, the

this do not

optional

a strong

employees one

and that

However,

relationship one

to conclude that relationships

relationship

is

mandatory

CLASS to And

to

be (0..3),

each

class is taught

LECTURER

it is

LECTURER.

CLASS

by one

and

table.

LectuReR teaches cLAss

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Failure to may

understand

yield

created

designs

just

to

understand

It is

in

CLASS

constitutes

universitys

is

CLASS

a CLASS mandatory

Figures

CLASS

CLASS is of the

Keep in

it is important

must

that

you

of a

may determine

Tiny University

between

class

COURSE.

in the

class

to the COURSE a

COURSE.

Two

offers

and

several

course

(Typically,

185

be

clearly

that

the type courses;

in this

courses

schedules

is

generates

Therefore,

scenarios

scenarios

of

each

discussion:

are listed

students

a

in the

use to register

for

the

CLASS

you

can

CLASS

are a function

of the

relationship,

conclude

entity

the

the

may

semantics

practical system

All

which

a year

generates

must have

should

first

Rights

the

Reserved. content

does

May not

not from

updates

into

suppressed

for

once

assignments.

sections

and

do

one

easy

COURSE

written,

of the

shown

problem;

in

5

that is,

COURSE first and then

In the real

(classes)

not

or

at least

order to comply

aspects

desirable

database

are inserted

any

be

it is

the

have

generate

not

world, such

yet

classes

been

each

a scenario

defined.

In fact,

semester.

more

one

CLASSes.

CLASS.

In

Therefore,

with the semantics

bythe semantics

ER terms,

each

a CLASS

of the

COURSE

in

must be created

problem.

couRse and cLAssin a mandatory relationship

entities

Learning.

only

relationship

associated

that

making the teaching

COURSE

yet have a CLASS

Cengage

after

that

defined:

may be courses

Each

environment

created,

entity

COURSE is created in

mind the

relationship,

deemed

Diagrams

in relationships

instances)

mandatory. This condition is created bythe constraint that is imposed

FIguRe 5.17

a rigid

Relationship

cLAss is optional to couRse

statement

as the

has

(entity

of a problem

that

are listed

without

relationship.

are taught

the generates

2020

section)

contribution

exist

The different

there

courses

FIguRe 5.16

2

Entity

participation

rows

Therefore,

semantics

distinction

(or

with

participation.

suppose

the

offering

entitys

in the

very likely;

some

review

the

optional

temporary entities.

optional

Modelling

CLASS is optional. It is possible for the department to create the entity is

Copyright

again

on how the relationship

create the

Editorial

that

while classes

cannot

5.16 and 5.17.

depend

1

a specific catalogue,

the

see that

they

of required and

For example, Note

and

unnecessary)

creation

understand

mandatory

Data

classes.)

Analysing

entity

to

(and

mandatory

classes.

course

for their

of

a relationship.

has several

between

awkward the

concepts

also important

participation

distinction

which

accommodate

the

course

to

the

in

5

of the

scenario

accept an

a course

operational

the

with it.

not

be

copied, affect

in not

Figure

overall

apparent

CLASS table.

or

duplicated, learning

in experience.

whole

or in Cengage

For

Due Learning

to

electronic reserves

rights, right

semantics

with at least

a COURSE

some to

third remove

of the

party additional

content

a new

entity that

be

any

time

is

does not

relationship,

suppressed at

such

when CLASS

mandatory

may

Is

COURSE

be solved

content

of this

one class.

when

seems to

because

the

Given the

example,

inserting problem

However,

part.

5.17.

associated

of view?

COURSE table, thereby

scanned, the

that is point

Naturally, the

corresponding

materially

presented

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

186

PARt

II

Design

Concepts

the system desirable

will bein temporary

to

Finally, DBMS.

classify as you

To

the examine

maintain

When you create

tAbLe

on both

the

5.3

order

DBMS

rule constraint.

to

presented

the

the foreign

Table

in

scenarios

a relationship

sides.

of the business

as optional

data integrity,

with a COURSE through

many

violation

CLASS

produce

in

must

Figures ensure

For practical

a more flexible 5.16

that

and

the

purposes,

it

would be

design.

5.17,

keep in

many

mind the

side (CLASS)

role

is

of the

associated

key rules. in

MS Visio using

5.3 shows

the

various

UML, the

default relationship

multiplicities

that

will be optional

are supported

by the

UML

and

notation.

Multiplicity

Multiplicity

Description

0..1

A minimum instance

5 0..*

of zero

of the

A minimum instance

1..1

1..*

of zero

of the

and

a

maximum

class (indicates

of one and

of the

one instance a

online

of this

mandatory of this

content

class is

class). class

In

are associated

this

unary relationship

Databases

exists

relationship

with

of this

class

are

associated

with an

class).

equivalent

to

of the

other

related

class

1..1.

with an instance

of the

Visio

Professional:

other related

A Tutorial,

the number of entities or participants

when an association

degrees is

with an instance

class.

available

on the

online

Degree

exists whentwo entities are associated. higher

an

book.

degree indicates

entities

with

of this class are associated

with an instance

words,

are associated

class).

mandatory

associated

other

class

class).

many instances a

with an

Tolearn how to definerelationships properly withthe help of MSVisio,

A, Designing

5.1.9 Relationship A relationship

of

of this

an optional

class (indicates

are associated

to 0..*.

Appendix for

a maximum

other related

Many instances

platform

class (indicates

class

class).

many instances

of one instance

Equivalent

Although

of

of this

an optional

a mandatory

(indicates

of four

of one instance

class (indicates

other related

other related

Exactly

see

maximum

A minimum of one and a maximum

A minimum

*

a

of the

instance 1

and

other related

exist, they

described

degrees

using

maintained

within a single

Aternary relationship

as

a four-degree

entity.

with a relationship. A binary

A

relationship

exists whenthree entities are associated.

are rare and are not specifically

simply UML

is

associated

named. (For

relationship.)

Figure

example,

5.18

shows

an association these

types

of

notation.

Unary relationships In the the

case

of the

manager

for

relationship

Copyright Editorial

review

2020 has

unary one

means

relationship

shown

or

more employees

that

EMPLOYEE

in

Figure

5.18,

within

that

entity.

requires

EMPLOYEE has a relationship

with itself.

The different

relationships

Cengage deemed

Learning. that

any

cases

All suppressed

Rights

of recursive

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

another

an employee In this

EMPLOYEE

Such a relationship will be explored

in experience.

whole

or in Cengage

part.

Due Learning

within

case,

to

electronic reserves

the

EMPLOYEE

the

existence

to

be the

Section

rights, the

right

some to

third remove

is

manages

manager

is known as a recursive in

entity

of the

that

is,

relationship.

5.1.10.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

5

Data

Modelling

with

Entity

Relationship

Diagrams

187

Binary relationships A binary are

relationship

most

(ternary

In

exists

common. and

Figure

In

higher)

when

fact,

to

two

simplify

relationships

the

are

5.18, the relationship

a

entities

are

associated

conceptual

design,

decomposed

LECTURER

in

into

a relationship.

whenever

appropriate

teaches

one

or

Binary

possible,

most

equivalent

more

relationships

binary

CLASSes

higher-order relationships.

represents

a binary

relationship.

FIguRe 5.18

three types of relationship degree

Unary

Relationship

Binary Relationship

b manages

5

0..*

EMPLOYEE

LECTURER

teaches

CLASS

c

1..1

0..*

1..1

Ternary

DOCTOR

writes

Relationship

PRESCRIPTION

c

0..*

1..1

PATIENT

b receives

0..*

1..1

0..* appears_in

c

1..1

DRUG

Ternary

and Higher-Order

Although

most relationships

relationships are

binary, the

use of ternary

and higher-order

relationships

does

allow

the designer some latitude regarding the semantics of a problem. A ternary relationship implies an association among three different entities. For example, note the relationships (and their consequences) in Figure 5.18, which are represented by the following business rules: A DOCTOR writes one or more PRESCRIPTIONs. A PATIENT mayreceive A DRUG

may appear

one or more PRESCRIPTIONs.

on one or more PRESCRIPTIONs.

(To

simplify

this

example,

assume

that the

business rule states that each prescription contains only one drug. In short, if a doctor prescribes more than one drug, a separate prescription must be written for each drug.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

188

PARt

II

Design

Concepts

The reason entity entities

FIguRe Database Table

name:

PATIENT

a single and

and not three event

binary relationships

or object

that

is

simultaneously

because

includes

the

associate

all three

parent

DRUG).

the implementation

of a ternary relationship

Ch05_Clinic

Drug

key:

DRUG_CODE

5

Table

relationship

reflects

(DOCTOR,

5.19

name:

Primary

why this is a ternary

PRESCRIPTION

name:

DrUG_CODe

DrUG_NAMe

DrUG_PriCe

AF15

Afgapan-15

25.00

AF25

Afgapan-25

35.00

DRO

Droalene

DRZ

Druzocholar

KO15

Koliabar

OLE

Oleander-Drizapan

TRYP

Tryptolac

Chloride

111.89

Cryptolene

18.99

Oxyhexalene

65.75 123.95

Heptadimetric

79.45

Patient

Primary key: PAT_NUM PAT_NUM

PAT_TiTLe

PAT_LNAMe

PAT_FNAMe

PAT_iNiTiAL

PAT_DOB

PAT_AreACODe

PAT_PHONe

100

Mr

Dlamini

Phindile

D

15-Jun-1952

0181

324-5456

101

Ms

Lewis

Rhonda

G

19-Mar-2015

0181

324-4472

102

Mr

Vandam

Rhett

14-Nov-1968

0879

675-8993

103

Ms

Jones

Anne

M

16-Oct-1984

0181

898-3456

104

Mr

Lange

John

P

08-Nov-1981

0879

504-4430

105

Mr

Nsizwa

D

14-Mar-1985

0181

890-3220

106

Mrs

Smith

Jeanine

K

12-Feb-2013

0181

324-7883

107

Mr

Diante

Jorge

D

21-Aug-1984

0181

890-4567

108

Mr

Wiesenbach

Paul

R

14-Feb-1976

0181

897-4358

109

Mr

Smith

George

K

18-Jun-1971

0879

504-3339

110

Mrs

Genkazi

Leighla

19-May-1980

0879

569-0093

111

Mr

112

Mr

113

Ms

Gounden

114

Ms

115 116

Copyright Editorial

review

2020 has

W

Rupert

E

03-Jan-1976

0181

890-4925

Edward

E

14-May-1971

0181

898-4387

Melanie

P

15-Sep-1980

0181

324-9006

Brandon

Marie

G

02-Nov-1942

0879

882-0845

Mrs

Saranda

Hermine

R

25-Jul-1982

0181

324-5505

Mr

Smith

George

A

08-Nov-1975

0181

890-2984

Cengage deemed

Mthembu

Washington Johnson

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Table

name:

Data

Modelling

with

Entity

Relationship

Diagrams

189

Doctor

Primary keys:

DOC_ID DOC_iD 29827

Table

5

name:

DOC_LNAMe

DOC_FNAMe

Ndosi

Sipho

DOC_iNiTiAL

DOC_SPeCiALTY Dermatology

J

32445

Jorgensen

Annelise

G

Neurology

33456

Jali

Phakamile

A

Urology

33989

LeGrande

George

Paediatrics

34409

Washington

Dennis

F

Orthopaedics

36221

McPherson

Katye

H

Dermatology

36712

Dreifag

Herman

G

Psychiatry

38995

Minh

Tran

40004

Chin

Ming

D

Orthopaedics

40028

Cele

Denise

L

Gynaecology

PAT_NUM,

PRES_DATE

5

Neurology

Prescription

Primary

key:

DRUG_CODE,

Foreign

keys:

DOC_ID

DRUG_CODE,

and

DOC_ID and PAT_NUM

DOC_iD

PAT_NUM

DrUG_CODe

32445

102

DRZ

32445

113

OLE

one

34409

101

KO15

one tablet

36221

109

DRO

38995

107

KO15

As you examine

the table

two

two

tablets

every

teaspoon

tablets

in

instance, you can tell that the first drug DRZ on 12 November 2019.

Figure

5.18,

prescription

four

with

every

hours

each

50 tablets

meal

six hours

with every

one tablet

contents

PreS_DATe

PreS_DOSAGe

meal

14-Nov-19

total

60 tablets

14-Nov-19

total

30 tablets

possible

12-Nov-19

ml total

30 tablets

every six hours

note that it is

250

total

14-Nov-19

total

to track

14-Nov-19

all transactions.

For

was written by doctor 32445 for patient 102, using the

5.1.10 Recursive Relationships As was previously

mentioned,

a recursive

relationship

is

one in

which a relationship

can exist

between

occurrences of the same entity set. (Naturally, such a condition is found within a unary relationship.) For example, a 1:* unary relationship can be expressed by an EMPLOYEE may manage many EMPLOYEEs, and each EMPLOYEE is managed by one EMPLOYEE. Aslong as polygamy is not legal, a 1:1 unary relationship may be expressed byan EMPLOYEE may be married to one and only one other EMPLOYEE.

to

relationships

Copyright Editorial

review

2020 has

Finally, the *:* unary relationship

may be expressed

by a

COURSE

may be a prerequisite

many other COURSEs, and each COURSE may have many other COURSEs as prerequisites.

Cengage deemed

are

Learning. that

any

All suppressed

shown

Rights

Reserved. content

does

in

May not

Figure

not materially

be

copied, affect

Those

5.20.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

190

PARt

II

Design

FIguRe

Concepts

5.20

An eR representation

of a recursive relationship

5

The 1:1 relationship Note that Singh.

FIguRe Database

you

Anne

Jones

5.21

determine

that

is

to

married

in the single table

married

who is

to

Vediga

married

eMPLoyee

to

Singh,

Anne

shown in

who is

Figure

married

to

5.21. Nishok

Jones.

is married to eMPLoyee

name:

eMP_LNAMe

eMP_FNAMe

eMP_SPOUSe

345

Singh

Nishok

347

346

Jones

Anne

349

347

Singh

Vediga

345

348

Delaney

Robert

349

Shapiro

Anton

346

Another unary relationship PARt contains PARt Ch05_PartCo

PART_V1

PArT_CODe

PArT_DeSCriPTiON

PArT_iN_STOCK

PArT_UNiTS_NeeDeD

PArT_OF_PArT

AA21-6

2.5 cm washer, 1.0 mmrim

432

4

C-130

AB-121

Cotter

1034

2

C-130

C-130

Rotor

E129

2.5 cm steel

128

1

C-130

X10

10.25

345

4

C-130

X34AW

2.5 cm hex nut

879

2

C-130

Copyright Editorial

Shapiro,

is

EMPLOYEE_V1

FIguRe 5.22

Table name:

Anton

Singh

Ch05_PartCo

eMP_NUM

Database

5.20 can be implemented

Nishok

the 1:1 recursive relationship

name:

Table name:

shown in Figure

can

review

2020 has

Cengage deemed

Learning. that

any

pin,

cm

All suppressed

copper

36

assembly

Rights

shank

rotor

Reserved. content

does

blade

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Unary relationships a rotor assembly. pins,

are common

assembly

(C-130)

Figure

one

2.5

implemented

parts,

illustrates aware

steel

tables

such

four

5.21 thus

be used to

two

aviation,

FIguRe 5.23

many

a rotor 10.25

assemble

are required

is

tracking

part

PART

2.5

Entity

Figure to

hex

parts PART

important output.

as

In fact,

in

Diagrams

5.22 illustrates only

washers,

nuts.

within each rotor

of other

Relationship

create

2.5 cm

cm

contains

is increasingly

is

with

used

of four

and two

more complex

parts tracking

part is

composed

kinds

the

Modelling

For example,

each

different

Data

each

blades

you to track

several

Implementation

but

assembly cm rotor

of producing

full

industries. parts,

to implement Parts

ramifications

those involving

of

enables

an environment.

of the legal

manufacturing

that

shank,

Figure

If a part can many

in

composed

5.22 indicates

cm

in

is

5

191

that

one rotor two

cotter

The relationship

assembly.

and is itself

composed

relationship.

Figure

managers

become

many industries,

of 5.23 more

especially

mandatory.

of the *:* recursive PARt contains PARt relationship 5

Database

name:

Table name:

Table

Ch05_PartCo

COMPONENT

name:

COMP_CODe

PArT_CODe

COMP_PArTS_NeeDeD

C-130

AA21-6

4

C-130

AB-121

2

C-130

E129

1

C-131A2

E129

1

C-130

X10

4

C-131A2

X10

1

C-130

X34AW

2

C-131A2

X34AW

2

PART PArT_CODe

PArT_DeSCriPTiON

AA21-6

2.5 cm

PArT_iN_STOCK

washer,

1.0

AB-121

Cotter pin, copper

C-130

Rotor

432

mm rim

1 034 36

assembly

E129

2.5 cm steel

X10

10.25

X34AW

2.5 cm

shank

cm rotor hex

128

blade

345

nut

879

The *:* recursive relationship might be morefamiliar in a school environment. Forinstance, note how the *:* COURSE requires COURSE relationship illustrated in Figure 5.20is implemented in Figure 5.24. In this

example,

MATH-243 is a prerequisite

are prerequisites

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

to

QM-261

and

QM-362,

while both

MATH-243

and

QM-261

to QM-362.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

192

PARt

II

Design

FIguRe Database Table

Concepts

5.24 name:

name:

Implementation

couRse

relationship

COURSE

5

name:

CrS_CreDiT

CrS_CODe

DePT_CODe

CrS_DeSCriPTiON

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

CIS-420

CIS

Intro.

CIS

QM-362

CIS

to

Computer

Database

Design

Mathematics

MATH

QM-261

Intro.

to

for

3

Science and Implementation

4

Managers

3

Statistics

Statistical

3

Applications

PREREQ

Finally,

the

1:* recursive

implemented

FIguRe

requires

Ch05_TinyUniversity

MATH-243

Table

of the *:* recursive couRse

in

5.25

Figure

CrS_CODe

Pre_TAKe

CIS-420

CIS-220

QM-261

MATH-243

QM-362

MATH-243

QM-362

QM-261

relationship

EMPLOYEE

manages

EMPLOYEE,

shown

in

Figure

5.20, is

5.25.

Implementation

of the 1:* eMPLoyee

manages eMPLoyee

recursive

relationship Database

name:

Table name:

Ch05_PartCo

EMPLOYEE_V2 eMP_CODe

eMP_LNAMe

101

Mazwai

102

Orincona

eMP_MANAGer 102

Jones

103

102

104

Malherbe

102

105

Robertson

102

106

Deltona

102

5.1.11 composite

entities

You should recall from Chapter 3, Relational Model Characteristics, that the relational model generally requires the use of 1:* relationships. (You should also recall that the 1..1 relationship has its place, but it should be used with caution and proper justification.) If *:* relationships are encountered, you must create

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

a bridge

All suppressed

Rights

between

Reserved. content

does

May not

not materially

be

the

copied, affect

entities

scanned, the

overall

or

duplicated, learning

that

in experience.

display

whole

or in Cengage

part.

such relationships.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

Recall that the

party additional

content

may content

be

suppressed at

any

time

bridge

from if

the

subsequent

eBook rights

entity

and/or restrictions

eChapter(s). require

it.

cHAPteR

(also

known

connected.

as a composite (An

FIguRe Database Table

example

5.26

composed

a bridge

converting

name:

name:

entity) is

of such

is

of the

shown

in

5

primary

Figure

the *:* relationship

Data

Modelling

keys

with

Entity

of each

Relationship

of the

Diagrams

entities

to

be

5.26.)

into two 1:* relationships

CH05_Travel_Agent

BOOKING BOOK_

BOOK_

CUST_

STATUS_

eMP_iD

NO

CODe

204200

1239986

101

1

225

06/04/2019

301200

1239986

102

1

90

04/02/2019

401211

4000768

1099

2

185

25/05/2019

BOOKiNG_

NO

Table

Table

193

name:

eveNT_

HOTeL_

iD

FLiGHT_

TOTAL_

BOOKiNG_

NO

COST

DATe

iD

5

TOUR_BOOKING

name:

TOUr_iD

BOOKiNG_NO

TOUr_DATe

1001

401211

06/07/2019

1002

401211

08/07/2019

1004

204200

03/08/2019

1005

301200

07/09/2019

1001

301200

28/09/2019

TOUR

TOUr_ iD

TOUr_ NAMe

TOUr_DeSCriPTiON

1001

The

See the

changing

Covent

Garden, the

Total

London Experience

Westminster

of the

guards

London

at Buckingham

Eye, St Pauls

Abbey, the river

Thames

Palace,

TOUr_

TOUr_

TOUr_

PriCe_ ADULT

PriCe_ CHiLD

PriCe_ CON

120

99

99

65

50

55

26

20

20

125

100

115

20

10

20

Cathedral,

and

more.

Meeting Point: 4 Fountain Square. Daily at 08:45 a.m. 1002

London

Visit the Tower of London

Gems

1003

Big

123151

Buckingham

See nine

attractions

Bus

on and

off in

of stops

take

Nairobi

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

including different

on the first a relaxing

floor

scenic

at 6:15

p.m.

Pick

up from

location/locations

7:45 p.m. Arrive at the the game drive/park

Reserved. content

does

May not

of the Seine

National

Rights

not materially

be

copied, affect

Eiffel

places.

daily

Safari

the

Meet

Daily at 1:00

p.m.

Tower.

Receive

Hop

details

58 Tour Eiffel restaurant,

Park Day Tour

Road.

when booking.

located

Tour

Editorial

nine

Palace

Enjoy dinner at the

Paris Night

1005

Crown Jewels and

go on a boat cruise on the River Thames.

City Tour

1004

and the

Eiffel

Tower,

River cruise.

to

Nairobi

be advised

National

formalities.

then Departs

Park for

Go on escorted

Walk.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

194

PARt

II

Design

Concepts

As you examine

Figure

on the

other

entities;

by the

composite

the

two

composed are

the in

entirely

possible

Implementing Specifically,

whether

the

5

date

Therefore,

if you

BOOKING

side

small

keys

database

even

1

mandatory

Figure

and the

no bookings

Figure

5.27,

the *:* relationship

that

play

of at least

place for a specific and

tables.

no role in

the

BOOKING

case the tour

(TOUR_ID

TOUR

are connected

date

which

booking.

BOOKING_NO)

Therefore,

no null

is entries

attributes.

*

or optional.

*:* relationship

key

that

attributes

in this

will take

and

entities

be composed

attributes,

tables

BOOKING

of the

additional

must

of the tour

key

shown in the

though

examine

of the

entity is existence-dependent

keys

contain

entity

any additional

tables

must know are

of the

the

TOUR_BOOKING

primary

you

may exist

FIguRe 5.27

the

primary

may also

although

TOUR_BOOKING

relationships

A TOUR

entity

TOUR_BOOKING

on the

on which that instance

mind that

the

clearly.

composite

based

may also include

of the

in the

composite

is

For example,

keys, it

uniquely identifies keep

composition

The

process.

primary

Finally,

its

entity.

connective

and TOUR

5.26, note that the

that

relationship,

of each

For example,

have

currently

an optional between

5.26 requires sides

note the

been

multiplicity

BOOKING

you define the relationships you

must

know

points:

made for it.

(0..*)

and

and

following

should

appear

on the

TOUR.

between booKIng AnD touR

BOOKING

may_contain

TOUR

c

0..*

0..*

You might argue that, for atour to exist, at least one BOOKING must be made. Therefore, TOUR is mandatory to BOOKING from a purely conceptual point of view. However, when a new tour is first offered, it will not have had the opportunity to be booked. Therefore, at least initially, TOUR is optional to BOOKING. Note that the practical considerations in the data environment help dictate the use of optionalities.

If TOUR is

not optional

to

BOOKING

from

a database

point

of view

a booking

must

be madefor the tour to allow it to beincluded in the database. But thats not how the process actually works. In short, the optionality reflects practice. The ERD in Figure 5.28 shows that the *:* relationship between BOOKING and TOUR has been decomposed

into

two

1:* relationships

through

TOUR_BOOKING.

In Figure 5.28, the

optionalities

have

been transferred to TOUR_BOOKING. In other words, it now becomes possible for a TOUR not to occur in TOUR_BOOKING if no customer has actually booked that tour. Because a tour need not occur in TOUR_BOOKING, the TOUR_BOOKING entity becomes optional to BOOKING. And because the TOUR_BOOKING entity is created before any bookings have been made, the TOUR_BOOKING entity is also optional

FIguRe

5.28

to

BOOKING.

Acomposite

entity in an eRD

BOOKING TOUR

BOOKING_NO EMP_ID

{PK} TOUR_ID

{FK1}

TOUR_NAME may_contain

BOOK_STATUS_CODE

c

TOUR_ID

{FK3}

{PK}

{PK}

TOUR_DESCRIPTION

{FK2}

TOUR_PRICE_ADULT

TOUR_DATE

0..*

1..1

{FK5}

has c

{FK1}

BOOKING_NO

EVENT_ID {FK4} HOTEL_ID

{PK}

TOUR_BOOKING

CUST_NO {FK2}

0..*

TOUR_PRICE_CHILD

1..1

TOUR_PRICE_CON FLIGHT_NO

{FK6}

BOOK_TOTAL_COST

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

5

Data

Modelling

with

Entity

a *:*

association

Relationship

Diagrams

195

note In

a UML

The

class

diagram,

association

association

an association

class

exists

can

have its

class

class to represent

within own

class the

attributes.

the *:* relationship

side of the relationship,

FIguRe 5.29

is

used

context Figure

between

which indicate

that

to

of the

represent associated

5.29

shows

BOOKING

entities

the

use

and TOUR.

both the participation

between

and,

as in

two

the

ER

entities.

model,

of an TOUR_BOOKING

the

association

Note the

multiplicities

(0..*)

of BOOKING

and TOUR

are optional.

on each

An association class 5

BOOKING

TOUR BOOKING_NO

{PK}

TOUR_ID

EMP_ID {FK1}

{PK}

TOUR_NAME CUST_NO

{FK2}

TOUR_DESCRIPTION BOOK_STATUS_CODE EVENT_ID

{FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

{FK3} TOUR_PRICE_ADULT

0..*

0..*

TOUR_PRICE_CHILD TOUR_PRICE_CON

{FK6}

BOOK_TOTAL_COST

TOUR_BOOKING TOUR_ID

{PK}

{FK1}

BOOKING_NO

{PK} {FK2}

TOUR_DATE

As customers entity.

make bookings

Naturally,

will appear occurs

a customer

more than

twice

booking

if

in

the

number

for specific books

once in

and

multiplicity

(0..*))

If

you

BOOKING

on the the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

shown

between

(1..1))

All suppressed

Rights

(0..*))

Reserved. content

does

May not

in

in

not materially

copied, affect

to

the

TOUR

overall

or

5.26.

be 1:* in

5.26,

5 1001

on the

scanned,

that

On the

entry.) Figure

TOUR_BOOKING

customers

note

entity. (Note

you

For example,

on the

be

Figure

5 401211

TOUR_BOOKING is located

then

the

that other

hand,

that the Therefore,

5.28,

booking

number

BOOKING_NO each

5 401211 customer

BOOKING

table in

the relationship

with the

* (shown

between

as the

side.

Figure

TOUR_ID

is located

one tour,

BOOKING

shown

table.

However,

multiplicity

multiplicity

tables

TOUR_BOOKING table.

the relationship as the

is

TOUR_BOOKING

will be entered into

For example,

BOOKING_NO

TOUR_BOOKING

examine

once in the

table

only once in the

Figure 5.26 has only one that BOOKING

more than

they

TOUR_BOOKING.

TOUR_BOOKING

occurs

tours,

duplicated, learning

will see that

TOUR_ID

occurs

only

a tour

5 1001 once in

can

occurs

the

occur

more than

twice

TOUR

in the

table

to

and TOUR is 1:*. Note that, in Figure TOUR_BOOKING

side,

while the

TOUR_

reflect

that

5.28, the * (shown

1 (shown

as the

side.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

196

PARt

II

Design

5.2

Concepts

DeveLoPIng

An eR DIAgRAM

The process of database design is aniterative rather than alinear or sequential process. The verb iterate meansto do again or repeatedly. Aniterative process is thus one based on repetition of processes and procedures. Building an ERD usually involves the following activities: Create a detailed narrative of the organisations

description

Identify the business rules based on the descriptions Identify

all main entities from the business rules.

Identify

all main relationships

Develop aninitial

5

between

of operations.

of operations.

entities from the

business

rules.

ERD.

Determine the multiplicities and the participation of all relationships. Remember, participation involves identifying whether arelationship can be optional or mandatory for each entity. Identify the primary and foreign Identify

keys.

all attributes.

Revise and review the ERD. During the review process, uncovered. Therefore, the components. Subsequently, of the existing diagram. The is a fair representation

of the

it is likely that additional objects, attributes and relationships will be basic ERM will be modified to incorporate the newly discovered ER another round of reviews may yield additional components or clarification process is repeated until the end users and designers agree that the ERD organisations

activities

and functions.

During the design process, the database designer does not depend simply on interviews to help define entities, attributes and relationships. A surprising amount of information can be gathered by examining the business forms and reports that an organisation uses in its daily operations. In this section, we will use two case studies Tiny University and ILoveHolidays to show the interactive process involved

5.2.1 tiny

in creating

an ERD.

university

case study

To start constructing an ERD, aninitial interview is required interview process yields the following business rules: 1

withthe Tiny University administrators.

The

Tiny University (TU) is divided into several schools: a school of business, a school of arts and sciences, a school of education, and a school of applied sciences. Each school is administered by a dean,

who is a lecturer

who has reached

the

grade

of professor

(LECT_GRADE

has a value

PROF). Keep in mindthat each dean can administer only one school. Therefore, a 1:1 relationship exists between LECTURER and SCHOOL. Note that the multiplicity can be expressed by (1..1) for the entity LECTURER and by (0..1) for the entity SCHOOL. (The smallest number of deans per school is one, as is the largest number, and each dean is assigned to only one school.) However

not all lecturers

are deans,

so

we need to

ensure that

the

entity

SCHOOL

has optional

participation. 2

Copyright Editorial

review

2020 has

Each school is composed of several departments. For example, the school of business has an accounting department, a management/marketing department, an economics/finance department and a computer information systems department. Note again the cardinality rules: the smallest

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

number

of departments

indeterminate

(*).

multiplicity belongs

is to is

operated

On the

other

expressed one,

by a school

hand,

by (1..1).

as is the

each

one,

the

number.

Data

Modelling

belongs

minimum Figure

with

and the largest

department

That is,

maximum

is

5

to

number

only

number

Entity

these

school; that

first

Diagrams

of departments

a single

of schools

5.30 illustrates

Relationship

197

is

thus,

the

a department

two

business

rules.

between

LECTURER

note It is

again

and

appropriate

SCHOOL

often indicates eliminated the

is that

the

data duplication

the

maintaining

of attributes

worth

as entities.

attributes

in the

SCHOOL

dean?

and what

the

anomalies.

duplication

of

However,

and

and the

may offer courses.

1:1 relationship that

the

existence

In this case, the entity.

This

are that are

already

of one approach

each

1..1 relationship

solution

also

would

credentials?

stored

in the

over another

easily

be

make it

easier

to

The

downside

often

table,

dean, the

depends

judgement. the

could

LECTURER

by a single

professional within

of 1:1 relationships

1:1 relationship

deans

each school is run

designers

university

the

repeating

database

make sure that

tiny

data that

because

minor. The selection speed,

the first

department

for It is

schools

it requires

lightly

reason

relationship.

is the

is rather

5.30

Each

of

deans

transaction

1:1 relationships

3

the who

stage for

requirements,

FIguRe

evaluate

dean

a misidentification

queries,

solution

setting

is

by storing

answer this

to

in the

In

database

of

5

thus

problem

of

on information short,

design

do not

is

use

defensible.

segment

For example,

the

management/marketing

department

offers

courses such asIntroduction to Management, Principles of Marketing, and Production Management. The ERD segment for this condition is shown in Figure 5.31. Note that this relationship is based on

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

198

PARt

II

Design

the

Concepts

way Tiny University

classified entity

operates.

as research would

FIguRe 5.31

only,

be optional

If, for example,

those

to the

Tiny

departments

DEPARTMENT

University

would

not

had some

offer

departments

courses;

therefore,

that

the

were

COURSE

entity.

the second tiny university eRDsegment

5

4

A CLASS is a section of a COURSE. That is, a department may offer several sections (classes) of the same database course. Each of those classes is taught by a lecturer at a given time in a given place. In short, a 1:* relationship exists between COURSE and CLASS. However, because a course may exist in

Tiny

Universitys

course

catalogue

even

when it is

not offered

as a class in a current

class schedule, CLASSis optional to COURSE. Therefore, the relationship CLASS can look like that shown in Figure 5.32.

FIguRe 5.32

5

Each department

mayhave lecturers 5 PROF) chairs the

assigned to it. department.

One of the lecturers

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

whose grade is a professor

Only one of the lecturers

to which (s)he is assigned, and no lecturer is required to accept the DEPARTMENT is optional to LECTURER in the chairs relationship. summarised in the ER segments shown in Figure 5.33.

Copyright

COURSE and

the third tiny university eRDsegment

(LECT_GRADE

Editorial

between

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

can chair the

department

chair position. Therefore, Those relationships are

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

FIguRe

5.33

the fourth

tiny

university

5

Data

Modelling

with

Entity

Relationship

Diagrams

199

eRD segment

0..*

b employs

LECTURER LECT_NUM

1..1

{PK}

DEPT_CODE

{FK}

LECT_SPECIALITY

DEPARTMENT

LECT_GRADE

LECT_LNAME

DEPT_CODE

LECT_FNAME

LECT_NUM

{PK}

LECT_INITIAL

DEPT_NAME

{FK1}

LECT_EMAIL LECT_GRADE

5 0..1

1..1

6

Each lecturer be on

contract

may enrol in several classes, period.

five

may enrol

enrolment in the

Statistics, in

in

up to

period)

shown

Cengage deemed

Learning. that

six

any

shown

ENROL.

PK is

All

Rights

five

each

class

and

ERD

segments

in

Figure

may also

5.34

depict

in the

If a class

does

May not

not materially

be

copied, affect

scanned, the

overall

exists

or

have

that

duplicated,

has

of the

in experience.

whole

5.35.

no students

entity is

Cengage

part.

Due Learning

to

electronic reserves

enrolled

weak: it is and

rights, the

right

thus

that

to

third remove

optional

party additional

content

may content

never

can

suppressed at

any

CLASS use

time

occurs

and its

You

be

a

of the

participation

class

entities.

would student

the

existence-dependent,

CLASS

some

to

through

in it, that

to

creating

start

optional

1:* relationships, But note

Each

exist (at the

STUDENT is

two

student

period!

35 students,

in it, so

STUDENT

or in

up to

may decide

but that

enrolment

can initially

into

Figure

a student

History

the

A CLASS

in

ENROL

PKs

learning

may

period,

and

during

have enrolled

segment

Note also that the of the

Database

must be divided

ERD

each class only once during any given

enrolment

times

CLASS.

no students

composed

Reserved. content

and

STUDENT

class

This *:* relationship

entity

next to

suppressed

classes,

current

English,

Statistics

even though

ENROL table.

(composite)

has

The

but (s)he takes

during the

Accounting,

same

between

ENROL

in the

2020

the

*:* relationship.

of the

review

For example,

classes

be enrolled

*:* relationship

is

at all.

the fifth tiny university eRDsegment

A student

not

Copyright

no classes

enrolment take

Editorial

and teach

conditions.

FIguRe 5.34

7

c

mayteach up to four classes; each class is a section of a course. Alecturer

a research

those

chairs

from if

add

the

subsequent

eBook rights

the

and/or restrictions

eChapter(s). require

it

200

PARt

II

Design

Concepts

multiplicities shown

FIguRe

in

5.35

(0..6)

Figure

and (0..35)

next to the

the sixth tiny

university

entity to reflect

the

business

rule

constraints

as

eRD segment

STUDENT STU_NUM

ENROL

5.35.

ENROL {PK}

is_written_in

STU_FNAME

STU_NUM

c

{PK}

CLASS_CODE

STU_LNAME

{FK1} {PK}

is_found_in

{FK2}

CLASS

c

ENROL_DATE

STU_INITIAL

1..1

0..6

CLASS_CODE

ENROL_GRADE

STU_EMAIL

{PK}

CLASS_TIME

0..35 1..1

5 8

Each department has several (hopefully

many) students

However,

major and is, therefore,

each student

has only a single

whose majoris offered bythat department. associated

with a single

department.

(See Figure 5.36.) However, in the Tiny University environment, it is possible atleast for a while for a student not to declare a major field of study. Such a student would not be associated with a department; therefore, DEPARTMENT is optional to STUDENT. It is worth repeating that the relationships between entities and the entities themselves reflect the organisations operating environment.

FIguRe

5.36

9

That is, the

the seventh

business

tiny

rules

university

define the

ERD components.

eRD segment

Each student has an advisor in his or her department; advisor

is

also

LECTURER

in

FIguRe 5.37

a lecturer,

but

not

the LECTURER

all lecturers

advises

advise

STUDENT

each advisor counsels several students. students.

Therefore,

relationship.

(See

STUDENT

Figure

is

An

optional

to

5.37.)

the eighth tiny university eRDsegment LECTURER

LECT_NUM

STUDENT

{PK}

LECT_SPECIALITY

STU_NUM

LECT_RANK

advises

{PK}

LECT_NUM

c

LECT_LNAME

{FK1}

STU_FNAME

LECT_FNAME

STU_LNAME

0..*

1..1

LECT_INITIAL

STU_INITIAL

LECT_EMAIL

STU_EMAIL

LECT_GRADE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

10

5

Data

Modelling

with

Entity

Relationship

Diagrams

201

Whenyou examine the CLASS entity in Figure 5.38, youll notethat this entity contains a ROOM_CODE attribute.

Given the

because FK to

naming

conventions,

a class is taught

an entity

ERD is created

a room,

ROOM.

by observing

single

BUILDING. (See

(class)

rooms.

FIguRe

in

named

clear that

it is reasonable

In turn,

that

a BUILDING

a storage

ROOM_CODE

to

each room

Figure 5.38.) In this

For example,

5.38

it is

assume

is located

can contain

might

the ninth tiny university

the

ROOM_CODE

it is clear that

not

a FK to another

a building.

many ROOMs,

ERD segment,

building

that in

is

contain

any

entity. in

So the last

Clearly,

CLASS is the Tiny

University

but each ROOM is found in a some buildings

named

rooms

do not contain

at all.

eRD segment

5

Using the preceding summary, you can identify the following SCHOOL

COURSE

DEPARTMENT

CLASS

ENROL (the

bridge

entity

between

STUDENT

and

LECTURER

STUDENT

BUILDING

ROOM

entities:

CLASS)

Once you have discovered the relevant entities, you can define the initial set of relationships among them. Next, you describe the entity attributes. Identifying the attributes of the entities helps you better understand the relationships among entities. Table 5.4 summarises the ERMs components, and names the

entities

and their relations.

tAbLe 5.4

components

entity

operates

1..*

DEPARTMENT

DEPARTMENT

has

1..*

STUDENT

DEPARTMENT

employs

1..*

LECTURER

DEPARTMENT

offers

1..*

COURSE

COURSE

generates

1..*

CLASS

1..1

SCHOOL

is

dean

of

LECTURER

chairs

1..1

DEPARTMENT

LECTURER

teaches

1..*

CLASS

LECTURER

advises

1..*

STUDENT

STUDENT

enrols

1..*

CLASS

BUILDING

contains

1..*

ROOM

1..*

CLASS

Note:

review

entity

Connectivity

SCHOOL

ROOM

Copyright

eRM

relationship

LECTURER

Editorial

of the

2020 has

is ENROL

Cengage deemed

Learning. that

any

is the

All suppressed

composite

Rights

Reserved. content

does

used

entity

May not

not materially

be

in

for

that

copied, affect

implements

scanned, the

overall

or

duplicated, learning

the

in experience.

whole

relationship

or in Cengage

part.

Due Learning

STUDENT

to

electronic reserves

rights, the

right

enrols

some to

third remove

in

party additional

CLASS.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

202

PARt

II

Design

You

Concepts

must also

the

end

user

conceptual also

define

the

diagram,

be displayed

depicted

connectivity

extensively.

Having

depicted in

the

in

ERD.

and

defined Figure

cardinality the

5.39.

However,

to

for the just-discovered

ERMs

components,

Actually,

avoid

the

you

entity

crowding

attributes

the

relations

can

diagram,

now

and their the

by querying

draw

the

ERD,

domains

entity

or

should

attributes

may be

separately.

FIguRe 5.39

the completed tiny university eRDsegment LECTURER

1..1

LECT_NUM

is_dean_of

{PK}

DEPT_CODE

c

SCHOOL

{FK} SCHOOL_CODE

LECT_SPECIALITY

1..1

0..1

LECT_NUM

LECT_RANK

{PK} {FK1}

SCHOOL_NAME

LECT_LNAME

1..1

LECT_FNAME

5

b employs

LECT_INITIAL

LECT_EMAIL LECT_GRADE

0..*

operates

1..1

c

1..1 advises

c

chairs

c

1..1

teaches

c

1..*

DEPARTMENT DEPT_CODE

0..1

{PK}

SCHOOL_CODE b

LECT_NUM

has

{FK1} {FK2}

DEPT_NAME 1..1

0..*

1..1

0..*

offers

c

STUDENT CLASS STU_NUM

0..*

0..*

{PK}

DEPT_CODE

CLASS_CODE

{FK1}

{PK}

CLASS_SECTION

STU_FNAME

0..*

COURSE

CLASS_TIME

STU_LNAME

CRS_CODE

STU_INITIAL

LECT_CODE

STU_EMAIL LECT_NUM

CRS_CODE

1..1

{FK2}

ROOM_CODE

{FK2}

b generates

{FK1}

{PK}

DEPT_CODE

{FK3}

{FK1}

CRS_TITLE

1..1

CRS_DESCRIPTION CRS_CREDITES

is_found_in

c

0..*

1..1

is_used_for

is_written_in

c

c

1..1

ROOM ROOM_CODE

{PK}

BLDG_CODE

0..*

0..*

{FK1}

1..1

b contains

BUILDING

ROOM_TYPE

BLDG_CODE ENROL CLASS_CODE STU_NUM

{PK}

BLDG_NAME {PK}

BLDG_LOCATION

{FK2}

{FK2}

ENROL_GRADE

0..*

ENROL_GRADE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

5

Data

Modelling

with

Entity

Relationship

Diagrams

203

5.2.2 ILoveHolidays ILoveHolidays

is

a small international

company

that

owns a number

of independent

travel

agencies in a

number of countries. The travel agencies specialise in booking complete holidays, hotels, flights, tours and one-off events. They also offer information to customers on attractions and places of interest in a number of cities worldwide. From interviews with various stakeholders and employees, the following business

1

rules

have been established:

Each travel Deputy

agent has a number

Manager

or Staff).

of employees

Each travel

who each have an associated

agent

must have

Manager. Therefore a 1:* mandatory relationship Figure 5.40 illustrates this first business rule.

FIguRe

5.40

one

employee

grade (Manager,

who takes

the

role

of

exists between TRAVEL_AGENT and EMPLOYEE.

segment 1: the tRAveL_Agent

5

eMPLoyee relationship EMPLOYEE

TRAVEL_AGENT

EMP_ID

{PK}

PAYROLL_NO

AGENT_ID {PK}

AGENT_ID

AGENT_NAME

employs

c

{FK1}

EMP_LNAME

AGENT_ADDRESS EMP_FNAME AGENT_PHONE

1.1

1..*

EMP_PHONE

AGENT_EMAIL

2

EMP_GRADE

Each employee may make bookings on behalf of customers when they visit one of the travel agencies. However, some employees, such as the Manager, may be confined to back office duties and may not make a booking. This is why BOOKING is optional to EMPLOYEE. A booking can only exist if it has been

relationship FIguRe

made by an employee.

The ERD segment is shown in Figure 5.41 and shows the 1:*

that exists between EMPLOYEE and BOOKING.

5.41

segment

2: the eMPLoyee

booKIng

relationship BOOKING

EMPLOYEE EMP_ID

BOOKING_NO

{PK}

EMP_ID {FK1}

{PK}

CUST_NO

PAYROLL_NO

makes

AGENT_ID {FK1}

c

EMP_LNAME EMP_FNAME

{FK2}

BOOK_STATUS_CODE EVENT_ID

{FK4}

HOTEL_ID

{FK5}

0..*

1..1

EMP_PHONE

FLIGHT_NO

EMP_GRADE

BOOK_TOTAL_COST

{FK3}

{FK6}

BOOKING_DATE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

204

PARt

II

3

Design

Concepts

Figure 5.42 shows the relationships entities. be for

A customer at least

friends.

booking. where and

the

In this

can

make

customer

scenario

himself customer

This is represented

by the

PARTY_MEMBERS same

BOOKING

time

and

is

more bookings or herself

the

at the

FIguRe 5.42

between the CUSTOMER, BOOKING and PARTY_MEMBERS

one or

but

becomes

to

segment 3: the custoMeR

other

The

people

CUSTOMER booking

party

1:* and

agencies.

booking

such

and is responsible

Each

more

booKIng

travel

traveller

between

1 or

is therefore

of the

also include

lead

CUSTOMER.

may include

PARTY_MEMBERS

could

the

1:* relationship

optional

a booking

at any

is for

members.

between

and

as family

for

the

or

overall

PARTY_MEMBER only

one customer

The relationship

BOOKING

must

and

between

CUSTOMER

*:1.

PARty_MeMbeRs relationship

CUSTOMER PARTY_MEMBERS CUST_NO

5

{PK} CUST_NO

{PK}{FK1}

CUST_FNAME

BOOKING_NO

CUST_LNAME CUST_ADDRESS makes

c

Party

c

PARTY_LNAME

CUST_DOB CUST_PHONE

1..1

{PK}{FK2}

PARTY_FNAME

PARTY_DOB

0..*

1..1

CUST_EMAIL

OUT_SEAT_NO

OUT_SEAT_NO

IN_SEAT_NO

IN_SEAT_NO

0..* 1..*

BOOKING BOOKING_NO

{PK}

EMP_ID {FK1} CUST_NO

{FK2}

BOOK_STATUS_CODE EVENT_ID

{FK3}

has

c

{FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

1..1

{FK6}

BOOK_TOTAL_COST

BOOKING_DATE

4

Each BOOKING is assigned

a BOOKING_STATUS_CODE.

These codes

allow the travel

agencies

to

track the status of the booking. When a customer makes a booking, he or she must pay a deposit. The booking status code is then set to Deposit Paid. Once the cost of the booking is paid in full, the booking status code changes to Fully Paid. Booking status codes also exist if the booking is cancelled and when the booking is complete, whichis set after the customer has completed his or her travel

plans.

BOOKING

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Figure

5.43 shows the

1:* relationship

between

BOOKING_STATUS_CODE

and

where BOOKING is optional to BOOKING_STATUS_CODE.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

FIguRe

5.43

segment 4: the booKIng

5

Data

Modelling

with

booKIng_stAtus_coDe

Entity

Relationship

Diagrams

205

relationship

BOOKING_STATUS_CODE BOOKING_STATUS_CODE

{PK}

DESCRIPTION 1..1

5

BOOKING BOOKING_NO EMP_ID

{PK}

{FK1}

CUST_NO

{FK2}

BOOKING_STATUS_CODE

{FK3}

EVENT_ID {FK4} HOTEL_ID

{FK5}

FLIGHT_NO

0..*

{FK6}

BOOKING_TOTAL_COST

BOOKING_DATE

5 ILoveHolidays Wimbledon BOOKING. may or the

Copyright Editorial

review

2020 has

Learning. that

any

Championships.

be for

agencies

All suppressed

Figure

Both sides of the relationship

may not

travel

Cengage deemed

also sells tickets for a number of events such as the Tennis

Rights

Reserved. content

an event will keep

does

May not

not materially

be

copied, affect

and the

scanned, the

overall

or

duplicated, learning

shows

allow for optional

an event

details

5.44

may or

may not

of all events

offered

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

the

Monaco Grand Prix or the

relationship

participation. be booked within

rights, the

right

some to

third

EVENT

because

by a customer.

their

remove

between That is

and

a booking Regardless,

database.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

206

PARt

II

Design

FIguRe

Concepts

5.44

segment 5: the booKIng

event relationship BOOKING BOOKING_NO

EVENT

EMP_ID

EVENT_ID {PK}

{PK}

{FK1}

CUST_NO {FK2}

EVENT_DESCRIPTION

may_contain

BOOKING_STATUS_CODE

{FK3}

EVENT_PRICE_ADULT EVENT_ID

{FK4}

HOTEL_ID

{FK5}

EVENT_PRICE_CHILD 0..1

0..*

EVENT_PRICE_CON

FLIGHT_NO

{FK6}

EVENT_DATE BOOKING_TOTAL_COST

BOOKING_DATE

5

6

A booking

made by a customer

Sydney in House stored

Australia

and

The initial

into

on

table

(entity)

TOUR

optional

that

example

tour

of a

of the

FIguRe 5.45

Bondi

to

participation

the

and

appear

WEAK

entity,

it is

and

TOUR

For example, a customer

Blue

Mountains,

of all tours

a TOUR

can

TOUR

use of the

is shown

never

see the

Details

BOOKING

through

can

BOOKING

tours

Beach.

and therefore

between

1:* relationships

Note that

PKs

a day

relationship

two

books,

may book separate

spend

in the

may be for a number of tours.

is

offered

exist

*:*,

without

but this

TOUR_BOOKING

TOUR_BOOKING

existence-dependent

relationship

and

has

being must

are made.

be divided

Figure

exists that

5.45.

no one ever

TOUR_BOOKING

a composite

Opera

agencies

as shown in

If a tour

table.

Sydney

travel

a BOOKING

entity

next to TOUR_BOOKING.

in in the

visit the

by the

visiting

is

also

PK composed

an

of the

entities.

segment 6: the booKIng

touR_booKIng

touR relationship

BOOKING BOOKING_NO EMP_ID

TOUR

{PK}

{FK1}

CUST_NO

TOUR_BOOKING

{FK2} {FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

{PK}

TOUR_NAME

BOOKING_STATUS_CODE EVENT_ID

TOUR_ID

{FK3}

may_contain

c

TOUR_ID

{PK}

BOOKING_NO 1..1 0.. *

{FK6}

{FK1}

has

{PK}

c

TOUR_DESCRIPTION

{FK2}

TOUR_PRICE_ADULT

TOUR_DATE TOTAL_TOUR_COST

1..1

0..*

TOUR_PRICE_CHILD TOUR_PRICE_CON

BOOKING_TOTAL_COST BOOKING_DATE

7

Figure 5.46 shows the relationship between TOUR, ATTRACT_TOUR and ATTRACTION. A tour may comprise visits to a number of attractions and at the same time different combinations of attractions

may be offered

on different

tours.

This

means that,

initially,

a *:* relationship

existed

between TOUR and ATTRACTION, which needed to be resolved by the addition of the weak entity ATTRACT_TOUR. An attraction may exist without belonging to a tour and therefore the travel agencies would be able to provide information to the customer about the attraction such as travel instructions.

This is

a specific

requirement

of ILoveHolidays

in

order to

customers and exceed expectations. Note that ATTRACT_TOUR the PKs from TOUR and ATTRACTION.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

provide

additional

has a composite

third remove

party additional

content

may content

be

PK comprising

suppressed at

any

time

help to

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

FIguRe

5.46

segment 7: the touR

5

Data

Modelling

AttRAct_touR

with

Entity

AttRActIon

Relationship

Diagrams

207

relationship

TOUR TOUR_ID may_contain

{PK}

TOUR_NAME

c

TOUR_DESCRIPTION TOUR_PRICE_ADULT 1..1

0..*

TOUR_PRICE_CHILD TOUR_PRICE_CON

ATTRACT_TOUR TOUR_ID

{PK}{FK1}

ATTRACTION_NO_{PK}

{FK2} ATTRACTION

ATTRACTION_NO CITY_ID

0..*

5

{PK}

{FK}

ATTRACT_TYPE ATTRACT_NAME ATTRACT_WEBSITE

may_be_visited

c

1..1

ATTRACT_PHONE

ATTRACT_OPENING_TIME ATTRACT_CLOSING_TIME ATTRACT_ADDRESS ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON

8

Each booking or

may be

shows

may be for

booked

on

the relationship

relationship

one hotel. Hotels that exist in the

multiple

occasions.

between

the

This

business

BOOKING

and

HOTEL table

rule

HOTEL

is illustrated

entities.

may never be booked in

Figure

Note that

both

5.47,

which

sides

of the

are optional.

FIguRe 5.47

segment 8: the booKIng

HoteL relationship HOTEL HOTEL_ID

BOOKING BOOKING_NO EMP_ID

{PK}

HOTEL_STARS

{FK1}

CUST_NO

{PK}

HOTEL_NAME

HOTEL_PHONE

{FK2}

BOOKING_STATUS_CODE

HOTEL_EMAIL

consists_of

{FK3}

HOTEL_ADDRESS EVENT_ID

{FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

CITY_ID

0..*

0..1

{FK}

DOUBLE_ROOM_PRICE

{FK6} FAMILY_ROOM_PRICE

BOOKING_TOTAL_COST

SINGLE_ROOM_PRICE

BOOKING_DATE

HOTEL_NO_NIGHTS HOTEL_DATE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

208

PARt

II

Design

9

Concepts

A booking

may be for

relationship can

FIguRe

be seen in

5.48

one specific flight

between

BOOKING

Figure

and

and a specific flight

FLIGHT

is

therefore

may be on

*:1

with

both

many bookings.

sides

being

The

optional,

as

5.48.

segment 9: the booKIng

FLIgHt relationship

BOOKING BOOKING_NO

FLIGHT

{PK}

EMP_ID {FK1}

FLIGHT_NO

CUST_NO

FLIGHT

{FK2}

BOOKING_STATUS_CODE

consists_of

{FK3}

FLIGHT_DEPART_AIRPORT FLIGHT_ARRIVE_AIRPORT

EVENT_ID {FK4} HOTEL_ID

5

0..*

{FK5}

FLIGHT_NO

{PK} AIRLINE

FLIGHT_DEPART_DATETIME

0..1

FLIGHT_ARRIVE_DATETIME

{FK6}

FLIGHT_COST

BOOKING_TOTAL_COST BOOKING_DATE

10

In

order for

employees

to search

for

attractions

in any given

city, ILoveHolidays

wishes to store

details of what attractions exist in each city. An attraction exists in one and only one city whilst a city may have any number of attractions. The relationship between ATTRACTION and CITY is shown in Figure 5.49.

FIguRe

5.49

segment 9: the AttRActIon

cIty relationship

ATTRACTION ATTRACTION_NO CITY_ID

{PK}

{FK}

ATTRACT_TYPE ATTRACT_NAME

CITY

ATTRACT_WEBSITE exits_in

ATTRACT_PHONE

CITY_ID

c

{PK}

COUNTRY_ID

ATTRACT_OPENING_TIME

{FK}

CITY_NAME

ATTRACT_CLOSING_TIME

0..*

1..1

LOCAL_WEBSITE

ATTRACT_ADDRESS ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

5

Data

11 In order to deal with more detailed enquiries from customers country

(for

example,

relationship a city

based

between

can

only

cities

exist in

FIguRe 5.50

upon

the

and their

one

country

number

of attractions

associated (see

country.

Figure

segment 10: the cIty

Modelling

about

in

each

Entity

Relationship

Diagrams

209

which cities to visit in a given city), it is

One country

has

necessary

one

or

to

model

more cities

a

whilst

5.50).

countRy relationship COUNTRY

CITY CITY_ID

with

{PK}

COUNTRY_ID

b has

COUNTRY_ID

{PK}

COUNTRY_NAME

{FK}

TOURISM_WEB_SITE

CITY_NAME 1..*

LOCAL_WEBSITE

1..1

MAIN_LANGUAGE

5

12

Each city where a customer available. are required. is

optional

for

the

would like to stay will hopefully

To allow the travel Figure in the

travel

to search for hotels

5.51 shows the 1:* relationship

relationship

agencies

FIguRe 5.51

agencies

to

as a city

recommend

(and

have

therefore

segment 11: the HoteL

entities,

CITY and

HOTEL.

between

may not

have a selection

by city, two

any

hotels

would

not

that

are

be included

of hotels that are HOTEL and Notice that

deemed

good

in the

HOTEL

CITY, HOTEL

enough table).

cIty relationship

HOTEL HOTEL_ID

{PK}

HOTEL_NAME HOTEL_STARS

CITY

HOTEL_PHONE HOTEL_EMAIL

exists_in

CITY_ID

c

{PK}

HOTEL_ADDRESS

COUNTRY_ID

CITY_ID {FK}

CITY_NAME

0..*

DOUBLE_ROOM_PRICE

1..1

{FK}

LOCAL_WEBSITE

FAMILY_ROOM_PRICE SINGLE_ROOM_PRICE HOTEL_NO_NIGHTS

HOTEL_DATE

13

A customer makes atleast one or,in some cases, many payments in order to pay off the total cost of their booking. The relationship between CUSTOMER and PAYMENT (shown in Figure 5.52) is

mandatory

on both sides

be associated

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

as a customer

must make atleast

one payment

and one payment

must

with a CUSTOMER.

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

210

PARt

II

FIguRe

Design

Concepts

5.52

segment 12: the custoMeR

PAyMent

relationship

CUSTOMER

CUST_NO

{PK}

PAYMENT

CUST_FNAME

PAYMENT_NO

CUST_LNAME provides

CUST_ADDRESS

CUST_NO

c

INVOICE_NO

CUST_DOB

{PK}

{FK1} {FK2}

AMOUNT_PAID

CUST_PHONE

1..*

1..1

PAYMENT_TYPE

CUST_EMAIL

DATE_PAID OUT_SEAT_NO

IN_SEAT_NO

5 14

A booking will generate atleast one invoice but may generate manyinvoices. This will depend on whether the customer chooses to pay for his or her booking all at once. In this case, only oneinvoice will be produced. Otherwise, several invoices may need to be generated for a specific booking. The 1:* mandatory

FIguRe 5.53

relationship

between

BOOKING

segment 13: the booKIng

and INVOICE

can be seen in Figure

5.53.

InvoIce relationship

BOOKING BOOKING_NO

EMP_ID

{PK}

{FK1}

CUST_NO

INVOICE

{FK2}

BOOKING_STATUS_CODE

generates

{FK3}

INVOICE_NO

c

{PK}

BOOKING_NO

EVENT_ID

{FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

{FK1}

INVOICE_DATE 1..1

1..*

INVOICE_BALANCE

{FK6}

BOOKING_TOTAL_COST

BOOKING_DATE

15

Figure 5.54 shows the 1:* relationship between

BOOKING

relationship.

to reduce

FIguRe 5.54

and

INVOICE,

One invoice

the

may

balance

between INVOICE and PAYMENT. Similar to the relationship PAYMENT

be paid

and

by a number

INVOICE

also

of payments

participate

whilst

one

in

a

payment

mandatory is

assigned

on one invoice.

segment 14: the InvoIce

PAyMent relationship PAYMENT

INVOICE

PAYMENT_NO

INVOICE_NO

{PK}

BOOKING_NO

{PK}

CUST_NO

is_paid_by

{FK1}

{FK1}

INVOICE_NO

INVOICE_DATE

{FK2}

AMOUNT_PAID

1..*

1..1

INVOICE_BALANCE

PAYMENT_TYPE DATE_PAID

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

We have now completed components,

we can

FIguRe

5.55

all the

draw

the

segments

5

Data

of the ILoveHolidays

completed

conceptual

Final ILoveHolidays

ERD

Modelling

ERD. as shown

with

Now that in

Entity

Relationship

Diagrams

we have defined

Figure

211

all the

5.55.

eRD CUSTOMER

TRAVEL_AGENT EMPLOYEE

AGENT_ID

CUST_NO

{PK} EMP_ID

1..1

{PK}

{PK} CUST_FNAME

AGENT_NAME PAYROLL_NO employs

c

1..*

AGENT_ADDRESS

CUST_LNAME AGENT_ID

{FK1} CUST_ADDRESS

AGENT_PHONE EMP_LNAME

CUST_DOB

AGENT_EMAIL EMP_FNAME

CUST_PHONE EMP_PHONE

CUST_EMAIL

EMP_GRADE

1..1

provides

c

CUST_SEAT_NO

1..*

IN_SEAT_NO

1..1

PAYMENT

1..1 party

c

1..1

PAYMENT_NO

{PK}

CUST_NO

0..*

BOOKING_STATUS_CODE

BOOKING_STATUS_CODE

5

{FK1}

INVOICE_NO

{FK2}

AMOUNT_PAID

{PK} makes

c

PAYMENT_TYPE

PARTY_MEMBERS

DESCRIPTION

makes

c

DATE_PAID

1..1

CUST_NO

{PK}{FK1}

BOOKING_NO

{PK}{FK2}

PARTY_FNAME

PARTY_LNAME 1..* PARTY_DOB has

c OUT_SEAT_NO

0..*

is_paid_by

IN_SEAT_NO

0..* 1..*

INVOICE BOOKING

INVOICE_NO generates

BOOKING_NO EMP_ID CUST_NO

1..1

{PK}

c BOOKING_NO

{PK}

{FK1}

INVOICE_DATE

{FK1}

1..*

{FK2}

INVOICE_BALANCE

1..1 BOOKING_STATUS_CODE

EVENT_ID HOTEL_ID

may_contain

{FK5}

FLIGHT_ID

0..*

{FK3}

{FK4}

{FK6}

consists_of

0..*

c

BOOKING_TOTAL_COST BOOKING_DATE 0..*

consists_of 0..1 1..1 0..1

may_contain

0..*

c

0..1 EVENT

FLIGHT

0..* FLIGHT_NO

EVENT_ID

{PK}

{PK} FLIGHT_AIRLINE HOTEL

EVENT_DESCRIPTION

TOUR_BOOKING FLIGHT_DEPART_AIRPORT

EVENT_PRICE_ADULT

HOTEL_ID TOUR_ID

{PK}

FLIGHT_ARRIVE_AIRPORT

{PK}{FK1} HOTEL_NAME

EVENT_PRICE_CHILD BOOKING_NO

{PK}{FK2}

FLIGHT_DEPART_DATETIME HOTEL_STARS

EVENT_PRICE_CON

FLIGHT_ARRIVE_DATETIME

TOUR_DATE HOTEL_PHONE

EVENT_DATE

FLIGHT_COST

TOUR_TOUR_COST HOTEL_EMAIL

HOTEL_ADDRESS

has

CITY_ID{FK}

c

DOUBLE_ROOM_PRICE

0..* 1..1

FAMILY_ROOM_PRICE SINGLE_ROOM_PRICE

TOUR

HOTEL_NO_NIGHTS HOTEL_DATE

TOUR_ID 1..1 may_contain

{PK}

TOUR_NAME

c TOUR_DESCRIPTION 0..* TOUR_PRICE_ADULT

TOUR_PRICE_CHILD

0..*

TOUR_PRICE_CON

exits_in

c

ATTRACT_TOUR

TOUR_ID

{PK}{FK1}

ATTRACTION_NO

{PK}{FK2} ATTRACTION 1..1 ATTRACTION_NO

{PK} COUNTRY

CITY_ID

{FK}

0..*

ATTRACT_TYPE

COUNTRY_ID

1..1

CITY

ATTRACT_NAME

0..*

ATTRACT_PHONE

c

{PK}

COUNTRY_NAME

bhas

1..*

{PK}

TOURISM_WEB_SITE

1..1

may_be_visited

CITY_ID

1..1

ATTRACT_WEBSITE exits_in

c

COUNTRY_ID

{FK} MAIN_LANGUAGE

CITY_NAME ATTRACT_OPENING_TIME LOCAL_WEBSITE ATTRACT_CLOSING_TIME ATTRACT_ADDRESS

ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD

ATTRACT_COST_CON

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

212

PARt

II

Design

5.3

Concepts

DAtAbAse

Database

designers

DesIgn

often need to

cHALLenges:

make design

as adherence to design standards (design The database

design

must conform

to

conFLIctIng

compromises

that

are triggered

elegance), processing design

standards.

goALs by conflicting

speed andinformation

Such standards

goals, such

requirements.

have guided

you in

developing logical structures that minimise data redundancies, thereby minimising the likelihood that destructive data anomalies will occur. You have also learnt how standards prescribed avoiding nulls to the greatest extent possible. In fact, you have learnt that design standards govern the presentation

of all components

within the

database

design. In short,

design

standards

allow you to

work with well-defined components and to evaluate the interaction ofthose components with some precision. Without design standards, it is nearly impossible to formulate a proper design process, to evaluate an existing design, or to trace the likely logical impact of changes in design.

5 In many organisations, particularly those generating large numbers of transactions, high processing speeds are often atop priority in database design. High processing speed means minimal access time, which may be achieved by minimising the number and complexity of logically

desirable

relationships.

For example,

a perfect

design

might use a 1:1 relationship

to

avoid nulls, while a higher-transaction-speed design might combine the two tables to avoid the use of an additional relationship, using dummy entries to avoid the nulls. If the focus is on data-retrieval speed, you might also be forced to include derived attributes in the design. The quest for timely information might be the focus of database design. Complex information requirements may dictate data transformations, and they may expand the number of entities and attributes

within the design.

Therefore,

the

database

may have to

sacrifice

some

of its clean

design structures and/or some ofits high transaction speed to ensure maximum information generation. For example, suppose that a detailed sales report must be generated periodically. The sales report includes allinvoice subtotals, taxes and totals; even the invoice lines include subtotals. If the sales report includes hundreds of thousands (or even millions) of invoices, computing

the totals,

taxes

and subtotals

is likely

to take

some time. If those

computations

had

been made and the results had been stored as derived attributes in the INVOICE and LINE tables at the time of the transaction, the real-time transaction speed might have declined, but that loss of speed would only be noticeable if there had been many simultaneous transactions. The cost of a slight loss

of transaction

speed

at the front

end and the

addition

of

multiple

derived

attributes

is likely to pay off when the sales reports are generated (not to mention the fact that it will be simpler to generate the queries). Another issue that needs to be borne in mindif derived values are used to improve performance, is data integrity. Should the values from which the derived value is calculated change, then triggers need to bein place to ensure that the derived values are automatically

updated.

Failing to

do this

would result in

As arule, you should first strive for a design that hasintegrity for performance

reasons.

Once a normalised

design is in

data integrity

issues.

before attempting to denormalise the design

place, issues

around improving

performance

by

mergingtables, including derived values, etc., can beincluded. A design that meets alllogical requirements and design conventions is an important goal. However, if this perfect design fails to meetthe customers transaction speed and/or information requirements, the designer will not have done a proper job from the end users

point of view.

Compromises

are a fact

of life in the real

world of database

design.

Even as the designer focuses on the entities, attributes, relationships and constraints, he or she should begin thinking about end-user requirements such as performance, security, shared access and dataintegrity. The designer must consider processing requirements and verify that all update, retrieval and deletion options are available. Finally, a design is oflittle value unless the end product is capable of delivering

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

all specified

All suppressed

Rights

Reserved. content

does

May not

query

not materially

be

copied, affect

and reporting

scanned, the

overall

or

duplicated, learning

requirements.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

cHAPteR

You are quite likely further using

changes, the

process.

meeting

the

thorough

to

discover

mandated ER

demands

modelling

revisit

the

Figure

get

design

a sense

1:1 recursive

5.56

shows

best design

requirements.

essential and

in

Such

the

growth.

design

relationship

with

process

produces

changes

should

of a sound

ERDs

yields

problems

Entity

Relationship

an ERD that not

the

that

213

you from

is

richest

Diagrams

requires

discourage

design

perhaps

that

and implementation

ways

Modelling

capable

bonus

of

of all: a

really functions.

EMPLOYEE

different

Data

development Using

and implementation

of the

three

FIguRe 5.56

even the

of how an organisation

There are occasional To

is

of adjustment

understanding

solutions.

that

by operational

5

is

married to

of implementing

various implementations

do not yield clean

choices

a database

EMPLOYEE

such

first

implementation

designer

examined

in

faces,

lets

Figure

5.21.

a relationship.

of the 1:1recursive relationship 5

Database name:

Ch05_PartCo

Table name: EMPLOYEE_V1 First implementation

Second

eMP_NUM

eMP_LNAMe

eMP_FNAMe

eMP_SPOUSe

345

Singh

Nishok

347

346

Jones

Anne

349

347

Singh

Vediga

345

348

Delaney

Robert

349

Shapiro

Anton

346

implementation

Table name: EMPLOYEE

Table name:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

eMP_NUM

eMP_LNAMe

eMP_FNAMe

345

Singh

Nishok

346

Jones

Anne

347

Singh

Vediga

348

Delaney

Robert

349

Shapiro

Anton

MARRIED_V1

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

eMP_NUM

eMP_SPOUSe

345

347

346

349

347

345

349

346

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

214

PARt

II

Design

Concepts

Third implementation Table

name:

MARRIAGE MAr_NUM

Table name:

MAr_DATe 1

04-Mar-13

2

02-Feb-09

MARPART MAr_NUM

5

Table

name:

eMP_NUM

1

345

1

347

2

346

2

349

EMPLOYEE eMP_NUM

eMP_LNAMe

eMP_FNAMe

345

Singh

Nishok

346

Jones

Anne

347

Singh

Vediga

348

Delaney

Robert

349

Shapiro

Anton

As you examine the EMPLOYEE_V1 table in Figure 5.56, note that this table is likely to yield data anomalies. For example, if Anne Jones divorces Anton Shapiro, two records must be updated by setting the respective EMP_SPOUSE values to null to properly reflect that change. If only one record is

updated,

inconsistent

data

occur.

The

problem

becomes

even

worse if

several

of the

divorced

employees then marry each other. In addition, that implementation also produces undesirable nulls for employees who are not married to other employees in the company. Another approach would be to create a new entity shown as MARRIED_V1 in a 1:* relationship with EMPLOYEE. (See Figure 5.56, second implementation.) This second implementation does eliminate the nulls for employees

who are not

married to somebody

working for the same company.

(Such

employees

would not be entered in the MARRIED_V1 table.) However, this approach still yields possible duplicate values. For example, the marriage between employees 345 and 347 may still appear twice, once as 345 347 and once as 347 345. (Since each of those permutations is unique the first time it appears, the creation

Copyright Editorial

review

2020 has

Cengage deemed

of a unique index

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

will not solve the

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

problem.)

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

cHAPteR

As you can see, the first two implementations Both solutions refer to

use synonyms.

an employee.

Both solutions as 345

are likely

Both solutions

violate

integrity

be to

the

EMP_NUM

approach

to show

would

have

two

be the

preferred

can

see,

effectiveness

and

a recursive

judgement

processing

requirements

Finally, youve

you (or

those

design.

Although

the

design

yield

document

need

for

only

you)

to

pick

The

of ensuring

data

to enter

other

the

class

to

employee

345.

employees.

different

in

For

will not

a 1:* relationship.

diagram

environment.

But

in

Figure

even this

5.38.)

approach

only once in any given

5

marriage,

MARPART table. solutions

with

as a database

requirements

Put all design you

stay

up the

varying

designer

imposed

degrees

is to

by

use

business

of your

rules,

activities

in

writing.

Then review

what

on track

during

the

design

process,

but

also

thread

when

modify

the

problems

in

design

should

be

work is that the put

stages.

aspect

UML

occurs

Your job

meets

helps

documentation

analysis

and implementation

that

215

principles.

document!

not

possible

MARPART

the

attribute in the many

principles.

design

and

following

and systems

very important

and

(See

a relational

yields

a solution

and basic

it is

married to employee

MARRIAGE

in

Diagrams

and EMP_SPOUSE

married to several

EMPLOYEE.

solution

design

Relationship

all unique.

on the EMP_NUM

basic

Documentation

enables

database

to

to

document,

written.

are

entities

1:1 relationship

adherence

professional

348 as

For example, to ensure that an employee

would have to use a unique index As you

Entity

as 345 347 and 348 347 and 349 347 that

they

key to

with

EMP_NUM

For example,

one employee

new

foreign

Modelling

problems:

uses

employee

because

Data

same synonyms.

data.

have data pairs such

requirements

some fine-tuning.

table

uses the

347 and to enter

would

contains

This third

to

table

produce inconsistent

data entries

possible

approach

MARPART

you

allow

it is entity

requires

to

several

The EMPLOYEE_V1

MARRIED_V1

married to employee

example,

A third

The

yield

5

obvious,

it in

development

writing

of

compatibility

the

one

of the

comes

most

rule is often

organisational

and

time

to

vexing

not observed

documentation

in all of the

standards

is

a

coherence.

suMMARy The

ERM

uses

ERDs to

main components and

cardinality

(optional

or

Multiplicity number that

notations.

is the

known

whether

all occurrences

In the

ERM,

ERM in a relational

ERDs

may be based

the

least

Copyright review

2020 has

Cengage deemed

Learning. that

any

on

of the

All suppressed

Rights

to the

cardinality.

at the

number

on

conceptual

level.

must be

ERMs

participation

us to

describes

define the of one entity

two

the

entity.

important

specific

number

Participation

or not.

business

The

connectivity

etc.).

expresses

of a related

based

user.

of instances

Multiplicity

in the relationship

usually

end

relationship

ternary,

which enables

Cardinality

an occurrence

does

different

May not

not materially

ERMs.

the same.

application

business

Reserved. content

on a relationship,

participate

valid

binary,

entity.

by the

strength,

(unary,

the *:* relationship

many

remains

constraints,

some

relationship

and refers

are

as viewed

The ERD also includes

determines

Participation

is either

rules.

However,

when implementing

mapped to

the

a set of 1:* relationships

entity.

modelling logic

and action

is

database,

a composite

exists

and with

Multiplicities

a *:* relationship

through

show

of a related

of an entity

or optional.

also

relationship,

one instance

associated

database

and attributes.

of relationship

that

as participation

occurrences

conceptual

can

degree

in that with

of entity

mandatory

Editorial

and

main constraint

associated

constraints,

the

relationships

An ERD

mandatory),

of participants

are

represent

are entities,

However,

Because

software

regardless

of

which

no ERM can accurately

must be used to augment

model is

portray

selected,

all real-world

the implementation

data of at

rules.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

216

PARt

II

Design

Concepts

Database

designers,

all applicable compromises

that

conventions.

knowledge

of

modifications

1 5

end

end,

users

forced

have

are

database It is

keep

the

designs that

design

and

and/or

adherence

professional to

designers

must have

also important

information to

all

modification.

to

process

modelling

to

and

determine

To ensure that

detailed

document

on track

to

Those

judgement

are subject

design

conform

compromises.

speed

logic

must use their

conventions. helps

make

vital transaction

conventions

sound,

produce

to

modelling

designers

modelling

which

are able to

often

of perfect

database

data-modelling to

well they are

use

judgements

beginning

steps

the

what extent the

professional

The

when

prevent

Therefore,

how and to

from

matter how conventions,

are required

requirements

their

no

modelling

and in-depth

the

design

allows

for

process easy

in the future.

in

creating

an entityrelationship

model

are:

Create a detailed narrative of the organisations

description

of operations.

2 Identify the business rules based onthe descriptions of operations. 3 Identify

all main entities from the business rules.

4 Identify all mainrelationships between entities from the business rules. 5

Develop aninitial

6

Determine the involves

ERD.

multiplicities

identifying

and the participation

whether

a relationship

7 Identify the primary and foreign 8 Identify

9

of all relationships.

can be optional

Remember,

or mandatory

participation

for each entity.

keys.

all attributes.

Revise and review the ERD.

Key teRMs association

identifying relationship

relationship degree

binaryrelationship

iterative process

simple attribute

cardinality

mandatory participation

single-valued attribute

class

multiplicity

strongrelationship

multivaluedattribute

ternaryrelationship

compositeattribute composite key

non-identifying relationship

unary relationship

derived attribute

optional participation

weak entity

existence-dependent

participants

weakrelationship

existence-independent

participation

identifiers

recursive relationship

FuRtHeR ReADIng Chen,

P. (ed.)

Entity-Relationship

Computer Gordon,

K.

BCS,

Society Modelling

and

Approach:

The

North-Holland,

Business

Use

of ER

Concept

in

Knowledge

Representation.

IEEE

1985.

Information:

Entity

Relationship

and

Class

Modelling

for

Business

Analysts,

2.

Hernandez,

M. J.

Database

Design

for

Mere

Mortals:

A Hands-On

Guide

to

Relational

Database

Design.

Addison-Wesley,

2003. Larman,

C. Applying

Development.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

UML

and

Prentice

Rights

Reserved. content

does

Patterns:

Hall,

May not

not materially

be

An Introduction

to

Object-Oriented

Analysis

and

Design

and Iterative

2004.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

Patig,

S. Evolution

Elsevier

of entity-relationship

Science,

Rumbaugh,

J.,

February

Jacobson,

modelling,

Journal

of

5

Data

Data

Modelling

& Knowledge

with

Entity

Engineering

Relationship

56(2):

Diagrams

217

122138,

2006.

I. and

Booch,

G. The

Unified

Modelling

Language

Reference

Manual.

Addison-Wesley,

2004.

online content available

on the

Answers to selectedReviewQuestions andProblems forthis chapterare

online

platform

accompanying

this

book.

RevIeW QuestIons 1

Which two conditions must be met before an entity can be classified example of a weak entity.

2

Whatis

3

a strong (or identifying)

as a weak entity?

Give an

5

relationship?

Given the business rule an employee may have many degrees, discuss its effect on attributes, entities and relationships. (Hint: Remember what a multivalued attribute is and how it might be implemented.)

4

Whatis a composite

FIguRe

Copyright Editorial

review

2020 has

Cengage deemed

Q5.1

Learning. that

any

All suppressed

entity and when is it used?

the conceptual

Rights

Reserved. content

does

May not

not materially

be

copied, affect

model for

scanned, the

overall

or

duplicated, learning

in experience.

whole

question

or in Cengage

part.

Due Learning

to

5

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

218

PARt

II

Design

5

Concepts

Suppose you are working Given the

a

within the framework

model in

Figure

Identify

Whatis arecursive relationship?

7

How would you (graphically) identify

a

an entity?

b

the

Give an example. each of the following

ERM components in a UML model:

multiplicity (0:*)?

Discuss the difference indicated

9

model shown in Figure Q5.1.

all of the cardinalities.

6

8

of the conceptual

Q5.1:

Writethe business rules that are reflected in it.

b

5

conceptual

in

between a composite

key and a composite

attribute.

How would each be

an ERD?

Whattwo courses of action are available to a designer

when he or she encounters

a multivalued

attribute?

10

Whatis a derived attribute? Give an example.

11

Howis a relationship

12

Discuss two (Hint:

13

waysin whichthe 1:* relationship about

14

has

1720

Q5.2

Cengage deemed

in an ERD, and what is its function? Illustrate

using the

must be addressed in database design? attributes and simple attributes.

of each.

Whatare multivaluedattributes, and how can they be handled withinthe database design?

FIguRe

2020

entity represented

Briefly, but precisely, explain the difference between single-valued

Questions

review

COURSE and CLASS can beimplemented.

strength.)

Whichthree (often conflicting) database requirements

16

Copyright

between

notation.

Give an example

Editorial

relationship

How is a composite UML

15

Think

between entities indicated in an ERD? Give an example using UML notation.

Learning. that

any

All suppressed

are based

on the

ERD in

Figure

Q5.2.

the eRD for questions 1720

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

17

Writethe ten cardinalities (multiplicities)

18

Writethe business rules reflected in this

19

Whichtwo attributes

21

in your

Thelocal soccer.

Data needs to Also,

Draw a data

data

Team:

Cengage deemed

Team,

the

children

coaches

for

and attributes

Player,

Coach,

and

here.

5

ID

number,

Team

name,

and

Team

colours

Coach last

ID

number,

Parent ID number, address

(Street,

relationships

Team

is related

to

Player.

Team

is related

to

Coach.

Player

is related

to

and

first

Parent last

City,

Province,

must

be defined:

name, and

name, name,

Parent first

Postal

and Player age and

Coach

name,

home

Home

phone

phone

number

number

and

code)

Parent.

participations

are

defined

as follows:

may or may not have a Player. must

have

a Team.

A Team

may have

A Player

has

A Team

may or may not have a Coach.

only

must

many Players. one

have

Team.

a Team.

A Team

may have

A Coach

has

A Player

must

have

a Parent.

A Parent

must

have

a Player.

A Player

may have

A Parent

may have

All

and their

Parent

name,

suppressed

who sign up to play

team.

described

Coach

any

primary key. Use proper

who will play on each team,

each

Coach

Learning. that

on the

Coach:

A Coach

has

on each team,

be kept

Player last

A Player

2020

to

name,

Connectivities

review

weak entitys

Player first

The following

Copyright

DEPENDANT

Player ID number,

Parent:

Editorial

219

entity between STORE and PRODUCT?

Player:

A Team

Diagrams

required:

Team

Home

Relationship

ERD.

of the

with the entities

required:

Attributes

Entity

needs a database system to help track children

be kept

needs

model

Entities

with

answer.

city youth league

parents.

Modelling

in your answer.

Describe precisely the composition terminology

Data

that are appropriate for this ERD.

must be contained in the composite

Use proper terminology

20

5

Rights

only

Reserved. content

many

does

one

May not

Coaches.

Team.

many

Parents.

many

not materially

be

copied, affect

Players.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

220

PARt

II

Design

Concepts

PRobLeMs 1

2

5

3

Using the following

business

a

A company

operates

b

Each department

c

Each of the employees

d

Each employee

Using the following

rules,

create the

appropriate

ERD using

UML notation.

many departments.

employs

one or

more employees.

may or may not have one or more dependants.

may or may not have an employment business

rules,

create the

history.

appropriate

ERD using

a

Afootball team has at least 11 players and

b

Each player

c

A minimum of 11 players and a maximum of 14 players

d

A player

e

Each game

UML notation:

may have up to 40 players.

may or may not play one or more games.

may or may not score

may participate in one game.

one or more goals.

may have zero or more goals.

Using the following a

A musician

b

Onerecording

c

Atrack

business rules, create aninitial

makes atleast

one recording,

but

ERD using UML notation: may over a period

of time

make many recordings.

consists of at least three or more tracks.

can appear

on

more than

one recording.

4

Revise the ERD you developed in Problem 3 and resolve any *:* relationships.

5

The Hudson Engineering Group (HEG) has contacted you to create a conceptual model whose application will meetthe expected database requirements for the companys training programme. The HEG administrator

environment. cardinalities.

gives

you the

description

(Hint: Some of the following Can you tell which ones?)

(see

below)

sentences

identify

of the training

the

groups

volume

operating

of data rather than

The HEG has 12instructors and can handle up to 30 trainees per class. HEG offers five advanced technology courses, each of which may generate several classes. If a class has fewer than ten trainees, it will be cancelled. Therefore, it is possible for a course not to generate any classes. Each class is taught

by one instructor.

to do research the following:

6

Define all of the entities and relationships.

b

Describe the relationship between instructor and existence-dependence.

review

2020 has

a

A department employs

b

Some employees,

c

A division operates

d

An employee

Cengage deemed

up to two

any

All suppressed

Rights

many employees,

known asrovers,

may be assigned

may be assigned

do

and class in terms

of cardinality,

UML notation.

participation

Write all appropriate

but each employee is employed by one department.

are not assigned to any department.

many departments,

must have atleast

Reserved. content

or

but each department is operated

many projects,

and a project

by one division.

may have

many employees

to it.

A project

Learning. that

classes

(Use Table 5.4 as your guide.)

Use the following business rules to create an ERD using multiplicities in the ERD.

e

Copyright

may teach

maytake up to two classes per year. Giventhat information,

a

assigned

Editorial

Each instructor

only. Each trainee

does

May not

not materially

be

copied, affect

scanned, the

overall

one employee assigned to it.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

f

One of the employees one

g

7

5

manages each department,

Data

Modelling

with

Entity

Relationship

and each department is

Diagrams

221

managed by only

employee.

One of the employees runs each division, and each division is run by only one employee.

During peak periods, Temporary Employment Corporation (TEC) places temporary companies. TEC

TECs

has

If the job

manager

a file

of candidates

candidate history

additional Each

job

history has

a BA degree both

or a

a BA and

Each time folder.

a company

That folder

starting Each

several

contains

opening

requires

When a candidate made in the number,

the

a temporary

Record

hours

job

Each time the

Each

for

history. (Naturally, candidate

And

no

worked,

one

may be earned

one candidate

clearly,

a candidate

to

by

more

have earned

may have

5

earned

temporaries.

number,

TEC

a company

date, and hourly or

an

name,

entry in

required

the

Openings

qualifications,

a

pay.

he or she is given the job

That folder

etc. In

makes

main qualification.

qualification, folder.

worked,

qualification

more than

employee,

one specific

matches the

workersin

business:

Certification.)

ending

only

has a specific

worked.)

Certification.

an opening

Placement total

candidate

that request

requests

of the

work.

qualifications.

Network

date, an anticipated

to

it is possible

Network

of companies

description

created.

example,

Microsoft

TEC also has alist

willing

has never

was

Microsoft

a

following

before, that

earned

(For

the

are

candidate

record

one candidate.

you

who

has worked

exists if the

candidate

than

gives

contains

addition,

an

an entry

is

opening

and an entry is number,

made in the

job

a candidate

history

for the

candidate.

An opening TEC

uses

can be filled special

codes

Copyright review

many candidates,

to

describe

codes is shown in the table

below.

DeSCriPTiON

SEC-45

Secretarial

work,

SEC-60

Secretarial

work, at least

CLERK

General clerking

PRG-PY

Programmer,

Python

PRG-C++

Programmer,

C++

DBA-ORA

Database

Administrator,

DBA-DB2

Database

Administrator,

DBA-SQLSERV

Database

Administrator,

SYS-1

Systems

Analyst,

level

1

SYS-2

Systems

Analyst,

level

2

NW-NOV

Network

Administrator,

2020 has

Cengage deemed

Web Developer,

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

and a candidate

a candidates

CODe

WD-CF

Editorial

by

duplicated, learning

at least

45

can fill

qualifications

words

per

for

many openings.

an opening.

The list

of

minute

60 words per

minute

work

Oracle IBM

DB2

MS SQL

Server

Novell experience

ColdFusion

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

222

PARt

II

Design

Concepts

TECs

management

wants to

keep track

of the following

entities:

COMPANY

OPENING QUALIFICATION

CANDIDATE JOB_HISTORY

PLACEMENT Given that information,

a 5

Drawthe ERD using UML notation for this enterprise.

b

Identify

c

Identify the multiplicities(including the mandatory/optional dependencies)for eachrelationship.

d 8

all possible relationships.

Resolve all *:* relationships.

The Gauteng Netball Conference (GNC) is an amateur netball association. has one team 11

players.

other

Given those

c

Identify the cardinality

d

Identify the dependency

has a maximum

coaches

plays

do the

(offensive,

two

games

Each town in the province

of 14 players defensive

(home

and

and a minimum

and

visitor)

physical

against

of

training

each

of the

following:

of each relationship.

of dependency that exists between TOWN and TEAM. between teams

and players and between teams and town.

between coach and team and between team and player.

Draw the ERD to represent the

GNC database.

Automata Inc. produces specialty vehicles by contract. The company operates several departments, of

which a new

request

The

builds

a particular

vehicle

is

specific to

order

keep

is

built,

by the

maintained

so

is in inventory.

If an item

Using

that

such

places

Automatas

purchasing

and to

that

as a limousine,

department

accelerate

purchasing

When an order

several

the

of orders

immediately.

have

vehicle,

components. track

received

inventory

most

frequently

it

with the

process may

in, it is checked

is not in inventory,

order

functional do the

in

creating

a

different are

whether

items.

delivered

almost

the requested

a supplier.

An

item

Each item

may

description

of the

processes

encountered

at

Automatas

purchasing

following:

b

Identify all ofthe relations and multiplicities among entities.

c

Identify the type

all of the

main entities.

of existence dependency in all relations.

Giveatleast two examples ofthe types ofreports that can be obtained from the database.

Learning. that

department

materials.

items

from

Identify

Cengage

purchasing

several

determine

must be ordered

RV.

is interested

contain

requested

to

or an

of delivering

a

d

a van

department

the

department

the

comes

an

a truck,

suppliers.

department,

deemed

multiplicities

three

team

conditions,

Identify the type

database

has

up to

each

b

to

2020

Each team

has

season,

Identify the

When

review

also

the

a

each

Copyright

team

During

teams.

e 9

as its representative. Each

coaches).

Editorial

do the following:

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

10

5

Data

Modelling

with

Entity

Relationship

Diagrams

223

Create an ERD based on the UML notation, using the following requirements: An INVOICE but

is

written

each invoice

The INVOICE

is

is

by a SALESREP.

written

written

Each

by a single

for

a single

sales

sales

representative

can

write

many invoices,

representative.

CUSTOMER.

However,

each

customer

can

have

many

invoices.

An INVOICE

can include

many detail lines

(LINE),

which

describe

the

products

bought

by the

customer. The product

information

The products

is

vendor

stored

in

information

a PRODUCT

is found

in

entity.

a VENDOR

entity.

note

5

Limit

your

do not include

11

ERD to

add realism the

entities

and

to

your

design

that

would

attributes

Using the following fully

labelled

relationships

permit

brief summary

ERD.

Make sure

based

by expanding

on the

business

or refining

the

model to

the

rules

business

be successfully

shown

rules.

here.

In

other

However,

all appropriate

you

implemented.

of business rules for the ROBCOR catering service,

you include

words,

make sure

entities,

relationships,

draw the

connectivities

and

cardinalities. Each

12

dinner

can

attend

can

be

is

based

on a single

many dinners,

mailed to

many

following

business

A patient

can

make

one

doctor,

and cases

patient many

Each

appointment

is

and, the

visit

patients

A patient

one

patients.

an

be served

by

many

at

many

dinners.

guests.

Each

dinner

A guest invitation

many invitations.

for a medical clinic, using at least

records

a single

with the

are

Each

more

doctors

each

However,

doctor

in the

clinic,

appointment

and

is

a doctor

made

with

only

patient.

appointment

appropriate,

a bill.

or

However,

appointment. in the

a visit

when

creates

for

book

appointment

management

as unscheduled.

specified

in the

appointment.

The visit

treatment. updated

patient

to

visit is

provide billed

a

medical

by one

history.

doctor,

and

each

doctor

can

patients.

more than

one

However,

a bill

may be paid in

many instalments,

and a payment

may

bill.

may pay the

insurance

can

can receive

references

entered

yields

bill must be paid.

cover

If the

visit,

entre

be attended

with

many

do not require

a diagnosis

Each bill

each

an emergency

With each

guest

appointments with

an appointment

yields

each

and

many

appointments

If kept,

each

can

UML notation that can be implemented

accept

purposes,

but

dinner

rules:

can

Emergency

entre,

each

guests,

Create an ERD using the

and

bill

directly,

or the

bill

may

be the

basis

for

a claim

submitted

to

an

company.

bill is

paid

by an insurance

company,

the

deductible

is

submitted

to the

patient

for

payment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

224

PARt

II

13

Design

Concepts

Tiny University is so pleased tracking of

system

operations

members

example,

centres,

travel

are

forms,

form

and name vehicle,

(The

TFBS

to

completion

form.

department

is

panel van,

is

log

required,

parts

maintenance maintenance

mechanic

maintains

Each

sign

are

generated,

each

month.

Finally,

Cengage deemed

Learning. that

any

brief

notation

All suppressed

Rights

does

May not

not materially

be

as the

to

affect

parts

parts

used

scanned, the

overall

to

and

The reports within

draw

the

relationships

or

duplicated, learning

in experience.

whole

appropriate and

or in Cengage

part.

Due Learning

item

the

the

form

form.

The

belts

stapled

of various parts

that

types. reach

manager requires maintenance;

each

the

parts

used.

mileage

driven

parts

report

usage

each

is

reports.

various

fully

performed,

date is filled

In addition,

(and

a

maintenance

reorder

part is

created

who

is recorded.

detail

vehicles

include

is

to

number

the

and

parts

the

A detailed

summary

signs

and to

a department.

department.

maintenance

log

maintenance

each

and the

department

completion

service

which

maintenance

mechanics

parts

maintenance

usage

under

(Only

is transferred

forms

usage, the

a

service.)

air filters

perform

number

members

of operations,

copied,

monitor

The

was completed

maintenance

the

oil filters,

are

maintenance,

to the

source for various oil,

(sedan,

of maintenance

who performed

log

type

be

the

more entities than

service.

for each

back into

To track

are

into

maintenance

completed,

vehicle

a set of reports.

entities,

lines

vehicle

credit

must form,

log form.

number

mechanic

of the trip, University

receipt

requires

also forwarded the

form.)

entities!)

back into

maintenance

the

card

maintenance

forms

is

Tiny

of the type

back

checkout

completion

maintenance

vehicle

which

been

the

log

by vehicle

to indicate

Reserved. content

that

maintenance

summary

a vehicle

vehicle

on

have

daily

a vehicle

Each time

pick up the completion

identification

Do not use and

which the

of the

on hand level.

and by faculty

on the

the log

including

parts

based

separate

form,

checked

month, TFBS issues

reports

ERD

items

out the the

by department,

has

is

records

Given that

contains

inventory,

quantity

rate

the

the

and end

credit

attributes

destination

up a trip

(if any) and the

trip

the

form

start

The

arrives to

signs

at the

who are

of a trip.

members

to

used for

Centre.

faculty,

pick

the

number

be used later

a parts

inventory

to

manager

to

on

initiated,

usage

also

a brief description

forms

log

use faculty

on a prenumbered

who releases

are then filed,

mechanic

2020

detail form

for

minibus) used. (Hint:

date

students

vehicles

required,

and

travel.

to transport

member

faculty

mileage

may release

a parts

to the

the minimum

review

out

type

of the

by TFBS.

been

vehicle

the

sanctioned learning

end

purchased,

between

the

the log

When all

parts

at a

used and for identification

and the

The

description

off-campus

The

for its at the

readings

is

officially to

form

vehicle

purchased

receipt

who released

has

form;

who fills

maintenance

TFBS

Copyright

form

detail

manager,

date,

out the

vehicle

If fuel Upon

billed

authorisation

as the log

maintenance

fuel.

performed

entry

date,

includes

of fuel

difference

mechanic

an inspection

the

vehicle identification,

log

of the

As soon

the

entry is completed the

the initial

identification

forms

A brief

Far But Slowly)

When the faculty

odometer

minivan or

maintenance

log form includes

Editorial

pay for

All vehicle

maintenance

the

for

vehicles

completion

to log

form

(if any), litres

the

out

pool.

purposes.

TFBS (Travel

departure

who releases

Remember

stapled

car

presented,

service

can reserve

form

trip

wagon,

item.

of its student registration/

its

to travel

are

public

trip

completion

necessary.

papers

for

member.

a checkout

trip

members

research

expected

faculty

identification,

members

station

which

used

to the

faculty

The

to include

by Tiny University

by faculty

department

the

complaints

number

stapled

owned

appropriate

employee

members

maintenance

each

sign

code, the vehicles

for the

design

by Tiny Universitys

authorised

must

The faculty

have

the

and to travel

out the

includes

of the

(s)he

at

managed

for filling

reservation

card

expand

vehicles

locations,

reservation

form.

to

may be used

to locations

purposes

responsible

5

you

vehicles

sanctioned

Using

with your design and implementation

wants

may use the

the

to

officially

such

it

follows:

Faculty For

that

by vehicle, revenue is

also filed

month.

labelled)

ERD.

Use the

UML

multiplicities.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

14

Using the following implemented.

information,

Make sure

EverFail

company

in their

cars for

wipers,

oil filters

charges

for

presented

is in the

what is and

the to

not extend

oil change

described

oil and

to

used,

and

pay

EverFails

database

business.

EverFail

or

write

Diagrams

225

that can be

multiplicities:

customers

also replaces

charge.

card

Relationship

model and

The invoice

labour

a credit

Entity

UML

Although

approval.

a standard

is to

with

relationships

oil changes,

use

Modelling

based on the

customer

cash,

Data

entities,

and lube

as quick

subject

all parts they

an ERD

all appropriate

quick

air filters,

customers,

credit.

produce

you include

5

bring

windshield

contains

When the a cheque.

the

invoice

is

EverFail

does

be designed

to keep track

of all components

operations,

EverFail

maintain

in all

transactions. Given

the

high

of its

parts

(oil,

minimum vendor.

parts

usage

wipers,

on-hand

of the

oil filters

quantity,

EverFail

business and

the

maintains

air filters)

inventory.

parts in low

a vendor

list,

supply

which

must

Therefore,

if

parts

must be reordered

contains

vendors

careful

reach

from

actually

control

their

an appropriate

used

and

5

potential

vendors.

Periodically, EverFail

15

based

also tracks

Create a complete description not

any

models

Each

stock.

get

and

Every Every

spas any

from

is

one

or

service,

produces brand.

spa

mails updates

in the relational

a small

start-up

a simple must

different

to

customers.

model using the following

company

warehouse

be ordered

so

at the

that

sells

spas.

customers

time

can

of the

HW does

see

some

of

sale:

manufacturers.

or

more different

by only

one

brands

of spas.

manufacturer.

models.

as part

that

is

sold

one

EverFail

mileage.

up in

products

more

produced

an 81-jet

set

produced

an entry-level

BBI-6,

are

several

brand

has

manufacturer

car

Water (HW)

produces

every

model is

spas,

Hot

but

spas

brand

cars

customers

A few

manufacturer

Each

of the

ERD that can be implemented

available,

HW can

date

each

of operations.

carry

the

on the

of a brand.

Big

Blue

The

with two

Big

Meerkat Blue

6 hp

For example, spas,

Meerkat

motors,

Bay

a premium-level

brand

and the

Meerkat

offers

BBI-10,

Spas is

brand,

several

and

models,

a 102-jet

spa

a Lazy

Lizard

including

with three

the

6 hp

motors. Every

manufacturer

code,

phone

For each in the

brand,

the

by a

and account brand

name

motor,

model, the

suggested

capacity

must

retail

price,

be kept in

the

code.

The

company

are kept in the system

brand

level

on the

brief

Volunteers

carry

out the tasks

are tracked

for

each

require

assigned

(premium,

name,

for every

mid-level,

address,

area

manufacturer.

or entry-level)

are

horsepower

per

kept

HW retail

volunteer

description

of the

volunteer.

yet. It is

is assigned

of jets,

price,

dry

number weight,

of

motors,

water

capacity,

and

seating

system.

Each

many volunteers.

a task

number

organisation

following

tasks

and

model number,

United Helpers is a non-profit Based

manufacturer

number

system.

For each

16

is identified

number

that

of operations,

to

to a task, the

have

system

aid to people after natural disasters.

create

organisation. volunteer

the

appropriate

The name,

might tasks

be in the

that

fully

labelled

address

and telephone

to

several

tasks,

without

having

may be assigned

A volunteer

possible

provides

system

no

one

has

been

should track

the

start time

ERD:

number

and

assigned.

some

been When a

and end time

of that

assignment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

226

PARt

II

Design

Concepts

Each task

has a task

may be a task recurring,

and

of prepare

code, task

with task

code

a status

5 000

description,

101,

of ongoing.

packages

task type

a description Another

of basic

task

medical

and task

of answer might

supplies,

status.

the have

For example,

telephone, a code

a type

there

a type

of 102,

of packing,

of

a description and

a status

of open. For

all tasks

of type There

packages,

child-care

packing list up the

name

Packing produced

by the

date

the

should

be tracked.

Each item

placed as

package,

100 bandages,

and

no

to

with the

that

have

offers

also record to lead

needs many

date

qualified

at least

any

All suppressed

the

many

list

Tasks

an ID

that

number.

A given

phones)

package

is

will not produce

of basic

any

medical supplies)

but it is not always

actual items

different

included

items,

and

has an item ID number,

the

system.

Along

of each item

list

may state

that

and 4 bottles

The fact

that

in

a given

the

in the

basic

medical

of hydrogen

package

in

will

possible

to

each package item

description, the

actual

package

can

be used

item

value

packages

peroxide.

includes

package

items

must

bandages possible

yet,

but

because

1 bottle

of

of iodine,

and iodine for the

every

are

include

However,

It is

that

be tracked

should

only 10 bandages,

included.

any

item

with tracking

placed

may include

of each item

needs

to

organisation

package

will contain

company.

tours.

and is

The

For each

home

tour,

address,

It is important they

to

completed

different

the

difficulty

know

tour

name,

guides

qualification

A tour

can

to lead

are

length

(in

but the system

qualified

for

each

many

any tours, just

ID,

with all of the

as follows:

Guides take

are

test

have

up

approximate

by an employee

which

keeping

operations

and date of hire.

the

tours.

having

companys

Guides are identified

may not be qualified

must be designed

Reserved. does

The

May not

to

are kept.

others

one tour.

content

quickly

a test to

to lead

tour.

different

as a new tour

should be qualified

which

A guide

qualified

hours),

tours

may be

guides.

New

may or may not have

guides.

while

Rights

has grown

name,

which

description

one tour,

Learning.

contain

package

of the

many

may or

official

that

on

qualified

Every tour

Cengage

a make

of supplies

assigned

the

of each package,

a given

needed.

tours.

to lead

guides

deemed

package

is

as answer

Therefore,

been included

different

a guides

specific

and the

has

A packing

many tasks.

are recorded.

contents

quantity

not

LOST

and fee charged is

2020

list.

should

Scenic Tours (LOST) provides guided tours to groups of visitors to the Cape Town years,

information

LOST

review

package

weight

that

one item.

Luxury-Oriented

any

each

5 000 packages

in

of iodine

peroxide.

along

area. In recent various

with

medical

list.

as prepare

quantity

of items,

hydrogen

at least

stored the

4 bottles

have items

packing

Each individual

total

provides

a packing

supply

be recorded

can

organisation

For example,

the limited

and

of each item.

on hand

each

well.

packing

as basic

has an ID number,

the items

one

may be associated

Some tasks (such

A package

quantity in

only

of the

such

Each packing list

with

packages.

contents

packages.

that the

and item

the

packages,

which describes

with any

and its

the ideal

number

different

specifies

packages.

describes

the ideal

many

many

or it

is tracked,

with only one task.

that

different

associated

of

was created

while other tasks (such

in

Copyright

creation

packages,

list

produce

packages.

associated

associated

include

is

any tasks,

organisation

with

to

description,

task

not

in the

package

The packing list

Editorial

with

a packing

lists

and food

packing

are

result

be associated

17

packages,

tasks

tasks

is

packing

and a packing list Every

packing

there

many

be associated

not

The

are

package.

may not are

5

packing,

packages.

not materially

be

as

order in

affect

scanned, the

at least

three

Some locations

(such

copied,

visit

overall

District

or

duplicated, learning

(such

Six) are

which the

in experience.

tour

whole

locations.

or in Cengage

as Table

visited

visits

part.

Due Learning

For each location,

Mountain) are visited

by a single

tour.

each location

to

electronic reserves

rights, the

a name,

right

should

some to

third remove

party additional

by

All locations

content

may

be

any

more than

as

suppressed at

and

are visited

be tracked

content

type

time

from if

by

well.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

When a tour is in

advance

schedules. have

actually

so they A tour

any

that

tour.

Tourists,

Newly

called

at least

a

hired

number

outing

can

one

outing,

outings, is for

tour

Each outing

outing

of a tour

guides

may never

by LOST,

pay to join

although

outing.

only

outings

outings

their

work

tours for

officially

to lead

may not

a particular

the

many different

have

any

to

outings.

client,

who

Guides

qualified

any

For each

on clients

may not

227

well

upcoming

designed

not

may sign up to join

scheduled

Diagrams

at LOST are guided tours,

are

scheduled

kept

Relationship

has one and only one guide.

been

is

Entity

scheduled

All tours

a scheduled

Information

newly

newly and is

even if they

have

Clients

many clients.

understand

although

a single

with

LOST schedules

can

to each outing.

are recorded.

have

employees

with a tour.

an

Modelling

to as an outing.

so

outing

Data

must be associated

to lead

clients

and telephone each

Each

must be assigned asked

and

many scheduled

All outings

occasionally

lead

have

scheduled.

date and time.

are

be advertised

can

outings

so a guide

given, that is referred

can

5

name

outings,

have

clients

and

signed

up for

signed

up yet.

Create an ERD to support LOST operations.

5

b The operations provided state that it is possible for a guideto lead an outing of atour even if the

business

guide

is

officially

rules instead

to lead

an outing

data

not

model in

specified

unless

Part

qualified

that

to lead

modified

of that

a guide is never,

he or she is qualified

a be

outings

to

enforce

under

to lead this

Imagine

that

the

any circumstances,

outings

new

tour.

of that tour.

allowed

How could the

constraint?

note Problems

18 and

of translating ERD that

19

may be used

a description can

about the

generic

components

mind that are

many

handled

existence

18

business,

Web-based

get

away

designed

Use the following decals

for

a bad

to

models

through

available,

is

The

review

2020 has

Cengage deemed

Learning. that

includes

any

card

All

Rights

Reserved. content

does

May not

not materially

be

copied, affect

the

order

overall

or

a few items

the

also

such

constraints

Problem

than

per

must

must

can easily be

of transactions

you

keep in

18 deals

design, rather than

day,

on

made that the

ever.

(You

but the

might

problems

of

increases.)

Company to complete ships

and

this

cars)

(www.rc_models.com).

products at the

duplicated, learning

and to

customers

exercise.

and add-on

Models

on the

and

decals

are

in experience.

whole

CC

or in Cengage

pulled

CC

Bank,

Bank is

not

part

part.

Due Learning

to

electronic reserves

rights, the

right

at

to

third remove

is

in the

for

additional

content

may content

shipping

be

container. Company

database.)

suppressed at

the

shipment.

RC_Models

RC_Models

party

not

are not charged

inventory

which

of the

some

orders

his or her transactions, from

is enclosed

to the

card. If a product

(Back

completes

invoice

The printed invoice

The

pay by credit discretion.

When a customer

listed

(Note:

scanned,

in

more important

models (aircraft,

website

are transmitted

account.

only

instead,

of the

of an

discussions

is to separate You

operations

the argument

of RC_Models

internet

charge.)

charges

sell

plastic

shipped.)

a shipping

credit

a commercial

suppressed

is

products

aspects

number

components

One of the things

design;

database

challenge

basis for

design.

of

the

1/144 to 1/32.

on back

order

and the

you

operations

vary from

placed

if

products its

database

design

the

as the

implemented,

database

description

database

as the

website to select the

until the

customer

maintains

Copyright

it is

printed

(The invoice

Editorial

use the

a customer

invoice

of the

the

the

illustrate

will define

be used

of operations.

details. In fact,

design

that

also

affect

into

made

problems

can be successfully

be on the

compounded

sells its

are available in scales that

currently

database

are

Company

those

Customers

has

rules can

directly

Although

should

These

description

that

management

descriptions

RC_Models

problems

be incorporated

businesses

with

of business

that

software.

projects.

a set

details

the focus

databases

class

These

databases

cannot

applications

for

of a proper

the

and the transaction of

be able to poorly

create

constraints

with a Web-based its interface

and contents

material from

by the

to

implemented.

want to

background

basis

of operations

be successfully

need to learn if you

as the

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

228

PARt

II

Design

Concepts

RC_Models materials. conduct to,

Company

Because its

operations,

customer

product is removed

of the

However,

RC_Models list

to

RC_Models

Company.

periodically

sends

requires

detailed

available.

category

Those

and

amount,

has not recorded

a sale

reports

include,

product

turnover

within four

out

promotional

information but

are

and

to

not limited

revenues

by

weeks of being stocked,

it

and scrapped. on the

use in

RC_Models also

has

marketing

In addition,

Company

plastic

models

from

Models

are

ordered

customer

purchased

its

list

have

a copy

products

customer

account

order

with the

specified

to

bought

of the

customers

RC_Models

FineScale who

data are recorded

products

from

are

Bank.

products.

Modeler

have

not

when potential

applicable

business

rules

Use the following

are

yet

magazine

bought

customers

to

from

request

All orders

the

others.

Decals

are

in the

RC_

through

placed

number

and

example,

Not all manufacturers are placed

handled

For

when

of product

via the

manufacturers

RC_Models product

units

commercial

inventory

ordered

reaches

depends

on the

product.)

description of operations for RC_Models Company, write all

establish

three

manufacturers.

Revell/Monogram

automatically

(The

for each

the

others. (Note:

orders.)

on hand.

specified

Giventhat brief andincomplete (Hint:

and

automatically

Orders

quantity

from

Academy,

WaterMark

have received

CC

quantity

directly

Tamiya,

amounts

minimum

order

its

Tauro,

database

and the

minimum

orders

Aeromaster,

Company

websites,

a

are

and

Company

information.

ordered

the

purchases

RC_Models

reports

product

Company

subscription

bank

at

If a product

customers

RC_Models

5

by

from inventory

Many

customer

numerous

purchases

and customer.

product

tracks

management

entities,

business

rules

relationships,

optionalities

as examples,

and

multiplicities.

writing the remaining

business

rules

in the same format.) A customer

may generate

Each invoice

is generated

Some

b

customers

Drawthe fully labelled a of this

19

have

problem.

Use the following

many invoices.

by only one customer.

not (yet)

generated

an invoice.

and implementable

Include

all entities,

description

ERD based on the business rules you wrotein Part

relationships,

of the operations

optionalities

of the

and

RC_Charter2

multiplicities.

Company to complete this

exercise. The

RC_Charter2

certificate available

place

companies

only to

after

one

one

and

during

only

operations;

one

charges

function crew

expenses.

models

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

yields

model

revenue

for the

upon

used,

use

the

mileage

mile. Round-trip

traced

does

May not

in Figure

to

not materially

be

be 130

copied, affect

scanned, the

overall

are

duplicated, learning

+ 180

in experience.

whole

or in Cengage

do not

a flight.

the

part.

Due Learning

or

flights

by

procedure.

charter

companys

use

and

or some

trip is reserved

other

by

charter

RC_Charter2

only.)

is generated

flight

customer multiplying

actual

take

many different

charter

special

on the

are

date

This revenue

The

computed

+ 390

or charter)

cargo

charter

operations

time,

charter

use the

services

miles are based

P5.1 illustrates

reserve each

Company.

waiting

is,

passengers,

of course,

charter

of

taxi

The aircraft

at a customer-designated

customers

RC_Charter2

FAA.

Canada. that

purposes,

on the

charges

+ 200

or

to fly

maintenance

flown,

per

States and

can,

completion

distance

Part 135 (air

by the

transporting

for billing

will focus

distance

Reserved. content

fuel,

FAR

operations

of RC_Charter2s

The

calculated

Rights

A customer

design

pays

United

of an aircraft

However,

charge

route

miles is

use

the

are enforced

destinations,

purchase

database

a customer

The sample

Editorial

they

trip

the

Some

under

that

unscheduled

and cargo.

customer.

of aircraft

by the

reserves

any time frame.

This

Each charter the

within the

so-called

customer-designated

instead, (Note:

of aircraft

operations

provide

more

a fleet

(FARs)

of passengers

(trips)

services.

operates

Air Regulations

a customer

or

combination flights

Federal

for air taxi (charter)

Charter

time

Company

of the

the

charges

by are

a

requirements

and

round-trip

miles

navigational

path flown.

Note that the number

of round-trip

5 900.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

FIguRe

P5.1

Round-trip

5

Data

Modelling

with

Entity

Relationship

Diagrams

229

mile determination

180

Intermediate

200

Destination

miles

Stop

miles

390

miles

Pax Pickup

5

130

miles

Home

Depending

on whether

a customer

Base

has RC_Charter2

Pay the entire charter bill upon the completion Pay a part of the may not

exceed

charter the

credit

he or she

may:

of the charter flight.

bill and charge the remainder

available

authorisation,

to the

account.

The charge

amount

credit.

Charge the entire charter bill to the account.

The charge amount

may not exceed the available

credit.

Customers may be charge

may pay all or part

made

at any time

includes

the

customers

the

The aircraft, crew

and

the

used

must

a crew

flies

Copyright Editorial

review

2020 has

Cengage deemed

any

for

to

The hourly

handle

crew

previous

a specific

other by

basis.

crew

FAR

charter charter

by

those

crew-member

trips.

trip.

required

135,

All suppressed

Such

The

FAR

135.

customers

charge is

payments

charter

However,

are

based

mileage if

charged

for

on each

crew

flights

piston

aircraft

of a flight

In

short,

trip

aircraft

attendants

engineer, can

charter

of

the

of the

a

weight

of the larger Some

aircraft

one

of an

require

takeoff

crew.

cargo-carrying more than

use

aircraft

a gross

while some

as part

and larger consist

requires

having

a pilot and a copilot,

flight

a crew

charter

engine-powered

that is,

aircraft require may require

assignment

aircraft waiting

does

May not

not materially

be

waiting

charge.

charges

Reserved. content

Larger

Each

person

of the require

and

not

all

waited

by

are pilots.

hourly

Rights

assignment.

The smaller

pilot.

passengers

the

as hotel/motel

Learning. that

and

not required

aircraft.

of a loadmaster.

members

The charter

such

tied

and jet-powered

require

models

each a single

to transport

assignment

the

be able to

of only

aircraft

crew

balance

pilot(s)

crew

on an hourly

of 5 500 kg or more

older

existing necessarily

qualifications.

database

consisting

aircraft

not

of the

additional

members

members

of the

are

expense

request

crew

and

and

copied, affect

scanned, the

overall

charges

Crew ground

or

duplicated, learning

are computed

expenses

by

are limited

multiplying

to

meals

the

and

hours

overnight

expenses

transportation.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

230

PARt

II

Design

Concepts

The trips,

RC_Charter2

expenses

data that

database

and

each

pilot in

destination(s),

aircraft

data pertinent

to the

that

detail revenue

other

crew

contract

licensing

5

are

of crew

is

water,

The instrument

a monthly

records.

for

crew)

Such

each

data,

charter

are then

for

customers,

is,

based

date(s)

fuel

and

other

monthly reports

and pilots.

the

on the

and time(s),

usage

generate

aircraft

that

of all charter

are

trip

flown,

used to

employees;

summary

records trip:

distance

data

Company

members.

All pilots

company

does

and

not

use

designed

is

that

require

either

appropriate

govern

the

a commercial

ratings.

and landings

multi-engine

Ratings

ability to

Conditions

or visual

conduct

flight

and

Rules (IFR).

conditions

all flight

can take

off

operations

rating is required

(IMC),

Flight

only, the

aircraft

seaplane.

The instrument

Instrument

on land

When a multi-engine

MES, or

Meteorological

weather

of requirements

must have earned

for takeoffs landplane.

on a demonstrated

FAR-specified

good

pilots

Both licences

instrumentation.

Instrument

under

set

example:

rating

based

a strict

For example,

For

aircraft

to cockpit

under

under

licence.

appropriate

rating is

under

conducted

MEL, or multi-engine

the

with sole reference

conducted

charter

requirements.

rating

governed

Such

pilot (ATP)

a multi-engine

are

generate

to record other

cost information

are

appropriate

an aircraft

to charter

(and

RC_Charter2

To operate

on

flight.

transport

competency

and land

pilot

operations

or an airline

specific

the

crew.

135

and training

licence are

charter

from

is required

number,

and operating

and

Part

derived

command

members

pilots

FAR

must be designed

revenues

all such

In

are

operations

contrast,

based

to operate

operations

on the

FAR

Visual

Flight

Rules (VFR). The type aircraft

rating that

is required

are

purely

aircraft is said to aircraft

pilot licences under

The the

If the

reverts period,

Aturboprop

a Class II

it

automatically

unless it

both a current

of

more than

engines

that is,

meets the

are not time-limited,

to

a turbo

exercising

the

5 500

drive

5 500 kg

medical certificate

may be Class I or Class II.

must be renewed

Class I

to

weight

uses jet

kg

or for

propellers,

that

propeller-powered weight limitation.)

privilege

of the licence

and a current

and

Part 135 checkride.

are important:

certificate

Class II, and it

yearly.

a takeoff

an aircraft

rating

and ratings

distinctions

medical

with

(If

a type

Part 135 requires

The following

all aircraft

be turboprop-powered.

does not require

Although ratings

for

jet-powered.

medical

is

certificate.

every six

not renewed If the

reverts

to

The

months.

during

Class II

the

medical

a Class III

Class I

The Class II six-month

is

medical,

medical is

medical

period,

not renewed which is

more stringent

not

must be renewed

it

automatically

within valid

than

the

for

specified

commercial

flight

operations. A Part

135

every

six

checkride months.

is The

a practical checkride

flight

examination

includes

that

all flight

must

manoeuvres

be successfully

completed

and

specified

procedures

in

Part 135. Non-pilot

crew

requirements. In

addition,

crew

operations over

19)

members

that are

also

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Reserved. content

does

May not

not

be

the

(more to

a complete

affect

scanned, the

overall

or

duplicated, learning

in experience.

and

flight

a 5 500

pass

record

whole

certificates

an appropriate

than

medical certificate

copied,

proper

need

as loadmasters

aircraft

to keep

materially

have

periodically

as well as pilot

Rights

such

large

required

also

loadmasters

members

involve

Company is required member,

must

For example,

a

Cengage

part.

and

Due

to

electronic reserves

to

practical

meet

specific

as do flight who

weight

of all test types,

Learning

order

attendants

kg takeoff

written

examination

or in

in

certificate,

may

job

attendants.

be required

and

passenger

exam.

The

in

numbers

RC_Charter2

dates and results

for each

crew

dates.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

cHAPteR

In addition, must such

as

flight

Part

assigned

are

Test

135

to

members

that

Nor

exams.)

For example,

in the

table

are

to submit

members required

are to

many crew

Modelling

to

periodic

not required

take

crew

members

with

on a given

charter

earned

flight,

tests

such

certificate

formats

below.

Test

Test frequency

description

Part 135 Flight

6 months

Check

5

Class 1

6 months

3

Medical,

Class

2

12

months

Practical

12

months

12

months

4

Loadmaster

5

Flight

6

Drug test

7

Operations,

Attendant

Practical

Random written

exam

6 months

B results Test

Test

result

Test

101

1

12-Nov-18

Pass-1

103

6

23-Dec-18

Pass-1

112

4

23-Dec-18

Pass-2

103

7

11-Jan-19

Pass-1

112

7

16-Jan-19

Pass-1

101

7

16-Jan-19

Pass-1

101

6

11-Feb-19

Pass-2

125

2

15-Feb-19

Pass-1

C Licences

Licence

or

and

code

date

employee

Certificates Licence

Certificate

or

Certificate

ATP

Airline

Comm

Commercial

Med-1

Medical

Med-2

Medical certificate,

Transport

Pilot

certificate,

LM

Loadmaster

FA

Flight

Cengage

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

1

class 2

aircraft rating

attendant

duplicated, learning

class

rating

Multi-engine land

MEL

Description

licence

Instrument

Instr

deemed

pilot

is required. data

Medical,

has

and

If that

2

2020

tests

certifications

certificate.

Sample

231

the results

pilot-specific

and/or

licence.

Diagrams

as loadmaster

have licences

pilots

Relationship

drug testing;

the loadmaster

a commercial

Entity

to take

may have an ATP and a loadmaster

may have

code

Part

review

Data

A Tests

Part

Copyright

pilots

a pilot

1

Editorial

crew

However,

be a loadmaster attendant

are required

non-pilot

checkrides.

a flight

shown

crew

(Note

practical

areas.

Similarly,

PArT

too.

attendant

in several is

all flight

be tracked,

5

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

232

PARt

Part

II

Design

Concepts

D Licences

and

Certificates

Held

employee

Licence

101

Comm

Date earned

or Certificate

12-Nov-03 28-Jun-04

Instr

101 101

MEL

9-Aug-04

103

Comm

21-Dec-05

112

FA

23-Jun-12

103

Instr

18-Jan-06

112

LM

27-Nov-15

5

Pilots

and

other

assignments. For

example,

pilot

The

for

The

crew

by

in

Command

a gross

The those

under

RC_Charter2 aircraft

The

can

attendants.

a crew to

a

aircraft

b

a complete

record

and flight

of all recurrency

credentials

and

of each requirement

maintained

requirements by

However,

engines

are

aircraft

must fly the

piston

operations

available.

aircraft

or turboprops

permitted

under

of a copilot

who is capable

as and

Part 135

even if FAR Part 135 permits

anticipates

the

of a pilot

ratings that

passengers cargo

A

Pilot have

as long

single-pilot

of conducting

crew

copilot.

exceed

5 500

the

requires

over

and securing

charter

of turbojet-powered and

requirements.

weighing

the loading

lease

and training

that

optionalities

Each charter

trip is requested

customers

charter

any

All suppressed

problem.

Rights

Reserved. content

does

May

many

have

the

the

5 500

kg,

of the

gross

pilot

takeoff

presence

of

aloadmaster

cargo.

assignment

kg

aircraft,

Both

and

weight.

one must

The database

or

and copilot

Those

more flight

be assigned

as

must be designed

capability.

trip

not materially

multiplicities.

not (yet)

may have

be

business

charter

copied, affect

scanned, the

overall

requested

to serve

many

entities,

or

duplicated, learning

Use the

following

five

business

trips.

a charter

in

whole

member

assigned

on

to it to

many charter serve

trips.

as crew

members.

ERD based on the business rules you wrote in Part a

relationships,

experience.

trip.

as a crew

employees

and implementable all

(Hint:

rules in the same format.)

by only one customer.

may be assigned

Include

not

and

writing the remaining

Drawthe fully labelled

Learning. that

of all crew

consisting

aircraft

of

may request

of this

Cengage

record

powered

presence

a crew

A customer

Each

deemed

are

the

larger

carry

relationships,

An employee

has

that

is available.

additional

as examples,

Some

2020

a detailed

rules

specified

Usingthis incomplete description of operations, writeall applicable businessrules to establish rules

review

record

currency

kg, single-pilot

have

number

anticipated

entities,

Copyright

a complete

must have a properly

manager

to

member to supervise

meet the

and

keep

must keep

and

Part 135 licensing,

the

If those

135 flight

requirements

work

135.

also leases

carry

maintain

company

require

Part

operations

same

company

aircraft

autopilot

are required

must meet the

to

licensing

5 500

many customers

operations

to

Part

their

is job-specific.

training.

company

aircraft

under

maintained

operations, flight

the

FAAs

For those

weight

as a properly

flight,

of the

(PIC).

takeoff

The

operations

to

that

data.

a charter all

is required

appropriate

curriculum

of all applicable

flight

is required

to the

training

FAA-approved

a review

company

Company

Part 135.

recurrency

on an

includes

subject

Company

mandated

meets

based

training

member

and of all compliance

who

is

data interpretation,

RC_Charter2

To conduct

must receive

training

RC_Charter2

each

all training

pilot

members

recurrency

weather

procedures. training

crew

Recurrency

regulations,

Editorial

by employees

or in Cengage

part.

optionalities

Due Learning

to

electronic reserves

rights, the

right

some to

and

third remove

party additional

multiplicities.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 6 Data Modelling Advanced Concepts IN THIS CHAPTER,YOU WILLLEARN: About

the

extended

How

entity

clusters

entity are

relationship

used

The characteristics

of good

How to

solutions

use flexible

to

(EER)

represent

primary

models multiple

keys

main constructs entities

and relationships

and how to select them

for special

data

modelling

cases

PREVIEW In the

previous

properly relationship entity

Most

(EER)

current

important

of primary

adapted

to

2020 has

you

carry

any

All suppressed

out

does

May

not materially

design,

of

on relational

why this

tables,

chapter

this

chapter

databases. it is

for

As the

essential

to learn

Primary key selection covers

critical

designs,

keys. (Flexible

designs

changing

data

aspects

should

of poor

modelling

copied, affect

scanned, the

overall

database

tasks, data

or

duplicated, learning

know

the

in

whole

Cengage

requirements.) a good

mantra:

Data

foundation

good

and no amount

chapter

or in

of

can be

database

of outstanding

design.)

modelling

experience.

the

special

are designs that

providing

designs,

some

proper identification

and information

of databases, (You

basic

also illustrates

of flexible

on bad database

be

to

entity

and adds support

and how to select them.

development

outlines

not

based among

which is

of foreign

data

(ERDs)

extended

placement.

the limitations

Reserved. content

keys

development.

that

Rights

are

the importance

be based

diagrams

about the

on ER concepts

associations

database

demands

relationship

will learn

clustering.

chance,

step in the

checklist

Learning. that

create

and

and placement

can overcome

Cengage deemed

to

highlight

a vital

entity

primary

application

To help

review

be left

cannot

modelling

Copyright

to

successful

applications coding

to

of good

meet the is

entity

you

implementations

keys

on practical

keys

modelling

Editorial

uses

cases that

primary

for

and

key identification

Focusing

design

use

model. The EER model builds

database

model

how to

model. In this chapter,

subtypes

characteristics

is too

you learnt

a data

supertypes,

relational

the

chapters,

create

concludes

with

a database

principles.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

234

PART II

Design

6.1

Concepts

THE EXTENDED ENTITY RELATIONSHIP

As the complexity requirements

of the

have

data structures

become

being

more stringent,

MODEL

modelled has increased,

there

has

been

and as application

an increasing

need to

software

capture

more

information in the data model. The extended entity relationship model (EERM), sometimes referred to as the enhanced entity relationship model, is the result of adding more semantic constructs to the original entity relationship (ER) model. As you might expect, a diagram using this model is called an EER diagram (EERD). In the following sections, you willlearn about the main EER model constructs entity

supertypes,

entity

subtypes

and entity

clustering

Following on from Chapter 5, Data Modelling UML notation to produce EER diagrams.

and see how they

with Entity Relationship

are represented

in

ERDs.

Diagrams, this chapter

will use

6.1.1 Entity Supertypes and Subtypes In the real

6

world,

most businesses

employ

people

with a wide range

of skills and special

qualifications.

In fact, data modellers find many ways to group employees based on employee characteristics. For instance, aretail company would group employees as salaried and hourly employees, while a university would group employees as faculty, staff and administrators. The grouping of employees to create various types of employees provides two important benefits: It avoids unnecessary that

are not shared

nulls in the employee attributes

by other

when some employees

It enables a particular employee type to participate in relationships employee

Toillustrate those benefits, lets

explore the case of an aviation business. The aviation business employs

pilots,

accountants,

mechanics,

FIGURE 6.1

secretaries,

EMP_

EMP_

EMP_

LNAME

FNAME

INITIAL

100

Nkosi

Cela

101

Lewis

Marcos

102

Vandam

Jean

103

Jones

Victoria

104

Lange

Edith

107

Diante

Venite

108

Shenge

109

Travis

EMP_HIRE_

MED_TYPE

DATE 15-Mar-98

SEL/MEL/Instr/CFII

25-Apr-99

1

28-Aug-13

U

ATP

SEL/MEL/Instr

1

20-Oct-07

COM

SEL/MEL/Instr/CFI

2

08-Nov-07

COM

SEL/MEL/Instr

2

05-Jan-14 02-Jul-07

L

18-Nov-05 T

COM

SEL/MEL/SES/Instr/CFII

1

14-Apr-11 01-Dec-13

Stan

Learning. that

EMP_

R

Brett

Cengage

of employees.

Mhambi

Genkazi

deemed

EMP_RATINGS

EMP_ LICENCE

ATP

Gabriel Theeban

has

many other types

20-Dec-03

Naidu

2020

and

T

106

review

managers

by unique attributes

NUM

Copyright Editorial

Nulls created

Williams

database

how pilots share certain characteristics with other employees, such as alast and hire date (EMP_HIRE_DATE). Onthe other hand, many pilot characteristics

EMP_

110

that are unique to that

type.

Figure 6.1 illustrates name (EMP_LNAME)

105

have characteristics

employees.

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

are not shared

by other

requirements

such

employee have

characteristics

a lot

of nulls

for employees

Based

those

For

to

not

all

in the

employee

flies

preceding

discussion,

to

that

all

are

unique

employees.

a generic

entity type that is related

common

subtype.

characteristics

In the

specialisation

next

section,

supertypes

hierarchy entity

would

supertype

PILOT

EMP_MED_TYPE

only

if

you

would

case,

special

will generate that

employees

all

nulls

are unique to

who

are

pilots

can

and that hierarchy,

the

entity

can

entity

PILOT

entity

entity

stores that

unique

is

that

are

a subtype

of

supertype

characteristics

supertypes

only

an entity supertype is

where the entity

the

stores

attributes

PILOT

modelling terms,

subtypes,

contain

the

conclude

of PILOT.In

how the

that

EMPLOYEE you

subtypes

will learn

deduce

contains

of each entity

and subtypes

are related

in

a

6

and

Hierarchy subtypes

and

(child

entities).

three

entity

reflects

the

subtype

subtype

this

in some relationships

correctly

pilots

are

depicts the arrangement

subtypes

hierarchy

entity,

In

235

meet special

Therefore,

EMPLOYEE entries.

Concepts

relationship.

to one or more entity

you

must

training.

dummy

aircraft;

Advanced

hierarchy.

6.1.2 Specialisation Entity

participate

on that

and the

a single

Modelling

pilots

periodic

and

can fly

EMPLOYEE and that EMPLOYEE is the supertype the

in

of needless

pilots

Data

employees, and

EMP_RATINGS

you

to

Based

a lot

employees

aircraft

other

checks

were stored

make

as EMP_LICENCE,

on the

attributes

common

qualifications

have

example,

unlike

flight

who are not pilots. In addition,

qualifications.

participate

restrictions,

special

would

such

For example,

hour

and

or you

pilot characteristics

their

employees.

as flight

6

6.2 shows

subtypes

to

a specialisation

PILOT,

to

specialisation

MECHANIC

one instance

The specialisation

entities) and lower-level formed

by an EMPLOYEE

ACCOUNTANT.

and

of the

of the

hierarchy and

EMPLOYEE

one instance

hierarchy.

entity supertypes (parent

the

between

is related

is related

in

of higher-level

Figure

1:1 relationship

occurrence

occurrence

organised

each

of its

EMPLOYEE

EMPLOYEE

The

specialisation

subtypes.

For

supertype,

and

example,

a

a

MECHANIC

supertype.

NOTE In

UML

notation,

notation

also

subtypes

enables

generalisation

and

within the

directly

related.

relationships in turn,

Copyright Editorial

review

2020 has

Learning. that

any

of the

All suppressed

Reserved. content

does

May not

the

understand

have

is the

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

a

to

in experience.

can

hierarchy

or in Cengage

part.

to

tree in

(supertype)

continue

are

mechanic

in

UML

UML

as class

which

each

class.

Throughout

in

Due Learning

to

electronic reserves

through

sometimes

is

child

class

this

chapter. in

and

hierarchy,

many levels

which

the

described

an employee,

can have only have

other lower-level

whole

superclasses.

are referred

within a specialisation

hierarchy

a specialisation supertype

as you

and every subtype

a specialisation

as

in all discussions.

hierarchy

that,

which

of another

will be explained

an employee,

known

an upside-down

a subclass

specialisation

a pilot is

to

can

subtypes

Rights

6.2

are

hierarchies,

and supertype

Figures

supertypes

resembles

class is

of a supertype

However, you

child

within

and

specialisation hierarchy

subtype

in

context

that is,

one

Cengage deemed

Each

It is important

only

subclasses

represent

For example,

is an employee.

it is

class.

depicted

relationships.

called

A class

symbols

The relationships

exist

to

we will use the terms

The terminology

IS-A

you

hierarchies.

has only one parent chapter

are

terms

of

an accountant

a subtype

one supertype

can

to

which

of supertype/subtype

a supertype

has

many subtypes;

subtypes.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

236

PART II

Design

Concepts

FIGURE 6.2

Aspecialisation

hierarchy

Employee

supertype Inherited

EMPLOYEE

Attributes

inherited subtypes

EMP_NUM

by all

relationship

DEPENDANT

{PK}

EMP_NUM

EMP_LNAME

has

c

{PK}

DPNT_NUM

EMP_FNAME

{FK1}

{PK}

DPNT_FNAME

EMP_INITIAL 1..1

EMP_HIRE_DAT

DPNT_LNAME

0..1

DPNT_RELATION

EMP_TYPE

Discriminator

is

emp_type

emp_type

{OR,

Participation

Optional}

?

OR is

?

Optional

and disjoint

an example is

constraints

of a disjoint

an example

constraint

of a participation

constraint

6 MECHANIC

PILOT

ACCOUNTANT PIL_LICENCE

MEC_TITLE

PIL_RATINGS

MEC_CERT

Attributes

ACT_TITLE

unique to subtypes

ACT_CPA_DATE

PIL_MED_TYPE

Subtypes

Online Content Thischaptercoversonlyspecialisation hierarchies. TheEERmodelalso supports those

specialisation

concepts

are

lattices

better

Databases.

where a subtype

covered

under

The appendix

is

the

can have

multiple parents (supertypes).

object-oriented

available

on the

model in

Appendix

platform

for

online

However,

G, Object-Orientated

this

book.

As you can see in Figure 6.2, specialisation hierarchies enable the data model to capture additional semantic content (meaning) into the ERD. Aspecialisation hierarchy provides the meansto: Support attribute inheritance. Define a special supertype

attribute known as the subtype

Define disjoint/overlapping

constraints

The following

sections

will cover such

discriminator.

and complete/partial

characteristics

constraints.

and constraints

in

more detail.

6.1.3 Inheritance The property of inheritance enables an entity subtype to inherit the attributes and relationships of the supertype. As discussed earlier, a supertype contains those attributes that are common to all of its subtypes.

Copyright Editorial

review

2020 has

Cengage deemed

In contrast,

Learning. that

any

All suppressed

Rights

Reserved. content

does

subtypes

May not

not materially

be

contain

copied, affect

scanned, the

overall

or

only the attributes

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

that

to

electronic reserves

are unique to the

rights, the

right

some to

third remove

party additional

content

subtype.

may content

be

For example,

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure name,

6.2 illustrates first

attributes

name, that

are

characteristic

Figure

that

pilots,

middle initial, unique;

is that

the

date

same

all entity

6.2 that the

mechanics

hire

is true

subtypes

EMP_NUM

and

attribute

and

accountants

so on.

However,

for

mechanics

inherit

their

is the

primary

all inherit Figure

and

primary

6

Data

the

from

their

237

last

pilots

have

inheritance

supertype.

of the subtypes,

Concepts

number, that

One important

attribute

key for each

Advanced

employee

6.2 also illustrates

accountants.

key

Modelling

Note in

but it is

not shown

in the subtype. At the implementation maintain

level,

a 1:1 relationship.

EMPLOYEE

and the

table

name:

Table

Copyright review

has

you

Figure

replace

the

the

undesirable EMPLOYEE

6.3.)

PILOT supertype

subtype relationship

100

Nkosi

Cela

T

15-Mar-98

101

Lewis

Marcos

102

Vandam

Jean

103

Jones

Victoria

104

Lange

Edith

Williams

R

Naidu

Theeban

107

Diante

Venite

108

Shengi

109

Travis

Brett

110

Genkazi

Stan

EMP_TYPE

6

25-Apr-99

P

20-Dec-03

A

28-Aug-13

U

Gabriel

106

L

Mhambi T

20-Oct-07

P

08-Nov-07

P

05-Jan-14

P

02-Jul-07

M

18-Nov-05

M

14-Apr-11

P

01-Dec-13

A

PILOT EMP_NUM

PIL_LICENCE

PIL_RATINGS

PIL_MED_TYPE

101

ATP

SEL/MEL/Instr/CFII

1

104

ATP

SEL/MEL/Instr

1

105

COM

SEL/MEL/Instr/CFI

2

106

COM

SEL/MEL/Instr

2

109

COM

SEL/MEL/SES/Instr/CFII

1

inherit

shows entity.

all relationships

the

EMPLOYEE

Through

All suppressed

with

Reserved. content

does

May not

not materially

which the supertype

be

copied, affect

all subtypes

multiple

and relationships

Rights

in entity

inheritance,

hierarchies

attributes

any

hierarchy

supertype

EMP_HIRE_DATE

Learning. that

PILOT. (See

lets

one representing

specialisation

EMP_INITIAL

6.2

Cengage deemed

hierarchy

tables

in the

EMP_FNAME

specialisation

2020

specialisation

with two

the subtype

depicted

EMP_LNAME

DEPENDANT

Editorial

the

6.1

subtype(s)

EMP_NUM

subtypes

all of the

example,

Figure

and its

EMPLOYEE

name:

Figure

supertype

The EMPLOYEE

105

In

in

other representing

Table

Entity

For

structure

FIGURE 6.3

the

levels

from

scanned, the

overall

or

duplicated, learning

in

entity

participating are

also

able

whole

upper-level

or in Cengage

part.

Due Learning

to

participates.

in to

For

example,

a 1:* relationship

participate

of supertype/subtypes,

all of its

experience.

supertype

in that

a lower-level

with

a

relationship.

subtype

inherits

supertypes.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

238

PART II

Design

Concepts

6.1.4 Subtype Discriminator A subtype

discriminator

is the

attribute

in the supertype

entity that

determines

to

which entity subtype

each supertype occurrence is related. As seen in Figure 6.2, the subtype discriminator is the employee type (EMP_TYPE). It is common practice to show the subtype discriminator and its value for each subtype in the ER diagram,

as seen in

Figure 6.2.

However,

not all ER

modelling tools

follow

that

practice.

In

Figure 6.2,

the discriminator was added in MS Visio by using the UML generalisation properties. Its important to note that the default comparison condition for the subtype discriminator attribute is the equality comparison. However, there may be situations in which the subtype discriminator is not necessarily based on an equality comparison. For example, based on business requirements, you may create two

new pilot subtypes,

PIC (pilot-in-command)

qualified

and copilot

qualified

only. A PIC-qualified

pilot will be anyone with morethan 1 500 PIC flight hours. In this case, the subtype discriminator beFlight_Hours and the criteria would be . 1500 or ,5 1500, respectively.

would

6 NOTE When creating

object

to

a specialisation

connect

the

hierarchy

subtype

field called discriminator

using

entity to the

UML

notation

supertype

in

entity.

through the UML generalisation

MS Visio,

you

The subtype

properties

should

use the

discriminator

generalisation

is typed

into the

box.

Online Content ForatutorialonusingMSVisioto createa specialisation hierarchy, see

Appendix

platform

for

A, Designing this

Databases

with

Visio

Professional:

A Tutorial,

available

on the

online

book.

6.1.5 Disjoint and Overlapping

Constraints

An entity supertype can have disjoint or overlapping entity subtypes. For example, in the aviation example, an employee can be a pilot or a mechanic or an accountant. Assume that one of the business rules

dictates

that

an employee

cannot

belong to

morethan

one subtype

at a time; that is, an employee

cannot be a pilot and a mechanic atthe same time. Disjoint subtypes, also known as non-overlapping subtypes, are subtypes that contain a unique subset of the supertype entity set; in other words, each entity instance of the supertype can appear in only one of the subtypes. When using UML notation, a disjoint relationship

is represented

by an OR,

and an overlapping

constraint

is represented

by an AND.

For example, in Figure 6.2, an employee (supertype) whois a pilot (subtype) can appear only in the PILOT subtype, not in any of the other subtypes. You can see that when using MS Visio to produce ERDs using UML notation, the disjoint subtype is indicated by placing the word OR in brackets. Onthe other hand, if the business rule specifies that employees can have multiple classifications, the EMPLOYEE

supertype

may contain

overlapping

job

classification

subtypes.

Overlapping

or non-disjoint

subtypes are subtypes that contain non-unique subsets of the supertype entity set; that is, each entity instance ofthe supertype may appear in morethan one subtype. For example, in a university environment, a person may be an employee or a student or both. In turn, an employee may be alecturer as well as an

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

administrator. subtypes of the UML

Because

of the

an employee

supertype

supertype

EMPLOYEE.

notation

by placing

FIGURE 6.4

also

PERSON,

just

Figure

the

may be a student, as LECTURER

6.4 illustrates

word AND

in

STUDENT

and

6

Data

and EMPLOYEE

ADMINISTRATOR

how these

Modelling

Advanced

subtypes

239

are overlapping

are overlapping

overlapping

Concepts

subtypes

are represented

in

brackets.

Specialisation hierarchy with overlapping subtypes

6

It is

common

However, does

practice

to show

not all ER

not show

add the OR

the

modelling

the

disjoint/overlapping

tools

follow

disjoint/overlapping

and AND

constraints

that

symbols

practice.

constraints.

in Figures

in the

ERD. (See

For example,

Therefore,

the

when

Figure

using

MS Visio text

6.2 and

UML

tool

Figure

notation,

6.4.)

MS Visio

was used to

manually

6.2 and 6.4.

NOTE

Alternative notations exist for representing popularised the use of G and Gsto indicate

disjoint/overlapping subtypes. For example, disjoint and overlapping subtypes.

Toby J. Teorey

As you learnt earlier in this section, the implementation of disjoint subtypes is based on the value of the subtype discriminator attribute in the supertype. However, implementing overlapping subtypes requires the

use of one discriminator

attribute

for

each subtype.

For example,

in the

case of the

Tiny University

database design you saw in Chapter 5, Data Modelling with Entity Relationship Diagrams, alecturer can also be an administrator. Therefore, the EMPLOYEE supertype would have the subtype discriminator attributes and values shown in Table 6.1.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

240

PART II

Design

Concepts

TABLE 6.1 Discriminator

Discriminator

attributes

with overlapping

subtypes

Comment

Attributes

Lecturer

Administrator

Y

N

The Employee is a member of the Lecturer

N

Y

The

Employee

is

a

Y

Y

The

Employee

is

both

6.1.6 Completeness

member

of the

a Lecturer

subtype.

Administrator and

subtype.

an Administrator.

Constraint

The completeness constraint specifies whether each entity supertype occurrence must also be a member of at least one subtype. The completeness constraint can be partial or total. Partial completeness meansthat not every supertype occurrence is a member of a subtype; that is, there may

6

be some

supertype

occurrences

that every supertype

that

occurrence

are not

members

of any subtype.

must be a member of atleast

Total

completeness

means

one subtype.

NOTE Alternative Foot line

notations

notation under

the

the

represents

circle

a total

In

UML,

exist to represent

completeness represents

completeness

Given

Disjoint

Partial

Supertype

{Optional}

Total

Copyright Editorial

review

2020 has

Cengage

any

optional

discriminator sets

are

supertype

subtypes

and shown

Subtype

sets

Rights

Reserved. content

does

May not

can

in

constraint

Table

Crows

horizontal

under

the

circle

a

member

cannot

not

be

copied, affect

scanned, the

overall

of a (at

duplicated, learning

in experience.

whole

notation

6.2

possible is

Partial

completeness

Figures

it is

UML

or in Cengage

part.

has optional

Subtype

discriminators

Subtype

sets

Every

be null.

or

whilst total

be seen in

Constraint

Supertype

be null.

is

constraint.

shown

and

6.4.

to

have

in

brackets.

the

scenarios

unique.

materially

The

Overlapping

can

participation

constraints, 6.2.

least

discriminator

All

A single line

participation

completeness

one) subtype.

suppressed

when using

shape.

horizontal

as the

by Optional

subtypes.

instance

are

to

This representation

unique.

Subtype

Learning. that

category

a double

referred

{OR}

Subtype

least

deemed

Constraint has

often

scenarios

hierarchy

Subtype

Every

{Mandatory}

constraint

Specialisation

Type

is

participation.

disjoint/overlapping hierarchy

TABLE 6.2

For example,

MS Visio

constraint;

above is represented

by Mandatory

the

constraint.

on the

completeness

constraint

as described

specialisation

based

constraint.

completeness

is represented

completeness

is

a partial

completeness

the

the

constraint

are

supertype

be null.

is

a

member

of a(at

one) subtype. discriminators sets

to

can

instance

Subtype

Due

subtypes.

not unique.

Subtype

Learning

{AND}

electronic reserves

rights, the

right

are

some to

third remove

cannot

be null.

not unique.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

6.1.7 Specialisation You can use different

6

Data

Modelling

Advanced

Concepts

241

and Generalisation

approaches

to develop

entity supertypes

and subtypes.

For example,

you can first

identify aregular entity, and then identify all entity subtypes based on their distinguishing characteristics. You also can start byidentifying multiple entity types and then later extract the common characteristics of those entities and create a higher-level supertype entity. Specialisation

is the top-down

process

of identifying

lower-level,

more specific

entity

subtypes

from a higher-level entity supertype. Specialisation is based on grouping unique characteristics and relationships ofthe subtypes. In the aviation example, you used specialisation to identify multiple entity subtypes from the original employee supertype. Generalisation is the bottom-up process ofidentifying a higher-level, more generic entity supertype from lower-level entity subtypes. Generalisation is based on grouping

common

characteristics

and relationships

of the subtypes.

For example,

you

multiple types of musical instruments: piano, violin and guitar. Using the generalisation could identify a string instrument entity supertype to hold the common characteristics subtypes.

might identify

approach, you of the multiple

6

6.1.8 Composition and Aggregation So far we have looked at how to model relationships between entities using IS-A relationships. Suppose we have two entities, one called DEPARTMENT and one called UNIVERSITY. The relationship between the two could be described as apart_of orhas_a relationship asthe DEPARTMENT entity is a part_of the UNIVERSITY entity. This type of relationship is known as aggregation, whereby a larger entity

can be composed

of smaller

entities.

A special

case

of aggregation

is known

as composition.

Thisis a much stronger relationship than aggregation, since when the parent entity instance is deleted, all child entity instances are automatically deleted. Consider the two entities BUILDING and ROOM, where BUILDING is the parent entity and ROOM is the child entity. A ROOM is part_of a BUILDING and if the building was destroyed then all the rooms would also be destroyed.

TABLE

6.3

Aggregations

UML Construct

and compositions

UML Symbol

Description

Aggregation

This type

of association

relationship entity).

(that An

instance

entity

an

Composition

When

(child)

parent

are an

not

deleted,

entity deleted.

empty

all child

composition of the

formed

as

the

dependent

(child)

the

(parent)

with instance The

is

a collection

strong deleted,

aggregation

in

the

a special

case

are

that

association

instance.

entity.

a dependent

is

deleted.

with a filled

This is the

is

with a strong

automatically

is represented

object

child

parent

When the parent entity instance

entity instances

entity

of the

indicates

has a mandatory

of other

the

of the

of

entity

association

side

A composition

association parent

type

is

represents

entity instance.

or has_a

that

diamond

association.

entity instance

(parent)

that

association

of association

aggregation

side

the

by

This type

entity

a part_of

indicates

optional

instances

represented ?

an

aggregation

has

instance.

is,

represents

diamond

equivalent

The in the

of a weak

entity in the ER model.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

242

PART II

Design

Concepts

There is no ability to and

composition

notation. of

In

a

particular,

dependency

much the

of the

two

An aggregation

construct entities

but the

has_a

relationship

A composition with For

the

type.

used

is

used

has and

a team

invoice

depicted

in

composition

of aggregation

through

to indicate

Table

6.3

and composition

the

the

summarises

of (or is formed

players,

are

associated

deleting

lines,

the

help

or a band in

parent

or an order

6.5 to

constructs

UML

strength the

main

you

has

the

order

understand

of other

can be classified many

as a

musicians.

an aggregation

deletes

contains

as follows:

by) a collection

That is, the relationship

many

entities

Figure

concept

developed

UML constructs.

has

That is,

as the

been

an relationship.

aggregation

when two

contains

relationships

in

of each other.

relationship.

an invoice

which

when an entity is composed

For example,

identifying

example,

use of the

are independent

construct

a strong

Examine

is

Foot notation,

use aggregation

participating

and composition

guides the

entities,

Crows

approach,

diagrams

entities

aggregation

UML standard

using

more contemporary UML class

between

characteristics The

model such relationships

is

association

children

instances.

lines.

the

use

of

aggregation

and

composition.

6 FIGURE 6.5

Aggregation and composition Aggregation

OWNER

CAR

OWNER_ID

CAR_VIN bis_owned_by

OWNER_FNAME

CAR_YEAR CAR_BRAND

OWNER_LNAME OWNER_INIT

CAR_MODEL

0..*

1..1

OWNER_DRIVER_LIC

Deleting

an

OWNER_ID

OWNER

parent instance

does

not delete

all related

CAR children

instances.

Composition LINE INVOICE

INV_NUMBER contains

INV_NUMBER

c

LINE_NUMBER P_CODE

INV_DATE CUS_CODE

Deleting

6.2 Developing

design

Copyright Editorial

review

2020 has

to the

use

Learning. that

Generally,

approaches

can

Cengage deemed

parent instance

an ER diagram

diagram

you

an INVOICE

1..*

deletes

all related

LINE_PRICE

LINE

children

instances.

ENTITY CLUSTERING

relationships.

the

1

LINE_UNITS

any

entity

All suppressed

Rights

of

clusters

does

May not

not

be

discovery

modeller

the

ERD

making it to

materially

the

data

completion, point

Reserved. content

entails

the

affect

scanned, the

will contain

unreadable

minimise

copied,

of possibly

will develop

overall

or

the

duplicated, learning

in

whole

of entities

or in Cengage

part.

Due Learning

to

electronic reserves

of entity

types

ERD containing

hundreds

and inefficient

number

experience.

hundreds

an initial

of entities

and their

a few

and relationships

as a communication shown

rights, the

right

in the

some to

third remove

respective

entities.

party additional

tool.

As the

that

crowd

In those

cases,

ERD.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

An entity cluster is a virtual ERD.

An entity

object. the

cluster

An entity

final

with the Figure

cluster

is

entity

purpose

of simplifying

cluster

6.6 illustrates in

two

clusters:

entity

by combining

considered

ERD; the

introduced

entity type used to represent

is formed

the

Chapter

is

virtual

a temporary

the

use of entity

5, Data

grouping

the

COURSE

LOCATION

grouping

the

ROOM

Note

also that the

primary

the

key

ERD in

key inheritance

consequences,

the loss

such

of foreign

avoid the

display

FIGURE

of the rules

and

in the

6.6

change.

enhancing based

Entity

not show

In turn,

the

from

of attributes

some

when entity

Tiny University

are change from

entities. clusters

Advanced

into

a single

that it is multiple

not

243

in the

abstract

actually

entities

Concepts

entity

an entity

in

and relationships,

Tiny

University

Diagrams.

example

Note that

the

that

was first

ERD

contains

and relationships.

entities

entities

Modelling

its readability.

on the

entities

in relationships

sense

Relationship

BUILDING

does

entities

used to represent

CLASS

combined

as changes

key attributes

6.6

with

and

Figure

attributes

or abstract

clusters

Modelling

OFFERING

clusters,

interrelated

ERD and thus

Data

multiple entities and relationships

multiple

entity

6

and relationships.

attributes

for the

no longer

available.

in the inheritance

identifying

to

To eliminate

entities.

When using

Without rules

the

can

have

non-identifying

those

key

undesirable

or vice

problems,

the

entity

attributes,

versa

and

6

general rule is to

are used.

ERD using entity

clusters

0..1 b

is_dean_of

SCHOOL

1..1

operates

c

1..1

1..* 1..1

1..1

employs

0..*

c

DEPARTMENT

LECTURER

b chairs

0..1

1..1

1..1

1..1 has

c

0..*

STUDENT

offers

c 1..1

is_written_in

c

0..*

ENROL LOCATION

0..* 1..1 is_found_in

c

1..1 teaches

b is_used_for

c

OFFERING

0..* 1..*

0..*

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

244

PART II

Design

Concepts

6.3

ENTITY INTEGRITY: SELECTING PRIMARY KEYS

Arguably, the combination

mostimportant of attributes)

characteristic

used to

of an entity is its

uniquely

identify

each

primary key (a single attribute

entity instance.

The primary

or some

keys function

is

to guarantee entity integrity. Furthermore, primary keys and foreign keys work together to implement relationships in the relational model. Therefore, the importance of properly selecting the primary key has a direct bearing on the efficiency and effectiveness of database implementation.

6.3.1 Natural Keysand Primary Keys The concept of a unique identifier is commonly encountered in the real world. For example, you use class (or section) numbers to register for classes, invoice numbers to identify specific invoices, account numbers to identify credit cards, and so on. Those examples illustrate natural identifiers or keys. A natural key or natural identifier is a real-world, generally accepted identifier used to distinguish that is,

6

uniquely identify

real-world

objects.

Asits

name implies,

a natural

key is familiar

to

end users

and forms part of their day-to-day business vocabulary. Usually, a data modeller uses a natural identifier as the primary key of the entity being modelled, assuming that the entity has a natural identifier. Generally, most natural keys make acceptable primary key identifiers. However, there are occasions when the entity being modelled does not have a natural primary

key or the

composed

natural

key is

of the following

not a good

primary

key. For example,

assume

PROJ_NUM,

EMP_NUM, ASSIGN_HOURS,

Which attribute (or combination of attributes) would make a good primary Database Designs, you willlearn that trade-offs are often associated combinations of attributes to serve as the primary key for a specific table. about the use of surrogate keys, which can also be used as a primary key?

entity

attributes:

ASSIGNMENT (ASSIGN_DATE, ASSIGN_CHARGE)

primary

an ASSIGNMENT

The next section

gives some

basic

guidelines

for selecting

ASSIGN_CHG_HOUR,

key? In Chapter 7, Normalising with the selection of different You will alsolearn in Chapter 7 key. But what makes a good primary

keys.

6.3.2 Primary Key Guidelines A primary key is the attribute or combination of attributes that uniquely identifies entity instances in an entity set. However, can the primary key be based on, say, 12 attributes? And just how long can a primary

key be? In

previous

examples,

why was EMP_NUM

selected

as a primary

key of EMPLOYEE

and not a combination of EMP_LNAME, EMP_FNAME, EMP_INITIAL and EMP_DOB? Can a single 256-byte text attribute be a good primary key? The answer may depend on whom you ask. There is no single answer to those questions; however, there is a body of practice that database experts have built over the years. This section will examine that body of documented practices. First,

you should

understand

the function

of a primary

key. The

primary

keys

main function

is to

uniquely identify an entity instance or row within atable. In particular, given a primary key value that is, the determinant the relational model can determine the value of all dependent attributes that describe the entity. Note that identification and description are separate semantic constructs in the model. The function of the primary key is to guarantee entity integrity, not to describe the entity.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Second, However, end

primary the

users.

the

In

objects.

a store

the

real

For

display

applications should

using

primary

behind

is

done

objects

not

the

on the

characteristics

select

at the

selection

process

among

multiple

relationships behind

you

by looking

scenes.

key

store,

The

Nonintelligent

PK

The

unique

PK should

semantic

stock

over

If

time

those

concepts

probably

has

make good she

semantic

primary

gets

married

semantic

better

used

In other

L. as a primary

to the database basically Preferably

A primary primary

key keys

should

meaning,

and

it

adding

to

numeric

Unique

can

routines

with the

complaint

All suppressed

The selected

Rights

Reserved. content

does

minimum

6.4,

Having

and

managed

of each

her surname

constructs,

must not

risk

or violation.

EMPLOYEE

table

not

be

copied, affect

scanned, the

overall

or

over

her

why names

key,

husbands

key value

do

what happens surname?

thus

adding

means that you are

of attributes

possible.

Single-attribute

primary

the

possible

addition

the

of many attributes,

coding

are numeric attribute

In fact,

simplify

primary keys can cause

making (application) when they

Single-attribute keys

thus

more cumbersome.

because

that

primary

the

database

automatically

most database

increments

systems

Microsoft

can

include

the

Access, to support

be composed

of any attribute(s)

For example,

using

an ID

that

might

number

be

as a PK in

an

not a good idea.

duplicated, learning

entity

key attributes.

key

materially

This is

must be updated,

a primary

such as Autonumber in

primary

May

of the

would be preferred

primary to

key values

a counter-style

a security

not

with embedded

updates.

as the

multiple-attribute

new row.

considered

is

must be able to

characteristic

to

Smith

number

keys.

to implement

primary

key

An attribute

as a descriptive

changing

not required.

workload

addition

self-incrementing

any

at Table

6

Vickie

but

be better

ability to use special

Learning.

objects

of an entity.

have the

database

values

use internal

that

database

different

mind, look

A primary

meaning.

change

Furthermore,

entities to grow through

to the

values

Cengage

in

may be subject

have

decides

desirable

of foreign

keys of related

deemed

Therefore, of

from

natural for

key identifier.

the identity

are

implementation

has

about

them

only

words, a student ID of 650973

keys. If you

workload.

changing

single-attribute

2020

know

nulls.

If a primary key is subject to change, the foreign

review

Its

narratives

entity instance. contain

embedded

as an identifier.

an attribute

not

is

each

It cannot

not have

Martha

when

Copyright

number.

descriptive

from

by taking

245

entities.

hidden

they

products

as much as possible.

Keeping

identify

values.

meaning

Smith,

Editorial

among scenes,

Concepts

key characteristics

must uniquely

rather than

Security

the

Advanced

characteristics.

primary

guarantee

Preferably

Modelling

mostly

based

at a grocery

choose

primary

used to implement

Data

Rationale

values

No change

identify

human

user

Desirable

Characteristic

Unique

users

the labels,

end

values

are

relationships

shopping

mimic the

the

desirable

TABLE 6.4 PK

to

key

summarises

end

when

and reading

let

keys

of such

world,

shelf

database

which

and foreign

example,

applications while

keys

implementation

6

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

246

PART II

Design

6.3.3 In the

Concepts

Whento Use Composite Primary Keys previous

section,

you learnt

about

the

desirable

characteristics

of primary

keys.

For example,

you learnt that the primary key should use the minimum number of attributes possible. However, that does not mean that composite primary keys are not permitted in a model. In fact, composite primary keys are particularly useful in two cases: Asidentifiers of composite the *:* relationship.

entities, where each primary key combination is allowed only once in

Asidentifiers of weak entities, parent entity.

where the weak entity has a strong identifying

relationship

with the

Toillustrate the first case, let us consider two examples. For the first example, assume that you have a STUDENT entity set and a CLASS entity set. In addition, assume that those two sets are related in a *:* relationship

via an ENROL

entity set in

which each student/class

combination

may appear

only once in

the composite entity. Figure 6.7 shows the ERD to represent such a relationship using UML notation. As shown in Figure 6.7, the composite primary key automatically provides the benefit of ensuring that there cannot be duplicate values that is, it ensures that the same student cannot enrol more than

6

once in the same class.

FIGURE

6.7

The *:* relationship

between

STUDENT and CLASS ENROL

STUDENT STU_NUM

is_written_in

{PK}

c

CLASS_CODE

STU_LNAME

STU_NUM

STU_INIT

Table

name:

review

2020 has

Cengage deemed

CLASS_CODE

{FK2}

{PK}

CRS_CODE 0..*

CLASS_SECTION

1..1

STU_NUM

STU_LNAME

STU_FNAME

STU_INIT

321452

Ndlovu

Amehlo

C

324257

Smithson

Anne

K

324258

Le Roux

Dan

324269

Oblonski

324273

Smith

324274

Katinga

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

copied, affect

scanned, the

overall

or

D

John Raphael

P

Hemalika

T B

John

Smith

be

H

Walter

Ismail

324299

Copyright

c

STUDENT

324291

Editorial

{PK}

is_found_in

{FK1}

ENROL_GRADE

0..*

1..1

CLASS

{PK}

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Table name:

second

The

entities

which

example TOUR

shown and

second

review

2020 has

Cengage deemed

to

Learning. that

any

C

10014

324257

B

10018

321452

A

10018

324257

B

10021

321452

C

10021

324257

C

each

All

CRS_CODE

CLASS_SECTION

10012

ACCT-211

1

10013

ACCT-211

2

10014

ACCT-211

3

10015

ACCT-212

1

10016

ACCT-212

2

10017

CIS-220

1

10018

CIS-220

2

10019

CIS-220

3

10020

CIS-420

1

10021

QM-261

1

10022

QM-261

2

10023

QM-362

1

10024

QM-362

2

10025

MATH-243

1

Figure

6.8

further

are related can

Advanced

Concepts

247

only

by

illustrates

of each other

and

in

Rights

world. other.

a strong

DEPENDANT

Reserved. content

does

May

not materially

the

use

a *:* relationship

appear

once in the

of

via the

composite

primary

TOUR_BOOKING

TOUR_BOOKING

relationship

keys. entity

in

entity.

with a parent

be

copied, affect

the

However,

one

entity is

overall

or

duplicated, learning

such

objects

relationship.

in

whole

or in Cengage

can For

the

part.

Due Learning

example, in

key of the

to

electronic reserves

rights, the

are two

exist in the

dependency

contains

experience.

object. Those types

and an employee

of existence

key that

scanned,

on another real-world

A dependant

identifying is

a composite

not

6

normally

situations:

in the real

entity is

suppressed

CLASS_CODE

combination

independently

dependant

Copyright

321452

object that is existent-dependent

EMPLOYEE

Editorial

10014

in

one of two

Areal-world

relate

ENROL_GRADE

BOOKING

are distinguishable exist

STU_NUM

case, a weak entity in a strong identifying

used to represent

1

Modelling

CLASS

a booking/tour

In the

Data

ENROL

CLASS_CODE

The

6

right

the which

parent

some to

third remove

party additional

separate model

of objects

people

only

relationship the

who

when they between

primary

key

of the

entity.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

248

PART II

Design

Concepts

FIGURE 6.8

The *:* relationship

between BOOKING and TOUR

BOOKING BOOKING_NO EMP_ID

TOUR

{PK}

{FK1}

CUST_NO

BOOKING_STATUS_CODE EVENT_ID

{FK4}

HOTEL_ID

{FK5}

FLIGHT_NO

TOUR_ID

TOUR_BOOKING

{FK2} {FK3}

may_contain

c

TOUR_ID

{PK}

{FK1}

BOOKING_NO

has

{PK}

0..*

{FK6}

c

TOUR_DESCRIPTION

{FK2}

TOUR_PRICE_ADULT

TOUR_DATE

1..1

{PK}

TOUR_NAME

1..1

0..*

TOUR_PRICE_CHILD

TOTAL_TOUR_COST

TOUR_PRICE_CON

BOOKING_TOTAL_COST BOOKING_DATE

2

6

A real-world identifying in

object that is represented

relationship.

a data

model:

independent In

both

situations,

weak

6.3.4 There key

having

a strong

to the

entity

and

parent

types

LINE.

but rather

Clearly,

as part

provides

In

the

model as two

invoice

LINE

relationship

summary,

benefits

data

separate

entities in

object is represented

entity

does

not

exist

are some may not

instances

when

be a suitable

rooms

the format

for

that

a primary

primary

small

shown in

TABLE 6.5

in the

real

the

ensures

that

selection

enhance

the

dependent

of a composite

the integrity

and

entity

primary

consistency

can

key for of the

parties.

key. The

key

doesnt

exist in the

For example,

manager

of the

consider facility

real

the

world

case

keeps

track

or

when the

of a park

recreation

of all events,

of events Party_Of

17/06/19

11:00AM

2:00PM

Allure

Ndlovu

60

17/06/19

11:00AM

2:00PM

Bonanza

Adams

Office

12

17/06/19

3:00PM

5:30PM

Allure

Naidoo

Family

15

17/06/19

3:30PM

5:30PM

Bonanza

Adams

Office

12

18/06/19

1:00PM

3:00PM

Bonanza

Scouts

18/06/19

11:00AM

2:00PM

Allure

March

18/06/19

11:00AM

12:30PM

Bonanza

Naidoo

Family

EVENT

entity

data

shown

EVENT(DATE,

Cengage

Learning. that

any

All suppressed

in

Table

6.5,

TIME_START,

Rights

Reserved. content

with

Table 6.5.

Data used to keep track

Given the

that

a folder

Event_Name

deemed

only

natural

facility

using

Room

has

exist

composite

existing

Time_End

2020

as an

model.

Time_Start

review

entities

world

Date

Copyright

a strong

by two

of an INVOICE.

identifying

entity.

in the

the real-world

Whento Use Surrogate Primary Keys

houses

Editorial

INVOICE

object,

when it is related and

For example,

does

May not

not materially

be

you

would

TIME_END,

copied, affect

scanned, the

overall

or

duplicated, learning

model the ROOM,

in experience.

whole

or in Cengage

Wedding

33 of

Dimes

EVENT_NAME,

part.

Due Learning

to

electronic reserves

12

as: PARTY_OF)

rights, the

25

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

What primary a primary you

might

stands)

that

the

RESOURCE Given the

business

EVNTRSC now

have

the

given

selected time

primary

key

and text

data

would

problem

is to

use

Surrogate

could

(that

RSC_ID,

as tables,

EVENTs.

The

RSC_QTY,

249

be used as chapters,

EVENT

projectors,

RESOURCE

entity.

PCs and entity

would

RSC_PRICE)

and

key

would

be represented

QTY_USED) primary

make the

EVENT

as follows:

key.

What

existence-dependent

would

happen

entity?

implementation

of the

attributes

are

with several

if the

At this

database

surrogate practice

when

there

through

is

different

point, and

EVNTRSC you

can

program

see

coding

data

case,

key

the

may not work

EVENT

entitys

by a combination

types).

In

addition,

entities.

complex

no natural

data

key,

when

candidate

key, you

use

primary

this

of date,

the

selected

The solution

to the

key.

todays

selected

the

selected 6.4. In

and is formed

primary

in

or when the

properly

Table

existence-dependent

if you use a surrogate

performs

in

information

keys for

accepted

contents,

EVENT entitys

about

semantic

primary

helpful

semantic

that the

you learnt

single-attribute

keys

especially

question

is,

cause lengthy

there is a trade-off:

environments.

the

candidate

key is too long

must ensure that the

of unique

index

Surrogate

selected

and not

key

or cumbersome.

candidate

null

key of the

constraints.

DESIGN CASES: LEARNING FLEXIBLE DATABASE DESIGN

Data

modelling

acquired and

many

primary

composite

embedded

a numeric

are

has embedded

6.4

could previous

ROOM) for the

(such

RESOURCE

a composite

by another

guidelines

contains

primary

keys

in

for

between

may have noticed

key

columns

key

entity

TIME_START,

RESOURCEs

RSC_TYPE,

ROOM,

key

you

primary

However,

with

four-attribute

modeller,

the

primary

primary

in

Concepts

ROOM)

be used

*:* relationship entity,

primary

key that about

Advanced

complex.

As a data well,

the

were inherited

composite

natural

you learnt

Modelling

attributes:

TIME_START,

key

unnecessarily

may

many

RSC_DESCRIPTION,

a lengthy

no simple

concepts

key (DATE,

may use

RESOURCE

rules,

(DATE,

key

TIME_END,

primary

EVENT

composite

primary

or (DATE,

following

(RSC_ID,

EVNTRSC

entitys

one

case, there is

primary

Data

options:

composite

same

by the

In this

on the

ROOM)

the

and that

Based

of these

determine

be represented

that

one

you select

you

You

model.

TIME_START,

Assume

via the

would you suggest?

suggest

(DATE,

Next,

key

key in the

6

and

through

different

design

design

importance

require

practice

problems.

of flexible

skills

regular

that

are

This section

designs,

acquired

and frequent

proper

through

repetition,

will present

identification

of

experience.

applying

four

the

special

primary

keys

design and

In turn,

concepts

experience

learnt

to

cases that

placement

is

specific

highlight

of foreign

the

keys.

NOTE In

describing

to

be

the

different

on relational

issues

are

between

addressed

design

modelling

models.

attempts

to

Copyright Editorial

review

2020 has

Cengage deemed

Entities

any

All suppressed

Rights

Reserved. content

this

on the

goal in

does

are identified

book,

practical

the

focus

nature

mind. Therefore,

there

has

been

of database is

and

continues

design,

no sharp

in

entities

May

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

books

whole

or in Cengage

that

may become

and relationships.

an ERD is implemented

you will discover that this

not

keys are not part of an ER diagram

by identifiers

and define the

designed

modelling tool,

Learning. that

focus

stage of the design, foreign

understand

which the relationship

as your

throughout

the

line

all

design

of demarcation

and implementation.

and relationships.

modeller

concepts given

with the implementation

Atthe pure conceptual entities

Also,

part.

Foreign

in a relational

methodology

Due Learning

to

electronic reserves

rights, the

right

primary keys

keys. are the

model. If you

use

is reflected in the

some to

third remove

party additional

content

may content

the

ERD displays

only

During

the

design,

mechanism MS Visio

through

Professional

Visio modelling practice.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

250

PART II

Design

Concepts

6.4.1 Design Case #1:Implementing

1:1 Relationships

Foreign keys work with primary keys to properly implement relationships in the relational model. The basic rule is very simple: put the primary key ofthe one side (the parent entity) on the many side (the dependent entity) as aforeign key. However, where do you place the foreign key when you are working with a 1:1 relationship? For example, assume the case of a 1:1 relationship between EMPLOYEE and DEPARTMENT

based

on the

business

rule one

EMPLOYEE

is the

manager

of one DEPARTMENT,

and

one DEPARTMENT is managed by one EMPLOYEE. In that case, there are two options for selecting and placing the foreign key: Place aforeign keyin both entities. That option is derived from the basic rule you learnt in Chapter 5, Data

Modelling

with Entity Relationship

Diagrams.

Place EMP_NUM

as a foreign

key in

DEPARTMENT

and

DEPT_ID as aforeign key in EMPLOYEE. However, that solution is not recommended asit would create duplicated work and it could conflict with other existing relationships. (Remember that DEPARTMENT and EMPLOYEE also participate in a 1:* relationship one department employs many employees.) Place a foreign key in one of the entities. In that case, the primary key of one of the two entities appears

6

as a foreign

key on the

other

entity.

That is the

preferred

solution,

but there

is a remaining

question: which primary key should be used as a foreign key? The answer to that question is found in Table 6.6. Table 6.6 shows the rationale for selecting the foreign key in a 1:1 relationship based on the relationship properties in the ERD. TABLE

6.6

Case

ER

I

Selection Relationship

One

II

side

is

other

side

Both

sides

of foreign Constraints

mandatory is

key in a 1:1 relationship Action

and

the

Place

optional.

are

the

optional.

the

PK

optional

Select the

of the side

as

FK that

Both sides are

mandatory.

on the

a FK and

causes

FK in the entity in

III

entity

mandatory

make the

the fewest

entities

do not

FK

belong

together

your in

in the

entity

on

mandatory.

number

which the (relationship)

See Case II or consider revising

side

of nulls

role is

or place

the

played.

model to ensure that the two

a single

entity.

Figure 6.9 illustrates the EMPLOYEE manages DEPARTMENT relationship. Note that, in this case, EMPLOYEE is mandatory to DEPARTMENT. Therefore, EMP_NUM is placed as the foreign key in DEPARTMENT. Alternatively, you might argue that the manager role is played by the EMPLOYEE in the

FIGURE

DEPARTMENT.

6.9

A 1:1 relationship A one-to-one one

between

DEPARTMENT

(1:1) relationship:

DEPARTMENT;

each

An EMPLOYEE DEPARTMENT

and EMPLOYEE manages

is

managed

zero by

or

one

EMPLOYEE

EMPLOYEE EMP_NUM

DEPARTMENT manages

{PK}

c

DEPT_ID

EMP_LNAME EMP_FNAME

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

1..1

Reserved. content

{PK}

DEPT_NAME

does

May not

not materially

be

EMP_NUM

0..1

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

{FK1}

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

As a designer,

you need to recognise

should

be supported

are

not

placed

and

unique

entity,

in

data

same

table.

the

entity

what

in the

types

would

that

be the

that

1:1 relationships

model. In fact, In

other

do not

name

of that

Company

managers through

company

managers

company words,

profits

the

data

Normally, without

attribute that

use the

stored

on

all

data in

and for

a database

are

such

hand,

attributes

such

as your

Sometimes

open any

and close case,

original

entity.

This new

is

model

pertinent

as

Figure

to

If

you

only

group

that

two

are

clearly

them

251

they

entity

sets

separate

together

in

one

by replacing

Data

based

on the

as

questions

the

data,

but

such

existing

also

as,

subject

must keep a history

to

change

over

date

of birth

or your

student

grades

or your

are externally are

time

and

ID

bank

account

on

and event

well-defined

with the

current

In

other

new

value,

for

a given

6

data refer to data whose You could

time

not time

balance

do the

of values

data changes.

are

is

In fact,

data.

history

are, therefore,

number,

originated

based

of the

How

value

when the

that

events.

sales trends?

historic

attribute

are situations

information

well as past

what are XYZ products

current

there

is

current

answer

years and not

which you

well as the

history

of time-variant

time-variant

entity

to the

are

driven,

variant.

However,

variant. subject

to

argue

On the

other

change

over

such

as a product

such

as the

a

multivalued

schedules,

daily

price stock

will contain

event

history

data,

being

you

the

data is must

equivalent

create

new value, the

modelled.

of all department

For example, managers

to

a new

having

entity

date of the if

you

over time,

in

change,

want to you

attribute

a 1:* relationship

could

keep

with the

and

whatever

track

of the

create

the

model

other current shown

6.10.

FIGURE 6.10

Cengage deemed

making

data reflect

databases

However,

changes

the

attribute

has

DEPARTMENT

entity.

Concepts

values.

keeping To

2020

in

reflect

as your

occasions,

entity.

manager

review

decision Such

of previous

the data changes

On other

your

in

good

managed

over time

values,

in

stored

value.

attribute

In

Copyright

are

previous

ensure

History of Time-Variant

databases.

databases

and

a single

Advanced

world and, therefore,

to

must be preserved. From a data modelling point of view, time-variant

change.

Editorial

data

some

quote

that

in

to those

changes

to the

change

time.

realise

data stored

compare

data

regard

values

generally

the

in

used

Modelling

entity?

6.4.2 Design Case #2: Maintaining

generated

is

EMPLOYEE

together

Data

exist in the real

a 1:1 relationship

words,

belong

6

Learning. that

any

All suppressed

Rights

Maintaining manager history

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

252

PART II

Design

As you and

Concepts

examine

Figure

the

manager

of

managers.

the of the the

many

entity to

an employee

Figure

time,

could

managed through

each

time

a new

DEPARTMENT

appears of the

you

would

history

Copyright Editorial

review

2020 has

modify

Learning. that

any

and

a given

code,

and

one insert

model in

entity to

employee

many

different

once

could

you

attribute

became to

case in could

the

be the

your

be

employee

DATE_ASSIGN

employee

not the

in

manager manager

environment

make

of if,

DATE_ASSIGN

there

6.10

6.9

hand,

relationship

Figure

6.10

is implemented

the

historic

with that

data

model is that

modifications:

one

update

entity. more

apparent

case, the

for

that

case,

each

a JOB_HIST

when

you

PK of the 1

Now suppose In

salary

practice.

most recent

ERD in

Additionally,

data

in

the

the

The trade-off

will be two

In that key.

by adding

other

manager

becomes

employees.

employees

and redundant

by retrieving

DEPARTMENT.

relationship.

and the

is

The current

and

as a foreign

maintain the

in theory

On the

MGR_HIST

Figure

assigned

Figure

data.

in the in

optional

a department

DEPT_MGR_HIST.

companys

date

is

a department,

proposed

the

the

with EMPLOYEE

an

(EMP_NUM)

same

is

department.

historic

to

of the

the

only

of

EMPLOYEE

many employees

each

the

new JOB_HIST

Cengage deemed

model

employee

time,

have

must store

scenario

manager

side (EMPLOYEE)

for

the job

FIGURE 6.11

and

employs

in the many

job

department,

the

of the

department

for

assigned

entity

over

could

you

permits

If that

relationship

between

is

data,

of a department

who the

data

that,

entity.

EMP_MGR_HIST

manager

The flexibility

one

current

has a 1:* relationship

fact

a department

MGR_HIST dates.

MGR_HIST

entity

the

on which the

manager

out

relationship

are

the

find

between

by the manages

in

key of

MGR_HIST

date from

differentiates

date

different

is the

reflect and

6.10 that the manages

you

DATE_ASSIGN

on

in the

MGR_HIST to

time-variant

the

primary but

attribute

Note in At any

provide

The

the

departments

are recording

department,

example,

note that

DEPARTMENT

different

you

department.

same

a non-prime

6

with

Because

MGR_HIST

for

6.10,

a 1:* relationship

add the

1:*

side (DEPT_ID)

you

would like to keep track

you

would

employee. entity.

keep

track

To accomplish

Figure

6.11

of the

that

shows

the

task, use

of

history.

Maintaining job history

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Again, it is

worth emphasising

and redundant history in

and

practice.

selecting

Chapter

SQL and

in

8,

can

only the

Beginning

Advanced

represented

that the manages You

relationships

to

out

where

data row

Query

SQL, finding

and employs

find

most current

Structured

for

Language

the

separate

admittedly

current

each

works

employee. and in

redundant

Modelling

However, Chapter

253

at the job

as you

will discover

9, Procedural

not a trivial task.

Concepts

optional

by looking

Language

Therefore,

but unquestionably

historic

Advanced

are theoretically

employee

works is

data from

Data

relationships

each

(SQL),

where each employee

in Figure 6.11 includes

employs

always

6

the

model

useful manages

and

data.

6.4.3 Design Case #3: Fan Traps Creating

a data

due to

model requires

miscommunication

uncommon

to

misidentify

contain a design trap. and,

therefore,

is

Given

Figure

a

data

relationships

of the

entities.

has

among

business

Under

those

rules

circumstances,

occurs when a relationship is improperly way that

is

not

consistent

entities.

with the

real

However,

or processes,

it is

the

ERD

orincompletely world.

The

not may

identified

most

common

6

as a fan trap.

among

league those

among

A design trap in

of the

understanding

occurs when you have one entity in two 1:* relationships

an association football

identification

relationships

represented

design trap is known

Afan trap

proper

or incomplete

the many

other

entities

divisions.

incomplete

that

Each

business

is

not

division

rules,

you

expressed

has

many

might

create

to other entities, thus producing

in the players,

model. and

an ERD

For

each

that

example,

division

looks

assume

has

like

the

the

many teams. one

shown

in

6.12.

FIGURE 6.12

Incorrect

ERD with fan trap problem Fan trap

due to

misidentification

of relationships

PLAYER

TEAM DIVISION

TEAM_ID {PK} TEAM_NAME

DIV_ID

DIV_ID

DIV_NAME

{FK1}

0..*

PLAYER_ID {PK} PLAYER_NAME

{PK}

1..1

DIV_ID

0..*

1..1

{FK1}

As you can see in Figure 6.12, DIVISION is in a 1:* relationship with TEAM and in a 1:* relationship with PLAYER. Although that representation is semantically correct, the relationships are not properly identified.

For example,

there is no way to identify

which

players

belong to

which team.

Figure

6.12 also

shows a sample instance relationship representation for the ERD. Note that the DIVISION instances relationship lines fan out to the TEAM and PLAYER entity instances, thus the fan trap label. Figure 6.13 shows the correct ERD after the fan trap has been eliminated. Note that, in this case, DIVISION is in a 1:* relationship

also shows the instance

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

with TEAM. In turn,

relationship

May not

not materially

be

copied, affect

scanned, the

TEAM is in a 1:* relationship

representation

overall

or

duplicated, learning

in experience.

whole

with PLAYER.

Figure 6.13

after eliminating the fan trap.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

254

PART II

Design

Concepts

FIGURE 6.13

Corrected ERD after removal Fan trap

eliminated

by

of the fan trap

proper

identification

of relationships

PLAYER

TEAM DIVISION DIV_ID

TEAM_ID

{PK}

PLAYER_ID

{PK}

DIV_NAME

DIV_ID

0..*

1..1

{PK}

PLAYER_NAME

TEAM_NAME 0..*

1..1

{FK1}

TEAM_ID

{FK1}

Jordan Baird

6

U-15

Club

Pirates

Dlamini Malone

Ajax

U-18

FC Shezi Zulu

Given the to find

design in Figure

out

division;

which

then

6.4.4

(As

places

you learnt

related

model.

Chapter

The

data.

historic

need

via the

TEAM

to

see

team.

In

Relational

is

Model

teams

words,

belong

there

is

to

each

a transitive

entity.

seldom

a good

Characteristics,

occur

when

with redundant to

which

other

However,

note that

in

the

redundancies

there

are

relationships some

environments

thing

can

multiple

cause

anomalies

paths

remain

use redundant

backups

environment. data

relationship

is that they

designs

(multiple

database

between

consistent

relationships

across

as a

way to

relationships

by the fact

Another

was first

data.

more

introduced

However,

that

such

specific

the

in

Figure

relationships

example

6.10

use of the redundant

of

during

were dealing

a redundant

the

discussion

manages

on

and employs

with current

data rather

relationship

is

represented

and

through

in

6.14. Figure

safely

6.14,

(So

deleted

Cengage deemed

first

on each

which team.

Relationships

relationships

of time-variant

than

has

3,

of redundant

history

was justified

2020

play

mind), redundancy

it is important

relationships

redundant.

review

PLAYER

main concern

note

entity set. Therefore,

Copyright

and

you

players

play for

design.

An example

Figure

to

Redundant

However,

the

maintaining

Editorial

division,

which

players

is often seen as a good thing to have in computer comes

in

entities.

simplify

In

which

out

DIVISION

redundancy

a database.)

the

play in

to find

Design Case #4: Redundant

multiple

in

need

between

Although in

players

you

relationship

6.13, note how easy it is to see which

Learning. that

any

All suppressed

the

too is the without

Rights

Reserved. content

transitive

the relationship

does

additional

losing

May not

not materially

1:* relationship

between

DIVISION

that

DIVISION

and

connects

attribute

DIV_ID in

PLAYER.)

any information-generation

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

In that

capabilities

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

PLAYER

the

PLAYER is, for all practical

right

case,

in the

some to

third remove

party additional

the

TEAM

purposes,

relationship

could

be

model.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 6.14

6.5

Aredundant

6

Data

Modelling

Advanced

Concepts

255

relationship

DATA MODELLING CHECKLIST 6

Data

modelling

real-world

data,

enables

the

trade-offs The

translates

designer

processes to

add

and intricacies

modelling

database

designs.

modelling

and tools

TABLE

6.7

(The

BUSINESS

thus

far

give

the

tools

needed

Table

relational

in

such

to

6.7

how the

also

learnt

Table

6.7 is

successful

that

based

entity

order for

you

perform

on the

concepts

relationship

it is assumed

as synonyms,

the data.

all is in

ensure

the

EERM

about

produce

ensure that

model,

the

of time-variant to

will help

model.) Therefore, checklist,

have

modelling

checklist

used in the

You

represents

chapter

the

3

the

in this

a checklist

in

model that

and

modelling

Chapter

a data

model.

you

pilot uses

entity relationship

modelling

the

keys

shown

data

in

and labels

Data

primary

checklist

beginning

to

of

as any good

into

You have learnt

content

learnt

modelling

successfully.

majority of terms

that

aliases

model,

you are familiar

and relationships.

checklist

RULES

Properly Ensure

document that

entities,

and verify all business rules

all business

attributes,

Identify

the

existence

DATA

have

and the extended

environment

semantic

selection

just

data

have learnt,

normalisation with the

you

the

tasks

you

more

However,

flight,

real-world and interactions.

in the

techniques

a successful data

users,

a specific

source and

rules

are

relationships

written and

of all business

by the

date

and

with the end users.

precisely,

clearly

and simply.

The

business

rules

must help identify

constraints.

rules

and

person(s)

ensure

that

responsible

each

for

the

business business

rule is rules

accompanied

verification

by the and

reason

for its

approval.

MODELLING

Naming

Conventions:

All names

should

be limited

in length

(database-dependent

size).

Entity names: ?

Should

be nouns

?

Should include

?

Should

be unique

composite

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

are familiar

abbreviations,

? For composite

Editorial

that

business

synonyms

within the

entities,

to

and should

be short

and

meaningful

and aliases for each entity

model

mayinclude

a combination

of abbreviated

names of the entities linked through

the

entity

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

256

PART II

Design

Attribute

Concepts

names:

?

Should

be unique

?

Should

use the

within the

entity

abbreviation

? Should be descriptive

such as _ID, _NUM or _CODE for the

? Should not be a reserved ? Should not contain

?

or prefix

of the characteristic

? Should use suffixes

Relationship

entity

PK attribute

word

spaces

or special

characters

such as @,! or &

names:

Should

be active

or passive

verbs

that

clearly

indicate

the

nature

of the

relationship

Entities:

6

All entities

should

All entities

should

The

granularity

represent

a single

subject

be in 3NF or higher (covered

of the

The PKis clearly

entity instance

defined

is

in

clearly

and supports

Chapter 7, Normalising

Database

Designs)

or for

maintaining

defined

the selected

data granularity

Attributes: Should

be simple

Should include Derived

and

single-valued

default

attributes

should

as a foreign

data)

values, constraints, be clearly

Should not be redundant, used

(atomic

synonyms

identified

unless they

and aliases

and include

are required

source(s)

for transaction

accuracy

a history

or are

key

Relationships: Should

clearly

identify

Should clearly

relationship

participants

define participation

and cardinality

rules

ER Diagram: Should

be validated

Should evaluate Should

not

Should

against

where,

contain

expected

processes:

when, and how to

redundant

updates

and

deletions

maintain a history

relationships

minimise data redundancy

inserts,

except

as required

to ensure single-place

(see

attributes)

updates

SUMMARY The extended supertypes, one

or

entity relationship subtypes

more entity

A specialisation and

entity

of the

to

Copyright review

2020 has

Cengage deemed

Learning. that

any

All

Inheritance

allows

can

entity

Rights

Reserved. content

supertype

does

is

to the

a generic

ER

model via entity

entity

May

not materially

be

an entity

the

and relationships subtype

type

that

supertype

copied, affect

and subtypes:

scanned, the

overall

or

duplicated, learning

in experience.

occurrence

or in Cengage

part.

Due Learning

the

A subtype

is related

to

electronic reserves

to

entity

attributes

to

supertypes

and relationships

discriminator

is related.

approaches

specialisation

whole

between

to inherit

or overlapping.

There are basically two

supertypes

not

arrangement

be disjoint

subtype

completeness.

of entity

suppressed

model adds semantics

An entity

depicts the

Subtypes which

or total

hierarchy

Editorial

hierarchy

subtypes.

determine

(EER)

clusters.

subtypes.

supertype.

partial

and

is

used

The subtypes

developing

can

to exhibit

a specialisation

and generalisation.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

An entity

cluster is a virtual

entity type

ERD.

An entity

cluster

is formed

single

abstract

entity

object.

Natural

keys are identifiers

primary

keys.

should

Surrogate

primary

keys

primary

are

data

keep

This entity trap

occur

refers

occurs

when there

redundant

to

and

and relationships

do not necessarily must

and they

and

no natural

key

with

mandatory

data

whose

are

have

Concepts

257

in the into

a

make good

unique

preferably

values,

numeric

weak (strong-identifying)

key that

makes

multiple different

changes.

the

new

you

the

other

have

they and

entities.

a suitable

primary

data types,

entity that

are

the

date

in

is

key,

or when the

used

key in the

any

which the

between

to represent

to

in the

related

across has_a

data,

other

be

you

entities

The

6

must data.

maintained. and there

Redundant

entities.

the

mandate

time-relevant

other

model.

entity,

played.

requirements

history is to

1:* relationships

consistent

whose

of time-variant and

optional

where the role is

and

history

expressed

paths

remain

time

of change

two

not

multiple relationship

composition

over

with the entity for

one

they

maintain

the

as a foreign

of nulls, or place it

change

To

value,

entities

is that

entity

number values

a 1:* relationship

are

is

primary

PK of the

of data

relationships

Aggregation

the

when

among

and relationships

entities

They

over time,

Advanced

be usable.

containing

maintains

association

when there

causes the least

a history

an entity

useful

place

entity that

Time-variant

A fan

change

*:* relationships

key is a composite

a 1:1 relationship,

place it in the

you

not

Modelling

entities

Natural keys

characteristics:

Data

attribute.

key is too long to

create

must

multiple

interrelated

world.

have these

keys are useful to represent primary

when the

multiple

exist in the real

should they

of a single

Composite

that

that

keys

be non-intelligent,

composed

In

Primary

used to represent

by combining

6

is

an

relationships

main concern

with

model.

or part_of

relationships

between

entities. The

data

minimum

modelling

checklist

provides

a way for the

designer

to

check

that

the

ERD

meets

a set

of

requirements.

KEY TERMS aggregation

entitysubtype

completeness constraint

entity supertype

composition

extended entity relationship

design trap

overlapping(non-disjoint)subtypes partial completeness model

specialisation hierarchy

(EERM)

subtype discriminator

disjointsubtypes(non-overlappingsubtypes)

fantrap

time-variantdata

EERdiagram(EERD)

inheritance

total completeness

entity cluster

natural key(natural identifier)

FURTHER READING Advances

in

Conceptual

Modelling

Theory

and

Practice,

Lecture

Notes in

Computer

Science,

Volume

4231,

Springer, 2006. Booch, G. Unified Modelling Language Gordon,

K.

Modelling

Computer

Society,

Hernandez,

User Guide, Addison-Wesley,

Business Information:

Entity

Relationship

and

2005. Class

Modelling for

Business

Analysts.

British

2017.

M. J. Database

Design for

Mere

Mortals:

A Hands-On

Guide to

Relational

Database

Design.

Addison-Wesley,

2003.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

258

PART II

Design

Concepts

Online Content are available

on the

Answers to selectedReviewQuestions andProblems forthis chapter

online

platform

accompanying

this

book.

REVIEW QUESTIONS

6

1

Whatis an entity supertype

2

Whatkinds of data would you store in an entity subtype?

3

Whatis

4

Whatis a subtype discriminator?

Given an example of its use.

5

Whatis an overlapping

Give an example.

6

Whatis the difference

7

Whatis an entity cluster, and which advantages

a specialisation

and whyis it used?

hierarchy?

subtype?

between partial completeness

8

Which primary key characteristics considered desirable.

9

Under which circumstances

10

Whatis

a surrogate

11

Whenimplementing mandatory

12

primary

key,

and

when

a 1:1 relationship,

and one side is

Whatare time-variant

are derived from its use?

are considered

would composite

optional?

and total completeness?

desirable?

Explain

why each characteristic

is

primary keys be appropriate?

would you use one?

where should you place the foreign

Should the foreign

key be

mandatory

key if one side is

or optional?

data, and how would you deal with such data from a database design point

of view?

13

Whatis the

most common

design trap, and how does it occur?

PROBLEMS 1

AVANTIVE Corporation is a company specialising in the commercialisation of automotive parts. AVANTIVE has two types of customers: retail and wholesale. All customers have a customer ID, a name, an address, a phone number, a default shipping address, a date oflast purchase and a date of last payment. Retail customers have the credit card type, credit card number, expiration date and email

address.

Wholesale

customers

have a contact

name,

contact

phone

number,

contact

email address, purchase order number and date, discount percentage, billing address, tax status (if exempt) and tax identification number. Aretail customer cannot be a wholesale customer and vice versa. Given that information, create the ERD containing all primary keys, foreign keys and main attributes.

2

AVANTIVE

Corporation

purchasing.

Each

has five

department

departments:

employs

administration,

many employees.

marketing,

Each

employee

sales, shipping has an ID,

and

a name,

a

home address, a home phone number, a salary and a tax ID. Some employees are classified as sales representatives, some astechnical support and some as administrators. Sales representatives receive a commission based on sales. Technical support employees are required to be certified in

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

their

areas of expertise.

systems ERD

3

specialists.

containing

For example,

all primary

AVANTIVE Corporation

keys,

keeps

AVANTIVE

keeps several

alist

on hand.

A retail

purchased 30 days from

number, prices

to

pays

about

models,

and is

normally price

the

that

others,

Concepts

259

as electrical

information,

manufacturer,

will have a part ID,

card

customer

specialists;

Given

Advanced

create

the

business rules:

many car

a discounted

line

person

a shipping

closed

for

Modelling

main attributes.

A part

by credit

or wholesale)

extended the

and

a bonus.

Data

for

and

a car

charged

the list

item

has

price

many

for

order

purchased.

and

year.

unit price and

model

pays via purchase each

model

description,

parts.

each

with terms

(The

of net

discount

varies

customer.)

(retail

to identify

as drivetrain

and

with information

be used

a date, a shipping and

cost,

can

charged

customer

A customer

models

A wholesale

and is

a title

keys

parts in stock.

normally

item.

foreign

of car

A part

customer

have

operates under the following

AVANTIVE

quantity

some are certified

All administrators

6

totals.

who

date,

can place

address, Each

made the

an order

many orders.

a billing

total

order

address also

has

sale,

an

order

cost,

an

order

Each

order

and a list a sales

representative

subtotal, total

will have an order

of part codes,

an order

paid

and

quantities,

ID (an

tax

total,

an order

unit

employee)

6

a shipping

status

(open,

or cancel).

Using that information,

create the

complete

ERD containing

all primary

keys, foreign

keys

and

main attributes.

4 In

Chapter 5, Data Modelling with Entity Relationship

University

database

many students to include

design.

these

business

An employee A lecturer

Staff employees

be staff

one

such

business

department.

rules

Modify the

as a lecturer design

may advise

shown

in

Figure

5.39

or a lecturer

or an administrator.

be an administrator.

have a work level

Only lecturers

can

Only lecturers

can serve

chair

classification,

a department.

as the

such

A department

as Level I and Level II. is

dean of a school.

chaired

by only

Each of the

one lecturer.

universitys

schools

is served

dean.

Alecturer

can teach

Administrators Given that primary

may chair

rules:

could may also

by one

That design reflected

and a lecturer

Diagrams, you saw the creation of the Tiny

many classes.

have

a position

information,

keys,

create

foreign

keys

title.

the

and

complete

ERD

using

UML

class

diagram

notation,

containing

all

main attributes.

5 Tiny University wantsto keep track ofthe history of all administrative appointments (date of appointment and to

date

know

2018

of termination). how

or who the

complete

6

and to

review

2020 has

dean of the

ERD containing

technology

infrastructure

Copyright

Time

worked

variant

in the

School

all primary

data School

are at

Cengage deemed

Learning. that

support.

technology

take

any

support

All

Some

of Education keys, foreign

Rights

training

Reserved. content

does

May not

not

be

retain

copied, affect

their

scanned, the

overall

or

duplicated, learning

The

was in 2010. keys

provide IT

and

in experience.

whole

technology are

Cengage

part.

Due Learning

to

electronic reserves

chancellor 2000

may want

and

1 January,

Given that information,

create the

(IT) personnel. Some IT personnel

Some IT

expertise.

or in

University 1 January,

main attributes.

technology

personnel

technical

Tiny

between

programmes.

personnel support.

to

materially

academic

IT

infrastructure

periodic

suppressed

for

work.)

of Business

Some Tiny University staff employees areinformation provide

Editorial

(Hint:

many deans

personnel

provide

technology

support

for

academic

programmes

not lecturers.

IT

personnel

are required

Tiny

rights, the

right

University

some to

third remove

party additional

tracks

content

may content

all IT

be

suppressed at

any

time

personnel

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

260

PART II

Design

Concepts

training

by date, type

complete

7

ERD

The FlyRight

and results

containing

Aircraft

maintenance

for

all

(completed

primary

vs not completed).

keys,

foreign

Maintenance (FRAM)

FRCs

aircraft.

Produce

keys

and

Given that information,

division of the FlyRight

a data

model

create the

main attributes.

segment

Company (FRC) performs

that

reflects

the following

all

business

rules:

All

mechanics

Some

mechanics

specialised

(AV) in

in

maintenance.

in their

course

type,

(AF)

not

a real-world

you

will show

of expertise.

FRC

(Y/N)

and

are

mechanics.

maintenance.

Some

mechanics

are

components

mechanics tracks

take all

mechanics

specialised

are

in

avionics

of an aircraft that

periodic

courses

refresher

taken

are used

courses

by each

to

stay

mechanic

date,

performance.

employment

terminated

your

(EN) Some

electronic All

and

requirement. in

engine

are the

of the

date

in

navigation.)

certification

a history

Not all employees

maintenance.

(Avionics

areas

promoted,

Given those

specialised

and

current

date

8

are

airframe

communication

FRC keeps

6

are FRC employees.

of all mechanics. so

Instead,

on. (Note:

it

has

The history includes

The and

been

used

so on

the

component

here to limit

the

date

is,

hired,

of course,

number

of attributes

design.)

requirements,

create

the

ERD

segment

using

UML

notation.

You have been asked to create a database design for the BoingX Aircraft Company(BAC), which has two

products:

TRX-5A

managers to track For

simplicitys

blueprints make

All parts you

may assume

your

own

used

in the

You

HUD (heads-up

sake,

may assume

parts.

TRX-5B

parts and software

and that

up

and

blueprints, you

the

TRX-5B

blueprint

are free

the

to

the

based

units.

The

database

HUD, using the following

TRX-5A

unit is

on three

based

engineering

must enable

business

on two

rules:

engineering

blueprints.

You

are free

to

names.

TRX-5A

that

that

unit is

display)

for each

and

TRX-5B

TRX-5A

make

unit

up your

are

uses

own

classified

three

part

as hardware.

parts

and that

For simplicitys

the

TRX-5B

sake,

unit

uses four

names.

NOTE Some

parts

suppliers Company. parts.

are

supplied

by vendors,

must be able to Any parts supplier

Therefore,

any

while

meet the technical

part

that

others

are

supplied

requirements

meets the

BoingX

may be supplied

by

Aircraft

multiple

by the

BoingX

specification

(TRSs)

Companys

suppliers

TRSs

and

Aircraft

Company.

Parts

set by the

BoingX

Aircraft

may be contracted

a supplier

can

supply

to supply

many

different

parts.

BAC

wants

BAC

wants to

assume also

to

that

uses

keep keep

the

two

track

of

track

unit

the

the change,

BAC

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

person

requirements,

Rights

Reserved. content

does

May not

not materially

be

of the

change,

affect

scanned, the

overall

software.

software

or

duplicated, learning

You are free

the

description

made the

data

in

whole

or in Cengage

part.

to

make

Due

to

electronic reserves

sake,

up your

change,

and the

using

changes.

and that

you

may

TRX-5B

unit

software

names.

Those changes

the

reason

the

own

and software.

by test type, test

Learning

of those

For simplicitys

of the

change

ERD segment

experience.

dates

components

made in blueprints

of all HUD test create the

copied,

TRX-5B

components.

who actually

keep track

and the

named

of all changes

and time

the

wants to

Given those

Editorial

date

changes

and

uses two

software

BAC wants to keep track reflect

price

of all TRX-5A

TRX-5A

named

all part

person

for the

must

who authorised

change.

date and test

outcome.

UML notation.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

9

Given the following hierarchy,

if

departments box

in

number

employees, salaried, is

which

target

and

each

who can

work

on sales

hours

a salesperson

with the

a base

he

employees,

billing

rate

example,

addition

of

plus another

R500

beginning

000

date

are

stored

per

year

in the plus

5 per cent and

end

a 2 per

cent

of their

for

employees,

For

profit

and

salespeople

commission

system.

of the

date

are

the

percentage

example,

John

commission

on each

contracts

be

which

wages

40 hours/week

For all salaried their

can

number,

employees

salespeople,

mail

assigned

hourly

may target

salaried

261

the

internal

Employees

an employee

company

and

many

employees,

base salary.

profit

have

department.

Concepts

a specialisation

name,

can

hourly

Some

For

salary

their

the

to their

makes,

one

For

others.

system. on

department

are assigned

address.

20 for

in the

the

for

and

for

and

in

only

Advanced

on employees

A department

to

Modelling

Foot ERD using

the

kept.

Data

information

department, are

All employees

percentage

all sales

keeps

assigned

stored;

is recorded

with

price for

For contract

is

others,

commission

a Crows

each

name

are

32 for

amount

and

For

create Company

extension

employees

earn a commission

salary

sales

work.

phone

or work on contract.

employees,

yearly

Sales

employee

with the

weekly

some

they

office

along

scenario,

Granite

and

hourly,

kept

business

appropriate.

6

is

on the

of those

sales.

are stored

along

6

hours.

CASESTUDIES 1

Sedgefield

Bike Rentals is a small family-owned

Tourists

regularly

visit the

coastline.

The

main business

maintaining suitable

those

for

bikes

a complete bike

has

manufacturer, CHILD,

three

to

ERD to

and type

TEENAGER,

Sedgefield

Bike

that

depending

(e.g.

bike is

date

condition.

is

on

2020 has

Cengage deemed

Learning. that

any

which

with

a class

day or full

it

All

South Africa. areas

bikes to

on bikes

described

beautiful

hire to tourists,

that

are

no longer

below:

by a unique

number.

is recorded

along

has reached

the

dealers

they

For each with the

end

bike,

the

size (e.g.

model,

INFANT,

of its lifespan

The price it is

sell to

on

A dealer

poor condition

it is

make

along

with the

Rights

into

Reserved. content

does

May not

it is

date

the

not

be

(typically

sold for and the

a regular

may or

then it is

to

affect

hire If

scanned, the

overall

to

Bike

basis

may not

and

maintain

purchase

not offered

to

a bike

a dealer

or

duplicated, learning

in experience.

need

by either

whole

Cengage

bike

For each

Due Learning

good

class

size,

working

A description

has

example,

been

a typical

maintenance

telephoning to

can rent

part.

for

a

order.

of the fault

repaired,

the

problem

is that

on a regular

and

action new

basis.

maintenance.

agrees

or in

rates

CHILD_YOUNG,

rates.

Log.

will undergo

A customer

standard

ensure that it is still in

For

a customer

determine

LARGE_ADULT}.

When the

a bike

to

of bike are {INFANT,

Maintenance

may never bikes

used

and full-day

is recorded.

a bike

shop.

copied,

checked

lifespan,

that

is

sizes and

also recorded.

are recorded.

materially

that

half-day

in the

are

a request

details

code

STANDARD_ADULT

Over its

possible

size

day). The class

was noticed

and the

walking

suppressed

needs

information.

bike is in

it is recorded

are required.

and contact

review

a bike

of

contact

If the

assigned,

noticed,

Customers

Copyright

or road)

After

a number

dealers

TEENAGER,

was taken

However,

Editorial

business

mountain

has

the

associated

code

simply

the

etc.).

After a bike has been returned,

tyres

selling

the

in the bike record.

of hire (half

If a fault is

that

acquiring

and

on usage) it is sold on to a dealer.

Rentals

on its

CHILD_OLD,

the

order

Sedgefield, around

scrapped.

period

unique

and cycle

Rentals is in

working

and is identified

ADULT,

contains

and just Each

business located in

countryside Bike

a good

support

a bike record

date are also recorded

the

the

of Sedgefield ensure

years but dependent

a list

admire

hiring.

Create Every

area to

to

electronic reserves

Sedgefield

hire

a bike,

Bike

his or her

one or more bikes

rights, the

right

some to

third remove

party additional

content

Rentals

name,

or by

address

at a given time.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

262

PART II

Design

Concepts

For each time In

bike that is rented,

the

bike

addition,

code.

the

Each

number

was taken amount

rental

or

When a bike is for

ordering

each

being

week, for

one

costs

order

A order

but

and

a grand

over

time

can be

one

Explorer

part

but

mountain

the rental

actual

time

determined

customer.

it

date, the

was returned.

through

the

can

be rented

A bike

order

total. can

one

place

many

a part

contain

Bike

Rentals

orders

are

may place

can

class

size out

any

be responsible on Fridays

of orders.

order,

any associated

be associated

with

placing

orders.

can be ordered

description

a request

will

placed

any number of the

employee

and parts

number,

often

Typically

date, a subtotal

Only

many parts

bike large

of Sedgefield

An employee

number,

employees

can

one

are required.

Mondays.

by a unique

and the

which is

with only

that

order

made for

This contains

back

an employee

parts

on

due

be rented.

of an order

part is identified at least

bike

is created.

was

paid is recorded,

maintained,

delivery

consists

it

associated

may never

additional

An order delivery

is

record

time

of rent

record

of times

a rental

out, the

for

and

on part

several

multiple cost.

of the

occasions.

Each

An order

must

parts,

e.g. three

same

be for

saddles.

6 A particular in

stock

part

with

Sedgefield

manufacturer,

to

2

of the

is

placed Bike

uses

Unsolicited

the

title

ensure

that

its

the journal,

name,

is

some

a part is part in

basis.

address,

not

stock.

For

telephone

may supply

for

is

affiliation

manuscript

to to

record

publication.

publication.

support

it

the

each

number

parts

via

orders

When a

the

and

a

including

(the

school

order

authors.

which

is received,

below:

the

each

authors

A single

authors

name,

are kept in

author a

of

Every

manuscripts

when

the

status

or company).

Additionally, in

of the journal

described

manuscript

who have submitted

have several

needs

cent

about it in the system,

was received

the journal.

10 per

A new issue

manuscript

authors

systems research

Only about

business

also recorded,

Only authors

to

for

basic information

date

author(s) and

does

within

ensure

their

email

May

not materially

be

affect

the

scope

of the

scope

overall

or

duplicated, learning

of the journal,

reviewer

may have

manuscript

are listed

(for areas

in

or in

whole

Cengage

part.

then the other

Due Learning

to

electronic reserves

has

in the

right

is

An area

some to

is the an

third remove

area

party additional

content

manuscript

not

author

selects

to

appropriate is

notified

three

or

via

more

or universities

the

of interest.

and

rights,

editor

reviewer,

IS2003

the

content

companies

areas

of interest,

of the

and the

has specified.

example,

topic

If the

For each and

the

rejected

work for

validity.

many

review

journal. to

affiliation

the

experience.

briefly

changed

Reviewers

a description have

scanned, the

is

scientific

that

can

copied,

will

address,

expertise

and includes

not

editor

manuscript.

name, of

the

status

within the

A reviewer

Reserved. content

manuscripts

ERD to

the

manuscripts

fall

the to

areas

Rights

the

manuscripts

number,

modelling).

All

on a regular

manufacturer

are accepted

and records

for a

contents

content

by an IS code

suppressed

a

by authors.

manuscript,

convenience,

the

manuscripts

predefined

select

submitted

about

to review

reviewer

and

a complete

it is important

or his earliest

any

the

the

basis.

to the journal

different

At her

Learning.

use

If

have

credits.

email. If the

that

manufacturers.

see if they

they

is recorded:

to

address

It is typical

authors,

reviewers

Cengage

of

to

Research Knowledge is a prestigious information

are

email

many

manuscript

read

number

manufacturers

a regular

must have an author.

system.

for

any

be checked

manufacturer

Create

Information

multiple

of

process

of the

address,

submitted

deemed

on

it a number

the

manuscript

has

Rentals

quarter.

assigns

received.

2020

one

manuscripts

including

review

with

submitted

each

mailing

alist

can

information

a peer-review

manuscripts

published

editor

from

others

keep

The Journal of E-commerce

is

Copyright

Rentals

the following

Sedgefield

It

be obtained

address.

order

journal.

Editorial

always

manufacturer,

Bike

and email Each

can

one

system Areas

and

records

a

of interest

of interest

are

is identified

code for database of interest

may content

be

suppressed at

any

time

if

can

be

from

eBook

the

subsequent

rights

and/or restrictions

eChapter(s). require

it

CHAPTER

associated It is

with

unusual,

many reviewers.

but

The editor

possible,

will change

reviewers

received

will typically

as

well

of this the

a

of the journal,

although

the

will be

published

journal.

issue

Each

including been

fonts,

which

issue.

stored

in

of the

each

publication

the

order

issue

is

which

may

appear

page

manuscript

has

status

that

system.

in

and the

been

for

each

For

is

manuscripts

the

of the

content,

manuscript

has

will then

of

decide

manuscripts

within

must

an issue,

published,

in that

6

publication

manuscript for

each

one issue

editor

order

scheduled

manuscript

only

Once the

each

Once an issue of

to accepted

formats

The

the

is recorded.

which

in

which

manuscripts,

known

appears

number

to scheduled.

and the

it is

on

summer),

many

and so on. the

date

all

evaluations,

status

spring,

process

in

will record

their

its

on a

be scheduled.

will contain before

editor

with the

must

feedback

to the field,

of acceptance

winter,

justification

will

The

change

it

a typesetting

provide

manuscript

provided

date

manuscript

beginning

the

and

the

system

is recorded

the

along

have

publication,

in the

spacing,

is changed

recorded,

263

no reviewers.

new reviewers

and contribution

received,

An issue

manuscript

Once

Concepts

A reviewer

and

or reject).

period (autumn,

of pages

and the

system.

for

through

rating

reviewers

An accepted

size, line

has

reviewer.

convenience

includes

manuscript

are recorded.

accepted

The

manuscript

the

the

number

earliest

review

the

accepted

goes

Advanced

and record

each

each year, although

will be published,

issue.

journal

review

methodology

of the

publish

been

that

Modelling

one area of interest.

the

sent to

(accept

each

all

has

font its

issue

each

for

Once

manuscript

typeset,

at their

publication

may be created in

was

to review

clarity,

system

and number

at least which

to under

date it

each reviewer

for

manuscript

manuscript

volume

manuscript

manuscript

whether to

If the

for

Data

yet.

from

in the

will decide

year,

the

was received.

or rejected.

issue

manuscripts

must specify

of interest

and the

manuscripts

as a recommendation

editor

of the

appropriateness,

information

Once

status

The feedback

feedback

for

any

scale for

an area

manuscript

will read

editor.

ten-point

All reviewers

have

several

not have received

to the

the

the

receive

The reviewers

to

6

the

the

issue

be

is

status

print

date

changed

to published.

3

Global Computer offices to

located

maximise

according so that

Solutions (GCS) is an information

throughout its

resources,

to region.

The

a first

2020 has

Southern

Each

employee

Each

skill

for

has

II,

many

a skill ID,

consulting

companys

highly

projects,

company

success

skilled

is

based

employees

GCS has contacted

customers,

systems

C11

I,

All suppressed

Rights

Reserved. content

GCSs

GCS

have

an

and

a date

does

May not

and skills,

analyst

C11

II,

employees,

not materially

to

you to

with many on its

work

ability

on projects

design

a database

project

schedules,

projects,

copied, affect

and information

requirements.

A basic

many

I,

overall

or

I,

duplicated,

an employee

last

name,

a

middle

of

II,

in

Valid

II,

database

or in Cengage

I,

Java

engineer

shows

whole

Europe (EE),

have the

pay.

Java

network

experience.

Eastern

Western Europe

(SA).

analyst

Python

table

learning

ID,

employees

and rate

engineer

the

Africa

I, systems

scanned,

employee

of hire.

South

The following

be

operations

Europe (NE),

and

Python

manager.

project

all of

description

DBA, network

any

assign

of their

Northern

(SE)

SQL Server

Learning.

The

follows:

a region

Europe

has

entry C II,

working

and

that

technology

Africa. to

manage its

are as follows:

(WE),

Cengage deemed

ability

must support

name,

Valid regions

review

its

South

can keep track

main entities

The employees

Copyright

is,

To better

database of the

initial,

Editorial

and

and invoices.

GCS

description

C I,

that

GCS managers

assignments

data

Europe

part.

skills

are

Due

to

electronic reserves

II,

as follows:

ASP I,

I,

data

database

ASP II,

of the

right

some to

third remove

skills

party additional

content

entry

I,

designer

Oracle

web administrator,

rights, the

skill.

designer

II,

an example

Learning

same

DBA,

technical

II,

MS

writer

inventory.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

264

PART II

Design

Concepts

Employee

Skill

6

Seaton

Data

Entry I

Data

Entry II

Williams

Systems

Analyst

I

Craig

Systems

Analyst

II

Chandler

Josh;

Brett;

Williams

Josh;

Seaton

Amy

Sewell

Beth;

Joseph;

Burklow

Designer

I

Yarbrough

Peter;

Smith

DB

Designer

II

Yarbrough

Peter;

Pascoe

Kattan Chris; Epahnor

C11 I

Smith Jose;

C11 II

Nokwi Londe;

Python I

Zebras

Steve; Ellis Maria

Python

Zebras

Steve;

Duarte

Miriam;

Ismail

Summers

Victor,

Nkosi Cela

Nokwi Londe;

Pieterse Bush

Hemalika;

Oracle DBA

Smith Jose;

Peter;

Smith

Engineer

I

Ismail

Hemalika;

Smith

Mary

Network

Engineer

II

Ismail

Hemalika;

Smith

Mary

Ismail

Hemalika;

Smith

Mary;

Manager

GCS has number GCS to

many customers.

works

design,

by

projects.

develop

a project date (an

start

estimated

cost

employee

assigned

actual

employee

is, in

effect,

has

Cengage deemed

Learning. that

any

All

Rights

date

as

Kenyon

Tiffany;

Connor

has a customer

based

Sean

ID, customer

on a contract

a computerised

(that

is,

the

date

a project

an

actual

manager project

start

of the

is

the

name,

phone

who is the

manager

to

which date

date,

the

Each

which the

the

end

date,

and

GCS

specific

belongs,

contract

an estimate),

an actual

has

project

projects

(also

customer

project

a brief

was

signed),

a project

a

budget

an actual

cost

(total

and

one

of

pay)

project.

updated

hours

on

end

between

solution.

project ID, the customer

multiplying

and

the

each

each

Friday

employee

does

May not

not materially

that

copied, affect

scanned, the

overall

by adding worked

duplicated, learning

must

In the

performed

that

times

of people

in experience.

whole

or in Cengage

part.

complete

project

to

description,

number

or

project plan.

will be

a brief task

and the

be

of the

development

tasks

needed,

Reserved. content

is

estimate),

has a task ID,

of skill

suppressed

as the

project),

a design

determine

Each task

2020

Roger;

weeks

the

cost

skills

rate

to

cost.

The

type

Jaco

Larry

Each customer

A project

of the

by

actual

must

of

cost

(computed

review

such

project

the

Mudd

and implement

description,

The

Brad;

Bender

Pieterse

and region.

characteristics

Copyright

Surgena;

Paine

Maria

Jose

Network

Project

Ellis

Pascoe Jonathan

Yarbrough

Kilby

Anna;

Pieterse Jaco

Miriam; Pieterse Jaco

Writer

Erin

Emily

Duarte

Technical

Steve

Jaco

ASP II

Web Administrator

Zebras

Bible Hanah

Miriam; Bush Emily

DBA

Emily;

Cope Leslie

Duarte

Server

Robbins

Victor;

ASP I

SQL

Bush

Jonathan

CII

Java II

Erin;

Mary

Kattan

I

Epahnor

Buhle

Shane;

CI

II

Chris;

Khoza

Robbins

DB

Java

Editorial

Amy;

take

schedule

(or

project

from

the

the tasks (with

Due Learning

to

electronic reserves

the

starting

required

rights, the

a project

right

some to

third remove

schedule, plan),

the

and ending

additional

content

manager

beginning

skills)

party

which

may

be

to

suppressed at

any

time

end.

date, the

required

content

to

from if

complete

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

the task.

General tasks

coding,

testing,

schedule

Project

shown

in

the

See

next

Date:

1/3/2019

Start

Date

End

sign-off.

Management

Contract End

Date

database

and

and system

For

Data

Modelling

Advanced

Concepts

265

design, implementation,

example,

GCS

would

have the

project

table.

Sales

Rocks

Start

interview,

evaluation

Description:

ID:1

Company:

are initial

and final

6

Task

Date:

Date:

System 12/2/2019

1/7/2019

Region:

WE

Budget:

R375

Skill(s)

Description

000

Required

Quantity

Required 1/3/13

Initial Interview

6/3/19

11/03/13

15/03/19

Database

11/03/13

12/04/19

System

Project

Manager

Systems

Analyst

Design Design

1 II

1

DB Designer

I

1

DB Designer

I

1

Systems

Analyst

II

1

Systems

Analyst I

2

18/03/13

22/03/19

Database Implementation

Oracle DBA

1

25/03/13

20/05/19

System

CI

2

C II

1

Coding

and

Testing

Oracle

25/03/13

07/06/19

System

10/06/13

14/06/19

Final Evaluation

DBA

Technical

Documentation

1

Writer

Project

Manager

Systems

Analyst

DB Designer

1 1 II

1

I

1

Cobol II 17/06/13

21/06/19

On-Site

System

Online

and

Data Loading

1

Project

Manager

Systems

1

Analyst II

DB Designer

1

I

1

CII 01/07/13

01/07/19

Sign-Off

Assignments: are

assigned

first

projects

analyst

II,

GCS pools to

is assigned that

customer,

can

work

Copyright review

2020 has

Cengage deemed

skills

multiple

project

If

can

tasks.

an employee

cannot

is

All suppressed

Rights

task

Reserved. content

does

May not

not materially

be

and

copied, affect

scanned, the

overall

duplicated, learning

task

in experience.

whole

them

to

work

on a project

until his/her

current

or in Cengage

part.

Due Learning

to

electronic reserves

ahead

rights, the

right

of the

some to

third remove

party additional

Using as the

task.

and

a given

on only task

may content

employee

one

from

project 20/02/19

is closed ending

behind)

content

manager

region

match the of (or

the

project.)

same

assignment

does not necessarily be completed

project

project to it,

for

a systems

(The

in the

can

work

employees

06/03/19

duration

assigned

to

pool,

1

For example,

needed.

to the

an employee

assigned

can

are

for the

employees

is closed

or

01/03/19

who are located

However,

a task

manager.

manager

assigning

many

work on another

because

period

Manager

and from this

project

and remains

already

on which an assignment

the

employees

have

(s)he

schedule

for

a project

required

The date

any

and

the

03/03/19,

Learning. that

the

task

on

that

I

by region,

by the

project is created

schedule

at a time.

project

Editorial

know

GCS searches

matching

project

task

you

employees

scheduled

designer

when the

information,

Each

task

schedule, a database

1

Project

all of its

a specific

6

to

(ends).

date of the

schedule.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

266

PART II

Project

Design

Concepts

Description:

ID:1

Company:

Sales

Management

Contract

See Rocks

System

As of: 29/03/19

Date: 12/2/2019 Actual

Scheduled Project

Task

Start

Initial Interview

Date

1/3/19

Employee

End Date

Skill

6/3/19

Project

11/03/19

15/03/19

11/03/19

12/04/19

Mgr.

Start

101

Connor

Analyst

II

102

Cele

DB Designer

I

103

Pillay

DB Designer

I

104

Pillay

Sys.

Database

assignments Date

End Date

01/03/19

06/03/13

01/03/19

06/03/13

M.

01/03/19

06/03/13

M.

11/03/19

14/03/13

S.

S.

Design System

Design

Database

Sys. Analyst II

105

Cele S.

11/03/19

Sys. Analyst I

106

Hemalika I.

11/03/19

Sys.

107

Zebras

11/03/19

108

Smith

Analyst

I

DBA

S.

15/03/19

J.

18/03/19

22/03/19

Oracle

25/03/19

20/05/19

Cobol I

109

Summers

Cobol I

110

Ellis

Cobol II

111

Epahnor

DBA

112

Smith

Writer

113

Kilby

19/03/13

Implementation

6

System

Coding

& Testing

Oracle System

25/03/19

07/06/19

10/06/19

14/06/19

Tech.

A.

21/03/19

M.

21/03/19 21/03/19

V.

21/03/19

J.

25/03/19

S.

Documentation Final

Evaluation

Project

Mgr.

Sys. Analyst II DB Designer

I

Cobol II On-Site

17/06/19

System and

21/06/19

Project

Online

Data

DB

Loading

(Note:

01/07/19

The

assignment

assignments whatever

shown number

01/07/19

number previously

matches

Given with

all

your

shown

as

fills

following

end

each

assignment

of the bill to

Cengage

Learning. that

any

All suppressed

Rights

you

project

name; date

are

can

for

of this

see that

schedule.

example, design.

assignment

101,

The

102.)

Assume

assignment

week (Friday)

of the

month

ID, the total

hours

that

number

which the

work log

that

the

can

be

entry is charged.

of the

current

is

project

shown

an employee

assignment,

schedule

on page

a record

of each

month.

month if it

up to the

Obviously,

work log

of the

you

task,

date

projects

run

267.

of the

actual

hours

work log is a weekly form that the employee

of the

week (or

associates

track

be any dates as some

containing

end

day

keep

employee,

form

The

or at the

or the last

worked

ID,

a work log

on a given assignment.

assignment to

could

assignment

kept in

the

Therefore,

ends (which

A sample

works

of each

Friday

end

each

The form

doesnt

contains

fall

of the

month),

work log

the

on a Friday), and the

the number

entry can be related

entries for the first sample

to

project is shown in

table.

Reserved. content

employee

as of the

information:

only one bill. A sample list

deemed

of the

existing

schedule).

an employee

out at the

the following

Mgr.

starts and date assignment

of or behind

date (of

has

I

II

information, the

worked by an employee

2020

Designer

design.

using

the

a prefix ones

preceding task,

at least

The hours

review

II

Project

only

database

of the

assignment ahead

is

are the

a project

require

Copyright

Analyst

Cobol

Sign-Off

Editorial

Mgr.

Sys.

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Employee

Week

Name

Hours

Advanced

Worked

Bill

4

xxx

01/03/19

1-101

4

xxx

01/03/19

1-103

4

xxx

08/03/19

1-102

24

xxx

08/03/19

1-101

24

xxx

08/03/19

1-103

24

xxx

15/03/19

1-105

40

xxx

Hemalika I.

15/03/19

1-106

40

xxx

Pillay

15/03/19

1-108

6

xxx

15/03/19

1-104

32

xxx

15/03/18

1-107

35

xxx

22/02/19

1-105

40

Hemalika I.

22/02/19

1-106

40

Ellis

22/02/19

1-110

12

22/02/19

1-111

12

Pillay J.

22/02/19

1-108

12

Pillay

22/02/19

1-112

12

22/02/19

1-109

12

22/02/19

1-107

35

22/02/19

1-105

40

29/03/19

1-106

40

29/03/19

1-110

35

Connor

S.

M.

Cele

S.

Connor

S.

Smith

M.

Cele

S.

J.

Pillay

M.

Zebras Cele

S. S.

M.

Mbaso

V.

J.

Summers Zebras

A. S.

Cele S. Hemalika Ellis

I.

M.

29/03/19

1-111

35

Kilby

S.

29/03/19

1-113

40

Smith

J.

29/03/19

1-112

35

29/03/19

1-109

35

29/03/19

1-107

35

Mbaso V.

Summers Zebras (Note:

A.

S. xxx represents

Finally,

the

the

every

on the

15

period.

entries that

and

each

first

you

work-log

can

work log

Create

all of the

Create

the

Populate

has

Cengage deemed

Learning. that

is to

any

All suppressed

Reserved. content

does

May not

sent to the

not

be

copied, affect

scanned, the

overall

that

only

bill.

the

hours

using the

GCS sent

between

bill in this

will fulfil the

operations

bill

and

table

worked

bill number,

many work log one

01/02/19

one

skill,

additional create

maintain

or

totalling

a bill can refer to one

Number

database.)

customer,

only

267

6

a bill, it updates,

worked is

duplicated, learning

in experience.

entity

whole

customer, required

all of the

as needed (as indicated

materially

to

hours there

are and

to

your

bill. In summary,

are employee,

tables

in

Concepts

on 15/03/19

15/03/19.

and that

bill

covers

above form.

bill. (There

indexes

the tables

Rights

that

a database

entities and

bill number

be related the

assume

required

required

can

totalling

create

and

of that

shown in the

minimum required

assignment,

2020

safely

entries

the

When GCS generates

entry

(Xerox),

matches

written

are part

work log

project

Your assignment The

one that

a bill is

Therefore,

the

Use the

days

entries the

bill ID.

project for that

work-log

for

review

Modelling

1-102

Smith

Copyright

Number

Data

01/03/19

Cele S.

Editorial

Assignment

Ending

6

in the

or in Cengage

part.

Due Learning

region, entities

required

integrity

electronic reserves

project,

that

are

project

problem. schedule,

not listed.)

using

surrogate

primary

keys.

data and forms).

rights, the

in this

relationships.

when

sample

to

described

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

268

PART II

Design

Concepts

4 Martial

Arts R Us (MARU)

students. teach

The each

of each

database

class,

and

student are

with their

name,

instructor

date

must

An instructor

any

taught

of students

help.

many

For

5:00

Each

least

A given rank

school each

head

rank

Cengage deemed

a rank

Learning. that

any

is

stored

along

In addition

start

working

(compensated

to

as an

or volunteer).

but each class

volunteer

has one and

instructors,

may not be

class

2 is

week, and location.

Room

an intermediate-level

level

meetings

other than

session.

class

be recorded.

the

assigned

one instructor

Room

martial

arts.

1. During

students.

rank

rank.

is

belt

instructor

assigned

to teach

meeting

up

class. or the

of that

class,

instructor.

colour,

and rank

requirements, All ranks

to that

(head

as an assistant

name,

may show

many assistant

assigned

a particular

served

have numerous

with only one particular

and

roles

Mr Jones

The rank

normally

by any

instructor

who is

and the instructors

Ms Khumalo

the

yet.

For example,

class in

is no

Therefore,

meeting is

be attended

may have a head instructor the

A third

be tracked.

and each may not

class.

class.

week, so there

class

must

For

class.

each

particular

meeting

meetings,

class

and

a beginner-level

during

any

class

1 is

1 is

an advanced-level

meetings

any

Most ranks

day of the

Room

class

at least

in the

by

rank

that

can

should

also

but each

except

All

Rights

white belt have at

does

computing.

new

May

attains rank

is

of

system.

kept

white

in

belt.

All ranks

Sales (GUTS) is

a variety

not

each students the

system.

The

have

to think

progress

date

New that

at least

of a student

through students

a student

one

joining is

student

as

the ranks. the

awarded

who

has

at some time.

not materially

Employees

of

personal

computing

company

Reserved. content

the

While it is customary

to track

a student

be kept in the

rank

use The

The

suppressed

many students.

given

Global Unified Technology

address.

has

number

they

of classes,

will attend

of the

instructor

automatically

that

and laptops.

2020

to

may be held

model for employee

review

have date

associated

every are

achieved

Copyright

progress

requirements:

are instructors.

status

time,

appropriate

class

a single rank, it is necessary

Therefore,

Editorial

the

to

school.

especially

Room

meeting

p.m., intermediate

is

to track

The

of

assigned

one requirement.

having

They

of the

attended

the

are stored.

requirement

who is

ERD for these

date that

instructor

of a class, instructors

need

holds

requirements

the

p.m. in

p.m. in

Some

will always

was the

student

the

with hundreds

offered,

school.

at 5:00 p.m. in

6:00

student

have

meeting,

instructor)

Mr Jones

at 5:00

a given class

but it

Monday,

at

many different

meeting

class

assistant

joined

at a specific

Mondays

Mondays

students.

Therefore,

each

the

any number

at each individual

may not

instructors,

Foot

not all students

with their

level

particular

will attend

At any given

they

all instructors,

to teach

any class

any

students

join

Some instructors,

on

on

attendance

by

date

are

Also, it is important

Crows

they

but clearly

a specific

that

to

5

for

may attend

A student

class.

that

class.

expectation

attended

when

for

instructor.

on Tuesdays

Students

each

classes

a complete

and the

along

one class taught class

taught

New

birth,

may be assigned

to

Another

attend

number

be recorded

A class is offered

class

of

MARU is a martial arts school

of all the

Create

information,

only one assigned

6

students

a student

student

example,

track

are also students,

normal

assigned

keep

advance.

given

All instructors the

which

as they

Students

needs a database.

must

copied, affect

to

scanned, the

a bring

your own device (BYOD)

can use traditional

desktop

computers

mobile

computing

model introduces

wants

be

moving towards

overall

ensure

or

duplicated, learning

that

in experience.

whole

any

or in Cengage

part.

some devices

Due Learning

to

electronic reserves

devices security

such risks

connecting

rights, the

right

some to

third remove

that to

party additional

content

in their

as tablets, GUTS is

their

may content

attempting

servers

be

are

suppressed at

any

time

offices.

smartphones

from if

the

subsequent

to

properly

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

registered support

and approved the

Every

business

employee

number,

is

company.

every

has

currently

At such

have

number

hired employees

it is

happens,

possible

only the

can

part of the

a static

someone For

devices,

that

Once

a device

devices for

a period

for

a single

Each

server

Some

servers

then

the

can

host

system

Cengage deemed

Learning. that

any

All suppressed

to

it is

must

Rights

virtual

to

connect each

for and IP

May

but

newly

brand and system.

However,

system.

When

6

mobile to

IT

a static IP address,

should

department

serial

and

also

be kept dispatch

which

department

has

device is

can

number,

OS. The IT

devices.

be a permanent

A desktop

This location

not

also

is

encryption

also enabled

or not each

mobile

for

device

and which but

be

to

copied, affect

the

overall

or

duplicated,

any

but

the

one

server.

more

has

if it is Not

all

all

system

a number

Therefore,

it is

of

possible

servers.

departments

physical

facilities

servers

Further,

enabled servers.

might be in the GUTS

not for IT

are

or

device

individually.

Within the

are

physical

server

virtual

it is

server. In

that

in experience.

whole

is

other

host

that

servers.

If

it is running

server

servers

devices

learning

to

be recorded.

physical each

be created

scanned,

so the

server

where

some

Not all physical many

capabilities to

can

are

be located.

necessary

to track

on each server.

can host a virtual

have

materially

address.

used

appropriate

servers

rooms

should

servers

new servers

not

several

server

track

server.

does

or

system.

on whether

at first,

for

servers,

servers

Reserved. content

office

the

connections

approved

is in

virtual

for

be approved

brand,

server

will normally

for

in

an

another.

in the

are intended

enabled

and the

be approved

be approved

should

many

virtual

possible

system,

may be approved

system is being are

lock

are recorded

device,

to

is tracked

devices

storing information

in the

a name,

each

operating

A server

has

support

before

to

has

Only physical

2020

has

of climate-controlled

which

another

review

device

device

a device

Which room

Copyright

the of the

registered

device

a number

Editorial

capture version

For

needs to be recorded.

number).

also to

the

will be in the

device is assigned

and the

within

employees.

assigned

one

one employee device

compromised,

a screen

is

an employee

is kept in the

office

an

enabled.

the

of time

and

at least

and

using,

meet the requirements

servers,

and

becomes

device

a company

desktop

and the

For each device, the

to

the

in

any

middle initial)

Each have

company

hardware

name

problem.

mobile

reside

each

the

should

is

device,

and

from

owns

As such,

it is important

each

ERD to

department

be created

without

initially.

transfer

that

computer

device

capabilities

mobile

could

by the

if the

it is

269

mail box

which

can

system.

are registered

provided

(building

(OS)

The system

has these

a

location

system

verifying

data.

for the

so that,

mobile

that

systems

network.

to remediate

operating

in the

who currently

desktop

Concepts

title.

Most employees

a device

are typically

company

system

department

in the system, the date of that registration

MAC address

kept in in the

registered

name,

in

temporarily

each employees

devices

employee

be either

devices

and the

a new

and name (first, last,

to keep

that

Advanced

has 5 employees,

will only track

may exist

number

Only

Modelling

Create a complete

code,

currently

system

Very rarely,

devices

a department

department

when it is registered.

a device is registered

Desktop

many

has

This

department

be recorded.

While unlikely,

Devices

the

department.

might not have any devices registered

need to

if that

employed.

an employee

can

identification

model

The smallest

It is also necessary

An employee

that

40 employees.

times,

employee,

system.

Technology

Data

below:

a department

number.

department

employee

described

works for

and phone

largest

the

by the Information

needs

6

on.

on only

words,

one virtual

Cengage

part.

Due Learning

electronic reserves

to

rights, the

server, server

physical

server

server.

cannot

host

server.

approved

to

a virtual physical

one

access

the

server,

do not yet have any approved

or in

is

A single

hosted

a virtual

are

a server

right

some to

third remove

party additional

content

but it is

devices.

may content

be

When a

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

270

PART II

Design

Concepts

device is It is

approved

also

happens,

the

approval,

it

removal A server

date

provide service

Cengage

in this

Learning. that

user

tracked

any

All suppressed

system,

must

get

permissions

approved

date of that for

was removed at a later

but

new is

the

Rights

approval

a server

should date if

to lose

should

its

be recorded.

approval.

be recorded.

whatever

service

to

might

If

If that

a device

circumstance

employee

access

loses

that

May

not materially

be

affect

scanned, the

can

its

lead

the

any

to the

overall

or

that

can

at first.

are not

Most

employees

might not have employees

The

date

The first

a username

employee

on only one services

use it.

approved

system.

must create the

date

a server.

they

users

by the

and

The

runs

Client-side

with

multiple

approved

name.

Each service initially.

before

support

employee

managers,

and

but new employees

is tracked

and password

eventually

copied,

a service

homework

number

be associated

of services,

have

a service

a service,

is

chat,

be recorded.

must

service

not

use

username

as email, identification

should

service

Each

to

such

a unique

might not offer any services

every

service.

to access

not

has

permission

services

same

does

that

so

approved

Reserved. content

services,

to use a wide array

on any

will be the

deemed

many

on a server

offering

is approved

has

was

approval

new servers

which

to a server, the

that

approval

although

employee

2020

that

GUTS began

users,

review

the

can

permission

Copyright

that

may regain

Each

Employees

Editorial

a device

server

have

6

for

is resolved.

others.

that

for connection

possible

as

on

time

which

the

an employee

and password.

will use for

every service

This for

approved.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 7 Normalising Database Designs IN THIS CHAPTER, YOU WILLLEARN: What normalisation About the

is and

normal forms

How normal forms

what role it plays in the

1NF, 2NF, 3NF,

can be transformed

database

design

process

BCNF and 4NF

from lower

normal forms

to

higher

normal

forms That

normalisation

database That

and

ER

modelling

are

used

concurrently

to

produce

a good

design

some

situations

require

denormalisation

to

generate

information

efficiently

PREVIEW Good

database

will learn

thereby

design

to evaluate

avoiding

must be

useful to

a poor

of a poor

good

one.

table

table

structures

structures.

In this

to

data redundancies,

control

chapter,

you

The process that yields such desirable results is

and appreciate

examine

characteristics

to

good table

data anomalies.

known as normalisation. In order to recognise it is

matched

and design

the characteristics

Therefore,

structure

the

and

the

of a good table

chapter problems

begins it

structure,

by examining

creates.

You

the

will then

learn how to correct a poor table structure. This methodology will yield important dividends: you will know how to design a good table structure and how to repair an existing

poor

You

one.

will

discover

normalisation,

less

complicated

the normalised operations.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

not

only

but also that

to use than set

All suppressed

Rights

of table

Reserved. content

does

May not

that

data

a properly

an unnormalised

structures

not materially

be

anomalies

normalised

copied, affect

overall

or

duplicated, learning

in experience.

be

eliminated

structures

set. In addition,

more faithfully

scanned, the

can

set of table

whole

or in Cengage

reflects

part.

Due Learning

to

electronic reserves

through

is actually

you will learn that

an organisations

rights, the

right

some to

third remove

party additional

content

real

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

272

PART II

Design

Concepts

7.1

DATABASE TABLES AND NORMALISATION

Having

good relational

database

software

is

not enough

to

avoid the

data redundancy

discussed

Chapter 1, The Database Approach. If the database tables are treated as though they system, the relational database management system (RDBMS) never has a chance to

superior data-handling capabilities. The table is a basic building block in the structure with

is

of great

Entity

interest.

Ideally,

Relationship

the

Diagrams,

database

database

yields

design process.

design

good

table

process

in

Yet it is

possible

structures.

are files in a file demonstrate its

Consequently,

explored

Chapter

5,

to

in

the tables

Data

create

Modelling poor

table

structures even in a good database design. So, how do you recognise a poor table structure, and how do you produce a good table? The answer to both questions is based on normalisation. Normalisation is a process for evaluating and correcting reducing the likelihood of data anomalies. to tables

based

on the

concept

of

table The

structures to normalisation

determination

minimise data redundancies, process involves assigning

you learned

about

in

Chapter

Characteristics. Normalisation works through a series of stages called normal forms. described as first normal form (1NF), second normal form (2NF) and third structural design

point

of view,

purposes,

2NF is

3NF is

better

as high

than

as you

discover in Section 7.3 that properly normal form (4NF).)

7

1NF and

3NF is

need

go in

to

designed

better the

than

2NF.

also

Relational

Model

The first three stages are normal form (3NF). From a For

normalisation

3NF structures

3,

thereby attributes

most

business

process.

database

(Actually,

you

meet the requirements

will

of fourth

Although normalisation is a very important database design ingredient, you should not assume the highest level of normalisation is always the most desirable. Generally, the higher the normal

that form,

the

more relational

join

operations

are required

to

produce

a specified

output

and

the

more

resources are required by the database system to respond to end-user queries. A successful design must also consider end-user demand for fast performance. Therefore, you will occasionally be expected to denormalise some portions of a database design in order to meet performance requirements. (Denormalisation

produces

denormalisation.)

a lower

However,

the

normal

price

you

form;

that

is,

a 3NF

pay for increased

will be converted

performance

to

through

a 2NF

through

denormalisation

is

greater data redundancy.

7.2

THE NEED FOR

The normalisation

process

activities of a construction project

number,

name,

can

rate

company

is

dependent

be illustrated

company that

employees

name and job classification, The

NORMALISATION

charges

application,

the

assigned

to it and so on. Each employee

simplified

database

Each project has its own has an employee

number,

such as engineer or computer technician.

its

on the

with a business

manages several building projects.

clients

by billing

employees

position.

for

the

(For

hours

spent

example,

on each

one

hour

contract.

The

of computer

hourly

billing

technician

time

is

billed at a different rate from one hour of engineer time.) Periodically, areport is generated that contains the information displayed in Table 7.1.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 7.1

A sample

report

Normalising

Database

Designs

Chg/

Hours

Total

Hour

Billed

Charge

Project

Employee

Employee

Num.

Name

Number

Name

15

Evergreen

Job

Class

Mzwandile

E. Baloyi

101

John

G. News

105

Alice

K. Johnson*

106

William

102

Kavyara H. Moonsamy

Smithfield

Elec. Engineer

67.55

23.8

1 607.69

Database

Designer

82.95

19.4

1

Database

Designer

82.95

35.7

2

26.66

12.6

335.92

76.43

23.8

1 819.03

Programmer

Systems

Analyst

Subtotal 18

Amber

114

Annelise Jones

Wave

118

James

104

Noxolo

112

Darlene

Applications

J. Frommer K.

General

Maseki*

Designer

Support

Systems

M. Smithson

Analyst

DSS Analyst

Rolling

105

Alice K. Johnson

Database

104

Noxolo

K. Maseki

Systems

113

Delbert

K. Joenbrood*

Applications

111

Geoff

106

William

Wabash

Clerical

Smithfield

38.00

25.6

14.50

45.3

76.43

32.4

2

476.33

36.30

45.0

1

633.50

972.80 656.85

Designer Analyst Designer

Support

Programmer

82.95

65.7

5 449.82

76.43

48.4

3 699.21

38.00

23.6

896.80

21.23

22.0

467.06

28.24

12.8

361.47

Subtotal 25

Starflight

Note: * indicates

The

107

project

subtotals

Maria D. Alonzo

115

Travis

101

John

114

Annelise

10

Programmer

B. Bawangi

Systems

G. News*

Analyst

Database

Jones

Designer

Applications

Designer

25.6

76.43

45.8

3

500.49

82.95

56.3

4

670.09

38.00

33.1

1

257.80

1

Krishshanth

76.43

23.6

118

James J. Frommer

General Support

14.50

30.5

112

Darlene

DSS Analyst

36.30

41.4

Systems

M. Smithson

Analyst

7

874.36

28.24

108

B. Khan

961.32

5 739.48

Tide

B.

609.23

8 333.19

Subtotal 22

273

layout

Proj.

103

7

722.94

803.75

442.25 1

502.82

Subtotal

13

900.14

Total

38 942.09

leader.

and total

charge

in

Table

7.1

are

derived

attributes

and,

at this

point,

not

stored

in the

table.

The easiest short-term wayto generate the required report correspond to the reporting requirements. (See Figure 7.1.)

might seem to be atable

whose contents

Online Content Thedatabases usedtoillustratethe material in this chapterareavailable on the

Copyright Editorial

review

2020 has

Cengage deemed

online

Learning. that

any

All suppressed

platform

Rights

Reserved. content

does

accompanying

May not

not materially

be

copied, affect

this

scanned, the

overall

or

duplicated, learning

book.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

274

PART II

Design

Concepts

FIGURE 7.1 Database Table

Tabular representation

name:

name:

of the report format

Ch07_ConstructCo

RPT_FORMAT

RPT_FORMAT

PROJ_

PROJ_NAME

EMP_

NUM

CHG_HOUR

JOB_CLASS

EMP_NAME

HOURS

NUM

15

Evergreen

18

Mzwandile

103

Amber

Wave

E. Baloyi

101

John

G. News

105

Alice

K. Johnson

Elect.

*

Smithfield

Engineer

67.55

23.80

Database

Designer

82.95

19.40

Database

Designer

82.95

35.70

26.66

12.60

76.43

23.80

38.00

24.60

General Support

14.50

45.30

Systems

76.43

32.40

DSS Analyst

36.30

44.00

Database

82.95

64.70

76.43

48.40

38.10

23.60

106

William

102

Kavyara

H. Moonsamy

Systems

114

Annelise

Jones

Applications

Programmer Analyst

Designer James J. Frommer

118

7 Rolling

22

Tide

104

Noxolo

112

Darlene

105

Alice

104

Noxolo

K.

113

Delbert

K. Joenbrood

K. Maseki * M. Smithson

K. Johnson

Analyst

Designer

Systems

Maseki *

Analyst

Applications Designer

25

Starflight

111

Geoff B. Wabash

Clerical Support

21.23

22.00

106

William

Programmer

28.24

12.80

107

Maria

D. Alonzo

Programmer

28.24

24.60

B. Bawangi

Systems

76.43

45.80

82.95

56.30

38.00

33.10

Analyst

76.43

23.60

Smithfield

115

Travis

101

John

114

Annelise

Analyst

Database

G. News *

Designer

Applications

Jones

Designer

As you examine Apparently,

of the

In

Copyright Editorial

review

2020 has

5 112)

data

set,

addition, total

you

14.50

30.50

112

Darlene

DSS Analyst

36.30

41.40

any

All suppressed

can

includes

the

per

Rights

hour

Reserved. content

does

May not

not materially

be

only

copied, affect

a single

overall

or

duplicated, learning

in experience.

one

occurrence

of hours

whose

for

value in

whole

the

or in Cengage

project.

of any

which

Due Learning

7.1.

to

electronic reserves

example, Starflight.

one

employee.

classification

each

by

No structural

rights, the

right

some to

third remove

party additional

Smithson structure

Therefore,

worked

content

may content

is

projects.

Darlene

knowing

hourly

charge.

on each

multiplying harm

to

Given the

and its

employee

be computed

Figure

of employees

For

Wave and

the job

can

part.

assignment

Amber

you find

been included

scanned, the

more than projects:

will let

number

attribute not

to

to two

value

total

has

Systems

7.1, note that it reflects

be assigned

EMP_NUM

B. Khan

M. Smithson

assigned

a derived

is included.)

Learning.

Figure

been

will know

attribute

that

General Support

project

charge charge

Cengage

James J. Frommer

has

and

and the

deemed

118

data in

each

PROJ_NUM

(The

Krishshanth

an employee

(EMP_NUM

the

the

108

the

project.

hours

billed

done if this

be

suppressed at

any

time

from if

the

subsequent

derived

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Unfortunately, discussed

1

in

the

structure

Chapter

3,

will define

2

each

3

nulls. (Given

be entered

The table a

project.

spite

ease.

just

to

Design

entries.

data

design

El.

know

Eng.

in

that

others,

Those data redundancies

very

well.

PROJ_NUM

1 EMP_NUM

value Elect.

and EE

in

still

Engineer

others.

yield the following

anomalies:

5 105.

not yet assigned,

In

For

a new employee

a phantom

project

must be assigned to a

must be created

to

complete

clerk

update

HOURS. number

EMP_NAME 300 table

person

and

Darlene

not

in Figure

by the

7.1 does not

data

15

Evergeen

At first

glance,

Evergreen?

Cengage

Learning. that

any

All suppressed

the

Rights

data DCS

Reserved. content

is

does

in

May

entry

not

be

to

must be

hours

data

for

what

up the

data

worked

DB

is

generated

anomaly

value

Design

proverbial

wall

that

job

and Database

and

is to ensure that the

classification

has

by the

cannot

database

are looked

up from

table.

work

on the

DSS

EMP_NUM, assigned

to

problems

(at

a high

Given the existence Evergreen

Analyst

project.

36.30

EMP_NAME,

to the

a project,

project,

cost), it

of update

The

data

entry

0.0

JOB_CLASS,

she

has

not

CHG_HOUR

yet

worked,

so the

affect

scanned, the

overall

suppose

duplicated, learning

to

wasted data DCS

to

be

to

be

in experience.

as

data entry number

charge. job

PROJ_NAME,

chore

should

Because

(name,

updated.

the

Smithson

or

hourly

(such

when 200 or

be sufficient there

is

classification

Unfortunately,

the

only and

structure

so

to one on)

displayed

possibility.

7.1 leads

appears

her

entries

the

employee

characteristics

main file is

for that

Figure

data

Imagine

of the

and

persons the

supposed

copied,

entry

description that

some

repeated.

Note that the

Darla

materially

employee

will

entry:

been

example,

Analyst

not

on

total

becomes inefficient.

assigned

assigned

112,

112

And is

information

work; the report

anomalies

for job

M. Smithson

each time

For

to

most of the reporting

data entry

PROJ_NAME,

make allowances

anomalies.

the

managers

key in this

with the

her job

evident

project

a fictitious

depending

show

be used

are unnecessarily

in

the

0.0.)

number

The data redundancy produces

is

made!

be typed

results

to

eliminate

is

has just

employee

CHG_HOUR)

have to

can

Darlene

M. Smithson,

identified

should

appears

drive

could

M. Smithson

must be

structure

will not include

a foreign

even a simple

worked

an existing

entries

identify

are

file

deleted,

information,

different

report

auditing

Ms Smithson

of hours

are

project

a report

anomalies

PROJ_NUM,

(When

Each time

that

112

attributes

print

codes

PROJECT

Evergreen

match the

may yield

they

Darlene the

data

The only solution to avoid these

entry

that

of the

the table

want to

example,

data

employee

information.

reporting

words,

careful

suppose

15

report

you

(Such

other

anomalies, must

project

Designer,

integrity.

table.

and the

deficiencies,

if

is easy to demonstrate

deemed

is

programming.)

Even if very

total

you

For example, the JOB_CLASS

each EMP_NUM

the

the

example,

be fixed through

and

discussion,

cases,

To prevent the loss

structural

Database

has

some

company

save

Unfortunately, For

another

the

be deleted.

classification

has

to the requirements

data

data entry.

leaves

of those

occurred.

2020

handle

275

Deletion anomalies. Supposethat only one employee is associated with a given project. If that

created

review

not conform

does it

Designs

Modifying the JOB_CLASS for employee number 105 requires (potentially) one for

employee

employee

also

Copyright

7.1 does nor

anomalies. Just to complete a row definition, If the

employee

with

in

displays data redundancies.

b Insertion

c

preceding

datainconsistencies.

Update anomalies.

the

the

as Elect.Eng.

many alterations,

Editorial

Figure

Characteristics,

Database

row.)

The table entries invite might

to

data set in

Model

Normalising

The project number (PROJ_NUM) is apparently intended to be a primary key(PK) or atleast a part of a PK, but it contains

In

of the

Relational

7

whole

or in Cengage

part.

space. clerk

But is

Due

to

electronic reserves

Is

rights, the

right

some to

third remove

data redundancy

the

data

as:

0.0

Evergeen Darla

more,

entered

36.30

Analyst?

Learning

Whats had

Analyst

correct. DSS

disk entry

the

same

Smithson

party additional

content

may content

the

be

project

same

suppressed at

any

time

from if

as

person

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

276

PART II

Design

as

Concepts

Darlene

the

data The

when

the

M. Smithson?

entry

failed

possibility

is

designed.

overcome

confusion

conform

to the

of introducing

a database

designer

Such

to

those

is

rule

a data integrity

that

problem

all copies

that

of redundant

data integrity

problems

caused

The relational

database

environment

was caused

data

must

by data redundancy is

because

be identical.

must be considered

especially

well suited

to

helping

problems.

NOTE Remember likely

that the

origin is.

with the

PROJECT

name

length

rather

than

is

data

while the

section,

NAME

especially

the

you learn

will be used

component

in the

directly

ensure

to

in

primary

accomplish

a single courses.

the

data

a table

the

subject.

are

are

For

stored

updated

in

in

the

prefix

and is

was

used

misunderstood.)

objective

is to

tables

create

to

tables

a course

more than one

on the

table

will contain

will contain only

one table.

only

student

store the that

have

data that

data.

The reason

for this

requirement

place.

primary

key

the

entire

primary

key

and

Table 7.2.

normal

the

normalisation

forms.

The

You will learn

most

process common

the details

takes

normal

of these

you

forms

through and their

the

steps

basic

nothing

but

that

lead

to

characteristics

normal forms in the indicated

are

sections.

Normal forms Section

Characteristic

1NF

and

no partial

2NF

and

no transitive

normal

mind that

CHG

will be

what its

associated

keep in

prefix

a set of normalised

The

table

only

Second

form

form

(2NF)

(3NF)

normal

normal

produce

example,

Table format;

Fourth

However,

reason,

that that

information.

First normal form (1NF)

Boyce-Codd

to

a student

dependent

Form

normal

not likely

for

attribute

PROCESS

Similarly,

objective,

higher

TABLE 7.2

Third

too.

For that

stands

the

key.

successively

in

that

self-documenting,

it is

required

will be unnecessarily that

All attributes the

to indicate

designation.

context,

the

what each attribute

PROJ

use normalisation

generate

represents

pertains

No data item is to

to

is

prefix

databases

how to

prefix

characteristics:

Each table

Normal

uses the

THE NORMALISATION

that

listed

makes it easy to see

PROJ_NAME

an issue,

following

To

convention

CHARGE. (Given

In this

the

table,

also

7.3

7

naming

For example,

form

form

Every

(BCNF)

determinant

3NF

(4NF)

no repeating

and

groups

7.3.1

and PK identified

7.3.2

dependencies

7.3.3

dependencies

is

a candidate

no independent

key (special

multivalued

case

of

3NF)

7.6.1

dependencies

7.6.2

Even higher-level normal forms exist. However, normal forms such as the fifth normal form (5NF) and domain-key normal form (DKNF) are not likely to be encountered in a business environment and are mainly of theoretical interest. Some very specialised applications, such as statistical research, may

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

require

normalisation

operations. normal

beyond

Since

this

are

not

forms

book

the

4NF, but those

focuses

on

applications

practical

fall

applications

outside

7

the

of database

Normalising

scope

Database

of

techniques,

Designs

277

most business

the

higher-level

covered.

7.3.1 Conversion To First Normal Form Because

the relational

values

model views the

must be identified,

Figure 7.1 contains fact

that

multiple

or repeating

that

groups,

the

project.

time

of the

evidence

that

the

you

group

because

grows

the

table

are in the

of an attribute

of several

of related entries,

have

works

which

all key

Note that

fields.

In

each

a PROJ_NUM

entries, 7.1,

note

For example, person

whose

Evergreen

These Figure

data entries. one for

on the

key.

value

project,

the

working is

the

on

15.

Each

number

of

by one. repeating in

groups.

Figure

The existence

7.1

fails

to

meet

of repeating even

the

groups

lowest

provides

normal

form

7

will reduce

making sure that diagnose

each who

in

as shown.

group derives its name from the

key

may consist

a group

of tables

be stored

redundancies.

structure

by to

single

with five

they

or collection

might not

Arepeating

any

but

person

table

data

for

associated

another

RPT_FORMAT

be identified

where

for

7.1

groups. exist

can reference

are related

reflecting

must be eliminated must

Figure

structures

must not contain

thus

Normalising

in

can

5 15) is

entered

table

requirements,

type

(PROJ_NUM)

entries is

entries in the repeating A relational

same

(PROJ_NUM

Those

a new record

data as part of a table

depicted

will have identical

number

project

data

whatis known as repeating

entries

each project

Evergreen

the

the

normalisation

the

data redundancies.

each row

defines

If repeating

a single

normal

form.

Identification

process.

The

normalisation

groups

entity. In addition, of the

process

do exist,

the

they

dependencies

normal

form

will let

starts

with a simple

you

know

three-step

procedure.

Step 1: Eliminate the Start

by

presenting

Repeating

the

data

repeating

groups.

To

repeating

group

attribute

Figure

7.1 to

1NF in

FIGURE 7.2 Database Table

name:

name:

in

Groups

a tabular

eliminate

the

format,

repeating

contains

an

where groups,

each

cell

eliminate

appropriate

data

has

a single

the

value.

nulls

This

value

by

change

and there

making

sure

converts

are

that

the

no

each

table

in

Figure 7.2.

Atable in first

normal form

Ch07_ConstructCo

DATA_ORG_1NF

DATA_ORG_1NF

PROJ_

PROJ_NAME

EMP_

NUM

Copyright Editorial

review

EMP_NAME

Evergreen

103

15

Evergreen

101

John

G. News

15

Evergreen

105

Alice

K. Johnson

15

Evergreen

106

William

15

Evergreen

102

Kavyara

H. Moonsamy

Systems

18

Amber

Wave

114

Annelise

Jones

Applications

18

Amber

Wave

118

James

has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

Mzwandile

copied, affect

HOURS

HOUR

15

2020

CHG_

JOB_CLASS

NUM

scanned, the

overall

or

duplicated, learning

E. Baloyi

Elect.

*

J. Frommer

in

whole

or in Cengage

part.

Due

to

electronic reserves

23.80

Designer

82.95

19.40

Database

Designer

82.95

35.70

26.66

12.60

76.43

23.80

38.00

24.60

14.50

45.30

Analyst Designer

General Support

Learning

67.55

Database

Programmer

Smithfield

experience.

Engineer

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

278

PART II

Design

Concepts

DATA_ORG_1NF PROJ_NAME

PROJ_

EMP_

HOUR

18

Amber

Wave

104

Noxolo

18

Amber

Wave

112

Darlene

22

Rolling

Tide

105

Alice

22

Rolling Tide

104

Noxolo

22

Rolling Tide

113

Delbert K. Joenbrood

22

Rolling Tide

111

Geoff B. Wabash

22

Rolling Tide

106

25

Starflight

25

Starflight

25

Systems

K. Maseki *

DSS

M. Smithson

Analyst

Analyst

Database

K. Johnson

Designer

Systems

K. Maseki

Analyst

76.43

32.40

36.30

44.00

82.95

64.70

76.43

48.40

38.00

23.60

Clerical Support

21.23

22.00

William Smithfield

Programmer

28.24

12.80

107

Maria

Programmer

28.24

24.60

115

Travis

76.43

45.80

Starflight

101

John

82.95

56.30

25

Starflight

114

Annelise

38.00

33.10

25

Starflight

108

Krishshanth

76.43

23.60

25

Starflight

118

James

General Support

14.50

30.50

25

Starflight

112

DSS Analyst

36.30

41.40

Step

2: Identify

The layout note

in

that

the

one row

example,

the

key that

will uniquely

in

Figure

Step

All

Also,

if

are

In

2020 has

Cengage deemed

you

one

5 15

23.8,

the

all of the of five

project

To

(row)

maintain

must be composed key. For example,

EMP_NUM

5 103, the

HOURS

can

only

observer

does

entity

primary

and

a casual

number

remaining

employees.

new key

and

Even

not

will

uniquely

attributes. a proper

For

primary

of a combination using the

entries

for

of

data shown

the

attributes

be Evergreen,

Mzwandile

respectively.

Step

2

means that

you

have

already

identified

the following

? PROJ_NAME, EMP_NAME, JOB_CLASS,

EMP_NAME,

For words,

JOB_CLASS,

by

the

example,

the

project

CHG_HOUR

combination

the

project

name

is

of

CHG_HOUR,

HOURS

and HOURS values are all dependent

PROJ_NUM

number

on its

dependent

dependency:

on the

and

own

EMP_NUM.

identifies

project

There

are

(determines)

number.

You

can

the write that

? PROJ_NAME

know

an and

employee that

number,

employees

you

also

charge

per

know hour.

that

employees

Therefore,

you

name, can identify

that

employees

the

dependency

next:

Learning. that

any

change.

as:

EMP_NUM

review

because

CHG_HOUR,

and

determined

other

classification

shown

key

not identify

PROJ_NUM

PK in

dependencies. name.

Analyst

Dependencies

EMP_NUM

PROJ_NUM

job

primary does

JOB_CLASS, 67.55

of the

they

Designer

a mere cosmetic

15 can identify

that

PROJ_NAME,

dependency

more than

hence

value

Engineer,

PROJ_NUM,

project

M. Smithson

value, the

The identification

additional

J. Frommer

an adequate and

EMP_NAME,

is,

Systems

a composite

Elect.

that

B. Khan

any attribute

3: Identify

on

not

know

Designer

Applications

Jones

This is called

you

Analyst

Database

identify

7.2, if

That is, the

G. News *

and EMP_NUM.

PROJ_NAME, E. Baloyi,

Systems

Designer

Key

table

PROJ_NUM

PROJ_NUM

Copyright

is

Applications

B. Bawangi

7.2 represents

of the

*

D. Alonzo

Darlene

Primary

Figure

PROJ_NUM

identify

Editorial

HOURS

CHG_

JOB_CLASS

NUM

NUM

7

EMP_NAME

any

All suppressed

? EMP_NAME, JOB_CLASS,

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

CHG_HOUR

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

However, means

given the

knowing

previous

the

dependency

charge

per

components,

hour for

that

job

you can see that

classification.

In

other

7

Normalising

knowing words,

Database

the job

you

Designs

279

classification

can identify

one last

dependency:

JOB_CLASS

? CHG_HOUR

The dependencies in

Figure

7.3.

you have just

Because

such

it is known as a dependency view

of all

of the

will overlook

be

and

determined

by

determines

part

another

primary

The diagram Partial

key.

with the found

help

of the

within

a given

diagram table

shown

structure,

diagrams are very helpful in getting a birds-eye

attributes,

dependencies.

non-key

FIGURE 7.3

a tables

dependency.

of the

be depicted

all dependencies

Dependency

among

transitive

can also

depicts

diagram.

relationships

an important

dependencies

examined

a diagram

and

below

their

use

shows two types

dependencies

Transitive

makes it less

are

dependencies

that

of dependencies:

where are

likely

a non-key

when

one

you

partial

attribute

non-key

can

attribute

attribute.

First normal form (1NF) dependency

diagram

7 PROJ_NUM

PROJ_NAME

EMP_NUM

EMP_NAME

JOB_CLASS

Partial dependency

CHG_HOUR

HOURS

Transitive dependency Partial dependencies

1NF (PROJ_NUM,

EMP_NUM,

PROJ_NAME,

EMP_NAME,

JOB_CLASS,

CHG_HOURS,

HOURS)

PARTIAL DEPENDENCIES: (PROJ_NUM PROJ_NAME)

(EMP_NUM

EMP_NAME,

JOB_CLASS,

CHG_HOUR)

TRANSITIVE DEPENDENCY: (JOB CLASS CHG_HOUR)

As you

1

examine

Figure

7.3,

note

The primary key attributes

2

the

following

dependency

are bold, underlined

The arrows above the attributes indicate are

based

on the

combination

of

primary

key. In this

PROJ_NUM

and

diagram

and shaded in a different colour.

all desirable

case,

features:

note

that

dependencies,

the

entitys

that is, dependencies

attributes

are

dependent

that on the

EMP_NUM.

3 The arrows below the dependency diagram indicate less desirable dependencies. Twotypes of such

a

dependencies

exist:

Partial dependencies. that

is,

know

the only

PROJ_NAME the

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

dependent to find

the

on

only

part

EMP_NAME,

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

of the the

based on only a part of a composite

Reserved. content

is

EMP_NUM

A dependency

Editorial

You need to know only the PROJ_NUM to determine the PROJ_NAME; primary

key.

JOB_CLASS

And

and

the

you

need

to

CHG_HOUR.

primary keyis called a partial dependency.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

280

PART II

Design

b

Concepts

Transitive

dependencies.

JOB_CLASS. neither

(In

attribute

other

is

words,

another

As you examine Figure 7.3, note that

Because

neither

at least

CHG_HOUR

part

of a key

a transitive

non-prime

nor the

dependency

attribute.)

condition

is

The problem

CHG_HOUR is dependent

JOB_CLASS is

is

a prime

known

a dependency

with transitive

attribute

as a transitive

of one

is,

dependency.

non-prime

dependencies

on

that

attribute

is that they

on

still yield

data anomalies.

NOTE Partial

and transitive

dependencies

Partial dependency primary

refers

are important

to attributes

concepts

that

when

performing

are only dependent

normalisation.

To recap:

on part of the composite

key.

Transitive

dependency

is

when an attribute

is

dependent

on any other

attribute

except

the

primary

key.

Note that Figure 7.3 includes the relational identified dependency.

7

All relational in

Figure

the

tables

7.3 is that

primary

caution.

contains

dependencies

If the information

need

caution

for

is

a data

duplication of a day, the

requirements design,

because

to

various

atable

seem to

dictate

discussed

in

that

anomalies.

EMP_NAME,

values

more, the

contains

versions

EMP_NUM

for

of effort

with the

1NF table

dependencies

integrity

partial

structure

based

data

or

15,

occur

on only

be

because

entered

shown a part

or the hourly

data

to

of

evaluate Such

entry requires

even

course

though

very inefficient. user from

slightly

name for

name also

violate

the

the Whats

typing

the employee

The project

anomalies

with

data redundancies

time

is

pay. For instance,

Such

to

Intelligence.

5 105 during the

the

or K. Moonsamy.

as Evergeen.

it is time

each

prevents

be used

every row

of effort

nothing

should

Business

EMP_NUM

duplication

Moonsamy

for

is still subject

entries must

Such

they

dependencies,

Databases

dependencies

anomalies;

misspelled

reasons,

use of partial

CHG_HOUR 5 105.

as Kavyara

consistency

the

Chapter

name, the position

as Evergreen and

and

create

might be entered

performance

makes 20 table

EMP_NUM

helps

of the employee

5 102 correctly

databases

is,

The data redundancies

JOB_CLASS

are identical

duplication

be entered

that

used for

of data. For example, if the user

attribute

The problem

dependencies

are sometimes

warehouse

warranted

and, therefore,

different

1NF requirements. partial

notation for each

key.

While partial

the

satisfy the

it

schema for the table in 1NF and a textual

might

relational

rules.

NOTE

The term first All of the

normal form (1NF) describes the tabular format in which: key attributes

are defined.

There are no repeating groups in the table. In other words, each row/column one and only one value, not a set of values. All attributes

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

intersection

contains

are dependent on the primary key.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

7

Normalising

Database

Designs

281

7.3.2 Conversion To Second Normal Form Fortunately,

the relational

database

design

can be improved

easily

format known as the second normal form (2NF). The 1NF-to-2NF the 1NF format displayed in Figure 7.3, you do the following: Step

1:

Write Each

Key Component

on a Separate

by converting

the

database

conversion is simple.

into

Starting

a

with

Line

Write each of the composite primary keys components (composite) key on the last line. For example:

on a separate line; then

write the original

PROJ_NUM EMP_NUM

PROJ_NUM EMP_NUM Each component will become the key in a new table. In other words, the original table is now divided into three tables (PROJECT, EMPLOYEE and ASSIGNMENT). Step 2: Assign Corresponding Dependent Attributes Use Figure 7.3 to determine those attributes that are dependent on other attributes. The dependencies for the original key components are found by examining the arrows below the dependency diagram shown in Figure

are described

7.3. In other

words, the three

bythe following

PROJECT

(PROJ_NUM,

relational

new tables

(PROJECT,

EMPLOYEE

and

7

ASSIGNMENT)

schemas:

PROJ_NAME)

EMPLOYEE (EMP_NUM,

EMP_NAME, JOB_CLASS,

ASSIGNMENT (PROJ_NUM,

EMP_NUM,

CHG_HOUR)

ASSIGN_HOURS)

As the number of hours spent on each project by each employee is dependent on both PROJ_NUM and EMP_NUM in the ASSIGNMENT table, you place those hours in the ASSIGNMENT table as ASSIGN_HOURS.

NOTE The

ASSIGNMENT

table

contains

a composite

primary

key

composed

of the

attributes

PROJ_NUM

and

EMP_NUM. Any attribute that is atleast part of a key is known as a prime attribute or a key attribute. Therefore, both PROJ_NUM and EMP_NUM are prime (or key) attributes. Conversely, a non-prime attribute, or a non-key attribute, is not even part of a key.

The results earlier

of

have

you need to A partial a table Figure

Copyright review

2020 has

still

any

All suppressed

Rights

Reserved. content

PROJECT

key

shows

does

displayed

exist

consists

not materially

be

copied, affect

when

of only

you

the

overall

or

duplicated, learning

At this

point,

want to

a tables

primary

a single

which

in experience.

whole

or in Cengage

part.

is can

held by

Due

to

electronic reserves

anomalies

discussed

a PROJECT

composed

automatically generate

rights, the

right

of several in

2NF

anomalies.

many employees,

Learning

of the

record,

only one row.

key is

attribute

most

add/change/delete

and add/change/delete

classification

scanned,

7.4. now

dependency,

for a job

May

Figure

if

table

only

a transitive

not

in

For example,

can

primary

7.4

Learning. that

2 are

go only to the

per hour changes

Cengage deemed

1 and

eliminated.

dependency

whose

charge

Editorial

Steps

been

some to

third remove

that

party additional

content

attributes,

For example,

change

may content

so

when it is in

be

any

if the

must be

suppressed at

1NF.

time

from if

the

subsequent

made

eBook rights

and/or restrictions

eChapter(s). require

it

282

PART II

Design

Concepts

for each of those the

charge

hourly

name:

If you forget

change,

different

to update

employees

some

of the

with the

same

employee job

records

description

that

are affected

will generate

by

different

Second normal form (2NF) conversion results

PROJECT

PROJ_NUM

Table

hour

charges.

FIGURE 7.4 Table

employees.

per

name:

PROJECT

(PROJ_NUM,

PROJ_NAME)

PROJ_NAME

EMPLOYEE

EMPLOYEE

(EMP_NUM,

EMP_NAME,

TRANSITIVE

JOB_CLASS,

DEPENDENCY

(JOB_CLASS

EMP_NUM

EMP_NAME

JOB_CLASS

CHG_HOUR)

CHG_HOUR)

CHG_HOUR

7 Transitive dependency Table

name:

ASSIGNMENT

PROJ_NUM

ASSIGNMENT

EMP_NUM

(PROJ_NUM,

EMP_NUM,

ASSIGN_HOURS)

ASSIGN_HOURS

NOTE A table

is in

second

normal

form

(2NF)

when:

It is in 1NF. and It includes

no

partial

dependencies;

that

is,

no

attribute

is

dependent

on only

a portion

of the

primary

key.

(It is

still possible

for a table in

may be functionally

2NF to exhibit

dependent

on non-key

transitive

dependency;

that is,

one or

more attributes

attributes.)

7.3.3 Conversion To Third Normal Form The data anomalies created completing

Copyright Editorial

review

2020 has

Cengage deemed

the

Learning. that

any

All suppressed

following

Rights

Reserved. content

does

bythe database organisation

three

May not

not materially

be

shown in Figure 7.4 are easily eliminated

by

steps:

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Step 1:Identify Each New Determinant For every transitive dependency, write its any

attribute

whose

dependencies,

you

one transitive

value

determines

will have

dependency.

three

determinant

other

values

different

Therefore,

Normalising

Database

Designs

as a PK for a new table. (A determinant

within

a row.)

If

Figure

7.4

determinants.

write the

7

determinant

you

have three

shows

different

a table

for this transitive

that

283

is

transitive

contains

dependency

only

as:

JOB_CLASS

Step 2:Identify Identify

the

the

attributes

dependency.

that

In this

JOB_CLASS Name

the

Step

3: Remove

table

Eliminate

Attributes

are

case,

dependent

you

on each

determinant

identified

in

Step

1 and identify

the

write:

? CHG_HOUR to reflect

the

its

In this

from

in the transitive

In this

the

and function.

Attributes

attributes

relationship.

7.4 to leave

contents

Dependent

all dependent

a transitive Figure

Dependent

example,

EMPLOYEE

Transitive

JOB

from

CHG_HOUR

dependency

seems

appropriate.

Dependencies

relationship(s)

eliminate

table

case,

definition

each

from

the

of the tables

that

EMPLOYEE

table

have such shown

in

as:

7 EMP_NUM

? EMP_NAME, JOB_CLASS

Note that the JOB_CLASS Draw

a new

dependency

new tables and that

no table

complete

In

contains

have

by simply

ASSIGNMENT Note that

this

are

all of the

to serve as the

tables

you

Step 3 to

have

FK.

defined

make sure that

in

Steps

each table

1-3.

Check

the

has a determinant

dependencies. 13,

you

drawing

the

will see the revisions

results

as you

has been completed,

EMP_NAME,

in

Figure

7.5. (The

usual

procedure

is

make them.)

your

database

contains

four

tables:

JOB_CLASS)

CHG_HOUR)

(PROJ_NUM, conversion

now

show

table

PROJ_NAME)

(EMP_NUM,

(JOB_CLASS,

EMPLOYEE

modified in

conversion

(PROJ_NUM,

EMPLOYEE

to

you

Steps

words, after the

PROJECT

tables

13

in the

inappropriate

completed

Steps

other

JOB

diagram

as well as the tables

When you to

remains

said to

EMP_NUM,

has eliminated

be in

third

normal

ASSIGN_HOURS) the

original

form

EMPLOYEE

tables

transitive

dependency;

the

(3NF).

NOTE A table

is in

It is in

third

normal

form

(3NF)

when:

2NF.

and

It contains

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

no transitive

Rights

Reserved. content

does

May not

not materially

dependencies.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

284

PART II

Design

Concepts

FIGURE 7.5

Third normal form (3NF) conversion results

PROJ_NUM

PROJ_NAME

EMP_NUM

Table name: PROJECT PROJ_NAME)

EMPLOYEE (EMP_NUM,

CHG_HOUR

PROJ_NUM

Table name: JOB CHG_HOUR)

EMP_NAME,

EMP_NUM

Table name:

JOB (JOB_CLASS,

JOB_CLASS

Table name: EMPLOYEE

PROJECT (PROJ_NUM,

JOB_CLASS

EMP_NAME

JOB_CLASS)

ASSIGN_HOURS

ASSIGNMENT

ASSIGNMENT (PROJ_NUM,

EMP_NUM,

ASSIGN_HOURS

7

7.4

IMPROVING

The table

structures

You can

now focus

operational need

to

section

to

presents

just

that

valuable

normal

are cleaned

the

In the

produce

normalisation its

unless

use

cannot, helps

intentionally

left

will learn

of tables.

be relied

normal

the

for

dependencies. on enhancing

types

of issues

space

issues,

due to

to all remaining

make good At a

and

different

note that,

principle

on to

forms

information

about

Please

data redundancies.

in lower

partial and transitive

provide

must apply the

byitself,

eliminate

you

set

designer

initial

ability to

paragraphs,

normalised the

the troublesome

databases

next few

a good

one example

because

form,

up to eliminate

on improving

characteristics. address

Remember is

THE DESIGN

tables

performance

all designs

reasons,

you each

in the

designs. Instead,

minimum,

its

design.

normalisation

should

be in third

as discussed

later.

7.4.1 Evaluate PK Assignments As the

number

of employees

entered

into

EMPLOYEE

integrity the to

the

violations.

EMPLOYEE create

JOB_CODE

JOB_CLASS A transitive of another to

pay

Copyright Editorial

review

2020 has

Cengage deemed

any

All suppressed

it

of a JOB_CODE

a transitive

produces

dependency

the

Rights

the

would

be better

attribute

to

produces

employee

Database

add

the

is

to referential

Designer, into

a JOB_CODE

attribute

dependency:

dependency,

if you assume that the JOB_CODE

is a proper

dependency:

the

does

May not

new

not materially

because the

presence

Reserved. content

exists

attribute

Note that

Learning. that

Therefore,

a new

errors that lead

? CHG_HOUR

non-key

because

violations.

it

each time

make data-entry

CHG_HOUR

does produce

because

must be entered

easy to

entry of DB Designer, rather than

a violation.

The addition

? JOB_CLASS,

This new attribute key,

such

value

it is too

a JOB_CLASS

will trigger

identifier.

a JOB_CLASS

Unfortunately,

For example,

table

a unique

primary

grows, table.

be

a non-key

CHG_HOUR.

of JOB_CODE JOB

copied, affect

table

scanned, the

overall

or

greatly

now

duplicated, learning

attribute However,

has two

in experience.

whole

or in Cengage

the that

decreases

the

candidate

part.

JOB_CLASS

transitive

Due Learning

to

electronic reserves

determines

dependency likelihood

rights, right

some to

third remove

party additional

content

an

of referential

keys (JOB_CODE

the

is

and

may content

be

any

time

value price

integrity

JOB_CLASS).

suppressed at

the easy

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

7

Normalising

Database

In this case, JOB_CODE is the chosen primary key as well as a surrogate key. A surrogate artificial

PK introduced

tables.

Surrogate

free

keys

of semantic

You learnt

by the

are usually

content

about

designer

(they

with the

numeric,

have

purpose

they

are

no special

PK characteristics

of simplifying

often

automatically

meaning),

and assignment

the

and they

in

Chapter

assignment

generated

are

usually

6, Data

by the

Modelling

keys to

DBMS,

from

285

key is an

of primary

hidden

Designs

the

they

end

Advanced

are

users.

Concepts.

7.4.2 Evaluate Naming Conventions It is

best

to

adhere

CHG_HOUR addition,

to

the

naming

will be changed

the

Database

attribute

Designer

have

noticed

That

change

and

that

to

name so

on; the was

It generally

is good

the

use of the

practice

EMP_NAME

a last

name,

flexibility.

For

to

generate

if the

phone

lists

If the

For example,

would

be desirable.

Data

Models.

association entries

fits

the

in

the

with the JOB table.

such

as

entries

Systems

better.

conversion

ASSIGNMENT

Therefore,

Also,

from

In

Analyst, you

1NF

may

to

2NF.

table.

to the

use

atomicity

table is not atomic By improving

names,

by the

degree

(An

atomic

names

and

and initials.

In

rules

EMP_NAME

attribute

general,

and

Such

processing

you

also

EMP_INITIAL, a task

designers

is

one

Clearly, the

can be decomposed

of atomicity,

EMP_FNAME,

first

business

because

the

EMP_LNAME,

last

requirement.

Such an attribute is said to display atomicity.)

an initial.

were used in

a real-world

gross

salary

an employee

of employment

gain you

would

querying can

be very

easily difficult

prefer to use simple,

single-valued

requirements.

environment,

payments

hire

and serve

date

and

attribute

as a basis for

several

would have to

UIF (Unemployment

Insurance

(EMP_HIREDATE)

could

awarding

measures. The same principle

other attributes

bonuses

Fund)

be used

to long-term

be

payments to track

employees

an and

must be applied to all other tables in your design.

New Relationships

ability

EMP_NUM

with the

were within a single attribute.

morale-enhancing

to

supply

as a foreign

PROJECTs

designer

you

Adding

7.4.5 Identify

each

and

year-to-date

length

systems

its

describe

ASSIGN_HOURS

worked

pay attention

by sorting

table

added.

for other

quite

2,

New Attributes

EMPLOYEE

employees

to

hours

subdivided.

as indicated

7.4.4 Identify

the

if

name components attributes

The

name

example,

not

JOB_DESCRIPTION

in the EMPLOYEE

a first

Chapter

Atomicity

that cannot be usefully further into

label

in

to indicate

does

changed

you associate

7.4.3 Refine Attribute

outlined

JOB_CHG_HOUR

JOB_CLASS

HOURS

lets

conventions

key in

manager

must take

care

detailed

data

to

information

PROJECT.

without

place

the

about

That action

producing

right

each

projects

ensures

unnecessary

attributes

in the

7.4.6 Refine Primary Keys as Required for

that

and

right

manager

is

ensured

you can access

the

undesirable

data

by using

normalisation

tables

by using

details

duplication.

of The

principles.

Data Granularity

Granularity refers to the level of detail represented bythe values stored in atables row. Data stored attheir lowest

level

of granularity

ASSIGN_HOURS

are those represent requires hour,

Copyright Editorial

review

2020 has

day,

Learning. that

any

week,

All suppressed

daily total,

definition.

Reserved. content

does

the

and

May not

not materially

case,

so on

be

copied, affect

scanned, the

overall

or

monthly

the relevant

duplicated, learning

7.5, the

by a given

of granularity?

total,

do you

Figure

worked

level

weekly

In this

data. In

hours

at their lowest

total,

month

Rights

be atomic

to represent

recorded

hourly

more careful

Cengage deemed

values the

are said to

attribute

in

whole

or in Cengage

part.

the

Due Learning

to

other

or yearly

question

want to record

experience.

In

total

ASSIGNMENT

employee

electronic reserves

would

table

on a given

words,

do the

total?

Clearly,

be as follows:

ASSIGN_HOURS

rights, the

right

some to

third remove

party additional

in

3NF uses the

project.

However,

ASSIGN_HOURS ASSIGN_HOURS For

what time

frame

data?

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

286

PART II

Design

Concepts

For example, (composite) the

total

key

assume

primary number

such

assume

as

of hours

that the

still

generated

if

any

may have

entries in the

add the

on

or

its

table.

That

for the

hours in the

when

the

an entity

project

is

example,

entity integrity

integrity

on the

and then

ASSIGN_NUM

primary

For

key and an employee

violates

PK,

only

a surrogate

primary

same

an acceptable

representing

flexibility.

action

morning

in

Using

greater

of a composite

is

useful

start.

yields

is used as the

more entries

no problems

PROJ_NUM

key is

since

and

as part

a few

data entry yields

a project

ASSIGNMENT

project

and

primary

combination

makes two

on the

That

granularity

ASSIGN_DATE

employee worked

in the day.) The same

lower

and PROJ_NUM

worked

of EMP_NUM table.

worked

provides

Even if you

employee

combination

ASSIGNMENT

an employee

EMP_NUM

makes two hours

the

the

ASSIGN_NUM

requirement. is

that

key in

violation

same

day. (The

on it

again later

worked

used as the

primary

key.

NOTE In an ideal (database

design)

or the requirements designs

involve

real-world

gathering

the

phase.

refinement

environment,

And those

world, the level

of

However,

existing

changing

changes

of desired

may ultimately

requirements,

the

determined

use

triggering

may dictate

of surrogate

at the conceptual

seen in this

thus

requirements

require

is

as you have already

data

granularity

granularity

chapter, design

changes

in

design

many database

modifications.

primary

key

In

a

selection.

keys.

7

7.4.7

Maintain

Writing

the

job

accuracy

Historical

charge

of the

data

per in

ASSIGN_CHG_HOUR. that is true to

each

that

project

the

the

the time data

charge

is

Finally,

you

the

derived

results.

when the

end

has

Cengage deemed

name

hour found

Chapter

within

the

a derived

derived

user

time.

presses

3, Section

the

historical attribute

same.

However, hours

However,

suppose

it is reasonable

that

the

JOB

found

discussion

to

in

always

per hour that

3.6, for a more detailed

charges

the

show

was in effect at on how historical

Attributes in the

to

ASSIGNMENT

be named

must

Enter

write the

be reported is

key, thus

to

store

done

speeding

preceding

database

application

time

up the

actual

result

point

the of data

charge

of

made

multiplying

of view,

or invoices.

software

summarised, at the

the

is the

write reports

and/or

calculation

in the

a strictly

are needed to

makes it easy to

(If the

table

ASSIGN_CHARGE, From

when they

described

the

in the

ASSIGN_CHG_HOUR.

in the table

the this

database.)

attribute

attribute,

multiplying

the

time.

charge

(See

by

forever

over

per hour stored in the JOB table, rather than the

reporting

7.6 is

Learning. that

a vast improvement

properly,

NUM and

2020

to

such

derived

However,

to

produce

availability entry,

it

the

storing

the

desired

of the

derived

will be completed

process.)

sections

are illustrated

in the tables

shown

in

7.6.

designed

review

maintaining

would

many transactions

will save

to

project

attribute

Also, if

Figure

Copyright

per

appropriate

on the

The enhancements

Editorial

billed)

can be calculated

attribute

Figure

(and

crucial

would appear to have the same value as JOB_CHG_HOUR,

will change

charge

is be

charges

by the

values

would

Those

use

ASSIGN_HOURS

attribute

table

It

value remains

hour

Using Derived

That

table.

attribute

per

maintained

can

a project.

ASSIGNMENT

worked

of the assignment.

accuracy

the

table.

and the

7.4.8 Evaluate

to

charge

calculated

table

current

into

ASSIGNMENT

JOB_CHG_HOUR

job

were

ASSIGNMENT

hour

the

Although this

only if the

assume

Accuracy

any

the

ASSIGN_HOURS

All suppressed

Rights

Reserved. content

does

over

most active

May not

table

values.

not materially

be

copied, affect

original

database

design.

requires

the

The values for the attributes

scanned, the

the

(ASSIGNMENT)

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

entry

If the

ASSIGN_NUM

rights, the

right

some to

third remove

application

of only the

party additional

content

software

PROJ_NUM,

is

EMP_

and ASSIGN_DATE

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

can

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.6 Database Table

name:

name:

The completed

7

Normalising

Database

Designs

287

database

Ch07_ConstructCo

PROJECT PROJ_NUM

PROJ_NAME

EMP_NUM

15

Evergreen

105

18

Amber

104

22

Rolling Tide

113

25

Starflight

101

Wave

Table name: JOB

Table

Copyright Editorial

review

name:

JOB_CODE

JOB_DESCRIPTION

JOB_CHG_HOUR

500

Programmer

28.24

501

Systems

76.43

502

Database

503

Electrical

Analyst

82.95

Designer

7

66.76

Engineer

53.64

504

Mechanical

Engineer

505

Civil Engineer

44.07

506

Clerical

21.23

507

DSS

508

Applications

509

Bio Technician

27.29

510

General Support

14.50

Support

36.30

Analyst

38.00

Designer

ASSIGNMENT

ASSIGN_

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_CHG_

ASSIGN_

NUM

DATE

NUM

NUM

HOURS

HOUR

CHARGE

1001

04-Mar-18

15

103

2.60

67.55

175.63

1002

04-Mar-18

18

118

1.40

14.50

20.30

1003

05-Mar-18

15

101

3.60

82.95

298.62

1004

05-Mar-18

22

113

2.50

38.00

95.00

1005

05-Mar-18

15

103

1.90

67.55

128.35

1006

05-Mar-18

25

115

4.20

76.43

321.01

1007

05-Mar-18

22

105

5.20

82.95

431.34

1008

05-Mar-18

25

101

1.70

82.95

141.02

1009

05-Mar-18

15

105

2.00

82.95

165.90

1010

06-Mar-18

15

102

3.80

76.43

290.43

1011

06-Mar-18

22

104

2.60

76.43

198.72

1012

06-Mar-18

15

101

2.30

82.95

190.79

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

288

PART II

Design

Concepts

ASSIGN_

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_CHG_

ASSIGN_

NUM

DATE

NUM

NUM

HOURS

HOUR

CHARGE

1013

06-Mar-19

25

114

1.80

38.00

68.40

1014

06-Mar-19

22

111

4.00

21.23

84.92

1015

06-Mar-19

25

114

3.40

38.00

129.20

1016

06-Mar-19

18

112

1.20

36.30

43.56

1017

06-Mar-19

18

118

2.00

14.50

29.00

1018

06-Mar-19

18

104

2.60

76.43

198.72

1019

06-Mar-19

15

103

3.00

67.55

202.65

1020

07-Mar-19

22

105

2.70

82.95

223.97

1021

08-Mar-19

25

108

4.20

76.43

321.01

1022

07-Mar-19

25

114

5.80

38.00

220.40

1023

07-Mar-19

22

106

2.40

28.24

67.78

Table name:

7

EMPLOYEE

EMP_

EMP_

EMP_

EMP_

EMP_

NUM

LNAME

FNAME

INITIAL

News

John

G

08-Nov-10

502

Kavyara

H

12-Jul-99

501

E

01-Dec-07

503

Noxolo

K

15-Nov-98

501

Alice

K

01-Feb-04

502

22-Jun-15

500

D

10-Oct-04

500

101

102

Moonsamy

103

Mzwandile

Baloyi

JOB_CODE

HIREDATE

104

Maseki

105

Johnson

106

Smithfield

William

107

Alonzo

Maria

108

Khan

Krishshanth

B

22-Aug-99

501

109

Smith

Larry

W

18-Jul-09

501

110

Olenko

Gerald

A

11-Dec-06

505

Geoff

B

04-Apr-99

506

111

Wabash

112

Smithson

Darlene

M

23-Oct-05

507

113

Joenbrood

Delbert

K

15-Nov-04

508

114

Jones

Annelise

20-Aug-01

508

115

Bawangi

Travis

B

25-Jan-00

501

116

Pratt

Gerald

L

05-Mar-05

510

Angie

H

19-Jun-04

509

04-Jan-16

510

Williamson

117

Frommer

118

be generated bythe application. the

ASSIGN_DATE

J

James

For example, the ASSIGN_NUM can be created by using a counter, and

can be the system

date read

by the

application

and automatically

entered

into the

ASSIGNMENT table. In addition, the application software can automatically insert the correct ASSIGN_ CHG_HOUR value by writing the appropriate JOB tables JOB_CHG_HOUR value into the ASSIGNMENT table. (The JOB and ASSIGNMENT tables are related through the JOB_CODE.) If the JOB tables

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

JOB_CHG_HOUR the

change

the

system

requires

ASSIGNMENT

tables

value changes,

automatically.

structure

7.5

this

address

design

enter

of that

thus their

magnetic

primary

key.

card

value into the

minimises

own reader

maintaining

the

need

ASSIGNMENT

for

human

Database

table

In

work

hours,

they

can scan

their

EMP_NUM

enters

their

identity.

Thus,

the

desired level

289

will reflect

intervention.

that

some

Designs

fact,

into

if the

ASSIGNMENT

of security.

At the

example,

and referential

a composite

grows. (It becomes key. In

Or a primary why the

When, for

key

primary

addition,

requirements,

key

may simply

have too

primary

to

key is

too

the

key

key

makes it

much

descriptive

JOB

considered

designer

table

still

cumbersome

a suitable foreign primary

was added

the

may become

to create

a composite

attribute

the

integrity

difficult

attribute

JOB_CODE

whatever reason,

implementation via the

incremented

level,

DBMS.

for

each

JOB table

TABLE 7.3

For example,

and

7.4 that

remember

in the

that in

key is

a system-defined

column,

Section

However,

a surrogate

Usually,

new row.

uses an identity

Recall from

shown

entity

to

must to

use

when the related more

difficult

content

serve

as that

to be unsuitable,

to

to

be

tables

designers

use

created

and

keys.

managed

Server

vital

primary

routines.)

which is

surrogate

For

of attributes

a composite

search

usable

meets the

concerns.

number uses

key.

to

a

can set the stage for

some

as the

write

employees

by using

next insertion structure

Normalising

SURROGATE KEY CONSIDERATIONS

Although

table

the

table

the

The table

7

Microsoft

JOB_CODE

Access

the

JOB_CODE

attribute does

numeric

uses

an

generally

and its

value is

AutoNumber

data type,

designated

prevent

to

be the

duplicate

entries

from

primary

being

made,

7.3.

Duplicate

entries in the job table JOB_CHG_HOUR

JOB_DESCRIPTION

511

Programmer

26.66

512

Programmer

26.66

data entries in

MS SQL

JOB tables

JOB_CODE

Clearly, the

7

automatically

object.

was not

attribute

key is

Oracle uses a sequence

the

Table

a system-defined

surrogate

Table

7.3 are inappropriate

because

they

duplicate

existing

records

yet

there has been no violation of either entity integrity or referential integrity. This multiple duplicate records problem was created when the JOB_CODE attribute was added as the PK. (When the JOB_ DESCRIPTION was initially designated to be the PK, the DBMS would ensure unique values for all job description entries when it was asked to enforce entity integrity. However, that option created the problems

that

caused

use of the

JOB_CODE

attribute

in the first

place!) In

any case, if

JOB_CODE

is

to be the surrogate PK, you must still ensure the existence of unique values in the JOB_DESCRIPTION through the use of a unique index. Note that all of the remaining tables (PROJECT, ASSIGNMENT and EMPLOYEE) are subject to the same limitations. For example, if you use the EMP_NUM attribute in the EMPLOYEE table as the PK, you can

make

multiple entries for the

same

employee.

To avoid that

problem,

you

might create

a

unique index for EMP_LNAME, EMP_FNAME, and EMP_INITIAL. But how would you then deal with two employees named Joe B. Smith? In that case, you might use another (preferably externally defined) attribute, such asID number, to serve as the basis for a unique index.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

290

PART II

Design

It is

Concepts

worth repeating

judgement.

In

flexibility.

For

EMP_NUM

example,

and

all, if

make

the

best

you

Unfortunately,

multiple

number

to

7.6 Tables

design

if you

the

ASSIGNMENT

is likely to

for

same

be to

times

table

employee

a new

to to

a project and

externally

the

hours

given

project

attribute

on

and

PROJ_NUM, entry

multiple times

per

for any

point of view.

day, it

must

that

day. In

during

such

data audits

index

integrity

ASSIGN_HOURS

a managerial

any

of professional

design

a unique one

from

same

defined

In any case, frequent

use only

during

exercise

between

enter the same

be undesirable

on

and the

a balance

couldnt

might

uniqueness.

trade-offs

strike

employees

entries

add

to

an employee

different

that

need

want to limit

several

ensure

as a stub,

be

that

voucher

After

possible case,

or ticket

would be appropriate.

HIGHER-LEVEL NORMAL FORMS in

3NF

when higher

will perform

suitably

normal forms

Boyce-Codd

7

might

often involves

you

that limitation

works

solution

design

would ensure that

an employee

to

database environment,

ASSIGN_DATE

date. That limitation given date.

that

a real-world

in

business

transactional

are useful. In this section,

normal form (BCNF),

and about

databases.

you learn

However,

about

a special

there

case

are

occasions

of 3NF, known

as

4NF.

7.6.1 The Boyce-Codd Normal Form (BCNF) Atable is in Boyce-Codd (Recall

from

reason, the

it

3NF

Chapter

normal form (BCNF)

3 that

a candidate

was not chosen to and the

when the

table

BCNF

are

contains

be the

key

has the

primary

equivalent.

same

characteristics

key.) Clearly,

when a table

Putting

more than

one

determinant

in

when every determinant in the table is a candidate key.

that

proposition

candidate

as a primary

contains

another

way,

key,

but for

some

only one candidate BCNF

can

key,

be violated

only

shown

here

key.

NOTE A table

is in

BCNF

Most

designers

are used, table

when

be in

3NF

other

words,

not violate

The Figure

Copyright review

2020 has

BCNF

not

be in

when a table

3NF

in the

situation

table

attribute

when it is in

described

meet the

(a

of the

key.

3NF. In fact,

once the

that

question,

you

is

dependent

on

2NF

attribute

be a candidate

just

a candidate

case

To answer

which a non-key

3NF, yet it fails to

is

BCNF requirements

non-prime

is in

table

as a special

to the BCNF?

one

the

and there

is the

are

if the

must keep another

BCNF requirements

in

So, how can a

mind that

non-prime

no transitive

determinant

techniques

3NF is reached.

a transitive

attribute.

dependencies.

of a key attribute? because

However,

That condition

BCNF requires

that

every

key. 3NF

table

that

fails

to

meet

BCNF

requirements)

is

shown

in

7.7.

Cengage deemed

the

conform

a case in

determinant

Editorial

and

exists

what about does

consider

most tables

dependency In

every

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.7

Normalising

Database

Designs

291

Atable that is in 3NF but not in BCNF

A

As you examine

7

Figure

7.7, note these

BC

functional

D

dependencies:

A 1 B ? C, D C ?B The table

structure

shown in

Figure 7.7 has no partial dependencies,

nor does it contain transitive

dependencies. (The condition C ? Bindicates that a non-key attribute determines part ofthe primary key

and that

requirements.

dependency

is

not transitive!)

Yet the condition

FIGURE 7.8

Thus, the table

C ? B causes the table to fail to

but

not

BC

D

1NF

A

CB

D

has

Cengage deemed

Learning. that

any

All suppressed

meets the

3NF

7

Rights

Reserved. content

does

May not

not materially

be

dependency

CD

3NF

2020

7.7

meetthe BCNF requirements.

A

A

review

Figure

BCNF

Partial

Copyright

in

Decomposition to BCNF

3NF,

Editorial

structure

copied, affect

and

scanned, the

overall

or

CB

BCNF

duplicated, learning

3NF

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

and

party additional

BCNF

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

292

PART II

Design

Concepts

To convert change

the

means that

Figure

structure

key to

Cis, in

dependency in

the table primary

effect,

in

Figure

7.7 into table

A 1 C. That is

a superset

of

structures

an appropriate

B. At this

C ? B. Next, follow the standard

point,

that

action

the

table

decomposition

are in

because

is in

1NF

3NF

the

and in

because

it

TABLE 7.4

contains

procedure

Sample

can be applied to an actual

data for

a BCNF conversion

problem,

examine the sample

a partial

data in Table 7.4.

amended

STAFF_ID

CLASS_CODE

ENROL_GRADE

125

25

21334

A

125

20

32456

C

135

20

28458

B

144

25

27563

C

144

20

32456

B

7.4 reflects

Each

the

following

CLASS_CODE

course

in two

might

class

INFS

section

A student

classes.

might identify class

This

condition

For example,

each identified

420,

can take earning

A staff

member

the

INFS

Or the

labelled

the

section

420

28458

which

might

registration.

1, while the

CLASS_CODE

case in

INFS

code to facilitate

420, class

2.

illustrates

a course

by a unique

section

many

classes.

grades

A and

can teach

member

The structure

uniquely.

a

be taught

Thus, the

CLASS_CODE

might identify

32457

QM

362,

5.

32456,

staff

a class

many

(sections), 32456

might identify

conditions:

identifies

generate

classes

CLASS_CODE

that

? B

procedures to produce the results shown

STU_ID

Table

C

7.8.

To see how this

7

BCNF, first

dependency

many

20 teaches

shown

in

Table

STU_ID 1 STAFF_ID

Note,

for

example,

that

student

125

has taken

both

21334

and

C, respectively. classes,

the

but

classes

each

class

identified

7.4 is reflected

in

? CLASS_CODE,

is taught

as 32456

Panel

by

and

A of Figure

only

one

staff

member.

Note

28458.

7.9:

ENROL_GRADE

CLASS_CODE ? STAFF_ID Panel

A of Figure 7.9 shows

has a major problem, enrolment

information.

different

staff

an update

is lost, structure, Figure

7.9 yields

review

2020 has

Cengage deemed

Learning. that

any

the two

that

All suppressed

contains

Rights

Reserved. content

that is clearly in 3NF, but the table represented

does

purpose to teach

student

135

May

drops

procedure

outlined

structures is in

that

BCNF

not materially

be

copied, affect

scanned, the

overall

when

or

duplicated, learning

table class

anomaly.

only one candidate

not

describe two things:

a deletion

table

a table

a dual

assigned

And if

following

when a table

Copyright

is

producing

Remember

Editorial

Such

member

anomaly.

thus

a structure

because it is trying to

structure

32456, class

key,

in experience.

whole

rows

28458,

information

to

both

3NF

determinant

3NF and

or in Cengage

part.

Due Learning

and

BCNF table

is

of

if

a

producing that

decompose

decomposition

in that

example,

thus

who taught

is to

structure

and student

For

updates,

about

problem

the

by this

to classes

anomalies.

will require

to the

Note that

conform every

will cause

two

The solution earlier.

staff assignments

class

the table

Panel

B shown

in

requirements. a candidate

key.

Therefore,

BCNF are equivalent.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.9

7

Normalising

Database

Designs

293

Another BCNF decomposition

Panel A: 3NF, but not BCNF

STU_ID

STAFF_ID

CLASS_CODE

ENROL_GRADE

Panel B: 3NF and BCNF

7 STU_ID

CLASS_CODE

ENROL_GRADE

CLASS_CODE

STAFF_ID

7.6.2 Fourth Normal Form (4NF) You

may encounter

example, involved

poorly

consider in

the

multiple

Cross

and

3 and

4. Figure

designed

possibility

service

United

databases

that

Way. In

addition,

same

that

set

employee

may each

multivalued The

presence

2(Table a few are

not

unique,

be used

and

Copyright Editorial

review

2020 has

different

employee

sets

attributes are

in the

clearly

Cengage

Learning. that

any

multivalued

PKs.

table.

many of

3 (VOLUNTEER_V3)

version

3

meets

two

and

many

means

the tables candidate

of the

Such

whom

at least

sets

that,

yet it

1 and

to contain

versions is

multiple

and

entries.)

EMP_NUM

table

contains

1,

of independent

if versions

has a PK, but it is

3NF requirements,

projects:

assignment

a condition

may have

be Red

ORG_CODE

are likely

in

the

ways.

key. (The

attributes

nulls.)

on three

different

contain

dependencies

contain

of employees,

Version

In fact,

of them

work for

The attributes

entries

are implemented,

No combination

some

tables

service

work

very

For

and can also

volunteer to

in

exist.

1 and

not

job

quite values 2

desirable,

assignments

composed

of all of

many redundancies

undesirable. is to

by creating

deemed

be

are thousands

activities.

be assigned

do not even have a viable

a PK because

when there

many service

many

attributes

assignments

does

be recorded

That is, the

have

of independent

cannot

create

values. can

multivalued

seems to be a problem.

and VOLUNTEER_V2)

so they

to

The solution

this

multiple

VOLUNTEER_V1

especially

that

of

many

(One

null values; in fact, the tables

can

the

have

dependencies.

10123

might can

of

multiple

employee

of facts

If you examine the tables in Figure 7.10, there ASSIGN_NUM

can have

Suppose

the

how

which a number

an employee

organisations.

7.10 illustrates

in

All suppressed

the

Rights

eliminate

problems

caused

and

SERVICE_V1

ASSIGNMENT

Reserved. content

the

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

by independent

or in Cengage

tables

part.

Due Learning

to

multivalued

depicted

electronic reserves

rights, the

right

in

some to

third remove

dependencies.

Figure

party additional

content

7.11.

may content

be

do

As you examine

suppressed at

You

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

294

PART II

Design

Figure

Concepts

7.11,

note that

multivalued

FIGURE 7.10 Database Table

name:

name:

neither

dependencies.

the

Those

ASSIGNMENT tables

are

nor the

said

to

be in

table

contains

independent

Tables with multivalued dependencies Ch07_Service

VOLUNTEER_V1 EMP_NUM

ORG_CODE

ASSIGN_NUM

10123

RC

1

10123

UW

3

4

10123

Table name:

SERVICE_V1 4NF.

VOLUNTEER_V2 EMP_NUM

ORG_CODE

10123

RC

10123

UW

ASSIGN_NUM

7

Table name:

10123

1

10123

3

10223

4

VOLUNTEER_V3 EMP_NUM

ORG_CODE

ASSIGN_NUM

10123

RC

1

10123

RC

3

10123

UW

4

NOTE A table

is in

fourth

normal

form

proper

design

(4NF)

when

it is

in

3NF

and

has

no

multiple

sets

of

multivalued

dependencies.

If you follow

described tables

Copyright Editorial

review

2020 has

the

problem.

conform

All attributes

2

Norow

Cengage

Learning. that

any

Specifically,

two

All

Rights

Reserved. content

illustrated

discussion

in this

book,

you

of 4NF is largely

does

May not

not materially

be

shouldnt

academic

encounter

if you

the

previously

make sure that

your

rules:

must be dependent on the primary key, but they

maycontain two

suppressed

the

to the following

1

deemed

procedures

must beindependent

of each other.

or more multivalued facts about an entity.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.11 Relational

7

Normalising

Database

Designs

295

Aset of tables in 4NF

Diagram

7

Database name: Table

Table

name:

EMPLOYEE

name:

Copyright review

2020 has

Cengage deemed

Learning. that

any

EMP_NUM

EMP_LNAME

10121

Rogers

10122

OLeery

10123

Panera

10124

Johnson

PROJECT

Table name:

Editorial

Ch07_Service

PROJ_CODE

PROJ_NAME

PROJ_BUDGET

1

BeThere

808

2

BlueMoon

15

3

GreenThumb

2

555 220.24

4

GoFast

4

482 460.00

5

GoSlow

791

363.55 956 900.32

975.00

ORGANISATION

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

ORG_CODE

ORG_NAME

RC

Red Cross

UW

United

Way

WF

Wildlife

Fund

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

296

PART II

Table

Design

name:

Concepts

ASSIGNMENT

Table name:

ASSIGN_NUM

EMP_NUM

PROJ_CODE

1

10123

1

2

10121

2

3

10123

3

4

10123

4

5

10121

1

6

10124

2

7

10124

3

8

10124

5

SERVICE_V1

7

7.7

EMP_NUM

ORG_CODE

10123

RC

10123

UW

10123

WF

NORMALISATION

AND DATABASE DESIGN

The tables

shown in Figure 7.6 illustrate

how normalisation

good tables

from

have ample

poor ones.

You will likely

procedures

opportunity

to

can be used to

put this

skill into

produce

practice

when

you begin to work with real-world databases. Normalisation should be part of the design process. Therefore, make sure that proposed entities meetthe required normal form before the table structures are created. Keep in mindthat, if you follow the design procedures discussed in Chapter 3, Relational Model Characteristics, and Chapter 5, Data Modelling with Entity Relationship Diagrams, the likelihood of data anomalies

will be small. (But

even the

best database

designers

are known

to

make occasional

mistakes that come to light during normalisation checks.) However, many of the real-world databases you encounter will have been improperly designed or burdened with anomalies if they wereimproperly modified during the course of time. And that means you may be asked to redesign and modify existing databases that are, in effect, anomaly traps. Therefore, you should be aware of good design principles and procedures

as well as normalisation

procedures.

First, an ERDis created through aniterative process. You begin byidentifying relevant entities, their attributes and their relationships. Then you use the results to identify additional entities and attributes. The ERD provides the big picture, or macro view, of an organisations data requirements and operations. Second,

normalisation

focuses

on the

characteristics

of specific

entities;

that is,

normalisation

represents a micro-view of the entities withinthe ERD. And as you learnt in the previous sections of this chapter, the normalisation process may yield additional entities and attributes to beincorporated into the ERD. Therefore, it is difficult to separate the normalisation process from the ER modelling process; the two techniques are used in an iterative and incremental process.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

To illustrate of the can

the

contracting

proper

be summarised The

manages

project

requires

An employee

Some employees a project.

the

employee

hourly

billing

Many

assigned

employees

has

are

many

several

part

re-examine sections.

the

297

operations

Those

operations

projects.

and perform

of alabour

job

lets

preceding

Designs

rules:

different

pool,

secretary

primary

process,

in the

Database

employees.

to a project

executive

a single

business

of

to

design

Normalising

would

to

not specifically

be shared

not

classification.

duties

by all

be assigned That job

to

project any

teams.

one

classification

related

to For

particular

project.

determines

the

rate.

employees

can

one electrical

Given that

projects.

companys

Each

than

many

services

in the

were normalised

following

are not assigned

Some

example,

tables

the

the

may be

of normalisation

whose

by using

company

Each

role

company

7

simple

have the

same

job

classification.

For example,

the

company

employs

more

engineer.

description

of the

companys

operations,

two

entities

and their

attributes

are initially

defined: PROJECT

(PROJ_NUM,

EMPLOYEE

PROJ_NAME)

(EMP_NUM,

7

EMP_LNAME,

EMP_FNAME,

EMP_INITIAL,

JOB_DESCRIPTION,

JOB_CHG_HOUR) Those

two

entities

constitute

FIGURE 7.12

the initial

ERD shown

in

Figure

7.12.

Initial contracting company ERD

After creating the initial

ERD shown in Figure 7.12, the normal forms are defined:

PROJECT is in 3NF and needs no modification at this point. EMPLOYEE requires additional scrutiny. The JOB_DESCRIPTION attribute defines job classifications such as systems analyst, database designer and programmer. In turn, those classifications determine the billing rate, JOB_CHG_HOUR. Therefore, EMPLOYEE contains a transitive

dependency.

The removal

of EMPLOYEEs

Copyright review

2020 has

dependency

PROJECT (PROJ_NUM,

PROJ_NAME)

EMPLOYEE (EMP_NUM,

EMP_LNAME,

JOB (JOB_CODE,

Editorial

transitive

Cengage deemed

Learning. that

any

All suppressed

Rights

JOB_DESCRIPTION,

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

yields three entities:

EMP_FNAME, EMP_INITIAL, JOB_CODE) JOB_CHG_HOUR)

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

298

PART II

Design

Because in

Concepts

the normalisation

Figure

process

yields

an additional

entity (JOB),

the initial

ERD is

modified as shown

7.13.

FIGURE 7.13

Modified contracting company ERD

7

To represent

the *:* relationship

relationships have

could

many

employees

Unfortunately,

between

be used

an

assigned

that

Figure

7.14

projects,

the

must thus

primary

be

keys

from

the

ERD entities

note that in this implementation, avoid the

ASSIGNMENT

and the

non-identifying

relationships.

Figure

PROJECT

and

ASSIGNMENT

primary

requires

that

cannot

you

might think

projects,

be correctly

entity

7.15.

The

and

that

each

two

1:*

project

can

entitys the

between

implemented.

cannot

to track

be implemented,

the

ASSIGNMENT

EMPLOYEE

key. Therefore,

relationship

many

and PROJECT

ASSIGNMENT

in

the

use of a composite

the

shown

PROJECT, to

7.14).

a design

EMPLOYEE

to include

the

and

be assigned

Figure

yields

between

modified

yielding

can

to it (see

representation

Because the *:* relationship

to

EMPLOYEE

employee

to

serve

surrogate

as its

enters relationship

PROJECT

and

assignment entity

primary

the

in

ERD in

of employees Figure

foreign

7.15

keys.

uses

However,

key is

ASSIGN_NUM,

between

EMPLOYEE

ASSIGNMENT

are in fact

to and

weak

or

NOTE In Chapter 5, Data Modelling with Entity Relationship Diagrams, it not make a distinction between weak and strong relationships.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

was discussed that

rights, the

right

some to

third remove

party additional

content

UML notation

may content

be

suppressed at

any

time

from if

the

subsequent

does

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.14

Modified contracting

7

Normalising

Database

Designs

299

company ERD

7

FIGURE 7.15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Final contracting company ERD

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

300

PART II

Design

Concepts

As you examine named

Figure

7.15, note that the

ASSIGNMENT.

creation

As you

of a manages

foreign

key in

ability to

generate

employee

the

PROJECT

model

should

(EMP_NUM,

(JOB_CODE,

entities yields

now reflect

Figure

7.15,

optionality in

exists

Figure

FIGURE

name:

Table name:

Copyright review

and their

EMP_NUM)

EMP_LNAME,

EMP_FNAME,

to improve

may want to include

the

the

through

the

the

date on which the

length.

Based

on this last

attributes:

EMP_INITIAL,

EMP_HIREDATE,

JOB_CODE)

PROJ_NUM,

EMP_NUM,

ASSIGN_HOURS,

on the

right

track.

conformance

whose note

that

because

to

entities

3NF.

ERD represents

is

optional

all employees

the

The combination

may now be translated

PROJECT

not

The

to

manage

accurately,

of normalisation

into

appropriate

EMPLOYEE projects.

operations

in the

The final

and

table

and the

ER

manages

contents

This

are

shown

database

Ch07_ConstructCo

EMPLOYEE EMP_FNAME

EMP_INITIAL

EMP_HIREDATE

JOB_CODE

101

News

John

G

08-Nov-10

502

102

Moonsamy

Kavyara

H

12-Jul-99

501

E

01-Dec-07

503

Noxolo

K

15-Nov-98

501

Alice

K

01-Feb-04

502

22-Jun-15

500

D

10-Oct-04

500

Baloyi

Mzwandile

104

Maseki

105

Johnson

106

Smithfield

William

107

Alonzo

Maria

108

Khan

Krishshanth

B

22-Aug-99

501

109

Smith

Larry

W

18-Jul-09

501

110

Olenko

Gerald

A

11-Dec-06

505

111

Wabash

Geoff

B

04-Apr-99

506

112

Smithson

Darlene

M

23-Oct-05

507

113

Joenbrood

Delbert

K

15-Nov-04

508

114

Jones

Annelise

20-Aug-01

508

115

Bawangi

Travis

B

25-Jan-00

501

116

Pratt

L

05-Mar-05

510

Cengage deemed

As you

relationship.

database

EMP_LNAME

has

modelling

structures.

EMP_NUM

2020

systems

JOB_CHG_HOUR) ASSIGN_DATE,

The implemented

103

Editorial

is implemented

be created

entity

manager,

7.16.

7.16

Database

now

their

ERD,

projects

each

ASSIGN_CHARGE)

is

a useful

examine

7

process

to the composite

about

of worker employment

entities

PROJ_NAME,

(ASSIGN_NUM,

ASSIGN_CHG_HOUR, design

four

may

you

to keep track

is assigned

relationship

attributes

For example,

include

attribute

information

The manages

additional

JOB_DESCRIPTION,

ASSIGNMENT

detailed

useful.

some

information.

(PROJ_NUM,

EMPLOYEE

The

is

Finally,

additional

need

was hired (EMP_HIREDATE)

modification,

JOB

relationship

PROJECT.

ASSIGN_HOURS

will likely

Learning. that

any

All suppressed

Rights

Gerald

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

EMP_NUM

EMP_LNAME

117

Williamson

118

Table

name:

Table

Copyright Editorial

review

Database

Designs

EMP_INITIAL

EMP_HIREDATE

JOB_CODE

Angie

H

19-Jun-2004

509

04-Jan-2016

510

J

JOB_CODE

JOB_DESCRIPTION

name:

JOB_CHG_HOUR

500

Programmer

28.24

501

Systems

76.43

502

Database

503

Electrical

Analyst

82.95

Designer

66.76

Engineer

Mechanical

53.64

Engineer

505

Civil Engineer

44.07

506

Clerical

21.23

507

DSS

508

Applications

509

Bio Technician

27.29

510

General

14.50

Support

36.30

Analyst

38.00

Designer

Support

7

PROJECT

name:

PROJ_NUM

PROJ_NAME

EMP_NUM

15

Evergreen

105

18

Amber

104

22

Rolling Tide

113

25

Starflight

101

Wave

ASSIGNMENT

ASSIGN_

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_CHG_

ASSIGN_

NUM

DATE

NUM

NUM

HOURS

HOUR

CHARGE 175.63

1001

04-Mar-19

15

103

2.60

67.55

1002

04-Mar-19

18

118

1.40

14.50

1003

05-Mar-19

15

101

3.60

82.95

1004

05-Mar-19

22

113

2.50

38.00

95.00

1005

05-Mar-19

15

103

1.90

67.55

128.35

1006

05-Mar-19

25

115

4.20

76.43

321.01

1007

05-Mar-19

22

105

5.20

82.95

431.34

1008

05-Mar-19

25

101

1.70

82.95

141.02

1009

05-Mar-19

15

105

2.00

82.95

165.90

1010

06-Mar-19

15

102

3.80

76.43

290.43

2020 has

Cengage deemed

Learning. that

any

301

JOB

504

Table

Normalising

EMP_FNAME

James

Frommer

7

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

20.30 298.62

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

302

PART II

Design

Concepts

ASSIGN_

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_CHG_

ASSIGN_

NUM

DATE

NUM

NUM

HOURS

HOUR

CHARGE

1011

06-Mar-19

22

104

2.60

76.43

198.72

1012

06-Mar-19

15

101

2.30

82.95

190.79

1013

06-Mar-19

25

114

1.80

38.00

68.40

1014

06-Mar-19

22

111

4.00

21.23

84.92

1015

06-Mar-19

25

114

3.40

38.00

129.20

1016

06-Mar-19

18

112

1.20

36.30

43.56

1017

06-Mar-19

18

118

2.00

14.50

29.00

1018

06-Mar-19

18

104

2.60

76.43

198.72

1019

06-Mar-19

15

103

3.00

67.55

202.65

1020

07-Mar-19

22

105

2.70

82.95

223.97

1021

08-Mar-19

25

108

4.20

76.43

321.01

1022

07-Mar-19

25

114

5.80

38.00

220.40

1023

07-Mar-19

22

106

2.40

28.24

67.78

7

7.8

DENORMALISATION

Although the creation of normalised relations is animportant database design goal,it is only one of many such goals. Good database design also considers processing requirements. Astables are decomposed to conform to normalisation requirements, the number of database tables expands. Joining the larger number of tables takes additional input/output (I/O) operations and processing logic, thereby reducing system

speed.

Consequently,

occasional

circumstances

may allow

some

degree

of denormalisation

so

processing speed can beincreased. Keep in mindthat the advantage of higher processing speed must be carefully weighed against the disadvantage of data anomalies. Onthe other hand, some anomalies are of only theoretical interest. For example, should people in a real-world database environment worry that a POST_CODE determines CITY in a CUSTOMER

table

whose primary

key is the

customer

number?

Is it really

practical to

produce

a separate table for: POST_CODE (POST_CODE, to eliminate

a transitive

CITY)

dependency

from the

CUSTOMER

table?

(Perhaps

your answer to that

question

changes if you arein the business of producing mailing lists.) The advice is simple: use common sense during the normalisation process. Normalisation purity is often difficult to sustain in the modern database environment. The conflicts between design efficiency, information requirements and processing speed are often resolved through compromises

that

may include

denormalisation.

You

will also learn

(in

Chapter

15,

Databases

for

Business Intelligence) that lower normalisation forms occur (and are even required) in specialised databases known as data warehouses. Such specialised databases reflect the ever-growing demand for greater scope and depth in the data on which decision-support systems increasingly rely. You will discover

that the

data

warehouse

routinely

uses

2NF structures

in its

complex,

multilevel,

data environment. In short, although normalisation is very important, especially production database environment, 2NF is no longer disregarded asit once was.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

multisource

in the so-called

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Although partial

2NF tables

and/or

Aside

from

the

production Data

updates

larger

tables.

possibility

is

many

attributes

create

that

good

tables

and

in

views

such

choice

is

not

for 8,

created

database

as the

that

you

than

their

ones

can

that

Database

with tables should

created,

and

to

that

not

be

Designs

303

contain

minimised.

unnormalised

tables

in

a

the

all of the

virtual

tables

deal

with

required

known

Structured

application

thus far.

why

must

indexes

Query

for the

under

In

(You

that

use

will

a database.

data redundancy

other

some

as views.

Language.)

programs

often lead to various

examined

explain

tables

table.

Beginning

in

update

build

creating

tables

normalised

read

unnormalised

Chapter be

working

environment being

practical

strategies

cannot

unnormalised

anomalies

of

Normalising

defects:

a single

in

problem

database

programs

simply

no simple use

make sure

a better

these

because

design

databases and

are

yield

mind that

production

cautiously

data

from

It

the

a production

may be located

tables

Also keep in in

suffer

efficient

that

how to

Remember

to

more cumbersome.

Unnormalised

be avoided,

in

of troublesome

tend

are less

Indexing

always

dependencies

database

learn

cannot

transitive

7

words,

disasters

use denormalisation

circumstances

the

unnormalised

counterparts.

7

SUMMARY Normalisation first three point

is a technique

normal forms

of view,

forms

yield

3NF

as the

form,

higher

relatively ideal

used to

(1NF,

2NF and 3NF) are

normal

forms

fewer

data redundancies

normal

form.

when

all key

is in

1NF

dependent

on the

dependencies.

(A

primary

(A partial

attribute

is functionally

primary

key

A table is in

3NF, the keys

2NF

2NF

forms

minimised.

higher

all business

known

as

The

From a structural

because

Almost

3NF is

are

in

keys.

normal

designs

Boyce-Codd

use

normal

attributes

Cengage

Learning. that

any

All

all remaining

still

contain

an attribute

partial

is functionally

dependency

non-key

attributes both

attribute.)

is one in A table

are and transitive

dependent which

on

one non-key

with a single-attribute

dependencies.

in

2NF

key is

no partial based

may still

form

(BCNF)

When a table

is

has

dependencies.

on only

contain

a single

transitive

no transitive merely

only

may be split into

process

is illustrated

is an important

are defined

suppressed

can

a 1NF table

i.e.

it is

not

a

dependencies.

dependencies.

a special

a single

Therefore, attribute,

3NF

attribute

Given that

case in

which

candidate

definition

of

all determinant

key,

a 3NF table

is

BCNF.

The

Normalisation

which

2NF and contains

normal

when

1NF

key. Atransitive

primary

A table

when it is in

candidate

one in

and

in

1NF and contains

when its

key.

3NF

requirements.

deemed

are

encountered.

normal

database.

defined

on another

partial

A table that is not in 3NF

has

lower

in the

a table

is

primary

when it is in

in

Boyce-Codd

automatically

2020

most commonly

than

are

However,

dependency

exhibit

primary

A table is in

review

which data redundancies

more restricted

attributes

dependent

cannot

automatically

composite

Copyright

better

special,

key.

only a part of a multi-attribute

Editorial

are

in

or BCNF.)

A table

is

design tables

Rights

Reserved. content

does

part

during the

May not

not materially

be

copied, affect

in

Figures

scanned, overall

or

modelling

duplicated, learning

in experience.

whole

until all of the tables

7.17

but only a part

ER

the

new tables

to

or in Cengage

part.

Due Learning

design

subject

to

3NF

7.19.

of the

process,

meet the

electronic reserves

each

rights, the

process.

right

some to

third remove

As entities

entity (set) to

party additional

content

may content

be

and

normalisation

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

304

PART II

Design

Concepts

checks and

and form

continue

equivalent

tables

Atable in 3NF redundant (4NF)

new entity (sets)

the iterative are in

3NF

The larger

and

the

greater

the

I/O in

order

increased

table

FIGURE 7.17

the

normalised

and their

dependencies

remove

the

of tables,

the

entities into the

attributes

are

defined

and

ERD all

by

logic.

I/O

numerous

to the Thus,

the

operations

tables

are

Unfortunately,

data

updates

data redundancies use

either

dependencies.

Therefore,

speed.

making

databases,

produce

a 3NF table

fourth

a table

null values normal

is in

or

form

4NF

when

dependencies.

processing

speed

that

convert

more additional

of processing

and by introducing

to

multivalued

multivalued

to increase

of production

Incorporate

all entities

may be necessary

to no

amount

processing

cumbersome, design

the

it

contains

the number

less

multivalued

Therefore,

by splitting

it is in

until

3NF.

may contain

data.

as required.

ER process

denormalisation

sometimes

with larger

less

that

are required

efficient,

are likely

sparingly

denormalised tables,

by

to yield

and

to join them

you

and to

pay for

yield the

making indexing

more

data anomalies.

In the

cautiously.

Theinitial 1NF structure

7 A

BC

DE

Partial

Transitive

F

dependency

dependency Step

1:

Write line; PK

each

then on the

PK

component

write the last line.

on

original

a separate

(composite)

A

B

A

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

B

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 7.18

Identifying Step

2:

Normalising

Database

Designs

305

possible PK attributes

Place

all

dependent

attributes

attributes

identified

No attributes

A

7

in

are

with

Step

the

PK

1.

dependent

on

A. Therefore,

A does

not

become a PK for a new table structure.

This

BC

(no

table

is in

partial

3NF

because

dependencies)

no transitive

it is and it

2NF

contains

dependencies.

This A

in

BD

EF

table

because

is in it

transitive

Transitive

2NF

contains

a

dependency.

dependency

7

FIGURE 7.19

Table structures based on the selected PKs Step

3: Remove

all transitive

and retain All tables

are in (no

3NF

partial

transitive

B

dependencies

identified

in

Step

2

all 3NF structures. because

they

are in

dependencies)

2NF

and they

do

not

contain

dependencies.

C

DF

Attribute A

BD

table

E

Dis retained

structure

to

in this

serve

as the

FK to the second table.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

306

PART II

Design

Concepts

KEYTERMS atomicattribute

fourth normalform (4NF)

primeattribute

atomicity

granularity

repeatinggroup

Boyce-Coddnormal form (BCNF)

key attribute

second normalform (2NF)

denormalisation

non-key attribute

surrogate key

dependency diagram

non-prime attribute

third normalform (3NF)

determinant

normalisation

transitivedependency

first normal form (1NF)

partial dependency

Online Content are available

Answers to selectedReviewQuestions andProblems forthis chapter

on the online

platform

accompanying

this

book.

FURTHER READING 7 Ambler,

S. Agile

Database

Codd,

E.F.

Further

Fagin,

R. Multi-valued

Fagin,

R. Normal

Conference Maier,

forms

of

of

Relational

Wiley

& Sons Inc,

Database

and

and relational

Management

Theory

John

of the

dependencies

on

D. The

Techniques.

Normalizations

a new

database Data,

pp.

normal

Model.

form

operators. 153160,

Databases.

2003.

Relational

NY

Data

for

relational

In

Proceedings

Base

Systems.

databases.

Prentice

ACM

of ACM

Hall, 1972.

Transactions

Sigmoid

2(3),

1977.

International

1979.

Computer

Science

Press,

1983.

REVIEW QUESTIONS 1

Whatis normalisation?

2

Whenis atable in 1NF?

3

Whenis atable in 2NF?

4

Whenis

5

Whenis atable in BCNF?

6

Given the dependency a

a table in 3NF?

Identify

b

diagram shown in Figure Q7.1, answer Items

and discuss each of the indicated

Create a database each

c

6a-6c.

dependencies.

whose tables

are atleast in 2NF, showing the

dependency

diagrams for

whose tables

are atleast in 3NF, showing the

dependency

diagrams for

table.

Create a database each table.

7

Whatis a partial dependency?

8

With which normal form is it associated?

Whichthree data anomalies arelikely to be the result of data redundancy?

How can such anomalies

be eliminated?

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE

Q7.1

Dependency

Normalising

Database

Designs

307

diagram for Question 6

C1

9

7

C2

C3

Define and discuss the concept of transitive

C4

C5

dependency.

10

Whatis a surrogate key, and when should you use one?

11

Whyis a table whose primary key consists of a single attribute automatically in 2NF whenit is in 1NF?

12

How would you describe a condition in which one attribute is dependent on another attribute when neither

13

attribute

is part of the

primary

7

key?

Suppose that someone tells you that an attribute that is part of a composite primary keyis also a candidate

14

key. How

would you respond

A table is in ________ normal form

to that

statement?

when it is in ___________ and there are no transitive

dependencies.

15

The dependency diagramin Figure Q7.2indicates that authors are paid royalties for each book they

write for

a publisher.

The amount

of the royalty

can vary by author,

by book,

and by edition

of the book.

FIGURE

Q7.2

ISBN

Book royalty

BOOK_TITLE

dependency

AUTHOR_NUM

diagram

LAST_NAME

PUBLISHER

SOURCE:

ROYALTY

Course

EDITION

Technology/Cengage

whose tables

Learning

a

Based on the dependency diagram, create a database showing the dependency diagram for each table.

are at least in 2NF,

b

Create a database whose tables are atleast in 3NF, showing the dependency diagram for each table.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

308

PART II

Design

16

Concepts

The dependency one

or

tables

more are in

FIGURE Q7.3

MED_NAME

diagram in Figure Q7.3indicates

medicines at least

over 2NF,

time.

Based

showing

the

that a patient can receive

on the

dependency

dependency

diagram,

diagram

for

each

many prescriptions for

create

a database

whose

table.

Prescription dependency diagram

PATIENT_ID

REFILLS_

DATE

PATIENT_

ALLOWED

DOSAGE

NAME

SOURCE:

Course

SHELF_LIFE

Technology/Cengage

Learning

7 Suppose someone tells you that an attribute that is part of a composite candidate key. How would you respond to that statement?

primary

key is also a

PROBLEMS 1

Using the INVOICE

table

structure

shown

below,

write the relational

schema,

draw its

dependency

diagram, and identify all dependencies (including all partial and transitive dependencies). You can assume that the table does not contain repeating groups and that an invoice number references morethan one product. (Hint: This table uses a composite primary key.) Attribute

Name

Sample

INV_NUM

Value

Sample

211347

Value

Sample

211347

Sample

Value

Sample

211348

211349

RU-995748G

AA-E3422QW

15-Jan-2019

15-Jan-2019

16-Jan-2019

Band

Rotary

Power

AA-E3422QW

SALE_DATE

15-Jan-2019

15-Jan-2019

PROD_LABEL

Rotary

0.25-in.

VEND_CODE

211

VEND_NAME

NeverFail,

QUANT_SOLD

1

8

1

2

1

PROD_PRICE

34.46

2.73

31.59

34.46

69.32

sander

Inc.

draw

the

new

saw

NeverFail,

Inc.

BeGood,

1, remove

dependency

sander

Inc.

Identify

the

NeverFail,

normal

drill

157

all partial dependencies,

diagrams.

GH-778345P

211

309

211

Using the answer to Problem and

drill bit

Value

211347

PROD_NUM

2

QD-300932X

Value

Inc.

ToughGo,

write the relational

forms

for

each

table

Inc.

scheme

structure

you

created.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

7

Normalising

Database

Designs

309

NOTE You

can

assume

products.

that

PROD_NUM (Hint:

3

any

Therefore,

given

it is

product

proper

to

is

supplied

conclude

that

? PROD_DESCRIPTION,

Your actions

should

produce

by a single the

PROD_PRICE,

three

dependency

4

new

dependency

Using the results

5

Using the

diagram.

Name

Identify

the

a vendor

can

supply

many

exists:

VEND_CODE,

VEND_NAME

dependencies,

normal

forms

for

writethe relational schema and each

table

structure

you

created.

of Problem 3, draw the ERD using UML class notation.

STUDENT table

dependency

Attribute

diagrams.

but

dependency

diagrams.)

Usingthe answer to Problem 2, remove alltransitive draw the

vendor,

following

structure

Identify

Sample

shown

here,

all dependencies

Value

Sample

write the relational

(including

Value

all transitive

Sample

STU_NUM

211343

200128

199876

STU_LNAME

Stephanos

Smith

Jones

STU_MAJOR

Accounting

Accounting

DEPT_CODE

ACCT

DEPT_NAME

Accounting

DEPT_PHONE

4356

4356

COLLEGE_NAME

Business

ADVISOR_

Grastrand

Admin

Value

schema

and draw its

dependencies).

Sample

Value

Sample

Value

223456

199876 Ortiz

McKulski

Marketing

Marketing

Statistics

ACCT

MKTG

MKTG

MATH

Accounting

Marketing

Marketing

Mathematics

Business

Admin

Business

Admin

Business

Gentry

Grastrand

3420

4378

4378

7

Admin

Arts & Sciences

Tillery

Chen

T356

J331

LNAME ADVISOR_

T201

T228

T201

OFFICE

Building

Torre

Torre

Building

Building

ADVISOR_BLDG

Torre

ADVISOR_

2115

2115

2123

2159

3209

STU_GPA

3.87

2.78

2.31

3.45

3.58

STU_HOURS

75

45

117

113

87

Torre

Building

Jones

Building

PHONE

STU_CLASS

6

3NF requirements

considerations If

review

2020 has

Cengage deemed

dictate

necessary,

naming

Copyright

UG1

Usingthe answerto Problem 5, writethe relational schema and drawthe dependency diagramto meet the

Editorial

UG3

UG3

UG2

UG1

Learning. that

any

add

or

to

using

the

greatest

practical

a 2NF structure,

modify

attributes

to

extent

explain create

possible.

why your decision

appropriate

If

you

believe

to retain

determinants

that

practical

2NF is appropriate.

and

to

adhere

to

the

conventions.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

310

PART II

Design

Concepts

NOTE Although

the

completed

(STU_CLASS), student

is

a student hours

this

student

dependency

considered

ajunior

who is classified

the

7

not

if that

range

(STU_HOURS)

as obvious

student

as a junior

within the specified

define

hours

is

has

do

as you completed

between

may have completed

of 61-90

hours. In

determine

might initially

the assume

61 and

student it to

90 credit

66, 72 or 87 hours,

short,

any hour value

classification

be.

For

example,

hours.

a

Therefore,

or any other

number

within a specified

range

of will

classification.

Using the results

of Problem 6, draw the ERD using UML notation.

NOTE This

ERD constitutes

a small

be combined

with the

might

7

Relationship

8

Tiny

of a universitys

University

full-blown

presentation

in

design. Chapter

5,

For example, Data

this

Modelling

To keep track

of office furniture,

table

computers,

printers and so on, the FOUNDIT company

Sample

Sample

Value

Sample

Value

231134-678

342245-225

254668-449

ITEM_LABEL

HP DeskJet 3755

HP Toner

DT Scanner

ROOM_NUMBER

325

325

123

BLDG_CODE

NTC

NTC

CSF

BLDG_NAME

Nottooclear

Nottoclear

Canseefar

I.

BLDG_MANAGER

Given that information,

answer

I.

B. Rightonit

write the relational

that you label the transitive Using the

uses the

Value

ITEM_ID

to

May B. Next

B. Rightonit

schema

and draw the dependency

Using the results

diagram.

Make sure

and/or partial dependencies.

Problem

8,

write the

relational

schema

and

diagrams that meet 3NF requirements. Rename attributes to create new entities and attributes as necessary. 10

Entity

structure:

Name

9

segment with

Diagrams.

following

Attribute

segment

create

a set

of dependency

meetthe naming conventions

and

of Problem 9, draw the ERD using UML notation.

NOTE Problems

Copyright Editorial

review

2020 has

11-13

Cengage deemed

Learning. that

any

may be combined

All suppressed

Rights

Reserved. content

does

May not

not materially

be

to

copied, affect

serve

scanned, the

overall

or

duplicated, learning

as a case

in experience.

whole

or in Cengage

or a

part.

Due Learning

mini-project.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

11

The table (For

structure shown below contains

example,

attributes

Attribute

there

are

not

atomic.)

are

several

Sample

Name

many unsatisfactory

multivalued

attributes,

Value

Sample

1003

EMP_NUM

naming

Willaker

EMP_EDUCATION

BBA,

components

Value

Sample

JOB_CLASS

SLS

EMP_DEPENDANTS

Gerald (spouse),

Database

Designs

311

and characteristics. are violated

Value

and

Sample

1019

some

Value

1023

McGuire

Smith

MBA

Normalising

conventions

1018

EMP_LNAME

7

McGuire

BS, MS, Ph.D.

BBA SLS

JNT

DBA

JoAnne (spouse)

George (spouse)

Mary (daughter),

Jill (daughter)

John (son) DEPT_CODE

MKTG

MKTG

SVC

DEPT_NAME

Marketing

Marketing

General

DEPT_MANAGER

Jill

EMP_TITLE

Sales

EMP_DOB

23-Dec-1978

EMP_HIRE_DATE

H. Martin

Jill

Agent

H.

Hank

Info.

B. Jones

Systems

David

G. Dlamini

DB

28-Mar-1989

18-May-1992

20-Jul-1969

14-Oct-2007

15-Jan-2016

21-Apr-2013

15-Jul-2009

EMP_TRAINING

L1, L2

L1

L1

L1, L3,

EMP_BASE_SALARY

30

24

EMP_COMMISSION_RATE

0.015

transitive 12

structure,

221.45

Agent

Service

Janitor

Given that

Sales

Martin

INFS

095.00

15

602.50

Admin

101

7

L8, L15

041.00

0.010

write the relational

schema

and

draw its

dependency

diagram.

Label

all

and/or partial dependencies.

Using the

answer

to

Problem

11, draw

the

dependency

diagrams

that

are in

3NF. (Hint:

You

might have to create a few new attributes. Also make sure that the new dependency diagrams contain attributes that meet proper design criteria; that is, make sure that there are no multivalued attributes, that the naming conventions are met, and so on.) 13

Using the results

of Problem 12, draw the UML ERD.

NOTE Problems

14

1416

2020 has

or a

mini-project.

of

Cengage deemed

as a case

business rules to form the basis for a database design. The

A

review

serve

Suppose you are given the following must

Each

Copyright

to

database members,

Editorial

may be combined

any

the

plan the

All suppressed

many

receives

Rights

Reserved. content

does

manager

meals, to

dinner serves

member

Learning. that

to

enable

keep

track

members,

many invitations,

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

a company of

who

and each and

in experience.

whole

dinner attends

member

each invitation

or in Cengage

part.

Due Learning

to

electronic reserves

club the

to

mail invitations

dinners

and

may attend is

right

some to

third remove

the

clubs

on:

many dinners.

mailed to

rights, the

so

to

party additional

many

content

may content

members.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

312

PART II

Design

Concepts

A dinner is dinners.

based

For

on a single

example,

may be composed

A member Because

the

structure

Attribute

of a fish

may attend manager

shown

entre,

a dinner entre,

a baked

many dinners,

is

not

Sample

may be used as the of a fish

potato

and

expert,

entre,

string

and each dinner

a database

in the following

Name

but an entre

may be composed

basis for

rice

and

Or the

dinner

beans.

may be attended

the first

many

corn.

attempt

by many

at creating

the

members.

database

uses the

table.

Sample

Value

Value

Sample

235

Value

214

MEMBER_NUM

214

MEMBER_NAME

Alice

MEMBER_ADDRESS

325

MEMBER_CITY

Murkywater

Highlight

MEMBER_POSTCODE

12345

12349

12345

INVITE_NUM

8

9

10

INVITE_DATE

23-Feb-2020

12-Mar-2020

23-Feb-2020

ACCEPT_DATE

27-Feb-2020

15-Mar-2020

27-Feb-2020

DINNER_DATE

15-Mar-2020

17-Mar-2020

15-Mar-2020

DINNER_ATTENDED

Yes

Yes

No

DINNER_CODE

DI5

DI5

DI2

DINNER_DESCRIPTION

Glowing

ENTREE_CODE

EN3

ENTREE_DESCRIPTION

Stuffed

DESSERT_CODE

DE8

DESSERT_DESCRIPTION

Chocolate

Gerald

B. VanderVoort Meadow

123

Park

M. Gallega

Rose

Court

Alice

B. VanderVoort

325

Meadow

Park

Murkywater

7

Sea

Given that

15

Delight

Ranch

mousse

with

Marinated

crab

steak

DE5

DE2

Cherries Jubilee

Apple pie with honey crust

sauce

write the relational

partial

EN5

Stuffed

crab

Superb

dependencies.

schema

(Hint:

This

and

draw its

structure

dependency

uses

diagram.

a composite

Label

primary

all

key.)

Break up the dependency diagram you drew in Problem 14 to produce dependency diagrams that are in 3NF and write the relational schema. (Hint: You might have to create a few new attributes. Also

make

criteria; are

16

structure, and/or

Sea

EN3

raspberry

transitive

Glowing

Delight

sure

that

that is,

the

new

dependency

make sure that

there

diagrams

are no

contain

multivalued

attributes

attributes,

that

that

meet

the

proper

naming

design

conventions

met and so on.)

Using the results

of Problem 15, draw the ERD.

NOTE Problems

Copyright Editorial

review

2020 has

17-19

Cengage deemed

Learning. that

any

All suppressed

may be combined

Rights

Reserved. content

does

May not

not materially

be

copied, affect

to serve as a case or a mini-project.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

17

7

Normalising

Database

Designs

313

The manager of a consulting firm has asked you to evaluate a database that contains the table structure

Attribute

shown

in the

following

table.

Sample

Name

Sample

Value

James

R. Brown

Value

289

289

298

CLIENT_NUM

Sample

Value

James

D. Smith

D. Smith

CLIENT_NAME

Marianne

CLIENT_REGION

Gauteng

CONTRACT_DATE

10-Feb-2018

15-Feb-2018

12-Mar-2018

CONTRACT_NUMBER

5841

5842

5843

CONTRACT_AMOUNT

a2 358 150.00

a529 537.00

a987 500.00

Internet

Database

Design

Database

Administration

Western

CONSULT_CLASS_1

Database

Administration

CONSULT_CLASS_2

Web Applications

Western

Cape

Services

Cape

Network

CONSULT_CLASS_3

Installation

CONSULT_CLASS_4

CONSULTANT_NUM_1

29

CONSULTANT_NAME_1

Rachel

CONSULTANT_REGION_1

Gauteng

CONSULTANT_NUM_2

56

CONSULTANT_NAME_2

Karl

CONSULTANT_REGION_2

Gauteng

CONSULTANT_NUM_3

22

CONSULTANT_NAME_3

Julian

CONSULTANT_REGION_3

34

G. Carson

Gerald K. Ricardo Western

M. Spenser

Western

H. Donatello

Geraldo

Gauteng

was

the

Each

has

Cengage deemed

Each

contract

Learning. that

any

make a

can

All

Cape

J. Rivera Cape

Rights

may cover

Reserved. content

work

does

in

May not

a consultant

matched to the and

objective

is

make sure that

the

consultants

he or she is located

who is located

consulting

expense,

The

and to

in the

company

it is not always

expertise.

in the

Western

Western

Cape

manager

tries

possible

to

to

and match

do so.) The

maintained:

on

many

the

services

more than

services

design

the

with consultants.

in that region

one region.

may require sign

with

(Although

Cape

clients

properly

with database

minimise travel

are

match

Chen

many clients.

can

contract

suppressed

in

services is

match

to

to

with a consultant

design.

rules

contain

consultant

manager

help

database

is located

can

consulting

2020

is to

business

Each

A client

review

needs

and client locations

client

the

consulting

client

is in

basic

A region

Each

enable

specific

objective

consultant

to

within a given region

if the

expertise

following

Copyright

created

need for

whose

Western

18

match a client

Cape,

Cape

Western

Eastern

For example,

Gerald K. Ricardo

45

CONSULTANT_REGION_4

clients

7

Western Cape

Anne T. Dimarco

Donald

This table

Cape

M. Jamison

34

CONSULTANT_NAME_4

to

Angela

38

CONSULTANT_NUM_4

Editorial

25

not materially

one multiple

copied, affect

scanned, the

overall

of

contract,

database

be

contracts.

but

consulting design

or

many consultants.

duplicated, learning

and

in experience.

whole

each

contract

is

classifications.

signed

(For

by one

example,

client.

a contract

may list

networking.)

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

314

PART II

Design

Concepts

Each consultant A region

can

is located

contain

Each consultant classified Each

area

Given that and

3NF

and

write the

sure

that

many

and

For example,

consultants

in it.

For

and the

business

preceding

(and

example,

the

consulting

experts. rules,

very

write the relational

poor) table

relational new

sure

schema.

(Hint:

dependency

that

there

Using the results

20

Given the sample records in the

of the and

You

diagrams

are

no

may have contain

multivalued

to

create

attributes

attributes,

that

a few

new

that

meet

the

naming

attributes.

proper

diagram for the table

indicates

the

miles, including relationships.

one trip

CHARTER table that follows,

number pickup

For example,

structure.

of passengers points. note

(Hint:

that

Make sure that

carried. Look

write the relational

The

at the

employee

values

has flown

entry is

to

determine

two

charter

the

Sample

Sample

Value

Sample

Value

Sample

Value

CHAR_DATE

15-Jan-2019

15-Jan-2019

16-Jan-2019

17-Jan-2019

CHAR_CITY

STL

MIA

TYS

ATL

CHAR_MILES

580

1 290

524

768

CUST_NUM

784

231

544

784

Hanson

Bryana

Brown

CUST_LNAME

Brown

CHAR_PAX

5

12

2

5

CHAR_CARGO

235 kg

18 940 kg

348 kg

155 kg

Chen

Henderson

COPILOT

Henderson

Melton

FLT_ENGINEER

OShaski

LOAD_MASTER

Benkasi

Melton

PILOT

Melton

1234Q

3456Y

1234Q

2256W

MODEL_CODE

PA31-350

CV-580

PA31-350

PA31-350

MODEL_SEATS

10

38

10

10

MODEL_CHG_MILE

2.13

18.45

2.20

2.20

AC_NUMBER

any

All suppressed

as pilot

Value

10235

Learning.

on

nature

trips

10234

that

based

as copilot.)

Name

Cengage

and

all dependencies.

CHAR_MILES

data

Melton

you label

schema

10233

deemed

design

conventions

10232

has

all

diagrams that

CHAR_TRIP

2020

Label

of Problem 18, draw the ERD using UML notation.

dependency

round-trip

review

schema

structure.

met and so on).

CHAR_PAX

Copyright

may be

networking.

who are networking

for the

a consultant

diagram you drew in Problem 17 to produce dependency

19

Attribute

have

diagram

the

make

can

design

(class).

dependencies.

make

draw the

Editorial

partial

is,

database

of the requirements

dependency

Also

that

both

of expertise

many consultants

brief description

are in

are

more areas

(class)

Break up the dependency

criteria;

7

in

of expertise

and/or

consultants.

has one or

may employ

draw the

transitive

18

many

as an expert

company

in one region.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

21

Decompose the dependency and

22

write the

Draw the Problem in

relational

ERD to reflect 21.

Problem

Make

diagram in Problem 20 to create table

schema.

sure

Make sure

the

that

properly

the

ERD

that

you label

decomposed

produces

7

Normalising

Database

Designs

315

structures that are in 3NF

all dependencies.

dependency

a database

that

can

diagrams track

all

you created in

of the

data

shown

20.

NOTE Use the

dependency

FIGURE P7.1

diagram

shown

in

Figure

P7.1 to

work

on

Problems

23-24.

Initial dependency diagram for Problems 2324

7

A

23

BC

Break up the dependency in 2NF.

DE

FG

diagram to create two new dependency

24

Modify the dependency diagrams you created in Problem 23 to produce a set of dependency diagrams that are in 3NF. To keep the entire collection of attributes together, copy the 3NF dependency diagram from Problem 23; then show the new dependency diagrams that are also in 3NF. (Hint: One of your dependency diagrams will be in 3NF but not in BCNF.)

25

Modify the dependency diagrams in Problem 24 to produce a collection of dependency diagrams that arein 3NF and BCNF. To ensure that all attributes are accounted for, copy the 3NF dependency

26

diagrams

from

Problem

24; then

show the

Suppose

you have been given the table

new 3NF and

structure

BCNF

dependency

and data shown

here,

an Excel spreadsheet. The data reflect that a lecturer can have multiple committees and can edit more than one journal.

Copyright Editorial

review

diagrams, one in 3NF and one

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

diagrams.

which

wasimported

multiple advisees,

some to

third remove

party additional

content

may content

be

can serve on

suppressed at

any

time

from

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

316

PART II

Design

Attribute

Concepts

Sample

Name

Value

Sample

EMP_NUM

123

104

LECT_RANK

Professor

Asst.

EMP_NAME

Ghee

DEPT_CODE

CIS

DEPT_NAME

Computer

Info.

Value

Sample

Value

Sample

118 Lecturer

Lecturer

Lecturer

Rankin

Ortega

Smith

CHEM

CIS

ENG

Chemistry

Computer

Systems

English

Info.

Systems

PROF_OFFICE

KDD-567

BLF-119

KDD-562

PRT-345

ADVISEE

1215, 2312, 3233,

3102, 2782, 3311,

2134, 2789, 3456,

2873, 2765, 2238,

2218,

2008,

2876,

2222,

2002,

2901,

3745,

1783,

2378

2764

COMMITTEE_CODE

2098

PROMO,

TRAF,

APPL,

2046,

SPR,

DEV

2018,

JOURNAL_CODE

7

JMIS,

QED, JMGT

Identify the

Create the dependency

d

Eliminate the

create

attributes

2020 has

Cengage deemed

Learning. that

table:

diagram.

diagrams to yield a set of table structures in 3NF.

multivalued dependencies

by converting the affected table structures to 4NF.

Draw the ERD to reflect the dependency to

review

JCIS, JMGT

multivalued dependencies.

c

e

Copyright

in this

Draw the dependency

b

any

All suppressed

Rights

additional

attributes

conform

Reserved. content

SPR,

DEV

Given the information

a

2308

PROMO,

TRAF

DEV

Editorial

Value

does

May not

not materially

to the

be

copied, affect

naming

scanned, the

to

overall

or

duplicated, learning

define

diagrams you drew in Part c. (Note: the

proper

PKs

and

FKs.

Make

sure

You may have that

all of

your

conventions.)

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

27

Using the descriptions diagram

FIGURE

that

is in

P7.2

of the attributes

at least

7

Normalising

given in Figure P7.2, convert the ERDinto

Database

Designs

317

a dependency

3NF.

Appointment

for ERD for

Problem

27

7

28

Using the descriptions diagram

that is in at least

FIGURE P7.3

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of the attributes

All suppressed

Rights

a dependency

3NF.

Presentation ERDfor Problem 28

Reserved. content

given in Figure P7.3, convert the ERDinto

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PartIII

DATABASE PROGRAMMING 8 Beginning Structured Query Language

9 Procedural Language SQL and Advanced SQL

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

BUSINESSVIGNETTE OPENSOURCE DATABASES acquiring

MySQL in

significantly

With Oracle

growing?

An interesting

being

the

ranked

moving,

and

licensing.

worlds

continue

Oracle

most

to

have

procedures,

which have led

For example,

in

2011,

system

PostgreSQL2

(CNAF).3

Frances

69

billion

to

to reduce

took

for

over

11

18

The

BBC news

website

The idea

was that

the

currently

began to

of the

news

per

comprises

minute There

and

over

is

an increasing

using

source

databases

as

Oracle

1

The

2

Open

24 tables

open

using the about

per

8

source real-time

the

rows,

made

features

existing

of data.3

up

world

DBMS

Today,

2006 in which

most

could

the

system

and

in

system,

picked

was tight,

million

was

essential the

database

budget

operating

from

4 terabytes

being

Linux

more than

DBMS

have all the

throughout

As the

Familiales to

the

other

be fed

to

solution

about

would give

by

users. the

was to

and a MySQL

processes

order to

site,

develop

database.

30 000

The

data inserts

hour.4

number

source

zones

users.

DBMSs.

management

day.

open

website

licensing

open source

source

Migration

containing every

a dynamic

on the

all time

to

complex

amounting

an open

was found

queries

produce

to

been

software

dAllocations

benefits

required.

use a MySQL

million

and

switch

databases SQL

stories

from 35

4 000 requests

develop

to

of

database

Nationale

allocating

PostgreSQL

was to

news

stories

attracts

decision

source

have

costs

and

still

MySQL still

organisations

expensive

open

Caisse

with

systems by

due to the

and switch to

to the

system

general,

for their

performance

a billion

BBC News Live Stats system

database

The

and

The aim

sense

criticised

deals

168 individual almost

monitor reader interest. a real

Security

On evaluation,

runs

In

management be answered

databases

switched

system

of reliability

can

platform.1 source

been

database

perhaps

to look for alternatives

people.

and involved

system

audiences

which

Social

million

levels

months

PostgreSQL

often

Government

Security

costs.

necessary

both

source

that

open

customers

its

open

database

towards

French

Social

licensing

and the

the

are

question

popular

move,

and IBM

2009,

of

success

databases.

stories

However,

still have a way to go to

from

organisations

in terms

match the

that

of business

capabilities

of the

have

chosen

intelligence, commercial

to

these

open

vendors

such

and IBM.

Most

Popular

Source

Databases

Database

2019.

Engine

for

Available:

Frances

www.explore-group.com/blog/the-most-popular-databases-2018/bp46/

Social

Security.

Available:

www.linuxbsdos.com/2010/11/25

/open-source-database-new-engine-of-frances-social-security/ 3 4

PostgreSQL. BBC

Available:

News

www.postgresql.org/

Website

uses

MySQL

to

Monitor

Reader

Interest.

Available:

www-it.mysql.com

/whymysql/case-studies/

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

319

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 8 Beginning Structured Query Language IN THIS CHAPTER,YOU WILLLEARN: The

basic

commands

and functions

How to

use

SQL for

data

How to

use SQL for

data

How to

use SQL to

of

SQL

administration

(to

manipulation

create

tables,

(to add,

query a database

to

indexes

modify,

extract

delete

and

views)

and retrieve

data)

useful information

PREVIEW In

this

chapter,

pronounced

you

S-Q-L

database

and table

administration, DBMS to

basic

SQLs is

by the

is

simple,

fact

that

data

format

and

Query

of commands

various to

types

extract

Language

that

(SQL).

enable

of data useful

many software

so the language much

manipulation

have

SQL,

users to create

information.

vendors

work

the

complex

takes

and

data

All relational

developed

extensions

activities

Its simplicity

the

take

place

when

store

language;

be done.

a

For

to

a non-procedural

do not need to know the

that

scenes.

required

but not how it is to

and programmers

complex

behind

structures

SQL is

must be done,

easy to learn.

place

table

Furthermore,

what

end users

is relatively

of its

creates

successfully.

or the

Structured

set.

user specifies

SQL commands, storage

perform

command

manipulate

of

database

SQL,

command

a single

that is, the

basics

is composed

the

supports

SQL

enhanced

and

query

vocabulary

example,

the

structures,

and

software

the

learn

or sequel,

To issue

physical

SQL

data

command

is

executed. Although

quite

applications corrections overlays, expect.

and

on

(adding, chapter.

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

In

May not

or any those

data

spite

of the

SQL

with

SQL itself other are

(creating

deleting

of its

powerful,

entry

features

definition

modifying,

managing

and

Data

additions.

pop-ups Instead,

focuses

Editorial

useful

arena.

does utilities

available tables,

SQL is

not

meant

possible

to

but

stand

not create

menus,

special

and

devices

that

screen

indexes

and

data), the a powerful

views)

for

in

as are

end

and

the

data

report

forms,

users

usually

enhancements. data

basic functions tool

alone

awkward,

as vendor-supplied

and retrieving

limitations,

is

SQL is

SQL

manipulation

presented

in this

information

and

extracting

data.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8.1

INTRODUCTION

8 Beginning

Structured

Query

Language

321

TO SQL

Ideally, a database language allows you to create database and table structures, to perform basic data management chores (add, delete and modify), and to perform complex queries designed to transform the raw data into useful information. Moreover, a database language must perform such basic functions with minimal user effort, andits command structure and syntax must be easyto learn. Finally,it must be portable; that is, it

must conform

to some

basic standard

so that

an individual

does not have to relearn

the

basics

when moving from one RDBMS to another. SQL meetsthose ideal database language requirements SQL functions fit into several broad categories: It is

a data definition

language

(DDL):

SQL includes

commands

to create

database

objects

well.

such

as tables, indexes and views, as well as commands to define access rights to those database objects. The data definition commands you willlearn in this chapter are listed in Table 8.1. It is

a data

manipulation

language

(DML):

It includes

retrieve data within the database tables. chapter arelisted in Table 8.2. It is

a transaction

control language

commands

to insert,

update,

The data manipulation commands

(TCL):

The

DML commands

in

delete

and

you willlearn in this

SQL are executed

within the

context of a transaction, whichis alogical unit of work composed of one or more SQL commands. SQL provides commands to control the processing of these statements in anindivisible unit of work. This will be discussed further in Chapter 9. It is a data control language (DCL): Data control commands are used to control access to data objects, such as giving a specific user permission to view the PRODUCT table. Common TCL and DCL commands are shown in Table 8.3.

TABLE 8.1 Command CREATE

SQL data definition

or Option

8

commands

Description

SCHEMA

Creates

a database

schema

Creates

a new

in the

Ensures

that

a column

will not

have

null values

Ensures

that

a column

will not

have

duplicate

AUTHORISATION CREATE

NOT

TABLE

NULL

UNIQUE PRIMARY

KEY

Defines

a primary

FOREIGN

KEY

Defines

a foreign

key for

a table

Constraint

Creates an index for a table

CREATE VIEW

Creates a dynamic

TABLE

Modifies

TABLE

AS

TABLE

used to validate

a tables

Creates

subset

of rows/columns (adds,

based

deletes

a table

DROP INDEX

Permanently

deletes

an index

DROP

Permanently

deletes

a view

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

from

modifies

on a query in the

Permanently

VIEW

no value is

given)

data in an attribute

definition

a new table

values

a table

CREATE INDEX

2020 has

schema

CHECK

DROP

review

key for

database

Defines a default value for a column (when

CREATE

Copyright

users

DEFAULT

ALTER

Editorial

table

(and

to

electronic reserves

thus its

rights, the

right

some to

one or more tables

or deletes users

attributes

database

or constraints) schema

data)

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

322

PART III

Database

Programming

TABLE 8.2 Command

or

SQL data

manipulation Description

Option

Inserts

INSERT

Selects

WHERE

Restricts

the

Restricts

ORDER

a table from

rows

selection

the

rows

selection

selected

in

one

of rows

selected

Orders the

BY

into

attributes

Groups the

BY

HAVING

based

an attributes

Modifies

DELETE

Deletes one or more rows from

COMMIT

Permanently

ROLLBACK

Restores data to their

one

based

or

expression

more attributes

on one or

in

or views

on a conditional

rows

based

values

more tables

on one or

of grouped rows

or

based

UPDATE

Comparison

on a condition

more attributes

more tables

rows

a table

saves data changes original

values

operators Used in

conditional

expressions

AND/OR/NOT

Used

conditional

expressions

Special operators

Used in

conditional

expressions

BETWEEN

Checks

whether

an attribute

value is

within

Checks

whether

an attribute

value is

null

Checks

whether

an attribute

value

matches

a given

Checks

whether

an attribute

value

matches

any value

EXISTS

Checks

whether

a subquery

returns

DISTINCT

Limits values to unique values

5,

,,

.,

,5,

Logical

IS

.5,

,.

operators

NULL

LIKE IN

in

a range

string

pattern

within

a value list

any rows

Used with SELECT to return

mathematical

COUNT

Returns the number

with non-null

MIN

Returns the

minimum attribute

value found in a given column

MAX

Returns

the

maximum

value

SUM

Returns

the

sum

AVG

Returns

the

average

Aggregate

functions

SQL is relatively yet, to

SQL is worry

The

how it is

SQL

ANSI

the

SQL

SQL

specifications,

many

to

move

However, minor.

Copyright review

2020 has

Whether you use Oracle,

Learning. that

any

All suppressed

Rights

in this

Reserved. content

does

May not

not materially

be

in a given

bodies

a given

of fewer than

National

Standards

which

was formally

add their

own

special

enhancements.

from

one

RDBMS

to

Microsoft manual

SQL Server, IBMs should

be sufficient

adopted

you

Although

making

up to

2011. (ISO),

database it is

some

seldom

changes.

among

Access

them

or any other

SQL

speed

a

adherence

contract

differences

have

December

Standardization

without

Microsoft

get

Better

dont prescribes

Consequently,

the

DB2, to

in

for

government

another

SQL dialects,

words.

you

(ANSI)

150 countries. and

100

be done;

Institute

Organization

commercial

different

column

what is to

of more than

in

column

column

set has a vocabulary

required

are several

a software

for

on columns

values for a given column

found

merely command

SQL:2011,

standards usually

summaries

a given

by the International

application

there

for

of all values

American

is

vendors

a SQL-based

presented

The

accepted

is

attribute

of all values

you

version

of national

of rows

command

done.

also

standard

even though

material

Cengage deemed

are

RDBMS

RDBMS, the

be

most recent

composed

ANSI/ISO

possible

Its basic language:

to

standards

a consortium to the

easy to learn.

a non-procedural

about

standard

Editorial

row(s)

SELECT

GROUP

8

commands

if

you

are well-established know

chapter.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

323

NOTE Throughout

this

Community degrees

ISO

so there

standards

know

book

Edition

RDBMs

Release and

2(11.2)

Part

11,

examples

differences

complies the

between

functionality

by a Swedish

aim

In editions

to

MySQL,

one

developers.

If

number

of versions

benefits

from

SQL and support MySQL

of

which

available

both

is the

Release

however,

For

important

to

to

note that

the

to

of databases SQL

in

MySQL different

Users are interested

portability

defined

and

standard

parts.

example,

which is

2 (11.2)

SQL:2011

several

offers.

within

between Oracle

11g

Part 2, SQL/Foundation,

depending

open source

8.3

MySQL

has

open

wishes

been

on the

development

will remain

or option

use

aims

same to

of

time

MySQL

over the

adding

enhance

the

RDBMS in 1995 and years

additional

usability

has

was

been to

SQL extensions.

of the

MySQL

DBMS

features.

source

to

as an open source

One of the

whil at the

for non-SQL

free

an organisation

Other SQL commands

of Sun.

Currently,

Community

Edition,

a commercial

business

version

requirements,

and revenue

Oracle which is

of each

generated

open source in the future

for

a number

supported

MySQL,

Oracle

a price.

by commercial

is up for

offer

of

by open has

Oracle

versions

made reap

of

a

the

MySQL.

debate.

8

Definition

control language

COMMIT

Permanently

ROLLBACK

Restores

control

It is,

Standard,

as part of its takeover

TABLE

Data

RDBMS

AB.

community

of

Transaction

11g

can assess the

MySQL started life MySQL

Whether a version

Command

each

standards

development

Oracle acquired

of

source

called

with the ISO

MySQL

extensions

2009

dialects.

and comprise

SQL:2011

DBMS,

company

compliance

of the

by adding

Core

Oracle

with the ISO

SQL/Schemata.

work towards The

SQL

so that they

that

with the

both

comply

the

with standards

compliant

using

RDBMSs

are very complex

Whereas Oracle is a commercial sponsored

be given

of these

as SQL:2011

and is

will

Both

are small

such

how a RDBMS

different

SQL

RDBMS.

saves

data

data to its

changes

original

values

language

GRANT

Gives

a user

permission

to take

a system

action

or access

a data

object REVOKE

At the

Removes

heart

of SQL is the

query. In

Chapter

a previously

1, The Database

granted

permission from

Approach,

you learned

a user

that

a query is

a

spur-of-the-moment question. Actually, in the SQL environment, the word query covers both questions and actions. Most SQL queries are used to answer questions such as these: Which products currently held in inventory are priced over 100, and whatis the quantity on hand of each of those products? How many employees have been hired since 1 January 2019 by each ofthe companys departments? However, many SQL queries

are used to perform

actions

such as adding

or deleting table rows

or changing

attribute

values

within tables. Still other SQL queries create new tables orindexes. In short, for a DBMS, a query is simply a SQL statement that must be executed. However, before you can use SQLto query a database, you must define the database environment for SQL withits data definition commands.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

324

PART III

Database

8.2

Programming

DATA DEFINITION COMMANDS

Before

examining

the

SQL syntax for creating

and defining tables

the simple database model and the database tables that youll explore in this chapter.

and other elements,

willform the basis ofthe

lets

first

examine

many SQL examples

8.2.1 The Database Model A simple

database

composed

of the following

chapter: CUSTOMER, INVOICE, Figure 8.1.

FIGURE 8.1

The database

tables

is

used to illustrate

the

SQL commands

LINE, PRODUCT and VENDOR. This database

in this

model is shown in

model

8

The database

model in

A customer

8.1 reflects

may generate

An invoice

contains

Each invoice sell

Figure

line

more than

many invoices.

one

or

more invoice

references

one

the following

one

hammer

to

business

Each invoice lines.

product.

is

generated

Each invoice

A product

more than

one

rules:

line

is

by one

customer.

associated

may be found

in

with

one invoice.

many invoice

lines.

(You

can

customer.)

A vendor maysupply many products. Some vendors do not yet supply products. (For example, a vendor list mayinclude potential vendors.) If

a product

Some

Copyright review

2020 has

Cengage deemed

Learning. that

vendor

products

in-house

Editorial

is

any

or

All suppressed

supplied,

are

not

may have

Rights

Reserved. content

does

May not

that

supplied been

not materially

be

affect

scanned, the

is

by a vendor.

bought

copied,

product

overall

on the

or

duplicated, learning

in experience.

supplied (For

open

whole

or in Cengage

by only

example,

a single

some

vendor.

products

may be

produced

market.)

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

325

Online Content Thedatabasemodel in Figure8.1isimplemented in the Microsoft Access 'Ch08_SaleCo'

database

located

on the

online

platform

for this

book. (This

additional tables that are not reflected in Figure 8.1. These tables only.) If you

use

accompanying structures

Microsoft

this

Access,

book.

so you

can

you

However,

practice

the

can

it is

use the

strongly

database

suggested

SQL commands

that

software the

was installed

database

How you connect to the

on your

administrator.

server

you

Follow

access

the instructions

paths

provided

As you can see in Figure 8.1, the database

model contains

set of data definition

of attention

commands,

the focus

create

online

your

chapter.

manytables.

database are using the

on how the

methods

defined

by your instructor,

will be the

purposes

the data in the database

depends

and

a few

platform

own

If you

and loading

Oracle database

and on the

on the

in this

Oracle or MySQL DBMS, SQL script files for creating the tables are also available online.

contains

are used for discussion

available

illustrated

database

Oracle

and

college

managed

by

or university.

However, to illustrate the initial

PRODUCT

and VENDOR tables.

You

will have the opportunity to use the remaining tables later in this chapter and in the problem section. So that you have a point of reference for understanding the effect of the SQL queries, the contents of the PRODUCT and VENDOR tables are listed in Figure 8.2. Note the following about these tables. (The features correspond to the business rules reflected in the

ERD shown in

Figure 8.1.)

8 FIGURE 8.2 Database Table

The VENDORand PRODUCTtables

name:

name:

Ch8_SaleCo

VENDOR

V_CODE

V_NAME

V_CONTACT

V_POSTAL_CODE

V_PHONE

V_Country

V_ORDER

21225

Bryson, Inc.

Smithson

0181

223-3234

UK

Y

21226

SuperLoo, Inc.

Flushing

0113

215-8995

SA

N

21231

D&E Supply

Singh

0181

228-3245

UK

Y

Khumalo

0181

889-2546

SA

N

Smith

7253

678-1419

FR

N

Anderson

7253

678-3998

FR

Y

Browning

0181

228-1410

UK

N

21344

Jabavu

22567

Dome

23119

Randsets

24004

Brackman

24288

ORDVA, Inc.

Dandala

0181

898-1234

SA

Y

25443

B&K, Inc.

Smith

0113

227-0093

SA

N

25501

Damal Supplies

Gounden

0181

890-3529

SA

N

25595

Rubicon

Du Toit

0113

456-0092

SA

Y

Table

name:

Bros Supply Ltd. Bros.

Systems

PRODUCT

P_CODE

P_INDATE

P_QOH

P_MIN

15 psi.,

03-Nov-18

8

blade

13-Dec-18

32

P_DESCRIPT

11QER/31

Power

painter,

P_PRICE

P_DISCOUNT

V_CODE

5

109.99

0.00

25595

15

14.99

0.05

21344

3-nozzle 13-Q2/P2

Copyright Editorial

review

2020 has

Cengage deemed

18cm

Learning. that

any

All suppressed

Rights

pwr.

Reserved. content

does

saw

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

326

PART III

Database

Programming

P_CODE

P_DESCRIPT

P_INDATE

P_QOH

P_MIN

P_PRICE

P_DISCOUNT

V_CODE

14-Q1/L3

22 cm pwr. saw blade

13-Nov-18

18

12

17.49

0.00

21344

1546-QQ2

Hrd. cloth, 0.6 cm, 2x50

15-Jan-19

15

8

39.95

0.00

23119

1558-QW1

Hrd. cloth, 1.25 cm, 3x50

15-Jan-19

23

5

43.99

0.00

23119

2232/QTY

B&D jigsaw,

30 cm blade

30-Dec-18

8

5

109.92

0.05

24288

2232/QWE

B&D jigsaw,

20 cm blade

24-Dec-18

6

5

99.87

0.05

24288

2238/QPD

B&D cordless

20-Jan-19

12

5

38.95

0.05

25595

23109-HB

Claw

20-Jan-19

23

9.95

0.10

21225

23114-AA

Sledge

hammer,

02-Jan-19

8

14.40

0.05

54778-2T

Rat-tail

file,

0.5 cm fine

15-Dec-18

43

4.99

0.00

21344

89-WRE-Q

Hicut

saw,

40 cm

07-Feb-19

11

0.05

24288

PVC23DRT

PVC

2.5

20-Feb-19

188

75

5.87

0.00

SM-18277

3 cm

metal

01-Mar-19

172

75

6.99

0.00

21225

SW-23116

6 cm

wd. screw,

24-Feb-19

237

100

8.45

0.00

21231

WR3/TT3

Steel

matting,

17-Jan-19

18

0.10

25595

drill,

1.25

cm

hammer

chain pipe,

7 kg

9 cm,

3 0.5 cm

screw,

m

25

50

10 cm

31.25

3 20 cm

cm

10 5 20 5

256.99

5

119.95

mesh

8 The VENDOR designers

note

exist

without

Data

Modelling

products

bought in

example, such

Relationship

values

in

are

supplied

a special

examined

is

in the

optional

such

to

optional

PRODUCT

VENDOR

table.

because

relationships

in

Database

a vendor

detail

in

may

Chapter

5,

Diagrams.

PRODUCT

table

factory

direct,

must (and

a few

warehouse

sale. In

other

VENDOR is

optional

to

conditions

using

You

PRODUCT

do)

have

a

match in the

VENDOR

table

integrity.

just

null V_CODE

nulls

the

who are not referenced that

a product.

Entity

Therefore,

of the

vendors by saying

to

referential

vendor. Afew

with

V_CODE

ensure

A few

contains

possibility

a reference

Existing to

table

that

described

values

are

made in-house,

words, a product

and

a few

may have

is not necessarily

supplied

been

by a

PRODUCT.

were made for the

were used in the

sake

PRODUCT

of illustrating

specific

table to illustrate

(later)

SQL features.

For

how you can track

SQL.

8.2.2 Creating the Database Before

you

second, creates

can

the

If

is the

is that you

Copyright Editorial

review

2020 has

Cengage deemed

any

data

the

All suppressed

the

Rights

Reserved. content

does

dictionary

easy to

May

not materially

in

be

copied, affect

create

scanned, the

overall

you

or

to

duplicated, learning

tasks:

data.

store

the

metadata

operating

a database

want to

in experience.

whole

store

or in Cengage

part.

is the

Due Learning

to

from

a new and

start

rights, the

database

task,

creates

right

and

some to

third remove

party additional

content

RDBMS database

with the

operating

the

may

you

suppressed at

any

use.

File/New/Blank

database.

be

database The good

RDBMS

select

content

the

another.

which

Access, name

RDBMS

the

a default

creating

RDBMS to of

structure;

the

database,

Therefore, one

database

reserves

the

the first

regardless

simple:

electronic

create

means interacting

system.

structure,

database

create

database

differ substantially

the

first,

To complete

When you

will hold the

by the to

two

database.

that

creating which

complete

end-user

tables

files

supported

Access,

not

the

that tends

folder

must

will hold

systems

Microsoft

you

will hold the

physical

one feature

specify

Learning. that

the

it is relatively

use

Database,

that

Creating

and the file

structure

RDBMS,

that

files

creates

administrator.

news

a new

the tables

physical

automatically

system

use

create

time

from if

However,

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

if you

work in the database

enterprise

RDBMS

complexity,

those

You

will

that

If

database

are

the

DBMS

log

on to

enterprise

end

RDBMS,

every

more

among

RDBMS

elaborate

ID

such

ANSI standard

user ID is

However,

as

Oracle,

RDBMS.

associated

with

creation

or

Server,

a database

most

most RDBMSs do not use

in

Note you

boxes.

can

start

process through

To be authenticated,

by the

greater

process,

before

is the

database.

created

327

process.

creation

SQL implementations

SQL

Language

use an

and

will be highlighted

DB2

the

a password

will probably

SQL. For example,

some

Query

requirements

database

Authentication

may access

and

database of the

Structured

you

security

implementations

by the users

a user

organisations,

Given their

exception

with a semicolon.

differences

only registered using

a

DB2.

with the

must be authenticated

RDBMS

or

deviates little from the

syntax

verifies that the

that,

an enterprise

you

used bylarger

Server

require

discover

use SQL that

using

tables

typically

SQL

products

to

Important

you

Oracle,

any SQL command

a semicolon.

creating

as

be relieved

RDBMS vendors require

environment

such

8 Beginning

database

which

you

administrator.

must In

an

schema.

8.2.3 The Database Schema In the SQL environment, are related can

hold

multiple

grouping tables

by

tables

that

owner

CREATE Therefore,

(or

user.) the

the

user

For

is

JONES,

RDBMSs

a first

to

such as tables

or applications. and

level

and indexes

user or application.

views.

Think

Schemas

of security

A single

of a schema

are useful

by allowing

the

that

database

as a logical

in that

they

user to

see

group only

the

8

schema:

command:

command.

a user is

DBMS is used, the

a database

JONES;

that

(When

create

{creator};

use the

support

line.

who owns the schema.

most

users

indexes

a command

AUTHORISATION

command

AUTHORISATION

focuses

enforce

AUTHORISATION

SCHEMA

When the

different

as tables,

and

define

creator

Most enterprise is, from

belongs to a single

him or her.

standards

if the

to

such

function)

SCHEMA

CREATE

belonging

objects,

belong to

SQL

Usually, the schema

schemas

of database

ANSI

a schema is a group of database objects

to each other.

However,

created,

CREATE

the

SCHEMA

That is, if you log

the

command

DBMS

is

seldom

automatically

AUTHORISATION

on as JONES,

used

assigns

directly

a schema

command

that to that

must be issued

you can use only

CREATE

by

SCHEMA

JONES. RDBMSs,

on the

ANSI

the SQL

CREATE

SCHEMA

commands

AUTHORISATION

required

to

create

is

and

optional.

That is

manipulate

tables.

define

PRODUCT

why this

chapter

8.2.4 Data Types After the

database

structures the

within

data

schema the

dictionary

been

database. shown

TABLE 8.4

created,

Attribute

Name

Name

Table

Product

P_DESCRIPT

are ready SQL

to

the

commands

used in the

and

VENDOR

example

are

table

based

on

8.4.

for the Ch8_SaleCo

Contents

P_CODE

you

The table-creating

in

Data dictionary

Table

PRODUCT

has

Data

code

Product

Type

database Range

Format

Required

CHAR(10)

XXXXXXXXXX

NA

Y

VARCHAR(35)

Xxxxxxxxxxxx

NA

Y

PK

FK

or

Referenced

FK

Table

PK

description

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

328

PART III

Database

Table

Attribute

Name

Name

VENDOR

Programming

Contents

Data

P_INDATE

Stocking

date

P_QOH

Units

P_MIN

Minimum

available

units

Type

Format

Range

Required

DATE

DD-MON-YYYY

NA

Y

SMALLINT

####

0-9999

Y

SMALLINT

####

0-9999

Y

PK

FK

or

Referenced

FK

Table

VENDOR

P_PRICE

Product

price

NUMBER(8,2)

####.##

0.00-9999.00

Y

P_DISCOUNT

Discount

rate

NUMBER(4,2)

0.##

0.00-0.20

Y

V_CODE

Vendor

code

INTEGER

###

100-999

N

FK

V_CODE

Vendor

code

INTEGER

PK

V_NAME

Vendor

name

V_CONTACT

Contact

#####

1000-9999

Y

CHAR(35)

Xxxxxxxxxxxxxx

NA

Y

CHAR(25)

Xxxxxxxxxxxxxx

NA

Y

CHAR(5)

99999

NA

Y

CHAR(12)

999-999-9999

NA

Y

CHAR(2)

XX

NA

CHAR(1)

X

Y or

person

8

V_AREACODE

Area

V_PHONE

Phone Country

V_ORDER

Previous

5 Foreign

key

PK

5 Primary

key

CHAR

5 Fixed

the

character

5 Variable

NUMBER

5 Numeric

decimal

INTEGER

length

data,

character

length

data.

places.

only.

shown

here

mind that

1 to

only.

the

data type

of a MONEY in

by

formats

Y

in this

with two

or a CURRENCY

in

places

in

Oracle

and up to

nine

digits long,

including

data type

Oracle

DD-MON-YYYY,

chapter.

in

VARCHAR2

decimal

Oracle

NUMBER

are

dictionary

selection

May be labelled

numbers

NUMBER

Represented

data

characters.

specify

use by

accepted

the

2 000

used to

will be illustrated

As you examine

For

is

permit

Commonly

Y N

255 characters

data,

Represented

values

may vary.

* Not all of the ranges

1 to

RDBMSs

values

5 Small integer

DATE formats

order

NUMBER(9,2)

Some

5 Integer

SMALLINT

number

V_COUNTRY

FK

VARCHAR

code

DD-MON-YY,

However,

you can

Table 8.4, note

is

usually

dictated

some

kind

of numeric

by the

MM/DD/YYYY,

use these

constraints

particularly nature

the

of the

and

to

practise

data types

data

MM/DD/YY

writing

your

own.

selected.

Keep in

and by the intended

use.

example: P_PRICE

clearly

requires

data type;

defining

it

as a character

field

is

not

acceptable. Just

as clearly,

a vendor

VARCHAR2(35) case,

such

Country

dates

instance,

Copyright Editorial

review

2020 has

Cengage deemed

if

you

any

All suppressed

Rights

are to

allow have

does

May

obvious

two

be a(Julian) you

to

not materially

DATE fields,

copied, affect

scanned, the

overall

so

can

15-APR-2019

or

duplicated, learning

in experience.

whole

CHAR(2)

rather

date

you

a character

data type. character

For

example,

strings,

and in this

long.

characters,

simple

for

are variable-length

DATE field

make

by using

be

candidate

names

35 characters

always

used

not

an

vendor

may be up to

Reserved. content

is

because

and 15 April, 2019

Learning. that

strings

P_INDATE

Julian

2018

name

well

abbreviations

Selecting the

fits

than

is

a character

comparisons determine

alogical

and to how

choice. field

is

perform

many

days

desirable date

are

because

arithmetic.

between

1

For March,

- 01-MAR-2018.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

If you use DATE fields, using

15-FEB-2018

Microsoft

Access

today?

to

For example,

interest

arithmetic

would

On the to

hand,

computer

vendor

code,

SMALLINT

for

If

you

it

as a character

do not

new

vendor

values

want to

perform

to

procedures

on the

option

generate

process

is

up to

in

to

queries.

that

for

sorting

represents

the

data type with the

would if

because number

in

numbers it is

2,

which is to

the

clearly

define

the

ones

shown

TABLE

If in

8.5

RDBMS 8.5.

many

you

RDBMSs

give

does

May not

not materially

be

copied, affect

scanned, the

definition,

thought

use of the

the

CHAR(3)

multiplication

to the

or division

H_BATH_NUM

problems.

see

the

8

an attribute

be assigned

valid

data types

values

For example,

value

expected

supported

10

use

as less

of the

data

by SQL. For teaching

any RDBMS can be used to implement

supports

many

beyond

the

more

ones

data

types

specified

in

than ANSI

the SQL.

overall

or

duplicated, learning

in experience.

whole

to integer use

Cengage

part.

specification,

Due

to

electronic reserves

rights, the

that

will be stored

right

and

some to

third remove

counting

that

values

up to

may content

so they places.

If your integer

of INT.

storage

content

decimal

six digits.

length

is

are acceptable,

additional

numbers,

require

instead

DECIMAL

party

the

or -134.99).

numbers

but the

DECIMAL(9)

Learning

12.32,

SMALLINT

That is, greater lengths

or in

numbers

are (whole)

want to store

small,

DECIMAL(9,2),

that

may be up to seven digits long, including

place (for example,

but limited

are relatively NUMBER

indicates

and

as INT. Integers

be used if you

Like the

not.

Reserved.

mathematical

application,

addition,

would

places

decimal

specification.

content

classify data

to the expected

estate could

NUMBER(7,2)

decimal

Like INTEGER

DECIMAL(L,D)

Rights

of

chapter.

potential

it

May be abbreviated

values

All

use

SQL data types

SMALLINT

suppressed

only

the

should

perform

creates

data types

The declaration

any

to

a query some

of the

support

NUMBER(L,D)

Learning.

type

compliant,

Numeric

that

you

a real

to ensure that almost SQL

Comments

Cengage

perform

permit

Character

will do any data

only a few

Format

deemed

cannot

on V_CODE,

(H_BATH_NUM)

must

cannot

has

(You

also

no need

in

decision

Data Type

2020

need

properly.

INTEGER

review

date

you

will ensure that

pay close attention

application

sign and the

Copyright

start

simple

example,

of numbers.

this

For example,

with two

Editorial

in

of bathrooms,

ANSI

Some common

is

procedures

data type

Clearly,

is fully

And

there

CHAR(3)

of data types is limited

your

Table

this

in Table 8.4 contains

the selection

examples.

to

attribute.

a home

the

on the

data type

attribute.

based

you need to

by number

incorrect.

attribute

The data dictionary purposes,

homes

from

one to the largest

INTEGER

entirely

when

in

that

Based

system

Such

For

SQL implementations

composed

purposes.

unlikely

generated.

by adding

as a numeric

Most

SQL

data type,

retrieval

is

60 days

Access).

want your

judgement.

The designation

Therefore,

the

However,

sorts

codes

procedures

it is

of bathrooms

highly

be 2,1,2.5,10.

order

the

data

of bathrooms.

an application

than

and

you

date

by

Date() in

type.

professional

store it as a character

demonstrate

will be the 1 60 (in

2018

Oracle,

329

digits.

though

When you define the attributes attributes

six

in

Language

as follows:

be used.

mathematical

even

attribute,

used

can

What Date()

Query

15 February

SYSDATE

Perhaps

data

V_CODE

data.)

as, or

after the invoice

type

(integers)

such

billing.

can require

must classify

date

Oracle)

a character

data

integer

are quicker

60 days

used

on character

attribute,

questions 1 60 (in

Structured

will be 60 days from

system

useful in

selection

you

numbers

to

V_CODEs to

procedures

counting

The first

the

you

date

RDBMSs

SYSDATE

balance if

what the

the

particularly

data type

about

mathematical the

is

use

answer

use

on a customer

want the

recorded

can the

might

be impossible

other

you

determine

capability

make a decision If

Or you

you

Date arithmetic charging

you can also determine 1 60.

8 Beginning

are

be

but smaller

any

ones are

all acceptable.

suppressed at

a minimum

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

330

PART III

Database

Data

Programming

Type

Comments

Format FLOAT(L,D)

Float is similar

CHAR(L)

Fixed-length

Character

that

are

are left

to

DECIMAL

character

not as long unused.

always

wanted VARCHAR(L)

or

to

as the

you

store

up to

CHAR if you

MySQL.

characters.

parameter

value,

If you the

CHAR(25),

CHAR(3)

spaces

such

However,

would

strings

remaining

strings

as 25 characters.

so

store

as Smith

US area

be appropriate

if you

codes.

character

unused

more often in

255

specify

digits long,

such

characters

not leave

used

are each stored

three

store

Variable-length

VARCHAR2(L)

data for

Therefore,

and Katzenjammer code is

and is

data.

up to

The

designation

25 characters

spaces.

VARCHAR2(25)

long.

However,

Oracle automatically

converts

will let

VARCHAR

will

VARCHAR to

VARCHAR2. Date

DATE

In addition to the TIME,

Stores

data types

TIMESTAMP,

REAL,

dates in the

Julian

date format.

shown in Table 8.5, SQL supports DOUBLE,

FLOAT

and intervals

such

several other data types, including as INTERVAL

DAY TO HOUR.

Many

RDBMSs also have expanded the list to include other types of data, such as LOGICAL, CURRENCY, AutoNumber (Access) and sequence (Oracle). However, because this chapter is designed to introduce the SQL basics, the discussion is limited to the data types summarised in Table 8.5.

8

8.2.5 Creating Table Structures Now you are ready to implement using the

CREATE

TABLE

the

syntax

PRODUCT and VENDOR table structures

shown

with the help of SQL,

next.

CREATE TABLE tablename ( column1

data type [constraint]

[,

column2

data type[constraint]

] [,

PRIMARY KEY(column1 [, column2]) ] [, FOREIGN KEY(column1 [, column2])

REFERENCES tablename] [,

CONSTRAINT constraint ] );

Online Content available Oracle

To

definition. and

Copyright Editorial

review

2020 has

Learning. that

any

Developer

SQL

code

addition,

All suppressed

Reserved. content

Oracle

more

are fully

and

Rights

or

does

line

May not

not materially

be

most

up the

capitalised.

PRODUCT

copied, affect

tables

scanned, the

overall

book. You can copy and paste the SQL commands

into

APEX.

readable,

spaces

names

VENDOR

Cengage deemed

In

attribute

create

on the online platform for this SQL

make the

For Oracleusers,allthe SQLcommandsyou willseein this chapterare

or

SQL

attribute Those

programmers characteristics

conventions

and throughout

duplicated, learning

in experience.

whole

or in Cengage

part.

the

Due Learning

to

are

use

one line

and

constraints.

used in the

per

column

(attribute)

Finally,

both

table

examples

that

following

book.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

8 Beginning

Structured

Query

Language

331

NOTE ABOUT SQL SYNTAX

Syntax

notation

for

SQL

CAPITALS

Required

Italics {a |

commands

used in

SQL command

An end-user-provided A

b| ..}

mandatory

use

tablename

The name

of a table

column

The

of an attribute

data type

A valid

constraint

A valid

constraint

condition

A valid

conditional

columnlist

One or more column

parameter;

data

type

names

One

or

more conditional

expression

A simple

names

brackets

(that

option is

list

optional

76

or

or false)

separated

by commas

by commas

expressions

is,

to true

or expressions

separated

Married)

by logical

operators

(that

P_PRICE

or a formula

is,

- 10)

8

(

UNIQUE,

NOT NULL

V_NAME

VARCHAR(35)

NOT NULL,

V_CONTACT

VARCHAR(15)

NOT NULL,

V_AREACODE

CHAR(5)

NOT NULL,

V_PHONE

CHAR(12)

NOT

NULL,

V_Country

CHAR(2)

NOT

NULL,

V_ORDER

CHAR(1)

NOT

NULL,

PRIMARY

| separated

(evaluates

separated

INTEGER

V_CODE

from square

a table

expression

conditionlist

VENDOR

in

option inside

definition

more table

TABLE

one

required)

definition

or

value

(generally

anything

One

CREATE

keywords

parameter;

An optional

name

book:

parameter

[......]

tablelist

this

KEY (V_CODE));

NOTE Because

table

the

PRODUCT

relationship, If

your

you

RDBMS

Oracle accepts If

your

RDBMS

supported, If you

use

delimiters

Copyright Editorial

review

2020 has

Cengage deemed

a foreign

the

not

create the table for the

support

the

VARCHAR

does

not

key that

VARCHAR2

data type

support

SINT

references

always references

and

1 side.

create

the

Therefore,

VENDOR

in a 1:*

1 side first.) FCHAR

and automatically or SMALLINT,

VENDOR,

the

format,

converts

use INTEGER

use

it to

CHAR.

VARCHAR2.

or INT.

If INTEGER

is

not

use NUMBER. Microsoft

Access,

SQL level.

decimal

you For

places is fine

can

use the

example, in

NUMBER

data type,

NUMBER(8,2)

Oracle,

but you

to indicate

cannot

use it in

but you

cannot

numbers

with

Access;

use the up to

instead,

number

eight

use

characters

NUMBER

without

delimiters.

Learning. that

contains

must always

does

at the

and two

the

table

first. (In fact, the * side of a relationship

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

332

PART III

Database

Programming

If your

RDBMS

delete

them

does

from

If you use the

not

the

support

SQL

PRIMARY

primary

code

shown

and foreign

key

designations

or the

UNIQUE

specification,

here.

KEY designation

in

Oracle, you do not need the

NOT NULL and

UNIQUE

specifications.

The

ON UPDATE

RDBMS.

CREATE

In that

TABLE

case,

delete

PRODUCT

(

clause is part of the the

ON UPDATE

VARCHAR(10)

NOT

NULL

P_DESCRIPT

VARCHAR(35)

NOT

NULL,

P_INDATE

DATE

P_QOH

SMALLINT

NOT

NULL,

P_MIN

SMALLINT

NOT

NULL,

P_PRICE

NUMBER(8,2)

NOT

P_DISCOUNT

NUMBER(4,2)

NOT NULL,

NOT

KEY (P_CODE),

FOREIGN

KEY (V_CODE)

ON

UPDATE

As you

examine

The

NOT

the

specifications

empty in the

(with data

for

UNIQUE

no data

in

the

supported,

element

Because

that

a data

entry

specification

can

use this

is

the

is

will not allow the

this

programs

a unique index

contain

are

not

the

following

made.

features:

When it is

end user to leave made

information

in the respective

at the

to

attribute.

use

Microsoft

and

definition

(attributes,

both a NOT NULL and a UNIQUE

entity integrity

PK in

assumed table

at all).

creates

enforce

entire

ensure

note

table

create

crucial

the level

the

and

data

Use it to avoid

a column.

specifications designate

attributes

sequences,

automatically.

key attributes

automatically

the

command

NOT NULL specification entry

specifications

The

VENDOR

application

specification

values

The primary

the

dictionary,

validation

duplicated

you

clause.

NULL,

SQL table-creating

data available,

dictionary

if

by your

NULL,

REFERENCES

preceding

NULL

attribute

The

may not be supported

CASCADE);

have the

stored

CASCADE

but it

INTEGER,

PRIMARY

to

ANSI standard,

UNIQUE,

P_CODE

V_CODE

8

CASCADE

are

is

Access,

not

spelled

enclosed

primary

key

requirements. PRIMARY

in

If the

KEY

the

NOT

specification.

NOT NULL and

without

the

NULL

and

UNIQUE

specifications. UNIQUE

Those

(For

example,

specifications

are

out.)

parentheses.

A comma

and foreign

key)

definition.

primary

key,

all of the

is

used

to

separate

each

table

NOTE If

you

are

working

parentheses consists

with

a composite

and are separated

of the two

attributes

with commas.

INV_NUMBER

primary

keys

attributes

are

contained

within the

For example, the LINE table in Figure 8.1 has a primary key that and

LINE_NUMBER.

Therefore,

you

would

define

the

primary

key

by typing:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

PRIMARY The

order

KEY (INV_NUMBER, of the

attribute,

primary

then

ordered

key

proceeds

within

each

components

with the

1

1001

2

1002

1

1003

1

1003

2

ON

UPDATE

that

(cascade) clause

ensure

part

CASCADE.

An RDBMS

in the

The command

333

attribute,

because

the indexing

and so on. In this

starts

example,

with the first-mentioned

the line

numbers

would

be

ensures

applied

integrity

standard, does

not

enforces

support

column; that

ends

if you

the

(Although

such

as

clause,

integrity

delete

time,

VENDORs

the system CASCADE

support

code

That is,

cannot

any

UPDATE

not

the

keys.

you

ON

do

it from

in

throughout

the

Oracle,

for foreign

same

a change

key references

maintained.

at the

make

all foreign

RDBMSs,

referential

key

references

sequence

is

some

that

to

ON

shown

UPDATE

here.)

you cannot

delete

a vendor

have

row

as

8

vendor.

with a semicolon.

(Remember,

your

RDBMS

may require

that

you

semicolon.)

use

message,

COLUMN

NAMES

mathematical

symbols

such

as

but PER_NUM is acceptable.

by SQL to the

Language

numbers:

referential

foreign

row

NOTE ABOUT Do not

next

specification

ANSI

automatically entry

the

is important

automatically

RDBMS

as a product

omit

is

that

of the

If your

an invalid long

CASCADE

change

to is

Query

LINE_NUMBER

1001

V_CODE,

Structured

LINE_NUMBER),

of the invoice

INV_NUMBER

The

8 Beginning

perform

message

specific

invalid

functions.

column

1,

2, and /.

For

example,

Also, do not use reserved For example,

in some

PER-NUM

words.

RDBMSs,

may

Reserved

generate

words

the column

an error

are words used

name INITIAL

generates

name.

NOTE TO ORACLE USERS If

you

are

each line, the

Enter

example,

Copyright Editorial

review

2020 has

using

command

aline

number

key.

Line

Oracles

CREATE

TABLE

2

P_CODE

line is

SQL to

create

automatically

numbers

are

execution PRODUCT

command

PRODUCT_P_CODE_PK

5

P_INDATE

6

P_QOH

7

P_MIN

All suppressed

Rights

DATE

NOT

NUMBER

do not type when

looks

using

the

Enter

key

a semicolon Oracles

after

before

SQL

typing

pressing

Developer.

For

like this:

does

May not

not materially

PRIMARY NOT

KEY,

NULL,

NULL,

NOT

NUMBER

Reserved. content

press

(

VARCHAR2(35)

any

you

VARCHAR2(10)

CONSTRAINT

Learning.

when

as you

CREATE TABLE

P_DESCRIPT

that

as long

generated

4

Cengage

Oracle,

automatically

3

deemed

in

generated

also

of the

tables

NULL,

NOT NULL,

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

334

PART III

Database

Programming

8

P_PRICE

9

P_DISCOUNT

10

V_CODE

11

CONSTRAINT

12

NUMBER(8,2)

examine

NUMBER(5,2)

line

NULL,

preceding

definition

REFERENCES

SQL

for

command

P_CODE

VENDOR); sequence,

starts in line

note the

2 and ends

following:

with a comma

at the

end of

3.

The

CONSTRAINT

clause (line

You can name the constraint constraint

was named

Examples CHECK. To

Lines

12

define

V_CODE.

do not

difficult

the

the

time

define

and name

own naming

a constraint

conventions.

in

In this

Oracle.

case, the

about

UNIQUE,

PRIMARY

constraints,

constraint,

you

KEY, in this

KEY

CONSTRAINT

also

Oracle

constraint

clause

is

KEY and

see below.

could

case,

KEY, FOREIGN

use the

following

automatically name

syntax:

names

the

P_CODE constraint.

PRODUCT_V_CODE_FK

generally

used

at the

end

for

of the

the

CREATE

sequence.

name

Unfortunately,

NOT NULL,

a FOREIGN

The

command

If you

KEY

PRIMARY

11 and

you to

meet your

details

a PRIMARY

attribute TABLE

are

For additional

define

3) allows

to

PRODUCT_P_CODE_PK.

of constraints

VARCHAR2(10)

8

NOT

PRODUCT_V_CODE_FK

KEY V_CODE

the

The attribute

NULL,

NUMBER,

FOREIGN

As you

NOT

constraints

yourself,

Oracle-assigned

deciphering

name

it later.

You

Oracle makes

should

automatically

sense

assign

assigns

only to

a name

Oracle,

that

a name. so you

makes

sense

will have to

a

human

beings!

8.2.6 SQL Constraints In Chapter 3, Relational Model Characteristics, you learnt that adherence to entity integrity and referential integrity rules is crucial in arelational database environment. Fortunately, most SQL implementations support

both integrity

rules.

Entity integrity

is enforced

automatically

when the primary

key is specified

in the CREATE TABLE command sequence. For example, you can create the VENDOR table structure and set the stage for the enforcement of entity integrity rules by using: PRIMARY KEY(V_CODE) As you look

enforced

at the

PRODUCT

tables

by specifying in the

FOREIGN

KEY (V_CODE)

That foreign

CREATE

REFERENCES

key constraint

TABLE

sequence,

note that referential

definition

VENDOR

ensures

ON UPDATE

This is the

default

behaviour

Onthe other hand, if a change is reflected

automatically

has been

CASCADE

that:

You cannot delete a vendor from the VENDOR table if atleast vendor.

integrity

PRODUCT table:

for the treatment

one product row references that

of foreign

keys.

madein an existing VENDOR tables

in any PRODUCT

table

V_CODE reference

V_CODE, that change

(ON

UPDATE

must be

CASCADE).

That restriction makesit impossible for a V_CODE value to exist in the PRODUCT table pointing to a non-existent VENDOR table V_CODE value. In other words, the ON UPDATE CASCADE specification ensures the preservation of referential integrity. (Oracle does not support ON UPDATE CASCADE.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Online

Content

ON UPDATE Section to

NOTE ABOUT The support

and

Server SQL

Oracle

does

Oracle

supports

MySQL

to

establish

automatically

In general, following

do not

Language

335

Model into

Among

a Database

Tables.

Structure,

Appendix

Dis

available

book.

ACTIONS varies from

ON

DELETE

product

to

product.

For example:

CASCADE.

CASCADE.

CASCADE.

support for

support

does

SET

additional

ON

support

NULL. information

DELETE

them

a relationship

pops

this

actions

support

an ER

Relationships

for

ON UPDATE

manuals

it

Query

NULL.

not

level,

Governing

CONSTRAINT

support

Server

does

D, Converting

platform

ON UPDATE

product

MySQL

try

SET

SQL

your

command-line

Rules online

Oracle

support

Appendix

constraint

and

Server

not

and

Refer to

you

General

REFERENTIAL

SQL

MySQL

see

on the

for the referential

MySQL,

While

D.2,

Structured

Fora moredetailed discussion ofthe optionsfor the ON DELETEand

clauses,

download

8 Beginning

CASCADE

through

between

on referential

two

the

or

ON

relationship

tables

in

constraints.

UPDATE window

Access,

the

CASCADE

interface.

at the

In fact,

relationship

SQL

8

whenever

window

interface

up.

ANSI SQL permits the use of ON DELETE and ON UPDATE clauses to cover any of the

actions:

CASCADE,

SET NULL or SET DEFAULT.

Besides the PRIMARY KEY and FOREIGN KEY constraints, following constraints:

the ANSI SQL standard also defines the

The NOT NULL constraint is used to ensure that a column does not accept nulls. The UNIQUE constraint is used to ensure that all values in a column are unique. The DEFAULT

table.

constraint

is used to assign

a value to an attribute

when a new row is

added to a

The end user may, of course, enter a value other than the default value.

The CHECK constraint

is

used to validate

data

when an attribute

value is entered.

The CHECK

constraint does precisely whatits name suggests: it checks to see that a specified exists. Examples of such constraints include the following: ? The

minimum

order value

must be at least

condition

ten.

? The date must be after 15 April 2019. If the CHECK constraint is met for the specified attribute (that is, the condition is true), the data are accepted for that attribute. If the condition is found to be false, an error messageis generated and the data are not accepted. Note that the

CREATE

TABLE

command

lets

you define

When you create the column definition (known When you use the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

in two

different

places:

as a column constraint).

CONSTRAINT keyword (known

Reserved. content

constraints

or in Cengage

as atable constraint).

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

336

PART III

Database

A column

Programming

constraint

constraints In

are

this

SQL

applies to just

supported

chapter,

Oracle

command

one column;

at varying is

used

sequence

uses

a table

levels

of compliance

to illustrate

SQL

the

DEFAULT

constraint

may apply to

by enterprise

constraints.

For

CHECK

constraints

and

many columns.

Those

RDBMSs. example, to

note

that

the

define

the

table

following named

CUSTOMER. CREATE

TABLE

CUSTOMER

(

CUS_CODE

NUMBER

PRIMARY

CUS_LNAME

VARCHAR(15)

NOT

NULL,

CUS_FNAME

VARCHAR(15)

NOT

NULL,

CUS_INITIAL

CHAR(1),

CUS_AREACODE

CHAR(5)

KEY,

DEFAULT '0181'

CHECK(CUS_AREACODE

NOT NULL

IN

('0181','0161','7253')), CUS_PHONE

CHAR(12)

NOT

CUS_BALANCE

NUMBER(9,2)

DEFAULT

CONSTRAINT

8

In that

case,

the

CUSTOMER is

recorded.

0181,

CUS_UI1

0161

the

only table

row

is

added

Also

note

that

the

CHECK

any

other

values

and

is

7253;

tables, the

value

modified.)

In

you

contrast,

on the

with the

same last

possible

to

(See

name

last and first

more than

user

a default

makes

value

no entry

restricts

value

the

the

CHECK

for

of 0181.

the

values

If you

only area

condition

condition

Therefore,

area

for

the

code,

the

if

a new

0181

customers

is

want to

area

value code

and first (This

name.

index

named

valid

Language

not

a customer it

to

a unique

row

the

the

is

added

in

entry

table.)

However,

Microsoft

other

(named

of two Clearly,

CUSTOMER

or

SQL.) Finally,

constraint

process.

when

only to the

attributes

index

and

used

applies

that include

prevents

Smith in the

is

SQL and Advanced

creates

merely illustrates

John

value

expression,

conditions

The index

are added to a table

default

whether

any

check for

sequence

name.

(The

validated

9, Procedural

name

when new rows

code.

may include

command

one person

applies

customers

Chapter

TABLE

customers

have

the

assigned

are rejected.

CHECK

CREATE

end

0.00,

CUS_FNAME));

condition

for

being checked.

of the

is

DEFAULT

entered

must use triggers.

line

CUS_UI1)

is

while the

in the table

last

and the

to note that the no

However,

attributes

attribute

table

when

modified.

(CUS_LNAME,

CUS_AREACODE

It is important then

UNIQUE

NULL,

customers it should

be

NOTE TO MICROSOFT ACCESS USERS Microsoft accept

Access

the

In

will

the

review

2020 has

the

DEFAULT

UNIQUE

command

a new invoice

to

or

CHECK

(CUS_LNAME,

create

and the

the

CHECK

constraints. CUS_FNAME)

INVOICE

line

table,

constraint

the

validates

TABLE

INVOICE

PRIMARY

CUS_CODE

NUMBER

NOT

Learning. that

any

Access

the

unique

constraint

the invoice

will

index.

assigns

date is

greater

a

than

( NUMBER

Cengage

create

DEFAULT

that

INV_NUMBER

deemed

and

2019.

CREATE

Copyright

SQL

date to

1 January

accept CUS_UI1

following

default

Editorial

not

CONSTRAINT

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

NULL

rights, the

KEY,

right

some to

third remove

REFERENCES party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

DEFAULT

SYSDATE

NOT NULL,

Language

337

CUSTOMER(CUS_CODE), INV_DATE

DATE

CONSTRAINT In this

case,

The

INV_CK1

notice

the

CUS_CODE

that

the

The

CHECK

.

TO_DATE('01-JAN-2019','DD-MON-YYYY')));

following:

attribute

CUS_CODE

DEFAULT

(INV_DATE

is

definition a foreign

constraint

contains key.

uses the

REFERENCES

This is

another

SYSDATE

special

CUSTOMER

way to

define

function.

(CUS_CODE)

a foreign

This function

to indicate

key.

always

returns

todays

date. The invoice when

date (INV_DATE)

a new row

A CHECK

added

constraint

comparing

a date to

DATE function. The final

is

SQL

and

is

used to

a

manually

command

validate

TABLE

product

automatically is

given

that

entered

sequence

given

for the

the invoice

takes two

creates

the

date is

LINE

date

greater

clause,

parameters,

The

table

P_CODE

VARCHAR(10)

NOT

NULL,

LINE_UNITS

NUMBER(9,2)

DEFAULT

0.00

NOT

NULL,

LINE_PRICE

NUMBER(9,2)

DEFAULT

0.00

NOT

NULL,

PRIMARY

KEY (INV_NUMBER,LINE_NUMBER),

FOREIGN

KEY (INV_NUMBER)

FOREIGN

KEY (P_CODE)

ON

of the

CASCADE

deletion

following

LINE

table,

A UNIQUE

use

When

of the

TO_

has

used.

a composite

primary

and P_CODE to

the

for

you

a UNIQUE

constraint

is

through

the

action

enforces

referential

key

weak entities of the

row

automatically more

CASCADE,

P_CODE));

enforced

deletion

will learn

ON DELETE

PRODUCT(P_CODE),

that is

foreign

of an INVOICE section,

note

constraint

CASCADE

triggers

INVOICE

UNIQUE(INV_NUMBER,

is recommended

automatically

REFERENCES REFERENCES

LINE_UI1

DELETE

2019.

the

8 NOT NULL,

line.

January

LINE (

NUMBER(2,0)

creation

SYSDATE)

same invoice.

LINE_NUMBER

an invoice

'1

in INV_NUMBER

NOT NULL,

In the

by

date and the date format

LINE

constraint

in the

than

NUMBER

CONSTRAINT

(returned

Oracle requires

the literal

table.

and uses a UNIQUE twice

todays

attribute.

date in a CHECK

is not ordered

INV_NUMBER

the

no value

LINE_NUMBER)

ensure that the same

the

is

The TO_DATE function

key (INV_NUMBER,

CREATE

attribute

to

about

rows

deletes indexes

all

use

the

duplication

index.

Also

use

of

of a row in the

dependent

LINE

how to

prevent

The

deletion

in the

of the

and

to

of a unique integrity.

ensure that the

corresponding

added

creation

rows

weak

related

SQL

ON

DELETE

strong

entity

In that

case,

entity.

to the

invoice.

to

create

commands

of

note that

In the them.

NOTE The current integrated

release as the

Manipulation

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of

Language

All suppressed

Rights

Reserved. content

MySQL is currently

default

does

storage (DML)

May not

not materially

be

operations

copied, affect

8.0,

The community

engine for tables

scanned, the

overall

or

duplicated, learning

created

followed

in experience.

whole

the

or in Cengage

part.

Due Learning

edition

with

standard

to

electronic reserves

ACID

rights, the

of InnoDB

MySQL. This

right

some to

third remove

storage

engine

was

meant that the

MySQL

Data

(Atomicity,

party additional

content

may content

Consistency,

be

suppressed at

any

time

from if

the

subsequent

Isolation

eBook rights

and/or restrictions

eChapter(s). require

it.

338

PART III

and

Database

Durability)

properties

Programming

properties

of

will be covered

MySQL

maintains

This ensures

that

database

in

more

data integrity

consistency

To summarise,

transactions

detail

in

Chapter

through

is

the

maintained

MySQL version

in line

8.0 (and

with

12,

other

Managing

InnoDB

engine

across

all tables

beyond)

supports

major

RDBMS

Transactions

such

and

by supporting

as

Concurrency.)

FOREIGN

when data is inserted, the following

Oracle. (ACID

KEY

updated

constraints.

or deleted.

SQL constraints:

UNIQUE

PRIMARY

KEY

FOREIGN

KEY

It is important has

no

to

effect

Consider

when the

corresponding CREATE

8

that

the

CHECK

actually

CUSTOMER

table

CUSTOMER

INTEGER

that

has

PRIMARY

CUS_FNAME

VARCHAR(15)

NOT NULL,

CUS_INITIAL

CHAR(1), CHAR(5)

CUS_BALANCE UNIQUE

NOT NOT

be added

Currently,

created

in this

create the

when

CHECK section

CUSTOMER

is

a table

not

using

is

created,

but it

supported.

Oracle

SQL.

Below

is the

table:

NULL,

NULL

DEFAULT

'0181',

NULL,

NUMBER(9,2)

(CUS_LNAME,

been

to

actually

table.

KEY,

NOT

CHAR(12)

can

the

(

VARCHAR(15)

CUS_PHONE

into

sequence

CUS_LNAME

CUS_AREACODE

constraint

entered

MySQL command

TABLE

CUS_CODE

note

data is

NOT

NULL

DEFAULT

0.00,

CUS_FNAME));

8.2.7 SQLIndexes You learnt in Chapter 3, Relational Model Characteristics, that indexes can be used to improve the efficiency of searches and to avoid duplicate column values. In the previous section, you saw how to declare unique indexes on selected attributes when the table is created. In fact, when you declare a primary key, the DBMS automatically creates a unique index. Even with this feature, you often need additional

INDEX

indexes.

command,

The ability to

create indexes

SQLindexes

can be created on the basis of any selected attribute.

CREATE [UNIQUE]

For example, creates

INDEX indexname

based on the attribute

an index

named

CREATE INDEX

quickly

ON tablename(column1

is important.

Using the

[, column2])

P_INDATE stored in the PRODUCT table, the following

P_INDATEX

within the data dictionary.

CREATE UNIQUE INDEX

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

command

ON PRODUCT(P_INDATE);

Using the

without

warning you first, thus

UNIQUE index

qualifier,

preserving the index

you can even create

an index

prevents you from using a value that has been used before. Such a feature is especially the index attribute is a primary key (PK) whose values must not be duplicated:

Editorial

CREATE

The syntax is:

P_INDATEX:

SQL does not let you overwrite an existing index structure

and efficiently

not materially

be

that

useful when

P_CODEX ON PRODUCT(P_CODE);

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

If you now try to in index. when

Many you

declare

A common

operations if you

the

is to

create

attribute you

possible,

Unique

the

should

once test

unique

yet the

for

are often which

date.)

EMP_NUM

that

Remember index

is

used

would

that

in this

used to

required

on the

as a search

be useful to

a vendor

case.

employee

111

entry is

structure

meets

clearly

prevent

Better

data

employee

Given the

A duplicated

value

PK attribute(s)

key

can

yet, to

or in

order.

create

comparison

For example,

an index

supply

many

make the

search

as efficient

test of Table

entity

duplication.

scores

are

8.6, the

integrity

For example,

stored.

PK is

(An

consider

employee

EMP_NUM

requirements

the

the

can take

1 TEST_NUM.

combination

111,3

is

test record

TEST_CODE

TEST_DATE

TEST_SCORE

1

WEA

15-May-2018

93

110

2

WEA

12-May-2018

87

111

1

HAZ

14-Dec-2018

91

111

2

WEA

18-Feb-2019

95

111

3

WEA

18-Feb-2019

95

112

1

17-Aug-2018

91

CHEM

could have been avoided through

EMP_NUM,

on the

products.

110

attributes

339

duplicated.

TEST_NUM

Such duplication

Language

message duplicate

index

want to list rows in a specific

by vendor, it

table.

a UNIQUE

8.6, in

WEA test

TABLE 8.6

error

a unique

Query

is recommended.

indexes

Table

entry

the

create

on any field

or when you

of all products

index

on a given

The third

SQL produces

automatically

an index

PRODUCT

not create

composite

only

create

a report

in

in

value,

Access,

expression,

a composite

case illustrated

P_CODE

including

Structured

PK.

practice

want to

Therefore,

a test

a duplicate

in a conditional

V_CODE

as

enter RDBMSs,

8 Beginning

TEST_CODE

the use of a unique composite index,

using the

and TEST_DATE:

CREATE UNIQUE INDEX EMP_TESTDEX

ON TEST(EMP_NUM,

TEST_CODE, TEST_DATE);

By default, allindexes produce results that are listed in ascending order, but you can create anindex that yields output in descending order. For example, if you routinely print a report that lists all products ordered

by price from

CREATEINDEX To delete an index,

highest

to lowest,

PROD_PRICEX use the

you could

create

an index

ON PRODUCT(P_PRICE

DROP INDEX

named

PROD_PRICEX

by typing:

DESC);

command:

DROP INDEX indexname For example, if you want to eliminate the PROD_PRICEX index, type: DROP INDEX PROD_PRICEX; After creating

the tables

and some indexes,

you are ready

to start entering

use two tables (VENDOR and PRODUCT) to demonstrate

8.3

data.

The following

sections

most of the data manipulation commands.

DATA MANIPULATION COMMANDS

In this section, you willlearn how to use the basic SQL data manipulation commands INSERT, COMMIT,

Copyright Editorial

review

2020 has

Cengage deemed

UPDATE,

Learning. that

any

All suppressed

Rights

ROLLBACK

Reserved. content

does

May not

not materially

be

and

copied, affect

DELETE.

scanned, the

SELECT,

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

340

PART III

Database

Programming

8.3.1 Adding Table Rows SQL requires

the

use of the INSERT

command

to enter

data into

a table.

The INSERT

commands

basic

syntax looks like this: INSERT Because

INTO tablename

the

PRODUCT

VALUES (value1,

table

uses its

value2, ... , valuen)

V_CODE to reference

the

VENDOR

tables

V_CODE,

an integrity

violation occurs if those VENDOR table V_CODE values dont yet exist. Therefore, you need to enter the VENDOR rows before the PRODUCT rows. Given the VENDOR table structure defined earlier and the sample VENDOR data shown in Figure 8.2, you would enter the first two data rows as follows: INSERT INTO VENDOR VALUES (21225,'Bryson, INSERT

INTO

Inc.','Smithson','0181','223-3234','UK','Y');

VENDOR

VALUES (21226,'Superloo,

Inc.','Flushing','0113','215-8995','SA','N');

and so on, until all of the VENDOR table records (To see the

contents

of the

VENDOR table,

have been entered.

type

SELECT

* FROM

VENDOR;)

Enter the PRODUCT table rows in the same fashion, using the PRODUCT data shown in Figure 8.2. For example, the first two data rows would be entered as follows, pressing Enter at the end of each line:

8

INSERT INTO PRODUCT VALUES ('11QER/31','Power

painter,

15 psi., 3-nozzle','03-Nov-18',8,5,109.99,0.00,25595);

INSERT INTO PRODUCT VALUES ('13-Q2/P2','7.25-in. (To see the contents

pwr. saw blade','13-Dec-18',32,15,14.99,

of the PRODUCT table, type:

0.05, 21344);

SELECT * FROM PRODUCT;)

NOTE Date

entry

shown

is

a function

of the

as 25-Mar-2019

RDBMS.

In

MySQL,

in the

date format

Microsoft default

expected

Access

date

format

the use of # delimiters when performing in P_INDATE .5 #25-Mar-19#.

As you examine The row

the

Copyright Editorial

review

2020 has

(string)

Numerical

entries

Attribute

entries

A value

is required

Learning. that

any

All suppressed

Rights

are entered

in

May not

each

not materially

be

affect

March

2019

might

depending

Microsoft

be

on your

Access

requires

based on date attributes,

as

sequence

between

character

after

VALUES is

is also a parenthesis.

apostrophes

(').

apostrophes.

by commas. column

copied,

example.

Note that the first

in the command

are

enclosed

for

25 formats

or comparisons

parentheses.

character

For example, presentation

observe that:

must be entered

not

other

be 2019-03-25,

and date values

for

does

would

DBMS.

or in

any computations

between

are separated

Reserved. content

by the

Oracle

data entry lines,

and that the last

Character

Cengage deemed

preceding

contents

a parenthesis

and

scanned, the

overall

in the

or

duplicated, learning

table.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

This version

Inserting

of the INSERT

Rows

Thus far,

you

a product

have

INTO

VALUES

code

that

Rows

which

in

the

You do that

all of the

row

attribute

or if you dont

Language

341

at a time.

values

are specified.

yet know the vendor

a null, use the following

drill NULL

was

not

with Optional

may be occasions

NULL

of this

in

null. To enter

the

declaration

Inserting

one table

Query

But

what

code? In those

do you

cases,

do if

you

want

syntax:

PRODUCT

note

NULL

There

rows

('BRT-345','Titanium

Incidentally,

as

entered

vendor

INSERT

adds

Structured

with Null Attributes

does not have a vendor

to leave the

NOT

commands

8 Beginning

by listing

example,

entry

is

used

in the

the

accepted

more than

command,

assume

you

attribute that

75, only

10,

4.50,

because

CREATE

TABLE

0.06,

the

NULL);

V_CODE

statement

attribute

for this

is

optional

the

attribute.

Attributes

when

INSERT

bit','18-Oct-18',

one

attribute

can

indicate

names inside

the

is

optional.

just

attributes

than

attributes

parentheses

only required

Rather

the

declaring

that

have

after the table

for the

each

name.

PRODUCT

attribute

required

table

values.

For the are

purpose

P_CODE

and

P_DESCRIPT:

INSERT

INTO

PRODUCT(P_CODE,

P_DESCRIPT)

VALUES ('BRT-345','Titanium

drill bit');

8

8.3.2 Saving Table Changes Any

changes

close the

made to the

program

power

outage

are lost

and

table

contents

you are using,

are

or some other interruption only the

original

not

or use the

table

occurs

contents

physically

COMMIT

saved

on disk

command.

before you issue the

are retained.

until

you

close

If you are using the COMMIT

The syntax

for

the

command,

the

database,

database

COMMIT

and a

your changes command

is:

COMMIT [WORK] The COMMIT and rows to the

command

deleted

PRODUCT

will permanently

made to table

any table

permanent,

save any changes in the

it is

database.

a good

idea

such

Therefore,

to

save

as rows

added,

if you intend

those

changes

attributes

to

modified,

make your changes

by using:

COMMIT;

NOTE TO MICROSOFT

ACCESS USERS

Microsoft

support

execution

Access

doesnt

of each

SQL

the

COMMIT

command.

Access

automatically

saves

changes

after

the

command.

NOTE TO MYSQL USERS MySQL version 5.6 and onwards supports the use of the COMMIT command. When started, the storage engine defaults to the autocommit mode. As soon as any DML statement is executed that updates atable, MySQL automatically commits the transaction, makingit permanent.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

342

PART III

Database

However, of the in

Programming

the

COMMIT

COMMIT

and

transaction

commands

management

Transactions

and

purpose

ROLLBACK

is

commands

(You

not just

(see

will see

to

save changes.

Section

how

such

8.3.5)

issues

is to

are

In fact,

ensure

addressed

the

ultimate

database

update

in

Chapter

12,

purpose integrity Managing

Concurrency.)

8.3.3 Listing Table Rows Use the SELECT command to list the contents

of a table.

The syntax of the

SELECT command is as

follows: SELECT

columnlist

The columnlist

represents

FROM

one or

tablename

more attributes,

as a wildcard character to list all attributes. (A general the

substitute

PRODUCT

for table,

other

characters

separated

wildcard

or commands.)

by commas.

character

You could

use the * (asterisk)

is a symbol that can be used as a

For example,

to list

all attributes

and

all rows

of

use:

NOTE The SELECT Relational

command

Algebra

is

and

based

on the relational

Calculus.

For example,

operator

the

SELECT,

which

was introduced

in

Chapter

4,

of the

rows

in the

statement

8 SELECT * FROM Can be

written

PRODUCT;

in relational

algebra

as

s (PRODUCT) SELECT

* FROM

Figure

8.3

shows

PRODUCT

tables

table

first

command

PRODUCT;

two

the that

output serve

records,

would show

basis

as shown

output

and the

will have

created

and

output

shown

populated

by that for

entered.

Figure

VENDOR

(Figure

8.3

discussions.

preceding

you in

your

command.

subsequent

in the

only the rows

SELECT

future

generated as the

section, Dont

8.3. and

When

If

the

output

of the

the

complete

PRODUCT

all

entered

worry about you

shows

you

tables

only the

preceding

difference

the

SELECT

between

work in this

with the

PRODUCT

correct

your

section,

rows

for

you use in

sections.)

NOTE Your listing

may not be in the order shown in Figure 8.3. The listings

system-controlled

primary-key-based

index

operations.

You

shown in the figure are the result

will learn

later

how to

control

the

output

of so

that it conforms to the order you have specified.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 8.3 P_CODE

11QER/31

The contents

8 Beginning

Structured

Query

Language

343

of the PRODUCTtable

P_DESCRIPT

P_INDATE

P_QOH

P_MIN

P_PRICE

P_DISCOUNT

V_CODE

Power painter, 15 psi.,

03-Nov-18

8

5

109.99

0.00

25595

3-nozzle 13-Q2/P2

7.25

cm

pwr.

saw

blade

13-Dec-18

32

15

14.99

0.05

21344

14-Q1/L3

9.00

cm

pwr.

saw

blade

13-Nov-18

18

12

17.49

0.00

21344

1546-QQ2

Hrd.

cloth,

3 50

15-Jan-19

15

8

39.95

0.00

23119

1558-QW1

Hrd. cloth,

3 3 50

15-Jan-19

23

5

43.99

0.00

23119

2232/QTY

B&D jigsaw,

12 cm blade

30-Dec-18

8

5

109.92

0.05

24288

2232/QWE

B&D jigsaw,

8 cm

24-Dec-18

6

5

99.87

0.05

24288

2238/QPD

B&D cordless

20-Jan-19

12

5

38.95

0.05

25595

23109-HB

Claw hammer

20-Jan-19

23

10

9.95

0.10

21225

23114-AA

Sledge

hammer,

12

kg

02-Jan-19

54778-2T

Rat-tail

file,

1/8

cm

fine

15-Dec-18

43

89-WRE-Q

Hicut

saw,

16

cm

07-Feb-19

11

PVC23DRT

PVC

pipe,

3.5

m

20-Feb-19

188

75

5.87

0.00

SM-18277

1.25

cm

metal

01-Mar-19

172

75

6.99

0.00

21225

SW-23116

2.5

24-Feb-19

237

8.45

0.00

21231

0.10

25595

WR3/TT3

3 1/6

cm,

2

1/2 cm,

chain

cm

Steel

1/4

blade

drill, 1/2 cm

cm,

8

screw,

wd. screw, matting, m,.5

4

25

50 m 3 8

m

8

17-Jan-19

4.99

20 5

256.99

100

119.95

5

18

0.05

14.40

5

0.00

21344

0.05

24288

8

m mesh

NOTE TO ORACLE USERS Some SQLimplementations (such as Oracle) cut the attribute labels to fit the width ofthe column. However, Oracle lets you set the width of the display column to show the complete attribute name. You can also change the display format, regardless of how the data are stored in the table. For example, if you want to display the euro symbols and commas in the P_PRICE output, you can declare: COLUMN P_PRICE FORMAT 99,999.99 to change the output 12347.67 to 12,347.67. In the same manner,to display only the first 12 characters

of the P_DESCRIPT attribute,

use:

COLUMN P_DESCRIPT FORMAT A12 TRUNCATE

Although

SQL commands

best shown

can be grouped together

on separate lines,

with space

on a single line, complex command

between the

SQL command

sequences

and the commands

are

components.

Using that formatting convention makesit much easier to see the components of the SQL statements, making it easy to trace the SQL logic and, if necessary, to make corrections. The number of spaces used in the indention is up to you. For example, note the following format for a more complex statement:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

344

PART III

Database

Programming

SELECT

P_CODE,

P_DESCRIPT,

P_INDATE,

P_QOH,

P_MIN,

P_PRICE,

P_DISCOUNT,

V_CODE

FROM

PRODUCT;

When you run the

a SELECT

command

same characteristics

table

you

specified

default, SQL

most

set of rows.

as a relational

in the

SQL

commands

FROM

data

This is

the

a set of one or more rows that

SELECT

a very important

commands

be set-orientated

may include

RDBMS returns

table. In addition,

clause.

manipulation

are said to

The set

on a table, the

operate

over

commands.

an entire

A SQL

one or more columns

command

lists

characteristic

of

table

(or

set-orientated

and zero or

all rows from the

SQL

commands.

relation).

from

By

That is

command

more rows

have

works

why

over

a

one or more tables.

8.3.4 Updating Table Rows Usethe

UPDATE command to

UPDATE

[WHERE

if

row

(second)

5 expression

conditionlist

example,

second

8

you

of the

row.

want

to

SET

P_INDATE

attribute

is to

PRODUCT

SET

P_INDATE

What

table. a

be

updated

in the

row,

primary

separate

P_PRICE

if the

previous

P_PRICE

UPDATE

and

P_MIN

2018

to

18 January

key (13-Q2/P2)

the

2019

in the

the

correct

to locate

corrections

5 17.99,

command

values

Remember,

the

UPDATE

command

condition,

the

UPDATE

command

by using this

command

WHERE

December

P_MIN

with commas:

5 10

5 '13-Q2/P2';

happened

The P_INDATE,

PRODUCT specify

have

13

Figure 8.3), use the

5 '18-JAN-2019',

P_CODE

would

table (see

from

5 '13-Q2/P2';

UPDATE

Answer:

P_INDATE

5 '18-JAN-2019'

P_CODE

WHERE

5 expression]

type:

PRODUCT

one

change

PRODUCT

UPDATE

more than

[, columnname

];

Therefore,

WHERE If

The syntax for this command is:

tablename

SET columnname

For

modify data in a table.

had

would

is

not included

have

been

a set-oriented

applies

the

the

WHERE

changed

operator. changes

in

Therefore,

to

all rows

condition?

all rows

of the

if you

in the

dont

specified

table.

Confirm the correction(s) SELECT

*

FROM

to

check the

PRODUCT

tables

listing:

PRODUCT;

8.3.5 Restoring Table Contents If you

have

not

yet

used the

COMMIT

command

to

can restore the database to its previous condition any

changes

restore

the

and

brings

data to

their

the

data

back

prechange

to the

store

with the

values

condition,

the

that

changes

permanently

in the

ROLLBACK command. existed

before

the

database,

you

ROLLBACK undoes

changes

were

made.

To

type:

ROLLBACK;

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

and

press

data

to

in

Enter.

their

Chapter

and

or delete table CREATE

ten

UPDATE

Will the

the

called

rows

in

see that the

ROLLBACK

the

ROLLBACK

commands

are

did, in fact,

examined

in

Language

restore

345

the

greater

detail

etc. illustrates

to

RDBMSs,

commands.

and

such

previous

as

example,

all

dictionary

ROLLBACK

add,

modify

to

ROLLBACK

command

definition

commands

cannot

be rolled

back.

will

(CREATE

The lack

between

databases

are

of commands

Microsoft

designed

to

Access

support

such

as ROLLBACK,

and enterprise

large

multi-user

databases

environments

8

controls.

automatically

previous

wouldnt

and

No, the

All data

command.

key differences

Enterprise

Oracle,

command?

commands.

data

COMMIT

if you had used the

afterwards

subtle

the

data integrity

For example,

ROLLBACK

are used

USERS

support

Oracle.

UPDATE

to the

one of the

robust

ROLLBACK

and

ACCESS

doesnt

have

by the

INSERT

MICROSOFT

MySQL

that

actions:

table.

committed

Access

as

commands

you perform these

table.

SALES

of the

Microsoft

need

manipulation

that

command.

COMMIT,

and

with data

assume

SALES

automatically

such

only

be removed

results

NOTE TO

these

and

Query

SALES.

in the

SALES table

are

work

ROLLBACK

only the

in the

again to

COMMIT

For example,

rows

two

Execute

Some

statement

The

ROLLBACK

rows.

a table

INSERT

TABLE)

SELECT

values.

Structured

9.

COMMIT

undo

Use the

original

8 Beginning

have

data

CREATE INDEX

changes

would

undone

anything.

changes

when issuing

command

have

been

Check

committed your

data

after updating

definition

the two

automatically;

RDBMS

manual

rows

doing

to

a

understand

differences.

8.3.6 Deleting Table Rows It is easy to

delete

a table

row

using the

DELETE

statement;

the

syntax is:

DELETE FROM tablename [WHEREconditionlist For

example,

product

if

code

you

are

want to

(P_CODE)

DELETE In that

];

example,

the

PRODUCT

to

table

primary

key key

the

table

WHERE

value

lets

match;

SQL find any

the

attribute

product

that

you

added

earlier

whose

contents

5 'BRT-345';

exact

record

to

may be used.

are several

WHERE P_MIN

tables

the

P_CODE

products

to delete all rows from the

PRODUCT

PRODUCT

PRODUCT

use:

will see that there

command

DELETE FROM

the

PRODUCT

a primary

you

Use the following

from

is 'BRT-345',

FROM

not limited

Check

delete

for

be

For

which the

PRODUCT table for

deleted.

However,

example,

if

you

P_MIN attribute which the

deletions

examine

your

is equal to 5.

P_MIN is equal to 5:

5 5;

again

to

verify

that

all

products

with

P_MIN

equal

to

5 have

been deleted.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

346

PART III

Database

Programming

Finally, remember condition table

is

are

that

optional.

You learnt

in

8.3.1 how to

source

INSERT

INTO

of the

data.

query

or an inner

always

executed

outer

first

queries

in the

which

has

condition,

mind that

all rows

the

from

WHERE

the

you

INSERT

and

the

column

first

Given the

specified

previous

more

table

one character date

the

subquery into

attribute, values,

the

the

A subquery, inside

also known as a nested

another

SQL statement,

query.

The inner

the INSERT

portion

or subquery.

You

every

case, the

output

of the inner

query. In

Chapter

different

types

should

match

you

subquery

column

is

represents

can

9, Procedural

query

nest

queries

(lower-level)

Language

SQL

of subqueries. the

are inserting

SELECT

second

you

another

is:

query

which

section, using

the inner

deep; in

about

to a table,

tablename;

nested)

outer (higher-level)

If the

has

multiple rows

statement FROM

(or

represents

SELECT

statement.

add

INSERT

columnlist

many levels

by the

attribute

for the

embedded

portion

will learn

returned

is

RDBMS.

for the

to add rows to a table. In that

how to

uses a SELECT subquery.

that

SELECT

queries)

SQL,

has

attributes

and

rows

one

has

should

number

data date

return

values

types

of the

attribute,

one or

and the

one

more rows

third

column

values.

Populating

the

VENDOR

The following

steps

with the

to

P are

syntax

statement

by the

inside

character

The

a query

as the input

values

number in

And keep in

WHERE

statement

you learn

SELECT

is

and the

Advanced

table

8

query,

query

query is used

The

a

specify

use the INSERT

tablename

In that case, the INSERT

and

command.

do not

one at a time. In this section,

as the

(place

a set-oriented

you

Table Rows with a Select Subquery

Section

added rows

the

if

deleted!

8.3.7 Inserting

table

DELETE is

Therefore,

data

used

PRODUCT

guide you through

be used in the

as the

PRODUCT

and

data

rest

source.

Tables

the

of the

V and

process

of populating

chapter.

To

P have the

same

the

accomplish table

VENDOR

that

structure

task,

and

two

PRODUCT

tables

(attributes)

tables

named

as the

V and

VENDOR

and

tables.

Online Content Thefollowingsectionsassume thatthe database hasbeenrestored to its original If

condition.

you

are

Therefore,

using

Oracle

you

or

MySQL or Oracle folder the

database.

provided

from

Usethe following

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

sqlintrodbinit.sql

database,

script

file located

to create all tables

follow the instructions

in

either

and load

the

the data in

specific to your school

setup

Microsoft the

online

Access,

copy the

platform

for this

original 'Ch08_SaleCo.mdb'

file available to

book.

steps to populate your VENDOR and PRODUCT tables. (If you havent already created and

before completing

Editorial

run the

on the online platform,

by your instructor.

download

PRODUCT

MySQL,

hosted

To connect to the

If you are using

the

must do the following:

Rights

VENDOR

to

practise

the

SQL commands

in the

previous

sections,

do so

these steps.)

Reserved. content

tables

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Delete all rows

PRODUCT

?

DELETE

FROM

PRODUCT;

?

DELETE

FROM

VENDOR;

Add the

rows

to

VENDOR

and VENDOR

by copying

all rows

Structured

Query

Language

from

V.

Microsoft

Access, type: INSERT

INTO

VENDOR

SELECT * FROM

V;

? If

Oracle

MySQL,

INTO

VENDOR

SELECT

TEACHER.V;

you

are

using

? If

you

to

are

or

PRODUCT

using

by copying

Microsoft

Access,

? If you are using

Oracle or

Oracle

permanently

users

If you followed

must

those

data that

are

the

INSERT

all rows

type:

from

INTO

PRODUCT

MySQL, type: INSERT

INTO

PRODUCT

save

the

changes:

sections

SELECT

* FROM

P;

SELECT * FROM

TEACHER.P;

COMMIT;

you now have the

remaining

* FROM

P.

INSERT

steps correctly,

used in

type:

347

tables.

? If you are using

Add the rows

the

from the

8 Beginning

VENDOR

of the

and PRODUCT

tables

populated

with

chapter.

Online Content If youareusingOracle or MySQL, youcanrunthe sqlintrodbinit.sql script file

located

in

and load

the

INVOICE, specific

In this search

the

MySQL

the

database.

in

LINE,

EMP

your

college

to

8.4

either data

and

or

Oracle This

folder script

EMPLOYEE). or

To

university

setup

hosted file

on the

populates

connect

to

provided

the

the by

online

database,

your

platform

remaining

to

tables

follow

the

create

all tables

(CUSTOMER, instructions

8

instructor.

SELECT QUERIES section, you criteria.

willlearn

SELECT,

how to fine-tune

coupled

the SELECT command

with appropriate

search

conditions,

by adding restrictions

is an incredibly

powerful

to the tool that

enables you to transform data into information. For example, in the following sections, you learn how to create queries that can be used to answer questions such as these: Which products were supplied by a particular vendor?, Which products are priced below 10?, How many products supplied by a given vendor were sold between 5 January 2019 and 20 March 2019?

8.4.1 Selecting Rows with Conditional

Restrictions

You can select partial table contents by placing restrictions on the rows to beincluded in the output. To do this, add conditional restrictions to the SELECT statement, using the WHERE clause. The following syntax enables you to specify which rows to select: SELECT

columnlist

FROM

tablelist

[WHERE

conditionlist

];

The SELECT statement retrieves all rows that match the specified condition(s) also known as the conditional criteria you specified in the WHERE clause. The conditionlist in the WHERE clause of the SELECT statement is represented by one or more conditional expressions separated by logical operators.

The

WHERE clause is optional.

If no rows

match the

specified

criteria in the

you may see a blank screen or a message that tells you that no rows the query:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

WHERE clause,

were retrieved.

party additional

content

may content

be

For example,

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

348

PART III

Database

Programming

SELECT

P_DESCRIPT,

FROM

PRODUCT

WHERE returns

V_CODE

the

description,

FIGURE 8.4

P_INDATE,

P_PRICE,

V_CODE

5 21344;

date,

and

price

of products

with

a vendor

code

of 21344,

as shown

in

Figure

8.4.

Selected PRODUCTtable attributes for vendor code 21344 P_DESCRIPT

P_PRICE

V_CODE

7.25 cm pwr. saw blade

14.99

21344

9.00

17.49

21344

4.99

21344

cm

Rat-tail

pwr. file,

saw 1/8-in.

blade fine

NOTE The

query:

8

SELECT

P_DESCRIPT,

FROM

PRODUCT

WHERE

V_CODE

comprises

both

algebra

the

P_INDATE,

P_PRICE,

V_CODE

relational

algebra

5 21344;

SELECT

and

PROJECT

operators

and

can

be

written

and

4.1.2 in

in

relational

as:

(PRODUCT)) (s v_code 521344 Pp_descript, p_indate, p_price, v_code For

more information

Relational

on the

Algebra

Microsoft

and

Access

users

Access

QBE

generates

Access

SQL

window,

the

SQL

SELECT

and

PROJECT

operators,

see

Sections

4.1.1

Chapter

4,

Calculus.

windows

can use the

Access

QBE (query

by example)

own native

version

of

SQL,

can

bottom

of

Figure

its

as shown

at the

QBE-generated

SQL,

you

8.5.

and the listing

of the

also

Figure

query

choose

8.5 shows

modified

generator.

to type

Although

standard

the

the

SQL in the

Access

QBE screen,

SQL.

NOTE TO MICROSOFT ACCESS USERS The

Microsoft

Access

QBE interface

automatically

designates

the

data source

by using the table

name

as

a prefix. You will discover later that the table name prefix is used to avoid ambiguity when the same column name appears in multiple tables. For example, both the VENDOR and the PRODUCT tables contain the V_CODE attribute. Therefore, if both tables are used asthey would bein ajoin, the source of the V_CODE attribute

Copyright Editorial

review

2020 has

must be specified.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 8.5

8 Beginning

Structured

Query

Language

349

The Microsoft Access QBEand its SQL

Query view options

Microsoft

Numerous

Access-generated

conditional

comparison

restrictions

operators

TABLE 8.7

SQL

can

shown in Table

Comparison

User-entered

be placed

on the

8.7 to restrict

selected

Copyright Editorial

review

2020 has

table

Course

contents.

For

Technology/Cengage

example,

Learning

use the

operators Meaning

5

Equal to

,

Less than

,5

Less than

or equal

.

Greater

than

.5

Greater

than

Cengage

Learning. that

or

to

equal

to

Not equal to

or !5

deemed

SOURCE:

output.

Symbol

,.

8

SQL

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

350

PART III

Database

Programming

The following

example

uses the not

SELECT

P_DESCRIPT,

FROM

PRODUCT

WHERE The

V_CODE

output,

shown

FIGURE 8.6

in

operator:

P_INDATE,

,.

Figure

equal to

P_PRICE,

V_CODE

21344;

8.6, lists

all of the

rows

for

which

the

vendor

code is

not

21344.

Selected PRODUCTtable attributes for vendor codes other than 21344 P_DESCRIPT

P_INDATE

P_PRICE

V_CODE

03-Nov-18

109.99

25595

3 50

15-Jan-19

39.95

23119

Hrd. cloth,

1/2 cm, 3 3 50

15-Jan-19

43.99

23119

B&D jigsaw,

12 cm blade

30-Dec-18

109.92

24288

B&D jigsaw,

3 cm blade

24-Dec-18

99.87

24288

20-Jan-19

38.95

25595

Power Hrd.

painter,

cloth,

1/4

cm,

B&D cordless

8

Claw

hammer

Hicut

chain

1.25

cm

2.5 Steel

2

3-nozzle

drill, 1/2 cm

saw, metal

cm

15 psi.,

16 cm screw,

wd. screw, matting,

4

25

50 3 8 3 1/6

20-Jan-19

9.95

21225

07-Feb-19

256.99

24288

01-Mar-19

6.99

21225

24-Feb-19

8.45

21231

17-Jan-19

3 5 cm

119.95

25595

mesh

As you examine included

in the

The

Figure

8.6, note that

SELECT

command

commands

P_DESCRIPT,

FROM

PRODUCT P_PRICE

WHERE the

output

FIGURE 8.7

with nulls in the

shown

P_QOH,

,5

in

P_MIN,

has

Cengage deemed

Learning. that

any

8.3) are not

P_PRICE

8.7.

P_QOH

All suppressed

Rights

Reserved. content

does

P_MIN

P_PRICE

23

10

9.95

43

20

4.99

PVC pipe, 3.5 cm, 8 m

188

75

5.87

1.25

172

75

6.99

2.5

2020

Figure

10;

Figure

hammer

Rat-tail file,

review

(see

Selected PRODUCTtable attributes with a P_PRICErestriction

Claw

Copyright

column

output.

P_DESCRIPT

Editorial

V_CODE

sequence:

SELECT

yields

rows

May not

cm cm

not materially

1/8 cm fine

metal

screw,

wd. screw,

be

copied, affect

scanned, the

overall

25

50

or

duplicated, learning

100

237

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

8.45

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using Comparison Because

Operators

computers

Interchange

identify

(ASCII)

character-based

codes,

Therefore,

SELECT

P_CODE,

FROM

PRODUCT

would

be correct

1558-QW1. follows

and

that

the

A is less

ASCII

than

FIGURE 8.8

Language

351

American

may

even

Standard

be

used

Code

to

place

for

Information

restrictions

on

P_QOH,

P_MIN,

P_PRICE

, '1558-QW1';

would yield

(Because

Query

command:

P_DESCRIPT,

P_CODE

(numeric)

operators the

Structured

Attributes

by their

comparison

attributes.

WHERE

on Character

all characters

8 Beginning

a list

code

of all rows

value

B.) Therefore,

the

for

in

which the

P_CODE is

B is

than

the letter

output

is

generated

greater

alphabetically

the

as shown

in

value

of the

Figure

less

than

letter

A, it

8.8.

Selected PRODUCTtable attributes; the ASCIIcode effect P_CODE

P_DESCRIPT

P_QOH

painter,

11QER/31

Power

8

5

109.99

7.25

cm

pwr.

saw

blade

32

15

14.99

14-Q1/L3

9.00

cm

pwr.

saw

blade

18

12

17.49

1546-QQ2

Hrd.

cloth,

15

8

39.95

cm,

2

3-nozzle

P_PRICE

13-Q2/P2

1/4

15 psi.,

P_MIN

3 50

8 String (character) useful

comparisons

when attributes

such

are made from left to right.

as names

are to be compared.

This left-to-right For example,

comparison is especially

the

string Ardmore

would be

judged greater than the string Aarenson but less than the string Brown; use such results to generate alphabetical listings like those found in a phone directory. If the characters 0-9 are stored as strings, the same left-to-right string comparisons can lead to apparent anomalies. For example, the ASCII code for the

character

5 is,

as expected,

greater than the

ASCII code for the

character

4.

Yet the

same 5

will also be judged greater than the string 44 because the first character in the string 44 is less than the string 5. For that reason, you may get some unexpected results from comparisons when dates or other numbers are stored in character format. For example, the left-to-right ASCII character comparison would force the conclusion that the date 01/01/2019 occurred before 12/31/2018. Since the leftmost character

0

in 01/01/2019

is less than the leftmost

character

1 in 12/31/2018,

01/01/2019

is less

than 12/31/2018. Naturally, if date strings are stored in a yyyy/mm/dd format, the comparisons will yield appropriate results, but this is a non-standard date presentation. Thats why all current RDBMSs support date data types, and thats why you should use them. In addition, using date data types gives you the benefit of date arithmetic. Using Comparison Operators on Dates Date procedures are often more software specific than other SQL procedures. For example, the query to list all of the rows in which the inventory stock dates occur on or after 20 January, 2019, willlook like this:

SELECT

P_DESCRIPT, P_QOH, P_MIN, P_PRICE, P_INDATE

FROM

PRODUCT P_INDATE

WHERE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

.5 '20-Jan-2019';

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

352

PART III

Database

Programming

(Remember use

that

Microsoft

#20-Jan-19#

FIGURE 8.9

in the

Access

above

users

WHERE

must use the

clause).

The

Selected PRODUCTtable attributes: P_QOH

P_DESCRIPT

B&D cordless Claw

hammer

Hicut

chain

PVC

pipe,

3 cm

metal

6 cm

wd.

drill,

1.25 cm

# delimiters

for

date-restricted

dates.

output

is

For example, shown

P_MIN

P_PRICE

P_INDATE

38.95

20-Jan-19

12

5

23

10

9.95

20-Jan-19

40

cm

11

5

256.99

07-Feb-19

9 cm,

2.5

m

188

75

5.87

20-Feb-19

172

75

6.99

01-Mar-19

8.45

24-Feb-19

25

screw,

100

237

50

you

Figure

would

8.9.

date restriction

saw,

screw,

in

Using Computed Columns and Column Aliases Suppose you want to determine the total value of each of the products currently held in inventory. Logically, that determination requires the multiplication of each products quantity on hand byits current price. You can accomplish this task with the following command:

8

SELECT

P_DESCRIPT, P_QOH, P_PRICE, P_QOH * P_PRICE

FROM

PRODUCT;

Entering that SQL command in Access generates the output shown in Figure 8.10. FIGURE

8.10

SELECT statement

SQL accepts any

valid

specified

To

Copyright Editorial

review

2020 has

expressions

mathematical in the

Expr label

Expr2;

any valid

to

with a computed

(or formulas)

operators

FROM

clause

all computed

output

in the

and functions

of the

SELECT

columns.

(The

and so on.) Oracle uses the make the

Column in Access

that

computed

column

text

columns.

applied

Note

first

the

are

statement.

actual formula

more readable,

computed

permits

attributes

that

would

as the label

SQL standard

to

also

in

Access

can contain

any

of the

automatically

be labelled

for the the

Such formulas

Expr1;

computed

tables adds

the

an

second,

column.

use of aliases for

any column in

a SELECT statement.

An alias is an alternative name given to a column or table in any SQL statement.

For

rewrite

example,

Cengage deemed

Learning. that

any

you

All suppressed

Rights

can

Reserved. content

does

May not

not materially

be

the

copied, affect

previous

scanned, the

overall

or

duplicated, learning

SQL

in experience.

whole

statement

or in Cengage

part.

Due Learning

as:

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The

SELECT

P_DESCRIPT,

FROM

PRODUCT;

output

of that

command

FIGURE 8.11

P_QOH,

is

shown

P_PRICE,

in

Figure

P_QOH * P_PRICE

Structured

Query

Language

353

AS TOTVALUE

8.11.

SELECTstatement with a computed column and an alias P_DESCRIPT

P_QOH

Power painter,

15 psi., 3-nozzle

P_PRICE

TOTVALUE

8

109.99

879.92

32

14.99

479.68

9.00 cm pwr. saw blade

18

17.49

314.82

Hrd. cloth,

1/4 cm,

2 3 50

15

39.95

599.25

Hrd. cloth,

1/2 cm,

3 3 50

23

43.99

1011.77

8

109.92

879.36

6

99.87

599.22

12

38.95

467.40

23

9.95

7.25

cm

pwr.

saw

blade

B&D jigsaw,

12 cm

B&D jigsaw,

8 cm

B&D

cordless

Claw

hammer,

Rat-tail

file,

1/2

cm

pipe,

1.25

cm

Steel

saw, 3.5

wd. screw, 4

25

50 3 8

3 1/6

214.57

256.99

11

m

screw,

matting,

cm

8

115.20

4.99

43

16 cm

cm,

metal

cm

cm fine

228.85

14.40

8

12 kg

1/8

chain

PVC

.5

blade

drill,

Sledge

2.5

blade

hammer

Hicut

You could

8 Beginning

188

5.87

1103.56

172

6.99

1202.28

237

8.45

2002.65

119.95

18

cm,

8

2826.89

2159.10

mesh

also use a computed

column,

an alias and date

arithmetic

in

a single

query.

For example,

assume that you want to get a list of out-of-warranty products that have been stored more than 90 days. In that case, the P_INDATE is atleast 90 daysless than the current (system) date. The Microsoft Access version of this query is shown as: SELECT

P_CODE, P_INDATE, DATE() - 90 AS CUTDATE

FROM

PRODUCT

WHERE The

P_INDATE

Oracle version

PRODUCT

has

P_INDATE

Oracle, respectively.

Cengage deemed

SYSDATE

- 90;

You could

use the

DATE() and

such as in the value list of an INSERT

Learning. that

,5

DATE() and SYSDATE are special functions

expected,

2020

below:

FROM

and

review

query is shown

P_CODE, P_INDATE, SYSDATE - 90 AS CUTDATE

Note that

Copyright

of the same

DATE()- 90;

SELECT

WHERE

Editorial

,5

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

that return todays

anywhere

a date literal is

when changing

Cengage

part.

Due Learning

to

electronic reserves

functions

Microsoft Access

statement, in an UPDATE statement

or in

SYSDATE

date in

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

354

PART III

Database

the

value

output

Programming

of a date attribute

changes

Suppose

based

a

expiration

manager

date (90

wants

days

P_CODE,

FROM

PRODUCT;

As you

saw in the

or in

Arithmetic

example,

shown

in

The arithmetic

Table

Subtract Multiply

/

Divide

can

confuse

the

as

Microsoft

conjunction

with

multiplication

suggests,

Access;

Copyright review

2020 has

the

latter

For

(*) is

note

the

Perform power operations

3

Perform

4

Perform additions and subtractions

5 50.

any

received that

and

list,

the

warranty

type:

as well as with numeric

attributes.

operators

with table

are

used

commands

often

attributes

in

in

a column

conjunction

with the

of the Similarly,

by (4

All suppressed

of (some

applications

wildcard

only in

symbol

string

use ** instead

of ^)

used

SQL implementations

by some

comparisons,

while

the

former

is

used

in

remember

are the rules that of the

the rules

establish

following

the

of precedence.

order in

computational

As the

which computations

sequence:

within parentheses

multiplications and divisions

application

Learning.

were generate

AS EXPDATE

on attributes,

order

2

that

they To

arithmetic

with the

used

of precedence

example,

Perform operations

Cengage

use

operations

1

deemed

query

procedures.

mathematical

the rules

completed.

expressed

dates

1 90

SQL

power

symbol

mathematical

As you perform

10 * 5

previous

8.8.

Raise to the

^

The

the

Description

2

are

Of course,

operators

Add

name

here.

with date attributes

In fact,

1

such

the

was received).

P_INDATE

you

expression.

Operator

Do not

products,

product

operators

*

Editorial

of all

when the P_INDATE,

previous

operators

TABLE 8.8

as shown

Operators: The Rule of Precedence

a conditional

arithmetic

statement

date.

you can use all arithmetic

8.4.2 Arithmetic

list

a SELECT

a list

from

SELECT

Note that

8

or in

on todays

rules 4

of precedence

1 5^2

* 3 5 4

1 25 * 3

1 5^2) * 3 yields the

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

will tell

duplicated, learning

in experience.

that

5 79 but (4

answer (4

or

you

whole

8

1 2 * 5

5 8

1 10

1 5)^2 * 3 5 81 * 3

5 18,

5 243,

but (8

1 2) * 5

while the

5

operation

1 25) * 3 5 29 * 3 5 87.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

355

8.4.3 Logical Operators: And, Or, and Not In the real

world,

a search

of data normally

involves

multiple

conditions.

For example,

when you are

buying a new house, you look for a certain area, three bedrooms, two and a half bathrooms, two stories and so on. In the same way, SQL allows you to have multiple conditions in a query through the use of logical operators. Thelogical operators are AND, OR and NOT. For example, if you want alist of the table

contents

command

for

either the

V_CODE

5 21344

OR the

V_CODE

SELECT

P_DESCRIPT, P_INDATE, P_PRICE, V_CODE

FROM

PRODUCT V_CODE 5 21344

WHERE That command

FIGURE

8.12

Select PRODUCT table

attributes:

logical

P_PRICE

18 cm pwr. saw blade

13-Dec-18

14.99

21344

22 cm pwr. saw blade

13-Nov-18

17.49

21344

B&D jigsaw,

30 cm blade

30-Dec-18

109.92

24288

B&D jigsaw,

20 cm blade

24-Dec-18

99.87

24288

file, chain

0.3 cm fine

15-Dec-18

saw,

07-Feb-19

40 cm

PRODUCT

P_PRICE , 50

P_INDATE

. '15-Jan-2019';

produces the output shown in Figure 8.13.

Select PRODUCTtable attributes: logical AND P_INDATE

B&D cordless Claw

drill,

3 cm

9 cm,

2.5

metal screw,

6 cm wd. screw,

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

1.25

cm

hammer

PVC pipe,

has

24288

The following command generates a list which P_INDATE is a date occurring after

FROM

P_DESCRIPT

2020

8

21344

256.99

P_DESCRIPT, P_INDATE, P_PRICE, V_CODE

FIGURE 8.13

review

V_CODE

4.99

SELECT

This command

Copyright

OR

P_INDATE

The logical AND has the same SQL syntax requirement. of all rows for which P_PRICE is less than 50 AND for 15 January 2019:

WHERE

match the logical restriction.

P_DESCRIPT

Hicut

Editorial

you can use the following

OR V_CODE 5 24288;

generates the six rows shown in Figure 8.12 that

Rat-tail

AND

5 24288,

sequence:

be

copied, affect

scanned, the

overall

m

25 50

or

duplicated, learning

in experience.

whole

P_PRICE

V_CODE

20-Jan-19

38.95

25595

20-Jan-19

9.95

21225

20-Feb-19

5.87

01-Mar-19

6.99

21225

24-Feb-19

8.45

21231

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

356

PART III

Database

You can For

Programming

combine

example,

The

the logical

suppose

you

P_INDATE is

Orthe

OR with the logical

want

a table

after 15 January

the

required

listing

2019,

P_DESCRIPT,

FROM

PRODUCT

WHERE

(P_PRICE

use

of parentheses

you

executed

,

to

want the logical

first.

FIGURE 8.14

50 AND

The

combine

restrictions

preceding

yields

30 cm

P_INDATE

Claw

hammer

Hicut

chain

PVC

pipe,

3 cm

Note that

the three

P_PRICE

entries use

are placed

blade

drill, 1.25

saw, 9 cm,

cm

within

depends

parentheses

are

always

8.14.

not

the

match

P_PRICE

V_CODE

30-Dec-18

109.92

24288

24-Dec-18

99.87

24288

20-Jan-19

38.95

25595

20-Jan-19

9.95

21225

20-Feb-19

5.87

01-Mar-19

6.99

21225

24-Feb-19

8.45

21231

m

V_CODE

256.99

5 24288

24288

are included

regardless

OR and

a specialty

AND

can

field in

become

quite

mathematics

complex

known

of the

when

P_INDATE

numerous

as Boolean

NOTis used to negate the result of a conditional evaluate row

is

a certain

not

to true

or false.

selected.

condition.

code is not 21344,

and

restrictions

algebra is dedicated

The

If

For example,

use the

command

an expression

NOT logical

is

operator

if you

want to

expression.

true,

the

is typically

see

alisting

row

That is, in SQL, is

used

selected; to find

of all rows

for

if

the

an

rows

which

the

sequence:

* PRODUCT

Note that

NOT (V_CODE

the

clarity.

Learning. that

listed Figure

parentheses

rows.

expressions

WHERE

Cengage

in

the

operators.

is false,

FROM

deemed

shown

2.5

operators

operator

SELECT

has

Conditions

output

place

07-Feb-19

with the

for those

use of logical

expression

2020

rows

on the query. In fact,

all conditional

review

Where you

40 cm

metal screw, 25

of the logical

The logical

Copyright

restrictions.

the

6 cm wd. screw, 50

Editorial

50.

. '15-Jan-2019')

be executed.

blade

20 cm

B&D cordless

for

output.

V_CODE

P_INDATE

B&D jigsaw,

do

on the

Select PRODUCTtable attributes: logical AND and OR

8

vendor

P_PRICE is less than

P_PRICE,

logical to

query

B&D jigsaw,

that

restrictions

conditions:

5 24288;

P_DESCRIPT

to the

place further

following

and the

P_INDATE,

OR V_CODE

The

the

use:

SELECT

on how

AND to

for

V_CODE is 24288.

To produce

Note the

listing

any

condition

is

The logical

All suppressed

Rights

enclosed

NOT can

Reserved. content

does

May not

5 21344);

not materially

be

copied, affect

in

parentheses;

be combined

scanned, the

overall

or

duplicated, learning

that with

in experience.

whole

practice

AND

or in Cengage

part.

Due Learning

and

to

electronic reserves

is

optional,

but it is

highly

recommended

OR.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

357

NOTE If

your

SQL

version

does

not

support

the

logical

NOT, you

can

generate

the

required

output

by

using

the

condition: WHERE

V_CODE

If your version

,.

21344

of SQL does not support

WHERE V_CODE !5

8.4.4 Special ANSI-standard special

use of special

operators

in

conjunction

with the

WHERE clause.

These

include: Used to

Used to

LIKE

21344

SQL allows the

operators

NULL

use:

Operators

BETWEEN IS

,.,

Used to

check

check

check

whether whether

whether

an attribute

an attribute

an attribute

value is value

value

is

within a range.

null.

matches

a given

string

pattern.

value

within

a value

8 IN

Used to

EXISTS

The If

check

Used to

BETWEEN

you

use

whether

whether

Special

software

that

an attribute

products

check

whose

whether

value

a subquery

matches returns

any

any rows.

implements

are

a standard

within

a range

between

50

SQL, the

of values.

and

100,

operator

use the

P_PRICE

NOTE TO

ORACLE

BETWEEN

AND

50.00

MYSQL

AND

DBMS does not support

SELECT

*

FROM

PRODUCT

2020 has

Cengage deemed

Learning. that

used

to

command

check

for

all

sequence:

USERS

any

BETWEEN,

BETWEEN special operator. If you list the higher

you can use:

P_PRICE . 50.00 AND P_PRICE , 100.00;

WHERE

review

following

may be

want to see a listing

100.00;

Always specify the lower range value first when using the range value first, Oracle returns an empty result set.

Copyright

if you

PRODUCT

WHERE

Editorial

BETWEEN

For example,

*

FROM

If your

list.

Operator

value is

prices

SELECT

an attribute

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

358

PART III

Database

The IS

Programming

NULL Special

Standard

SQL

want to list use the

allows

all products

command

use that

SELECT

P_CODE,

FROM

PRODUCT

want to

FROM

PRODUCT

WHERE SQL

uses

LIKE

a special

special

Standard

SQL

matches

(V_CODE

is

For example,

null).

To find

suppose such

a null

you entry

sequence

is:

P_INDATE

NULL; operator

to

of an

test

for

NULL is attribute

that

is

used in

conjunction

you to

use the

per

entire and

string

is

nulls. not

Why couldnt

a value

you just

(such

represents

as the

precisely

enter

number

the

a condition

0 (zero)

absence

such

or the

of any

blank

value.

includes

_23-_56-678_

includes

wildcards

sign (%)

and

to find

patterns

underscore

(_)

within

wildcard

string

attributes.

characters

to

make

known: are

Jernigan,

eligible.

July,

and

For example,

J-231Q

and Jones

_ means any one character _23-456-6789

with

cent

characters

Jones,

Johnson

includes

not

all following

Johnson,

includes

_o_es

assigned

value.

V_CODE

No. Technically,

operator

any

IS

property

allows

includes

Jo%

attribute

Operator

when the

% means J%

a null

a null date entry, the command

a special

5 NULL?

The LIKE Special The

a vendor

P_DESCRIPT,

P_INDATE

but

for

NULL;

check

P_CODE,

space),

have

check

P_DESCRIPT,

SELECT

as V_CODE

NULL to

do not

V_CODE IS

Similarly, if you

Note that

of IS

sequence:

WHERE

8

Operator the

may be substituted

123-456-6789, 123-156-6781,

Jones,

Cones,

Cokes,

for the

underscore.

For example,

223-456-6789,

and 323-456-6789

123-256-6782,

and

totes,

823-956-6788

and roles

NOTE Some

RDBMSs,

such

For example,

as

Microsoft

Access,

the following

query

use the

would find

wildcard

characters

all VENDOR

rows

* and

for

? instead

contacts

of

% and

whose last

_.

names

begin

with Smith. SELECT

V_NAME,

FROM

VENDOR

WHERE If

you

V_CONTACT

check

records:

the

two

original

Smiths

Keep in

mind that

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

does

May not

not materially

data in

Figure

8.2

SQL implementations

be

V_PHONE

again,

youll

see that

this

SQL

query

yields

three

Smithson.

that includes

Reserved. content

one

most

V_AREACODE,

LIKE 'Smith%';

VENDOR

and

will not yield a return

Editorial

V_CONTACT,

copied, affect

yield

case-sensitive

Jones if you use the

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

wildcard

to

electronic reserves

rights, the

searches.

search

right

some to

third remove

For

delimiter

party additional

content

may content

example,

jo%

be

suppressed at

any

Oracle

in a search for

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

last

names.

The reason

a lowercase

j.

On the

For example,

VENDOR

can

be

made

and

no-match

sensitivity.

query

character has

result

preceding

regardless

query

WHERE an output

character SELECT

the

wildcards

queries

special

Cengage

Learning. that

a

allow

strings

The IN Special

deemed

actually

like

query the

a special is

stored

your

the

RDBMS

whose

cannot

table

table.)

allows

the

name

a

match.

entry.

conversions

UPPER function

in the

last

make

necessary

done in the

uppercase

Smith, and smith

no vendor

make the

conversion

including

all rows

letter

combinations

conjunction

to

computers So if you

use

convert

both

memory

only;

want to

of the

to

UPPER

avoid

a

function,

V_PHONE

that

contain such

with the

V_AREACODE,

whose names

alast as

name

Smith,

special

that

smith

begins

and

operators.

with

Smith,

the

query:

SMITH.

For instance,

V_PHONE

match for

do not start

a persons

either

name is

spelling.

The

with Smith. spelled

proper

Johnson

search

would

or Johnsen.

The wildcard

be instituted

by the

query:

LIKE 'Johns_n'

you to

make

matches

may be used in combinations.

can yield the

has

provide

V_AREACODE,

whether

V_CONTACT

characters

2020

used in

SMITH,

an

VENDOR

WHERE

Many

is

causing

contains

exactly

That is,

*

FROM

Thus,

thus

table

written

sensitive.

NOT LIKE 'Smith%';

of all vendors

you find

the

automatically

and if

V_CONTACT,

you do not know

_ lets

with

LIKE 'SMITH%';

a list

V_CONTACT

Suppose

is

(The

value

may be used in

VENDOR

starts

sensitive.

8

operators

FROM

search

359

by using the query:

or lowercase

V_NAME,

may be case

character,

as Oracle,

uppercase.

sensitivity

produces

SELECT

entry

V_CONTACT,

of uppercase

The logical

wildcard

case

Language

Oracle:

SMITH%

Access,

UPPER(V_CONTACT)

WHERE

will yield

query

on how the

on case

VENDOR

not

Query

V_PHONE

Because

(uppercase)

such

to

the same results

FROM

in

queries

alowercase

Microsoft

Others,

V_NAME,

The

the

entries

SELECT

query

J and your are

V_AREACODE,

entries.

when the

as

no effect

based

you can generate

review

only

case

conversion

following

character-based

(unequal)

such

searches

Structured

LIKE 'SMITH%';

SMITH,

RDBMSs,

with a capital

Access

ASCII code from

different

with (uppercase)

eliminate table

as

the

begins

V_CONTACT,

because

has a different

be evaluated

Some

typed

V_CONTACT

are returned

Matches

Copyright

you

FROM

begins

Editorial

Microsoft

V_NAME,

character

the

Jones

hand,

SELECT

No rows

the

because

suppose

WHERE

to

is other

8 Beginning

any

Al, Alton,

would

operator

IN.

All

only

For example,

Blakeston,

blank,

approximate

the

spellings

wildcard

search

bloated

and eligible.

OR can

be

are

based

known.

on the

Wildcard

string _l%

Operator

that

suppressed

Elgin,

when

Rights

Reserved. content

does

require

the

use

For example,

May not

not materially

be

copied, affect

the

scanned, the

of the logical

overall

or

easily

handled

with the

help

of

query:

duplicated, learning

more

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

360

PART III

Database

Programming

SELECT

*

FROM

PRODUCT

WHERE

V_CODE

5 21344

OR

V_CODE

5 24288;

can

be handled

more efficiently

SELECT

with:

*

FROM

PRODUCT

WHERE Note that

V_CODE the IN

IN (21344,

operator

uses

24288);

a value

list.

All of the

Each of the values in the value list is compared value

matches

be only those If the

which

attribute

marks.

the

any of the in

used is

For instance,

preceding

is

21344

had

been

must

In this

be of the

case,

same

data

V_CODE. If the

example,

the rows

type.

V_CODE

selected

will

or 24288.

data type,

V_CODE

list

in this

the row is selected.

either

of a character

in the

attribute

the list

defined

values

as

must

CHAR(5)

be enclosed

during

the

in

single

quotation

table-creation

process,

would have read:

*

FROM

PRODUCT

WHERE

V_CODE

The IN

operator

suppose

you

In that

V_CODE

if the

query

SELECT

8

values in the list,

the

values

to the

case,

is

especially

want to you

IN ('21344',

list

could

'24288');

valuable the

use

when it is

V_CODE

and

a subquery

used

in

V_NAME

within

conjunction

of only

the IN

operator

with

those to

subqueries.

vendors

generate

who

the

For provide

value

list

example, products.

automatically.

The query is: SELECT

V_CODE,

FROM

VENDOR

WHERE

V_CODE

The preceding The inner V_CODE

query

The IN

IN (SELECT

query is executed

values

table

and

V_CODE

in two

or subquery

represent

operator

VENDOR

V_NAME

vendors

compares

the

selects

only the

PRODUCT);

steps:

generates

the

FROM

a list

of

who supply

values

V_CODE

generated

rows

with

values

from

the

PRODUCT

tables.

Those

products. by the

matching

subquery

values

to

the

V_CODE

that is, the

values

vendors

in the

who provide

products. The IN

special

Advanced

operator

SQL,

where

Operator

EXISTS

can

whenever

another

query.

be used

following

has

Cengage deemed

Learning. that

will list

is

a requirement

returns

all vendors,

but

in

Chapter

9,

Procedural

Language

SQL

and

subqueries.

to

any rows,

execute

run the

only if there

are

a command

main query; products

to

based

otherwise,

on the

dont.

result

of

For example,

order:

VENDOR

WHERE

2020

there

attention

about

*

FROM

review

more

That is, if a subquery query

SELECT

Copyright

additional

will learn

The EXISTS Special

the

Editorial

receives

you

any

EXISTS

All suppressed

Rights

Reserved. content

does

May not

(SELECT

not materially

be

copied, affect

* FROM

scanned, the

overall

or

duplicated, learning

PRODUCT

in experience.

whole

or in Cengage

part.

WHERE

Due Learning

to

electronic reserves

P_QOH

rights, the

right

some to

third remove

,5

party additional

content

P_MIN);

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

The EXISTS products

special

quantity

SELECT

EXISTS

EXISTS

special

about

8.5

than

to list

all vendors,

example

double

the

minimum

Language

but only if there

361

are

quantity:

(SELECT

operator

* FROM

will receive

PRODUCT additional

WHERE

attention

P_QOH

in the

,

next

P_MIN

* 2);

chapter,

where

you

will learn

ADVANCED DATA DEFINITION COMMANDS you

by adding

Finally,

you

will learn

All changes

ADD,

in the

you

action

to

may

columns

you

copy

table

and

crucial

are

of tables

you to

the are

structures

by changing

do advanced and

by using

change

you

(unless that

to

made

specific

allows

data

how

or parts

ADD enables DROP

a column

delete

tables

the

DROP.

delete

will learn

structure

produces

characteristics.

allow

how to change (alter) table

Then

how to

that

MODIFY

column

willlearn

columns.

by a keyword

you

data how

the

add a column,

delete

used

does

by

other

delete

not tables.

options enables

a table.

The

new

columns.

command,

Three

contain

characteristics

the

tables.

MODIFY

from

to

TABLE

make. and

a column

column

to

ALTER

want to

attribute

updates

Most

any

values)

basic

syntax

followed

are available: you to

RDBMSs

do

because to

add

cases,

the

change not

such or

an

modify

is:

ALTER {ADD You

less

Query

subqueries.

In this section, and

used in the following

hand,

Structured

VENDOR

WHERE

more

is

on

8 Beginning

*

FROM

The

operator

with the

CHAPTER

TABLE |

can

tablename

MODIFY}( also

ALTER

use the

TABLE

where constraint

ALTER

datatype

TABLE

[ {ADD

command

|

to

MODIFY}

add table

columnname

datatype]);

constraints.

In

these

syntax

is:

tablename

ADD constraint

You could

columnname

[ ADD constraint

] ;

refers to a constraint

also use the

ALTER

definition

TABLE

similar

command

to those

to remove

you learned

a column

in

or table

Section

8.2.6.

constraint.

The syntax

is: ALTER

TABLE

tablename

DROP{PRIMARY

KEY | COLUMN

Notice that,

when removing

one reason

why you should

columnname

a constraint, always

|

CONSTRAINT

you need to specify

name

your

constraints

constraintname

the name in your

};

given to the

CREATE

constraint.

That is

or ALTER

TABLE

TABLE

statement.

8.5.1 Changing a Columns Using

the

ALTER

V_CODE

TABLE

MODIFY

Copyright review

2020 has

Cengage deemed

Learning. that

the

(integer)

V_CODE

in

the

PRODUCT

table

can

be changed

to

a character

by using:

ALTER

Editorial

syntax,

Data Type

any

PRODUCT

(V_CODE

All suppressed

Rights

CHAR(5));

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

362

PART III

Database

Some is

Programming

RDBMSs,

empty.

For

a character already

such

as Oracle,

example,

if

definition,

contains

remember

the

data.

that the

you

above

The

V_CODE in

not

contain

alteration

if the

data,

foreign

changes

P_PRICE

do

column

ALTER

to

TABLE

alter

nine

contains

the

digits,

message If

the

sequence

specified

change

error

the

to

the

the

V_CODE

If the

of the

to

column data

type,

data types V_CODE

expected

creation

definition

V_CODE

message. If the the

be changed

number

VENDOR.

produces

during

column

current because

you

V_CODE in

thus triggering

not

the

table

dont column

structure

PRODUCT

table.

Data Characteristics

already

not

the

unless the

from

an error

references

was

field

explained.

command

key reference

data types

V_CODE

easily

violation,

preceding

If the column to be changed

is

PRODUCT

8.5.2 Changing a Columns

if those

the

will yield

message

integrity

the

you change

change

command

error

match, there is a referential does

do not let

want to

data

use the

data, you can

type.

For

make changes in the columns

example,

if

you

want

to increase

characteristics the

width

of the

command:

PRODUCT

MODIFY (P_PRICE

DECIMAL(9,2));

If you now list the table

contents,

you see that the

column

width of P_PRICE

has increased

by one digit.

NOTE

8

Some

DBMSs impose

Oracle lets attribute

modification

be done

limitations

you increase

only

(but affects

when there

on

not

when its

decrease)

the

the integrity

are

no

of the

data in

any rows

possible size

to

change

of a column.

data in

the

for the

attribute

characteristics.

The reason

database.

affected

for

In fact,

this

some

For example,

restriction attribute

is that

an

changes

can

attribute.

8.5.3 Adding a Column You can alter an existing table by adding one or more columns. In the following example, you add the column named P_SALECODE to the PRODUCT table. (This column will be used later to determine whether goods that have been in inventory for a certain length of time should be placed on special sale.)

Suppose you expect the P_SALECODE entries to be 1, 2 or 3. Because there will be no arithmetic performed with the P_SALECODE, the P_SALECODE is classified as a single-character attribute. Note the inclusion of all required information in the following ALTER command: ALTER TABLE PRODUCT ADD (P_SALECODE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

CHAR(1));

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

363

Online Content If youareusingthe Microsoft Accessdatabases providedonthe online platform

accompanying

For example, look one

named

can

and

column.

continue

changes

you

the

can track

of the

a column,

an error

will default

to

message; if you a value

updates

Each

cumulative

table

with the

in the following

of all

UPDATE

NOT

to

column.

includes

the

new

commands,

modification

and all of the

may even want to use both options, first to

the

add a new column new

effect

sections.

database,

copies

queries and then to examine the

not to include

of null for the

of the two

P_SALECODE

sections. (You

of the update

be careful

of the

PRODUCT_3.

want to see the PRODUCT

effects

each

PRODUCT table in the 'Ch08_SaleCo'

named

will make in the following

When adding so causes

you

one

If you

using

examine the individual

rows

book,

at the copies

PRODUCT_2

P_SALECODE you

this

NULL

atable

that

already

Therefore,

it is

not possible

NULL clause for this new column. (You can, of course add the

cumulative

clause for the

effects.)

new column.

has rows, to

the

Doing

existing

add the

NOT

NOT NULL clause to the table structure

after all of the data for the new column have been entered and the column no longer

contains

nulls.)

8.5.4 Dropping a Column Occasionally, you may want to modify a table by deleting a column. Suppose you want to delete the V_ORDER attribute from the VENDOR table. To accomplish that, you would use the following command: ALTER TABLE VENDOR DROP

COLUMN

V_ORDER;

Again, some

RDBMSs impose

attributes

are involved

that

restrictions

in foreign

on attribute

key relationships,

deletion.

nor

For example,

may you delete

you

may not drop

an attribute

of a table

that

contains only that one attribute.

8.5.5 Advanced Data Updates To make data entries in an existing rows columns, SQL employs the UPDATE command. The UPDATE command updates only data in existing rows. For example, to enter the P_SALECODE value 2 in the fourth

row,

use the

UPDATE

the value use the command

command

PRODUCT

SET

P_SALECODE

(P_SALECODE).

UPDATE

PRODUCT

SET

P_SALECODE

For example,

and 2232/QTY,

P_CODE IN ('2232/QWE',

UPDATE

PRODUCT

SET

P_SALECODE

WHERE

2020 has

Cengage deemed

Learning. that

any

To enter

All

Rights

Reserved. content

does

May not

not materially

be

byits

want to enter the

primary key

P_SALECODE

you use:

'2232/QTY');

copied, affect

command:

5'1'

P_CODE 5'2232/QWE'

suppressed

if you

5 '1'

If your RDBMS does not support IN, use the following

review

P_CODE 1546-QQ2.

data can be entered the same way, defining each entry location

and its column location

WHERE

Copyright

key

5'2'

value 1 for the P_CODE values 2232/QWE

Editorial

primary

P_CODE 5'1546-QQ2';

Enter subsequent (P_CODE)

with the

sequence:

UPDATE

WHERE

together

scanned, the

overall

or

duplicated, learning

OR P_CODE 5'2232/QTY';

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

364

PART III

Database

Programming

To check the results SELECT

P_CODE,

FROM

PRODUCT;

Although

the

process the

of your efforts

is

UPDATE

very

existing

you

want to

P_DESCRIPT,

sequences

cumbersome.

columns,

place

use:

it

just

P_INDATE,

shown

Fortunately,

can

be used

sales codes

allow

if

to

you

to

values

on the

can

to their

between

8

December

2018

16 January

2019

PRODUCT

table,

and

the

SET

P_SALECODE

WHERE

P_INDATE

UPDATE

PRODUCT

SET

P_SALECODE

WHERE

be established

between slots.

the table,

table

cells,

the

the

entries

For example,

using the following

and

suppose

schedule:

command

sequences

make the

appropriate

assignments:

5 '2'

5 '1'

.5

of those

SELECT

P_CODE,

FROM

PRODUCT;

'16-Jan-2019'

8.15.

two

command

sequences,

P_DESCRIPT,

made all of the

Figure

two

,5'10-Feb-2019';

To check the results

If you have

1

2019

, '25-Dec-2018';

P_INDATE

AND P_INDATE

10 February

following

PRODUCT

like

specified

2

UPDATE

look

into

P_SALECODE

25

Using the

values

appropriate

P_INDATE into

P_INDATE before

P_SALECODE

enter

a relationship

assign

based

P_PRICE,

updates

Make sure

P_INDATE,

shown

that

use:

in this

you issue

P_PRICE,

section

P_SALECODE

using

a COMMIT

Oracle, your

statement

to

save

PRODUCT these

table

should

changes.

Online Content Thescreenshotsprovided in Chapter 8,Beginning StructuredQuery Language SQL

and

Chapter

Developer

development

5 within

to

use

Your

5

Getting

Copyright review

2020 has

Cengage deemed

Learning. that

any

can

on

(APEX)

All Oracle scripts

Started

with

Rights

Reserved. content

does

May not

on the

may be part

Express

Oracle Academy

All

and runs

be found

or university

Application

suppressed

Oracle

SQL and SQL

Advanced

Developer

by Oracle. It is free to

10g and later

is

use and

SQL,

a graphical can

were taken tool

be used

Windows, Linux and

for

online

of the

platform

Oracle

a cloud-based

provided

with this book

this

programme. which

can

Oracle

database

with any

Oracle

Mac OSX. Throughout

accompanying

Academy software

from

Chapters 8

A guide for how

book in

If so, you

Appendix

N.

may be using

be used to learn

SQL and

will also work on Oracle APEX. Learn

more about

here: https://academy.oracle.com/en/oa-web-overview.html

Oracle

appdev/sql-developer.html 6 Developing Applications appdev/apex.html

Editorial

APEX.6

provided

Developer

college

PL/SQL. the

Language

will be used as an editor to explore the use of DML and DDL commands.

SQL

Oracle

Oracle

which is

Database version and 9, it

9, Procedural

not materially

be

SQL Developer.

Available:

www.oracle.com/database/technologies/

with Oracle APEX. Available:

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

www.oracle.com/database/technologies/

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The arithmetic in

your

product. arrive,

operators

PRODUCT

table

Suppose, youll

for

want to

are particularly has

example, add them

UPDATE

PRODUCT

SET

P_QOH

WHERE

useful in data updates.

below

you

have

to inventory,

5 P_QOH

P_CODE

FIGURE 8.15

dropped

the

ordered

minimum

8 Beginning

For example,

desirable

20 units

of

value,

product

Structured

if the youll

Query

quantity order

2232/QWE.

Language

365

on hand

more

When the

of the

20

units

using:

1 20

5 '2232/QWE';

The cumulative effect of multiple updatesin the PRODUCTtable (Oracle-APEX)

8

If you wantto add 10 per cent to the price for all products that have current prices below 50, you can use: UPDATE

PRODUCT

SET

P_PRICE 5 P_PRICE * 1.10

WHERE Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

P_PRICE All suppressed

Rights

Reserved. content

does

May not

, 50.00; not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

366

PART III

Database

Programming

NOTE If you fail to roll

back

will not

match

the

command

to

restore

the

changes

results the

shown database

of the

preceding

in

figures.

the

to its

UPDATE

queries,

Therefore,

previous

if

you

the are

output using

of the

subsequent

Oracle,

use

queries

the

ROLLBACK

state.

Online Content If you areusingAccess,copythe original'Ch08_SaleCo.mdb' file from the online platfom for this book.

8.5.6 Copying Parts of Tables As you will discover in later chapters on database design, sometimes it is necessary to break up a table structure into several component parts (or smaller tables). Fortunately, SQL allows you to copy the contents of selected table columns so that the data need not be re-entered manuallyinto the newly created table(s). For example, if you wantto copy P_CODE, P_DESCRIPT, P_PRICE and V_CODE from the

8

PRODUCT

table to

a new table

named

PART, you create the

PART table

structure

first,

as follows:

CREATE TABLE PART( PART_CODE

CHAR(8) NOT NULL

PART_DESCRIPT PART_PRICE

UNIQUE

CHAR(35),

DECIMAL(8,2),

V_CODE

INTEGER,

PRIMARY

KEY (PART_CODE));

Note that the PART column names need not beidentical table

need

not have the

same

number

of columns

to those ofthe original table and that the new

as the

original

table.

In this

case, the first

column

in the PART table is PART_CODE, rather than the original P_CODE found in the PRODUCT table. And the PART table contains only four columns rather than the seven columns found in the PRODUCT table. However, column characteristics must match; you cannot copy a character-based attribute into a numeric structure and vice versa. Next, you need to add the rows

you use the INSERT command INSERT

INTO

to the

new PART table,

using the

PRODUCT

table

rows.

To do that,

you learnt in Section 8.3.7. The syntax is:

target_tablename[(target_columnlist)]

SELECT

source_columnlist

FROM

source_tablename;

Note that the target column list is required if the source column list doesnt match all of the attribute names and characteristics of the target table (including the order of the columns). Otherwise, you do not need to specify

INSERT command INSERT

INTO

the target

column list. In this example,

you

must specify

the target

column list in the

below because the column names of the target table are different: PART (PART_CODE,

PART_DESCRIPT,

PART_PRICE,

V_CODE)

SELECT P_CODE, P_DESCRIPT, P_PRICE, V_CODE FROM PRODUCT; Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

The contents

of the

PART table

SELECT * FROM to

generate

the

can now be examined

by using the

8 Beginning

Structured

Query

Language

367

query:

PART;

new

PART

FIGURE 8.16

tables

contents,

shown

in

Figure

8.16.

PARTtable attributes copied from the PRODUCTtable PART_CODE

PART_DESCRIPT

11QER/31

Power

13-Q2/P2

7.25 cm

pwr. saw

14-Q1/L3

9.00 cm

pwr. saw

1546-QQ2

Hrd. cloth,

1558-QW1

Hrd. cloth, 1/2 cm, 3 3 50

2232/QTY

B&D jigsaw,

12 cm

2232/QWE

B&D jigsaw,

8 cm

2238/QPD

B&D

23109-HB

Claw

23114-AA

Sledge

hammer,

12 kg

54778-2T

Rat-tail

file,

cm fine

89-WRE-Q

Hicut

PVC23DRT

PVC

SM-18277

1.25 cm

SW-23116

2.5

painter,

25595

blade

14.99

21344

blade

17.49

21344

39.95

23119

43.99

23119

109.92

24288

99.87

24288

38.95

25595

3-nozzle

2 3 50

blade blade

drill,

1/2

cm

pipe,

cm

21225

9.95

hammer

chain

V_CODE

109.99

15 psi.,

1/4 cm,

cordless

Steel

WR3/TT3

PART_PRICE

1/8 saw, 3.5

cm,

8

matting,

21344

4.99

8

24288

256.99

16 cm

5.87

m

metal screw,

wd. screw,

14.40

25

50

4 3 8 3 1/6

m,.5

m

6.99

21225

8.45

21231 25595

119.95

mesh

SQL also provides another way to rapidly create a new table based on selected columns and rows of an existing table. In this case, the new table copies the attribute names, data characteristics and rows of the

original

CREATE

table.

The Oracle version

TABLE

SELECT

of the command

is:

PART AS

P_CODE AS PART_CODE, P_DESCRIPT AS PART_DESCRIPT,

P_PRICE AS PART_PRICE, V_CODE FROM

PRODUCT;

If the PART table already exists, Oracle will not let you overwrite the existing table. To run this command, you must first delete the existing PART table. (See Section 8.5.8.) The Microsoft Access version of this command is: SELECT P_CODE AS PART_CODE, P_DESCRIPT AS PART_DESCRIPT, P_PRICE AS PART_PRICE, V_CODE INTO

PART

FROM PRODUCT; If the

PART table

continue

Copyright Editorial

review

2020 has

Cengage deemed

exists,

Microsoft

Access

will ask if you

want to

delete the

existing

table

and

with the creation of the new PART table.

Learning. that

already

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

368

PART III

Database

Programming

The SQL command PART_PRICE, copied rules

and

just

automatically. are

But

automatically

and FK to

shown

V_CODE

note

applied

enforce

entity

creates

columns.

In

that

a new

no entity

to the

all

integrity

new table.

and referential

PART table

addition,

of the (primary

In the

integrity,

with PART_CODE,

data rows

next

key)

(for

the

or referential

section,

you

PART_DESCRIPT,

selected

columns)

integrity

will learn

are

(foreign

how to

key)

define

the

PK

respectively.

8.5.7 Adding Primary and Foreign Key Designations When you from

the

table,

create

use the ALTER ADD

Aside its

(In

the

referential

fact

that

integrity.

8

the integrity more

For

have discovered

the

not

ALTER

tables,

example,

you

rules

been

TABLE

neither the

both

ALTER

automatically

might

procedure

it

the

To

define

does

the

not include

primary

can

transferred

scenarios

forgotten

to

command.

be designated

to

could

integrity

key for the

the

from

a new table

leave

define

tables

did not transfer

ALTER

changes

KEY (V_CODE)

rules

new

PART

you primary

a different

that

without

derives

entity

and

and foreign

database,

keys

you

might

the integrity

rules. In any case, you can

For example,

if the

PART tables

foreign

by:

PART tables at once,

REFERENCES

primary

VENDOR;

key nor its foreign

key has been

designated,

you can

using:

TABLEPART PRIMARY

KEY (PART_CODE)

ADD

FOREIGN

KEY (V_CODE)

For

other have

ADD

Even

new table

not

are

several

by using

designated,

FOREIGN

if

the

key.)

PART

ADD

Alternatively,

table,

primary

Or, if you imported

that the importing

yet

incorporate

rules

other

original tables.

the integrity

has

no

command:

or

when you created

key

is

KEY (PART_CODE);

one

re-establish

on another

there

PART

PRIMARY

from

based

particular,

following

TABLE

from

data

a new table

old table.

composite example,

primary if

you

keys

want

to

and

multiple

enforce

the

REFERENCES foreign integrity

VENDOR;

keys

can

rules

for

be designated the

LINE

in

table

a single shown

SQL

in

command.

Figure

8.1,

you

can use: ALTER

TABLE

LINE

ADD

PRIMARY

KEY (INV_NUMBER,

LINE_NUMBER)

ADD

FOREIGN

KEY (INV_NUMBER)

REFERENCES

ADD

FOREIGN

KEY (PROD_CODE)

INVOICE

REFERENCES

PRODUCT;

8.5.8 Deleting a Table From the Database Usethe PART

DROP TABLE command to delete atable from the database. For example, you can delete the

table

DROP

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

you just TABLE

All suppressed

created

with:

PART;

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

You can drop a table to

drop

a table

violation

has

8.6

the

RDBMS

not participating

generates

as the one

an error

message

Structured

Query

side of any relationship.

indicating

that

a foreign

Language

369

If you try key integrity

occurred.

ADVANCED SELECT QUERIES

One of the logical

only if that table is

otherwise,

8 Beginning

most important

operators

environment.

that

In

calculate

advantages

were introduced

addition,

averages,

SQL

and

have no duplicates

of SQL is its ability to earlier

provides

so on.

Better

or entries

to

useful yet,

update

table

functions

that

SQL allows

whose duplicates

produce

the

complex

contents

count,

free-form

work just

find

minimum

user to limit

queries

as

The

well in the

and

to

queries.

maximum

only those

query values,

entries

that

can be grouped.

8.6.1 Ordering a Listing The ORDER BY clause is especially SELECT

conditionlist

[ORDER

BY

Although

you

ascending

have

option

P_CODE,

FROM

PRODUCT BY is

the in

unaffected

in

Figure

listing

Figure

by the the

Copyright review

type

ascending

contents

of the

BY

8.17

although

ORDER

to the

actual

order,

you

P_DESCRIPT,

P_PRICE

table

product

ORDER

BY yields

an ascending

contents

is listed

first,

BY produces

would

P_INDATE,

produce

For example,

an ordered

suppose

sequence

you

(last

Rights

Reserved. content

does

May not

earlier

in

by the

output, the

want to

name, first

Withinthe order createdin Step 2, ORDER BY middleinitial.

All

price listing. Figure

8.2,

you

next lowest-priced

actual table

contents

DESC;

are used frequently.

suppressed

in

P_PRICE

3

any

default

by P_PRICE

enter:

Withinthe last names, ORDER BYfirst name.

Learning.

shown followed

a sorted

2

that

the

listed

P_PRICE

ORDER BYlast name.

Cengage

table

command.

descending

PRODUCT

deemed

or descending

PRODUCT

1

2020 has

order

P_INDATE,

Note that

Figure

FROM

listings

the

want the

the lowest-priced

P_CODE,

Ordered

you

8.17.

ORDER

list in

be helpful if you could

Editorial

in

8.17,

SELECT

ORDER

declaring

if

P_DESCRIPT,

and so on. However,

To produce

of

DESC] ] ;

P_PRICE;

shown

Comparing will see that,

|

use:

SELECT

output

are

the

8

[ASC

For example,

order,

ORDER

]

columnlist

ascending.

product,

to you. The syntax is:

tablelist

[WHERE

The

order is important

columnlist

FROM

order is

useful when the listing

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

create

a phone

directory.

name, initial) in three

some to

third remove

party additional

content

may content

be

suppressed at

any

time

It

would

stages:

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

370

PART III

Database

Programming

FIGURE 8.17

Selected PRODUCTtable attributes:

ordered by (ascending)

P_PRICE

P_CODE

P_DESCRIPT

P_INDATE

P_PRICE

54778-2T

Rat-tail file, 0.5 cm fine

15-Dec-18

4.99

PVC23DRT

PVC pipe, 9 cm, 2.5 m

20-Feb-19

5.87

SM-18277

3 cm

01-Mar-19

6.99

SW-23116

6 cm wd. screw,

24-Feb-19

8.45

20-Jan-19

9.95

02-Jan-19

14.40

metal screw,

25 50

23109-HB

Claw

hammer

23114-AA

Sledge

13-Q2/P2

7.25

cm

pwr. saw

blade

13-Dec-18

14.99

14-Q1/L3

9.00

cm

pwr. saw

blade

13-Nov-18

17.49

2238/QPD

B&D

cordless

1546-QQ2

Hrd. cloth,

1558-QW1

Hrd. cloth,

2232/QWE

B&D

2232/QTY

B&D jigsaw,

11QER/31

Power

WR3/TT3

Steel

hammer,

7 kg

drill,

1/2 cm

20-Jan-19

38.95

1/4

cm,

2 3 50

15-Jan-19

39.95

1/2

cm,

3 3 50

15-Jan-19

43.99

24-Dec-18

99.87

jigsaw,

8 cm

blade

12 cm

painter,

blade

15 psi.,

matting,

3-nozzle

4 3 8 3 1/6

m,.5

m

30-Dec-18

109.92

03-Nov-18

109.99

17-Jan-19

119.95

07-Feb-19

256.99

mesh

8 89-WRE-Q

Such a multilevel

Hicut chain

ordered

saw,

sequence

16 cm

is known

as a cascading

order

sequence,

and it can be created

easily bylisting several attributes, separated by commas, after the ORDER BY clause. The cascading order sequence is the basis for any telephone directory. To illustrate a cascading order sequence, use the following SQL command on the EMPLOYEE table: SELECT EMP_LNAME,

EMP_FNAME,

EMP_INITIAL, EMP_AREACODE,

EMP_PHONE

FROM EMPLOYEE ORDER BY EMP_LNAME,

EMP_FNAME,

EMP_INITIAL;

That command yields the results shown in Figure 8.18. The ORDER BY clause is useful in many applications, especially because the DESC qualifier can be invoked. For example, listing the most recent items first is a standard procedure. Typically, invoice due dates are listed in descending order. Orif you want to examine budgets, its probably useful to start by looking

at the largest

budget line items.

You can use the ORDER BY clause in conjunction with other SQL commands, note the use of restrictions on date and price in the following command sequence: SELECT

P_DESCRIPT, V_CODE, P_INDATE, P_PRICE

FROM

PRODUCT

WHERE

P_INDATE , '21-Jan-2019'

P_PRICE ,5

too. For example,

AND

50.00

ORDER BY V_CODE, P_PRICE DESC;

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 8.18

Selected PRODUCT table attributes:

EMP_LNAME

EMP_FNAME

Brandon

Marie

Diante

Jorge

Genkazi

Leighla

is

descending

ordered by (ascending)

Query

EMP_INITIAL

EMP_REACODE

EMP_PHONE

G

7325

882-0845

D

0181

890-4567

7235

569-0093

0181

898-4387

W E

Jones

Anne

M

0181

898-3456

Cela

Nkosi

D

0181

324-5456

Lange

John

P

7325

504-4430

Lewis

Rhonda

G

0181

324-4472

Saranda

Hermine

R

0181

324-5505

Smith

George

A

0181

890-2984

Smith

George

K

7235

504-3339

Smith

Jeanine

K

0181

324-7883

P

0181

324-9006

7325

675-8993

Melanie Rhett

Washington

Rupert

E

0181

890-4925

Wiesenbach

Paul

R

0181

897-4358

Williams

Robert

D

0181

890-3220

shown

in

Figure

8.19.

Note

that

within

each

V_CODE,

the

Language

371

P_PRICE

Edward

Vandam

output

Structured

Johnson

Gounden

The

8 Beginning

P_PRICE

8

values

are in

order.

FIGURE 8.19

A query based on multiple restrictions V_CODE

P_DESCRIPT

Sledge hammer, Claw

P_INDATE

P_PRICE

02-Jan-19

14.40

21225

20-Jan-19

9.95

12 kg

hammer

9.00

cm

pwr.

saw

blade

21344

13-Nov-18

17.49

7.25

cm

pwr.

saw

blade

21344

13-Dec-18

14.99

21344

15-Dec-18

Rat-tail file,

1/8 cm fine

4.99

Hrd. cloth,

1/2 cm,

3 3 50

23119

15-Jan-19

43.99

Hrd. cloth,

1/4 cm,

2 3 50

23119

15-Jan-19

39.95

25595

20-Jan-19

38.95

B&D cordless

drill, 1/2 cm

NOTE

If the The

Copyright Editorial

review

2020 has

Cengage deemed

column

ORDER

Learning. that

ordering

any

All suppressed

has

BY clause

Rights

Reserved. content

does

May not

not materially

nulls,

must

be

copied, affect

they

always

scanned, the

overall

or

are listed be listed

duplicated, learning

in experience.

whole

either last

or in Cengage

part.

first

in the

Due Learning

to

electronic reserves

or last

(depending

on the

SELECT

command

sequence.

rights, the

right

some to

third remove

party additional

content

may content

be

RDBMS).

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

372

PART III

Database

Programming

8.6.2 Listing Unique Values How

many different

vendors

are currently represented

in the

PRODUCT

table?

A simple listing

(SELECT)

is not very useful if the table contains several thousand rows and you have to sift through the vendor codes manually. Fortunately, SQLs DISTINCT clause is designed to produce alist of only those values that are different from one another. For example, the command: SELECT

DISTINCT V_CODE

FROM

PRODUCT;

yields only the different (distinct) vendor codes (V_CODE) that are encountered in the PRODUCT table, as shown in Figure 8.20. Notethat the first output row shows the null. (By default, Access places the null V_CODE atthe top ofthe list, while Oracle places it atthe bottom. The placement of nulls does not affect the list contents. In Oracle, you could use ORDER BY V_CODE NULLS FIRSTto place nulls atthe top ofthe list.)

FIGURE

8.20

Alisting

of distinct

(different)

V_CODE values in the PRODUCT table V_CODE

21225 21231

8

21344 23119 24288

25595

8.6.3 Aggregate Functions SQL can perform contain

various

a specified

summing

the

aggregate

condition,

values

functions

TABLE

mathematical

8.9

finding

in

a specified

are

shown

in

has

and

The

number minimum

MAX

The

maximum

SUM

The sum

AVG

The arithmetic

Cengage

Learning. that

maximum

values

the

values

for in

number

some

of rows

specified

a specified

that

attribute,

column.

Those

functions

of rows

The

deemed

or

averaging

the

8.9.

MIN

another

are presented

2020

minimum

as counting

Output

To illustrate

review

the

Some basic SQL aggregate

COUNT

Copyright

for you, such

column, Table

Function

Editorial

summaries

any

All suppressed

standard

using the

Rights

Reserved. content

does

May not

not materially

be

copied, affect

attribute

value

attribute

of all values

SQL command

Oracle

containing

non-null encountered

value for

in

encountered

a given

mean (average)

format,

values

a given

column column

column

for

most of the

in

a given

a specified

remaining

column

input

and

output

sequences

RDBMS.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

COUNT Use the COUNT function in

conjunction

different

in Figure table.

with

vendors

the

DISTINCT

clause.

PRODUCT

FIGURE 8.21

For

table.

6. The answer indicates

(Note that the

Structured

to tally the number of non-null values of an attribute.

are in the

8.21, is

8 Beginning

that

nulls are not counted

COUNTfunction

example,

The

suppose

answer,

generated

six different

VENDOR

as V_CODE

output

you

codes

Language

373

COUNT can be used

want to

by the

Query

first

find

out

how

SQL

code

set

are found

in the

many shown

PRODUCT

values.)

example

8

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

374

PART III

Database

Programming

The aggregate functions can be combined with the SQL commands explored earlier. For example, the second SQL command set in Figure 8.21 supplies the answer to the question, How many vendors

8

referenced

in the

PRODUCT

table

have supplied

products

with prices that

are less

than

or equal to

10? The answer is 3, indicating that three vendors referenced in the PRODUCT table have supplied products that meetthe price specification. The COUNT aggregate function uses one parameter within parentheses, generally a column name such as COUNT(V_CODE) or COUNT(P_CODE). The parameter may also be an expression such as COUNT(DISTINCT

V_CODE)

or COUNT(P_PRICE110).

Using that

syntax,

COUNT

always

returns

the

number of non-null values in the given column. (Whether the column values are computed or show stored table row values is immaterial.) In contrast, the syntax COUNT(*) returns the number of total rows returned by the query, including the rows that contain nulls. In the example in Figure 8.21, SELECT COUNT(P_CODE) FROM PRODUCT and SELECT COUNT(*) FROM PRODUCT will yield the same answer

because

there

are no null values in the

P_CODE

primary

key column.

Note that the third SQL command set in Figure 8.21 uses the COUNT(*) command to answer the question, How many rows in the PRODUCT table have a P_PRICE value less than or equal to 10? The answer, 5, indicates that five products have alisted price that meets the price specification. The COUNT(*) aggregate

function

is used to count rows in a query result

set. In contrast,

the

COUNT(column)

aggregate function counts the number of non-null values in a given column. For example, in Figure 8.20, the COUNT(*) function would return a value of 7to indicate seven rows returned by the query. The COUNT(V_CODE) function would return a value of 6to indicate the six non-null vendor code values.

NOTE TO MICROSOFT

ACCESS USERS

Microsoft Access does not support such

queries in

Microsoft

Access,

For example, the equivalent SELECT

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

must create

subqueries

DISTINCT clause. If you want to use

with DISTINCT

and

NOT

NULL clauses.

Microsoft Access queries for the first two queries shown in Figure 8.21 are:

COUNT(*)

FROM

Editorial

the use of COUNT with the

you

(SELECT

Rights

Reserved. content

does

May not

not materially

be

DISTINCT V_CODE FROM PRODUCT

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

WHERE V_CODE IS NOT NULL)

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

8 Beginning

Structured

Query

Language

375

and

SELECT

COUNT(*)

FROM

(SELECT

DISTINCT(V_CODE)

FROM (SELECT

V_CODE,

WHERE Those

two

queries

Microsoft can

Access

delete

MAX and

that

can

be found

does

trailer

V_CODE

add

the

P_PRICE IS

NOT

on the

a trailer

next time

NULL

online

PRODUCT

AND

P_PRICE

platform

in

end

of the

query

use the

query.

at the you

FROM

the

, 10))

'Ch8_SaleCo' after

you

(Access)

have

database.

executed

it,

but

you

MIN

The MAX and

MINfunctions

Highest (maximum)

help you find answers to problems such as the:

price in the

Lowest (minimum)

PRODUCT

table.

price in the PRODUCT table.

8 The highest price, 256.99, is supplied by the first SQL command set in Figure 8.22. The second command set shown in Figure 8.22 yields the minimum price of 4.99.

FIGURE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

8.22

All suppressed

MIN and

Rights

Reserved. content

does

May not

not materially

be

MAX function

copied, affect

scanned, the

overall

or

duplicated, learning

output

in experience.

whole

or in Cengage

SQL

examples

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

376

PART III

Database

Programming

8

The third

SQL command

conjunction only

one

value, the

with value

a single

question,

Although

based

or a single

Which

product

that

Copyright review

2020 has

P_CODE,

FROM

PRODUCT

Learning. that

any

All

Rights

Reserved. content

does

May not

not materially

values

has the

demonstrates

However, found value.

highest

simple

you

in the It is

that

the

numeric

must remember

table:

easy

to

a single overlook

functions

that the maximum

this

can be used in

numeric value,

warning.

functions

yield

a single

minimum

For example,

examine

price?

enough,

P_DESCRIPT,

P_PRICE

suppressed

8.22

average

query seems

SELECT

Cengage deemed

Figure queries.

on all of the

count

WHERE

Editorial

set in

more complex

the

SQL command

sequence:

P_PRICE

5 MAX(P_PRICE);

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

does not yield the operator be

used

you

can

To answer each

results

thus

producing

only in the

symbol,

is

expected

is incorrect,

the

a single

value

query.

To do that,

The inner

query,

which is

The

query,

which is

executed

last.

case,

SELECT

in the

in this

Using the following price

value,

each

P_PRICE

command

which

is

stored

value, the

in

P_CODE,

FROM

PRODUCT

WHERE

The execution

The

of that

set in

that

most recent

outer

query

command

Since

executes

side of a comparison MAX(columnname)

a comparison

377

that

uses

can

an equality

sign.

maximum

price first,

query. In this

then

compare

case, the

it to

nested

query

the

always

note that

outer

the

first

SQL

command

you

query

the inner

query

now

a value

has

first

finds

to

the

which

maximum

to

compare

P_PRICE

MAX(PRICE)

query

is

sequence.)

properly:

5 (SELECT

nested

the

Language

yields

FROM

the

correct

can

also

PRODUCT);

answer

shown

below the third

(nested)

SQL

Figure 8.22.

MAX and

product

(The

P_DESCRIPT,

P_PRICE

in

equals

you need a nested

as an example,

memory.

query

Also,

of the

function

Query

first.

sequence

SELECT

command

executed

The aggregate

must compute

parts:

outer

right

Structured

MAX(P_PRICE) to the right

statement.

to the

you

use of

message.

of a SELECT

therefore,

by the

the

an error

of two

encounter

the

list

only

question,

price returned

composed

column

use

because

8 Beginning

MIN aggregate has the

product,

functions

oldest

you

date,

would

you

use

would

be used

use

with

date

MIN(P_INDATE).

columns.

In

the

For

same

example,

manner,

to find

to find

8

the

MAX(P_INDATE).

NOTE You

can

has the

use

expressions

highest

SELECT

anywhere

inventory

value.

a column To find

name

the

is

answer,

expected. you

can

Suppose write the

you

want to

following

query:

know

which

product

*

FROM

PRODUCT

WHERE

P_QOH

* P_PRICE

5 (SELECT

MAX(P_QOH*P_PRICE)

FROM

PRODUCT);

SUM

The SUM function computes the total sum for any specified attribute, using whichever condition(s) you have imposed. For example, if you want to compute the total amount owed by your customers, you could use the following command: SELECT

SUM(CUS_BALANCE)

FROM

CUSTOMER;

AS TOTBALANCE

You could also compute the sum total of an expression. of allitems carried in inventory, you could use:

For example, if you want to find the total value

SELECT

SUM(P_QOH * P_PRICE) AS TOTVALUE

FROM

PRODUCT;

because

the total

value is the

sum of the

product

of the

quantity

on hand

and the

price for

all items.

(See Figure 8.23.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

378

PART III

Database

Programming

FIGURE 8.23

The total value of all items in the PRODUCTtable

8

AVG The

AVG function

restrictions. can

set in

Figure

product

Copyright Editorial

review

2020 has

The first

value

Cengage deemed

Learning. that

8.24

any

is

similar

to

SQL command

be generated

price.

examined

format

to

yield

produces

Note

that

five the

that

of

MIN and

set shown

the

computed

output

second

in Figure average

lines query

MAX

describe

uses

nested

subject

8.24 shows

price

that

and is

SQL

the

how a simple

of 56.42125.

products

to

same

average

The second

whose

commands

prices and

operating

SQL

exceed

the

P_PRICE command

the

ORDER

average BY

clause

earlier.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 8.24

AVGfunction

8 Beginning

Structured

Query

Language

379

output examples

8

8.6.4 Grouping

Data

Frequency distributions can be created SELECT statement. The syntax is:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

quickly

duplicated, learning

in experience.

whole

and easily using the

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

GROUP BY clause

third remove

party additional

content

may content

be

suppressed at

any

time

within the

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

380

PART III

Database

Programming

SELECT

columnlist

FROM tablelist [WHERE

conditionlist

[GROUP

BY columnlist

[HAVING

conditionlist

[ORDER

BY columnlist

The

GROUP

in the

The

GROUP

set in

as

PRODUCT BY

a not

conjunction

in SQL

by each

bought

enter

supplied

review

2020 has

Cengage deemed

Learning. that

any

only

MIN,

when

MAX,

P_CODE,

AVG

used

and

output

in

columns

combined

with aggregate

conjunction

SUM.

For

with

example,

one

of the

as shown

in the

SQL

aggregate

first

command

by using:

P_DESCRIPT,

BY expression

with

some

sequence

P_PRICE

vendor?

by

in

Figure

because

a vendor.

code.

it

Perhaps

8.25 uses

nulls

the

properly

aggregate

products

can

were

making mean

the

preceding

BY clause

the

a null for the

person

write the

GROUP

answers

a COUNT

those or the

that

However, if you

function,

8.25 shows

channel

(Remember

error.

aggregate

output line in Figure

via a non-vendor

a vendor

FIGURE 8.25

valid

a GROUP

command

supplied

not

when you have attribute

V_CODE;

sequence

were

is

COUNT,

FROM

been

used

8.25, if you try to group the

Note that the third

Copyright

generally

V_CODE,

are

DESC] ] ;

statement.

SELECT

second

Editorial

[ASC |

BY clause

you generate

to

]

SELECT

such

Figure

GROUP

8

]

BY clause is

functions

functions,

]

question,

SQL command

works

properly.

How

many

The

products

function.

V_CODE, indicating

that two

produced

or they

data

in-house

entry

may have

products may

have

merely forgotten

many things.)

Incorrect and correct use ofthe GROUPBYclause

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

381

8

NOTE When using The

The

in

Copyright Editorial

review

columnlist

must include you could

in the

SELECTs

columnlist.

GROUP

BY clause

columnlist

SELECT

statement,

BY Features useful

like

the

has

HAVING

extension

and expressions

Learning. that

any

supplied

All suppressed

of the

Rights

Reserved. content

does

example, by each

May not

not materially

be

copied, affect

any

and

aggregate

function

by any aggregate

columns

BY feature

SELECT rows,

vendor.

scanned, overall

or

you

from

the

SELECTs

duplicated,

in experience.

whole

the

tables

functions.

columns

specified

in

function

columns

that

in

the

FROM

clause

of

columnlist.

or in Cengage

Due Learning

to

reserves

rights, the

right

some to

Basically,

is applied of the

third

party additional

content

to the

to

may content

be

products

any

time

to of a

of products

suppressed at

applies output

number

the listing

remove

HAVING

WHERE clause

a listing

want to limit

electronic

clause.

the

clause

generate

you

part.

HAVING

However,

HAVING

want to

But this time

learning

is

statement.

while the

suppose

the

names

all non aggregate also group

do not appear in the

GROUP

for individual

For

of column

Clause

WHERE clause in the

BY operation.

Cengage deemed

can include

even if they

statement:

a combination

If required,

the inventory

2020

must include

columnlist.

A particularly

GROUP

a SELECT

BY clauses

GROUP

columns

columnlist

with

GROUP

The

the

BY clause

SELECTs

appear

operates

GROUP

SELECT's

the

The

the

from if

whose

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

382

PART III

Database

prices

average

clause, in

Programming

below

as illustrated

conjunction

the

desired

10.

The first

in the

with the

first

part of that requirement

SQL

GROUP

command

BY clause

set in

in the

is satisfied

Figure

second

8.26.

SQL

with the

Note that

command

help of the

the

set in

HAVING Figure

GROUP

clause 8.26 to

is

BY

used

generate

result.

FIGURE 8.26

An application of the HAVINGclause

8

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using the

WHERE

Figure

produces

8.26

You

can

also

statement

clause

instead

an error

combine

Select

the

total

only the

clause

in the

second

Query

Language

SQL command

383

set in

message.

multiple

cost

rows

List the results

in

clauses

FROM

PRODUCT BY

and

aggregate

functions.

For

example,

the

following

SQL

BY

the

column

that

order

by

exceed

V_CODE.

500.

by the total

SUM(P_QOH

* P_PRICE)

SUM(P_QOH

syntax

used

expression

(TOTCOST).

totals

grouped

cost.

* P_PRICE)

AS TOTCOST

V_CODE

HAVING (SUM(P_QOH ORDER

products

descending

V_CODE,

GROUP

of

having

SELECT

others

HAVING

Structured

will:

Aggregate

Note

of the

8 Beginning

Some

in

*

. 500)

P_PRICE)

the

DESC;

HAVING

(formula)

used

RDBMSs

allow

and

in the

ORDER

SELECT

you to

BY clauses; statements

substitute

the

in

both

column

column

cases,

you

rather

than

list,

expression

with the

must

specify

the

the

column

alias

alias,

while

column

do not.

8

8.7 As

VIRTUAL TABLES: CREATING A VIEW

you

learnt

(or table). that is,

earlier,

Suppose products

of typing

the

The

output

at the

of a relational

end of every

with a quantity

same

in the database? query.

the

that,

query

end

of every

can

contain

day,

columns,

than

as

SELECT)

get a list

or equal to the

wouldnt

of a relational

(such

would like to

on hand that is less

at the

Thats the function

query

operator

day, you

it

be better

to

is

minimum save

that

view. A view is a virtual table

computed

columns,

aliases

and

another

relation

of all products

to reorder,

quantity. query

Instead

permanently

based on a SELECT

aggregate

functions

from

one

or moretables. The tables on which the view is based are called base tables. You can create a view by using the CREATE VIEW command: CREATE

VIEW viewname

The CREATE SELECT

VIEW statement

statement

The first

SQL

This

view

contains

rows

in

which

rows that

AS SELECT is

used to

command

the

price

a data definition

generate set in

only the

Figure

over

Access,

Copyright review

2020 has

you just which

Cengage deemed

Learning. that

shows

three

50.

The

the

the subquery

syntax

used to

attributes

second

stores

data dictionary. create

(P_DESCRIPT,

SQL

a view

P_QOH

command

sequence

specification

named

and in

PRICEGT50.

P_PRICE)

Figure

the

8.27

and

only

shows

the

ACCESS USERS

The CREATE VIEW command

Editorial

8.27

that

in the

make up the view.

NOTE TO MICROSOFT

view,

command

the virtual table

designated

is

query

any

can

All suppressed

Rights

need

does

May not

not materially

not directly

create

be treated

Reserved. content

to

is

like

be

copied, affect

a SQL a table,

scanned, the

overall

or

query it

duplicated, learning

supported and then

achieves

in experience.

in

whole

the

or in Cengage

part.

Microsoft

save it. same

Due Learning

to

electronic reserves

Access.

While this

is

To create not

a view in

as versatile

Microsoft

as an actual

result.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

384

PART III

Database

Programming

FIGURE 8.27

Creating a virtual table

with the CREATE VIEW command

8

A relational view has several special characteristics: You can use the name of a view anywhere a table name is expected in a SQL statement. Views are dynamically updated. That is, the view is re-created on demand each time it is invoked. Therefore, if new products are added (or deleted) to meetthe criterion P_PRICE . 50.00, those new products automatically appear (or disappear) in the PRICEGT50 view the next time it is invoked. Views

provide

alevel

of security in the

database

because

the

view can restrict

users to specified

columns and specified rows in atable. For example, if you have a company with hundreds of employees in several departments, you could give the secretary of each department a view of only certain attributes and only for the employees that belong to the secretarys department. Views may also be used asthe basis for reports. For example, if you need a report that shows a summary of total product cost and quantity-on-hand statistics grouped by vendor, you could create a PROD_STATS view as:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

CREATE

VIEW

PROD_STATS

SELECT

V_CODE,

SUM(P_QOH*P_PRICE) AS

AVG(P_QOH)

In

BY V_CODE;

Chapter

9, you

through

between

to

more than

combine

DBMS

(join) tables

one table

at and

tables,

you

MIN(P_QOH)

AS

MINQTY,

views

and, in

particular,

about

updating

data in

if

rows

the

WHERE

to

as the join

attributes

databases.

necessary,

is

perhaps

Ajoin is

review

the

the

performed

join

most important

enumerate

the

the

product

However,

the

tables

in the

definitions

and

clause

of the

FROM

of every table in the

to

common

to indicate

is

get the

attribute common

correct

examples

that

match. that

SELECT

FROM clause. (Review

result

values attributes

generally

composed

tables.

For example,

of related

Because table,

V_CODE

the link

TABLE 8.10

is,

That is

are

a natural

done

used to link

with the

is

is the

suppose

foreign

established

Creating

of an equality

on

key in

V_CODE.

links through

the

between

want to join

PRODUCT

(See

Table

foreign

the

table

two

the foreign

key and

and

tables, the

primary

V_CODE

listed

tables,

in the

you

SELECT

P_DESCRIPT,

FROM

PRODUCT,

Cengage deemed

might order

more than

command

sequence

P_PRICE,

any

All suppressed

Rights

columns

Reserved. content

does

May not

which V_NAME,

key in the

Linking

one

of the

must

joined

tables,

be defined.

produces

the

output

V_CONTACT,

attribute

the

To join

source

the

PRODUCT

shown in Figure

V_AREACODE,

table

of and

8.28:

V_PHONE

VENDOR

be presented

of the

be shown

in

PRODUCT.V_CODE

Learning. that

appears

would use the following,

SELECT

WHERE

has

name

and

keys

V_PHONE

attribute

VENDOR

8.10.)

V_COMPANY,

same

to

comparison

you

VENDOR

the

8

Use

referred

V_CODE

output

clause.

(sometimes

P_PRICE

which

The

must select

WHERE

the tables

P_DESCRIPT,

Your

4,

3 to revisit

you

PRODUCT

attributes

Chapter

Chapter

join

Attributes

VENDOR

in

from

statement.

Table

When the

distinction

when data are retrieved

condition).

key

PRODUCT.

(If

Cartesian

which

condition

primary

VENDOR

in

on common

and other

a time.

necessary.)

clause

The join

2020

base tables

Calculus.) simply

will create the terms,

only the

review

more about

database

Algebra

To join

Copyright

MAXQTY,

AS TOTCOST,

AS AVGQTY

will learn

a relational

Relational

Editorial

385

JOINING DATABASE TABLES

The ability

the

Language

views.

8.8

the

Query

PRODUCT

GROUP

these

Structured

AS

MAX(P_QOH)

FROM

8 Beginning

not materially

in is

be

5 VENDOR.V_CODE;

copied, affect

a different not relevant.

scanned, the

overall

or

duplicated, learning

order

because

In fact,

you

in experience.

whole

or in Cengage

part.

Due Learning

the

SQL

are likely

to

electronic reserves

to

rights, the

right

command get

some to

third remove

produces

a different

party additional

content

may content

a listing

order

be

of the

suppressed at

any

time

from if

the

subsequent

in

same

eBook rights

and/or restrictions

eChapter(s). require

it

386

PART III

Database

listing

the

using

an

Programming

next time ORDER

you execute

SELECT

P_DESCRIPT,

FROM

PRODUCT,

WHERE ORDER

BY

cm

list

V_NAME,

V_CONTACT,

V_AREACODE,

5 VENDOR.V_CODE

V_NAME

V_CONTACT

V_AREACODE

V_PHONE

9.95

Bryson, Inc.

Smithson

0181

223-3234

25

6.99

Bryson,

Smithson

0181

223-3234

50

8.45

D&E

Singh

0181

228-3245

Inc. Supply

cm

pwr.

saw

blade

14.99

Jabavu

Bros.

Khumalo

0181

889-2546

9.00

cm

pwr.

saw

blade

17.49

Jabavu

Bros.

Khumalo

0181

889-2546

4.99

Jabavu

Bros.

Khumalo

0181

889-2546

Anderson

7253

678-3998

Anderson

7253

678-3998

ORDVA, Inc.

Hakford

0181

898-1234

ORDVA, Inc.

Hakford

0181

898-1234

256.99

ORDVA, Inc.

Hakford

0181

898-1234

109.99

Rubicon

Systems

Du Toit

0113

456-0092

38.95

Rubicon

Systems

Du Toit

0113

456-0092

Rubicon

Systems

Du Toit

0113

456-0092

file,

1/8

cm fine

Hrd. cloth,

1/4 cm,

2 3 50

39.95

Randsets

Hrd. cloth,

1/2 cm,

3 3 50

43.99

Randsets

B&D jigsaw,

12 cm blade

B&Djigsaw,

8 cm blade

Hicut Power B&D Steel

chain

saw,

painter, cordless matting,

15 psi.,

4

109.92

99.87

16 cm

drill,

by

V_PHONE

7.25

Rat-tail

8

a more predictable

Theresults of ajoin

screw,

wd. screw,

you can generate

VENDOR

Claw hammer

2.5

P_PRICE,

P_PRICE

metal

However,

P_PRICE;

P_DESCRIPT

cm

command.

PRODUCT.V_CODE

FIGURE 8.28

1.25

the

BY clause:

3-nozzle

1/2

3 8

cm 3 1/6

m,.5

119.95

m

Ltd. Ltd.

mesh

NOTE

Table names were used as prefixes in the preceding SQL command sequence. For example, PRODUCT. P_PRICE was used rather than P_PRICE. Most current-generation RDBMSs do not require table names to

be used as prefixes

unless the

same attribute

name

occurs in several

of the tables

being joined.

In that

case, V_CODE is used as a foreign keyin PRODUCT and as a primary key in VENDOR; therefore, you must use the table names as prefixes in the WHERE clause. In other words, you can writethe previous query as: SELECT

P_DESCRIPT, P_PRICE, V_NAME, V_CONTACT,

FROM

PRODUCT,

WHERE

provide

such

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

5 VENDOR.V_CODE;

name occurs in several places, its origin (table)

a specification,

about the attributes highest price.

Editorial

VENDOR

PRODUCT.V_CODE

Naturally, if an attribute

SQL generates

an error

origin. In that case, your listing

Rights

Reserved. content

V_AREACODE, V_PHONE

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

must be specified. If you fail to

message to indicate

that

you have been ambiguous

will always be arranged from the lowest

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

price to the

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The preceding table

in

any

SQL command

which the

vendor

V_CODE

can

V_CODE

entries

can be

matched

deliver for

any

each

with

the

Because

Cartesian

be joined

each

All of the sequence

the

in the

quite

alisting

can be used

acceptable

in

SELECT

P_DESCRIPT,

FROM

PRODUCT,

WHERE AND

SQL

2.5 cm

will be the

output

product

VENDOR

5 176 rows.

tables.

the

Cartesian

and the

3 11)

V_NAME,

multiple

in

VENDOR

(Each

row

For example, shown

V_CONTACT,

in

of PRODUCT

table

contains in

PRODUCT

the following

Figure

and

11 rows, would

command

8.29:

V_AREACODE,

V_PHONE

5 VENDOR.V_CODE

An ordered andlimited listing after ajoin P_PRICE

metal screw, wd. screw,

drill, 1/2 cm

Steel

matting,

4 3 8 3 1/6

Hicut

chain

16

m,.5

m mesh

cm

V_CONTACT

V_AREACODE

V_PHONE

Smithson

0181

223-3234

8

Bryson,

8.45

D&E Supply

Singh

0181

228-3245

9.95

Bryson,

Smithson

0181

223-3234

38.95

Rubicon

Systems

Du Toit

0113

456-0092

119.95

Rubicon

Systems

Du Toit

0113

456-0092

256.99

ORDVA, Inc.

Hakford

0181

898-1234

50

B&D cordless

V_NAME

6.99

25

Claw hammer

saw,

V_CODE

VENDOR

P_DESCRIPT cm

Because

may contain

each

387

VENDOR

condition.

table

words,

Language

. '15-Jan-2019';

FIGURE 8.29

1.25

of (16

produces

P_PRICE,

other

Query

with a row in the clauses

PRODUCT

In

16 rows

on the joined

and

PRODUCT.V_CODE

P_INDATE

the result

table.)

the

table.

table

WHERE

Structured

PRODUCT.

contains

produce

in the

products, VENDOR

rows in

table

PRODUCT

as indicated

ordered

VENDOR

SQL commands

is

of

WHERE clause,

would

row

a row in the

same,

entry in the

PRODUCT

product

to

number

many V_CODE

the

joins

are the

V_CODE

If you do not specify VENDOR.

sequence

values

8 Beginning

Inc.

Inc.

NOTE In

Chapter

4, Relational

a specified

Algebra

and

Calculus,

you learnt

that

a JOIN is used to

way.In SQL, the natural-join is used to join tables together.

SELECT

P_DESCRIPT,

FROM

PRODUCT, VENDOR

WHERE

P_PRICE,

PRODUCT.V_CODE

V_NAME,

V_CONTACT,

combine

two

relations

in

The SQL statement:

V_AREACODE,

V_PHONE

5 VENDOR.V_CODE

AND P_INDATE . '15-Jan-2019'; can be written in relational

algebra as:

PP_DESCRIPT, P_PRICE, V_NAME, V_CONTACT, V_AREACODE, V_PHONE ((s p_indate 5'15-Jan-2019' (PRODUCT)) For

moreinformation

on JOIN

operators,

see Section

4.2 in

Chapter

|X|

4, Relational

VENDOR)

Algebra

and

Calculus.

Whenjoining three or more tables, you need to specify a join condition for each pair of tables. The number of join conditions will always be N-1, where Nrepresents the number of tables listed in the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

388

PART III

Database

FROM

Programming

clause.

tables,

you

For example,

must

Remember, table.

have

the join

For example,

date and

join

tables,

conditions;

condition

using

product

if you have three

four

will

Figure

descriptions

and

match

the

8.1, if you

CUS_LNAME,

FROM

CUSTOMER,

LINE,

Finally,

BY

not to

Bis related

A with B and

conditions;

if you have five

10014,

to

the

primary

key

invoice

of the

last

name,

you

must type the following:

related

number,

invoice

P_DESCRIPT

PRODUCT AND

5 LINE.INV_NUMBER

5 PRODUCT.P_CODE

AND

AND

5 10014

INV_NUMBER;

be careful

Table

of a table

customer

5 INVOICE.CUS_CODE

CUSTOMER.CUS_CODE ORDER

key the

INV_DATE,

INVOICE.INV_NUMBER

LINE.P_CODE

must have two join

for customer

CUSTOMER.CUS_CODE

WHERE

foreign

INV_NUMBER, INVOICE,

you on.

want to list

for all invoices

SELECT

so

to

create

Table

circular

join

conditions.

Table

Cis

also related

C and

B with C. Do not join

For to

example,

Table

if

Table

A, create

A is related

only two

to

Table

join

conditions:

The

aliases

B, join

C with A!

8.8.1 Joining Tables with an Alias 8 An alias are

may be

used

name

to label

used

to identify

the

the

PRODUCT

and

may be used

listing

contains

as an alias.

no duplicate

(Also

P_DESCRIPT,

FROM

PRODUCT

BY

notice

P, VENDOR

from

tables

that

which in the

there

SELECT

P_PRICE,

P.V_CODE

ORDER

table

VENDOR

names in the

SELECT

WHERE

source

are

the

data

next

are taken.

command

no table

name

sequence. prefixes

P and

Any legal

because

the

V

table

attribute

statement.)

V_NAME,

V_CONTACT,

V_AREACODE,

V_PHONE

V

5 V.V_CODE

P_PRICE;

8.8.2 Self-Joins An alias is especially a table to itself, in

Figure

a self-join

when a table is used.

must be joined

For example,

to itself in a recursive

suppose

you are

working

query. In

with the

order to join

EMP table

shown

EMP_

8.30.

FIGURE 8.30

Thecontents of the EMPtable

EMP_

EMP_

EMP_

EMP_

EMP_

NUM

TITLE

LNAME

FNAME

INITIAL

EMP_DOB

EMP_HIRE_

EMP_

EMP_

DATE

AREACODE

PHONE

MG

100

Mr

Cela

Nkosi

D

15-Jun-52

15-Mar-95

0181

324-5456

101

Ms

Lewis

Rhonda

G

19-Mar-75

25-Apr-96

0181

324-4472

100

102

Mr

Vandam

Rhett

14-Nov-68

20-Dec-00

7253

675-8993

100

103

Ms

Jones

Anne

16-Oct-84

28-Aug-04

0181

898-3456

100

104

Mr

Lange

John

08-Nov-81

20-Oct-04

7253

504-4430

105

Copyright Editorial

useful

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

M P

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

EMP_

EMP_

EMP_

EMP_

NUM

TITLE

LNAME

FNAME

Structured

Query

Language

EMP_HIRE_ DATE

AREACODE

PHONE

D

14-Mar-85

08-Nov-08

0181

890-3220

INITIAL

EMP_

389

EMP_

EMP_

EMP_DOB

EMP_

Robert

MGR

105

Mr

106

Mrs

Smith

Jeanine

K

12-Feb-78

05-Jan-99

0181

324-7883

105

107

Mr

Diante

Jorge

D

21-Aug-84

02-Jul-04

0181

890-4567

105

108

Mr

Paul

R

14-Feb-76

18-Nov-02

0181

897-4358

109

Mr

Smith

George

K

18-Jun-71

14-Apr-99

7253

504-3339

108

110

Mrs

Genkazi

Leighla

19-May-80

01-Dec-00

7253

569-0093

108

111

Mr

Washington

Rupert

E

03-Jan-76

21-Jun-03

0181

890-4925

105

112

Mr

Johnson

Edward

E

14-May-71

01-Dec-93

0181

898-4387

100

113

Ms

Gounden

Melanie

P

15-Sep-80

11-May-09

0181

324-9006

105

114

Ms

Brandon

Marie

G

02-Nov-66

15-Nov-89

7253

882-0845

108

115

Mrs

Saranda

Hermine

R

25-Jul-82

23-Apr-03

0181

324-5505

105

116

Mr

Smith

George

A

08-Nov-75

10-Dec-98

0181

890-2984

108

Using the

Williams

8 Beginning

Wiesenbach

data in the

EMP table,

W

you can generate

alist

of all employees

with their

managers

names

byjoining the EMP table to itself. In that case, you would also use aliases to differentiate the tables. The SQL command sequence would look like this: SELECT

E.EMP_MGR,

M.EMP_LNAME, E.EMP_NUM,

FROM

EMP E, EMP M

WHERE

E.EMP_MGR5M.EMP_NUM

ORDER BY

E.EMP_MGR;

The output

of the

above

FIGURE 8.31

command

sequence

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Figure

E.EMP_LNAME

8.31.

Using an alias to join atable to itself EMP_MGR

Editorial

is shown in

8

Rights

Reserved. content

does

EMP_NUM

E.EMP_LNAME

100

Cela

112

Johnson

100

Cela

103

Jones

100

Cela

102

Vandam

100

Cela

101

Lewis

105

Williams

115

Saranda

105

Williams

113

Gounden

105

Williams

111

Washington

105

Williams

107

Diante

105

Williams

106

Smith

105

Williams

104

Lange

108

Wiesenbach

116

Smith

108

Wiesenbach

114

Brandon

108

Wiesenbach

110

Genkazi

108

Wiesenbach

109

Smith

May not

M.EMP_LNAME

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

390

PART III

Database

Programming

NOTE In

Microsoft

Access,

add

previous

SELECT

E.EMP_MGR,

FROM

EMP AS E, EMP AS M

WHERE BY

8.8.3

M.EMP_LNAME,

8.28

showed

V_CODE

in

table.

do

Figure

rows

are listed.

are

show

If you

will notice

those

rows

types

and

VENDOR

matching

PRODUCT

in the

tables,

compare

no

matching

final

output

that

several

final

join

VENDOR

LEFT

JOIN

in the

on the

vendors

have

you

PRODUCT

table

are two

products

must

Also, if

no

matching

use

an outer

Chapter

outer join

in

VENDOR tables

join.

(See

examine

you

the

Figure

8.2,

you

with nulls in the

V_CODE

examine

V_CODE

output,

attribute,

the

in the

VENDOR PRODUCT

join.

4.)

will show

Given the

contents

all VENDOR

of the

rows

and

all

V_NAME

PRODUCT

output

generated

by the left

outer join

command

but show the output in a different

in

Microsoft

Access.

Both

Oracle

order.

Theleft outer join results P_CODE

V_CODE

V_NAME

23109-HB

21225

Bryson,

Inc.

SM-18277

21225

Bryson,

Inc.

21226

SuperLoo,

SW-23116

21231

D&E

13-Q2/P2

21344

Jabavu

Bros.

14-Q1/L3

21344

Jabavu

Bros.

54778-2T

21344

Jabavu

Bros.

22567

Dome

23119

Randsets Ltd.

1546-QQ2

any

to the

If you

5 PRODUCT.V_CODE;

MySQL yield the same result,

Learning.

null value

left

tables.

is that there

based

the following

FROM

that

VENDOR

output

and right.

VENDOR.V_CODE,

Cengage

and

the

output,

left

P_CODE,

FIGURE 8.32

deemed

PRODUCT

SELECT

and

has

it read:

rows:

Figure 8.32 shows the

2020

making

E.EMP_LNAME

Why? The reason

of outer joins:

ON VENDOR.V_CODE

review

E.EMP_NUM,

the

missing.

up in the

you

are two

PRODUCT

of joining

Because there is not

8.2,

To include

There

results

products

attribute.

products

table

the

14 product

will note that two

Copyright

sequence,

E.EMP_MGR;

note that

Editorial

command

Outer Joins

Figure

the

SQL

E.EMP_MGR5M.EMP_NUM

ORDER

8

AS to the

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Inc.

Supply

Due Learning

Supply

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

P_CODE

V_CODE

V_NAME

1558-QW1

23119

Randsets

24004

Brackman

2232/QTY

24288

ORDVA, Inc.

2232/QWE

24288

ORDVA,

Inc.

89-WRE-Q

24288

ORDVA,

Inc.

25443

B&K, Inc.

25501

Damal

11QER/31

25595

Rubicon

Systems

2238/QPD

25595

Rubicon

Systems

WR3/TT3

25595

Rubicon

Systems

SELECT

PRODUCT.P_CODE,

VENDOR.V_CODE,

FROM

VENDOR RIGHT JOIN PRODUCT

Structured

both

FIGURE

391

Supplies

with all matching vendor rows.

The

V_NAME

5 PRODUCT.V_CODE;

Oracle and

8.33

Language

Bros.

8

Figure 8.33 shows the output generated bythe right outer join command sequence in Again,

Query

Ltd.

The right outer join willjoin both tables and show all product rows SQL command for the right outer join is:

ON VENDOR.V_CODE

8 Beginning

MySQL yield the

The right

same result,

outer join

but show the

Microsoft Access.

output in a different

order.

results

P_CODE

V_CODE

V_NAME

23109-HB

21225

Bryson,

Inc.

SM-18277

21225

Bryson,

Inc.

SW-23116

21231

D&E

13-Q2/P2

21344

Jabavu

14-Q1/L3

21344

Jabavu

Bros.

54778-2T

21344

Jabavu

Bros.

1546-QQ2

23119

Randsets

Ltd.

1558-QW1

23119

Randsets

Ltd.

2232/QTY

24288

ORDVA, Inc.

2232/QWE

24288

ORDVA, Inc.

89-WRE-Q

24288

ORDVA, Inc.

11QER/31

25595

Rubicon

Systems

2238/QPD

25595

Rubicon

Systems

WR3/TT3

25595

Rubicon

Systems

23114-AA PVC23DRT

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

Supply

Bros.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

392

PART III

Database

In

Chapter

to

use the

Programming

9, Procedural latest

ANSI

Language SQL

SQL and

standard

Advanced

SQL, you

will learn

more about joins

and how

syntax.

Online Content Foracomplete walk-through example ofconverting anERmodel into a database

structure

ER Model into

and

using

a Database

SQL commands

Structure,

to

create

tables,

see

on the online platform for this

Appendix

D, Converting

an

book.

SUMMARY The

SQL commands

commands

can be divided into

The ANSI standard data types

two

overall

categories:

data definition

language

(DDL)

and data manipulation language (DML) commands.

are

data types

are supported

NUMBER, INTEGER,

CHAR,

by all RDBMS vendors in different VARCHAR

and

ways. The basic

DATE.

The basic data definition commands allow you to create tables, indexes and views. Many SQL constraints can be used with columns. The commands are CREATE TABLE, CREATEINDEX, CREATE

VIEW,

ALTER TABLE,

DROP

TABLE,

DROP VIEW and

DROP INDEX.

DML commands allow you to add, modify, and delete rows from tables. The basic DML commands are SELECT, INSERT, UPDATE, DELETE, COMMIT and ROLLBACK.

8

The INSERT command is used to add new rows to tables. The UPDATE command is used to modify data values in existing rows of atable. The DELETE command is used to delete rows from tables. The COMMIT and ROLLBACK commands are used to permanently save or roll back changes madeto the rows. Once you COMMIT the changes, you cannot undo them with a ROLLBACK

command.

The SELECT statement is the following syntax:

main data retrieval

command in SQL. A SELECT statement

has the

SELECT columnlist FROM tablelist [WHERE conditionlist ] [GROUP

BY columnlist

]

[HAVING

conditionlist

[ORDER

BY columnlist [ASC | DESC] ] ;

]

The column list represents one or more column names separated by commas. The column list may also include computed columns, aliases and aggregate functions. A computed column is represented by an expression or formula (for example, P_PRICE * P_QOH). The FROM clause contains

alist

of table

names

or view names.

The WHERE clause can be used with the SELECT, UPDATE and DELETE statements to restrict the rows affected by the DDL command. The condition list represents one or more conditional expressions separated bylogical operators (AND/OR/NOT). The conditional expression can contain

any comparison

operators

(5,

.,

,,

.5,

,5,

,.)

as well as special

operators

(BETWEEN, IS NULL, LIKE, IN and EXISTS).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Aggregate

functions

computations the

GROUP

The

aggregate The

clause

rows

ORDER sort

a set

BY clause

HAVING

can

(COUNT,

over

to

is

or

is

output

performed

time

If

every

WHERE

you

The

natural

join

columns.

to

output

specify

ajoin

also

values in the

output

sort the

output

and

either

use

two

that

usually

GROUP

perform

used in

computations

of the

by

Query

Language

393

arithmetic

conjunction

one

BY clause

of a SELECT ascending

with the

or

or

with

more

attributes.

by selecting

key

the

in

The

statement.

the

FROM

of one table

DBMS

statement.

or descending

SELECT

more tables

foreign

condition,

uses the join

are

Structured

only the

condition.

the

FROM

functions

functions

of aggregate

the

multiple tables

match

in the

You could

matching

to

specify

you specify

of you

clause

do not

tables

used

more columns

You can join the

in the

the

aggregate

to restrict

match a given

BY clause

by one

MAX, AVG) are special

The

group

used

that

MIN,

of rows.

8 Beginning

to

the

automatically

ORDER

BY clause

order.

The join

operation

clause

and

use

primary

key

of the

performs

is

a join

condition

related

a Cartesian

table.

product

of the

clause.

condition

do a right

to outer

other related

match join

only rows

and left

with

equal

outer join

to

values

select

in the

the

rows

specified that

have

no

table.

KEY TERMS alias

DELETE

OR

ALTERTABLE

DISTINCT

ORDERBY

AND

DROP INDEX

recursive query

authentication

DROP TABLE

reserved words

AVG

EXISTS

ROLLBACK

basetables

GROUPBY

rules of precedence

BETWEEN

HAVING

schema

Boolean algebra

IN

SELECT

cascadingordersequence

INSERT

subquery

COMMIT

IS NULL

SUM

COUNT

LIKE

CREATE INDEX

8

UPDATE

MAX

CREATE TABLE

MIN

CREATE VIEW

NOT

view

wildcardcharacter

FURTHER READING Allison, C. and Berkowitz, Inc., Freeman, Murach,

R. Oracle

Database

Murachs

MySQL,

J.

Jacobs, P.SQL:

Copyright Editorial

review

2020 has

Big Data

Cengage deemed

Learning. that

any

Microsoft Access.

All

Rights

Reserved. content

Release Edition.

Wordware Applications

2 New Features (Oracle Mike

Beginners

Murach

does

Springer

May not

not materially

be

copied, affect

Vieweg,

scanned, the

overall

Press).

& Associates

Inc.,

Library,

Wordware Publishing

or

Education,

2017.

with Exercises and Case Studies,

Models, Languages,

Consistency

2018.

Options and Architectures

2019.

duplicated, learning

McGraw-Hill 2019.

Guide to SQL Programming

M. SQL & Nosql Databases:

Management,

suppressed

12c 3rd

Comprehensive

Meier, A. and Kaufmann, for

N. SQL for

2005.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

394

PART III

Database

Programming

Online Content are available

on the

Answers to selectedReviewQuestions andProblems forthis chapter

online

platform

for

this

book.

REVIEW QUESTIONS

Online Content TheReviewQuestions inthis chapterarebasedonthe'Ch08_Review' database Access

located format.

utilities to

The

Ch08_Review

The charges of the

are

on the If you

online

use

platform

another

DBMS

move the

Access

database

database

stores

data for

based

Ch08_Review

on the

hours

database

each

are

shown

for this such

book. as

This

Oracle,

database SQL

is

Server

stored or

in

Microsoft

MySQL,

use its import

contents.

a consulting

employee in

company

works

Figure

that tracks

on each

project.

all charges

The

structure

to and

projects. contents

Q8.1.

8 FIGURE Q8.1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

The Ch8_Review database

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Language

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_HIREDATE

News

John

G

08-Nov-10

502

102

Moonsamy

Kavyara

H

12-Jul-99

501

103

Baloyi

Mzwandile

E

01-Dec-06

503

8

104

Maseki

Noxolo

K

15-Nov-97

501

17

105

Johnson

Alice

K

01-Feb-03

502

12

22-Jun-14

500

0

D

10-Oct-03

500

11

B

22-Aug-01

501

13

18-Jul-07

501

7

William

106

Smithfield

107

Alonzo

108

Khan

Krishshanth

109

Smith

Larry

110

Olenko

Maria

Wabash

W

JOB_CODE

4 15

Gerald

A

11-Dec-05

505

9

Geoff

B

04-Apr-01

506

14

23-Oct-04

507

10

15-Nov-06

508

8

M

Smithson

Darlene

113

Joenbrood

Delbert

114

Jones

Annelise

20-Aug-03

508

11

Travis

B

25-Jan-02

501

13

L

05-Mar-07

510

8

19-Jun-06

509

8

04-Jan-15

510

0

115

Bawangi

116

Pratt

H

Angie

J

James

Frommer

name:

K

Gerald

Williamson

118

Table

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_

ASSIGN_

ASSIGN_

NUM

DATE

NUM

NUM

JOB

CHG_HR

HOURS

CHARGE

has

1001

22-Mar-19

18

103

503

84.50

3.50

295.75

1002

22-Mar-19

22

117

509

34.55

4.20

145.11

1003

22-Mar-19

18

117

509

34.55

2.00

1004

22-Mar-19

18

103

503

84.50

5.90

498.55

1005

22-Mar-19

25

108

501

96.75

2.20

212.85

1006

22-Mar-19

22

104

501

96.75

4.20

406.35

1007

22-Mar-19

25

113

508

50.75

3.80

192.85

1008

22-Mar-19

18

103

503

84.50

0.90

1009

23-Mar-19

15

115

501

96.75

5.60

1010

23-Mar-19

15

117

509

34.55

2.40

1011

23-Mar-19

25

105

502

105.00

4.30

451.50

1012

23-Mar-19

18

108

501

96.75

3.40

328.95

1013

23-Mar-19

25

115

501

96.75

2.00

193.50

1014

23-Mar-19

22

104

501

96.75

2.80

270.90

1015

23-Mar-19

15

103

503

84.50

6.10

515.45

1016

23-Mar-19

22

105

502

4.70

493.50

Cengage deemed

Learning. that

any

8

ASSIGNMENT

ASSIGN_

2020

395

EMP_YEARS

112

117

review

Query

101

111

Copyright

Structured

EMPLOYEE

EMP_NUM

Editorial

8 Beginning

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

105.00

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

69.10

76.05 541.80 82.92

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

396

PART III

Database

Programming

ASSIGN_

ASSIGN_

PROJ_

EMP_

ASSIGN_

ASSIGN_

ASSIGN_

ASSIGN_

NUM

DATE

NUM

NUM

JOB

CHG_HR

HOURS

CHARGE

1017

23-Mar-19

18

117

509

34.55

3.80

1018

23-Mar-19

25

117

509

34.55

2.20

1019

24-Mar-19

25

104

501

110.50

4.90

541.45

1020

24-Mar-19

15

101

502

125.00

3.10

387.50

1021

24-Mar-19

22

108

501

110.50

2.70

298.35

1022

24-Mar-19

22

115

501

110.50

4.90

541.45

1023

24-Mar-19

22

105

502

125.00

3.50

437.50

1024

24-Mar-19

15

103

503

84.50

3.30

278.85

1025

24-Mar-19

18

117

509

34.55

4.20

145.11

131.29 76.01

Table name: JOB JOB_CODE

8

Table name:

JOB_DESCRIPTION

Copyright review

2020 has

Cengage deemed

Programmer

35.75

20-Nov-18

501

Systems

Analyst

96.75

20-Nov-18

502

Database

Designer

125.00

24-Mar-19

503

Electrical

Engineer

84.50

20-Nov-19

67.90

20-Nov-19

504

Mechanical

505

Civil Engineer

55.78

20-Nov-19

506

Clerical

26.87

20-Nov-19

507

DSS Analyst

45.95

20-Nov-19

508

Applications

48.10

24-Mar-19

509

Bio

34.55

20-Nov-18

510

General

18.36

20-Nov-18

Engineer

Support

Designer

Technician Support

PROJECT

Learning. that

JOB_LAST_UPDATE

500

PROJ_NUM

Editorial

JOB_CHG_HOUR

any

All suppressed

PROJ_NAME

PROJ_VALUE

PROJ_BALANCE

EMP_NUM

15

Evergreen

1453500.00

1002350.00

103

18

Amber

3500500.00

2110346.00

108

22

Rolling

805000.00

500345.20

102

25

Starflight

Rights

Reserved. content

does

May not

not materially

be

Wave Tide

copied, affect

2309880.00

2650500.00

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

107

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

As you examine attribute

Figure

Q8.1, note that the

(ASSIGN_CHG_HR)

are likely table.

to And,

stored.

change

the

those

ASSIGNMENT

maintain

over time.

naturally,

Because

to In

historical

fact,

employee

job

of the

to

data.

change

assignment

are required

Structured

Query

stores the JOB_CHG_HOUR

accuracy

a JOB_CHG_HOUR

primary

attributes

table

8 Beginning

maintain the

values

The JOB_CHG_HOUR

is reflected

may change,

in the

so the

historical

397

as an values

ASSIGNMENT

ASSIGN_JOB

accuracy

Language

of the

is

also

data, they

are

not redundant. Given

the

commands

1

structure to

answer

Writethe subset

that

Attribute

contents

questions

(Field)

Ch8_Review

will create the table

EMPLOYEE

the

of the

database

shown

in

Figure

Q8.1,

use

SQL

125.

SQL code that

of the

(Note

and

table.

JOB_CODE

The basic

is the

FK to

structure for a table

EMP_1

table

structure

is

named EMP_1. This table is a summarised

in the

table

below.

JOB).

Data

Name

Declaration

EMP_NUM

CHAR(3)

EMP_LNAME

VARCHAR(15)

EMP_FNAME

VARCHAR(15)

EMP_INITIAL

CHAR(1)

EMP_HIREDATE

DATE

JOB_CODE

CHAR(3)

8 2

Having created the table structure in the

table

FIGURE

shown

Q8.2

in

EMP_FNAME

EMP_INITIAL

EMP_HIREDATE

101

News

John

G

08-Nov-10

502

102

Moonsamy

Kavyara

H

12-Jul-99

501

Mzwandile

E

01-Dec-06

500

Noxolo

K

15-Nov-07

501

Alice

K

01-Feb-03

502

22-Jun-14

500

D

10-Oct-03

500

B

22-Aug-01

501

18-Jul-07

501

Baloyi

review

2020 has

104

Maseki

105

Johnson

106

Smithfield

William

107

Alonzo

Maria

108

Khan

Krishshanth

109

Smith

Larry

W

Assuming the data shown in the EMP_1 table have been entered, all attributes for ajob code of 502.

4

Copyright

of the EMP_1 table

EMP_LNAME

103

Editorial

Q8.2.

The contents

EMP_NUM

3

Figure

Question 1, writethe SQL code to enter the first two rows for

Writethe SQL code that

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

will save the changes

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

JOB_CODE

write the SQL code that

willlist

madeto the EMP_1 table.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

398

PART III

Database

5

Programming

Writethe SQL code to change the job code to 501 for the person After

6

you

have

of the

7

2014,

8

data

that

whose given

existed

will add the

percentage

9

to

job

code

in this

the

results,

then

classification

is

whose employee

reset

the job

code

William Smithfield,

500. (Hint:

Use logical

number is 107.

to its

original

before

who was hired on

operators

to include

attributes

made the

be

EMP_PCT

paid to

each

PROJ_NUM

CHAR(3)

changes

in

Questions

5 and

6.

and

PROJ_NUM

employee.

The

new

to its

structure.

attribute

The

EMP_PCT

characteristics

is the

in

is 103. Figure

Next,

are:

to

18 for

to

you

Figure

SQL command

sequence,

25 for

finish

write the

all employees

Using a single command When

write the

sequences

to

change

the

EMP_PCT

whose

sequence, all

10

SQL code that

job

classification

write the

employees

questions

may assume

FIGURE Q8.3

whose

and

11,

will change the project

(JOB_CODE)

SQL code that job

EMP_2

saved

again

(JOB_CODE)

table

will

is

contain

the

502

that

the

table

has

been

at this

EMP_

EMP_

FNAME

INITIAL

101

News

John

G

08-Nov-10

502

5.00

102

Moonsamy

Kavyara

H

12-Jul-99

501

8.00

103

Baloyi

Mzwandile

E

01-Dec-06

500

3.85

104

Maseki

Noxolo

K

15-Nov-97

501

10.00

105

Johnson

Alice

K

01-Feb-03

502

5.00

106

Smithfield

William

22-Jun-14

500

6.20

107

Alonzo

Maria

D

10-Oct-03

500

5.15

108

Khan

Krishshanth

B

22-Aug-01

501

10.00

109

Smith

Larry

18-Jul-07

501

2.00

any

in

Thecontents of the EMP_2table

LNAME

Learning.

higher.

shown

point.)

EMP_

that

number

or

data

NUM

Cengage

number

500.

will change the project

classification

the

is

EMP_

deemed

as

Q8.4.

(You

has

values

Q8.3.

Using a single command

(PROJ_NUM)

2020

bonus

Writethe SQL code to change the EMP_PCT value to 3.85 for the person whose employee number

(PROJ_NUM)

review

all

problem.)

you

NUMBER(4,2)

shown

Copyright

value.

will restore the data to its original status; that is, the table should contain

EMP_PCT

(EMP_NUM)

Editorial

examine

Writethe SQL code to create a copy of EMP_1, naming the copy EMP_2. Then write the SQL code that

11

and

information

Writethe SQL code that the

8

the task,

Writethe SQL code to delete the row for the person named 22 June,

10

completed

All suppressed

Rights

Reserved. content

does

May not

not materially

EMP_

W

be

copied, affect

scanned, the

overall

or

duplicated, learning

JOB_CODE

EMP_PCT

PROJ_

HIREDATE

in experience.

whole

or in Cengage

part.

Due Learning

to

NUM

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE

Q8.4

The contents

of the EMP_2 table after the

8 Beginning

Structured

Query

Language

modification

EMP_

EMP_

EMP_

NUM

LNAME

FNAME

101

News

John

G

08-Nov-10

502

5.00

102

Moonsamy

Kavyara

H

12-Jul-99

501

8.00

103

Baloyi

E

01-Dec-06

500

3.85

104

Maseki

Noxolo

K

15-Nov-97

501

105

Johnson

Alice

K

01-Feb-03

502

5.00

25

106

Smithfield

William

22-Jun-14

500

6.20

18

107

Alonzo

Maria

D

10-Oct-03

500

5.15

18

108

Khan

Krishshanth

B

22-Aug-01

501

10.00

109

Smith

Larry

18-Jul-07

501

2.00

12

Writethe

SQL code that

before

1 January

restored

to its

13

whose job

preceding

at least

and

18

10.00

employees

may assume

who were hired

that

the

table

will be

question.)

8

EMP_PCT.

Copythe matching EMP_2 valuesinto the TEMP_1table.

Writethe SQL command that

will delete the newly created TEMP_1 table from the database.

Writethe SQL code required to list all employees the

16

501. (You

25

Create atemporary table named TEMP_1 whose structure is composed ofthe EMP_2 attributes

b

15

code is

this

PROJ_NUM

SQL command sequences required to:

EMP_NUM

14

W

will change the PROJ_NUM to 14 for those

and

condition

Writethe two a

2004,

EMP_PCT

HIREDATE

INITIAL

Mzwandile

JOB_CODE

EMP_

EMP_

399

rows

for

both

Smith

and

Smithfield

should

whose last names start with Smith. In other words, be included

in the

listing.

Assume

case

sensitivity.

Usingthe EMPLOYEE, JOB, and PROJECT tables in the Ch08_Review database (see Figure Q8.1), write the

SQL

code

FIGURE Q8.5

that

will produce

the

results

shown

in

Figure

Q8.5.

The query results for Question 16

PROJ_

PROJ_

PROJ_

EMP_

EMP_

EMP_

JOB_

NAME

VALUE

BALANCE

LNAME

FNAME

INITIAL

CODE

JOB_

JOB_

DESCRIPTION

CHG_ HOUR

805000.00

500345.20

Moonsamy

Kavyara

H

501

Systems

Evergreen

1453500.00

1002350.00

Baloyi

Mzwandile

E

500

Programmer

35.75

Starflight

2650500.00

2309880.00

Alonzo

Maria

D

500

Programmer

35.75

Amber

3500500.00

2110346.00

Khan

Krishshanth

B

501

Systems

96.75

Rolling

Tide

Analyst

96.75

Analyst

Wave

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

400

PART III

Database

17

Programming

Writethe SQL code that produces a virtual table named that

18

was shown

Writethe

20

bonus

Writethe is

22

database. by

(See

multiplying

worked

for

each

of running

FIGURE Q8.6

Note that

ASSIGN_CHG_HR

by

employee

that

query

and the

are

shown

values in the ASSIGNMENT table in the

ASSIGN_CHARGE

is

write the SQL code that total

in

a derived

charges

Figure

stemming

will yield the total

from

those

hours

101

News

3.1

387.50

103

Baloyi

19.7

1664.65

11.9

1218.70

SumOfASSIGN_HOURS

number of worked.

The

12.5

1382.50

Maseki

SumOfASSIGN_CHARGE

105

Johnson

108

Khan

8.3

840.15

113

Joenbrood

3.8

192.85

115

Bawangi

12.5

1276.75

18.8

649.54

Williamson

Writea query to produce the total number of hours and charges for each ofthe projects represented in

the

ASSIGNMENT

FIGURE Q8.7

table.

24

The output

is

shown

in

Figure

Q8.7.

Total hours and charges by project PROJ_NUM

SumOfASSIGN_HOURS

SumOfASSIGN_CHARGE

15

20.5

1806.52

18

23.7

1544.80

22

27

2593.16

25

19.4

1668.16

Writethe SQL code to generate the total hours worked and the total charges made by all employees. The results are shown in Figure Q8.8.(Hint: This is a nested query. If you use Microsoft Access, you can generate the result by using the query output shown in Figure Q8.6 as the basis for the query that

Cengage deemed

that

Q8.6.

EMP_LNAME

23

attribute

ASSIGN_HOURS.

EMP_NUM

117

has

Q8.1.)

Total hours and charges by employee

104

2020

Figure

Using the data in the ASSIGNMENT table, hours

order by

willlist only the different project numbers found in the EMP_2 table.

SQL code to calculate the ASSIGN_CHARGE

calculated

results

review

you created in

percentage.

Ch08_Review

Copyright

bonus percentage in the EMP_2 table

8.

Writethe SQL code that

21

Editorial

REP_1, containing the same information

16.

Writethe SQL code that produces alisting for the data in the EMP_2 table in ascending the

8

Question

SQL code to find the average

Question

19

in

Learning. that

any

All suppressed

will produce

Rights

Reserved. content

does

May not

not materially

the

be

copied, affect

output

scanned, the

overall

or

shown

duplicated, learning

in experience.

in Figure

whole

or in Cengage

part.

Due Learning

Q8.8).

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE

Q8.8

SQL code to

The results you

this

26

should

use

Query

Language

401

SumOfSumOfASSIGN_CHARGE

7612.64

90.6

Write the

Structured

Total hours and charges, all employees SumOfSumOfASSIGN_HOURS

25

8 Beginning

generate

be the

Microsoft

the total

same

Access,

hours

as those

you

can

worked

shown

generate

in

the

and the total

Figure

result

charges

Q8.8. (Hint:

by using

the

made to all projects.

This is

query

a nested

output

query.

as the

If

basis

for

query.)

Explain whyit would be preferable to use a DATE data type to store date data instead

of a character

data type.

27

Explain why the following fix the

command

would create an error and which changes could be madeto

error:

SELECT

V_CODE,

SUM(P_QOH)

FROM

PRODUCT;

28

Explain the difference between an ORDER BY clause and a GROUP BY clause.

29

Explain why the following

30

SELECT

DISTINCT

SELECT

COUNT

two commands

COUNT

(V_CODE)

(DISTINCT

Whatis the difference

produce different results:

V_CODE)

FROM

PRODUCT;

FROM

PRODUCT;

between the COUNT aggregate function

and the SUM aggregate function?

31 In a SELECT query, whatis the difference between a WHERE clause and a HAVING clause? 32

Rewrite the following WHERE

WHEREclause

v_COUNTRY

IN ('UK',

without the use of the IN operator:

'SA',

'USA')

PROBLEMS

Online the

online

use

Before

database,

to people

implemented

such

to

all

the

shown pilots

crew

assignments,

Copyright review

2020 has

Cengage deemed

Learning. that

any

are

not

All

stored

or

in

Microsoft

MySQL,

Access

use its import

Rights

does

Although but

Thats Note

multiple

Reserved. content

May

not materially

be

not

format.

utilities

why the

such

If you

to

move the

table.

scanned, the

overall

or

duplicated, learning

does

and

such

as loadmasters

copilots,

between

Nor does the

not

the

optionalities, (Although,

design

is

and flight

does

Certified

database

show

members.

CHARTER

implementation and

Ch08_AviaCo

crew

pilots

as Instrument

stored in the

affect

design

schema are flight

relationship

this

with the

the relational

all employees

all involve

also that

ratings

copied,

yourself

assignments

EARNEDRATINGS

not

familiarise

assignments

pilots.

CREW.)

example,

P8.1.

member

which are properly

suppressed

is

Server

SQL queries,

Figure

member

For

SQL

employees

crew

stored in the (composite)

Editorial

in

are

accommodate who

This database

Oracle,

write any

through

attributes.

book. as

contents.

contents

mind that

flexible of

for this

database

and

in

Problems 115 are based onthe 'Ch08_AviaCo' databaselocated on

DBMS

you attempt

structure

this

platform

another

Access

keep

Content

Flight

and

in

sufficiently

attendants

EMPLOYEE

not include

is

multivalued

Instructor

ratings

CHARTER table include

are

multiple crew

CREW table.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

402

PART III

Database

Programming

FIGURE P8.1

The Ch08_AviaCo database

8

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Structured

Query

Language

403

CREW CHAR_TRIP

Table

8 Beginning

name:

EMP_NUM

CREW_JOB

10001

104

Pilot

10002

101

Pilot

10003

105

Pilot

10003

109

Copilot

10004

106

Pilot

10005

101

Pilot

10006

109

Pilot

10007

104

Pilot

10007

105

Copilot

10008

106

Pilot

10009

105

Pilot

10010

108

Pilot

10011

101

Pilot

10011

104

Copilot

10012

101

Pilot

10013

105

Pilot

10014

106

Pilot

10015

101

Copilot

10015

104

Pilot

10016

105

Copilot

10016

109

Pilot

10017

101

Pilot

10018

104

Copilot

10018

105

Pilot

8

RATING RTG_CODE

RTG_NAME

CFI

Certified

Flight

Instructor

Certified

Flight

Instructor,

CFII

Instrument

Instrument

INSTR

Multiengine

MEL

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

Land

SEL

Single

Engine,

Land

SES

Single

Engine,

Sea

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

404

PART III

Table

Database

name:

EMPLOYEE

EMP_NUM

EMP_TITLE

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_DOB

EMP_HIRE_DATE

100

Mr

Nkosi

Cela

D

15-Jun-52

15-Mar-98

101

Ms

Lewis

Rhonda

G

19-Mar-75

25-Apr-96

102

Mr

Vandam

Rhett

14-Nov-68

18-May-03

103

Ms

Jones

Anne

M

11-May-84

26-Jul-09

104

Mr

Lange

P

12-Jul-81

20-Aug-00

105

Mr

Robert

D

14-Mar-85

19-Jun-13

106

Mrs

Duzak

Jeanine

K

12-Feb-78

13-Mar-99

107

Mr

Diante

Jorge

D

01-May-85

02-Jul-07

108

Mr

Paul

R

14-Feb-76

03-Jun-03

109

Ms

Travis

Elizabeth

K

18-Jun-71

14-Feb-16

110

Mrs

Genkazi

Leighla

19-May-80

29-Jun-00

Table

8

Programming

name:

John

Williams

Wiesenbach

W

PILOT PIL_MED_TYPE

PIL_MED_DATE

PIL_PT135_DATE

1

12-Apr-2018

15-Jun-2018

SEL/MEL/Instr

1

10-Jun-2018

23-Mar-2019

COM

SEL/MEL/Instr/CFI

2

25-Feb-2019

12-Feb-2019

106

COM

SEL/MEL/Instr

2

02-Apr-2019

24-Dec-2019

109

COM

SEL/MEL/SES/Instr/

1

14-Apr-2019

21-Apr-2019

EMP_NUM

PIL_LICENSE

PIL_RATINGS

101

ATP

SEL/MEL/Instr/CFII

104

ATP

105

CFII

Table name:

EARNEDRATING EMP_NUM

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

RTG_CODE

EARNRTG_DATE

101

CFI

18-Feb-08

101

CFII

15-Dec-15

101

INSTR

08-Nov-03

101

MEL

23-Jun-04

101

SEL

21-Apr-03

104

INSTR

15-Jul-06

104

MEL

29-Jan-07

104

SEL

12-Mar-05

105

CFI

18-Nov-07

105

INSTR

17-Apr-05

105

MEL

12-Aug-05

105

SEL

106

INSTR

20-Dec-05

106

MEL

02-Apr-06

copied, affect

scanned, the

overall

or

duplicated, learning

23-Sep-04

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

EMP_NUM

Table name:

8 Beginning

RTG_CODE

EARNRTG_DATE

106

SEL

10-Mar-04

109

CFI

05-Nov-08

109

CFII

21-Jun-13

109

INSTR

23-Jul-06

109

MEL

15-Mar-07

109

SEL

05-Feb-06

109

SES

12-May-06

Structured

Query

Language

405

CUSTOMER

CUS_CODE

CUS_LNAME

CUS_FNAME

CUS_INITIAL

CUS_AREACODE

CUS_PHONE

CUS_ BALANCE

10010

Ramas

Alfred

A

0181

844-2573

0.00

10011

Dunne

Leona

K

0161

894-1238

0.00

10012

Smith

Kathy

0181

894-2285

896.54

10013

Pieterse

Jaco

0181

894-2180

1285.19

10014

Orlando

0181

222-1672

10015

OBrian

Amy

B

0161

442-3381

10016

Brown

James

G

0181

297-1228

0.00

0181

290-2556

0.00 0.00

Copyright review

F

Myron

George

10017

Williams

10018

Farriss

Anne

G

0161

382-7185

10019

Smith

Olette

K

0181

297-3809

Table

Editorial

W

name:

673.21 1014.56

453.98

CHARTER

CHAR_

CHAR_

AC_

CHAR_

CHAR_

CHAR_

CHAR_

CHAR_

CHAR_

CUS_

TRIP

DATE

NUMBER

DESTINATION

DISTANCE

HOURS_

HOURS_

FUEL_

OIL_

CODE

FLOWN

WAIT

GALLONS

QTS 1

10011

0

10016

2

10014

1

10019

397.7

2

10011

117.1

0

10017

0

348.4

2

10012

4.1

0

140.6

1

10014

1 574.00

6.6

23.4

459.9

0

10017

ATL

998.00

6.2

3.2

279.7

BNA

352.00

1.9

5.3

MOB

884.00

4.8

4.2

10001

05-Feb-19

2289L

ATL

936.00

5.1

2.2

10002

05-Feb-19

2778V

BNA

320.00

1.6

0

10003

05-Feb-19

4278Y

GNV

7.8

0

10004

06-Feb-19

1484P

STL

472.00

2.9

4.9

10005

06-Feb-19

2289L

ATL

1 023.00

5.7

3.5

10006

06-Feb-19

4278Y

STL

472.00

2.6

5.2

10007

06-Feb-19

2778V

GNV

1 574.00

7.9

10008

07-Feb-19

1484P

TYS

644.00

10009

07-Feb-19

2289L

GNV

10010

07-Feb-19

4278Y

10011

07-Feb-19

1484P

10012

08-Feb-19

2778V

2020 has

Cengage deemed

Learning. that

any

8

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

1 574.00

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

354.1 72.6 339.8

97.2

66.4 215.1

party additional

content

may content

be

suppressed at

any

time

0

10016

1

10012

0

10010

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

406

PART III

Database

Programming

CHAR_

CHAR_

AC_

CHAR_

CHAR_

CHAR_

CHAR_

CHAR_

CHAR_

CUS_

TRIP

DATE

NUMBER

DESTINATION

DISTANCE

HOURS_

HOURS_

FUEL_

OIL_

CODE

WAIT

GALLONS

QTS

FLOWN

10013

08-Feb-19

4278Y

TYS

644.00

3.9

4.5

174.3

1

10011

10014

09-Feb-19

4278Y

ATL

936.00

6.1

2.1

302.6

0

10017

10015

09-Feb-19

2289L

GNV

1 645.00

6.7

0

459.5

2

10016

10016

09-Feb-19

2778V

312.00

1.5

0

67.2

0

10011

10017

10-Feb-19

1484P

STL

508.00

3.1

0

105.5

0

10014

10018

10-Feb-19

4278Y

TYS

644.00

3.8

4.5

167.4

0

10017

Table

name:

MQY

AIRCRAFT AC_NUMBER

Table name:

8

1484P

PA23-250

1833.10

1833.10

101.80

2289L

C-90A

4243.80

768.90

1123.40

2778V

PA31-350

7992.90

1513.10

789.50

4278Y

PA31-350

2147.30

243.20

622.10

MODEL

Beechcraft

KingAir

8

2.67

PA23-250

Piper

Aztec

6

1.93

PA31-350

Piper

Navajo

Writethe SQL code that

MOD_NAME

MOD_SEATS

Chieftain

MOD_CHG_MILE

2.35

10

willlist the values for the first four attributes in the

CHARTER table.

Usingthe contents ofthe CHARTERtable, writethe SQL querythat will produce the output shown Figure

P8.2.

FIGURE P8.2 CHAR_DATE

Note that

the

output

is limited

to

selected

attributes

for

aircraft

number

AC_NUMBER

CHAR_DESTINATION

CHAR_DISTANCE

CHAR_HOURS_FLOWN

2778V

BNA

320.00

1.60

06-Feb-19

2778V

GNV

1574.00

7.90

08-Feb-19

2778V

MOB

884.00

4.80

09-Feb-19

2778V

MQY

312.00

1.50

3

Create a virtual table (named

4

Produce the output shown in Figure P8.3 for aircraft 2778V.

2020

Cengage deemed

the

Learning. that

any

All suppressed

2778V.

Problem 2 query results

05-Feb-19

from

has

AC_TTER

C-90A

in

review

AC_TTEL

MOD_MANUFACTURER

2

Copyright

AC_TTAF

MOD_CODE

1

Editorial

MOD_CODE

CHARTER

Rights

Reserved. content

does

and

May not

not materially

be

AC2778V) containing the output presented in Problem 2.

CUSTOMER

copied, affect

scanned, the

overall

or

duplicated, learning

tables.

in experience.

whole

(Hint:

or in Cengage

part.

Note that this output includes

Use a JOIN in this

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

data

query.)

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

FIGURE P8.3 CHAR_DATE

8 Beginning

Structured

Query

Language

Problem 4 query results AC_NUMBER

CHAR_DESTINATION

CUS_LNAME

CUS_AREACODE

CUS_PHONE

08-Feb-19

2778V

MOB

Ramas

0181

844-2573

09-Feb-19

2778V

MQY

Dunne

0161

894-1238

06-Feb-19

2778V

GNV

Smith

0181

894-2285

05-Feb-19

2778V

BNA

Brown

0181

297-1228

5

407

Produce the output shown in Figure P8.4. The output, derived from the CHARTER and MODEL tables, is limited to 6 February 2019. (Hint: Thejoin passes through another table. Note that the connection between CHARTER and MODEL requires the existence of AIRCRAFT because the CHARTER table does not contain a foreign key to MODEL. However, CHARTER does contain AC_NUMBER,

a foreign

FIGURE P8.4

6

key to

AIRCRAFT,

which contains

a foreign

key to

MODEL.)

Problem 5 query results

CHAR_DATE

CHAR_DESTINATION

AC_NUMBER

MOD_NAME

06-Feb-19

STL

1484P

Aztec

1.93

06-Feb-19

ATL

2289L

KingAir

2.67

06-Feb-19

STL

4278Y

Navajo

Chieftain

2.35

06-Feb-19

GNV

2778V

Navajo

Chieftain

2.35

Modify the

query in

Problem

5 to include

is limited to charter records Figure P8.5.)

FIGURE P8.5 CHAR_DATE 09-Feb-19

data from

generated

the

CUSTOMER

MOD_CHG_MILE

table.

8

This time the

since 9 February 2019. (The query results

output

are shown in

Problem 6 query results MOD_CHG_MILE

CHAR_DESTINATION

AC_NUMBER

MOD_NAME

ATL

4278Y

Navajo

Chieftain

2.35

Chieftain

CUS_LNAME Williams

09-Feb-19

MQY

2778V

Navajo

2.35

Dunne

09-Feb-19

GNV

2289L

KingAir

2.67

Brown

10-Feb-19

TYS

4278Y

Navajo Chieftain

2.35

Williams

10-Feb-19

STL

1484P

Aztec

1.93

Orlando

7

Modifythe query in Problem 6 to produce the output shown in Figure P8.6. The datelimitation in

Problem

6 applies to this

problem,

too.

Note that this

query includes

data from

the

CREW and

EMPLOYEE tables. (Note: You may wonder why the date restriction seems to generate more records than it did in Problem 6. Actually, the number of (CHARTER) records is the same, but several records are listed twice to reflect a crew of two: a pilot and a copilot. For example, the record

for the

09-Feb-2019

flight

to

GNV, using aircraft

2289L, required

a crew consisting

of a pilot

(Lange) and a copilot (Lewis).)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

408

PART III

Database

Programming

FIGURE P8.6

Problem 7 query results

CHAR_

CHAR_

AC_

MOD_CHG_

CHAR_

EMP_

CREW_

EMP_

DATE

DESTINATION

NUMBER

MILE

DISTANCE

NUM

JOB

LNAME

09-Feb-19

GNV

2289L

2.67

1 645.00

104

Pilot

Lange

09-Feb-19

GNV

2289L

2.67

1 645.00

101

Copilot

Lewis

09-Feb-19

MQY

2778V

2.35

312.00

109

Pilot

Travis

09-Feb-19

MQY

2778V

2.35

312.00

105

Copilot

Williams Duzak

09-Feb-19

ATL

4278Y

2.35

936.00

106

Pilot

10-Feb-19

STL

1484P

1.93

508.00

101

Pilot

10-Feb-19

TYS

4278Y

2.35

644.00

105

Pilot

10-Feb-19

TYS

4278Y

2.35

644.00

104

Copilot

8

Modify the query in Problem 5to include the computed (derived) possible to the

use SQL to

following

SQL

SELECT

8

to

hours

query

9-Feb-19

existed.

so

Is the

is

per

per hour per

per hour.

Hint:It is

For example,

In this

queries to find

fuel

burn

per case,

value.)

shown in

hour

result

why is the

Use a similar

Figure

shown

fuel

technique

P8.7. (Note

burn for that

due to

or was there

poor fuel

an error in the

management

in

Figure

burn for the

aircraft

out who flew the aircraft

difference

problem,

mile flown

output

that

on joined

254.3 litres/1.5

hour.)

gallons

much higher than the fuel

an important

FIGURE P8.7

by the

recording?

The

provides

managers

Chieftain

on 8-Feb-18?

or which special

management fuel

P8.7

Navajo

4278Y

AC_

DATE

NUMBER

pilot, ability

does it reflect to

generate

an engine useful

query

asset.

MOD_NAME

CHAR_HOURS_

CHAR_FUEL_

FLOWN

GALLONS

Expr1

Navajo Chieftain

1.5

67.2

09-Feb-18

2289L

KingAir

6.7

459.5

68.5820895522388

09-Feb-18

4278Y

Navajo Chieftain

6.1

302.6

49.6065573770492

10-Feb-18

4278Y

Navajo Chieftain

3.8

167.4

44.0526315789474

10-Feb-18

1484P

Aztec

3.1

105.5

34.0322580645161

has

Cengage deemed

Learning. that

any

All suppressed

may

might have

2778V

2020

on

Such a query result circumstances

09-Feb-18

review

with

flown

Problem 8 query results

CHAR_

Copyright Editorial

as the

information.

metering

output

attribute fuel

are not stored in any table.

acceptable:

the gallons

litres

such

to additional

fuel

that

Lange

CHAR_FUEL_GALLONS/CHAR_DISTANCE

the gallons 169.54

output

very important

lead

perfectly

produces

produce

produces

Query

is

attributes

Williams

CHARTER;

above

tables

query

computed

CHAR_DISTANCE,

FROM (The

produce

Lewis

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

44.8

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

409

NOTE The to

output

format

an output

is

determined

heading

by the

labelled

Expr1

RDBMS

to indicate

you

use. In this

example,

the

the

expression

resulting

from

Access the

[CHARTER]![CHAR_FUEL_GALLONS]/[CHARTER]![CHAR_HOURS]created

Oracle of your

9

defaults

RDBMSs

to the full utility

per to

requires

charter

CHARTER

records

the

date,

by the

10

to control

the

output

format

builder.

with the

customers

9 February last

09-Feb-19

(Hint:

The

miles flown.)

2019.

In

addition,

MODEL table

contains

Note

the

the

also that output

is

the

output

help

ordered

by date

2.67

4392.15

09-Feb-19

Dunne

312.00

2.35

733.20

09-Feb-19

Williams

936.00

2.35

2199.60

10-Feb-19

Orlando

508.00

1.93

980.44

10-Feb-19

Williams

644.00

2.35

1513.40

that

P8.9. The total * charge

miles flown

produced the

charge to the per

waited * 50

the

CHARTER

8

charges shown in

by:

value

in the

is

found

MODEL table,

in the

CHARTER

and the hours

table,

the

charge

per

mile

waited (CHAR_HOURS_WAIT)

are

table.

Problem 10 query results CUS_LNAME

09-Feb-19

Brown

09-Feb-19

Dunne

Reserved. content

does

Williams

May not

not materially

be

copied, affect

scanned, the

Charge

Waiting

overall

or

duplicated, learning

in experience.

Charge

Total

0.00

4392.15

Charge

4392.15

733.20

0.00

733.20

2199.60

85.00

2304.60

980.44

0.00

980.44

1513.40

225.00

Orlando

08-Feb-19

Rights

Mileage

Williams

08-Feb-19

All

Problem 9 to produce the

computed

Charge

mile.

is found

09-Feb-19

suppressed

is

Mileage

per hour.

CHAR_DATE

any

output in

customer

(CHAR_DISTANCE)

FIGURE P8.9

Learning.

and,

name.

1645.00

in

charge

is limited

Brown

found

that

total

MOD_CHG_MILE

(MOD_CHG_MILE)

Cengage

tables.

the

CHAR_DISTANCE

Use the techniques

deemed

since

CUS_LNAME

Hours

has

learn

expression

Problem 9 query results

Miles flown

2020

different

contains

CHAR_DATE

Figure

The

in two table

generated

FIGURE P8.8

review

by its

software.

data found

mile, and the

within

Copyright

You should

defaulted

Create a query to produce the output shown in Figure P8.8. Note that, in this case, the computed attribute

Editorial

division label.

software division:

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

1738.40

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

410

PART III

Database

11

Programming

Create the required

SQL query that

output

FIGURE P8.10

is

shown

in

unpaid

OBrian

Amy

Smith

Kathy

Orlando

Myron

Smith

Olette

balances.

FIGURE P8.11

The resulting

values

Balance

Minimum

B

1014.56

descending

453.98

maximum balance, and the total

Figure

the

0.00

output

headers

Maximum

shown

may look

Balance

Total

1285.19

Figure

(Utility

Bills

4323.48

group the aircraft

P8.12.

Unpaid

data. Then use the

software

was

used

to

SQL functions

modify

the

headers,

so

Problem 13 query results Number

of Trips

Total

Distance

Average

Distance

Total

Hours

Average

Hours

1976.00

494.00

12.00

3.00

2289L

4

5178.00

1294.50

24.10

6.03

2778V

4

3090.00

772.50

15.80

3.95

4278Y

6

5268.00

878.00

30.40

5.07

Writethe SQL codeto generatethe output shownin Figure P8.13. Notethat the listing includes all CHARTER

table,

Cengage deemed

to

different.)

4

has

of

P8.11.

1484P

2020

The

order.

896.54

are shown in

Balance

in

listed

review

1285.19

minimum balance, the

produce

14

Copyright

F

K

as the source,

FIGURE P8.12

Editorial

in

673.21

Using the CHARTER table

AC_NUMBER

are listed

CUS_BALANCE

W

balance, the

432.35

your

who have an unpaid balance.

balances

Problem 12 query results

Average

13

the

CUS_INITIAL

Jaco

Find the average customer

8

of customers

Note that

CUS_FNAME

Pieterse

the

a list

P8.10.

Problem 11 query results CUS_LNAME

12

will produce Figure

Learning. that

any

flights

in the

that

while the

All suppressed

did

not include

CREW table.

Rights

MOD_CODE

Reserved. content

does

May not

not materially

be

a copilot

Also note that

copied, affect

requires

scanned, the

overall

or

duplicated, learning

crew

the

access

in experience.

whole

assignment.

pilots last to the

or in Cengage

part.

Due Learning

MODEL

to

electronic reserves

(Hint:

name requires

crew

access

assignments

to the

are

EMPLOYEE

table.)

rights, the

The

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P8.13

8 Beginning

Structured

Query

Language

Problem 14 query results

CHAR_

CHAR_

AC_

TRIP

DATE

NUMBER

MOD_NAME

CHAR_HOURS_

EMP_

CREW_

FLOWN

LNAME

JOB

5.1

Lange

Pilot

1.6

Lewis

Pilot

Duzak

Pilot

10001

05-Feb-19

2289L

KingAir

10002

05-Feb-19

2778V

Navajo

10004

06-Feb-19

1484P

Aztec

2.9

10005

06-Feb-19

2289L

KingAir

5.7

Lewis

Pilot

10006

06-Feb-19

4278Y

Navajo

2.6

Travis

Pilot

10008

07-Feb-19

1484P

Aztec

4.1

Duzak

Pilot

10009

07-Feb-19

2289L

KingAir

6.6

Williams

Pilot

10010

07-Feb-19

4278Y

Navajo

Chieftain

6.2

Wiesenbach

Pilot

10012

08-Feb-19

2778V

Navajo

Chieftain

4.8

10013

08-Feb-19

4278Y

Navajo

Chieftain

3.9

10014

09-Feb-19

4278Y

Navajo

Chieftain

6.1

10017

10-Feb-19

1484P

Aztec

15

411

Write a query that lists

the

Chieftain

Chieftain

and the

Pilot

Williams

Pilot

Duzak

Pilot

Lewis

3.1

ages of the employee

Pilot

Lewis

date on

which the

query

was run.

The

8

required output is shown in Figure P8.14. (As you can tell, the query was run on 4 February 2013, so the ages of the employee are current as of that date.)

FIGURE P8.14 EMP_NUM 100

Problem 15 query results Age

Query Date

EMP_LNAME

EMP_FNAME

EMP_HIRE_DATE

EMP_DOB

Nkosi

Cela

15-Mar-1997

15-Jun-1952

67

04-Feb-19

101

Lewis

Rhonda

25-Apr-1998

19-Mar-1975

44

04-Feb-19

102

Vandam

Rhett

20-Dec-2002

14-Nov-1968

51

04-Feb-19

103

Jones

Anne

28-Aug-2015

16-Oct-1984

35

04-Feb-19

104

Lange

John

20-Oct-2006

08-Nov-1981

38

04-Feb-19

Robert

08-Jan-2016

14-Mar-1985

34

04-Feb-19

Williams

105 106

Duzak

Jeanine

05-Jan-2001

12-Feb-1978

41

04-Feb-19

107

Diante

Jorge

02-Jul-2006

21-Aug-1984

35

04-Feb-19

Paul

18-Nov-2004

14-Feb-1976

43

04-Feb-19

Elizabeth

14-Apr-2001

18-Jun-1971

48

04-Feb-19

Leighla

01-Dec-2002

19-May-1980

39

04-Feb-19

Wiesenbach

108

Travis

109

Genkazi

110

Online Content Problems 16-33arebasedonthe'Ch8_SaleCo' database locatedon the

online

platform

use another Access

Copyright Editorial

review

2020 has

Cengage deemed

any

book.

This database

is

stored

DBMS such as Oracle, SQL Server or

database

Learning. that

for this

All suppressed

in

Microsoft

Access

MySQL, use its import

format.

utilities to

If you

move the

contents.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

412

PART III

Database

Programming

The structure database

and contents

to

answer

the

of the

Ch8_SaleCo

following

problems.

database

Save

each

are shown query

as

in

QXX,

Figure where

P8.15.

Use this

XX is the

problem

number.

FIGURE P8.15

The Ch8_SaleCo database

8

Table

name:

CUSTOMER

CUS_CODE

Copyright Editorial

review

CUS_

CUS_

LNAME

FNAME

10010

Ramas

Alfred

A

0181

844-2573

0.00

10011

Dunne

Leona

K

0161

894-1238

0.00

10012

Smith

Kathy

0181

894-2285

345.86

10013

Pieterse

Jaco

0181

894-2180

536.75

10014

Orlando

0181

222-1672

0.00

10015

OBrian

Amy

B

0161

442-3381

0.00

10016

Brown

James

G

0181

297-1228

221.19

10017

Williams

0181

290-2556

768.93 216.55

CUS_ INITIAL

CUS_

CUS_

CUS_

AREACODE

PHONE

BALANCE

W F

Myron

George

10018

Farriss

Anne

G

0161

382-7185

10019

Smith

Olette

K

0181

297-3809

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

0.00

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

V_CODE

8 Beginning

Structured

Query

Language

413

VENDOR V_CONTACT

V_NAME

V_AREACODE

V_PHONE

V_COUNTRY

V_ORDER

21225

Bryson, Inc.

Smithson

0181

223-3234

UK

Y

21226

SuperLoo, Inc.

Flushing

0113

215-8995

SA

N

21231

D&E Supply

Singh

0181

228-3245

UK

Y

21344

Jabavu

Khumalo

0181

889-2546

UK

N

22567

Dome

Smith

7253

678-1419

FR

N

23119

Randsets

Anderson

7253

678-3998

FR

Y

24004

Brackman

Browning

0181

228-1410

UK

N

24288

ORDVA, Inc.

Hakford

0181

898-1234

UK

Y

25443

B&K, Inc.

Smith

0113

227-0093

SA

N

25501

Damal

Smythe

0181

890-3529

UK

N

25595

Rubicon

Du Toit

0113

456-0092

SA

Y

Bros. Supply Ltd. Bros.

Supplies Systems

Table name: PRODUCT

P_CODE

P_DESCRIPT

11QER/31

Power

13-Q2/P2

P_INDATE

painter,

P_QOH

03-Nov-18

15

P_MIN

P_PRICE

V_CODE 25595

0.00

109.99

5

8

P_DISCOUNT

8

psi.,

3-nozzle

7.25

cm

pwr.

cm

pwr. saw

saw

13-Dec-18

32

15

14.99

0.05

21344

13-Nov-18

18

12

17.49

0.00

21344

15-Jan-19

15

8

39.95

0.00

23119

15-Jan-19

23

5

43.99

0.00

23119

0.05

24288

blade

14-Q1/L3

9.00 blade

1546-QQ2

Hrd. cloth,

1/4 cm,

2x50 1558-QW1

Hrd. cloth, 1/2 cm, 3x50

2232/QTY

109.92

12 cm

30-Dec-18

8

5

8 cm

24-Dec-18

6

5

99.87

0.05

24288

20-Jan-19

12

5

38.95

0.05

25595

20-Jan-19

23

9.95

0.10

21225

hammer,

02-Jan-19

8

14.40

0.05

file,

15-Dec-18

43

4.99

0.00

21344

07-Feb-19

11

256.99

0.05

24288

5.87

0.00

B&D jigsaw, blade

2232/QWE

B&D jigsaw, blade

2238/QPD

B&D cordless

drill,

1/2 cm 23109-HB

Claw

hammer

23114-AA

Sledge

10 5

12 kg 54778-2T

Rat-tail

1/8 cm

20

fine

89-WRE-Q

Hicut chain saw,

5

16 cm PVC23DRT

Copyright Editorial

review

2020 has

Cengage deemed

PVC pipe,

Learning. that

any

All suppressed

Rights

Reserved. content

does

3.5 cm,

May not

not materially

be

8 m

copied, affect

20-Feb-19

scanned, the

overall

or

duplicated, learning

75

188

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

414

PART III

Database

Programming

P_CODE

P_DESCRIPT

P_INDATE

SM-18277

1.25

01-Mar-19

172

75

6.99

0.00

21225

24-Feb-19

237

100

8.45

0.00

21231

17-Jan-19

18

5

0.10

25595

cm

metal screw,

P_QOH

P_MIN

P_PRICE

P_DISCOUNT

V_CODE

25 SW-23116

2.5 cm

WR3/TT3

Steel

wd. screw, matting,

3 1/6

m,.5

4

50 3 8

119.95

m mesh

Table name: INVOICE INV_NUMBER

8

Table

name:

Copyright review

2020 has

Cengage deemed

INV_DATE

1001

10014

16-Mar-19

1002

10011

16-Mar-19

1003

10012

16-Mar-19

1004

10011

17-Mar-19

1005

10018

17-Mar-19

1006

10014

17-Mar-19

1007

10015

17-Mar-19

1008

10011

17-Mar-19

LINE

INV_NUMBER

Editorial

CUS_CODE

Learning. that

any

All suppressed

LINE_NUMBER

P_CODE

LINE_UNITS

LINE_PRICE

1001

1

13-Q2/P2

1

14.99

1001

2

23109-HB

1

9.95

1002

1

54778-2T

2

4.99

1003

1

2238/QPD

1

38.95

1003

2

1546-QQ2

1

39.95

1003

3

13-Q2/P2

5

14.99

1004

1

54778-2T

3

4.99

1004

2

23109-HB

2

9.95

1005

1

PVC23DRT

12

5.87

1006

1

SM-18277

3

6.99

1006

2

2232/QTY

1

1006

3

23109-HB

1

1006

4

89-WRE-Q

1

1007

1

13-Q2/P2

2

1007

2

54778-2T

1

4.99

1008

1

PVC23DRT

5

5.87

1008

2

WR3/TT3

3

119.95

1008

3

1

9.95

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

23109-HB

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

9.95 256.99 14.99

rights, the

109.92

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

16

Write a query to count the number ofinvoices.

17

Write a query to count the number of customers

18

Generate a listing P8.16

as your

Figure

P8.16.)

FIGURE P8.16 CUS_CODE

of all purchases

guide. (Hint:

with a customer

made by the customers,

Use the

ORDER

8 Beginning

BY clause

to

Structured

Query

Language

415

balance over 500.

using the output

order the resulting

shown in rows

Figure

as shown

in

Problem 18 query results

INV_NUMBER

INV_DATE

LINE_UNITS

P_DESCRIPT

LINE_PRICE

10011

1002

16-Mar-19

Rat-tail file, 1/8 cm fine

2

4.99

10011

1004

17-Mar-19

Claw hammer

2

9.95

10011

1004

17-Mar-19

Rat-tail

file,

3

4.99

10011

1008

17-Mar-19

Claw

hammer

1

9.95

10011

1008

17-Mar-19

PVC

pipe,

5

5.87

10011

1008

17-Mar-19

Steel

1/8 cm fine

3.5 cm,

matting,

8

m

4 3 8 3 1/6

m,.5

119.95

3

m

mesh 10012

1003

16-Mar-19

7.25

10012

1003

16-Mar-19

B&D cordless

10012

1003

16-Mar-19

Hrd.

10014

1001

16-Mar-19

10014

1001

16-Mar-19

10014

1006

17-Mar-19

1.25

10014

1006

17-Mar-19

B&D jigsaw,

10014

1006

17-Mar-19

Claw

10014

1006

17-Mar-19

10015

1007

10015 10018

19

produce the

Copyright review

2020 has

Claw

cloth,

cm

blade

5

14.99

drill, 1/2 cm

1

38.95

1

39.95

1

14.99

1

9.95

25

3

6.99

blade

1

pwr. saw

1/4

cm,

pwr. saw

2

3 50

blade

hammer cm

metal screw, 12 cm

8

109.92

1

9.95

Hicut chain saw, 16 cm

1

256.99

17-Mar-19

7.25 cm pwr. saw blade

2

14.99

1007

17-Mar-19

Rat-tail file, 1/8 cm fine

1

4.99

1005

17-Mar-19

PVC pipe, 3.5 cm, 8 m

12

5.87

hammer

Usingthe output shown in Figure P8.17 as your guide, generatethe listing of customer purchases, including

Editorial

7.25

cm

subtotals

the listing

for

Learning. that

any

All suppressed

Rights

Reserved. content

does

each

of the invoice

of customer

derived (computed)

Cengage deemed

the

May not

purchases

attribute

not materially

be

copied, affect

in

LINE_UNITS

scanned, the

line

overall

or

duplicated, learning

in experience.

whole

numbers.

Problem

(Hint:

Cengage

part.

Due Learning

the

query

18, delete the INV_DATE

* LINE_PRICE

or in

Modify

to

electronic reserves

to

rights, the

right

calculate

some to

third remove

party additional

the

content

format

used to

column,

and add

subtotals.)

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

416

PART III

Database

Programming

FIGURE P8.17 CUS_CODE

Problem 19 query results

INV_NUMBER

Units

P_DESCRIPT

Bought

Unit

Price

Subtotal

10011

1002

Rat-tail file, 1/8 cm fine

2

4.99

9.98

10011

1004

Claw hammer

2

9.95

19.90

10011

1004

Rat-tail file, 1/8 cm fine

3

4.99

14.97

10011

1008

Claw hammer

1

9.95

9.95

10011

1008

PVC

5

5.87

10011

1008

Steel

pipe,

3.5 cm,

matting,

4

8 m 3 8 3 1/6

m, .5

m

29.35

3

119.95

359.85

5

14.99

74.95

1/2 cm

1

38.95

38.95

2 3 50

1

39.95

39.95

1

14.99

14.99

1

9.95

9.95

25

3

6.99

blade

1

mesh

8

7.25cm

pwr.

saw

blade

10012

1003

10012

1003

B&D cordless

10012

1003

Hrd. cloth,

10014

1001

10014

1001

10014

1006

10014

1006

B&D jigsaw,

10014

1006

Claw

hammer

10014

1006

Hicut

chain

10015

1007

10015

1007

Rat-tail file, 1/8 cm fine

10018

1005

PVC pipe, 3.5 cm, 8 m

20

drill, 1/4 cm,

7.25 cm pwr. saw blade Claw hammer 1.25

cm

7.25

metal

cm

screw,

12 cm

pwr.

saw

1

256.99

256.99

blade

2

14.99

29.98

CUS_BALANCE

Modify the each

NUMBER,

representing

has

Cengage deemed

Learning. that

any

All suppressed

(In

you

will note that

2020

1

4.99

4.99

12

5.87

70.44

Rights

422.77

10015

0.00

34.97

10018

216.55

70.44

words,

count

if the

three

10011

a product

purchase.)

does

May not

not materially

153.85

0.00

customer

Reserved. content

444.00

10014

other

would

Purchases

345.86

query in Problem 20 to include

customer.

Total

0.00

10012

review

9.95

16 cm

10011

Copyright

9.95

Customer purchase summary CUS_CODE

Editorial

109.92

Modify the query used in Problem 19 to produce the summary shown in Figure P8.18.

FIGURE P8.18

21

109.92

1 saw,

20.97

be

copied, affect

the

product

generated

scanned, overall

duplicated, learning

invoice

purchases.

is If

output

in experience.

whole

or in Cengage

values

part.

Due Learning

to

based

you

three invoices,

Your

or

the number of individual

customers

examine

the

which contained must

electronic reserves

product

on three

match

rights, the

right

some to

third remove

those

party additional

content

purchases

products, original

invoice

a total shown

may content

be

one

in

any

data,

of six lines, Figure

suppressed at

made by per LINE_

time

from if

the

subsequent

you

each P8.19.

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P8.19

Customer total

CUS_CODE

Use the

divided

Total

Purchases

0.00

444.00

6

345.86

153.85

3

10014

0.00

422.77

6

10015

0.00

34.97

2

70.44

1

of

216.55

Problem

P8.20.

by the number

21 as the

Note that

basis

the

for

this

average

query.)

purchase

Total

0.00 345.86

output

values

must

match

is equal to the total

those

purchases

Purchases

Number

of Purchases

Average

Purchase

444.00

6

74.00

153.85

3

51.28

10014

0.00

422.77

6

70.46

10015

0.00

34.97

2

17.48

70.44

1

70.44

216.55

10018

Amount

8

Create a query to produce the total purchase per invoice, generating the results shown in Figure P8.21. Theinvoice total is the sum of the product purchases in the LINE that corresponds to the INVOICE.

FIGURE P8.21

Invoice totals INV_NUMBER

Invoice

24.94

1002

9.98

1003

153.85

1004

34.87

1005

70.44 397.83 34.97

1007

399.15

1008

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

Total

1001

1006

has

Your

amount

of purchases.

CUS_BALANCE

10012

2020

417

Average purchase amount by customer

10011

review

Language

of Purchases

10012

Figure

CUS_CODE

Copyright

Number

10011

results

in

FIGURE P8.20

Editorial

Query

Usea queryto compute the average purchase amount per product made by each customer. (Hint: shown

23

Structured

purchase amounts and number of purchases

CUS_BALANCE

10018

22

8 Beginning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

418

PART III

Database

24

Programming

Use a query to show the invoices

and invoice totals

as shown in Figure P8.22. (Hint:

Group bythe

CUS_CODE.)

FIGURE P8.22

Invoice totals

by customer

CUS_CODE

25

INV_NUMBER

Invoice

Total

10011

1002

9.98

10011

1004

34.87

10011

1008

399.15

10012

1003

153.85

10014

1001

24.94

10014

1006

397.83

10015

1007

34.97

10018

1005

70.44

Write a query to produce the number of invoices and the total purchase amounts by customer, using the output shown in Figure P8.23 as your guide. (Compare this summary to the results shown in Problem 24.)

8 FIGURE P8.23

Number of invoices CUS_CODE

26

and total

Number

purchase amounts by customer

of Invoices

Total

Customer

10011

3

444.00

10012

1

153.85

10014

2

422.77

10015

1

34.97

10018

1

70.44

Usingthe query results in Problem 25 as your basis, write a query to generatethe total number of invoices,

the

invoice

total

for

all

of the

invoices,

the

smallest

amount and the average of all of the invoices. (Hint: output must match Figure P8.24.

FIGURE P8.24 Total

Copyright Editorial

review

2020 has

amount,

the

largest

invoice

output in Problem 25.) Your

# of Invoices

Total

Sales

Minimum

1 126.03

Sale

Largest

34.97

Sale

Average

444.00

Sale

225.21

List the balance characteristics of the customers who have made purchases during the current invoice cycle that is, for the customers who appear in the INVOICE table. The results of this query are shown in Figure P8.25.

Cengage deemed

invoice

Check the figure

Number ofinvoices; invoice totals; minimum, maximum and average sales

8

27

Purchases

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P8.25

Balances of customers

as shown

FIGURE P8.26

in

Figure

10011

0.00

10012

345.86

10014

0.00

10015

0.00 216.55

P8.26.

Balance

Maximum

0.00

29

Balance

outstanding

balances.

FIGURE P8.27

345.86

Balance

112.48

8

The results

Balance

Minimum

2089.28

30

of this

query

are

shown

in

Figure

P8.27.

Balancesummary for all customers

Total

Balance

Maximum

0.00

Balance

Average Balance

768.93

208.93

Find the listing of customers who did not make purchases during the invoicing must match the output shown in Figure P8.28.

FIGURE P8.28

Balances of customers

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

CUS_CODE

CUS_BALANCE

10010

0.00

10013

536.75

10016

221.19

10017

768.93

duplicated, learning

period. Your output

who did not make purchases

0.00

10019

has

Average

Createa queryto find the customer balance characteristics for all customers, including the total of the

2020

419

Balancesummary for customers who madepurchases Minimum

review

Language

Usingthe results of the query created in Problem 27, provide a summary of customer balance characteristics

Copyright

Query

CUS_BALANCE

10018

Editorial

Structured

who made purchases

CUS_CODE

28

8 Beginning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

420

PART III

31

Database

Programming

Find the customer current

invoicing

FIGURE P8.29

balance summary for all customers period.

The results

in

Balance summary for customers

Total

Balance

Minimum

Balance

1526.87

32

are shown

who have not made purchases

Figure

during the

P8.29.

who did not make purchases

Maximum

0.00

Balance

Average

768.93

Balance

305.37

Create a query to produce the summary ofthe value of products currentlyin inventory. Notethat the

value

of each

product

the unit price. Use the

FIGURE P8.30

is produced

by the

multiplication

ORDER BY clause to

of the

units currently

in inventory

and

match the order shown in Figure P8.30.

Value of products currently in inventory P_QOH

P_DESCRIPT

Power painter,

8

15 psi., 3-nozzle

P_PRICE

Subtotal

8

109.99

879.92

7.25

cm

pwr.

saw

blade

32

14.99

479.68

9.00

cm

pwr.

saw

blade

18

17.49

314.82

Hrd. cloth, 1/4 cm, 2 3 50

15

39.95

599.25

Hrd. cloth,

23

43.99

1/2 cm,

3 3 50

B&D jigsaw,

12 cm blade

8

B&D jigsaw,

8 cm blade

6

B&D cordless Claw

drill, 1/2 cm

hammer

Sledge

hammer,

Rat-tail

file,

Hicut

pipe,

1.25

cm

2.5

cm

Steel

1/8

chain

PVC

12

saw, 3.5

metal

16

cm,

matting,

599.22

12

38.95

467.40

23

9.95

228.85

14.40

115.20

8 43

4.99

214.57

11

m

188

5.87

1103.56

172

6.99

1202.28

237

8.45

2002.65

screw,

4

879.36

cm 8

wd. screw,

109.92 99.87

kg

cm fine

1011.77

25

50 3 8 3 1/6

m, .5

m

256.99

18

2826.89

119.95

2159.10

mesh

33

Using the results of the query created in Problem 32, find the total value of the product inventory. The results are shown in Figure P8.31.

FIGURE

P8.31

Total value of all products in inventory Total

value

of inventory

15084.52

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 Beginning

Structured

Query

Language

421

Online Content Problems 34-42arebasedonthe'Ch8_ThemePark' database located on the

online

use another Access

platform

for this

book.

This

database

is

DBMS such as Oracle, SQL Server or

database

to

in

Microsoft

Access

MySQL use its import

format.

utilities to

If you

move the

contents.

The structure and contents database

stored

answer

of the

Ch8_ThemePark

the following

problems.

database are shown in Figure P8.32. Use this

Save each

query

as

QXX, where

XX is the

problem

number.

FIGURE

P8.32

The Ch8_ThemePark

database

8

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

422

PART III

Table

Database

name:

Programming

THEMEPARK PARK_CODE

PARK_NAME

PARK_CITY

PARK_COUNTRY

FR1001

FairyLand

PARIS

FR

NL1202

Efling

NOORD

NL

SP4533

AdventurePort

BARCELONA

SP

SW2323

Labyrinthe

LAUSANNE

SW

UK2622

MiniLand

WINDSOR

UK

UK3452

PleasureLand

STOKE

UK

ZA1342

GoldTown

JOHANNESBURG

ZA

Table name: TICKET TICKET_NO

8

Table

name:

Copyright review

2020 has

Cengage deemed

PARK_CODE

24.99

Adult

SP4533

13001

14.99

Child

FR1001

13002

34.99

Adult

FR1001

13003

34.99

Adult

FR1001

18721

14.99

Child

FR1001

18722

14.99

Child

FR1001

18723

20.99

Senior

FR1001

18724

34.99

Adult

FR1001

32450

24.99

Adult

SP4533

45767

24.99

Adult

SP4533

67832

18.56

Child

ZA1342

67833

28.67

Adult

ZA1342

67855

18.56

Child

ZA1342

88567

22.50

Child

UK3452

88568

42.10

Adult

UK3452

89720

22.50

Child

UK3452

89723

22.50

Child

UK3452

89725

22.50

Child

UK3452

89728

42.10

Adult

UK3452

ATTRACTION ATTRACT_NAME

10034

ThunderCoaster

10056

SpinningTeacups

10067 10078

Learning. that

TICKET_TYPE

4668

ATTRACT_NO

Editorial

TICKET_PRICE

any

All suppressed

Rights

ATTRACT_CAPACITY

11

PARK_CODE

34

FR1001

4

62

FR1001

FlightToStars

11

24

FR1001

Ant-Trap

23

30

FR1001

Reserved. content

ATTRACT_AGE

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

ATTRACT_NO

ATTRACT_NAME

Copyright Editorial

review

2020 has

Language

423

PARK_CODE

3

120

FR1001

20056

3D-Lego_Show

3

200

UK2622

30011

BlackHole2

12

34

UK3452

30012

Pirates

10

42

UK3452

30044

UnderSeaWord

4

80

UK3452

98764

GoldRush

5

80

ZA1342

HOURS ATTRACT_NO

HOURS_PER_ATTRACT

HOUR_RATE

DATE_WORKED

100

10034

6

6.5

18/05/2019

100

10034

6

6.5

20/05/2019

101

10034

6

6.5

18/05/2019

102

30012

3

5.99

23/05/2019

102

30044

6

5.99

22/05/2019

102

30044

3

5.99

23/05/2019

104

30011

6

7.2

21/05/2019

104

30012

6

7.2

22/05/2019

105

10078

3

8.5

18/05/2019

105

10098

3

8.5

18/05/2019

105

10098

6

8.5

19/05/2019

name:

8

EMPLOYEE

EMP_

EMP_

EMP_

EMP_

EMP_

EMP_HIRE_

EMP_

EMP_

NUM

TITLE

LNAME

FNAME

DOB

DATE

AREACODE

PHONE

100

Ms

Calderdale

Emma

15-Jun-82

15-Mar-02

0181

324-9134

101

Ms

Ricardo

Marshel

19-Mar-88

25-Apr-06

0181

324-4472

102

Mr

Arshad

Arif

14-Nov-79

20-Dec-00

7253

675-8993

103

Ms

Roberts

Anne

16-Oct-84

16-Aug-04

0181

898-3456

104

Mr

Denver

Enrica

08-Nov-90

20-Oct-11

7253

504-4434

105

Ms

Namowa

Mirrelle

14-Mar-00

08-Nov-16

0181

890-3243

106

Mrs

Smith

Gemma

12-Feb-78

05-Jan-99

0181

324-7845

Writethe SQL code

35

ATTRACT_CAPACITY

Query

Carnival

EMP_NUM

34

Structured

10098

Table name:

Table

ATTRACT_AGE

8 Beginning

whichlists all the attractions in each theme

park.

Writethe SQL code to display the attraction name and the capacity for all attractions in the theme park FairyLand. The results are shown in Figure P8.33.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

424

PART III

Database

Programming

FIGURE P8.33

Attractions

and their capacities in theme ATTRACT_NAME

ATTRACT_CAPACITY

ThunderCoaster

34

SpinningTeacups

62

FlightToStars

24

Ant-Trap

30 120

Carnival

36

Usingthe outputin Figure P8.34 as your guide, display the total number of hours worked by each employee

on each

FIGURE P8.34

8

attraction.

Number of hours worked on each attraction by employees

EMP_FNAME

EMP_LNAME

ATTRACT_NAME

Arif

Arshad

Pirates

3

Arif

Arshad

UnderSeaWord

9

Emma

Calderdale

ThunderCoaster

Enrica

Denver

BlackHole2

6

Enrica

Denver

Pirates

6

Marshel

Ricardo

ThunderCoaster

6

Mirrelle

Namowa

Ant-Trap

3

Mirrelle

Namowa

Carnival

9

37

SumOfHOURS_PER_ATTRACT

12

Writea query which shows the total price of all adult tickets price

column

results

as Total

of this

FIGURE P8.35

query

Adult

Ticket

are shown in

Sales

Figure

and round

Total

Adult

review

2020 has

decimal

places.

Ticket

Learning. that

any

All suppressed

The

Sales

104.97

GoldTown

28.67

PleasureLand

84.20

Writea query to show the last names, area codes and phone numbers of all employees on 18 May 2019. Your query should output the rows shown in Figure P8.36.

Cengage deemed

parks. Label the total

price to two

74.97

FairyLand

Copyright

total

P8.35.

AdventurePort

38

sold at all theme

up the

Total adult ticket salesin eachtheme park PARK_NAME

Editorial

park FairyLand

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

who worked

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P8.36

Employees

EMP_LNAME

39

EMP_AREACODE

0181

324-9134

ThunderCoaster

0181

324-4472

ThunderCoaster

Namowa

0181

890-3243

Carnival

Namowa

0181

890-3243

Ant-Trap

425

park.

TOTAL_TICKETS_SOLD

AdventurePort

3

FairyLand

7

GoldTown

3

PleasureLand

6

who have not worked on any attractions

8

as

Employees who have not worked on any attractions EMP_DOB

EMP_HIRE_DATE

Roberts

Anne

16-Oct-84

16-Aug-04

0181

898-3456

Smith

Gemma

12-Feb-78

05-Jan-99

0181

324-7845

EMP_AREACODE

EMP_PHONE

Write a query that willlist the length of service in years of each employee. Sample output is shown in Figure P8.39 when this query was run on 5 February 2019. Remember, your output will be different.

42

Cengage

EMP_FNAME

EMP_HIRE_DATE

EMP_DOB

Calderdale

Emma

15-Mar-02

15-Jun-82

14

101

Ricardo

25-Apr-06

19-Mar-88

10

102

Arshad

Arif

20-Dec-00

14-Nov-79

16

103

Roberts

Anne

16-Aug-04

16-Oct-84

12

104

Denver

Enrica

20-Oct-11

08-Nov-90

5

105

Namowa

Mirrelle

08-Nov-16

14-Mar-00

0

106

Smith

Gemma

05-Jan-99

12-Feb-78

18

Learning. that

of service of each employee

EMP_LNAME

Writethe

deemed

Thelength

100

any

Marshel

SQL code that

of employees

has

sold at each theme

EMP_FNAME

EMP_NUM

2020

Language

EMP_LNAME

FIGURE P8.39

review

Query

Total tickets sold at each theme park

Write a query to show the details of all employees shown in Figure P8.38.

FIGURE P8.38

Copyright

ATTRACT_NAME

Ricardo

PARK_NAME

Editorial

EMP_PHONE

Calderdale

FIGURE P8.37

41

Structured

who worked on 18 May 2019

Using Figure P8.37 as a guide, show the number of tickets

40

8 Beginning

All suppressed

who

Rights

Reserved. content

does

will produce a VIEW named

work in

May not

not materially

be

Length_of_Service

EMP_PARIS, containing

allthe information

PARIS.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter 9 procedural Language SQL and advanced SQL In thIS Chapter, About

the

How to

you

relational

use the

About the

set

operators

advanced

different

wILL Learn: UNION,

SQL

types

JOIN

use SQL functions

How to

create

to

Language

PL/SQL

functions

How to

create

and

MINUS

syntax

queries

dates, strings

and other

data

views

(PL/SQL)

embedded

ALL, INTERSECT

and correlated

manipulate

and use updatable

Use Procedural

operator

of subqueries

How to

UNION

to create triggers,

stored

procedures

and

SQL

Preview In

Chapter

8, Beginning

definition data.

and

In this

data

chapter,

more advanced

In this

the

chapter, and

JOIN

statement

you learnt

circumstances.

SQLs and

as

real

actions

through

such

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

to

May not

not materially

in

the

and

Chapter

basic

SQL data

manipulate

relational

8 and learn

how

to

use

copied, affect

the

overall

or

such

procedures

and

stored

duplicated,

or even

require

Such

procedures

learning

on stored

the

as the

procedures.

when it is

you

derived

execution

embedded

in

certain

styles

of sub-queries

learn

more

of

of dates

data.

defined

of a new invoice within

addition,

use

previous in

of clearly

be applied

In

the

manipulation

addition

can

In

different

of

how to

be useful

Finally,

data, including

occurs,

a class.

Basic, .NET,

scanned,

based

to learn

can

about the

(UNION,

merge the results

tables.

queries

statement.

from

operators

need

multiple

other

will also learn

procedures

you

from

a SELECT

set

are used to

Therefore,

inside

information

event

of triggers

operators

information

in

extract

use

be

create

SQL relational

of SQL.

you

business

of business

the

queries

well as computations

world,

as Visual

extract

chapter,

in

application

heart

be implemented

enrolment the

you learnt

to

you learnt

about

cascading

when a specific

students

at the to

In this can

strings

what

used

and how those

are

how

many functions

In the

on

will learn

MINUS) Joins

that

build

you

queries.

SQL

chapter,

you

Query Language,

commands

SQL features.

INTERSECT multiple

Structured

manipulation

the

or a DBMS

SQL facilitates

a programming

the

language

C# or Java.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

online Content want to see the

online

Oracle

platform

software

managed

your college

in

action,

you

creating the tables for this

book.

is installed

by the

Language

SQL and

Advanced

SQL

427

Mostofthe examples usedin this chapterarebasedon Oracle. If you

examples

SQL script files for the

9 Procedural

or universitys

and loading

How you

on your

database

need to load

connect

server

and

the required

the data in the database to the

on the

administrator.

Follow

technology

department.

database

Oracle

access

database

paths

the instructions

and

tables.

are located depends

methods

provided

The

Oracle

on

on how the defined

and

by your instructor

or

9

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

428

part

III

Database

9.1

Programming

reLatIonaL

Set operatorS

In Chapter 3, Relational Model Characteristics, you learnt about the eight general relational operators. In this section, you willlearn how to use SQL commands (UNION, INTERSECT and MINUS) to implement the union, intersection and difference relational operators. In previous chapters, you learnt that SQL data manipulation commands are set-orientated; that is, they

operate

over entire

sets

of rows

and columns

(tables)

at once.

Using sets,

you

can combine

two or more sets to create new sets (or relations). Thats precisely what the UNION, INTERSECT and MINUS statements do. In relational database terms, you can use the words, sets, relations and tables interchangeably because they all provide a conceptual view of the data set as it is presented to the relational

database

user.

note The SQL-2011 standard implementation

details

defines the operations that

to the

DBMS vendors.

all DBMSs

Therefore,

some

must perform on data, but it leaves the advanced

SQL features

may not

work on

all DBMS implementations. Also, some DBMS vendors mayimplement additional features not found in the SQL standard. UNION, INTERSECT and MINUS arethe names ofthe SQL statements implemented in Oracle. The SQL standard

uses the

keyword

EXCEPT to refer to the

difference

(MINUS)

relational

operator.

Other RDBMS

vendors may use a different command name or might not implement a given command at all. For example, MySQL version 8.0 supports the UNION operator and not INTERSECT. To learn more about the ANSI/ISO SQL standards, check the ANSI website (www.ansi.org) to find out how to obtain the latest standard documents in electronic form.

9

UNION, INTERSECT and MINUS work properly only if relations are union-compatible. In SQL terms, union-compatible means that the names of the relation attributes must be the same and their data types

must be identical.

In

practice,

some

RDBMS

vendors

require

the

data types

to

be compatible

but not necessarily exactly the same. For example, compatible data types are VARCHAR(35) and CHAR(15). In that case, both attributes store character (string) values; the only difference is the string size. Another example of compatible data types is NUMBER and SMALLINT. Both data types are used to store

numeric

values.

note Some

DBMS

products

9.1.1

Copyright review

2020 has

has

that

common

a combined

listing

Cengage

Learning. that

tables

any

All suppressed

Rights

two

does

May not

SaleCos

customer

not materially

is

to

have identical

be

data

types.

are

affect

scanned, the

overall

or

duplicated, learning

The

excludes

in

whole

or in Cengage

with

goods wants to

merged.

experience.

management

merged

purchased

management

lists

one that

copied,

SaleCos

properly

have

of customers

Reserved. content

company.

list

customers

customers.

when the

another

customer

some

duplicated

deemed

bought

companys

possible

contain

Editorial

SaleCo

acquired

quite

union-compatible

unIon

Suppose the

may require

part.

Due Learning

to

SaleCos

from

electronic reserves

query

to

customer

both

is

make list.

companies,

make sure that

UNION

duplicate

wants

sure

Because

that it is

the two lists

customer

a perfect

records

tool

for

may

are not

generating

records.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

429

online Content The'Ch09_SaleCo' database usedtoillustratethe UNION commands is located

on the

online

platform

for this

book.

The UNION statement combines rows from two or more queries without including syntax of the UNION statement is:

duplicate rows.

The

query UNION query In other words, the UNION statement combines the output of two SELECT queries. (Remember that the SELECT statements must be union-compatible. That is, they must return the same attribute names and similar data types.) To demonstrate the use ofthe UNION statement in SQL,lets use the CUSTOMER and CUSTOMER_2 tables

in the

Ch09_SaleCo

database.

To show the

without the duplicates, the UNION query is SELECT

CUS_LNAME,

FROM

CUSTOMER

combined

CUSTOMER

and

CUSTOMER_2

records

written asfollows:

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

UNION SELECT

CUS_LNAME,

FROM

CUSTOMER_2;

Figure 9.1 shows the UNION query.

FIgure

9.1

Database Table

name:

name:

Copyright review

unIon

of the

LNAMe

FNAMe

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Moloi

10013

Pieterse

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

Williams

George

10018

Padayachee

Vinaya

10019

Moloi

Mlilo

Cengage deemed

Learning. that

any

and the result

of the

9

Ch09_SaleCo

CUS_

has

and CUSTOMER_2 tables

query results

CUS_

2020

CUSTOMER

CUSTOMER

CUS_CODe

Editorial

contents

All suppressed

Rights

CUS_

AreACODe

PHONe

A

0181

844-2573

0.00

K

0161

894-1238

0.00

0181

894-2285

345.86

0181

894-2180

536.75

0181

222-1672

0.00

B

0161

442-3381

0.00

G

0181

297-1228

221.19

0181

290-2556

768.93

G

0161

382-7185

216.55

K

0161

297-3809

iNiTiAL

Marlene

W

Jaco

F

Myron

Reserved. content

CUS_

CUS_

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

CUS_BALANCe

0.00

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

430

part

Table

III

Database

name:

Programming

CUSTOMER_2

CUS_CODe

CUS_LNAMe

345

Terrell

Justine

347

Pieterse

Jaco

351

Hernandez

Carlos

352

McDowell

George

CUS_PHONe

0181

322-9870

F

0181

894-2180

J

8192

123-7654

8192

123-7768

8192

123-9876

H

G

Khaleed

368

Lewis

Marie

J

8192

332-1789

369

Dunne

Leona

K

0161

894-1238

UNION CUSTOMER_2 CUS_FNAMe

9

0181

297-1228

Dunne

Leona

K

0161

894-1238

Padayachee

Vinaya

G

0161

382-7185

Hernandez

Carlos

J

8192

123-7654

Marie

J

8192

332-1789

8192

123-7768

McDowell

George

OBrian

Amy

B

0161

442-3381

Pieterse

Jaco

F

0181

894-2180

0181

222-1672

0181

844-2573

0181

894-2285

Myron Alfred Marlene

Moloi

Mlilo

K

0161

297-3809

Terrell

Justine

H

0181

322-9870

Tirpin

Khaleed

G

8192

123-9876

0181

290-2556

Figure 9.1, note the following:

CUSTOMER

Customers

Cengage deemed

UNION

Learning. that

any

table

Dunne

CUSTOMER_2

Pieterse

All

and

contains

ten

rows,

while the

Pieterse

are included

CUSTOMER_2

in the

table

CUSTOMER

table

contains

as

seven

rows.

well as in the

table. query

are

suppressed

W

George

As you examine

has

A

Moloi

Williams

2020

CUS_PHONe

G

Ramas

The

CUS_AreACODe

James

Orlando

The

CUS_iNiTiAL

Brown

Lewis

review

CUS_AreACODe

Tirpin

CUS_LNAMe

Copyright

CUS_iNiTiAL

365

Query: CUSTOMER

Editorial

CUS_FNAMe

yields

15 records

not included.

Rights

Reserved. content

does

May not

not materially

In

be

copied, affect

because

short,

scanned, the

overall

the

or

duplicated, learning

the

UNION

in experience.

whole

duplicate

query

or in Cengage

part.

yields

Due Learning

to

electronic reserves

records

of customers

a unique

rights, the

right

some to

third remove

set

party additional

Dunne

and

of records.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

431

note You

were first

learnt

introduced

how to

to

combine

the

UNION

all tuples

SELECT

CUS_LNAME,

FROM

CUSTOMER

from

operator two

in

Chapter

relations.

We could

4, Relational

Algebra

and

Calculus,

therefore

write the

SQL

query:

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

when

you

UNION SELECT

CUS_LNAME,

FROM

CUSTOMER_2;

as the

following

relational

algebra

statement:

P CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE

(CUSTOMER)

(CUSTOMER_2) PCUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE

note The SQL standard calls for the elimination of duplicate rows when the UNION SQL statement is used. However, some DBMS vendors may not adhere to that standard. Check your DBMS manual to see if the UNION statement and

Oracle

18c

is supported both

The UNION statement

and, if so, how it is supported.

support

the

UNION

can be used to unite

SQL

For example,

the latest

version

9

of MySQL 8.0

statement.

more than just two

queries.

For example,

assume

that

you

have four union-compatible queries named T1, T2, T3 and T4. Withthe UNION statement, you can combine the output of all four queries into a single result set. The SQL statement will be similar to this: SELECT column-list

FROM T1

UNION

SELECT column-list

FROM T2

UNION SELECT column-list

FROM T3

UNION SELECT column-list

9.1.2 unIon

FROM T4;

aLL

If SaleCos management wants to know how many customers CUSTOMER_2 lists, a UNION ALL query can be used to produce rows.

Therefore,

the following

query

will keep

all rows from

are on both the CUSTOMER and arelation that retains the duplicate

both queries (including

the

duplicate

rows)

and return 17 rows.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

432

part

III

Database

SELECT

CUS_LNAME,

FROM

CUSTOMER

UNION

CUS_LNAME,

FROM

CUSTOMER_2;

FIgure

the

preceding

9.2 name:

UNION

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

ALL

query

produces

the

result

shown

in

Figure

9.2.

unIon aLL query results Ch09_SaleCo

CUS_LNAMe

CUS_FNAMe

CUS_iNiTiAL

CUS_AreACODe

CUS_PHONe

Ramas

Alfred

A

0181

844-2573

Dunne

Leona

K

0161

894-1238

W

0181

894-2285

0181

894-2180

0181

222-1672

Moloi

Marlene

Pieterse

Jaco

Orlando

F

Myron

OBrian

Amy

B

0161

442-3381

Brown

James

G

0181

297-1228

Williams

George

0181

290-2556

Vinaya

G

0161

382-7185

Moloi

Mlilo

K

0161

297-3809

Terrell

Justine

H

0181

322-9870

Pieterse

Jaco

F

0181

894-2180

J

8192

123-7654

8192

123-7768

Padayachee

9

CUS_FNAME,

ALL

SELECT

Running

Database

Programming

Hernandez

Carlos

McDowell

George

Tirpin

Khaleed

Lewis

Marie

Dunne

Like the

Leona

UNION

statement,

the

G

8192

123-9876

J

8192

332-1789

K

0161

894-1238

UNION

ALL statement

can

be used to

unite

more than

just

two

queries.

9.1.3 InterSeCt If

SaleCos

and

management

CUSTOMER_2

returning

wants

tables,

only the rows

query INTERSECT To generate

to

know

which

the INTERSECT

that

customer

statement

appear in both sets.

records

are

duplicated

can be used to combine

in

the

CUSTOMER

rows from two

The syntax for the INTERSECT

statement

queries,

is:

query

the list

of

duplicate

SELECT

CUS_LNAME,

FROM

CUSTOMER

customer

records,

you

can

use:

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

INTERSECT

Copyright Editorial

review

2020 has

SELECT

CUS_LNAME,

FROM

CUSTOMER_2;

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

433

note The

SQL

query

you

have just

seen

can

be

written

using the

relational

algebra

INTERSECT

operator

as follows:

P CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE

(CUSTOMER)

The INTERSECT

example, code

statement

the following

0181

invoice

(CUSTOMER_2) PCUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE

and

record

be

used

query returns

who

have

for that

SELECT

can

made

the

to

generate

customer

purchases.

(If

additional

codes

for

a customer

useful

customer

all customers

has

made

information.

For

who are located

a purchase,

there

in

must

area be an

customer.)

CUS_CODE

FROM

CUSTOMER

WHERE

CUS_AREACODE

5 '0181'

INTERSECT SELECT Figure

DISTINCT

9.3

shows

both

CUS_CODE

sets

of

SQL

FROM

statements

INVOICE; and their

output.

note Microsoft youll if you

2020 has

Cengage deemed

in some

format

or procedure.

an alternative

and

stored

any

9.3

All suppressed

query

procedures,

here is to

Learning. that

the INTERSECT

Atleast,

FIgure

review

not support

use

objective

Copyright

does

explore in this chapter.

triggers

Editorial

Access

show

you

InterSeCt

Rights

Reserved. content

does

May not

not materially

be

you how

to

cases,

can

use

use

some

query, Access

nor

does it support

might be able to

For example,

Visual

Basic

important

although

code

to

complex

give you the

Access

does

similar

actions.

perform

standard

other

queries

desired

not

results

support

SQL

However,

the

9

SQL features.

query results

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

434

part

III

Database

9.1.4 The

Programming

MInuS

MINUS statement

in

SQL combines

in the first set but not in the second. query

rows

from two

queries

The syntax for the

and returns

only the rows

that

appear

MINUS statement is:

MINUS query

For example,

if the

found in the

SaleCo

managers

want to know

what customers

in the

CUSTOMER

table

are not

CUSTOMER_2 table, they can use:

SELECT

CUS_LNAME,

FROM

CUSTOMER

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

MINUS SELECT

CUS_LNAME,

FROM

CUSTOMER_2;

If the

managers

want to

CUSTOMER table, they

know

which

customers

in the

CUSTOMER_2

table

are

not found

in the

merely switch the table designations:

SELECT

CUS_LNAME,

FROM

CUSTOMER_2

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

CUS_FNAME,

CUS_INITIAL,

CUS_AREACODE,

CUS_PHONE

MINUS

9

SELECT

CUS_LNAME,

FROM

CUSTOMER;

You can extract much useful information by combining For example, the following query returns the customer 0181

minus the

ones

who have

made purchases,

MINUS with various clauses such as WHERE. codes for all customers located in area code

leaving

the

customers

in area code

0181

who have

not made purchases. SELECT

CUS_CODE

FROM

CUSTOMER

WHERE

CUS_AREACODE

5 '0181'

MINUS SELECT

DISTINCT CUS_CODE FROM INVOICE;

Figure 9.4 shows the preceding three

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

SQL statements

or

duplicated, learning

in experience.

whole

or in Cengage

part.

and their output.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

9.4

9 Procedural

Language

SQL and

Advanced

SQL

435

MInuS query results

9

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

436

part

III

Database

Programming

note Some the

DBMS

products

difference

illustrated

here

INTERSECT

operator

are supported

Syntax

your

DBMS

subqueries

query

as

FIgure

DBMS.

9.5

name:

in

MINUS

Consult

For example,

INTERSECT

statements,

your

the

the

use

and

For example,

Section

while

DBMS

current

manual

version

others

to

of

may implement

see if the

MySQL

statements

does

not

support

MINUS

statements,

the following

you

can

use IN

and

NOT IN

query produces

the

same results

FROM

CUSTOMER 5 '0181'

AND

CUS_CODE

of the INTERSECT

FROM

INVOICE);

alternative.

alternative

Ch09_SaleCo

CUSTOMER

CUS_

CUS_

CUS_

CODe

LNAMe

FNAMe

10010

Ramas

Alfred

10011

Dunne

Leona

Moloi

10013

Pieterse

10014

Orlando

CUS_

CUS_

CUS_

AreACODe

PHONe

BALANCe

A

0181

844-2573

0.00

K

0161

894-1238

0.00

0181

894-2285

345.86

0181

894-2180

536.75

0181

222-1672

CUS_ iNiTiAL

Marlene

10012

W

Jaco

F

Myron

0.00 0.00

10015

OBrian

Amy

B

0161

442-3381

10016

Brown

James

G

0181

297-1228

221.19

0181

290-2556

768.93 216.55

George

Williams Padayachee

10018

Moloi

10019

Table

name:

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

review

2020 has

Cengage deemed

0.00

INVOICE iNv_NUMBer

Copyright

as the

9.1.3.

DISTINCT

InterSeCt

name:

the

similar results. shown

IN (SELECT

9.5 shows

10017

Editorial

or

EXCEPT.

CUS_AREACODE

CUS_CODE Figure

support

CUS_CODE

WHERE

Table

by your

doesnt

SELECT

Database

SQL

alternatives

to obtain

INTERSECT

9

the INTERSECT

in

or MINUS statements.

9.1.5 If

do not support

relational

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

CUS_CODe

iNv_DATe

1001

10014

16-Jan-19

1002

10011

16-Jan-19

1003

10012

16-Jan-19

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

iNv_NUMBer

9 Procedural

CUS_CODe

Language

SQL and

Advanced

SQL

437

iNv_DATe

1004

10011

17-Jan-19

1005

10018

17-Jan-19

1006

10014

17-Jan-19

1007

10015

17-Jan-19

1008

10011

17-Jan-19

Query result: CUS_CODe 10012 10014

note Microsoft Access generates an input request for the CUS_AREACODE if you use apostrophes around the area code. (If you supply the 0181 area code, the query will execute properly.) To eliminate that problem, use standard double quotation marks, writing the WHERE clause in the second line of the preceding SQL statement

as:

WHERE CUS_AREACODE

50181

AND

Microsoft Access will also accept single quotation

Using the same alternative to the MINUS statement, query shown in Section 9.1.4 by using: SELECT

The results

CUS_AREACODE FROM INVOICE); of that

query are shown in Figure

in area code 0181 who have not

FIgure Database Table

9.6 name:

name:

Copyright review

CUS_

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Moloi

10013

Pieterse

10014

Orlando

Cengage

Learning. that

any

DISTINCT CUS_CODE

output includes

only the

customers

have not generated invoices.

Ch09_SaleCo

FNAMe

deemed

9.6. Note that the query

MINUS

MInuS alternative

CUS_

has

AND CUS_CODE NOTIN (SELECT

made any purchases and, therefore,

LNAMe

2020

you can generate the output for the third

CUSTOMER

CUS_CODe

Editorial

5'0181'

9

marks.

All suppressed

Rights

CUS_ PHONe

A

0181

844-2573

0.00

K

0161

894-1238

0.00

0181

894-2285

345.86

0181

894-2180

536.75

0181

222-1672

Marlene

W

Jaco

Reserved. content

CUS_ AreACODe

CUS_ iNiTiAL

does

F

Myron

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

CUS_BALANCe

0.00

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

438

part

III

Database

CUS_CODe

Programming

CUS_

CUS_

CUS_

LNAMe

FNAMe

iNiTiAL

CUS_

CUS_

AreACODe

PHONe

CUS_BALANCe

10015

OBrian

Amy

B

0161

442-3381

0.00

10016

Brown

James

G

0181

297-1228

221.19

0181

290-2556

768.93 216.55

Padayachee

10018

Moloi

10019

Table

George

Williams

10017

name:

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

0.00

INVOICE iNv_NUMBer

CUS_CODe

iNv_DATe

1001

10014

16-Jan-19

1002

10011

16-Jan-19

1003

10012

16-Jan-19

1004

10011

17-Jan-19

1005

10018

17-Jan-19

1006

10014

17-Jan-19

1007

10015

17-Jan-19

1008

10011

17-Jan-19

Query result:

9

CUS_CODe 10010 10013 10016 10017 10019

9.2

SQL JoIn

The relational

join

operatorS

operation

merges rows from

two tables

and returns

the rows

with one of the following

conditions: Have common values in common columns (natural join). Meet a given join

condition

(equality

or inequality).

Have common values in common columns or have no matching values (outer join). In Chapter 8, Beginning Structured Query Language, you learnt how to use the SELECT statement in conjunction with the WHERE clause to join two or moretables. For example, you can join the PRODUCT and VENDOR

tables

Copyright review

2020 has

common

V_CODE

by writing:

P_CODE, P_DESCRIPT, P_PRICE, V_NAME

FROM

PRODUCT, VENDOR

Cengage deemed

their

SELECT

WHERE

Editorial

through

Learning. that

any

PRODUCT.V_CODE

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

5 VENDOR.V_CODE;

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

The preceding contains the

SQL join

the

tables

syntax is sometimes

being

joined

and that

referred

the

WHERE

note

the

9 Procedural

Language

as an old-style

join.

clause

the join

contains

SQL and

Advanced

Note that the FROM condition(s)

SQL

439

clause

used

to join

tables. As you The

examine

FROM

clause

operation

tables

takes

place

and

The number example,

two

at a time,

criteria

Generally,

the

statement

are

connected

for

T1 and

join

rows

section,

some

use

of the

of join

will learn

tabLe Join

9.1

if you

which

which the

rows

V_CODE

the join

are joining

to table

T3.

will be returned.

values in the

of tables

will have

AND logical

operator.

condition

(j2)

two

being joined

join

The first

defines

minus one. For

conditions join

the join

(j1

and j2).

condition criteria

for

All

(j1)

defines

the

output

comparison

of the

primary

key in

and outer joins.

are selected.

values

that

one table

and the

table.

for

returns

The join

one table the

same

The inner

criteria

join is the traditional

can be an equality

or both tables result

ways to express

that

the join

SQL join

CROSS

For example,

statement

for

are included,

of that join are then joined

T3), you

join

asinner joins

queries.

Classification

more tables

to right.

SELECT

all rows

an

different

support

following

the

will be an equality

second

attribute

do not

left

equal to the number

second

Table 9.1.). It is useful to remember

and that

from

or

condition (theta join). An outer join returns not only the

type you

If three

T3.

condition

with unmatched

In this

tells

T2 and

through

meet a given criteria

a special

be joined.

to T2; the results

always

The

points:

equal.

(T1,

T2.

key in the

or equijoin) or aninequality introduces

is

tables

following

starting

returns

are

conditions

can be classified

only rows that

clause

tables

and table

foreign

Join operations

WHERE

three

are to

T1 is joined

SELECT

of join

conditions join

related

the

tables

VENDOR

if you join

of the first join

(see

which tables

in the

case, the

PRODUCT

the

query,

indicates

condition

In this

the

preceding

T1, T2 and T3, first table

The join

join

the

shown

your

expression

join

to

operations

in this

DBMS

be joined.

Cartesian

not all DBMS vendors

styles

Refer to

as the

The

meet the

Oracle

you

are

11g is

using

(natural

join

SQL standard

of two

sets

ANSI

provide the same level

section.

manual if

condition

which

matching rows, but also

product

that

join in

also

or tables.

9

SQL standard

of SQL support

used

to

a different

demonstrate DBMS.

styles

Join Type

SQL Syntax

CROSS

SELECT

example

* FROM

Description

T1, T2

Returns

JOIN

(old SELECT

* FROM

T1

the

Cartesian

product

of T1 and

T2

Cartesian

product

of T1 and

T2.

style).

Returns

the

CROSS JOIN T2 INNER

Old-Style

SELECT * FROM T1, T2

JOIN

WHERE

Returns only the rows that

T1.C15T2.C1

condition rows

NATURAL

SELECT

JOIN

NATURAL

* FROM JOIN

T1

with

Returns

T2

in the

the

WHERE

matching

only the

matching

meet the join

clause

values

old style.

Only

are selected.

rows

with

columns.

The

matching

values

matching

in

columns

must have the same names and similar

data

types. JOIN

SELECT

USING

T2

JOIN

ON

* FROM

T1 JOIN

Returns

USING (C1)

SELECT

the

* FROM

T1 JOIN

Returns

T2 ON T1.C15T25C1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

only the

columns

rows

only the

Due

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

that

may

values

USING

clause.

be

ON clause.

suppressed at

any

time

in

meet the join

in the

content

matching

in the

rows

condition indicated

Learning

with

indicated

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

440

part

Join

III

Database

Programming

Classification

Join

OUTER

Type

SQL

LEFT JOIN

Syntax

example

Description

SELECT

* FROM

T1 LEFT

OUTER

JOIN

ON

T2

Returns includes

T1.C15T2.C1 RIGHT

SELECT

JOIN

T1

OUTER JOIN

Returns

T2

FULL

SELECT

* FROM

T1 FULL

OUTER

JOIN

ON

T2

Returns

with

the left

and

table

(T1)

with

matching

values

and

with

values.

rows

includes

T1.C15T2.C1

from

values

all rows from the right table (T2)

unmatched

JOIN

matching

values.

rows

includes

ON T1.C15T2.C1

with

all rows

unmatched

* FROM

RIGHT

rows

with

all rows

matching

from

with unmatched

values

and

both tables

(T1

and

T2)

values.

9.2.1 Cross Join A cross join performs cross join syntax is:

a relational

SELECT column-list

product (also known as the

FROM table1

Cartesian product)

of two tables.

The

CROSS JOIN table2

For example, SELECT * FROM INVOICE

CROSS JOIN LINE;

performs a cross join of the INVOICE (There

were eight invoice

rows

and LINE tables.

and 18 line rows,

thus

That CROSS JOIN query generates 144 rows. yielding

You can also perform a cross join that yields only specified

9

SELECT

INVOICE.INV_NUMBER,

FROM

INVOICE

The results

generated

8 3 18

5 144 rows.)

attributes. For example, you can specify:

CUS_CODE, INV_DATE,

P_CODE

CROSS JOIN LINE;

through

that

SQL statement

SELECT

INVOICE.INV_NUMBER,

FROM

INVOICE,

can also be generated

CUS_CODE, INV_DATE,

by using the following

syntax:

P_CODE

LINE;

9.2.2 natural Join Recall from Chapter 3, Relational Model Characteristics, that a natural join returns allrows with matching values in the matching columns and eliminates duplicate columns. That style of query is used whenthe tables share one or more common attributes with common names. The natural join syntax is: SELECT column-list The natural join

FROM table1

will perform

NATURAL JOIN table2

the following

Determine the common attribute(s) data types. Select only the rows

tasks:

by looking for attributes

The following example performs a natural join of the selected attributes:

Copyright review

2020 has

product

CUS_CODE, CUS_LNAME, INV_NUMBER,

FROM

CUSTOMER

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

of the two tables.

CUSTOMER and INVOICE tables

SELECT

Cengage deemed

names and compatible

with common values in the common attribute(s).

If there are no common attributes, return the relational

Editorial

withidentical

and returns

only

INV_DATE

NATURAL JOIN INVOICE;

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

The SQL code

FIgure

and its results

9.7

are shown

naturaL

at the top

of Figure

9 Procedural

Language

SQL and

Advanced

SQL

441

9.7.

JoIn query results

9

You are not limited PRODUCT

tables

Copyright review

2020 has

project

SELECT

INV_NUMBER,

FROM

INVOICE

The SQL code

Editorial

to two tables. and

Cengage deemed

Learning. that

any

All suppressed

Reserved. content

does

selected

NATURAL

May not

not materially

be

copied, affect

scanned, the

overall

or

by

LINE

at the

duplicated, learning

in experience.

whole

LINE_UNITS,

NATURAL

bottom

or in Cengage

JOIN

of Figure

part.

a natural join

Due Learning

of the INVOICE,

LINE and

writing:

P_DESCRIPT,

JOIN

are shown

you can perform attributes

P_CODE,

and its results

Rights

For example,

only

to

electronic reserves

LINE_PRICE

PRODUCT;

9.7.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

442

part

III

Database

Programming

One important does you

difference

not require

the

projected

attribute

the INV_NUMBER

9.2.3 JoIn A second

express in the

The syntax

SELECT

in the

USING

INVOICE PRODUCT

natural join

join

attributes.

not

require

syntax is that the

In

any

the first

table

and INVOICE

natural

qualifier,

tables.

The

natural join

join

even same

example, though

can

the

be said

of

example.

is through

the

USING

in the

keyword.

USING

This

clause

query

and that

returns

only the

column

must

and

tables

rows

exist

with

in

both

table1

JOIN

action,

table2

lets

perform

P_CODE,

JOIN

LINE

USING

(common-column)

ajoin

of the INVOICE

P_DESCRIPT,

LINE_UNITS,

LINE

by

writing:

LINE_PRICE

USING (INV_NUMBER)

USING (P_CODE);

SQL statement

9.8

did

CUSTOMER

second

query in

FROM

FIgure

the

and the old-style common

projection

both

indicated

FROM

INV_NUMBER,

The

the

is:

column-list JOIN

a join column

SELECT

JOIN

in

for

uSIng Clause

values

To see the

natural join

qualifier

yet the

appeared

attribute

way to

matching

between the

of a table

CUS_CODE

CUS_CODE

tables.

use

produces

the

results

shown

in

Figure

9.8.

JoIn uSIng results

9

As was the case with the NATURAL JOIN command, the JOIN USING operand does not require table qualifiers. As a matter of fact, Oracle will return an error if you specify the table name in the USING clause.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9.2.4 JoIn

9 Procedural

Language

SQL and

Advanced

SQL

443

on Clause

The previous two join

styles

used common

attribute

names in the joining

tables.

Another

way to express

ajoin whenthe tables have no common attribute names is to use the JOIN ON operand. That query will return only the rows that meetthe indicated join condition. Thejoin condition will typically include an equality comparison expression of two columns. (The columns may or may not share the same name but, obviously,

must have comparable

SELECT column-list

data types.)

FROM table1 JOIN table2

The syntax is:

ONjoin-condition

The following example performs ajoin of the INVOICE and LINE tables, using the ON clause. The result is shown in Figure 9.9. SELECT

INVOICE.INV_NUMBER,

FROM

INVOICE JOIN LINE ONINVOICE.INV_NUMBER

JOIN PRODUCT

ON LINE.P_CODE

FIgure

JoIn

9.9

P_CODE, P_DESCRIPT, LINE_UNITS, LINE_PRICE 5 LINE.INV_NUMBER

5 PRODUCT.P_CODE;

on results

9

Note table

that,

unlike

qualifier

ambiguously

Copyright Editorial

review

2020 has

Cengage deemed

the

NATURAL common

defined

Learning. that

the

for

any

All suppressed

Rights

error

Reserved. content

does

May not

JOIN

and the

attributes.

JOIN

If you

USING

do not

operands,

specify

the

the

table

JOIN

qualifier,

ON clause you

requires

will get

a

a column

message.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

444

part

III

Database

Programming

Keep in common you

mind that the JOIN attribute

can

use the

name.

ON syntax lets

For

following

example,

(recursive)

SELECT

E.EMP_MGR,

FROM

EMP E JOIN

ORDER

BY

to

you

perform

generate

a list

a join of all

even

when the tables

employees

with the

do not share

managers

a

names,

query: M.EMP_LNAME,

EMP

E.EMP_NUM,

M ON E.EMP_MGR

E.EMP_LNAME

5 M.EMP_NUM

E.EMP_MGR;

9.2.5 outer Joins An outer join in

the

types

of outer

are

joins:

or

third

table

right

in

right

FROM

the

and full.

clause

the

with

The left

Remember

condition

unmatched

and right

that

join

will be the left

right

returns

common

side table.

rows

are being joined,

becomes

outer join

matching the join

also the

DBMS.

more tables

The left values

but

left,

by the

named in the

If three

not only the rows

columns),

processed

table

the

returns

common

side,

The

designations

operations

the result

(that is, rows

values.

reflect

take

and the second

of joining

order in

two

table

the first two

matching

standard

the

place

with

ANSI

tables

three

which the

at a time.

named tables

values

defines

tables The first

will be the right becomes

the left

side. side;

side.

not

only the

column),

but

rows

also

matching

the

rows

in

the

the

join

left

condition

side

(that

table

is,

rows

with

with unmatched

matching

values

in the

The syntax is:

SELECT

column-list

FROM

table1

LEFT [OUTER]

JOIN

table2

ON join-condition

9 For

example,

the

and includes

VENDOR

LEFT

in the

column),

The

is:

syntax

table1 the

also includes

rows

also the

those

query

[OUTER] lists

products

the

that

do not

FROM

VENDOR

RIGHT JOIN

any

rows

and

vendor

name

for

all

products

5 PRODUCT.V_CODE;

9.10.

the join

in the

All suppressed

Rights

Reserved. content

does

output

May not

not materially

be

are shown

copied, affect

scanned, the

overall

or

right

condition

side

(that

table

with

is,

rows

with

unmatched

matching

values

in the

ON join-condition

code,

have

a

in

vendor matching

code,

and

vendor

vendor

name

for

all

products

code:

V_NAME

PRODUCT

duplicated, learning

table2

product

VENDOR.V_CODE,

Learning.

in Figure

matching

JOIN

P_CODE,

that

code

ON VENDOR.V_CODE

are shown

SELECT

Cengage

vendor

V_NAME

PRODUCT

only the but

RIGHT

following

The SQL code and its

deemed

not

code,

products:

column-list

For example,

has

returns

common

side table.

and

JOIN

SQL code and its result outer join

FROM

2020

product

matching

FROM

SELECT

review

the

no

VENDOR.V_CODE,

values

Copyright

lists

with

P_CODE,

The right

Editorial

query

vendors

SELECT

The preceding

left

following

those

ON VENDOR.V_CODE

5 PRODUCT.V_CODE;

Figure 9.11.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Copyright Editorial

review

2020 has

FIgure

9.10

LeFt JoIn results

FIgure

9.11

rIght

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

9 Procedural

Language

SQL and

Advanced

SQL

445

JoIn results

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

9

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

446

part

III

Database

The full

Programming

outer join

values

in

the

syntax

is:

returns

common

SELECT

table1

also

matching

all of the

FULL [OUTER]

For example,

the following

and includes

all product

the join

rows

matching

JOIN table2

query lists the rows

product

(products

condition

with unmatched

(that values

is, rows in

either

with

matching

side table.

The

code, vendor

without

SELECT

P_CODE,

VENDOR.V_CODE,

FROM

VENDOR

FULL JOIN

9.12

ONjoin-condition

matching

code and vendor

vendors)

as

name for all products

well as all vendor

rows (vendors

products):

The SQL code and its result

FIgure

but

column-list

FROM

without

not only the rows

column),

V_NAME

PRODUCT

are shown in

Figure

ON VENDOR.V_CODE

5 PRODUCT.V_CODE;

9.12.

FuLL JoIn results

9

9.3 The the

SubQuerIeS

use

of joins

following

the

allows

query

CUSTOMER

2020 has

Cengage deemed

Learning. that

any

database you

to

to

get the

All

Rights

Reserved. content

does

May not

not materially

be

get information customers

from

data

INVOICE.CUS_CODE,

two

with their

or

more tables.

respective

For example,

invoices

by joining

CUS_LNAME,

CUS_FNAME

INVOICE

CUSTOMER.CUS_CODE

suppressed

QuerIeS

tables.

CUSTOMER,

WHERE

review

allow

INV_NUMBER,

FROM

Copyright

a relational

would

and INVOICE

SELECT

Editorial

anD CorreLateD

copied, affect

scanned, the

overall

or

duplicated, learning

5INVOICE.CUS_CODE;

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

In the

previous

matching

However, example,

the

query, the

rows

with it is

you

Beginning

often

table

Structured

following

FROM

Similarly,

to

to

process

data

of vendors

products

Query Language,

based

who

some

on

provide

that

other

are processed

processed

products.

of them

you learnt

V_CODE,

V_NAME

V_CODE

NOT IN (SELECT

generate

price, you can

FROM

SQL

447

at once,

data.

(Recall

that

are only potential

you could

generate

Suppose,

not

vendors.) such

a list

for

all vendors

in

In

Chapter

8,

by

writing the

of

all

V_CODE

products

with

FROM

a price

PRODUCT);

greater

than

or equal

to

the

average

product

query:

P_CODE,

P_PRICE

P_PRICE

.5

PRODUCT

WHERE

both of those

cases,

Which vendors What is the both

a list

write the following

SELECT

input

Advanced

VENDOR

WHERE

In

and INVOICE)

SQL and

values.

a list

provided

(CUSTOMER

Language

query:

SELECT

In

generate

have

both tables

CUS_CODE

necessary

want to

VENDOR

data from

shared

9 Procedural

Although

you needed

provide

average

cases,

for the

you

(SELECT

AVG(P_PRICE)

FROM

to get information

that

PRODUCT);

was not previously

known:

products?

price

used

of all products?

a subquery

to

generate

the

required

information

that

could

then

be used

as

9

originating

query.

you learnt

how to

use subqueries

in

Chapter

8, lets

review

the

basic

characteristics

of a

subquery:

A subquery

is a query (SELECT

A subquery

is normally

The first The

query

query

The

In this

is

(such

the

Copyright Editorial

review

2020 has

as

known

as the

the

SQL

statement

is

known

as the

is

you

based

But subqueries manipulation

is

executed

have

a

language multiple

Cengage

Learning. that

any

All suppressed

Reserved. content

does

wide range

May

not materially

referred

of uses.

inner

be

copied, affect

is

for

to

the

query. query.

practical

For

outer

query.

as a nested

statement

you

UPDATE

expected.

query.

use of subqueries. to return

example,

(INSERT,

or a table)

in

the

SELECT

statement

codes

not

as the input

more about

(DML)

vendor

Rights

used

use of the

use of SELECT subqueries

deemed

is

is sometimes

will learn on the

outer

first.

query

SQL statement

section,

subquery

parentheses.

statement

of an inner

The entire

inside

a query.

SQL

query

output

expressed

inside

in the

inside

The inner

statement)

one or can

use

a subquery

or DELETE)

Table

9.2

You already

where

uses

know

that

more values to another

simple

within

a value

a SQL

or alist

examples

to

a

query. data

of values summarise

DML statements.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

448

part

III

Database

tabLe SeLeCT

Programming

9.2

SeLeCt subquery

Subquery

INSERT

INTO

SELECT

examples

examples

explanation

PRODUCT

* FROM

Inserts

P;

all rows

from

Table

must have the

same

attributes.

Table UPDATE

PRODUCT

SET P_PRICE FROM

5 (SELECT

AVG(P_PRICE)

WHERE

V_CODE

FROM

VENDOR

V_CODE

V_CODE

FROM

VENDOR

WHERE

PRODUCT

table.

Both tables

returns

all rows

from

product

that

0181.

price to the

are provided

The first

subquery

average

by vendors

subquery

returns

product

returns

the list

price,

but

only for

who have an area code the

average

of vendors

price; the

with an area

code

equal

0181.

5 '0181')

DELETE FROM PRODUCT WHERE

to

second to

WHERE V_AREACODE

the

the products equal

IN (SELECT

the

The subquery

P.

Updates

PRODUCT)

Pinto

Deletes the PRODUCT table rows that are provided

IN (SELECT

V_CODE

area

code

codes

V_AREACODE

equal

to

0181.

with an area code

The subquery equal to

returns

by vendors

the list

with

of vendors

0181.

5 '0181')

Using the examples shown in Table 9.2, note that the subquery is always at the right side of a comparison or assigning expression. Also, a subquery can return one value or multiple values. To be precise, the subquery

can return:

One single value (one column and one row). This subquery is used anywhere a single value is expected,

9

as in the right

side of a comparison

expression

(such

as in the

UPDATE example

above

when you assign the average price to the products price). Obviously, when you assign a value to an attribute, that value is a single value, not alist of values. Therefore, the subquery mustreturn only one value (one column, one row). If the query returns multiple values, the DBMS will generate an error. Alist of values (one column and multiple rows). This type of subquery is used anywhere alist of values is expected, such as when using the IN clause (that is, when comparing the vendor code to

alist

of vendors).

Again, in this

case, there is

only one column

of data

instances. This type of subquery is used frequently in combination WHERE conditional expression.

with

multiple value

with the IN operator in a

A virtual table (multicolumn, multirow set of values). This type of subquery can be used anywhere atable is expected, such as when using the FROM clause. You will see this type of query later in this chapter. It is important to note that a subquery can return no values at all;it is a NULL.In such cases, the output of the outer query mayresult in an error or a null empty set depending where the subquery is used (in a comparison, an expression or atable set). In the following

retrieve

sections,

you

will learn

how to

write subqueries

within the

SELECT

statement

to

data from the database.

9.3.1 where Subqueries The most common type of subquery uses an inner SELECT subquery on the right side of a WHERE comparison expression. For example, to find all products with a price greater than or equal to the average product price, you write the following query: SELECT

P_CODE, P_PRICE FROM PRODUCT

WHERE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

P_PRICE .5 (SELECT

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

AVG(P_PRICE) FROM PRODUCT);

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The output a

.,

,,

(one

of the

5,

column,

attribute

preceding

.5,

or

,5

one row).

to the left

string.

Also, if the

FIgure

query is shown in

conditional The value

of the

generated

comparison

query returns

9.13

Figure

expression,

by the

symbol

is

more than

where subquery

9.13.

requires

Language

Note that this type

a subquery

subquery

must

a character

a single

9 Procedural

type,

value, the

that

SQL and

of query,

returns

only

subquery

DBMS

single

value

type;

if the

data

must return

will generate

SQL

449

when used in

one

be of a comparable the

Advanced

a character

an error.

examples

9

Subqueries

can also

customers

who

be used in combination

ordered

the

product

SELECT

DISTINCT

FROM

CUSTOMER

WHERE

CUS_LNAME,

JOIN INVOICE

LINE

USING

JOIN

PRODUCT

For example,

the following

query lists

all of the

hammer:

CUS_CODE,

JOIN

P_CODE

with joins.

claw

CUS_FNAME

USING (CUS_CODE)

(INV_NUMBER) USING

5 (SELECT

(P_CODE) P_CODE

FROM

PRODUCT

WHERE

P_DESCRIPT

5 'Claw

hammer'); The result In

the

of that

query

preceding

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

also

example,

The P_CODE is then

Editorial

is

Rights

the

in

does

May not

not materially

be

copied, affect

Figure

inner

used to restrict

Reserved. content

shown

the

query

the

scanned, overall

or

9.13. finds

selected

duplicated, learning

in experience.

whole

the

rows

or in Cengage

part.

P_CODE

to

Due Learning

for

only those

to

electronic reserves

rights, the

right

the

product

where the

some to

third remove

party additional

content

claw

hammer.

P_CODE in the

may content

be

suppressed at

any

time

from if

the

subsequent

LINE

eBook rights

and/or restrictions

eChapter(s). require

it

450

part

III

Database

table

Programming

matches the

this

P_CODE for Claw

hammer.

Note that the

previous

query

could

have been

written

way:

SELECT

DISTINCT

FROM

CUSTOMER

WHERE But

what happens

JOIN

LINE

JOIN

PRODUCT

if the

USING (CUS_CODE)

USING (P_CODE) 5 'Claw

original

in the

CUS_FNAME

USING (INV_NUMBER)

You get an error as shown

CUS_LNAME,

JOIN INVOICE

P_DESCRIPT

description? operand,

CUS_CODE,

hammer';

query

encounters

message.

next

the claw

To compare

hammer

one value to

string in

alist

more than

of values,

you

one product

must use an IN

section.

9.3.2 In Subqueries What

would

or saw

compare

a single

lists

has

to find

product

product

attribute

all customers

table

there

code to

but they

has

are

FROM

CUSTOMER

(single

a list can

all customers DISTINCT

value),

of values,

CUS_CODE,

using

purchased

LINE

USING

JOIN

PRODUCT

USING

P_DESCRIPT

LIKE '%hammer%'

OR

P_DESCRIPT

LIKE '%saw%');

Cengage

Learning. that

any

of hammers:

of products

and so on. In such cases, alist

use the

IN

a query,

of product

code

operator. you

hammers

claw

that

must

use

of saw

hammer

contain

saw

and

in their

you need to compare values.

When the

or saws

kind

an IN

When you

P_CODE

values

subquery.

the

want to are

not

The following

or saw blades.

CUS_FNAME

(P_CODE)

WHERE

shown

types

or any

(INV_NUMBER)

IN (SELECTP_CODE

is

different

a hammer

USING (CUS_CODE)

P_CODE

query

purchased

occurrences

CUS_LNAME,

JOIN INVOICE

who

but to

you

be derived

who have

JOIN

of that

two

multiple

WHERE

9.14

deemed

the

SELECT

The result

2020

wanted

There are saw blades, jigsaws

one

beforehand

example

review

that

descriptions. not to

FIgure

you

Also note that

P_CODE

known

Copyright

Note

hammer.

product

9

do if

blade?

sledge

Editorial

you

in

Figure

FROM

PRODUCT

9.14.

In subquery examples

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

452

part

III

Database

The result

FIgure

Programming

of that

9.16

query is shown in

Multirow subquery

As you examine

the

query

The query is a typical

Figure 9.16.

operator example

and its

output in

Figure 9.16, its important

to

note the following

example of a nested query.

The query has one outer SELECT statement second SELECT subquery (call it sqB).

with a SELECT subquery (call it sqA) containing

The last SELECT subquery (sqB) is executed first and returns alist of all vendors from

9

points:

a

South Africa.

The first SELECT subquery (sqA) uses the output of the SELECT subquery (sqB). The sqA subquery returns the list of product costs for all products provided by vendors from South Africa. The use of the ALL operator allows you to compare a single value (P_QOH * P_PRICE) with alist of values returned by the first subquery (sqA), using a comparison operator other than equals. For arow to appear in the result set,it has to meetthe criterion P_QOH * P_PRICE . ALL of the individual values returned bythe subquery sqA. The values returned by sqA are alist of product costs. In fact, greater

than

ALL is equivalent

to greater

than the highest

product

cost of the list.

In the same

way, a condition ofless than ALL is equivalent to less than the lowest product cost ofthe list. Another

powerful

operator

is the

ANY

multirow

operator

(near

cousin

of the

ALL

multirow

operator).

The ANY operator allows you to compare a single value to alist of values and select only the rows for which the inventory cost is greater than any value ofthe list orless than any value ofthe list. You could use the equal to ANY operator, which would be the equivalent of the IN operator.

9.3.5 FroM Subqueries So far,

you

have seen

how the

SELECT

statement

uses subqueries

within

WHERE,

HAVING

and IN

statements and how the ANY and ALL operators are used for multirow subqueries. In all of those cases, the subquery was part of a conditional expression and it always appeared at the right side ofthe expression. In this section, you willlearn how to use subqueries in the FROM clause. As you already

know, the

FROM clause

specifies

the table(s)

from

which the

data are drawn.

Because

the output of a SELECT statement is another table (or more precisely a virtual table), you could use a SELECT subquery in the FROM clause. For example, assume that you want to know all customers who have purchased products 13-Q2/P2 and 23109-HB. All product purchases are stored in the LINE table. It is easy to find out who purchased any given product by searching the P_CODE attribute in the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

LINE table. one.

But in this

You

could

case, you

write the

SELECT

DISTINCT

FROM

CUSTOMER, (SELECT

know

all customers

CUSTOMER.CUS_CODE,

P_CODE

(SELECT

SQL and

both

products,

who purchased

Advanced

SQL

453

not just

5 '13-Q2/P2')

P_CODE

CUSTOMER.CUS_LNAME

FROM

FROM

5 '23109-HB')

CUSTOMER.CUS_CODE

INVOICE

NATURAL

JOIN

LINE

INVOICE

NATURAL

JOIN

LINE

CP1,

INVOICE.CUS_CODE

WHERE

Language

query:

INVOICE.CUS_CODE

WHERE

WHERE

want to

following

9 Procedural

CP2

5

CP1.CUS_CODE

AND

CP1.CUS_CODE

5

CP2.

CUS_CODE;

The result

of that

FIgure

query is

9.17

shown in

Figure 9.17.

FroM subquery example

9

As you examine Figure 9.17, note that the first subquery returns all customers who purchased product 13-Q2/P2, while the second subquery returns all customers who purchased product 23109-HB. So, in this

FROM

subquery,

you are joining

the

CUSTOMER

table

with two

virtual

tables.

The join

condition

selects only the rows with matching CUS_CODE values in each table (base or virtual). In the previous chapter, you learnt that a view is also a virtual table; therefore, you can use a view name anywhere a table is expected. So, you could create two views: one listing all customers who purchased product 13-Q2/P2 and another listing all customers who purchased product 23109-HB. Doing so, you

would

write the

query

as:

CREATE VIEW CP1 AS SELECT INVOICE.CUS_CODE

FROM INVOICE

NATURAL JOIN LINE

FROM INVOICE

NATURAL JOIN LINE

WHERE P_CODE 5 '13-Q2/P2'; CREATE VIEW CP2 AS SELECT INVOICE.CUS_CODE WHERE P_CODE

SELECT

DISTINCT CUS_CODE, CUS_LNAME

FROM

Copyright Editorial

review

2020 has

Cengage deemed

CUSTOMER

Learning. that

5 '23109-HB';

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

NATURAL JOIN CP1 NATURAL JOIN CP2;

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

454

part

III

Database

You

Programming

may be tempted

to speculate CUS_CODE,

CUS_LNAME

FROM

CUSTOMER

NATURAL

However,

P_CODE if

different

you

values

The

SELECT

Those

at the

A subquery

can

in the a simple

average

FROM 9.18

9.18, the

Copyright review

2020 has

when

Learning. that

AND

P_CODE

5 '23109-HB';

carefully,

attribute

you the

list

of base

list

will

note

query

JOIN

that

syntax:

LINE

a P_CODE

will not return

to indicate

tables

which

or computed

a subquery

must return can

cannot

be equal

to two

any rows.

to list

the

columns

to

attributes

expression,

one single

be used

P_CODE,

P_PRICE,

P_PRICE

(SELECT

any

result

note that

the

All suppressed

is the

aliases you

parses

Cengage deemed

NATURAL

can also include

query

the

value

expression,

Editorial

list

attribute

shows

of the column

DBMS

INVOICE

Therefore,

be attributes

value;

project

or the

in the

result

also known

otherwise,

difference

(SELECT

each

resulting

set.

of an aggregate

as an inline

an error

between

AVG(P_PRICE)

AVG(P_PRICE)

of that

Inline subquery

Figure

and that

list

JOIN

using the following

subquery.

code is raised.

products

price

For

and the

FROM

FROM

PRODUCT)

PRODUCT)

AS

AS

AVGPRICE,

DIFFERENCE

PRODUCT;

9.18

In

also be written

price:

SELECT

Figure

time.

uses the

inline

product

query

query could

List Subqueries

The attribute

example,

that

same

statement

columns

function.

5 '13-Q2/P2'

examine

9.3.6 attribute

FIgure

above

SELECT

WHERE

9

that the

get

alias

Rights

same

query

in

every

when computing

in the

executes

does

May not

not materially

output row.

the

message.

defined

Reserved. content

examples

the inline

an error

is

and

query.

returns

Note

difference.

The column same

one

also that

attribute

single the

value

query

(the

used

average the

full

products

expression

In fact, if you try to use the alias in the

alias

cannot

list.

That

be used in DBMS

computations

requirement

is

difference

in the due to

price) instead

the

attribute way the

queries.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Another

example

example, by

employee

table.

To

the

you

of

query,

total you

sales.

a common

use of attribute list code,

To get the

by employee

attribute.

of employees,

write the

by

need

you dont

subqueries

sales

you

know

the

need

need a common related

to

to

use

SQL

455

For

contribution only

the

of employees

LINE (from

LINE and EMPLOYEE

attribute.

each

Advanced

aliases.

and the

number

you can see that the

employees

SQL and

and column

by product,

product, to

Language

You need to know

product.

So to

answer

the

code:

COUNT(*)

* LINE_PRICE)

FROM

EMPLOYEE)

ROUND(SUM(LINE_UNITS EMPLOYEE),2)

sales

total

structures,

total

SUM(LINE_UNITS

(SELECT

the

you

In fact,

not the

following

P_CODE,

FROM

the

product

As you study the tables

number

SELECT

the

contribution

table).

would

know

products

the

do not share

only the

want to

each

compute

EMPLOYEE

tables

will help you understand

suppose

9 Procedural

AS

AS SALES, AS ECOUNT,

* LINE_PRICE)/(SELECT

COUNT(*)

FROM

CONTRIB

LINE

GROUP

BY

P_CODE;

The result

of that

query

up to

decimal

two

FIgure

is

shown

places

9.19

using

in the

Figure SQL

9.19.

Notice

ROUND

that

the

CONTRIB

column

has

been

rounded

function.

another example of aninline subquery

9

The use of that type of subquery is limited to certain instances where you need to include data from other tables that are not directly related to a main table or tables in the query. The value will remain the same for each row, like a constant in a programming language (although you will learn

another

use

of inline

subqueries

later

in

Section

9.3.7,

Correlated

Subqueries).

Note that

you cannot use an alias in the attribute list to write the expression that computes the contribution per employee. Another wayto write the same query by using column aliases requires the use of a subquery in the FROM

clause,

as follows:

SELECT

P_CODE, SALES, ECOUNT, SALES/ECOUNT

FROM

(SELECT FROM

P_CODE, SUM(LINE_UNITS * LINE_PRICE) AS SALES,(SELECT

EMPLOYEE)

FROM

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

COUNT(*)

AS ECOUNT LINE

GROUP

Editorial

AS CONTRIB

May not

BY

not materially

be

P_CODE);

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

456

part

III

Database

In that

Programming

case,

and returns contains query

you are actually a virtual

an inline receives

table

using two with three

subquery

the

using the column

that

output

subqueries.

columns:

returns

of the inner

The subquery

P_CODE,

the

number

query,

you

of

can

in the

SALES

and

employees

now

refer

FROM

as

to

clause

ECOUNT.

the

The

ECOUNT.

columns

executes

FROM

Because

in the

first

subquery the

outer

outer

subquery,

aliases.

9.3.7 Correlated Subqueries Until

now,

all subqueries

command its

sequence

output

SQL

is

used

statement

about

execute

in a serial fashion,

outer

query,

which

independently.

one after another.

then

executes

That is,

The inner

until the last

each

subquery

subquery

outer

query

in

executes

executes

a

first;

(the

first

code).

a correlated

That process is similar FOR

have learnt

executes

by the

in the

In contrast,

you

subquery

is a subquery

to the typical

nested loop in

that

executes

once for each row in the outer query.

a programming

language.

For example:

X 5 1 TO 2 FOR

Y 5 1 TO 3 PRINT

'X

5 'X, 'Y

5 'Y

END END will yield

the

output

X 5 1

Y 5 1

X 5 1

Y 5 2

X 5 1

Y 5 3

X 5 2

Y 5 1

X 5 2

Y 5 2

X 5 2

Y 5 3

9

Note that 3 is

the

outer

completed

for

correlated

loop

X 5 1 TO 2 begins

each

X outer loop

subquery

the

value.

process

by setting

The relational

X 5 1; then

DBMS

uses

the

the

same

inner

loop

Y 5 1 TO

sequence

to

produce

results:

1 It initiates the outer query. 2

For each row of the outer query result set, it executes the inner the inner

That process subquery

query. is the

because

column

of the

sold

value

2

Copyright review

2020 has

you have seen to the

outer

so far.

query

The query is called

because

the inner

a correlated

query references

a

greater

In that

subquery

in

than

average

the

case,

action,

complete

suppose units

you

sold

value

the following

want to for that

know

all product

product

(as

sales

opposed

in

which

to the

the

average

procedure:

value for a product.

Compare the average computed in Step 1to the units sold in each sale row; then select only the

Cengage deemed

subqueries

Compute the average-units-sold

rows in

Editorial

is

of the

query is related

subquery.

correlated

for all products).

1

opposite the inner

outer

To see the units

query by passing the outer row to

Learning. that

any

which the

All suppressed

Rights

Reserved. content

does

number

May not

not materially

be

of units sold is greater.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The following

correlated

SELECT

query

completes

INV_NUMBER,

FROM

LINE

WHERE

P_CODE,

LS.LINE_UNITS

FIgure

in

9.20

two-step

SQL and

Advanced

SQL

457

process:

LINE_UNITS

. (SELECTAVG(LINE_UNITS)

LINE

LA

WHERE example

preceding

Language

LS

FROM

The first

the

9 Procedural

Figure

LA.P_CODE 9.20

shows

the

5 LS.P_CODE);

result

of that

query.

Correlated subquery examples

9

As you examine the top query and its result in Figure 9.20, note that the LINE table is used more than once; so, you need to use table aliases. In that case, the inner query computes the average units sold of the product that matches the P_CODE of the outer query P_CODE. That is, the inner query runs once

using the first

product

code found

in the (outer)

LINE table

and returns

the

average

sale for that

product. Whenthe number of units sold in that (outer) LINE row is greater than the average computed, the row is added to the output. Then the inner query runs again, this time using the second product code found in the (outer) LINE table. The process repeats until the inner query has run for all rows in the (outer) LINE table. In that case, the inner query is repeated as manytimes as there are rows in the outer

Copyright Editorial

review

2020 has

query.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

458

part

III

Database

Programming

To verify the results a correlated units

inline

sold

see, the

column new

product.

and to

subquery for

query

provide

to the

each

product.

contains

(See

a correlated

You not only get an answer,

You can also use correlated want to

know

all

subquery

like

the

customers first

one

CUS_CODE,

FROM

CUSTOMER

in

correlated

query

subquery

with the

placed

an

inline

and its

that

lately.

in

the

you can add

shows

Figure

the

9.20.)

average

units

average

As you sold for

can each

answer is correct.

EXISTS special order

results

computes

subqueries,

subquery

operator.

In that

case,

For example, you

could

suppose

use

you

a correlated

9.21:

CUS_LNAME,

EXISTS (SELECT

In subquery

second

inline

Figure

CUS_FNAME

CUS_CODE

WHERE

9.21

the

That

but also can verify that the

have

shown

of how you can combine

query.

subqueries who

SELECT

WHERE

FIgure

an example

previous

FROM INVOICE

INVOICE.CUS_CODE

5 CUSTOMER.CUS_CODE);

examples

9

The second example of an EXISTS correlated subquery in Figure 9.21 will help you understand how to use correlated queries. For example, suppose you want to know which vendors you must contact to start ordering products that are approaching the minimum quantity-on-hand value. In particular, you

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

want to know the vendor than

double

the

code

minimum

SELECT

V_CODE,

query

for products

that

answers

having

that

a quantity

question

is

SQL and

Advanced

SQL

459

on hand that is less

as follows:

VENDOR

WHERE

EXISTS

FROM

(SELECT*

PRODUCT

WHERE

P_QOH

,

AND

P_MIN

* 2

VENDOR.V_CODE

As you examine

the

second

query in

5 PRODUCT.V_CODE);

Figure

9.21, note that:

Theinner correlated subquery runs using the first vendor.

2 If any products the

3

vendor

match the condition (quantity

code

and name

The correlated vendors

9.4

often

requires

output.

subquery runs using the second vendor,

a year.

A product

designed

For

For

basis

example,

the

years, to

programming

language,

SQL functions

code

and the

data

process repeats itself

until all

its

very likely

are useful tools.

Youll

ordered

by year of birth or when your

ordered

by postal

code

need

use

elements

to can

be derived

The value a table.

data

and the first that

from

an

Therefore,

digits

not

present

have data

the

into

from

use functions

of their

a

month

in the

Functions

anywhere

functions If

numbers.

use

or literal)

both

modern

all employees

of those

cases,

youll

a SQL function

date

or string

may be an attribute

a SQL statement

enabled a

of all customers

using

a numerical,

date

familiar.

alist

instead

or it

know

want to list

In

database,

number, that

you

will look

when you

always

in

special

section

and

may be

employee

wants you to generate

telephone

as such

had

data

decomposition

a day,

line,

decompositions. in this

department

itself (a constant

may appear

information

involves

be subdivided

production

SQL functions

attribute.

command

shift,

those

need to

four

are

a function

the

marketing

existing

may be part of the

like

that

can

languages

transformations

Generating

manipulation

SE-05-2-09-1234-1-3/12/04-19:26:48)

plant,

programming

data

of birth

example,

region,

information.

such

date

(for

manufacturing

perform

business

Sometimes,

an employees

conventional

programmers

of critical

manipulations.

manufacturing

to record

time.

are the

many data

elements.

that

in the

minimum quantity),

SQL FunCtIonS

of data

and

are listed

on hand is less than double the

are used.

The data in databases

can

of vendors

The

Language

V_NAME

FROM

1

and name

quantity.

9 Procedural

where a value

value.

located

in

or an attribute

be used. There

are

functions.

many types

This

overview

of the

section

of

SQL functions,

will not

explain

all

such of those

most useful

ones.

main

vendors

support

vendors

invariably

as arithmetic, types

trigonometric,

of functions

in

string,

detail,

but it

date

will give

and you

time a brief

note Although

the

DBMS

may differ. In fact, The functions

covered

Read your

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

DBMS

DBMS

All suppressed

Rights

section

SQL reference

Reserved. content

in this

does

May not

not materially

be

copied, affect

the

SQL functions add

represent

their

just

a small

manual for a complete

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

covered

own

part.

Due Learning

to

to

portion

list

electronic reserves

here, the

functions

of functions

of available

rights, the

right

syntax

products

some to

third remove

or degree

to lure

of support

new

supported

customers.

by your

DBMS.

functions.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

460

part

III

Database

9.4.1

Programming

Date and time

All SQL-standard date

the

data

data

types

support

type)

it lets

Because functions

the

for

Microsoft

9.3

with that differ

Access/SQL

Selected

YeAr

vendor

Server

to

and for

access/SQL

Lists a four-digit

year

all employees

SELECT

vendor,

Oracle.

this

Table

Server date/time

born in

EMP_LNAME,

YEAR(EMP_DOB)

YEAR(date_value)

FROM EMPLOYEE

MONTH

(of

Unfortunately,

problem

data types

a

date/

occurs

because

are to

be stored;

section

will

9.3 shows

alist

cover

basic

date/time

of selected

Microsoft

functions

1966:

EMP_FNAME,

a two-digit

month

code

SELECT

Syntax:

EMP_LNAME,

FROM

5 1966;

born in

November:

EMP_FNAME,

MONTH(EMP_DOB)

MONTH(date_value)

AS

DAY

EMPLOYEE

Lists all employees the

number

of the

day

SELECT

DAY(EMP_DOB)

DAY(date_value)

FROM

EMP_FNAME,

GeTDATe() Returns

Access

SQL

todays

Lists how

Server

AS DAY

5 14;

many days are left

SELECT

#25-Dec-2019#

Note two

features:

date

The

Christmas

doing SQL

Use

date

has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

affect

enclosed

in number

in

signs

Microsoft

Access.

( #) because

you

are

arithmetic.

GETDATE()

copied,

date is

which is acceptable

Server:

between

2020

until Christmas:

- DATE();

There is no FROM clause,

In

month:

EMP_DOB,

EMPLOYEE

WHERE DAY(EMP_DOB) Microsoft

5 11;

born on the 14th day of the

EMP_LNAME,

Syntax:

DATe()

EMP_DOB,

MONTH

WHERE MONTH(EMP_DOB)

Returns

EMP_DOB,

AS YEAR

Lists all employees

Returns

review

The

parameter

issue. from

Syntax:

Copyright

or date type).

vendors.

but does not say how those

WHERE YEAR(EMP_DOB)

Editorial

DBMS

one

example(s)

Returns

9

numeric

different

take

functions.

Microsoft

Function

All date functions

(character,

by

date data types

deal

date/time

functions.

a value

differently

functions

Server

and time

and return

defines

vendor

date/time

Access/SQL

date

are implemented

ANSI SQL standard

instead,

tabLe

DBMSs

or character

time

Functions

dates,

scanned, the

overall

or

duplicated, learning

to

get the

use the

in experience.

whole

current

DATEDIFF

or in Cengage

part.

Due Learning

to

system

date.

function

electronic reserves

(see

rights, the

right

some to

third remove

To compute

the

difference

below).

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Function SQL

Server

a number

periods

to

Adds

of selected

time

a number

hours,

a date

days,

SELECT

Syntax: DATeADD(datepart,

In

DATeDiFF

SQL Server

Subtracts two

Microsoft

FROM In

9.4 shows

function

the

(TO_CHAR)

equivalent to

extract

convert

character

strings

shows

selected

date/time

acquisition

to

461

date.

quarters,

Dateparts

or years.

For

can

be

minutes,

example:

AS DueDate

It is therefore

for

list

a current

tabLe

example Access,

adds

90 days to

P_INDATE.

use the following: AS DueDate

between two

dates expressed in a selected

datepart.

DATEDIFF(day,

Microsoft

PRODUCT;

date/time

functions

the

different Oracle

functions

9.4

GETDATE())

AS DaysAgo

use the following:

parts

of a date.

MySQL

AS DaysAgo

used in

date format

for

advisable

of up-to-date

Access,

DATE() - P_INDATE

FROM

a valid

P_INDATE,

PRODUCT;

of MySQL, it is likely that there

in the future.

that version

Oracle.

Also,

can

will be an overlap

Note that

another

be used in

5.6. It is

function date

worth

of a number

that the you refer to the

Oracle

(TO_DATE)

arithmetic. noting

DBMS

is

Finally,

that,

same used

Table

due to

of MySQL and

appropriate

uses the

to 9.5

Oracles

Oracle functions

SQL reference

9

manual

functions.

Selected

oracle date/time

Function

functions

example(s)

TO_CHAr Returns

Lists a character

a formatted

string

string from

or

all employees

SELECT

a date

born in

EMP_LNAME,

1992:

EMP_FNAME,

TO_CHAR(EMP_DOB,'YYYY')

value

EMP_DOB,

AS YEAR

FROM EMPLOYEE

Syntax:

WHERE TO_CHAR(EMP_DOB,'YYYY')

TO_CHAR(date_value, 5 format

MONTH:

fmt)

used;

name

of

MON: three-letter MM: two-digit D: number

can

Lists

be:

of day

FROM

name

of

SELECT

Cengage

Learning. that

any

All suppressed

value

Rights

Reserved. content

5 '11';

born on the 14th day of the

does

EMP_LNAME,

EMP_FNAME,

TO_CHAR(EMP_DOB,'DD')

year value

year

EMP_DOB,

AS MONTH

TO_CHAR(EMP_DOB,'MM')

Lists all employees

FROM YY: two-digit

November:

EMP_FNAME,

month:

month

DAY: name of day of week YYYY: four-digit

5 '1992';

EMPLOYEE

WHERE

week

of

born in

EMP_LNAME,

TO_CHAR(EMP_DOB,'MM')

month name

day

all employees

SELECT

month

month

for

DD: number

deemed

a given

P_lNDATE)

P_INDATE190

SELECT

has

SQL

PRODUCT;

SELECT startdate,

enddate)

2020

Advanced

For example:

DATeDiFF(datepart,

review

to

months,

Returns the difference

dates

Syntax:

Copyright

weeks,

The preceding

date)

FROM

Editorial

of dateparts

DATEADD(day,90,

SELECT

fmt

SQL and

FROM PRODUCT;

number,

Table

Language

example(s)

DATeADD Adds

9 Procedural

AS

EMP_DOB,

DAY

EMPLOYEE

WHERE TO_CHAR(EMP_DOB,'DD')

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

5 '14';

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

462

part

III

Database

Programming

Function

example(s)

TO_DATe

Lists the

Returns

a date

character

value

string

format

using

a

anniversary

and a date

a date between

FROM

fmt)

name

MM: two-digit

of

month

YY: two-digit

is

in

name

date

EMP_DOB)/365

AS YEARS

string,

not a date.

translates

the text string to a valid

How many days between

Thanksgiving

and Christmas

SELECT TO_DATE('2018/12/25','YYYY/MM/DD')

week

TO_DATE('NOVEMBER

year

value

FROM

value

Oracle date used

arithmetic.

of

year

2018?

-

23, 2018','MONTH

DD,

YYYY')

DUAL;

Note the following: The TO_DATE in

date

DUAL is

Lists

SYSDATe date

translates

the text

string

to

a valid

Oracle

date

used

Oracles

pseudo table

used only for cases

are left

Christmas:

where a table is not

needed.

how

SELECT FROM

function

arithmetic.

really

todays

a text

The TO_DATE function

name

of day of month of day

YYYY: four-digit

Returns

AS ANIV_DATE,

month

month

DD: number

9

EMP_FNAME,

Note the following:

D: number for day of week

name

tenth

used; can be:

MON: three-letter

DAY:

companys

BY YEARS;

'11/25/2018' MONTH:

on the

EMPLOYEE

ORDER

5 format

employees

(TO_DATE('11/25/2008','MM/DD/YYYY')

Syntax:

fmt

EMP_LNAME,

EMP_DOB, '11/25/2018'

formats

TO_DATE(char_value,

age of the

date (11/25/2018):

SELECT

mask; also used to

translate

approximate

many days

until

TO_DATE('25-Dec-2018','DD-MON-YYYY')

SYSDATE

DUAL;

Notice two things: DUAL is really The to ADD_MONTHS Adds

Lists

a number

of

months

date; useful for adding

to

a

Christmas

a valid

FROM

or years to a date

pseudo

table

used

only for

cases

where

a table

is

not

date is

enclosed

in

a TO_DATE

function

to translate

the

date

date format.

all products

SELECT

months

Oracles

needed.

with their

P_CODE,

expiration

P_INDATE,

date (two

years

from

the

purchase

date):

ADD_MONTHS(P_INDATE,24)

PRODUCT

ORDER BY ADD_MONTHS(P_INDATE,24);

Syntax: ADD_MONTHS(date_value, n 5 number

of

n)

months

LAST_DAY

Lists all employees

Returns the date of the last of the

month

given in

day

SELECT

a date

FROM

Syntax:

who were hired

EMP_LNAME,

EMP_FNAME,

within the last

seven days of a month:

EMP_HIRE_DATE

EMPLOYEE

WHERE

EMP_HIRE_DATE

.5

LAST_DAY(EMP_HIRE_DATE)-7;

LAST_DAY(date_value)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

tabLe

9.5

Selected

MySQL date/time

Function

Language

SQL and

Advanced

SQL

463

functions examples

Date_Format

Displays

Returns a character formatted

string or a

string from

received

a date value

Syntax: DATE_FORMAT(date_value, fmt

9 Procedural

= format

used;

fmt)

can

be:

the

product

code

and

date the

product

was last

into stock for all products:

SELECT

P_CODE,

FROM

PRODUCT;

SELECT

P_CODE,

FROM

PRODUCT;

DATE_FORMAT(P_INDATE,

'%m/%d/%y')

DATE_FORMAT(P_INDATE,

'%M

%d,

%Y')

%M: name of month %m: two-digit

month

%b: abbreviated %d: number

of day

%W: weekday

of

name month

name

%a: abbreviated %Y: four-digit

number

month

weekday

name

year

%y: two-digit

year

YeAr

Lists

Returns a four-digit

year

all employees

SELECT

EMP_LNAME,

Syntax: FROM

MONTH

Lists a two-digit

month

code

all employees

SELECT

FROM

DAY

November:

EMP_FNAME,

Lists the

number

of the

day

AS

MONTH(EMP_DOB)

all employees

5 11;

born

SELECT

EMP_LNAME,

FROM

EMPLOYEE

on the

WHERE

ADDDATe Adds a number

of days to a date

SELECT

5 number

n)

DATe_ADD This is similar to

ADDDATE

It allows

the

user to

or years to

except it is

specify

the

date

a date.

more robust. unit to

to

INTERVAL

will have been on the shelf

ADDDATE(P_INDATE,

30)

with their

30);

expiration

date (two

years from the

date):

SELECT

add.

n unit)

P_INDATE,

ADDDATE(P_INDATE,

P_CODE,

P_INDATE,

INTERVAL

2 YEAR)

FROM

5 number

DATE_ADD(P_INDATE,

PRODUCT

ORDER

BY

DATE_ADD(P_INDATE,

INTERVAL

2 YEAR);

add

5 date

unit,

add

n days

DAY:

BY

purchase

Syntax: DATE_ADD(date,

DAY

with the date they

Lists all products months,

month:

PRODUCT

ORDER

of days,

of the

EMP_DOB,

5 14;

P_CODE,

FROM

of days

Adds a number

day

for 30 days.

Syntax: ADDDATE(date_value,

AS

DAY(EMP_DOB)

List all products

14th

EMP_FNAME,

DAY(EMP_DOB)

DAY(date_value)

9

EMP_DOB, MONTH

EMPLOYEE

Syntax:

unit

born in

EMP_LNAME,

WHERE

n

= 1982;

MONTH(EMP_DOB)

MONTH(date_value)

EMP_DOB,

AS YEAR

YEAR(EMP_DOB)

Syntax:

n

EMP_FNAME,

EMPLOYEE

WHERE

Returns

1982:

YEAR(EMP_DOB)

YEAR(date_value)

Returns

born in

can

be:

WEEK: add n weeks MONTH: add n months YEAR:

Copyright Editorial

review

2020 has

add

Cengage deemed

Learning. that

any

n years

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

464

part

III

Database

Programming

Function

examples

LAST_DAY

Lists

all employees

of a

month:

Returns

the

given in

a date

date

of the

last

day

of the

month

Syntax:

who

SELECT

EMP_LNAME,

FROM

EMPLOYEE

LAST_DAY(date_value)

were hired

within the last

EMP_FNAME,

WHERE EMP_HIRE_DATE

seven

days

EMP_HIRE_DATE

>= DATE_ADD(LAST_DAY

(EMP_HIRE_DATE),

INTERVAL

-7

DAY);

9.4.2 numeric Functions Numeric functions can be grouped in many different ways, such as algebraic, trigonometric and logarithmic. In this section, you will learn two very useful functions. Do not confuse the SQL aggregate functions you saw in the previous chapter with the numeric functions in this section. The first

group

operates

over a set of values (multiple

the numeric functions parameter and return in an Oracle DBMS.

tabLe

9.6

Selected

oracle numeric

name

aggregate

functions),

while

functions

example(s)

ABS

Lists absolute

Returns

hence, the

covered here operate over a single row. Numeric functions take one numeric one value. Table 9.6 shows a selected group of numeric functions available

Function

9

rows

the

absolute

value

of a

SELECT

number

values:

1.95, -1.93,

FROM

ABS(1.95),

ABS(-1.93)

DUAL;

Syntax:

ABS(numeric_value) rOUND

Lists the product

Rounds a value to a specified precision

(number

SELECT

of digits)

Syntax: ROUND(numeric_value,

p)

FROM

prices rounded

P_CODE,

to one and zero decimal

places:

P_PRICE,

ROUND(P_PRICE,1)

AS PRICE1,

ROUND(P_PRICE,0)

AS PRICE0

PRODUCT;

p 5 precision TrUNC

Lists the product

Truncates

a value to

precision (number

a specified

SELECT

of digits)

p

p)

5 precision

Copyright Editorial

review

2020 has

FROM

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

P_CODE,

does

May not

not materially

be

copied, affect

to one and zero decimal

places and truncated:

P_PRICE,

ROUND (P_PRICE,1)

Syntax:

TRUNC(numeric_value,

price rounded

AS PRICE1,

ROUND(P_PRICE,0)

AS PRICE0,

TRUNC(P_PRICE,0)

AS PRICEX

PRODUCT;

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

Function

9 Procedural

Language

SQL and

Advanced

SQL

465

example(s)

CeiL/FLOOr Returns

the

Lists the smallest

greater than number

price,

or equal to a

or returns

integer

integer

SELECT

the largest

FROM

equal to or less than

number,

product

price, smallest integer

and the largest

P_PRICE,

integer

greater than

equal to

CEIL(P_PRICE),

or less

than

or equal to the product

the

product

price:

FLOOR(P_PRICE)

PRODUCT;

a

respectively

Syntax; CEIL(numeric_value)

FLOOR(numeric_value)

9.4.3 String String

Functions

manipulations

report

using

any

of characters, a subset

are among the

programming

printing

of useful

tabLe

9.7

language,

names

string

in

manipulation

Selected

functions know

||

Microsoft

different

Oracle

Access

and

data from

a single

In

Oracle,

FROM

two

columns

and

In

and

a string in

all lowercase

LOwer

use the

Microsoft

Lists all capital

or

In

letters

9.7 shows

9

following:

|| ', ' || EMP_FNAME

Access

and

EMP_LNAME

AS NAME

SQL

Server,

1', '

use the following:

1 EMP_FNAME

AS NAME

EMPLOYEE;

all employee

Oracle,

SELECT

Syntax:

FROM

UPPER(strg_value)

In

LOWER(strg_value)

SQL

Lists

Server,

SQL

FROM

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

names

use the

UPPER(EMP_FNAME)

1 ', '

AS NAME

1 UPPER(EMP_FNAME),

AS NAME

in

all lowercase

letters

(concatenated).

following:

LOWER(EMP_LNAME)

Server,

|| ', ' || LOWER(EMP_FNAME)

AS NAME

use the following:

LOWER(EMP_LNAME)

1 ', '

1 LOWER(EMP_FNAME)

AS NAME

EMPLOYEE;

Not supported

All

|| ', ' ||

EMPLOYEE;

SELECT

suppressed

(concatenated).

use the following:

all employee

FROM

any

letters

EMPLOYEE;

SELECT

Learning.

all capital

UPPER (EMP_LNAME)

Oracle,

In

in

following:

UPPER(EMP_LNAME)

SELECT

In

names

use the

EMPLOYEE;

FROM

that

strings

Table

1 strg_value

UPPer

Cengage

concatenating

attribute.

names (concatenated).

EMP_LNAME

SELECT

column

Returns

deemed

of a given

|| strg_value

strg__value

has

of properly

the length

a

EMPLOYEE;

FROM

strg_value

2020

If you have ever created

functions

Lists all employee

Syntax:

review

the importance

or knowing

SELECT

character

returns

Copyright

programming.

functions.

string

Server

Concatenates

Editorial

in

example(s)

Concatenation 1

you

uppercase,

Function

SQL

most-used

duplicated, learning

by

in experience.

whole

Microsoft

or in Cengage

part.

Due Learning

to

Access.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

466

part

III

Database

Programming

Function

example(s)

SUBSTriNG

Lists the first

Returns given

a substring

string

or part

of a

In

parameter

of all employee

phone

numbers.

EMP_PHONE,

SUBSTR(EMP_PHONE,l

,3)

AS PREFIX

FROM EMPLOYEE;

SUBSTR(strg_value,

p,l)

Oracle

SUBSTRING(strg_value,p,l)

In

SQL

SQL Server, use the following:

SELECT

Server

l

characters

use the following:

SELECT

Syntax:

p

Oracle,

three

FROM

5 start

position

5length

SUBSTRING(EMP_PHONE,1,3)

AS PREFIX

EMPLOYEE;

Not supported

by

Microsoft

Access.

of characters

LeNGTH Returns

EMP_PHONE,

Lists all employee last the

characters

number in

of

a string

order

value

In

Syntax:

by last

Oracle,

Oracle

LEN(strg_value)

Server

SQL

of their

names in descending

name length.

use the following:

SELECT

LENGTH(strg_value)

names and the length

EMP_LNAME,

LENGTH(EMP_LNAME)

AS NAMESIZE

FROM EMPLOYEE; In

Microsoft

SELECT FROM

Access

and

EMP_LNAME,

SQL

Server,

use the following:

LEN(EMP_LNAME)

AS NAMESIZE

EMPLOYEE;

9.4.4 Conversion Functions Conversion

9

value

in

functions another

TO_CHAR

and

allow

tabLe

you

string

and

of the

9.8

Note that

the

will see how to

selected

Selected

how to

use the functions

a given

function

date in

use the

TO_CHAR

TO_NUMBER

conversion

in

to

SQL

CONverT

convert basic value

it to

the

equivalent

conversion

functions:

and returns

TO_DATE function

a character

takes

a character

Oracle format. function to

to

convert

convert

text

numbers strings

to

to

a formatted

numeric

values.

9.8.

functions

Character:

Lists

all product

Oracle

inventory

Server

In

SQL Server

Returns a character

cost

Oracle,

SELECT

string from a numeric

prices, using

quantity

formatted

on hand,

and total

use the following:

TO_CHAR(P_PRICE,'999.99') TO_CHAR(P_QOH,'9,999.99') TO_CHAR(P_DISCOUNT,'0.99')

fmt)

discount,

P_CODE,

Syntax:

TO_CHAR(numeric_value,

per cent

values.

value.

Oracle:

and of the

a date

way, the

function Table

type

two

takes

same

an actual

shown

data

about

example(s)

TO_CHAr CAST

of

you learnt

TO_CHAR

is

Function Numeric

a value 9.4.1,

a date and returns

section,

A summary

to take Section

a day, a month or a year. In the

string representing

character

In

TO_DATE.

string representing

In this

you

data type.

AS PRICE, AS QUANTITY, AS DISC,

TO_CHAR(P_PRICE*P_QOH,'99,999.99')

AS TOTAL_COST FROM

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

PRODUCT;

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

Function

9 Procedural

Language

SQL and

Advanced

SQL

467

example(s)

SQL Server:

In

CAST (numeric

AS varchar

(length))

CONVERT(varchar(length),

SQL Server, use the following:

SELECT P_CODE,

numeric)

CAST(P_PRICE

AS VARCHAR(8))

CONVERT(VARCHAR(4),P_QOH) CAST(P_DISCOUNT

AS QUANTITY,

AS VARCHAR(4))

CAST(P_PRICE*P_QOH

AS PRICE,

AS DISC,

AS VARCHAR(10))

AS TOTAL_COST

FROM PRODUCT; Not supported Date to

Character:

Lists

TO_CHAr

Oracle

CAST

Server

SQL

CONverT

In

Oracle,

SQL Server

character

string

a date

value

using

different

date formats.

use the following: EMP_LNAME,

FROM

EMP_DOB,

DAY, MONTH DD, YYYY)

EMPLOYEE;

SELECT TO_CHAR(date_value,

fmt)

EMP_LNAME,

EMP_DOB,

TO_CHAR(EMP_DOB,

SQL Server: CAST (date

of birth,

AS DATEOFBIRTH

Syntax: Oracle:

dates

Access.

TO_CHAR(EMP_DOB,

string or a formatted

from

Microsoft

all employee

SELECT

Returns a character

in

YYYY/MM/DD)

AS DATEOFBIRTH AS varchar(length))

CONVERT(varchar(length),

FROM date)

In

EMPLOYEE;

SQL

Server,

SELECT

use the following:

EMP_LNAME,

EMP_DOB,

CONVERT(varchar(11), EMP_DOB) FROM

AS DATE OF BIRTH

EMPLOYEE;

SELECT

EMP_LNAME,

CAST(EMP_DOB

EMP_DOB,

9

as varchar(11))

AS

DATE

OF BIRTH

FROM EMPLOYEE; Not supported String

to

Number:

Converts

TO_NUMBer Returns

table

a formatted

character

number

string,

from

a

using a given format

In

5 format

used;

fmt) can

9 5 displays

a digit

0 5 displays

aleading

below

to

Access.

numeric

source

uses the

to

in text

values

when importing

format;

for

TO_NUMBER

example,

function

Oracle default numeric

to

data to a the

convert

query text

values using the format

masks

5 displays

the

comma

5 displays

the

decimal

5 displays

the

dollar

blank

S 5 leading

sign

MI 5 trailing

'S999.99'),

TO_NUMBER('99.78-','B999.99MI')

In

DUAL;

SQL Server, use the following:

SELECT

CAST('-123.99'

point

AS NUMERIC(8,2)),

CAST(-99.78

sign

The

the

SQL

Server

character

AS NUMERIC(8,2))

CAST function

does

not support

the trailing

sign

on

string.

Not supported

in

Microsoft

Access.

minus sign

SQL

Server

The following

DeCODe

Oracle

Compares

an attribute

with a series associated

use the following: TO_NUMBER('-123.99',

be:

zero

.

B 5leading

Oracle,

SELECT

FROM

,

CASe

strings

another

formatted

Oracle:

$

Microsoft

given.

TO_NUMBER(char_value, fmt

text

from

shown

Syntax:

in

returns

the

sales tax rate

for

specified

countries:

Compares

V_COUNTRY to 'SA'; if the values

match, it returns .08.

Compares

V_COUNTRY

match, it returns

Compares

V_COUNTRY to 'UK'; if the values

or expression

of values value

example

and returns

or a default

to 'FR';

if the

values

.05.

an

value if

no

match, it returns .085.

match is found

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

468

part

III

Database

Programming

Function

example(s)

Syntax:

If there is

Oracle:

SELECT

DECODE(e,

x, y, d)

e 5 attribute x

5 value

equal to SQL CASE

which to

compare

e

'UK',.085,

0.00)

FROM

VENDOR;

In SQL Server, use the following:

value to return if eis not

SELECT

x

V_CODE,

V_COUNTRY,

CASE

Server: When condition

THEN value1

ELSE value2

END

WHEN

V_COUNTRY

WHEN

V_COUNTRY

WHEN V_COUNTRY ELSE

0.00

FROM

END

9.5

oraCLe

If you

use

you can

use the

key,

to that

column,

that

Similarly,

might

data type

offers

a column

type,

include

to

to

create

named

every time

starting

define

with

you insert

with

an a row

you

different

1 and increasing

Oracle

sequences

are

Oracle

sequences

have

Oracle

sequences

are

Oracle sequences

use the

a name not tied

to

An Oracle sequence to

can be created

create

SEQUENCE

a sequence name

that

accept,

you

After

new row

to

type

database.

define

you

close

a value

is

are

Microsoft

Also,

as an a value

you

cannot

value at all.

on a table.

(Sequences

anywhere

that

adds

add.

deserves

a primary

a column

you edit that

a column and

populated

automatically

will not let

Access,

to define

will notice

you

Access

every

Microsoft

will be automatically

and forget

type.

values

In

Access

Access

data

be used

data type.

Microsoft

by 1 in

in the

But an

Oracle

scrutiny:

not

a data type.)

expected.

or a column.

a value

can be assigned

based

and deleted

[START

table,

assign

value that

in

if you data

value

to

a table

which you assign

column;

Autonumber

can

a numeric

your table

Microsoft

Microsoft

object

and to

key

the

Access

an independent

generate

attribute

CREATE

can from

in

atable in

in the

a sequence

Oracle,

The basic syntax

Access.

Autonumber

Autonumber

in

The table

THEN .085

with the

a column

a primary

ID

statements

very

THEN .05

Microsoft

be familiar

column in your INSERT

is

5 'FR'

5 'UK'

AS TAX

in

values. In fact, if you create

Access

Autonumber

you

Autonumber

creats

sequence

Access,

numeric

Microsoft

Access

THEN .08

SeQuenCeS

Microsoft

with unique

5 'SA'

VENDOR

Not supported

9

default value).

AS TAX

y 5 value to return in e 5 x d 5 default

0.00 (the

V_COUNTRY,

DECODE(V_COUNTRY,'SA',.08,'FR',.05,

or expression with

no match, it returns V_CODE,

to

on a sequence

any column in any table. can be edited

and

modified.

any time.

Oracle is: WITH n] [INCREMENT

BY n] [CACHE

|

NOCACHE]

where:

? name is the ?

n is

name

an integer

? START

value

increment

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All

can

be positive

the initial

BY determines

value is ascending

suppressed

that

WITH specifies

? INCREMENT

create

of the sequence.

Rights

Reserved. content

does

the

May not

sequence value

1. The sequence or descending

not materially

be

copied, affect

scanned, the

overall

or negative.

value. (The

by

which

increment

the

default

value is

sequence

1.)

is incremented.

can be positive

or negative

(The

default

to enable

you to

sequences.)

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

? The in

CACHE memory.

For example, each

time

or NOCACHE (Oracle

you

a new

automatically

could

time

CREATE

SEQUENCE

CREATE

SEQUENCE

To check Figure

create

customer

each

all of the

clause indicates

pre-allocates

is

a sequence

added

and

a new invoice

is

you

assign

another

SQL

will pre-allocate

to the

to code

SQL and

Advanced

sequence

use the

values

accomplish

SQL

469

numbers

code to

the

those

automatically

invoice tasks

number

is:

NOCACHE;

WITH 4010

created,

customer

assign to

WITH 20010

START

have

values

sequence The

START

INV_NUMBER_SEQ

Oracle

Language

by default.)

to

added.

CUS_CODE_SEQ

sequences

whether

20 values

9 Procedural

NOCACHE;

following

SQL command,

illustrated

in

9.22:

SELECT * FROM

FIgure

USER_SEQUENCES;

9.22

oracle sequence

9

To use sequences during data entry, you must use two special pseudo columns: NEXTVAL and CURRVAL. NEXTVAL retrieves the next available value from a sequence. CURRVAL retrieves the current value of a sequence. For example, you can use the following code to enter a new customer: INSERT INTO

CUSTOMER

VALUES (CUS_CODE_SEQ.NEXTVAL, The preceding

SQL statement

adds

'Connery',

'Sean',

a new customer

NULL, '0181', '898-2008',

to the

CUSTOMER

20010 to the CUS_CODE attribute. Lets examine some important CUS_CODE_SEQ.NEXTVAL Each time

you use

Once a sequence your

retrieves

NEXTVAL, the

the

sequence

value is used (through

SQL statement

rolls

next available

table

0.00);

and assigns

the

value

sequence characteristics:

value from the

sequence.

is incremented.

NEXTVAL), it cannot be used again. If, for some reason,

back, the sequence

value

does not roll

back. If you issue

another

SQL

statement (with another NEXTVAL), the next available sequence value is returned to the user willlook as though the sequence skips a number. You can issue an INSERT statement

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

it

without using the sequence.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

470

part

III

Database

CURRVAL was in

retrieves

generated

the

the

Programming

same

and

current

The

LINE

main

tables

the INV_NUMBER

numbers

related

Then,

cannot

use for

in the

use

that is, the last

CURRVAL

CURRVAL

is to

Ch09_SaleCo

unless

enter

database

sequence

CURRVAL,

foreign

key

rows

in

in

in

the

LINE

tables.

For

to generate invoice

INV_NUMBER

used and assign

For example:

INTO INVOICE

VALUES (INV_NUMBER_SEQ.NEXTVAL,

20010,

INSERT

INTO

LINE VALUES (INV_NUMBER_SEQ.CURRVAL,

1, '13-Q2/P2',

1, 14.99);

INSERT

INTO

LINE

2, '23109-HB',

1, 9.95);

(INV_NUMBER_SEQ.CURRVAL,

example,

relationship

INSERT

VALUES

which

previously

a one-to-many

sequence

table.

used,

was issued

dependent

are related

you can get the latest

attribute

number

a NEXTVAL

You can use the INV_NUMBER_SEQ

using

INV_NUMBER

of a sequence,

You

attribute.

automatically.

it to the

value

a NEXTVAL.

session.

INVOICE

through

the

with

SYSDATE);

COMMIT; The results

FIgure

are

9.23

shown

in

Figure

9.23.

oracle sequence examples

9

In

the

example

sequence the

Copyright Editorial

review

2020 has

number

use of the

Cengage deemed

shown

Learning. that

any

All suppressed

in

(4011)

SYSDATE

Rights

Reserved. content

does

May not

Figure

9.23,

INV_NUMBER_SEQ.NEXTVAL

retrieves

the

and assigns

it to the INV_NUMBER

column in the INVOICE

attribute

automatically

current

not materially

be

copied, affect

to

scanned, the

overall

or

duplicated, learning

in experience.

whole

insert

or in Cengage

part.

Due Learning

the

to

electronic reserves

rights, the

right

some to

next

available

table.

Also note

date in the INV_DATE

third remove

party additional

content

may content

be

suppressed at

any

time

from if

attribute.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Next, the following

two INSERT

statements

INV_NUMBER_SEQ.CURRVAL In this

way, the

refers

relationship

statement

at the

also issue

a ROLLBACK,

back (but the no

it!

unique

Remember The

points

not

were

you could you

is

can

you think

optional.

associated created

drop

(one

for

from

earlier,

DROP

SEQUENCE

INV_NUMBER_SEQ;

columns

original

using

the

does

not

to

number

(with

guarantee

that

case, (4011).

The

COMMIT

Of course,

and LINE tables

you

can

are rolled

NEXTVAL), there

the

471

sequence

is

always

data

set.

following

code

the

you

manually.

the

values

with

values

sequence

can

you

object

and INVOICE

Therefore,

values

examples

and

and used it to

and INV_NUMBER)

CUSTOMER

the

in

Figure

one for invoice

generate

SEQUENCE

assigned

to

two

number

unique

a DROP

9.23,

distinct

values),

values for

but

both tables.

command.

For

example,

you type:

delete

only the

(CUS_CODE

Because the the

in INVOICE

designed

you recall

a database

CUS_CODE_SEQ;

deletes

If

customer

SEQUENCE

table

permanent.

SQL

sequences:

enter

a table.

DROP

it

can

one sequence

created

a sequence

about

You

with

a sequence

drop the sequences

Dropping

Advanced

number

automatically.

changes

Once you use a sequence is

sequence

established

you inserted

characteristic

when

have created just

INV_NUMBER);

is

makes the

case the rows not).

LINE

SQL and

being sold to the LINE table. In this

INV_NUMBER_SEQ

and

sequence

Language

values.

is

sequences

which

number is

of sequences

A sequence

Finally,

in

products

last-used

INVOICE

command

This no-reuse

these

use

between

of the

sequence

way to reuse

generates

to

end

add the

to the

9 Procedural

from

remain

tables delete

the

table

database.

in the

(CUS_CODE

values

you

and

assigned

to the

database.

are used in subsequent

the

attributes

The

customer,

invoice

examples,

and line

you should

rows

you just

keep

added

9

by

commands:

DELETE

FROM

DELETE

FROM

INVOICE

WHERE INV_NUMBER

CUSTOMER

WHERE

5 4011;

CUS_CODE

5 20010;

COMMIT; Those commands invoice

(the

and the

delete the recently

LINE tables

recently

added invoice

INV_NUMBER

added

customer.

foreign

The

key

COMMIT

and all of the invoice was defined

statement

line rows

with the

saves

all

associated

ON DELETE

changes

to

with the

CASCADE

permanent

option)

storage.

note At this

point,

will be used

Copyright Editorial

review

2020 has

again

need later

to re-create

in the

the

chapter.

SEQUENCE

CUS_CODE_SEQ

CREATE

SEQUENCE

INV_NUMBER_SEQ

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

CUS_CODE_SEQ

and INV_NUMBER_SEQ

sequences,

as they

Enter:

CREATE

Cengage deemed

youll

duplicated, learning

START

WITH 20010

START

in experience.

whole

or in Cengage

part.

NOCACHE;

WITH 4011

Due Learning

to

electronic reserves

rights, the

right

NOCACHE;

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

472

part

III

Database

9.6

Programming

upDatabLe

VIewS

In Chapter 8, Beginning Structured Query Language, you learnt how to create a view and why and how views are used. As mentioned in Chapter 8, Microsoft Access does not support views. Whileviews can be simulated using a SQL query, asis seen here, a view is far more versatile. You will now look at how to make views serve common data management tasks executed by database administrators. One of the

most common

operations

in

production

database

environments

is

using

batch

update

routines to update a master table attribute (field) with transaction data. Asthe name implies, a batch update routine pools multiple transactions into a single batch to update a master table field in a single operation. For example, a batch update routine is commonly used to update a products quantity on hand

based

on summary

sales transactions.

Such routines

are typically

run

as overnight

batch jobs

to

update the quantity on hand of products in inventory. The sales transactions performed, for example, bytravelling salespeople in remote areas were entered during periods when the system was offline. To demonstrate a batch update routine, lets begin by defining the master product table (PRODMASTER) and the product monthly sales totals table (PRODSALES) shown in Figure 9.24. As you examine

FIgure Table

the tables,

9.24

name:

the

note the

proDMaSter

between

and proDSaLeS

the two tables.

tables

PRODMASTER

9

Table

1:1 relationship

name:

PrOD_iD

PrOD_DeSC

PrOD_QOH

A123

SCREWS

60

BX34

NUTS

37

C583

BOLTS

50

PRODSALES

PrOD_iD

PS_QTY

A123

7

BX34

3

online Content For Microsoft Accessusers, the PRODMASTER andPRODSALES tables are located Oracle After

Using

in the 'Ch09_UV'

users, you locate

sequences

into

the

in

tables

tables

product

produce

Copyright review

2020 has

script

your

you

Figure

9.24,

monthly

SET

PRODMASTER.PROD_QOH

Learning. any

the

All

Rights

Reserved. content

does

May not

not materially

be

copied, affect

in the

you

copy

PRODMASTER from

query is

can

student and

companion.

paste the

command

the

table

by subtracting

PRODMASTER

the

tables

PRODSALES

PROD_QOH.

To

written like this:

PRODSALES 5 PROD_QOH

PRODMASTER.PROD_ID

suppressed

platform for this book. For

are located

uv04.sql),

(PS_QTY)

update

PRODMASTER,

that

through

update

quantity

the

on the online

section

program.

lets

sales

update,

which is located see in this

files (uv01.sql

SQL*Plus

UPDATE

Cengage deemed

the

the required

WHERE

Editorial

database,

all SQL commands

scanned, the

overall

-

PS_QTY

5 PRODSALES.PROD_ID;

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Note that the Join the Update

update

PRODMASTER the

To

be used

in

Oracle

produced

UPDATE

attribute

update,

works in

statement.

in

the

error

In fact,

if

each

sequence

Language

used

not all views are

to

update

SQL

473

tables. row

of the

PRODSALES

Access,

message you

attributes

are updatable.

Advanced

of events:

PRODMASTER

table

with the

use

data

must

be stored

but Oracle returns

because Oracle,

Oracle

you

expects

cannot

in

the error

matching

join

a base

in the

Actually,

base table(s)

several

that

to find

tables

is (are)

restrictions

a single

in

the

used in the

govern

table

rather

than

in

message shown in Figure

You

views,

a

9.25.

name in the statement.

an updatable

view.

updatable

table

UPDATE

solve that problem, you have to create an updatable view. Asits name suggests, a view

SQL and

table.

Microsoft

the

the following

PRODSALES

PRODSALES

a batch

That query

reflects

and

PROD_QOH

PROD_ID in the

view.

statement

9 Procedural

To

view is

must realise

and some

that

of them

vendor-specific.

FIgure

9.25

the oracle upDate error message

9

note Keep in

mind that

on updatable supports

both

reference

Copyright Editorial

review

2020 has

Cengage deemed

examples

by the

updatable

in this

section

are

DBMS you are using, and insertable

views.

generated

in

Oracle.

To see

check the appropriate For

more information

which

DBMS on the

restrictions

manual.

syntax,

are

placed

MySQL version

consult

the

MySQL

8.0 8.0

manual.

Learning. that

the

views

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

474

part

III

Database

The

Programming

most common GROUP You

BY expressions use

set

or aggregate such

Most restrictions

are based

on the

meet the

limitations,

To

cannot

view restrictions

operators

Figure

FIgure

updatable

Oracle

an

as

are as follows:

functions

cannot

UNION,

INTERSECT

use of JOINs

updatable

be used in the and

or group

view

named

updatable

views.

MINUS.

operators

in views.

PSVUPD

has

been

created,

as

shown

in

9.26.

9.26

Creating an updateable view in oracle

9

One easy

way to determine

whether

a view can be used to update

a base table is to examine

the views

output. If the primary key columns of the base table you want to update still have unique values in the view, the base table is updatable. For example, if the PROD_ID column of the view returns the A123 or BX34 values morethan once, the PRODMASTER table cannot be updated through the view. After creating the updatable view shown in Figure 9.26, you can use the UPDATE command to update

the

view,

thereby

updating

the

PRODMASTER

command is used and shows the final contents executed.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

table.

Figure

9.27 shows

of the PRODMASTER table

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

how the

are after the

may content

be

suppressed at

any

time

from if

UPDATE

UPDATE is

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

9.27

proDMaSter

9 Procedural

Language

SQL and

Advanced

SQL

475

table update, using an updatable view

Although the batch update procedure just illustrated meetsthe goal of updating a mastertable with data from a transaction table, the preferred, real-world solution to the update problem is to use procedural SQL. Youll learn about procedural SQLin the next section.

9

9.7

proCeDuraL

SQL

Thus far, you have learnt to use SQL to read, write and delete data in the database. For example, you learnt to update values in a record, to add records and to delete records. Unfortunately, SQL does not support the conditional execution of procedures that are typically supported by a programming language IF

using the general format:

,condition.

THEN ,perform ELSE

,perform

procedure. alternate

procedure.

END IF SQL also fails to support the looping operations in programming languages that permit the execution repetitive actions typically encountered in a programming environment. The typical format is:

of

DO WHILE ,perform

procedure.

END DO

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

476

part

III

Database

Programming

Traditionally,

if you

operation

(that

as .NET,

C#

are

it

based

is,

C, Visual

Basic

on enormous

usually involves

changes

the

thus

yielding

(see

Chapter

be stored

platform

and

numerous Flow

executed

within

programming control

the

have

is

the

that

and logic

database.

structures

of

such

applications

is

Therefore,

still

common,

when procedural

programs.

all application

An environment

programs

application

In

code

call the

is isolated

any case, the

requirement,

that most

a single

of distributed

(see

more

shared

in

rise

databases

required

extensions

business

approach

object-orientated

meet that

type

language

problems.

Databases)

Those

legacy) that

many different

control.

and

To

extensions.

programming

(so-called

management

and then

(DO-WHILE)

a programming

Although

code

Databases)

or looping use

many programs.

data

Object-Orientated

language

procedural

older lines.

code in

maintenance

book,

would

approach

Distributed

for this

many

creates

critical

better 14,

you

must be made in

often

modular

(IF-THEN-ELSE)

program

modifications

of that

databases

why

COBOL

of application

is to isolate

advantage

online

Thats

of

redundancies

program,

on the

programming),

of

or Java.

program

approach

The

a conditional

type

duplication

by such

A better

perform

numbers

are required,

characterised

code.

wanted to a procedural

Appendix

application

RDBMS

G code

vendors

created

include:

(IF-THEN-ELSE,

DO-WHILE)

for logic

representation. Variable Error

declaration

and

designation

within

the

procedures.

in

SQL and to

management.

To remedy

the lack

of procedural

functionality

provide

some

standardisation

within the

many vendor offerings, the SQL-99 standard defined the use of persistent stored modules. A persistent stored module (PSM) is a block of code (containing standard SQL statements and procedural extensions) can

9

that

is

stored

be encapsulated,

assign

specific

for persistent (such

as

the

SQL

Server

standard

that

database. such error

as variables, The

by the

Anonymous

Stored

procedures

PL/SQL

functions

Do not confuse built-in

within

The

supported

PSM represents

database

module to ensure that

its

to

stored

users.

business

A PSM lets

only authorised In fact, for

procedure

procedural

use and

makes it

logic

that

an administrator

users can use it. many years, some

modules

processing

procedural

code

is

SQL language.

store

possible

conditional

within the

Support RDBMSs

database

before

to

procedural

merge

code

SQL

(IF-THEN-ELSE), executed

and

by the

SQL (PL/SQL)

SQL statements

and traditional basic

as a unit

Procedural

a

within the

programming and

is

constructs

loops

(FOR

DBMS

when it is invoked

WHILE loops)

and

(directly

or

create:

blocks.

in

Section

9.7.1).

(covered

in

(covered

PL/SQL

functions

within

DB2)

possible SQL

PL/SQL

(covered

SQL

server.

multiple

end user. End users can use PL/SQL to

Triggers

invoked

and

PSMs through

makes it

trapping.

DBMS

among

was promulgated.

Procedural

indirectly)

at the

shared

modules is left to each vendor to implement.

Oracle implements language

executed and

access rights to a stored stored

Oracle,

official

and

stored

in

Section

functions

can

PL/SQL

Section

be

SQL statements,

provided

SQLs

only

within

such they

and

Section

9.7.3).

9.7.4).

with

used

programs

9.7.2

built-in SQL

as triggers conform

to

aggregate statements,

and

stored

very

specific

functions while

PL/SQL

procedures. rules

that

such

as

functions

Functions are

MIN and are

can

also

dependent

MAX. mainly

be called

on your

DBMS

environment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

477

note PL/SQL, in the

Using and

triggers

Oracle END

(See

and

following

SQL*Plus,

clauses.

Figure

FIgure

stored

sections

you

For

procedures

assume

can

the

write

example,

the

are illustrated use

of

a PL/SQL

following

within the

Oracle

code

context

of an

Oracle

DBMS.

All examples

inside

BEGIN

VENDOR

table.

RDBMS.

block

PL/SQL

by enclosing

block

inserts

the

a new

commands row

in

the

9.28.)

9.28

anonymous pL/SQL block examples

9

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

478

part

III

Database

Programming

BEGIN INSERT

INTO

VALUES

VENDOR

(25678,'Microwork

Corp.',

'Adam

Gates','5910','546-8484','NL','N');

END; /

The PL/SQL block shown in Figure 9.28 is known as an anonymous not

been

given

to indicate press

the

a specific

the

end

Enter

key

after

message PL/SQL But suppose is

note entry.)

the forward

slash.

successfully

a

example,

that

That

the

blocks

type

last

of PL/SQL

Following

the

line

uses

block

PL/SQL

a forward

executes

blocks

because it has slash

as soon

execution,

(/)

as you

you

see the

completed.

more specific

message

displayed

New

Added.

To

Vendor

on the

SQL*Plus

produce

a

more

ON.

SQL*Plus

screen

specific

after

a procedure

message,

you

must

do

things: At the

SQL

console

like

.

prompt,

(SQL*Plus)

standard server

enter

SET

SET

to receive

side,

SERVEROUTPUT messages

PL/SQL

not at the

SERVEROUT

from

the

client

side. (To

This

server

code (anonymous

PUT

messages from the

PUT_LINE

command

side (Oracle

blocks,

triggers,

stop receiving

enables

DBMS).

the

Remember,

and procedures)

messages from the

client

just

are executed

server,

you

would

OFF.)

PL/SQL

block to the

SQL*Plus

console,

use the

DBMS_OUTPUT.

function.

The following New

type

SQL, the

at the

To send

9

typing

want

for

(Incidentally,

command-line

procedure you

completed

two

name.

of the

PL/SQL block

anonymous

Vendor

Added!

PL/SQL

(See

Figure

block inserts

a row in the

VENDOR table

and displays

the

message

9.28.)

BEGIN INSERT

INTO

VALUES

VENDOR

(25772,'Clue

Store','Issac

Hayes','5910','323-2009','NL','N');

DBMS_OUTPUT.PUT_LINE('New

Vendor

Added!');

END;

/ In

Oracle, you can use the

PL/SQL

blocks.

generate

an error

after

The following supported

SQL*Plus

The SHOW

ERRORS

creating

example

by the

SHOW

command

yields

or executing

of

procedural

in fact,

command

a PL/SQL

an anonymous language.

many vendors

their

to

help you diagnose

debugging

errors found

information

whenever

in you

block.

PL/SQL

Remember

enhance

ERRORS additional

block that

demonstrates

the

products

exact

several

syntax

of the

with proprietary

of the

constructs

language

is

vendor-dependent;

features.

DECLARE W_P1

NUMBER(3)

:5

0;

W_P2

NUMBER(3)

:5

10;

W_NUM

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

NUMBER(2)

Rights

Reserved. content

does

May not

not materially

be

:5

copied, affect

0;

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

479

BEGIN WHILE

W_P2

SELECT

,

300

LOOP

COUNT(P_CODE)

WHERE

P_PRICE

INTO

BETWEEN

W_NUM W_P1

DBMS_OUTPUT.PUT_LINE('There ' and ' ||

W_P2);

W_P1 :5

W_P2

1 1;

W_P2 :5

W_P2

1 50;

AND are ' ||

FROM

PRODUCT

W_P2; W_NUM

|| '

Products

with

price

between

' ||

W_P1 ||

END LOOP; END; / The

blocks

code

FIgure

and

9.29

execution

are

shown

in

Figure

9.29.

anonymous pL/SQL block with variables andloops

9

The PL/SQL block shown in Figure 9.29 has the following

characteristics:

The PL/SQL block starts with the DECLARE section in which you declare the variable names; the data types; and, if desired, aninitial value. Supported data types are shown in Table 9.9.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

480

part

III

tabLe Data

Database

Programming

9.9

pL/SQL basic data types Description

Type

Character values

CHAR

of a fixed length;

for example:

W_PCODE CHAR(5) Variable length

VARCHAR2

W_FNAME Numeric

NUMBER

values;

W_PRICE

for

for

W_EMP_DOB Inherits

%TYPE

the

database

WHILE

is

used.

The

SELECT statement You

can

the

same

Note the

data type

an attribute

of a

as the

P_PRICE

column

in the

PRODUCT

table

syntax:

only the

INTO

more than

of the

Each statement

uses the INTO

use

returns use

or from

statements;

LOOP

Note the

previously

LOOP

END

statement

you declared

example:

PRODUCT.P_PRICE%TYPE

condition

variable.

for

W_PRICE

PL/SQL

9

example:

data type from a variable that

W_PRICE

WHILE loop

example:

DATE

table;

Assigns

A

values; for example:

NUMBER(6,2)

Date values;

DATE

character

VARCHAR2(15)

string

inside

one

keyword

keyword

value,

the

PL/SQL

inside

you

concatenation

to assign

will get

symbol

code

the

output

a PL/SQL

block

of the

query to a PL/SQL

of code.

If the

SELECT

an error.

| |

to

must end

display

the

output.

with a semicolon

;.

note PL/SQL

blocks

SELECT,

INSERT,

directly

can contain

only standard

UPDATE

supported

in

The

most

useful

and

executed

you

need

a PL/SQL

SQL data

DELETE.

The

use

manipulation

of

data

is that

they

language

definition

(DML)

language

commands

(DDL)

such

commands

as

is

not

block.

feature either

to

and

of PL/SQL implicitly

use triggers

blocks

or explicitly

and

stored

by the

procedures.

let

you

DBMS.

create

That

We explore

code

capability

database

that is

can

be

especially

triggers

and

named,

stored

desirable stored

when

procedures

in the next sections.

9.7.1 triggers Automating critical

in

inventory

business a

supported

Copyright review

2020 has

Cengage deemed

Learning. that

business

management.

any

of triggers

All suppressed

Rights

Reserved. content

does

and

automatically

environment.

For

with sufficient

functionality

Editorial

procedures

modern

example,

product

you

May not

not materially

be

copied, affect

of the

want

to

availability.

can be supported

scanned, the

overall

or

duplicated, learning

maintaining

One

most make

Microsoft

at the

in experience.

whole

Cengage

part.

Due Learning

sure

to

electronic reserves

this

rights, the

right

some to

and

consistency

procedures

current

does

level,

integrity business

that

Access

application

or in

data critical

product

not support

third

party additional

content

may content

triggers.

be

any

time

can

be

While the

same level

suppressed at

proper

sales

does not create the

remove

are

is

from if

the

subsequent

eBook rights

and/or restrictions

of

eChapter(s). require

it

Chapter

data integrity that

the

layer,

provided

correct

it is

possible

propagated.

that

for

are

edits

Therefore,

products

ensuring

by triggers.

updates

hand

reflects

have

been Business

logic

and

requires

logic

without

a product

allowable

SQL and

into the

the

order

quantity

at the

appropriate

be

written

on hand.

Advanced

database

are implemented

database

that

Language

to

SQL

481

ensures

application

updates

being

a vendor

when

Better yet, how about

automatically? ordering,

must

an

the

minimum

consistent

key issues

to

ensure

below its

business

When triggers

directly to

product

an up-to-date two

made

completed

automatic

set,

be

to embed

propagated.

necessary

drops

that the task is

To accomplish

to

it is

inventory

Using triggers

always

9 Procedural

you first

value.

need to

After the

make sure that

appropriate

the

product

products

quantity

availability

on

requirements

be addressed:

update

of the

product

quantity

on hand

each

minimum

allowable

time

there

is

a sale

of that

product. If the

products

level,

the

To accomplish quantity

those

two tasks,

and

in the

correct

inefficient

on hand falls

because

another

data

is invoked

A trigger

is

of SQL

time

multiple

product

there

must be

requires

that

the

It is

or after

with

table

whether by the

state

useful

the

trigger

trigger,

as part

one

are critical

Triggers

can enforce

can

RDBMS

and to the

Auditing

a critical

Automatic

To For

see

how

if

automatically

Copyright review

2020 has

Cengage deemed

Learning. that

is

any

All suppressed

derived

does

the

May

the

is

sold.

SQL tasks.

or deleted.

The

external

at the

is

action

is

program

management.

actions

one of the

The condition

occurred.

or an

and

critical

it.

model. An event can be any operation that

event

be enforced

insert

database

what

determines

what is

undertaken

being

called.

For example:

DBMS design and implementation

and providing

most common

records truly

as a whole.

column

for

appropriate

levels.

warnings

uses for triggers

and

is to facilitate

in tables

useful;

Oracle

they

and also

recommends

call other add

triggers

stored

procedures.

processing

power

to the

for:

and

the

lets

not materially

be

purposes. used,

lets

on hand

is

quantity

use the

copied, affect

scanned, the

values.

constraints.

backup

quantity

not

the

be

by the RDBMS upon the occurrence

operation.

executed

values,

or security

whether

Reserved. content

a product

to perform

each

would

audit logs).

created

process,

Rights

table

system

a products

that

(eCA)

operation

that cannot

making

tables

check

demonstrate

in

of business

a trigger

process

each time

product

to run

integrity.

(creating

of replica

example,

database

update

of

the

have

multistage

executed

updated

that triggered

after

action. In fact,

role

update

would

that:

an update

being

by automating

to

generation

Enforcement

and

a

must remember

is inserted,

example

be executed

proper

database

purposes

Such

one to you

more triggers.

command

constraints

be used

play

(quantity-on-hand)

table.

or

for

of referential

Triggers

Creation

to

for remedial

enforcement

Triggers

rule) is to

add functionality

suggestions the

written

to remember

of the transaction

database,

as a SQL

Triggers

Triggers

Editorial

(or

such

Next,

sale.

somebody

a data row

a database

may have

of the

flag.

a new

Triggers tend to follow an event-Condition-Action changes

inventory

SQL statements:

reorder

was

statements

event.

before

is executed

write

the

SQL code that is automatically invoked

associated

database

A trigger

each

manipulation

A trigger

Each

update

SQL environment

Atrigger is procedural of a given

you could to

order

a series

worse, that

below its

must be reordered.

on hand

statement

Even

quantity

product

overall

or

on

examine

hand

PRODUCT

duplicated, learning

in experience.

a simple

updated

whole

falls

below

table in

or in Cengage

part.

inventory

when the

Due Learning

to

its

Figure

electronic reserves

right

is

sold,

minimum

9.30.

rights, the

management

product

some to

third remove

additional

problem.

system

allowable

Note the

party

the

content

should

quantity.

use of the

may content

be

suppressed at

any

time

from if

To

minimum

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

482

part

III

Database

order

quantity

ORDER field

Programming

(P_MIN_ORDER)

indicates

the

that indicates

values

FIgure

are

whether

set to

9.30

and the

minimum

0 (No)

quantity

the

to

product

serve

the proDuCt

product for

reorder

restocking

needs

to

be reordered

basis

for

the initial

as the

flag (P_REORDER)

a order.

The

(1

5 Yes,

trigger

columns.

P_REORDER 0

The P_MIN_

column

is

5 No). The initial

a numeric

P_REORDER

development.

table

9

online of the this

book.

Microsoft

Given the

quantity trigger

Content (The

PRODUCT

sets the

represents

table

listing

P_QOH. If the

The

syntax

OR REPLACE

[BEFORE

table

also

shown

shown

in

quantity

column to

in the 'Ch09_SaleCo'

Figure

9.30,

lets

database

create

on hand is below the

to

create

TRIGGER

/ AFTER] [DELETE

EACH

is

on the online platform for that is

stored

in

format.)

P_REORDER

Yes.)

CREATE

shown in Figure 9.30. The script file is located

PRODUCT

Access

on hand,

[FOR

Oracleuserscanrunthe PRODLIST.SQL scriptfileto formatthe output

PRODUCT table

1. (Remember a trigger

in

that

the

a trigger

to

evaluate

the

products

minimum

quantity

shown in

P_MIN, the

number

1 in the

P_REORDER

column

Oracle is:

trigger_name

/ INSERT

/

UPDATE

OF column_name]

ONtable_name

ROW]

[DECLARE] [variable_namedata

type[:5initial_value]

]

BEGIN

PL/SQL instructions; .......... END;

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

As you can see,

a trigger

definition

timing:

BEFORE

The triggering executes

in this

The triggering

case, before

event:

contains

the following

or AFTER.

This timing

or after the triggering

The statement

that

9 Procedural

Language

SQL and

Advanced

SQL

483

parts: indicates

when the triggers

statement

causes the trigger

is

to

PL/SQL

code

complete.

execute

(INSERT,

UPDATE,

or

DELETE). The triggering

level:

There are two

types

of triggers,

statement-level

triggers

and row-level

triggers. ? A statement-level of trigger default

trigger

is executed

is assumed

once,

before

trigger

requires

executed

once for

each row

ten rows,

the trigger

executes

action:

The PL/SQL

The triggering statement

inside

In the PRODUCT an UPDATE

of the

P_QOH

use of the affected

ROW keywords.

statement

is

This type

completed.

This is the

code enclosed

code you

must end

will create

and P_MIN

If the

FOR EACH

ROW keywords.

by the triggering

statement.

This type

(In other

of trigger

is

words, if you update

ten times.)

The trigger

column.

to

The trigger

PL/SQL case,

table.

P_MIN

P_REORDER

the

tables

PRODUCT

with the

FOR EACH

case.

? A row-level

in the

if you omit the

or after the triggering

value

of

the

a statement-level

attributes

action

between

is

an equal

row

UPDATE to

END keywords.

Each

;.

trigger

for an existing

executes

P_QOH

BEGIN and

with a semicolon

that is implicitly

or AFTER an INSERT

statement

or less

executed

than

that

P_MIN,

of a new row

compares the

AFTER

the

trigger

P_QOH

updates

the

1.

code is

shown in

Figure 9.31.

9

FIgure

To test

9.31

this trigger

the trg_proDuCt_reorDer

version,

lets

change

the

trigger

minimum

quantity

for

product

'23114-AA'

to

8. After

that update, the trigger makes sure that the reorder flag is properly set for all of the products in the PRODUCT table. (See Figure 9.32.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

484

part

III

FIgure

Database

Programming

9.32

Successful trigger

execution after the p_MIn value is updated

This trigger seems to work well, but what happens if you reduce the minimum quantity of product 2232/QWE? Figure 9.33 shows that when you update the minimum quantity on hand of the product

9

2232/QWE,

FIgure

Copyright Editorial

review

2020 has

it falls

9.33

Cengage deemed

Learning. that

any

below the

new

the p-reorDer

All suppressed

Rights

Reserved. content

does

May not

not materially

be

minimum,

but the reorder

flag is

still 0.

Why?

value mismatchafter update of the p_MIn attribute

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The answer is that the trigger REORDER

trigger

The trigger two

code

fires

(INSERT

or P_QOH

UPDATE

The triggering

action

product.

inserted),

action

is

Now lets

performs

modify

the

FIgure

is

completed.

UPDATE

UPDATE

plus

in the

Lets

examine the

Therefore,

UPDATE).

PRODUCT

that

all 519,129

do not value

need

Advanced

SQL

485

TRG_PRODUCT_

the

DBMS always

That is,

table,

the

after

you

trigger

executes

do an update

executes

another

to

all This

handle

all

of the can

table

rows

rows

affect

in the

the

PRODUCT

with 519,128

(519,128

table,

performance

original

rows

rows

of the

and you insert

plus the

one

you

an update!

only to

when the inventory

trigger

updates

one row!

if you have a PRODUCT

P_REORDER

9.34

that

just

will update

rows

required the

or

updates

what happens

sets the clearly

cases.

SQL and

detail:

a new row

an

The trigger

including

The trigger

UPDATE

statement

Imagine

one

plus

all possible

Language

automatically.

triggering

database. just

more

statement

or you insert

statement

even if the

9.31) in

after the triggering

statements

of P_MIN

does not consider

(Figure

9 Procedural

1; it

level

update

is

does

not reset

back

to

scenarios,

the

a value

value

greater

as shown

in

to

0, even if such

than

Figure

the

minimum

an value.

9.34.

the second version of the trg_proDuCt_reorDer

trigger

9

The trigger

in Figure

9.34 sports

several

new features:

The trigger is executed before the actual triggering statement is completed. In Figure 9.34, the triggering timing is defined in line 2, BEFORE INSERT

OR UPDATE. This clearly indicates

that the triggering

statement is executed before the INSERT or UPDATE completes, unlike the previous trigger examples. The trigger

is a row-level

trigger

instead

of a statement-level

keywords makethe trigger arow-level trigger. affected by the triggering statement. The trigger action uses the :NEW attribute.

trigger.

The FOR EACH

ROW

Therefore, this trigger executes once for each row

attribute reference to change the value of the P_REORDER

The use ofthe :NEW attribute references deserves a more detailed explanation. To understand its use, you must first consider a basic computing tenet: all changes are done first in primary memory, then to permanent memory.In other words, the computer cannot change anything directly in permanent storage (disk). It

must first read

in primary

Copyright Editorial

review

2020 has

Cengage deemed

permanent

storage to

primary

memory; then it

memory; finally, it writes the changed data back to permanent

Learning. that

the data from

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

makes the

change

memory (disk).

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

486

part

III

Database

The

DBMS does exactly

DBMS (You

Programming

makes two will learn

copies

more

contains

the

changed

(new)

about

original

made by an INSERT, to

refer

and :OLD

attribute

row

that

Although

the

not

uses

operator.

value

This

the

a BEFORE

trigger,

this

triggering

and the :NEW

values

value

0 to the version

not

disk,

assign

a value

:OLD

and the

value),

values;

value 1 to the

P_REORDER not

(after

original

table).

values; can

action.

copy

contains

the

any changes

You

you can use :NEW

For example:

minimum

quantity

is

for

comparison

use

the

already

not

to the

in

the

done

of each

The

assignment

change column

BEFORE

are

column

memory.

before is

uses the

them.

the means

made in

assignment

always

hasnt

otherwise,

Remember,

P_REORDER

you cannot

statement

place;

changes

table.

P_REORDER

triggering

taken

exist.

but after the

stored

the

database

with the

this

mean that has

would

permanently

are read-only

does

does

values

to

reference

1; assigns

copy

trigger

on hand

Therefore,

The first

second

the

of a database

statement

:NEW

to the

in

the

statement.

statement.

saved to

(never

stored

data integrity,

or DELETE)

Concurrency).

The

to refer to the

quantity

trigger.

and

changes.

are

code

the

UPDATE

saved to the

that

PL/SQL

are permanently

:5

new trigger

the

values

a row-level

the

are

the

You can use :OLD

(the

more. To ensure

Transactions

before

compares is

contrary,

results

The :OLD

P_REORDER assigns

the

or INSERT

the :NEW

within

triggering

have fired

changes

The trigger UPDATE

is

On the

would

before the

to

trigger

yet.

this

Managing

are permanently

or DELETE). values

to something by a DML (INSERT,

attributes

that

5 :NEW.P_MIN

by the

12,

of the

only

that

changed

Chapter

changed

references ,

being

attributes

UPDATE

updated

executed trigger

of the

Remember is

in

values

to the

IF :NEW.P_QOH a product.

this

(old)

values

use :NEW

the same thing, in addition

of every row

the

always :5

done

assignment

Note that :NEW.

and :NEW.P_REORDER

:5

0;

column. any

DML

statement!

9 FIgure

Copyright Editorial

review

2020 has

9.35

Cengage deemed

Learning. that

any

execution of the second version of the trigger

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Before testing the

the new trigger,

minimum

After

you

quantity,

create

As you

the

table.

is

Figure

triggering

up as shown in

Figure 9.35.

will run

other

attribute,

The use of triggers are independent are

use the

following

DROP

would

features:

have

value

affected

run.

they

with it.

the

automation

are

associated

However,

if you

row

of

multiple

with

database

need

to

in this

case,

data

delete

P_QOH

all

without

must

487

be 0.

in Figure

of the

9.35.

PRODUCT

PRODUCT

rows

statement

tasks.

When you

atrigger

SQL

was set

or P_MIN. If you update

management

tables.

flag

as shown

not

Advanced

on hand that is above

reorder

the triggering

or update

SQL and

all rows

rows,

the reason

row

the

to fire it,

only three

set. Thats

a new product

wont

has a quantity

statement

affected

Language

condition,

important

only if you insert trigger

currently

Given that

an UPDATE

following each

1.

the

facilitates

deleted

set to

execute

P_REORDER

objects,

objects

is

for

statement

correct

The trigger

can

note the

invoked

would have the

any

flag

you

9.35,

automatically

If your

product '11QER/31'

reorder

new trigger,

examine

The trigger

note that

yet the

9 Procedural

Although

delete

deleting

triggers

a table,

the

all trigger

table,

you

could

command:

TRIGGER

trigger_name

9.7.2 Stored procedures A stored stored

procedure

with triggers, same that

is

procedures this

integrity they

be used

procedure

doing that,

as stored

to

to represent

a single transaction.

is

network.

The use

executed

locally

Stored

at the

procedures unique

chance

of errors

cost

OR REPLACE

[variable_name

update

or the

stored

to the

code that

are

by

called

of application

addition

SQL

performance

means

is

create

customer.

By

them

as

Because the stored

statements

because

over the

all transactions

isolation

over the

and

programs),

and

can

and execute

have to travel

of code

by application

development

of a new

performance.

not

procedures you

procedures:

of individual

does

As

would not have the

of stored

procedure

triggers,

procedures.

For example,

and increase

system

duplication

and as such

use of stored

SQL statement

database

stored

advantages

transactions.

no transmission

each

layer

major

within a single

improves

and

modules

and the

use the following

CREATE

is

procedures

RDBMS,

PL/SQL

a credit

network traffic

there

help reduce

(creating

procedure,

reduce

server,

of stored on the

sale,

Just like

does not support

application

business

clear advantages

substantially

stored

Access

One of the

SQL statements

There are two

procedures

procedure

at the

and represent

a product

and SQL statements.

Microsoft

procedures.

encapsulate

you can encapsulate

Stored

of procedural

database.

would need to be implemented

robustness

can

a stored

a named collection

are stored in the

code

thereby

maintenance.

are network.

sharing minimising

the

To create

a stored

data-type,

... )] [IS/AS]

syntax:

PROCEDURE

procedure_name

data type[:5initial_value]

[(argument

[IN/OUT]

]

BEGIN PL/SQL

or SQL

statements;

...

END; Note the following Argument

could

Copyright Editorial

review

2020 has

Cengage deemed

specifies

have zero

Learning. that

important

any

All suppressed

Rights

the

or

about

parameters

does

May not

not materially

be

copied, affect

scanned, the

stored

that

more arguments

Reserved. content

points

overall

are

procedures

passed

and their

to the

stored

syntax:

procedure.

A stored

procedure

or parameters.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

488

part

III

Database

IN/OUT

Programming

indicates

whether the

Data-type is one of the match those

name, its

data type,

equal

stored

to twice

9.36

the

between

procedures,

the

minimum

quantity.

used in the

keywords

that

discount Figure

RDBMS.

The data types

normally

IS and

BEGIN.

You

must specify

the variable

value.

you for

or both.

statement.

an initial

assume

5 per cent

or output

SQL data types

and (optionally)

an additional

is for input

RDBMS table-creation

can be declared

assign

FIgure

procedural

used in the

Variables

To illustrate to

parameter

want to create

all products

9.36

shows

Creating the prC_proD_DISCount

a procedure

(PRC_PROD_DISCOUNT)

when the quantity how

the

stored

on hand is

procedure

is

more than

or

created.

stored procedure

9

online Content Thesourcecodefor allofthe storedprocedures shown inthis sectioncan be found

As you

on the

examine

Figure

OUTPUT.PUT_LINE

you previously To execute

online platform for this

9.36,

function

ran the

note to

that

the

display

a

SET SERVEROUTPUT stored

procedure,

you

book.

PRC_PROD_DISCOUNT message

when the

stored procedure

procedure

executes.

uses the

(This

action

DBMS_ assumes

ON.) must use the following

code:

BEGIN PRC_PROD_DISCOUNT; END; /

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Note that if you are using the the

following

SQL*

Plus command

line,

9 Procedural

Language

you can also execute

SQL and

stored

Advanced

procedures

SQL

489

using

syntax:

EXEC procedure_name[(parameter_list)]; For example, the

to see the results

of running

EXEC PRC_PROD_DISCOUNT

FIgure

9.37

the

PRC_PROD_DISCOUNT

command

shown in

Figure

results of the prC_proD_DISCount

stored

procedure,

you can use

9.37.

stored proCeDure

9

Using

Figure

a quantity (Compare

9.37

on hand the

first

One of the previous

increase

Copyright review

2020 has

you

more than

PRODUCT

can

increase

Learning. that

any

All suppressed

an input

procedure.

Rights

Reserved. content

does

listing

May

not

be

copied, affect

9.38

scanned, the

overall

the

to the

second

case,

shows

the

duplicated, learning

in experience.

whole

for

all products

was increased

table

code

part.

fine,

can

pass

for that

Due Learning

to

but

electronic reserves

with

by 5 per

cent.

listing.)

you can pass values to them.

you

Cengage

attribute

quantity

PRODUCT

worked

or in

discount

minimum

is that

In that

or

product

to twice

procedure

variable?

materially

how the

of procedures

Figure

not

see

or equal

table

main advantages

to the

Cengage deemed

of

guide,

PRC_PRODUCT_DISCOUNT

percentage

Editorial

as your

what if

you

an argument

For example,

wanted

to

to represent

the

make the the

rate

of

procedure.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

490

part

III

FIgure

Database

9.38

Figure

Second version of the prC_proD_DISCount

9.39

shows

the

execution

of the

procedure.

Note that

if the

procedure

parentheses

and they

must

be separated

discount is

not

FIgure

9

Programming

of 1.5, the

error

message

second

requires

version arguments,

by commas.

from

within

stored proCeDure

the

Also

stored

of the

PRC_PROD_DISCOUNT

those

arguments

notice

that,

procedure

is

when shown

stored

must be enclosed we try to

and the

in

apply

a product

product

discount

applied.

9.39

results of the second version of the prC_proD_DISCount

stored proCeDure

Stored procedures are also useful for encapsulating shared code to represent business transactions. For example, you can create a simple stored procedure to add a new customer. By using a stored procedure, all programs can call the stored procedure by name each time a new customer is added. Naturally, if new customer

attributes

are added later,

you

However, the programs that use the stored procedure added

attribute

and

PRC_CUS_ADD

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

would

stored

Rights

Reserved. content

does

need

to

add

procedure

May not

not materially

be

copied, affect

only

shown

scanned, the

overall

or

duplicated, learning

a new

in Figure

in experience.

whole

or in Cengage

would need to

modify the

stored

procedure.

would not need to know the name of the newly

parameter

to the

procedure

call. (Take

alook

at the

9.40.)

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

9.40

9 Procedural

Language

SQL and

Advanced

SQL

491

the prC_CuS_aDD stored proCeDure

9

As you examine

Figure

The PRC_CUS_ADD CUSTOMER

9.40, note these

features:

procedure uses several parameters,

one for each required

attribute in the

table.

The stored procedure uses the CUS_CODE_SEQ sequence to generate a new customer code.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

492

part

III

Database

Programming

The required null

only

parameters

when the

second

customer

and

cannot

The

procedure

was

those

table

specified

specifications

addition

in the table

permit

nulls

was unsuccessful

definition

for

because

Until

now,

returned

error.

displays

a

message

in the

SQL*Plus

If you

returned

all

of the

SQL

a single want to

is

the

use

There

stored,

DBMS

If the

an SQL

you

have

SQL

statement

statement

used inside

that

types

SQL

to let

the

a PL/SQL

returns

returns

may return

two

CURSOR you

FETCH,

holding

not in the

client

of cursors:

is

user

note

that

a required

the

attribute

know

that

the

customer

more than

more rows

inside

cursor_name have

and

the

9.10

one value

you

inside

usedin procedural

and rows.

or stored

value,

your

PL/SQL

an code,

SQL to hold the data rows

area of memory in

Cursors

procedure)

will generate

are

held in

which the

output

a reserved

memory

and explicit. returns

An implicit

cursor

only one value.

is

automatically

Up to this

point,

created

in

all of the examples

cursor is created to hold the output of an SQL statement that

could

return

0 or only

DECLARE

one row).

To create

an explicit

cursor,

use the

section:

IS select-query;

declared

CLOSE)

summarises

(but

a PL/SQL

(trigger

one

computer.

implicit

An explicit

block

more than

as a reserved

columns

SQL statement

cursor.

or

syntax

of a cursor

an array

when the

created an implicit following

You can think

like

server,

are two

procedural

tabLe

CUS_AREACODE

and can be

example,

with Cursors

statements value.

by an SQL query. query

Once

the

console

you need to use a cursor. A cursor is a special construct

9

For

added.

have

area in

parameter.

be null.

9.7.3 pL/SQL processing

of the

must be included

that

a cursor,

anywhere

main use

you

can

between

of each

use

the

specific

BEGIN

of those

PL/SQL

and

END

cursor

processing

keywords

of the

commands

PL/SQL

(OPEN,

block.

Table

9.10

commands.

Cursor processing commands

Cursor Command

explanation

OPeN

Opening cursor the

the

for

OPEN FeTCH

it

The

doesnt

SQL command

cursor

populate

and

declaration

the

cursor

populates

command with the

the

cursor

only reserves

data.

Before

you

with

data,

a named can

use

opening

memory

a cursor,

the

area for

you

need to

cursor_name cursor

copy

FETCH

is

opened,

it to the

cursor_name

data types

statement

you can

PL/SQL

variables

INTO

The PL/SQL variables have

the

For example:

Once the and

executes

processing.

cursor;

open it.

cursor

variable1

five

with the

columns,

FETCH

processing.

[, variable2,

used to hold the

compatible

returns

use the for

data

to retrieve is:

data from

the

cursor

...]

must be declared in the

columns

there

command The syntax

retrieved

must be five

by the

PL/SQL

DECLARE section

SQL command.

variables

to receive

and

must

If the

cursors

the

data from

SQL the

cursor.

This type

of processing

database cursor

is

second CLOSe

Copyright Editorial

review

The

2020 has

Cengage deemed

Learning. that

any

copied row

CLOSE

All suppressed

Rights

to the

of data is

PL/SQL placed

command

Reserved. content

resembles

the

one-record-at-a-time

models. The first time you fetch

does

May not

not materially

be

closes

copied, affect

scanned, the

variables;

in the

overall

the

or

duplicated, learning

the

PL/SQL cursor

in experience.

processing

used in previous

a row from the cursor, the first row of data from the

for

whole

second

time

variables;

you fetch

a row

from

the

cursor,

the

and so on.

processing.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Cursor-style

processing

a cursor,

it

becomes

opening

a cursor,

When you

the

PL/SQL

continues

the

fetch

variables.

do you

reached

the

important

of the

9.11

the

cursor

contains

of the

the

Language

one row

a current

SQL and

at a time.

row

Advanced

SQL

493

Once you open

pointer.

Therefore,

after

cursor.

data from

the

row

current

pointer

row

in

the

moves to the

cursor

is

copied

next row in the

to

set and

cursor.

of rows

data

set?

9.11

row

set

the current

end of the

Table

data

first

cursor,

number

cursor

data from

That

is the

the

the which

information.

tabLe

row

from

set.

After the fetch,

know

end

retrieving

data

current a row

until it reaches

How

involves

an active

9 Procedural

are in

You know

summarises

the

cursor?

because

the

Or how

cursors

cursor

have

do you

know

when

special

attributes

you

that

have

convey

attributes.

Cursor attributes

Attribute

Description

%rOwCOUNT

%FOUND

Returns

the

error. If

no FETCH

Returns

TRUE if the last

not return

number

has

been

been %iSOPeN

done

FETCH

so far. If the but the

cursor

returned

cursor is

a row.

is

not

OPEN, it returns

OPEN, it returns

Returns

FALSE

an

0.

if the last

FETCH

did

an error. If no FETCH has been

NULL.

Returns TRUE if the last FETCH

fetched

any row. If the cursor is not OPEN, it returns

done, it contains %NOTFOUND

of rows

returned

a row.

FETCH did not return If the

done, it contains

cursor

is

not

any row.

Returns FALSE if the last

OPEN, it returns

an error.

If no FETCH

has

NULL.

Returns

TRUE if the

cursor is

open (ready

for

processing)

closed.

Remember,

before you can use a cursor, you

or FALSE

if the

cursor

is

must open it.

9

To illustrate have in

the

a quantity

Figure

use of cursors, lets

use a simple

stored

on hand

the

average

quantity

procedure

code

greater

than

procedure

example

on hand

for

that lists

all products.

all products The

code is

that shown

9.41.

As you

examine

the

stored

shown

in

Figure

9.41,

note

the

following

important

characteristics: The

type

%TYPE

data

type

in

is used to indicate

declared

or from

indicate

that

columns

compatible The

that the

definition

section.

given variable inherits

of a database

W_P_CODE PRODUCT

and

table.

This

way, you

is

declared

as:

the

In this

W_P_DESCRIPT

table.

As indicated

Table

data type from

case,

will have ensure

in

that

you

the the

are

9.9, the

a variable

using

the

same

data type

PL/SQL

variable

%TYPE

data

previously %TYPE to

as the

respective

will have

a

data type.

PROD_CURSOR

To open

variable

an attribute

the

in the

the

the

cursor

PROD_CURSOR

cursor

and

CURSOR

populate

PROD_CURSOR

it, the

following

command

is

executed:

OPEN

PROD_CURSOR;

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

494

part

III

Database

FIgure

Programming

9.41

9

a simple prC_CurSor_eXaMpLe

The LOOP statement is used to loop through the data in the cursor, fetching

one row at atime.

The FETCH command is used to retrieve a row from the cursor and place it in the respective PL/SQL variables. The EXIT command is used to evaluate when there are no more rows in the cursor (using the %NOTFOUND cursor attribute) and to exit the loop. The

%ROWCOUNT cursor attribute is used to obtain the total

number of rows

processed.

The CLOSE PROD_CURSOR command is used to close the cursor. The use of cursors,

combined

with standard

SQL,

makes relational

databases

very

desirable

because

they

enable programmers to work in the best of both worlds: set-oriented processing and record-orientated processing. Any experienced programmer knows to use the tool that best fits the job. Sometimes you may be better off manipulating data in a set-orientated environment; at other times, it may be better to

use a record-orientated

environment.

Procedural

cake and eat it, too. Procedural SQL provides functionality while maintaining a high degree of manageability.

SQL lets

you have your

that enhances the capabilities

proverbial

of the DBMS

9.7.4 pL/SQL Stored Functions Using programmable procedures

or procedural

and functions

SQL, you can also create your own stored functions.

are very similar.

A stored

function

and SQL statements that returns a value (indicated create afunction, you use the following syntax:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

is basically

a named

group

by a RETURN statement in its

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

Stored

of procedural

program code). To

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

CREATE

FUNCTION

function_name

(argument

IN

9 Procedural

data-type,

Language

... ) RETURN

SQL and

data-type

Advanced

SQL

495

[IS]

BEGIN PL/SQL

statements;

... RETURN

(value

or expression);

END; Stored functions from to

confuse

There

is little

doubt

and its

requires

to

access

executable

internet.

No

matter

host language. capabilities

is

processing

side in its

authors

Mount

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

known

array

College

Reserved. does

May

grateful

not

be

affect

scanned, the

overall

to

as Visual

and systems.

or DB2

of

are related

If you

Yet,

Windows-based

the

almost

GUI system

you

will likely

need

the

in experience.

whole

or in Cengage

part.

Due Learning

you

generally

is,

each

All of the

a binary-executable

typically

use

his

rights, the

right

arrays

true

at a time.

provided

reserves

that

runs

Ada, FORTRAN,

can

especially

for

electronic

requires

at a time.1

is

(COBOL,

have adopted

to

procedural

at the

environment).

This is

basis

the

it is called the

language;

The host program

one record

the

over

languages

one instruction

comments is

run

languages:

for file

several

by

to

third remove

Pascal,

hold

data,

manipulation, newer

object-oriented

manner.

Emil

T.

considerable

some

to

However,

data sets in a cohesive

thoughtful experience

to

maintaining

procedural

host language

Although

Studio .NET

to

9

may be a standard

designed

interpreted

languages

at a time.

manipulate

duplicated,

with

DBMS

within an application

SQL statements

procedural

program).

data

learning

SQL

executed

at a time.

the

application

approach

and

Meanwhile,

one row

or

ease

such

and .NET.

developed

embedded

and it is

manipulates

IBM

systems

ASP

Oracle

being

a non-procedural,

as Visual

for

Server,

Web

mixing SQL

from

element

whose

copied,

be a

programming

data

elements

materially

or

most common

(different

typically

and

not

may

SQL is

space

one

Web application

due to its

language

programs

Java,

SQL

part

database

programming

The program

However,

Conventional

particularly

content

or it

as a compiled

such

Rights

Java.

side.

programmer

Mary

not

functions.

SQL statements that are contained

C# and

checked

help the

are

Saint

is

world,

with

Access,

between

that

environments that

be invoked

Remember

database.

Linux,

server

memory

host language

programming

The

the

rules).

is in

with other

you use, if it contains

syntax

process

process

the

stored

language

familiar

your

Microsoft

differences

at the

(also

own

PL/I)

extensions

1

its

mismatch:

and still

use, if

SQL is still the

key

place

program

Processing

or

Remember

parsed,

takes

and cannot

compliance

with

real

systems

most likely

applications.

some

mismatch:

instruction

where

as

which language

DBMS-based

Run-time

you

you

as VB.Net,

Embedded

in

C11,

But in the

are

data in the

Windows

understand

AVG)

manipulation

database

you

such

the

such

in

or triggers

specific

need a conventional

SQL is aterm used to refer to

binary

client

capabilities.

tools

a database

language

very

MAX and

as a data

and you still

programming to

procedures

some

MIN,

popularity

data-retrieval

manipulate

embedded programming

you

SQLs

Web applications,

use SQL to

that

as

C # or COBOL to integrate

of the

within stored follows

(such

and programs,

developing

regardless

function

SQL

that

powerful

Basic, .Net,

only from

the

SQL functions

eMbeDDeD

other systems

are

(unless

built-in

9.8

use

can be invoked

SQL statements

party additional

Cipolla, and

content

may content

who teaches

practical

be

suppressed at

any

time

from if

at

expertise.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

496

part

III

Database

Programming

Data type match

mismatch:

data

types

To bridge the several

SQL provides

used in

differences,

programming

A standard

several

different

the

Embedded

languages.

data types,

host languages

The

SQL

example,

Standard2

defines

Embedded

syntax to identify

embedded

syntax to identify

host variables.

but some

(for

data types

and

varchar2

a framework

SQL framework

SQL code

of those date

defines

within the

may not data

types).

to integrate

the

SQL within

following:

host language

(EXEC

SQL/

END-EXEC).

A standard receive

data from

the

host language. A communication language. Another which

is

database

All host area

This

used

to

by the

the

exchange

status

and

error

(ODBC)

and

information

variables

programming

Connectivity

code)

in the

host language

process

the

that

data in the

(:). between

SQL

and

SQLSTATE.

SQLCODE

SQL is through

an application

Open Database

and

two

are variables

SQL

by a colon

contains

languages

writes to

embedded

preceded

area

host

programmer

provided

are

communication

way to interface the

(through

variables

Host variables

the

use

interface

of a call level (API).

and the

interface

A common

host

(CLI),3

CLI in

in

Windows

interface.

online Content Thesourcecodefor allofthe storedprocedures shown inthis section is available

on the

online

platform

for this

book.

Before continuing, lets explore the process required to create and run an executable program embedded SQL statements. If you have ever programmed in COBOL or C11, you will be familiar

9

the

multiple steps required

among language

to

generate

the final

executable

and DBMS vendors, the following

The programmer

writes embedded

SQL code

program.

is used to transform

DBMS-and language-specific. to the host language.

the

the

specific

details

vary

general steps are standard: within the

host language

follows the standard syntax required for the host language A preprocessor

Although

with with

embedded

instructions.

and embedded

SQL into

specialised

The preprocessor is provided by the

The code

SQL. procedure

calls that

are

DBMS vendor and is specific

The program is compiled using the host language compiler. The compiler creates an object code module for the program containing the DBMS procedure calls. The object code is linked to the respective library modules and generates the executable program. This process binds the DBMS procedure calls to the DBMS run-time libraries. Additionally, the binding process typically creates an access plan module that contains instructions to run the embedded

code

at run time.

The executable is run, and the embedded

Copyright Editorial

review

2020 has

2

https://crate.io/docs/sql-99/en/latest/chapters/39.html

3

www.oracle.com/database/technologies/appdev/oci.html

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

SQL statement retrieves

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

data from the database.

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Note that in the

you can embed individual

book,

PL/SQL

you

blocks

extremely

have

in

executed

and

SQL statements

DBMS-provided

to

embed

use

address

ad hoc

To embed

to

to

or ad

process

hoc

data

a host language,

Advanced

Up to this

SQL

However,

a host language.

this

497

and

inside

compiled

SQL

point

statements

requests.

that it is

follow

SQL and

block.

write

transactions

within a host language

SQL into

Language

PL/SQL

(SQL*Plus)

one-time

queries

SQL statements

as needed.

or even an entire

application

mode to

awkward

typically

as often

EXEC

a

an interpretive

difficult

Programmers

used

9 Procedural

it is

once

and

syntax:

SQL SQL

statement;

END-EXEC. The

preceding

following

syntax

works for

embedded

EXEC

SQL

SELECT,

code

INSERT,

will delete

UPDATE

and

109,

George

employee

DELETE

statements.

Smith,

from

For example,

the

EMPLOYEE

the

table:

SQL DELETE

FROM

EMPLOYEE

WHERE

preceding

embedded

EMP_NUM

5 109;

END-EXEC. Remember,

the

Therefore,

the

changes

it.

only for

the

useful

if

statement

Each first

you

time run;

could

send

from

data

the

using

the

embedded

practice

is

to

COBOL,

To use

you

host

would

an employee write the

whose

following

EXEC

to indicate

to

the

the

employee

SQL,

as the

SQL

host

variables

is

to

in the

by preceding represented

attributes.

host

is

would

good

be

more

to

may be used

receive

the

host language. For

Storage

variable

9

data

Common

example,

section.

with a colon (:).

by the

code

be used

it in the

Working

them

programmer code

be deleted.

may

declare

source

the

preceding this

statement.

The host variables

or they

names

the

Clearly,

number

must first

of course,

short,

by a colon (:).

embedded

an executable

unless, In

an error.

you

variable

number

row.

generate the

to generate

change

same

variable,

SQL section

employee

compiled

are preceded

a host

define the

refer to them in the embedded

is

and cannot

deletes will likely

variables

host language

similar

it

runs

a variable

SQL.

use

runs,

all subsequent

SQL, all host

from

permanently

program

specify

In embedded to

is fixed

the

SQL statement

if

you

Then you

For example,

W_EMP_NUM,

are

would

to

delete

you

would

code:

SQL

DELETE

FROM

EMPLOYEE

WHERE EMP_NUM

5 :W_EMP_NUM;

END-EXEC. At run

if the

time,

the

host

employee

statement

has

defines

a SQL

known

as the

EXEC

you been

variable

value

are trying

to

completed

used

delete

without

communication SQLCA

is

and is

execute

doesnt

errors?

area to

area

to

hold

defined

in

embedded

exist in the

As

and

Data

SQL

the

information.

Division

What

happens

How do you know

previously,

error

statement.

database?

mentioned

status the

the

In

embedded

that

the

SQL standard

COBOL,

such

an

area is

as follows:

SQL INCLUDE

SQLCA

END-EXEC. The SQLCA main values

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

area contains returned

All suppressed

Rights

Reserved. content

two

by the

does

May not

variables

variables

not materially

be

copied, affect

scanned, the

for status and their

overall

or

duplicated, learning

and error reporting.

Table 9.12 shows

some

of the

meaning.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

498

part

III

Database

tabLe

Programming

9.12

variable

SQL status

and error reporting

value

Name

variables

explanation Old-style

SQLCODe

integer

error reporting

supported

value (positive

0

Successful

100

No data; the

for

backward

compatibility

only; returns

an

or negative).

completion

of command.

SQL statement

did not return

any rows

or did not select,

update,

or

delete any rows. Any negative

-999

value indicates

that

an error occurred.

Added by SQL-92 standard to provide

SQLSTATe

character 00000

string (5

Successful Multiple XX-.

The following

embedded

EXEC

SQL

EXEC

SQL SELECT

9

completion

represents

SQL

code

EMP_LNAME,

WHERE

the

defined

as a

of command. XXYYY

class

the subclass

illustrates

where:

code.

the

EMP_LNAME

EMP_NUM

error codes;

long).

values in the format

represents

YYY-.

characters

predefined

code.

use

INTO

of the

SQLCODE

within

:W_EMP_FNAME,

a COBOL

program.

:W_EMP_LNAME

5 :W_EMP_NUM;

END-EXEC. IF

SQLCODE

5 0 THEN

PERFORM

DATA_ROUTINE

PERFORM

ERROR_ROUTINE

ELSE

END-IF. In that

example,

successfully. is

the

SQLCODE

If that is the

host

case, the

variable

is

checked

DATA_ROUTINE

to

is

determine

performed;

whether

the

otherwise,

query

completed

the

ERROR_ROUTINE

data

from

performed. Just

as

returns

with

PL/SQL,

more than

Section

or in the

in this chapter.

one

embedded

value.

If

Procedure To declare

SQL requires

COBOL

is

Division. a cursor,

used,

the

the

cursor

The cursor you use the

use can

of cursors

hold

be declared

must be declared syntax

to

either in the

and processed,

shown in the following

a query

Working

that

Storage

as you learnt

earlier

example:

EXEC SQL DECLARE SELECT WHERE

PROD_CURSOR P_CODE,

FOR

P_DESCRIPT

P_QOH

. (SELECT

FROM

PRODUCT

AVG(P_QOH)

FROM

PRODUCT);

END-EXEC.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Next, you open the cursor to EXEC

make the

cursor ready for

9 Procedural

Language

SQL and

Advanced

SQL

499

processing:

SQL OPEN

PROD_CURSOR;

END-EXEC. To

process

a time

and

FETCH

the

the

data

place

the

command

COBOL EXEC

rows

in the

values

in

completed

program.

cursor,

the

you

host

FETCH

variables.

The

This

section

successfully.

Such a routine

use the

is executed

command

SQLCODE of code

with the

to

must

retrieve

one row

be checked

typically

to

constitutes

PERFORM

ensure

part

command.

of

data that

at the

of a routine

in

For example:

SQL FETCH

PROD_CURSOR

INTO

:W_P_CODE,

:W_P_DESCRIPT;

END-EXEC. IF

SQLCODE

5 0 THEN

PERFORM

DATA_ROUTINE

PERFORM

ERROR_ROUTINE

ELSE

END-IF. When all rows EXEC

have

been

processed,

you

close

the

cursor

as follows:

9

SQL

CLOSE

PROD_CURSOR;

END-EXEC. Thus

far,

you

statements

have

and

seen

examples

parameters.

of

were specified in the application meaning SQL

that

the

SQL

statement SELECT

P_CODE,

FROM

data data

access access

Dynamic

a program

the

neither

preceding

Cengage

Learning. that

P_QOH,

while

the

used

are limited

predefined

to the

SQL

actions

that

SQL is known as static

application

is running.

For

SQL,

example,

the

P_PRICE

any

All

on the

fly.

They are

Therefore,

used to describe

the

the

preceding

more likely end

the

Rights

Reserved. content

does

programmer

SQL statement.

to require

user requires

Unfortunately,

the flexibility

that

SQL

be

May

not materially

be

which the

SQL statement

At run time in a dynamic

are required

end

are to

in

at run time.

that

nor the queries

could

not

an environment

is generated

SQL statements

or how those example

suppressed

are known in the

environment.

SQL statement

can generate

of the

deemed

programmer

That style of embedded

change

and conditions

requirements

be generated

has

the

programs

of defining

as dynamic

as

requirements.

are to

2020

tables

SQL is a term

environment,

review

which

of the

this:

work in a static

advance; instead,

Copyright

not

users

. 100;

Note that the attributes,

Editorial

SQL in

end

PRODUCT

end users seldom

the

will

P_DESCRIPT,

WHERE P_PRICE

their

like

the

programs.

statements

may read

embedded

Therefore,

to respond

user is likely

be structured.

to

to

know

is

not known in

SQL environment,

ad hoc queries. In such an

precisely

For example,

what

a dynamic

kind

of queries

SQL

equivalent

be:

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

500

part

III

Database

Programming

SELECT :W_ATTRIBUTE_LIST FROM :W_TABLE WHERE :W_CONDITION; Note that

the

attribute

list

W_ATTRIBUTE_LIST

and the

and

in the query generation.

end

user

user

might

might

W_CONDITION

Although

be In

want to

want to

know

dynamic

much slower addition,

know

how

are text

which

many

known

until the

variables

uses the

products

units

clearly

static

are

not

that

flexible,

SQL, and

more likely

to

have

of a given

find

are

flexibility

levels

specifies

the

to

end-user

than

input

values

variables,

100;

in

another

for

sale

at any

a price.

used

the end

case,

the

given

Dynamic

more computer

of support

W_TABLE,

For example, in one instance,

available carries

them.

build the text

outputs.

a price less

SQL requires

different

user

contain

different

product

such

dynamic

end

end-user input

multiple times to generate

SQL is

that

you

are

Because the program

user can run the same program the

condition

SQL

resources

end

moment. tends

to

(overhead).

and incompatibilities

among

DBMS

vendors.

note Appendix

O,

procedural

Building

a

language

relational

database

highlights

some

will show

how

Simple

and

Object-Relational

advanced

SQL that

may be developed.

of the

The

object features

a simple

example

appendix

that

can

Database

using

was introduced

in

briefly

using

the

into

Oracle

Objects,

chapter

introduces

have been incorporated

be implemented

Oracle

this

expands

to illustrate

concepts

Oracles

of

data

on the

how

an object

Oracle

objects

and

model. This appendix

objects.

9

SuMMary SQL provides relation. and

relational

The

set operators

UNION and

produce

UNION

a new relation

to

combine

with

all unique

The INTERSECT

relational

set

operator

operator

selects

rows

are

different.

that

output

of two

combine

(UNION)

queries.

only the

the

ALL set operators

the

or duplicate

selects

to generate

of two (or

(UNION

only the

UNION,

queries

output

ALL) rows

common

INTERSECT

rows.

and

a new

more) queries from

The

both

MINUS

MINUS require

set

union-compatible

relations. Operations join

in

that join

which

tables

only rows

as well as the rows A natural

join

duplicate

that the Joins

name.

other

2020 has

Cengage deemed

Learning. that

any

All suppressed

Reserved. content

as

does

both tables.

values

in

the

when the

between

the

outer joins.

An inner

join

An outer join

returns

the

and

or both tables

matching tables

share

USING

column

ON clause is used, the

and

matching

is

query

rows

eliminates attribute

common

in the

traditional

bejoined.

a common

clause

indicated

is the

and the old-style

qualifier for the

ON. If the

to

columns

natural join

use of a table

values in the

If the

queries

are used

not materially

be

copied, affect

may

A subquery

scanned, the

overall

when it is necessary

query uses results

Subqueries

statement.

May

are selected.

used

the

and

with

join

a

syntax is

attributes.

used,

the

query

will

USING clause; that

will return

only the rows

that

condition.

query.

not

is

USING

data. That is, the

a SELECT

Rights

query

joins

values for one table

difference

matching

and correlated

by another in

of

such

with

join

processed

clauses

review

keywords

specified

generated

style

criteria

matching

does not require

must exist in

Subqueries

Copyright

This

as inner

attribute

with

One important

only the rows

meet the

meet a given

all rows

natural join

column

Editorial

returns

may use

return

that

be classified

with unmatched

columns.

common

can

or

duplicated, learning

in experience.

that

be used

whole

or in Cengage

part.

Due Learning

to

process

were previously

with the

may return

to

FROM,

a single

electronic reserves

rights, the

right

row

some to

third remove

data based

unknown WHERE,

or

party additional

content

may content

and that

IN

multiple

on

and

are

HAVING

rows.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Most subqueries request;

then

is

executed

in

a programming

to the

are executed

the inner once

for

query

SQL functions

are

table,

to serve

in

the

used

extract

to

as the

and

outer

query.

That

process

is

so named

The

output

computation

is

and

of derived

conversion

of the

be used

to

the

SQL

501

data that

nested

query

loop

is related

subquery. functions

values

or to serve

that

Advanced

a subquery

typical

used store

Aside from

functions

the is

the inner

outer

to

variables

can be vendor-specific.

functions

similar

most frequently

can

SQL and

subquery

because

a column

data.

function

Language

query initiates

a correlated

subquery

or transform

formats

string

contrast,

query references

of the

basis for the

Function

That is, the

In

outer

A correlated

The results

numeric

executed.

the inner

functions.

are

row

is

because

data comparisons. there

each

language.

outer

and time

in a serial fashion.

subquery

9 Procedural

time

in

date

as a basis for

and

convert

are

a database

date functions,

one

data

format

to

another.

Oracle sequences sequence data

type

to

Procedural

generate

A trigger

occurrence critical

numeric

is

to

proper

data

A stored

procedure

procedures

are

operation

in

the

procedures

help reduce

code

programs,

When

SQL

cursor

is

of the

query

memory cursors:

are

You

implicit

and

embedded

is

maintaining

to

statements.

of the

major

by the

automate

enforce

DBMS

upon

the

Triggers

various

constraints

Just like

are

transaction

that

and increases

are not

one

columns

value

area

of

and rows.

than in the

client

inside memory

computer.

are

called

by

of application

PL/SQL

in

are

9

Stored

that

cost

the

Cursors

is that

Use of stored

modules and the

stored

procedures

performance.

PL/SQL

of errors

more than

triggers,

transactions.

system

unique

chance

database of stored

business

as a reserved

holding

help

and PL/SQL

or DELETE).

advantages

complete

the

of a cursor

invoked INSERT

They

SQL

return

server, rather

to the

Basic, .NET, called

a

an Autonumber

which

code,

the

held in

a

output

a reserved

There are two types

of

explicit.

SQL refers

as Visual

designed

an array

DBMS

For example, uses

procedures

automatically (UPDATE,

by creating

minimising

can think like

area in the

Embedded

such

statements

stored,

a record. Access

levels.

traffic

duplication

maintenance.

is

One

network

stored

can be used to

and represent

thereby

and

needed.

to

Microsoft

management.

of

database.

reduces

development

and

and they

encapsulate

substantially

application

is

event

collection

procedures

the

that

and implementation

a named

stored

code

manipulation

design

is

can be used to

SQL

data

processes,

DBMS

be assigned

automatically.

sequences.

procedural

management

values to

invoices

can be used to create triggers,

database

at the

generate

number

of a specified

enforced

they

to

SQL (PL/SQL)

functions.

and

may be used to

may be used

the

procedural

use

of

SQL

statements

C#, Python

host language. capabilities

in

within

an application

or Java.

The language

Embedded

SQL is

DBMS-based

in

programming

which the

still the

most

language

SQL statements

common

are

approach

to

applications.

Key terMS anonymous PL/SQLblock

Copyright Editorial

review

explicit cursor

statement-level trigger

batchupdateroutine

hostlanguage

static SQL

correlated subquery

implicit cursor

stored function

crossjoin

inner join

stored procedure

cursor

outerjoin

trigger

dynamicSQL

persistentstored module (PSM)

updatableview

embeddedSQL

proceduralSQL(PL/SQL)

Event-Condition-Action (ECA) model

row-level trigger

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

502

part

III

Database

Programming

online Content are available

Further MySQL

8.0

Oracle

Reference

/index.html,

Answers to selectedReviewQuestions andProblems forthis chapter

online

platform

for

this

book.

reaDIng

Database

Malepati,

on the

Manual

18c

PL/SQL

[online].

Available:

[online].

https://dev.mysql.com/doc/refman/8.0/en/

Available:

(2019).

www.oracle.com/technetwork/database/features/plsql

2019.

T.

Shah,

B. and

Vanier,

E. Advanced

MySQL

8.

OReilly,

2019.

reVIew QueStIonS 1

The relational set operators UNION, INTERSECT and MINUS work properly only when the relations are union-compatible. What does union-compatible mean, and how would you check for this condition?

2

Whatis the difference

3

that information,

4

5

6

query

query

output for the

UNION

query? (List the

query

output.)

in Question 3, whatis the query output for the UNION ALL query?

output.)

in Question 3, whatis the query output for the INTERSECT

query?

output.)

in Question 3, whatis the query output for the

MINUS query? (List

output.)

7

Whatis a CROSS JOIN? Give an example of its syntax.

8

Whichthree join types

9

has

query

Giventhe employee information the

2020

query

Giventhe employee information (List the

review

what is the

Giventhe employee information (List the

Copyright

UNION and UNION ALL? Writethe syntax for each.

Suppose you have two tables: EMPLOYEE and EMPLOYEE_1. The EMPLOYEE table contains the records for three employees: Alice Cordoza, John Cretchakov and Anne McDonald. The EMPLOYEE_1 table contains the records for employees John Cretchakov and Mary Chen. Given

9

Editorial

between

areincluded

in the

OUTER JOIN classification?

Usingtables named T1 and T2, write a query example for each ofthe three join types you described in Question 8. Assume that T1 and T2 share a common column named C1.

10

Whatis a subquery, and what are its basic characteristics?

11

Whatis a correlated subquery?

Give an example.

12

Which Microsoft Access/SQL Server function should you use to calculate the number between the current date and 25 January 2019?

13

Which Oracle function should you use to calculate the number of days between the current date and 25 January 2019?

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

of days

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

14

Suppose a PRODUCT table contains two attributes, attributes

have

values

table

contains

(The

VEND_CODE

VENDOR

of

a single

ABC, attribute,

attribute

table.)

125,

in

DEF,

124,

Given that information,

table

what

is

A UNION query based onthe two tables?

b

A UNION ALL query based on the two tables?

c

AnINTERSECT query based onthe two tables?

d

A MINUS query based on the two tables?

Which Oracle string function EMP_LNAME

values?

123, respectively.

123,

124,

125

key

to the

query

using

a table

named

Advanced

126,

503

Those two

The

and

SQL

VENDOR

respectively.

VEND_CODE

in the

output for:

should you use to list the first three

Give an example

SQL and

JKL,

a foreign

would be the

a

15

and

with values

PRODUCT

Language

PROD_CODE and VEND_CODE.

GHI, 124,

VEND_CODE, the

9 Procedural

characters

of a companys

EMPLOYEE.

16

Whatis an Oracle sequence?

Writeits syntax.

17

Whatis atrigger,

18

Whatis a stored procedure, and whyis it particularly useful? Givean example.

19

Give an example of a stored function.

20

What are the four occasions on which Oracle recommends

and whatis its purpose?

Give an example.

How would the function

be called?

you use a trigger?

probLeMS 9

online Content The'Ch09_SimpleCo' database islocatedonthe onlineplatform for this

book,

as are

the

script

files

to

duplicate

this

data

set in

Oracle.

Use the database tables in Figure P9.1 as the basis for Problems 1-18.

FIgure

p9.1

Database Table

Table

Copyright Editorial

review

2020 has

name:

Learning. that

any

database tables

Ch09_SimpleCo

CUSTOMER

name:

Cengage deemed

name:

Ch09_SimpleCo

CUST_NUM

CUST_LNAMe

CUST_FNAMe

1000

Smith

Jeanne

1001

Ortega

Juan

CUST_BALANCe 1050.11 840.92

CUSTOMER_2

All suppressed

Rights

Reserved. content

does

May not

not materially

be

CUST_NUM

CUST_LNAMe

CUST_FNAMe

2000

McPherson

Anne

2001

Ortega

Juan

2002

Kowalski

Jan

2003

Chen

George

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

504

part

III

Table

Database

name:

Programming

INVOICE iNv_NUM

1

CUST_NUM

iNv_DATe

iNv_AMOUNT

8000

1000

23-Mar-19

235.89

8001

1001

23-Mar-19

312.82

8002

1001

30-Mar-19

528.10

8003

1000

12-Apr-19

194.78

8004

1000

23-Apr-19

619.44

Create the tables. (Use Figure P9.1 to see which table

2 Insert the data into the tables 3

names and attributes to use.)

you created in Problem 1.

Writethe query that will generate a combined list of customers (from the tables CUSTOMER and CUSTOMER_2) that do not include the duplicate customer records. (Note that only the customer named

4

Juan

Ortega shows

Writethe query that

up in

both customer

table).

will generate a combined list of customers to include the duplicate customer

records.

5

Writethe query that

will show only the duplicate customer records.

6

Writethe query that

will generate only the records that are unique to the

CUSTOMER_2 table.

7

Writethe query to show the invoice number, the customer number, the customer name, the invoice date and the invoice amount for all customers with a customer balance of 1 000 or more.

8

Writethe query that will show the invoice number, the invoice amount, the average invoice and the difference between the average invoice amount and the actual invoice amount.

9

amount

9

Writethe query that will write Oracle sequences to produce automatic customer number and invoice number values. Start the customer numbers at 1000 and the invoice numbers at 5000.

10

Modify the CUSTOMER table to included two new attributes: CUST_DOB and CUST_AGE. Customer 1000 was born on 15 March 1969, and customer 1001 was born on 22 December 1978.

11

Assuming you completed customers.

12

Assuming the CUSTOMER table contains a CUST_AGE attribute, write the values in that attribute. (Hint: Usethe results of the previous query.)

13

Writethe query that willlist the average age of your customers. (Assume that the CUSTOMER table has been modified to include the CUST_DOB and the derived CUST_AGE attribute.)

14

Problem 10, write the query that lists the

Writethe trigger to update the record

is entered.

(Assume

CUST_BALANCE in the

that the

sale is

a credit

sale.)

names and ages of your

query to update the

CUSTOMER table Test the trigger,

when a new invoice

using the following

new

INVOICE record: 8005,

1001, '27-APR-19',

225.40

Name the trigger trg_updatecustbalance.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

15

9 Procedural

Language

Writea stored procedure to add a new customer to the CUSTOMER table. in the

Name the

'Peter',

procedure

505

values

prc_cust_add.

Run a query to see if the record

has been added.

Use the following

values in

new record:

8006,

1000,

Name

the

17

Use the following

SQL

0.00

Write a procedure to add a new invoice record to the INVOICE table. the

Advanced

new record:

1002, 'Rauthor',

16

SQL and

'30-APR-19', procedure

301.72 prc_invoice_add.

Run a query

to

see if the

record

has

been

added.

Writea trigger to update the customer balance when an invoice is deleted. Namethe trigger trg_updatecustbalance2.

18

Writea procedure to delete aninvoice, giving the invoice number as a parameter. Namethe procedure

Use the

database

FIgure Table

prc_inv_delete. tables

p 9.2

name:

Copyright review

2020 has

P9.2

basis

for

database

8990765

Rough

912122048

Oracle

18c

912934511

Oracle

Backup

name:

Learning. any

invoices

Problems

8005

and

8006.

19-26.

tables

NUMBer_PAGeS

Cell

that

as the

TiTLe

72121333

Cengage deemed

Figure

by deleting

BOOK

935642189

Editorial

in

procedure

Ch09_publishing

iSBN

Table

Test the

Guide to

TYPe

496

6.99

Fiction

245

10.45

Reference

976

34.99

Reference

399

54.50

Reference

4990

19.99

Reference

Prague

Reference

Introduction

PriCe

Guide

& Recovery

to

SQL

9

AUTHOR

All suppressed

Rights

Reserved. content

does

May not

not materially

AUTHOr_iD

FirST_NAMe

LAST_NAMe

1

Stephen

King

2

Michael

Abbey

3

Michael

Robinson

4

Kenny

Smith

5

Steph

Haisley

6

Mandla

7

Rushford

Majoy

8

Farmyi

Madagore

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

Langa

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

506

part

III

Table

Table

Database

name:

name:

Programming

AUTHOR_BOOK iSBN

AUTHOr_iD

72121333

1

8990765

6

8990765

7

912122048

2

912122048

3

912934511

4

912934511

5

935642189

8

STOCK iSBN

STATUS

STATUS_DATe

72121333

IN

STOCK

8990765

IN

STOCK

912122048

ON

QUANTiTY 54 9

ORDER

12/05/2019

20

912934511

FUTURE

30/03/2019

32

935642189

ON ORDER

15/04/2019

50

9 19

Create the tables. (Use Figure P9.2 to see which table names and attributes to use.)

20 Insert the data into the tables 21

you created in Problem 19.

Modify the BOOK table to include a new attribute that records the DATE_PUBLISHED. SQL code required to update the DATE_PUBLISHED for the following books. iSBN

Writethe

DATe_PUBLiSHeD

72121333

12-MAR-19

912122048

23-NOV-19

912934511

12-MAY-19

935642189

11-JUNE-19

22

Writethe query that than two years.

will display the ISBN and title of all books that have been published for

23

Write a query that creates a list of unique authorbook the authors last name and the first eight characters

more

ids, using the first five characters of of the book title. Label the column

AUTHOR_BOOK_ID.

24

Writean anonymous PL/SQL block that displays the maximum author_id currently held in the database

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

and displays it to the

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

screen.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

25

Write an anonymous book

26

titled

Oracle

the

BOOK

Language

PL/SQL block to display the status date entered in the 18c

Writean anonymous from

9 Procedural

Reference

SQL and

Advanced

SQL

507

STOCK table for the

Guide.

PL/SQL block that contains a simple cursor to display only the first three titles

table.

(Hint:

use the

cursor

function

%ROWCOUNT.)

note The following

Use the

problem

sets can serve

Ch09_SaleCo2 database to

FIgure Table

p9.3

name:

as the

basis for

a class

project

work Problems 27-31 (Figure

Ch09_SaleCo

database

or case.

P9.3).

tables

CUSTOMER

CUS_

CUS_

CUS_

CODe

LNAMe

FNAMe

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Moloi

10013

Pieterse

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

Williams

10018

Padayachee

10019

Moloi

CUS_

CUS_

CUS_

AreACODe

PHONe

BALANCe

A

0181

844-2573

0.00

K

0161

894-1238

0.00

0181

894-2285

345.86

0181

894-2180

536.75

0181

222-1672

0.00

B

0161

442-3381

0.00

G

0181

297-1228

221.19

0181

290-2556

768.93 216.55

CUS_ iNiTiAL

Marlene

W

Jaco

F

Myron

George Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

9

0.00

Table name: PRODUCT P_CODe

P_DeSCriPT

11QER/31

Power

P_iNDATe

painter,

15 psi.,

P_QOH

03-Nov-18

P_MiN

8

5

P_PriCe

P_DiSCOUNT

109.99

v_CODe

0.00

25595

3-nozzle

Copyright Editorial

review

13-Q2/P2

7.25

cm

pwr.

saw

blade

13-Dec-18

32

15

14.99

0.05

21344

14-Q1/L3

9.00

cm

pwr.

saw

blade

13-Nov-18

18

12

17.49

0.00

21344

1546-QQ2

Hrd. cloth,

1/4 cm,

2 3 50

15-Jan-19

15

8

39.95

0.00

23119

1558-QW1

Hrd.

1/2

3

15-Jan-19

23

5

43.99

0.00

23119

2232/QTY

B&D jigsaw,

12

8

5

109.92

0.05

24288

2232/QWE

B&D jigsaw,

8 cm

2238/QPD

B&D cordless

23109-HB

Claw

2020 has

Cengage deemed

Learning. that

any

All suppressed

cloth,

Rights

cm

3 50

blade

30-Dec-18

blade

24-Dec-18

drill, 1/2 cm

hammer

Reserved. content

cm,

does

May not

not materially

be

copied, affect

scanned, the

overall

or

6

5

99.87

0.05

24288

20-Jan-19

12

5

38.95

0.05

25595

20-Jan-19

23

0.10

21225

duplicated, learning

in experience.

whole

or in Cengage

part.

10

Due Learning

to

electronic reserves

9.95

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

508

part

III

Database

Programming

P_DeSCriPT

23114-AA

Sledge

hammer,

54778-2T

Rat-tail

file,

89-WRE-Q

8

P_PriCe

5

P_DiSCOUNT

v_CODe

14.40

0.05

4.99

0.00

21344

0.05

24288

cm fine

15-Dec-18

43

Hicut chain saw, 16 cm

07-Feb-19

11

PVC23DRT

PVC

pipe,

20-Feb-19

188

75

5.87

0.00

SM-18277

1.25

cm

01-Mar-19

172

75

6.99

0.00

21225

SW-23116

2.5 cm wd. screw, 50

24-Feb-19

237

100

8.45

0.00

21231

0.10

25595

1/6

Table

name:

1/8

3.5

cm,

metal

matting, m, .5

8

m

screw,

25

4 3 8

17-Jan-19

3

20

256.99

5

119.95

5

18

m mesh

VENDOR v_CONTACT

v_AreACODe

v_PHONe

v_COUNTrY

v_OrDer

Smithson

0181

223-3234

UK

Y

SuperLoo, Inc.

Flushing

0113

215-8995

SA

N

21231

D&E Supply

Singh

0181

228-3245

UK

Y

21344

Jabavu

Ortega

0181

889-2546

SA

N

22567

Dome Supply

Smith

7253

678-1419

FR

N

23119

Randsets

Anderson

7253

678-3998

FR

Y

24004

Brackman

Browning

0181

228-1410

UK

N

24288

ORDVA, Inc.

Hakford

0181

898-1234

UK

Y

25443

B&K, Inc.

Smith

0113

227-0093

SA

N

25501

Damal

Smythe

0181

890-3529

UK

N

25595

Rubicon

Orton

0113

456-0092

SA

Y

Table

name: INVOICE

v_CODe

v_NAMe

21225

Bryson,

21226

Inc.

Bros.

Ltd. Bros.

Supplies Systems

iNv_NUMBer

Copyright Editorial

P_MiN

02-Jan-19

Steel

12

P_QOH

kg

WR3/TT3

9

P_iNDATe

P_CODe

review

2020 has

1001

10014

16-Jan-19

1002

10011

16-Jan-19

1003

10012

16-Jan-19

1004

10011

1005

iNv_SUBTOTAL

iNv_TAX

iNv_TOTAL

24.90

1.99

26.89

9.98

0.80

10.78

153.85

12.31

166.16

17-Jan-19

34.97

2.80

37.77

10018

17-Jan-19

70.44

5.64

76.08

1006

10014

17-Jan-19

397.83

31.83

429.66

1007

10015

17-Jan-19

34.97

2.80

37.77

1008

10011

17-Jan-19

Cengage deemed

iNv_DATe

CUS_CODe

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

431.08

31.93

399.15

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

Table

name:

9 Procedural

Language

SQL and

Advanced

SQL

509

LINE

iNv_NUMBer

LiNe_UNiTS

LiNe_PriCe

LiNe_TOTAL

LiNe_NUMBer

P_CODe

1001

1

13-Q2/P2

1

14.99

14.99

1001

2

23109-HB

1

9.95

9.95

1002

1

54778-2T

2

4.99

9.98

1003

1

2238/QPD

1

38.95

38.95

1003

2

1546-QQ2

1

39.95

39.95

1003

3

13-Q2/P2

5

14.99

74.95

1004

1

54778-2T

3

4.99

14.97

1004

2

23109-HB

2

9.95

19.90

1005

1

PVC23DRT

12

5.87

70.44

1006

1

SM-18277

3

6.99

20.97

1006

2

2232/QTY

1

109.92

109.92

1006

3

23109-HB

1

9.95

9.95

1006

4

89-WRE-Q

1

1007

1

13-Q2/P2

2

14.99

29.98

1007

2

54778-2T

1

4.99

4.99

1008

1

PVC23DRT

5

5.87

29.35

1008

2

WR3/TT3

3

119.95

359.85

1008

3

1

9.95

9.95

23109-HB

256.99

256.99

9

online Content The'Ch09_SaleCo2' databaseusedin Problems 27-31islocatedon the

27

online

platform

for this

book,

as are the

script

files

to

duplicate

this

data

set in

Oracle.

Create a trigger named trg_line_total to write the LINE_TOTAL value in the LINE table every time you add a new LINE row. (The LINE_TOTAL value is the product of the LINE_UNITS and the LINE_PRICE

28

values.)

Create atrigger named trg_line_prod that willautomatically update the product quantity on hand for

29

each product

sold after

a new LINE row is added.

Create a stored procedure named prc_inv_amounts to update the INV_SUBTOTAL, INV_TAX and INV_TOTAL.

The procedure

takes

the invoice

number

as a parameter.

The INV_SUBTOTAL

is the

sum ofthe LINE_TOTAL amounts for the invoice, the INV_TAX is the product ofthe INV_SUBTOTAL and the tax rate (8%), and the INV_TOTAL is the sum of the INV_SUBTOTAL and the INV_TAX.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

510

part

III

Database

Use the

FIgure Table

9

Ch09_AviaCo

p9.4

name:

database

to

work Problems

31-42

(Figure

P9.4).

Ch11_aviaCo database tables

CHARTER

CHAr_

CHAr_

AC_

CHAr_

CHAr_

TriP

DATe

NUMBer

DeSTiNATiON

DiSTANCe

CHAr_

CHAr_

CHAr_

CHAr_

CUS_

HOUrS_

HOUrS_

FUeL_

OiL_

CODe

FLOwN

wAiT

GALLONS

QTS

354.1

10001

05-Feb-19

2289L

ATL

936.00

5.1

2.2

10002

05-Feb-19

2778V

BNA

320.00

1.6

0.0

1574.00

7.8

0.0

2.9

4.9

72.6

1

10011

0

10016

2

10014

1

10019

397.7

2

10011

5.2

117.1

0

10017

7.9

0.0

348.4

2

10012

644.00

4.1

0.0

140.6

1

10014

1574.00

6.6

459.9

0

10017

ATL

998.00

6.2

3.2

279.7

0

10016

1484P

BNA

352.00

1.9

5.3

1

10012

2778V

MOB

884.00

4.8

4.2

215.1

0

10010

TYS

644.00

3.9

4.5

174.3

1

10011

4278Y

ATL

936.00

6.1

2.1

302.6

0

10017

09-Feb-19

2289L

GNV

1645.00

6.7

0.0

459.5

2

10016

10016

09-Feb-19

2778V

MQY

312.00

1.5

0.0

0

10011

10017

10-Feb-19

1484P

STL

508.00

3.1

0.0

105.5

0

10014

10018

10-Feb-19

4278Y

TYS

644.00

3.8

4.5

167.4

0

10017

10003

05-Feb-19

4278Y

GNV

10004

06-Feb-19

1484P

STL

10005

06-Feb-19

2289L

ATL

1023.00

5.7

3.5

10006

06-Feb-19

4278Y

STL

472.00

2.6

10007

06-Feb-19

2778V

GNV

1574.00

10008

07-Feb-19

1484P

10009

07-Feb-19

2289L

GNV

10010

07-Feb-19

4278Y

10011

07-Feb-19

10012

08-Feb-19

10013

08-Feb-19

4278Y

10014

09-Feb-19

10015

Table name:

472.00

TYS

339.8

97.2

23.4

66.4

67.2

CUSTOMER

CUS_CODe

CUS_LNAMe

CUS_FNAMe

CUS_iNiTiAL

CUS_AreACODe

CUS_PHONe

10010

Ramas

Alfred

A

0181

844-2573

0.00

10011

Dunne

Leona

K

0161

894-1238

0.00

10012

Moloi

W

0181

894-2285

896.54

0181

894-2180

1285.19

0181

222-1672

673.21 1014.56

Marlene Jaco

F

CUS_BALANCe

10013

Pieterse

10014

Orlando

10015

OBrian

Amy

B

0161

442-3381

10016

Brown

James

G

0181

297-1228

0.00

0181

290-2556

0.00

0.00

2020 has

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

Moloi

10019

review

George

Padayachee

10018

Copyright

Myron

Williams

10017

Editorial

Programming

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

453.98

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Table

name:

review

SQL and

Advanced

eMP_TiTLe

eMP_LNAMe

eMP_FNAMe

eMP_DOB

eMP_Hire_DATe

Mr

Kolmycz

George

D

15-Jun-1952

15-Mar-1997

101

Ms

Lewis

Rhonda

G

19-Mar-1975

25-Apr-1998

102

Mr

Vandam

Rhett

14-Nov-1968

20-Dec-2002

103

Ms

Jones

Anne

M

16-Oct-1984

28-Aug-2015

104

Mr

Lange

John

P

08-Nov-1981

20-Oct-2006

105

Mr

Robert

D

14-Mar-1985

08-Jan-2016

106

Mrs

Duzak

Jeanine

K

12-Feb-1978

05-Jan-2001

107

Mr

Diante

Jorge

D

21-Aug-1984

02-Jul-2006

108

Mr

Paul

R

14-Feb-1976

18-Nov-2004

109

Ms

Travis

Elizabeth

K

18-Jun-1971

14-Apr-2001

110

Mrs

Genkazi

Leighla

W

19-May-1980

01-Dec-2002

2020 has

name:

Cengage deemed

Learning. that

any

SQL

eMP_iNiTiAL

100

Table

Copyright

Language

511

EMPLOYEE

eMP_NUM

Editorial

9 Procedural

Williams

Wiesenbach

CREW

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

CHAr_TriP

eMP_NUM

Crew_JOB

10001

104

Pilot

10002

101

Pilot

10003

105

Pilot

10003

109

Copilot

10004

106

Pilot

10005

101

Pilot

10006

109

Pilot

10007

104

Pilot

10007

105

Copilot

10008

106

Pilot

10009

105

Pilot

10010

108

Pilot

10011

101

Pilot

10011

104

Copilot

10012

101

Pilot

10013

105

Pilot

10014

106

Pilot

10015

101

Copilot

10015

104

Pilot

10016

105

Copilot

10016

109

Pilot

10017

101

Pilot

10018

104

Copilot

10018

105

Pilot

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

9

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

512

part

III

Table

Database

name:

Programming

AIRCRAFT AC_NUMBer

Table

Table

name:

MOD_CODe

1833.10

1833.10

101.80

2289L

C-90A

4243.80

768.90

1123.40

2778V

PA31-350

7992.90

1513.10

789.50

4278Y

PA31-350

2147.30

PiL_LiCeNSe

PiL_MeD_TYPe

PiL_MeD_DATe

PiL_PT135_DATe

101

ATP

1

12-Apr-2018

15-Jun-2018

104

ATP

1

10-Jun-2018

23-Mar-2019

105

COM

2

25-Feb-2018

12-Feb-2018

106

COM

2

02-Apr-2018

24-Dec-2019

109

COM

1

14-Apr-2018

21-Apr-2018

name:

RATING rTG_CODe

rTG_NAMe

CFI

Certified

CFII

Certified Flight Instructor,

name:

2020 has

Flight Instructor

Instrument

Instrument Multiengine

Land

SEL

Single Engine, Land

SES

Single Engine,

Sea

MODEL MOD_MANUFACTUrer

MOD_NAMe

MOD_SeATS

MOD_CHG_MiLe

C-90A

Beechcraft

KingAir

8

2.67

PA23-250

Piper

Aztec

6

1.93

PA31-350

Piper

Navajo

name:

Cengage deemed

Learning. that

Chieftain

2.35

10

EARNED_RATING eMP_NUM

rTG_CODe

eArNrTG_DATe

101

CFI

18-Feb-08

101

CFII

15-Dec-15

101

INSTR

08-Nov-03

101

MEL

23-Jun-04

101

SEL

21-Apr-03

104

INSTR

15-Jul-06

104

MEL

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

29-Jan-07

SEL

104

review

243.20

eMP_NUM

MOD_CODe

Copyright

622.10

PILOT

MEL

Editorial

AC_TTer

PA23-250

INSTR

Table

AC_TTeL

1484P

9

Table

AC_TTAF

copied, affect

scanned, the

overall

or

duplicated, learning

12-Mar-05

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

30

the

31

online

Content platform

Modify the

Attribute

review

rTG_CODe

eArNrTG_DATe

105

CFI

18-Nov-07

105

INSTR

17-Apr-05

105

MEL

12-Aug-05

105

SEL

23-Sep-04

106

INSTR

20-Dec-05

106

MEL

02-Apr-06

106

SEL

10-Mar-04

109

CFI

05-Nov-08

109

CFII

21-Jun-13

109

INSTR

23-Jul-06

109

MEL

15-Mar-07

109

SEL

05-Feb-06

109

SES

12-May-06

book,

as are the

Name

Attribute Waiting

charge

33

Modify the

Cengage deemed

Learning. that

any

All suppressed

script

files

to

duplicate

Description

Writethe queries to update the

has

Advanced

SQL

513

9

this

data

set in

Oracle.

MODEL table to add the attribute and insert the values shown in the following table.

32

2020

SQL and

The'Ch09_AviaCo'databaseusedfor Problems31-42is located on

for this

MOD_WAIT_CHG

Copyright

eMP_NUM

Language

Create a procedure named prc_cus_balance_update that will take the invoice number as a parameter and update the customer balance. (Hint: You can use the DECLARE section to define a TOTINV numeric variable that holds the computed invoice total.)

online

Editorial

9 Procedural

per

Attribute hour for

each

model

100

Numeric

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

values

for

C-90A

50

for

PA23-250

75

for

PA31-350

MOD_WAIT_CHG attribute values based on Problem 31.

CHARTER table to add the attributes shown in the following

Rights

Attribute

Type

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

table.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

514

part

III

Attribute

Database

Programming

Name

Attribute

CHAR_WAIT_CHG

Waiting

CHAR_FLT_CHG_HR

Flight

CHAR_FLT_CHG

charge

for

charge

using

the

Flight

Attribute

Description each

per

model (copied

mile for

each

MOD_CHG_MILE

charge

from

the

model (copied

Numeric

MODEL table)

from

the

Type

Numeric

MODEL table

attribute)

(calculated

by

CHAR_HOURS_FLOWN

Numeric

3

CHAR_FLT_CHG_HR) CHAR_TAX_CHG

CHAR_FLT_CHG

3 tax rate (8%)

Numeric

CHAR_TOT_CHG

CHAR_FLT_CHG

1 CHAR_TAX_CHG

Numeric

CHAR_PYMT

Amount

paid by customer

CHAR_BALANCE

Balance

remaining

34

after

Numeric payment

Numeric

Writethe sequence of commands required to update the CHAR_WAIT_CHGattribute valuesin the CHARTER

35

table.

(Hint:

Use either

an

updatable

view

or a stored

procedure.)

Writethe sequence of commands required to update the CHAR_FLT_CHG_HRattribute valuesin the

36

CHARTER

table.

(Hint:

Use either

an

updatable

view

or a stored

procedure.)

Writethe command required to update the CHAR_FLT_CHG attribute values in the CHARTER table.

37

Writethe command required to update the

CHAR_TAX_CHG

attribute

values in the

CHARTER

CHAR_TOT_CHG

attribute values in the

CHARTER

table.

9

38

Writethe command

required to update the

table.

39

Modify the PILOT table to add the attribute shown in the following table.

Attribute

Name

Attribute

PIL_PIC_HRS

Pilot in

command

tables

Create a trigger new

41

row

tables table

42

shows

tables

is

review

2020 has

PIL_PIC_HRS

CREW to

CREW_JOB

CHARTER when the

Learning. that

any

All suppressed

when

update

Rights

Reserved. content

does

Numeric CREW

and

updates the AIRCRAFT table

tables

AC_TTER

CHAR_HOURS_FLOWN

uses

update

the

updates the PILOT table

a pilot

PILOT

May

not materially

be

a new

copied, affect

CHARTER

scanned, the

when a

update

the

values.

that automatically table

to

CREW_JOB

tables

entry.

PIL_PIC_HRS

when a new

Use the only

CHARTER

when the

CREW

entry.

source. (Assume

not

Type

be pilot

CHARTER

AC_TTEL,

and the

the

Create a trigger named trg_cust_balance that automatically updates the

Cengage deemed

by adding

to the

that automatically

named trg_pic_hours

added

a pilot

CHG as the

Copyright

updated

to

Use the

CHAR_HOURS_FLOWN uses

hours;

CREW_JOB

AC_TTAF,

CUST_BALANCE

Editorial

the

row is added.

Create a trigger CREW

(PIC)

named trg_char_hours

CHARTER

AIRCRAFT

Attribute

CHAR_HOURS_FLOWN

table

40

Description

overall

or

that

duplicated, learning

in experience.

row

is

added.

all charter

whole

or in Cengage

part.

Due Learning

Use the

charges

to

electronic reserves

CHARTER

CUSTOMERtables tables

CHAR_TOT_

are charged to the customer

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

balance.)

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

9 Procedural

Language

SQL and

Advanced

SQL

515

CaSe EliteVideo is a start-up company providing a concierge DVD kiosk service in upscale neighbourhoods. EliteVideo can own several copies (VIDEO) of each movie(MOVIE). For example, a kiosk may have 10 copies of the movie Cry,the Beloved Country. In the database, Cry,the Beloved Country would be one MOVIE, and each copy would be a VIDEO. Arental transaction (RENTAL) involves one or more videos being rented

to a member (MEMBERSHIP).

A video

can be rented

many times

over its lifetime;

therefore,

there

is an M:N relationship between RENTAL and VIDEO. DETAILRENTAL is the bridge table to resolve this relationship. The complete ERDis provided in Figure P9.5.

FIgure

p9.5

the Ch09_MovieCo erD

9

43

Write the SQL code to create the table structures for the entities shown in Figure P9.5. The structures should contain the attributes specified in the ERD. Use data types that are appropriate for the data that will need to be stored in each attribute. Enforce primary key and foreign key constraints

44

as indicated

The following tables data needs

to

by the

ERD.

provide a very small portion ofthe data that

be inserted

into

the

database

for testing

necessary to place the following datain the tables that your DBMS, be certain to save the rows permanently.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

will be kept in the database. The

purposes.

Write the INSERT

commands

were created in Problem 43. (If required

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

by

and/or restrictions

eChapter(s). require

it

516

part

III

tabLe MeM

Database

p9.1

Membership table MeM_CiTY

MeM-STreeT

MeM_

MeM_

NUM

Programming

MeM_PrOv

BALANCe

POSTAL

LNAMe

FNAMe

MeM_

MeM_

CODe 5200

110

KZN

4001

60

Pretoria

Gauteng

0001

0

Musket Ball Circle

Cape Town

Western Cape

7100

150

Maxwell Place

Durban

KZN

4001

0

Polokwane

Limpopo

0700

50

Bloemfontein

Free State

9300

0

26 Takli

Dawson

Circle

102

Tami

103

Koert

Wessels

45

104

Jamal

Melendez

78 East 145th

105

Palesa

Mamorobela

60

106

Nasima

Carrim

107

Rose

Ledimo

108

Mattie

Smith

430 Evergreen

109

Clint

Taylor

171

110

Thabang

Moroe

24 Southwind

111

Stacy

Mann

89

112

Louis

Du Toit

113

Sulaiyman

Philander

tabLe

p9.2

446

Cornell

Court

78 Danner

26 430

Elm

Avenue

Avenue Street

Street

East

East London

Eastern

Durban

Cape Circle

Cook

Avenue

Town

Western

Johannesburg

Gauteng

Drive

Cape

Upington

Northern

Mbombela

Mpumalanga

Melvin Avenue Vasili

Cape

Polokwane

Cape

Limpopo

7100

100

2001

0

8801

80

1200

30

0700

0

rental table

9 reNT_DATe

1001

01-MAR-20

103

1002

01-MAR-20

105

1003

02-MAR-20

102

1004

02-MAR-20

110

1005

02-MAR-20

111

1006

02-MAR-20

107

1007

02-MAR-20

104

1008

03-MAR-20

105

1009

03-MAR-20

111

tabLe

p9.3

Detailrental

table DeTAiL_ reTUrNDATe

DeTAiL_ DAiLYLATeFee

reNT_NUM

viD_NUM

DeTAiL_Fee

DeTAiL_ DUeDATe

1001

34342

20

04-MAR-20

02-MAR-20

1001

61353

20

04-MAR-20

03-MAR-20

10

1002

59237

35

04-MAR-20

04-MAR-20

30

Copyright Editorial

MeM_NUM

reNT_NUM

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

reNT_NUM

viD_NUM

Copyright review

Language

SQL and

Advanced

DeTAiL_

DeTAiL_

DeTAiL_

reTUrNDATe

DAiLYLATeFee

54325

35

04-MAR-20

09-MAR-20

30

1003

61369

20

06-MAR-20

09-MAR-20

10

1003

61388

0

06-MAR-20

09-MAR-20

10

1004

44392

35

05-MAR-20

07-MAR-20

30

1004

34367

35

05-MAR-20

07-MAR-20

30

1004

34341

20

07-MAR-20

07-MAR-20

10

1005

34342

20

07-MAR-20

05-MAR-20

10

1005

44397

35

05-MAR-20

05-MAR-20

30

1006

34366

35

05-MAR-20

04-MAR-20

30

1006

61367

20

07-MAR-20

10

1007

34368

35

05-MAR-20

30

1008

34369

35

05-MAR-20

1009

54324

35

05-MAR-20

1001

34366

35

04-MAR-20

p9.4

30

9

Video table MOvie_NUM

54321

18-JUN-19

1234

54324

18-JUN-19

1234

54325

18-JUN-19

1234

34341

22-JAN-18

1235

34342

22-JAN-18

1235

34366

02-MAR-20

1236

34367

02-MAR-20

1236

34368

02-MAR-20

1236

34369

02-MAR-20

1236

44392

21-OCT-19

1237

44397

21-OCT-19

1237

59237

14-FEB-20

1237

61388

25-JAN-18

1239

61353

28-JAN-17

1245

61354

28-JAN-17

1245

61367

30-JUL-19

1246

61369

30-JUL-19

1246

Cengage deemed

Learning. that

any

All suppressed

30

02-MAR-20

viD_iNDATe

has

517

30

05-MAR-20

viD_NUM

2020

SQL

DUeDATe 1003

tabLe

Editorial

DeTAiL_Fee

9 Procedural

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

518

part

III

Database

Programming

tabLe

p9.5

MOvie_

Movie table

MOvie_TiTLe

MOvie_YeAr

MOvie_COST

MOvie_GeNre

PriCe_CODe

NUM 1234

The

1235

Family

Christmas

2016

39.95

FAMILY

2

Smokey

Mountain

Wildlife

2013

59.95

ACTION

1

1236

Richard

Goodhope

2017

59.95

DRAMA

2

1237

Beatnik

2016

29.95

COMEDY

2

1238

Constant

2017

89.95

DRAMA

1239

Where

2007

25.49

DRAMA

3

1245

Time to

2014

45.49

ACTION

1

2015

58.29

COMEDY

1

1246

Fever Companion Hope

Dies

Burn

What He Doesnt

tabLe

9

Cesar

p9.6

price table

PriCe_CODe

PriCe_DeSCriPTiON

PriCe_reNTFee

PriCe_DAiLYLATeFee

1

Standard

20

10

2

New

35

30

3

Discount

15

10

10

05

4

For

Questions

those

tables

4559, in

has

that

were

created

in

Problem

43 and the

data that

was loaded

46

Writethe SQL command to change the price code for all action

47

Writea single SQL command to increase Alter the DETAILRENTAL table to include of up to three

Update the

Cengage deemed

Learning. that

any

All suppressed

Make

Rights

Reserved. content

does

into

digits.

each

a derived attribute

The attribute

May not

not materially

be

entry

copied, affect

match

scanned, the

overall

or

duplicated, learning

the

in experience.

movies to price code 3.

all price rental fee values in the PRICE table by ZAR7.00.

should

accept

DETAILRENTAL table to set the values in

component.

2020

tables

44.

Writethe SQL command to change the movieyearfor movienumber 1245 to 2014.

49

review

use the

Problem

45

integers

Copyright

Release

Weekly Special

48

Editorial

Know

values

whole

or in Cengage

shown

part.

Due Learning

to

named DETAIL_DAYS-LATE to store

null values.

DETAIL_RETURNDATE in the

electronic reserves

following

rights, the

right

some to

third remove

to include

a time

table.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

tabLe

p9.7

updates for the Detailrental

9 Procedural

SQL and

Advanced

SQL

519

table DeTAiL_reTUrNDATe

viD_NUM

reNT_NUM

Language

1001

34342

02-MAR-20

10:00am

1001

61353

03-MAR-20

11:30am

1002

59237

04-MAR-20

03:30pm

1003

54325

09-MAR-20

04:00pm

1003

61369

09-MAR-20

04:00pm

1003

61388

09-MAR-20

04:00pm

1004

44392

07-MAR-20

09:00am

1004

34367

07-MAR-20

09:00am

1004

34341

07-MAR-20

09:00am

1005

34342

05-MAR-20

12:30pm

1005

44397

05-MAR-20

12:30pm

1006

34366

04-MAR-20

10:15pm

1006

61367

1007

34368

1008

34369

05-MAR-20

09:30pm

1009

54324

1001

34366

02-MAR-20

10:00am

9

50

Alter the

VIDEO table

to include

an attribute

named

VID_STATUS

to

store

character

four characters long. The attribute should have a constraint to enforce the domain (IN, LOST) and have a default value ofIN.

Copyright review

up to

OUT

and

51

Update the VID_STATUS attribute of the VIDEO table using a subquery to set the VID_STATUS to OUT for all videos that have a null value in the DETAIL_RETURNDATE attribute of the DETAILRENTAL table.

52

Alter the PRICE table to include an attribute named PRICE_RENTDAYS to store integers of up to two digits. The attribute should not accept null values, and it should have a default value of 3.

53

Update the PRICE table to place the values shown in the following table in the PRICE_RENTDAYS attribute.

tabLe

Editorial

data

2020 has

updates for the price table

PriCe_CODe

PriCe_reNTDAYS

1

50

2

30

3

50

4

70

Cengage deemed

p9.8

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

520

part

III

54

Database

Programming

Create a trigger the

named trg_late_return

DETAILRENTAL

trigger

when the

should

satisfy

whenever

following

return

date is

null, then

If the

return

date is

not

If the

return

date is

noon

late,

If the return

date is

an

trigger

Calculate

the

Calculate

value

Subtract

of the

of the

fee

DETAIL_ DAYSLATE in

should

execute

attributes

are

as a BEFORE

updated.

The trigger

of the late fee

prior value

now

or earlier,

then

video the

is returned

video

is

not

video is considered

late,

stored.

are returned date

late.

of zero (0).

due date, then the

late.

attributes

the

update

days late

treat

if the

The trigger

are

updated

membership

should

in the

execute

as

DETAILRENTAL

conditions:

prior to

null, then

determine

will maintain the correct value in the

or return

fee

date

and

that

following

be null.

a value

videos

fee is the

fee is

due

have

day after the

when

date

also

should

be calculated

the

late

was

of the late

the

due

of the late

the value

value

after the should

must

satisfy

should

days late

day

table

when the should

of the late

If the

of the

past noon

The value

value

the

days late

MEMBERSHIP

The trigger

trigger.

days late

named trg_mem_balance

in the

AFTER

table.

the

of days late

Create a trigger balance

The trigger

or DETAIL_DUEDATE

null, then

and the

number

will write the correct value to is returned.

conditions:

If the

so the

that

a video

DETAIL_RETURNDATE

the

considered

55

table

it

that

triggered

multiplied

by the

as zero

after the

treat

that triggered

it

of the late fee from the

execution fee.

of the If the

previous

(0).

update

null, then

this

daily late

as zero

current

this

execution

of the trigger.

(0).

value

of the late fee to

determine

the

change in late fee for this video rental. If the

9

amount

amount

56

calculated

calculated

in

Part

for the

Create a sequence

c is

not zero (0),

membership

then

associated

named rent_num_seq

update

the

with this

to start

membership

balance

by the

rental.

with 1100 and increment

by 1. Do not cache

any values.

57

Create a stored procedure procedure The

should

membership

number

Use a Count() function table. If it does exist

and

If the

no data

balance

balance

Insert

of

58

has

Cengage deemed

is the display

value for

procedure

that

message

2020

then

membership

Verify

review

written

to

the

previous Previous

table

RENT_NUM,

number

provided

Learning. that

any

All suppressed

should

number the that

Rights

does

video

May not

the

the

number

video

not materially

be

exists

does

copied, affect

scanned, the

overall

not

or

duplicated, learning

exists in the

membership

balance.

balance

as the

MEMBERSHIP

membership

does

not

current

system

value

and display

if the

a message

membership

has a

R5.00.)

using the rent_num_seq the

that the

(For example,

balance:

following

will be provided

the

Reserved. content

satisfy

number

be displayed

database.

does exist, then retrieve amount

R5.00,

the

The video

Copyright

be

membership

a message should

for

Create a stored procedure named prc_new_detail The

Editorial

should

new rows in the RENTAL table. The

as a parameter.

verify that the

a new row in the rental

generate the

to

to insert

conditions:

will be provided

not exist, then

membership

that the

named prc_new_rental

satisfy the following

sequence

date for the

created

above to

RENT_DATE

value,

and

MEM_NUM.

to insert

new rows in the

DETAILRENTAL table.

requirements:

as a parameter. in the exist,

in experience.

VIDEO and

whole

or in Cengage

table.

do not

part.

Due Learning

to

If it

write

electronic reserves

rights, the

does

any

right

not

exist,

data to the

some to

third remove

party additional

content

then

display

a

database.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

If the

video

status

number

is

not IN,

be rented

again,

If the

status is IN,

Calculate

Insert

date

59

been

as the

Verify

the

that

the

video

the

If the

video

video

in the now

review

2020 has

Cengage deemed

Learning. that

any

Advanced

video is IN. be entered

SQL

521

If the before

it

can

the

number

in the

using

the

of days in

current

previous

number

provided

DETAIL_FEE,

the

as the

PRICE_

table.

value

in the

PRICE_

system

date.

returned

parameter

RENT_

as the

due date calculated

value for

by

VID_NUM,

above for the

DETAIL_DAILYLATEFEE,

and null

has

multiple not

VIDEO

All

available

Rights

does

for

May

not materially

be

any

rental.

current

not

in the

copied, affect

the

overall

it

the

video

is rented

does

outstanding

If it

does

not

do not

write

any

If the

video

date,

duplicated,

but and

message has

then that

in

whole

or in Cengage

video

part.

Due Learning

to

date.

any

reserves

rights, the

right

some to

one row message

no outstanding

third remove

status

rental,

party additional

to IN

then

to IN

in

database.

video

status

has only

an error

had

was successfully

electronic

video

more than

data to the

outstanding

message

database.

the

video

video

If

a

the

display

write

update

the

one

display

data to

not returned,

and update the

experience.

a return

do not

only

exist,

to ensure that the

have

rentals,

a

learning

not

rentals

display

or

requirements:

use a Count() function

a message that the

scanned,

table. and

which

system

following

VIDEO

for

outstanding

and

Then display

Reserved. content

have

table,

to enter data about the return of videos that

the

was not found

does exist, then

that

satisfy

as a parameter.

exists

provided

indicates

date to the

suppressed

should

DETAILRENTAL

does

VIDEO table.

Copyright

by adding

video

must

PRICE_RENTFEE,

PRICE

named prc_return_video

number

number in

video

return

Editorial

the

will be provided

number

DETAILRENTAL

but is

videos

the

table

value for

procedure

video

video

one record

that

rental

for the

return

SQL and

database.

of the

from

PRICE_DAILYLATEFEE

The

number

If the

values

video

videos

data to the

DETAILRENTAL

procedure

rented.

The video

that

the

VID_STATUS

the

Language

DETAIL_RETURNDATE.

Create a stored have

any

that

(hours:minutes:seconds)

RENT_NUM,

DETAIL_DUEDATE, the

write

for the

in the

as the

verify that the

message

PRICE_RENTDAYS

PRICE_RENTFEE

for

not

11:59:59PM

a new row

a

then retrieve

due

to

NUM_SEQ

do

and

the

RENTDAYS

display

and

DAILYLATEFEE,

the

does exist, then

then

9 Procedural

for the

9

rentals update

for that

the

video in the

returned.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PartIV

Database DesIgn 10 Database Development Process

11 Conceptual, Logical, and Physical Database Design

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

busIness

VIgnette

eM-Dat: tHe InteRnatIOnaL DIsasteR FOR DIsasteR PRePaReDness

Database

In 1998, the first emergency events database known as EM-DAT was set up by the WHO Collaborating Centre for Research on the Epidemiology of Disasters (CRED) with support from the Belgian Government. The purpose ofthe database wasto aidin decision makingfor disaster preparedness, as well as provide an objective base for vulnerability assessment and priority setting1 . During the last few

years

EM-DAT

has become

the

main global reference

database.

EM-DAT stores information on over 22 000 disasters that have occurred across the world from 1900 to the present day. Datais collected from many sources such as the United Nations, governments and the International Red Cross. The datais of various quality sois constantly checked for inconsistencies,

data redundancy

and incompleteness.

Each natural

disaster is recorded

using

a unique disaster number identifier, the disaster type, subtypes, associated disasters, start and end dates, and location. Disasters are classified into 15 types of natural disasters (and morethan 30 subtypes) and technological disasters which cover 15 disaster types. This now means that if a natural disaster affects a number of countries all the data that are collected from each country can be recorded

under

one unique

reference

number.

For example,

the

2004 tsunami

in

South

East

Asia affected 13 different countries but is recorded as one single event. From the database, disaster-related economic damage estimates can be obtained and also details ofinternational aid contributions for specific disasters. Each year EM-DAT aids CREDin conducting areview of disaster events throughout the year, e.g., In 2018, there were 281 climate-related and

geophysical

events recorded

in the

EM-DAT

with 10 733 deaths,

and over 60

people affected across the world.2 Data analytics is used to produce summary tables in different geographical locations who have been affected by specific disaster types.

1 2

Information 2018

about

Review

EM-DAT

Of Disaster

is

available:

Events,

Centre

million

of people The 2018

www.emdat.be/database for

Research

on the

Epidemiology

of

Disasters,

2019.

523

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

524

PaRt

IV

Database

review

Design

concludes

better

disaster

a need

to

The

ensure

disasters

which

effective

effective

data is

of good

3

Copyright review

2020 has

Cengage deemed

and

any

disaster

planning.

and

education

for risk

of

global

risk.

is to

Risk

management.

The

governments

Wahlstrm,

Reduction

You

to

mitigate the

Margareta

Disaster

natural

measures.3

a tool for

The intention

in reducing

disaster

effects reduction

for

quality,

Reduction

(UNISDR),

20152030

The four

disaster

action

risk

Back

actions

is

is recorded

which

states

cannot

to

and Priority

manage

it

and from

has

the

a people-centred 1.

manage

to

disaster

risk . . . Priority

3.

preparedness

and reconstruction.4

been

trustworthy

approach

disaster

rehabilitation never

Sendai

Understanding

disaster

4. Enhancing

in recovery,

Hence,

accurately

adopts

is implementing

are: Priority

governance

Better

data.

currently

priorities

for resilience,

to Build

these

Strategy

for

Framework

Learning. that

disaster

Secretary-General

this to

that there is

more important

sources

to

enable

to its

ensure

true

worth

effectively.

International Sendai

of

attributing

be a focus. and

and provides

successful

Reduction

risk reduction

completing

be utilised

4

Risk reduction.

response

to

should

occurrences

years,

acknowledges

be identified

of the

Risk

2. Strengthening

disaster

Central

to

Editorial

risk

in

measures

to

collection

development

to

required

critical

Disaster

Disaster

Priority

for

for

for

Investing

data

previous

the report

measure.

disaster

risk . . .

is

than in

However,

of the

aid in the

Representative

Office

Framework reducing

to

the funding

to information

UN

consistent

populations

Special

what you cannot The

used

was lower

standards.

statistics

life through

Nations

Access

and

provides

vulnerable

determine

death toll

and living

complete

are then

allows

of human

United

2018 the

website

use in order to

that

that

EM-DAT

information

loss

that in

management

All suppressed

Rights

for

Reserved. content

does

Disaster Disaster

May not

not materially

be

Reduction. Risk

copied, affect

scanned, the

overall

Available:

Reduction.

or

duplicated, learning

www.unisdr.org/disaster-statistics/introduction.htm

Available:

in experience.

whole

or in Cengage

part.

www.unisdr.org/we/coordinate/sendai-framework

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHaPteR10 Database Development Process In tHIs CHaPteR, yOu wILL LeaRn: That

successful

database

is

database

information

Systems

That,

systems

Development

the information

system

of

which

the

evaluation

are developed

within

a framework

known

as

are subject

to

Life Cycle (SDLC)

within the information

frequent

must reflect

a part

That successful the

design

system,

and revision

the

most successful

within a framework

databases

known

as the

Database

Life

Cycle (DBLC) How to

conduct

About

evaluation

database

design

vs decentralised Common put in

and revision strategies:

within

top-down

the vs

SDLC

and

bottom-up

DBLC

design

frameworks and

centralised

design

threats

to

the

security

of the

data

and

which

security

measures

could

be

place

The importance

of

The technical

database

and

administration

managerial

in

an organisation

roles

of the

database

administrator

picture

called

an information

(DBA)

Preview Databases that

fail

are a part

That is,

means to

an end rather

routines

to

fit the

staged

creation

the

and

2020 has

Cengage deemed

Learning. that

any

All

Rights

just

to

happen;

Systems establish through

evolution

does

seem

Database

whole

that the want the

to require

are

database database

that

designs

not likely

to

be

is a critical to

serve

managers

their

alter their

requirements.

dont

May not

they

analysis

of information

Life

not

be

copied, affect

scanned, the

overall

or

duplicated, learning

are the

product

used to

determine

Within known systems

Cycle,

and replacement

materially

is

its limits. a process

Development

Reserved. content

databases

system.

larger

Managers

enhancement

suppressed

of this

must recognise

created

Systems

part

designers

many

and

is

is

an end in itself.

process.

system

maintenance,

review

database

system

database

but too

development

The

called

than

systems

an information information

the

database

needs,

Information

Copyright

that

successful.

management

Editorial

of a larger

to recognise

systems

in

whole

or in Cengage

part.

follows

Due Learning

to

electronic reserves

the

an iterative

process

actual

pattern

of creation,

system.

rights, the

need for

the

development.

of the information

experience.

a carefully

analysis,

as systems

a continuous

of

right

some to

third remove

A similar

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

526

PaRt

IV

Database

cycle

Design

applies

to

databases.

The

database

explores

two

is

created,

maintained

and

enhanced,

and

eventually

replaced. This

chapter

security.

Data

data security

also is

could

data and about fully

very

important

and is

critical

have serious implications.

and

within

chapter,

and technical

to

protect

an organisation

you

roles

will learn

of the

issues: to

the

You willlearn

can be adopted

accepted

In this

managerial

resource

measures that

understood

be implemented. the

briefly

a corporate

about

database

database

which threats the

before

administration

organisation.

data.

important

data

administrator

a

data

breach

can affect the security

Database

a sound

and

Therefore,

data

administration

administration

management

in

of the must be

strategy

issues

can

by looking

at

(DBA).

10

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10.1

tHe InFORMatIOn

10

Database

Development

Process

527

systeM

A database is a carefully designed and constructed repository of facts. The fact repository is a part of alarger whole, known as an information system. Aninformation system provides for data collection, storage and retrieval. It also facilitates the transformation of data into information and the management of both data and information. Thus, a complete information system is composed of people, hardware, software,

the database(s),

application

programs

and procedures.

establishes the need for and the scope of aninformation system is known as systems development.

Systems

analysis

is the

process

that

system. The process of creating aninformation

nOte This chapter is not meant to cover all aspects of systems analysis and development covered in a separate

course

or book.

However, this chapter

should

help you develop

of database design, implementation and management issues that are affected in which the database is a critical component.

these

are usually

a better understanding

bythe information

system

Within the framework of systems development, applications transform data into the information that forms the basis of decision making. Applications usually produce formal reports, tabulations and graphic displays

designed

to

produce

insight.

Figure 10.1 illustrates

that

every application

is composed

of two

parts: the data and the code (program instructions) by which the data are transformed into information. Data and code work together to represent real-world business functions and activities. At any given moment, physically stored data represent asnapshot of the business. But the picture is not complete without an understanding of the business activities that are represented by the code. 10

FIguRe 10.1

generating information for decision making Information

Data

Application code

The performance

of aninformation

Decisions

system depends on a triad of factors:

Database design and implementation. Application

design and implementation.

Administrative

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

procedures.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

528

PaRt

IV

Database

Although

Design

this

book

most important), system.

Creating

much

planning

another,

emphasises

failure

to

a sound to

that

models.

the

database

this

that chapter

do not

you

used to

design

corporation.

The

to

this

size,

and

chapter

type

design

and

another,

is

to

that

information

development they

create

conceptual, database

broadly

such

Most

require

complement

one

of database

complete,

logical

storage

any size

as one for

building

for

design

normalised,

and

physical

structure,

loading

a local

shoe

database

shop,

requires

chapter

and

database data into

the

even

scale

a segment just

and far-ranging

as

For

procedures

do not precisely or

on the described

implemented.

However,

a blueprint,

more complex

focuses

procedures

being

of database.

corporation

house

requires

the

processes

of the

or type

a large

a small

stadium

applicable,

of the

or complexity

design

a database

design

but the

analysis

one

of the triad (arguably

functioning

describes the process

the

systems.

database,

up to

of such

building

planning,

a

Moses

analysis

and

database

life

house.

you

approaches

in

to

next sections Once

integrated

creating

needed

does,

the

and fully

a small

stadium

a poorly

management.

on the

an analogy,

segment

yield

systems with

database

all information

depend

To use

Mabhida

data

discussed to

work:

interface

in

includes

must plan, analyse

procedures

design than

for

hard

will likely

development

objective

possible)

procedures

is

activities

database

phase

are common

example,

cycle.

extent

and providing

make the

elements

the

the

and implementation

segments

on time.

primary

The implementation

To

in

The

(to

system

all of the

In a broad sense, the term

design

other two

are completed

and implementation. nonredundant

database

the

information

ensure

and that they

the

address

trace

the

are familiar database

overall

systems

with those

design,

such

development

processes

and

as top-down

life

cycle

procedures,

you

vs bottom-up

and

and the related will learn

about

centralised

different

general

vs decentralised

design.

nOte The

10

Systems

come

to

framework,

texts

Development

understand there

focus

is

maintained

are

Modelling

part

of the

Rapid

unfulfilled

smaller

cohesion.

This

with the

purpose

the

James

6 For

more information

has

the

specified

design

which

information

you

can

track

systems.

in the

SDLC.

For

and implementation,

and

Within that example,

and that

this

focus

is

methodologies: to

support

the

UML is covered

in

Appendix

tasks

associated

G, Unified

with the

Modelling (UML),

as

is

an interactive

software

to

develop

development,

development

application

which suffered

methodology

systems.

from long

that

RAD started

deliverable

times

uses

as and

is a framework

subprojects

method

to

emphasises

of increasing

for

obtain close

customer

developing

valuable

software

deliverables

communication

applications

in shorter

among

all users

that

times and

and

divides with better

continuous

evaluation

satisfaction.

methodologies

may change, the

basic framework

within

which they

are used

change.

Martin,

Cengage deemed

database

management

structured

development

5 See

2020

tasks

alternative

through

maintain

tools

(RAD)5

and flexible

Development6

work into

not

review

various

provides

systems.

to traditional

does

Copyright

(UML)

are

framework and

requirements.

Although

Editorial

complete

there

Development

Agile Software the

a general develop

resources.

CASE tools,

an alternative

is to

and on relational

Language

Application

prototypes,

to

However,

of information online

(SDLC)

required

ways

modelling

chapter.

development

Cycle

activities

different

on ER

in this

Unified

Life

the

Learning. that

any

Rapid

Application

about

All suppressed

Rights

Agile

Reserved. content

does

May not

Development. Software

not materially

be

copied, affect

Prentice-Hall,

Development,

scanned, the

overall

or

duplicated, learning

Macmillan

go to

in experience.

whole

or in Cengage

College

Division,

1991.

www.agilealliance.org.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10.2

tHe systeMs

The Systems

DeVeLOPMent

Development

Life

10

Database

Development

10.2

Cycle (SDLC)

the systems

529

LIFe CyCLe (sDLC) traces

the

history (life

cycle)

of an information

Perhaps moreimportant to the system designer, the SDLC provides the big picture database design and application development can be mapped out and evaluated. FIguRe

Process

Development

system.

within which the

Life Cycle (sDLC)

Phase

Section

Action(s)

Initial Planning

assessment

Feasibility

10.2.1

study

User requirements

10.2.2

Existing system evaluation

Analysis

Logical

Detailed systems design

system

Detailed

design

system

Coding, testing

10.2.3

specification

10.2.4

and debugging

Implementation

Installation,

fine-tuning

10

Evaluation Maintenance

Maintenance

10.2.5

Enhancement

Asillustrated in Figure 10.2, the traditional SDLC is divided into five phases: planning, analysis, detailed systems design, implementation and maintenance. The SDLC is an iterative rather than a sequential process. For example, the details of the feasibility study might help refine the initial assessment, and the details discovered during the user requirements portion of the SDLC might help refine the feasibility study. Because the

Database

Life

Cycle (DBLC)

fits into

and resembles

the

SDLC,

a brief

description

of

the SDLC is in order.

10.2.1 Planning The SDLC planning phase yields a general overview of the company and its objectives. An initial assessment of the information-flow-and-extent requirements must be made during this discovery portion of the SDLC. Such an assessment should answer some important questions: Should the existing system be continued? If the information generator does its job well, there is no point in modifying or replacing it. To quote an old saying, If its not broken, dont fix it. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

530

PaRt

IV

Database

Design

Should the extent

existing

and flow

considering

existing

system

are

the

Participants

a new

must address

The technical

the

initial

mainframe,

cost.

must

in the

order.

in

When

mind the

required

even

might indicate

to

create

a new

more important

that

the

system,

in this

current

a careful

case than it is in

to

software

address

question

and evaluate

is

whether

systems,

nature mobile

The

of the

alternative

it is feasible.

type

decisions

hardware

device)

database

mundane

and the

solutions.

The feasibility

might

not (yet)

requirements software

and software,

question, Can

a careful review

(desktop,

requirements

programming

languages

the

most

cost

The

operational

cost.

Does the system

or buying

is

solution

company

have

assessment).

not

(with

effective

operational?

be assessed

we afford it? is crucial (and the

of the initial

problem

in-house

aim is to find

the

requirements.

the or

a thousand-rand

a system

should

next

to

and so on).

might force

keep

the

supercomputer

The

culture

and

The admittedly

building

to

must keep

must begin to study

necessary,

operating

solution

resources

deficiencies may be in

assessment

assessment

effort

perhaps

assessment is

applications

question

between

The initial

Given the

of hardware

multi-user

million-rand

fixing.

but they

or

be used by the

to that

be replaced?

system

aspects

The system

in the initial

indicates

modifications

following:

be vendor-specific,

(single-or

major)

needs.

wants and needs is

SDLCs

that

mid-range

participants

assessment

even

system.

in the

decided

If the initial

minor (or

and

beyond

between

modifying

study

wants

flaws

distinction

modified?

the

between the

systems

If it is

be

modifications,

distinction Should

system

of the information,

defensible.

customisation)

that the

required

resistance

business

human,

of the

to

change

vendor

needs

technical

new

system

should

that

may need to

a third-party

meets the

The impact

as peoples

It bears repeating A decision

answer a be

made

system.

of the

organisation.

and financial on the

companies

not be underestimated.

10

10.2.2 analysis Problems

defined

during

A macroanalysis

such

must

Do those The

analysis The

phase

existing

problems

users

and

and the

examined needs

overall

in

and

end

information

effect,

greater

detail

during

organisational

needs,

the

analysis

addressing

phase. questions

systems

are

users?

requirements?

a thorough

audit

also

of

user requirements.

studied

during

of the systems

designer(s)

the

system

can

which

new

with a study

creation

Learning. any

inputs,

All suppressed

Rights

the

analysis

functional

areas,

phase.

actual

The result

and

potential

does

systems

May not

not materially

be

must is

work together

vital to

to identify

defining

the

processes

appropriate

and

to

performance

uncover objectives

be judged.

processes

Reserved. content

cooperation

of user requirements

of a logical

model,

that

systems

understanding

system

by

Cengage

the

software

be a better

Such

deemed

are

current

SDLC is, in

areas.

data

has

of the

problem

Along

2020

fit into

hardware

should

of the

potential

the

phase

of both individual

and opportunities.

End

review

requirements

requirements

of analysis

Copyright

planning

made

as: What are the

Editorial

the

be

design. and

copied, affect

expected

scanned, the

and the existing The logical

overall

or

duplicated, learning

output

in experience.

whole

or in Cengage

systems,

design

the

analysis

must specify

the

phase

also includes

appropriate

conceptual

requirements.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

When creating hierarchical database

(modules)

for

each

is

validated

yields

within the

against

and place

the

functional

such

such

entity

as data flow

point

to

entities

environment.

Development

diagrams

(ER)

the

of the

and

tools

as

(DFDs), The

describe

systems

DFDs.

531

all

database.

components

All data transformations

analysis

Process

diagrams.

discover

within

descriptions

systems

Database

relationship

at this

among

database

using

those

take

relationships

also

documented

might use tools

diagrams

activities

system

process

and

designer

(HIPO)

and the

logical

described

model

attributes

the

the

output

data-modelling

and their

Defining

design,

process

designs

entities

are

a logical

input

10

The

(processes) conceptual

data

processes.

10.2.3 Detailed systems Design In the The

detailed design

systems

includes

devices

that

are laid

out for

might

also planned

design

phase, the

all necessary be used

to

conversion

and

help

from

designer

technical make the

the

old to

must be submitted

completes

specifications system

a

new

system.

the

for

the

for more

managements

design

the

of the

screens,

efficient

systems

menus,

information

Training

processes.

reports

and

other

generator.

The

steps

principles

and

methodologies

design

process,

are

approval.

nOte Because this

attention

point

has been focused

explicitly

Such

approval

points

along

recognised

is the

on the

the fact

needed

because

way to

a completed

details

that

of the

systems

managements

a GO decision systems

approval

requires

is

funding.

needed

There

this

book

at all stages

are

many

has not until

of the

GO/NO

process.

GO decision

design!

10

10.2.4 Implementation During and

the implementation

the

database

system actual

enters

into

database

is

authorisations The

and

phase,

design

database

a cycle

so

hardware,

and the

DBMS

During

of coding,

created,

and

the

is implemented.

testing system

and

is

software

the

initial

and

application

stages

debugging

customised

of the

until it is

programs

ready

by the

creation

batch

mode,

are installed

implementation

to

phase,

the

be delivered.

of tables

and

The

views,

user

on.

contents

may

be loaded

interactively

or in

using

a variety

of

methods

devices: Customised Database

user interface

Conversion

utility

The system

is

and testing

of a new

are trained.

that

import

to

exhaustive

system

took

application After testing

The

the

data

from

a different

file

structure,

using

batch

programs,

the

implementation

or both.

subjected

of sophisticated time.

programs.

programs

a database

testing

programs.

system

testing

50 to

generators

is concluded, is in full

until it is ready

60 per cent

and

debugging

the final

operation

of the

for

total

tools

end

in

or in

of this

Traditionally, time.

has substantially

documentation

at the

use.

development

is reviewed phase

but

However,

the

decreased

and printed,

advent

coding

and

and end users

will be continuously

evaluated

and fine-tuned.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

experience.

whole

Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

532

PaRt

IV

Database

10.2.5 Almost

Design

Maintenance as soon

as the

generate system Corrective

system is operational,

maintenance activities,

maintenance

Adaptive

in response

maintenance

Perfective

due to

end users

begin to request

changes

in it.

Those

changes

which can be grouped into three types:7 to systems

changes

in the

errors.

business

environment.

maintenance to enhance the system.

Because every request for structural change requires retracing sense, always at some stage of the SDLC. Each system

has a predetermined

operational

life

span.

the SDLC steps, the system is, in a

The actual

operational

life

span of a system

depends onits perceived utility. There are several reasons for reducing the operational life of certain systems. Rapid technological change is one reason, especially for systems based on processing speed and expandability. Another common reason is the cost of maintaining a system. If the systems maintenance cost is high, its value becomes suspect. Computer-aided systems engineering

(CASe)

technology,

such

as System

Architect

or Visio Professional,

helps

make it possible

to produce better systems within a reasonable amount of time and at areasonable cost. In addition, the more structured, better-documented and especially standardised implementation of CASE-produced applications tends to prolong the operational life of systems by making them easier and cheaper to update

and

10.3

maintain.

tHe Database

LIFe CyCLe (DbLC)

Withinthe larger information system, the database, too, is subject to alife cycle. The Database Life Cycle (DBLC) contains six phases (Figure 10.3): database initial study, database design, implementation

10

and loading,

testing

and evaluation,

operation,

10.3.1 the Database Initial

and

maintenance

and evolution.

study

If a designer has been called in, chances are the current system has failed to perform functions deemed vital by the company (you dont call the plumber unless the pipes leak). So, in addition to examining the

current

systems

operation

current system fails.

7

See

E.

Reed

Information

Doke

but it

especially

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

and

Neil 4(1),

remains

have

All suppressed

Rights

does

pp.

today.

May not

not materially

be

copied, affect

date

with (but

revisited: on this

software

underlying

must determine

principles

and

why the

mostly listening to) end users.

a product

reference

environment

how

life

may

perspective,

you

with

dizzying

changes

of software

cycle

cause

design,

to

consider

it

frequency,

implementation

and

longevity.

scanned, the

The the

most of the

remarkable

designer

maintenance

8-11.

Although

to its interface,

enjoyed

Reserved. content

Software

Winter

1991,

the

alot of time talking

E. Swanson,

relevant

with respect

management

company,

That means spending

Executive,

outdated,

within the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

FIguRe

10.3

10

Database

Development

Process

533

the Database Life Cycle (DbLC) Phase

Action(s)

Section

Analyse the company Database initial

Define Define

study

problems objectives

Define scope

situation

and boundaries

Create the conceptual Database

DBMS

design

10.3.1

and constraints

software

design

10.3.2

selection

Create the logical design Create the physical design Install

Implementation

the

and loading

Load

Testing

and

be excellent

be alone

on the

operator

senior systems here to The

a technical

communicators,

Depending

cover

Analyse

purpose

the

must

and

scope

or part of a systems and one or

a wide range

overall

business,

and they

complexity

analysts

of the

company

Define problems

of

data

10.3.4

the required

application

information

flow

10.3.5

Introduce changes Make enhancements

evolution

is

the

database

Produce

Maintenance

design

10.3.3

Fine-tune the database Evaluate the database and its programs

Operation

database

or convert

Test the

and

evaluation

Although

DBMS

Create the database(s)

database

people-orientated.

tuned

database

development

team

also

have finely of the

more junior

design

it is

Database

interpersonal

composed

analysts.

designers

must

designer

might

skills.

environment,

team

systems

10

10.3.6

the

database

of a project leader,

one or

more

The word designer is used generically

compositions.

initial

study

is to:

situation.

and constraints.

Define objectives. Define Figure

scope

10.4

and

depicts

the

DBLC successfully. the

development

examine

Copyright Editorial

review

2020 has

each

Cengage deemed

Learning. that

any

All

interactive

and iterative

As you examine of the

of its

suppressed

boundaries.

Rights

components

Reserved. content

database

does

May not

not materially

be

in

copied, affect

Figure

system

the

overall

or

10.4,

duplicated, learning

required

note that

objectives.

greater

scanned,

processes

the

to

complete

database

Using Figure

initial

the

first

study

phase

of the

phase leads

10.4 as a discussion

template,

to

lets

detail.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

534

PaRt

IV

FIguRe

Database

Design

10.4

a summary

of activities in the database initial

Analysis

of the

company

Company

objectives

situation

Company

operations

Definition problems

study

and

Company

structure

of constraints

Database

system

specifications

10

Scope

Objectives

Analyse

the

Company

The company structure the

describes

mission.

companys These

Situation

situation

and its

What is the

must

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

are

Rights

Reserved. content

does

are,

general

The design

that

conditions

company

components

mission. For example, database

the

in

situation,

how

they

which a company the

database

function

and

how

operates, its designer

organisational

must

discover

what

they interact.

be resolved:

organisations

environment?

the general

To analyse

operational

issues

Boundaries

May

must satisfy the

a mail-order

quite

not

operating

not materially

different

be

copied, affect

the

overall

operational

business is likely

from

scanned,

environment,

or

those

duplicated, learning

in experience.

of a

whole

or in Cengage

and

what is its

demands

manufacturing

part.

created

to have operational

Due Learning

to

electronic reserves

mission

by the

within

that

organisations

requirements

involving

its

business.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Whatis the quite

organisations

useful

formats

when

and

so

structure?

you

are trying

Define Problems

and Constraints both

length

has

of time,

the

existing

it

Aside

the

The

has

some

must

version

definition

to

describe

problems

encountered

during

from

Finding

among

suppose

may find

and less

The real

month not

the

535

whom is and

query

water

tap

the

not

How

real

the

does

version;

the

have

users

the real

of a companys

operational

operation

will have

to

the

do a better

any

the

house.

job

problems

been

of

problem:

solving

Using

the

taps

an adequate the

problem?

database

(admittedly

but

made.

Is that

experienced

solving

relationships

departments

determined

supply

problem

end

or to identify

work.

marketing

You water

Company

view

routine

yet almost

database

any

does the system

more informal

operations

managerial

washers

simplistic,

of so-called

for

can be very informative.

unstructured.

much progress

high.

tap

the

concerning

off the

of the

be

actual

will solve

bill is too

scenario

to

especially

and turn

also

of company

Often the

system

outside

instances

appear

perform

paper trail

is

existed

or computer-based).

Which documents

the

there

has

differ.

scope

department,

replacement

most complete

world

constraints.

is

Process

more

designer

complicated

obvious).

Even the

a

who

important,

home

step

initially

operations.

a proposed

your

the leaky

similar

to

report

company

manual

require?

Studying

these

the larger

production

You

Or would

find

specific

If the

place (either

operation, how

might

users is

in

By whom?

see

company end

of the

The solution?

You can

of the

units. If

those

solution?

Development

who reports

flows,

of information.

does the system

systems to

precisely

answers

business

an analogy, leak.

that

precise

exacerbate

of the

process

unable

different

of system

used?

enough

are often

is

kind

output

be clever

problem

what and

information

sources

Whatinput

system

official

designer

who controls

required

and informal

function?

How is the

from

formal

already

system

generate?

the

define

Database

on.

The

designer

Knowing

to

10

usually

and

Such constraints

and

within

a solution.

a R20

The

accurate

intrudes

include 000

designer

problem

to limit

the

time,

budget,

definition

design

budget,

a solution

must learn

to

of

does

even

personnel, that

takes

distinguish

not always

the

most

and two

lead

elegant

more. If you years

between

to

whats

to the

perfect

database

solution.

by imposing

must have a solution

develop perfect

at a cost and

within

of R800

whats

000

10

possible.

nOte Whentrying to develop solutions, the database designer are

many cases

of database

treat the symptoms

Define during

Copyright Editorial

review

2020 has

the to

to

mustlook for the source of the problems.

satisfy the

end users

because

they

There

were designed

to

of the problems rather than their source.

database problem

system discovery

must be designed process.

As the

list

to

help solve of problems

at least

the

unfolds,

major problems

several

common

identified sources

are

be discovered.

Cengage deemed

that failed

Objectives

A proposed

likely

systems

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

FIguRe

10.5

two views of data: business

10

Database

Development

Process

537

manager and designer

Company

Engineering

Manufacturing

Purchasing

Managers Shared

information

view

What are the

problems?

What

solutions?

are the

What information

implement What

is

the

data

generate

needed

to

solutions?

is

required

the

desired

to

information?

Designers

view

How must the data be structured? Co m p a ny

b a se

Da t

How

10

will the

data

be

accessed?

How is the data transformed into information?

As you

begin to

remember these

examine

the

procedures

required

to

complete

the

design

phase in the

DBLC,

points:

The process of database design is loosely related to the analysis and design of alarger system. The data component is only one element of alarger information system. The systems analysts or systems programmers are in charge of designing the other system components. Their activities create the procedures that will help transform the data within the database into useful information. The database design does not constitute a sequential process. Rather, it is an iterative that provides continuous feedback designed to trace previous steps.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

process

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

538

PaRt

IV

Database

Design

The database

design

FIguRe 10.6 Section 9-4

process

is depicted

in

Figure

10.6.

Database design process

Stage

Activities

Steps

Conceptual

Data

analysis

and

requirements

Determine

Design Entity

Relationship

modeling

and

Define

normalisation

entities,

Draw

Data

model

end-user

ER

Identify

verification

Distributed

database

design*

modules

the

and

insert, integrity,

and

DBMS

Select

Selection

the

DBMS

Determine

DBMS

and

data

relationships

update,

Logical

Map conceptual

Design

model to logical

Validate

logical

model

using

Validate

logical

model integrity

Validate

logical

model

model

components

Define

normalisation constraints

against

tables,

columns,

Normalised Ensure

user requirements

set entity

Ensure

to

use

relationships,

and

Dependent

constraints

of tables and

the

referential

model

integrity;

supports

user

define

column

Physical

Define

data

storage

Define

integrity

organisation

Define

tables,

indexes,

and

constraints

requirements

Hardware 9-7

rules

security

Independent

DBMS

9-6

delete

and

strategy

Hardware

model

and

access,

allocation

and

requirements

attributes

views,

fragmentation

transaction

and

entity

validate

queries,

DBMS

9-5

and

domains,

normalise

reports,

Define

outputs,

attributes,

diagrams; ER

Validate

views,

views

Dependent

physical

organisation

Design Determine

10

* See +

See

Chapter

14,

Chapter

13,

Distributed

will

to

a clear

must

other

data

Copyright Editorial

review

2020 has

is

words,

any

security

Define

database

groups,

and

query

roles,

and

execution

access

controls

parameters

of the

components

of these

about

each

in

databases

areas. In

in

Chapter

component

Figure

10.6.

a real-world

Knowing setting.

11, Conceptual,

in greater

those This

Logical,

details

chapter

and

is

Physical

detail.

All

hardware

and its

used

be

to

used

create

an abstract

way possible.

functional might

independent

minimal

is there,

make

Reserved. content

is

most realistic

model to

sure

elements

Rights

modelling

business

following

defined

suppressed

data

The conceptual

areas.

not

so the

yet

At this

have

system

database

level

model

be set

up

the

Therefore, within

that

must embody

of abstraction,

been identified.

can

structure

any

type

the

of

design

hardware

and

chosen later.

needed

All data

Learning. that

of the

mind the

elements

Cengage deemed

each

and implement

objects in the

and

platform

needed.

about

stage,

database

be software

Keep in

In

willlearn

design

and/or

All that

users,

Design

real-world

software

briefly

an overview

you

understanding

hardware

Define

Performance

design

provide

conceptual

represents

SQL

will learn

Design,

Conceptual

In the

measures measures+

successfully

only intended

i.

and

you

you

Database

performance

Database

section

help

security

Databases

Managing

In this

and

does

that

May

not materially

be

data

affect

needed

must

be used

overall

or

duplicated, learning

in experience.

whole

is

needed.

are in the

database

scanned, the

is there

by the

model

copied,

rule:

all that

all

required

in the

not

and

data

model

transactions by at least

or in Cengage

part.

Due Learning

to

electronic reserves

and must

one

rights, the

right

that

database

some to

third remove

all

data in

be defined

party additional

the

in the

model

model,

are

and

all

transaction.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

However, on the

as you apply the

immediate

design

must leave

in information

needs

room

for future

resources

As you examine Data

analysis

of the

data rule,

business,

avoid

but

modifications

also

and

Database

an excessive

short-term

on the

data

future

additions,

ensuring

bias.

needs.

that

Development

the

Focus

Thus,

539

not only

the

businesss

Process

database

investment

will endure.

Figure

10.6, note that

conceptual

design requires

the following

four

steps:

and requirements

Entity

relationship

Data

model

modelling

and

normalisation

verification

Distributed

database

Each of these

minimal

data

10

steps

design

will be explained in

detail in

Chapter

11, Conceptual,

Logical,

and Physical

Database

Design.

ii.

DBMS

Software

The selection the

of DBMS software

advantages

avoid the

false

Selection

and

is critical to the information

disadvantages

expectations,

the

the factors

affecting

of the

end

user

proposed

must

be

systems

DBMS

made

smooth

software

aware

of the

operation.

should

Consequently,

be carefully

limitations

of

both

studied.

the

DBMS

To and

database. Although

most common

Purchase,

DBMS

features

application

more

maintenance, and tools.

report

task.

generators,

pleasant

work

Underlying

model.

Portability.

end

ease

of use,

and third-party

support

network, systems

requirements.

to

company,

some

of the

and

a variety

of tools

of query

and the

relational,

facilitate

(QBE),

the

screen

DBMS

a

Database

concurrency

software

object/relational,

create

programmer.

security,

also influence

costs.

and so on, helps to

application

performance,

that

by example

data dictionaries, user

conversion

10

control,

selection.

or object-orientated.

and languages.

Processor(s),

RAM, disk space,

and so on.

Design

The second

stage in the

stage is to

map the conceptual

1

company

training

includes

availability

both the

facilities,

platforms,

the

generators,

for

Hierarchical,

Across

DBMS hardware

DBMS.

vary from

installation,

software

example,

application

query

processing

licence,

database For

environment

facilities,

transaction

Logical

decision

operational, Some

development

administrator

iii.

purchasing

are:

Cost.

painters,

the

The logical

database

design

design model into

stage

Creating the logical

cycle is known alogical

consists

of the

as logical

model that

following

design.

can then

The aim of the logical

be implemented

design

on a relational

phases:

data model.

2 Validatingthe logical data model using normalisation. 3

Assigning and validating integrity

4

Merging logical

5

Reviewing the logical

You

will learn

in

constraints.

models constructed

detail

for different parts for the database.

data model with the user.

about

logical

design

in

Chapter

11,

Conceptual,

Logical,

and

Physical

Database

Design.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

540

PaRt

IV

Database

Design

The right to use the to

use the tables,

framework,

the

The logical

model

within

iv.

by defining

the

of the

questions the

hardware

require

domain

the

of data

physical

characteristics methods

of the

are

data in the

database

design

a function

supported

access

model into

tables

requirements

that

Who will be allowed

users?

of appropriate

the required

of the

by the

storage

Within

a relational

rights

and

views.

a software-dependent

and the

necessary

allow the

access

system to function

2

Translate each relation identified in the logical

3

Determine a suitable file organisation.

4

Defineindexes.

5

Define user views.

6

Estimate data storage requirements.

7

Determine database security for users. will learn

about

each

of these

stages

but

down into

in

and data access characteristics

types

system,

device(s),

can be broken

Analyse data volume and database usage.

of devices

and the also the

supported

DBMS. performance

a number

by the

Physical

hardware,

design

of the

of the

affects

not

system.

of stages:

data modelinto tables.

more detail in

Chapter

11,

Conceptual,

Logical,

and

Physical

Design.

Physical of the

phase.

which

environment.

1

Database

to

conceptual

process of selecting the data storage

access

location

Physical

You

definition

definitions,

define the

design

will be available

Design

The storage

type

during the logical

table(s)

software-independent

appropriate

design is the

database.

10

to those

translates

selected

Physical

only the

is also specified

portion(s)

The stage is now set to

the

Physical the

which

answers design

restrictions.

database

and

design

desktop

database

is

a very

world.

software

technical

job,

Yet even in the has

assumed

more typical

more complex much

of the

of the

client/server

mid-range

burden

and

of the

and

mainframe

mainframe

physical

world

environments,

portion

of the

than

modern

design

and its

implementation.

Online Content Physical design is particularly importantin the olderhierarchical and network

models

Network

Database

databases

are

In

spite

described

in

Appendices

Model, respectively,

more insulated

of the fact

from

that

I and

J, The

available physical

details

relational

Hierarchical

on the

online

than

the

models tend

Database

platform

older

to

Model

for this

hierarchical

hide the

and

book.

The

Relational

and network

complexities

models.

of the

computers

physical

characteristics, the performance of relational databases is affected by physical-level characteristics. For example, performance can be affected by the characteristics of the storage media, such as seek time, sector and block (page) size, buffer pool size and number of disk platters and read/write heads. In addition, factors such as the creation of an index can have a considerable effect on the relational databases

performance,

that is, data access

speed

and efficiency.

Even the type of data request must be analysed carefully to determine the optimum access method for meeting the application requirements, establishing the data volume to be stored and estimating the performance. Some DBMSs automatically reserve the space required to store the database definition and the

Copyright Editorial

review

2020 has

Cengage deemed

users

Learning. that

any

All suppressed

data in

Rights

Reserved. content

does

permanent

May not

not materially

be

copied, affect

storage

scanned, the

overall

or

duplicated, learning

devices.

in experience.

whole

This ensures that the

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

data are stored in sequentially

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

adjacent

locations,

performance

thereby

tuning

Physical

is

design

performance

is

surprising

that

reducing

covered

data access time

in

becomes

detail

more

affected

by the

designers

favour

in

Chapter

complex

when

and increasing

13, data

communication

are

database

software

distributed

hides

and

at different Given

as

Development

performance.

Database

throughput.

that

Database

system

Managing

medias

10

SQL

(Database

because

complexities,

many of the

541

Performance.)

locations

such

Process

the

it is

physical-level

not

activities

as possible. The

preceding

sections

have

separated

the

discussions

In fact,

logical

and

physical

design

can

be carried

basis.

Logical

and

physical

design

can

also

with hierarchical

and network

understanding hardware

of the

models.

software

and

The output attributes, phase,

install This

step

of the

database

domains,

views,

you

the

many

Such

actually

is required the

technique

design

phase is

indexes,

only

database

when

to

on a table-by-table

parallel

when

require

take

the

full

the

activities.

(or

file-by-file)

designer

designer

advantage

skills

design

a new

dedicated

that

made

and

to

of

detailing

storage

is

working

have a thorough

both

software

and

and

the

creation

performance

of tables,

guidelines.

In

specifications.

user

Relational

groups

use

of

Database

can be easily

Create the

have

representations

virtual

of

services

in the

(RDS).

managed, tested

DBMS the

is

necessary

standard

developed.

computing

and

is

virtual

for the

to leverage

The

resources used in

private

DBMS

server

such

This

generation

new

and scaled

of the

system

configuration

services

as

that

many

In

may

and

such

routing.

allows

to create

Service

users

to

10

server running

Another

Database

as

environment,

administrators

networks

of services

is a of the

of computing,

DBMS on a virtual

SQL

In

be installed

virtualisation

a database

and network

Microsoft

system.

investments

are independent

areas

networks.

of a new instance

a task that involves

database

Services

already

The technique

storage

normally

and

cloud

of the DBMS

One current trend is called virtualisation.

refers to the installation This is

instance a particular

employees

resources.

services,

hardware.

the

common or

create

Amazon

databases

up as needed.

Database(s)

modern

storage-related

relational

DBMSs,

constructs

group (or file contain

logical

virtualisation

appropriate

most

of instructions

constraints,

all these

will have

computing

of virtual

on shared

is

creates

physical

creation

In

activities

order

a series

security

implement

and the

that

underlying

that

in

design

and Loading

organisation

technology

trend

parallel

hardware

on a new server or on existing servers.

the

parallel, out in

physical

DBMS

cases,

in the

out in

be carried

and

characteristics.

10.3.3 Implementation

this

of logical

groups),

more than

For example,

to

the table

one table

the

a new

house the spaces

space

database

end-user

of the

a table logical

requires

The constructs

and the tables.

and that

implementation

implementation

tables.

Figure

space design

can

10.7

shows

contain

in IBMs

the

creation

of special

usually include

the

that

a storage

more than

one table.

DB2

would

require

the

storage

group

can

following:

1 The system administrator (SYSADM) would create the database storage group. This step is mandatory

groups to see

for

such

mainframes

automatically whether

as

DB2.

when a database is

you need to create

Other

DBMS

created. (See

a storage

group

software

may create

Step 2.) Consult

and, if so,

your

what the

equivalent

DBMS

command

storage

documentation syntax

2

The SYSADM creates the database

3

The SYSADM assigns the rights to use the database to a database administrator (DBA).

must be.

within the storage group.

4 The DBA creates the table space(s) withinthe database.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

542

PaRt

IV

Database

5

The DBA creates the table(s)

6

Design

The DBA assigns spaces. is

not required

for

GRANT

For

access

example,

may be granted

SELECT

10.7

may be limited

database

standpoint.

PROFESSOR

FIguRe

access rights to the table

Access rights

security

within the table space(s).

in the

to the

organisation

relational

the

user TO

spaces

views rather

using

ON PROFESSOR

Physical

to

and to the tables than to

environment,

following Miriam

USER

The creation

views

access

are

rights

whose identification

desirable to

table

of views from

a table

code is

a

named

MLEDIMO:

MLEDIMO;

of a Db2 database

Storage

but

command,

Ledimo,

within specified

whole tables.

environment

group

Database

Table

Table Table

Table Table Table

Table

Table

space

space

Table space

Table

Table

space

10 Table

space

Load or Convert the Data After the database has been created, the data must beloaded into the database tables. Typically, the data will have to be migrated from the prior version of the system. Often, data to be included in the system

must be aggregated

from

multiple sources.

In a best-case

scenario,

all of the data

will be in a

relational database so that it can be readily transferred to the new database. however, in some cases data may have to be imported from other relational databases, non-relational databases, flat files, legacy systems, or even manual paper-and-pencil systems. If the data format does not support direct importing

into

the

new database,

conversion

programs

may have to

be created

to reformat

the

data

for importing. In a worst-case scenario, much of the data may have to be manually entered into the database. Once the data has been loaded, the DBA works with the application developers to test and evaluate the database.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Loading for

this

also

existing

is that

on the

be a very

data into

most

cloud

amount

of

expensive

negotiating

a cloud-based

services

data that

are

travels

proposition.

the terms

of the

database

priced over the

Therefore,

cloud

service

based

not

service

can sometimes

only

network.

system

In

on the

such

Database

of data

loading

must

to

Process

careful

543

The reason be stored

but

a 1 TB database

be very

to ensure that there

Development

be expensive.

volume

cases,

administrators

contracts

10

in

could

reading

will be no hidden

and

costs.

10.3.4 Database security Data

stored

does

not take

database

have

in

the

or

when

a serious

system misuse. reading

the

This

section

to

ask

of threat.

to the

Threats

system

The loss

being

and/or its

access

operational,

The loss

Threats

examples

electronic

business.

see.

Activities

in

and

polices

a hacker

cause

goals.

from?

essential

What to the

that

data

security

from

loss,

any

misuse

users

of the

referred

For example,

causes

the

data from

to

as the

kind

or harm

a person

privacy

gaining

account.

database

accessing

such

and

breach

system

to

stop

it.

of data).

This could

as a password

10

be

or a bank

data can be lost security

then

Learning. any

All suppressed

it

is

but different

does

accidental such

May not

are likely

goals

to

be perpetrated

working

for

with

or her

so that

he or she

both inside

and

him

occur

case

from

alarge

outside

the

where an employee

steals

a person

specific not

of severity.

are:

a salesperson

he or she

loss

as user

not materially

be

of data.

copied, affect

policies

data

by humans, company can

often

his

or her

organisation

and

access

he or she is

to the

by

who resigns

start

has legitimate

that

connected

the

overall

or

duplicated, learning

to

in experience.

often

However,

caused

each

to the

unauthorised

organisation

by humans

it is important

and procedures

If employees

for them

scanned,

This is

authorisation.

security

by poor staff training.

Reserved. content

be

can

will be impossible

Rights

as these

and are of various levels

security

actually

data.

causes

procedures

on the

Consider the

system

stealing

to an organisation

such

and fraud

ensure that it has excellent

that

to

further

goals relate

the

money from the

private information

database

differently.

This internal

error that

Cengage

modification.

some

effects

would

customer

database

Human

deemed

to

and their

Both theft

be treated

breaking

has

it is

may

or

security

database

security

protect

potential

unauthorised

data (also

access

An example

your

organisations

2020

the

so

undertake the

the

data

of computer destruction

should

design,

doing

and removing

authorised

and externally

of data.

takes

have to

review

protect

database

and in have

For example,

of the

gaining

of threats

means.

and then

Copyright

which stops

to

the

balance.

Theft and fraud

own

data through

data.

establish

It

a student

can cause:

a bank account

can occur internally

Some

Editorial

of

of confidentiality

account

Threats

you

most common

to

loss,

to

Within

that

or accidental so

The

to

any kind

important

we trying

goals

for

users.

access

or damage

of security,

it is

of data.

unauthorised

have

misuse

intentional

ideas

by

major concern

prevent?

security

students

Any

a

what are

of circumstances

of the

by a person

to

access

when

data!

is

system

as,

availability

data.

to

payroll

basic

from

results

against

the

meet the

protected

Security

the

we trying

set

of availability

caused

to

any

of the integrity

The loss

only

such

are

to

to

data

developing

and the

are

unauthorised

access

the

questions

developed

be

the likely

organisation.

protecting

problems

are

have

When

must

predict

will highlight

confidentiality

measures

to

of

area.

related

to

on the

aim

this

It is important

integrity,

database

employees

impact

with

in

security

company

much imagination

in

place to

do not know the

for

not following an organisation

begin

with. In addition,

procedures

surrounding

data

be followed.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

544

PaRt

IV

Database

Design

Electronic ?

infections.

Viruses.

A virus

spreading

they

cannot

Email

is

be caught

viruses.

Worms

also

?

to

that

is

usually

infections:

capable

attached

of copying to

such

attaches

itself

to

email

itself

a program

as opening

messages

all people in the receivers

pieces

of software

that

or hole in

security.

networks

systems

quickly through

are

human intervention,

of virus

small

between

of electronic

of software

As viruses

without

This kind

telecommunication

travel

piece

categories

and

or application,

an email

attachment

program.

mailing itself

are

general

malicious

a network.

an infected

automatically ?

a

across

or running ?

There are four

without

email

replicate

and replicates

address

themselves

They

are

any human intervention

using

different

itself

by

book. any form

from

viruses

and can replicate

of

in

that

they

themselves

very

networks.

Trojan

horses.

action.

It remains

The

Trojan

horse

dormant

is

a computer

until run

program

and then

that

begins to

claims

to

do damage

perform

such

one task

as erase

or

a hard

disk. The introduction the loss

of a virus to a computer

of availability

of the

The occurrence

of natural

and not deliberate addition,

data

disasters

actions

could

network

system resulting

but

such

can result in

both the loss

due to

of the

consequences

to the

business.

as storms,

fires

These

are unpredictable

or floods.

would still result in the loss

be corrupted

of integrity

in serious

power

surges

of integrity

and

and availability

hardware

would

data and

of data. In

become

physically

damaged. Unauthorised access

is

and

making

access

could

result

and its

hardware

privileges

to

user

goes

set

then up

weak

and then The above

it

does

should DBMS the

Copyright review

2020 has

Learning. that

any

by no

need for

a number part

of the

computer

used

system

in

other

copied,

scanned,

parts

unauthorised

a computer Obtaining

data to

system, unauthorised

gain information

Unauthorised

malice

against

the

property,

reputation

modification

of data,

by the

database

One example

that

modification

organisation. and

This

safety

physically

allow

of a damaging

within

would

to

steal

(DBA)

DBA granting the

not

excessive

organisation.

be that or obtain

The

the

DBA

has

only

login

information

users.

and

to

example

attackers

administrator

is the

of his or her job Another

exhaustive

an organisation

gaining

by this threat.

database

of data security

within.

of

and

requirements

means

contained

unauthorised

be caused

which

for

entering

also the

of training.

of genuine

used

organisation.

acts

but

privileges.

schemes,

is

measures

security

Cengage deemed

the

the

often

of illegally

deleted.

access

could

lack

these

the identity

of threats

highlight

security

data

Editorial

list

only

exceeds

abuses

authentication

assume

contain is

who

on to

data

deliberate system

This

through

act

the

of data are also covered

knowledge

a user

with

phrase

browsing

Unauthorised

administration.

enough

and

or even

computer

employees.

and the theft

database

having

any

files

or against

changed

concerned

only

The as the

a person

benefit

being

data.

defined

to the

may involve

is

not

of

usually

persons

data

sabotage

would include business

is

changes

to that

in the

Employee

modification

unauthorised

be used

Poor

and Hacking

to a database

could

10

access hacking.

a summary

is

provided

have a comprehensive

measures to

protect

both the

infrastructure

within

of the

You

system.

in

Figure

10.8.

However

data security

plan.

The plan

data

hardware.

and the

an organisation will now look

and

will often

at some

of the

The

rely

on

common

measures.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

affect

the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

FIguRe

10.8

External

threats

a non-exhaustive

summary

10

Natural

include

horses

Storms unauthorised

data Theft

Process

545

disasters

Flood

Unauthorised

Development

of security threats

Electronic Infections such as viruses, worms and Trojan Hacking Gaining

Database

Fire

access

modification

of

of data

Fraud

Power

surges

Poor

database

administration

Poor security Set

by the

Granting Weak

Internal

threats

by

of

and

procedures

excessive

privileges

authentication

schemes

employees

Employee not trained in No employee monitoring Theft

policies DBA

procedures

data/unauthorised

modification

of

data

Sabotage

10 Data

Security

Physical type

Measures

security

of database

For

example,

existence terms

building

due to

security

controls, of the

identity. use

User

of

data

Password security

2020 has

as

Cengage deemed

Learning. that

any

is

a way

allows enforced

can

Rights

makes

access

to

can

tools

as they

characteristics

biometrics

include

contain

of the

basement by

have

of a

push-button been

a digital

to recognise

fingerprints,

In

placement

in the

systems

The

impractical.

the

be controlled

biometric

security.

security

disaster,

on the

be practical.

physical

do not locate

rooms

Recently,

for

physical

a natural

Depending

may not always

candidate

For example

behavioural

physical

tables,

does

May not

not materially

can

be

time

through (CREATE,

views,

copied, affect

the

overall

or

user

duplicated, learning

and

of access at the the

verifying

through rights

operating use

UPDATE,

queries

scanned,

the

be achieved

assignment at logon

operations

Reserved. content

the

be established

databases,

All

often due to

systems.

of identifying

This

usually

suppressed

networks

authentication

cases,

security

alikely

areas.

seen

imprint

to

of an

or authenticate

a persons

retina

a

or iris

geometry.

security

rights

not

hardware

Physical

convenient

some

access to specific

physical

is

considered.

biometric

most common

may restrict

such

and

or applications.

is

Access rights

review

hand

or

and, in

The

and

of floods.

cards

secure

authentication

restricted

Copyright

most

of data

be carefully

possibility

physical

persons and the

could

swipe

database

microcomputer

the loss

physical

establishing

research

multiserver

the

personnel

however,

student

against

in a building

individuals

Editorial

implementation,

of large

one

only authorized

a university

of guarding

hardware

be

allows

that

the

use

to

specific

system

of database DELETE

the

allowed

of passwords

and

authorised

to

access

access

rights:

users.

Password

level.

software. and

user is

so

on)

The

assignment

of access

on predetermined

objects

and reports.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

546

PaRt

IV

Database

Design

User authentication role.

This

will be

Audit trails is

an

that

to link

some

to

If

can

out

a persons

describes

which

used,

the

method

is

key, the In the

message.

The

the

10

Some

second

whom

DBMS

without

key,

the

decrypts

the

known

original

for

stores

writing

who

method

or the

guesses

the

be

used

might

account

have

number

the

as the

numbers

stored

zero to (0 to

nine to

99).

one

(DES).

data.

key.

Here, the

only

standard

the

be required

encryption

Where

encryption

value

This example

algorithm.

key.

of

example,

The real

five.

encryption

Data

numbers for

will be 32456.

decipher

(the

who

number

five is known

data

would

the

one-digit

encryption

order to

may then

users

of the

called

as the

digit is Both the

With the

one-key

guess the

Therefore,

key),

the longer

data.

wish to

send

an encrypted

key to transform private

the

key, is

used

message.

The

message

data in the by the

only

have

decryption

person

a public

message into

who

key.

The

an encrypted

algorithm

to

may hold the

convert

private

the key is

was destined.

encryption (TDE),

routines.

For example,

which allows

of complex

column.

known

work in

can identify

data

encrypt

value

number

represent

we have so far discussed.

to

subtraction

data is

trails

system.

wishes

encrypted

guesses

data in the

lots

the

key in

the

message

it in the

is

100

as the

to the

Data Encryption

need

and

back

managerial

measures

data itself

audit

unauthorised

by a secret

by the

real

the

to

Audit

security

audit

The

measures that

where the

up to ten

public

repair

code

then

one-key

all users

to

useless

the

to the

decipher

products include

as Transparent

encrypts

it is to

DBAs

Although the audit trail

use.

our other

occurred.

a bank

value

know the

up to

uses this

message

one for

as the

need to

method,

alter

algorithm,

method,

difficult

algorithm

encrypted

unauthorised

has

or security

algorithm,

would require

a two-key

data

32451

number

to

would

two-key

encryption

be to is

by the

is referred

more

after it

Supposing

encrypted

a specific

method an intruder

the

the

added

and receiver

with

discourage

may be used

security layers

encryption

of adding

five,

and

number

from

a very simple

whereas

which is part of the

we would rather

access

would

account

value

sender

can

Although

user

stage

decrypted

This logic

existence

by an algorithm.

The first

can be then

management,

chapter.

be used to render

database

carried

customers.

five.

a particular

of the

is

mere

defence.

or unauthorised

encryption

encryption

this

does not gain access to the system, if all else fails, the

a violation

violated

on in

its

database

of a violation

Data

of authorisation

later

device,

of the

an attacker

existence

a function

are usually provided by the DBMS to check for access violations.

after-the-fact

the last line

its

is

discussed

columns

code.

Similarly,

when

Oracle

DBMS

in a database

When

users

insert

users

select

the

has a feature

table to

data, column,

known

be easily encrypted

the

database

transparently

the

database

automatically

it.

nOte The

most

protocol

common that

Netscape,

example

was

SAIC,

Terisa

over the internet. customers

amount

of

of data

can

User-defined employees

review

2020 has

Cengage deemed

be sent

any

All

public

a secure securely.

policies

suppressed

Verisign)

Secure

Electronic

of companies

interested

authenticity

in

Transactions

(VISA,

ensuring

of electronic

key

encryption

connection The

use

and

(SET).

MasterCard,

data

privacy

transactions

such

Rights

Reserved. content

does

the

as training

May not

not materially

be

copied, affect

the

overall

SET is

an open

GTE, IBM,

Microsoft,

in all electronic

and

be seen

and

provides

when

be put in

security

in experience.

measures.

in

security

whole

or in Cengage

part.

Sockets

commerce

a guarantee

a person

that

Due

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

from

to

can

employees

may content

be

suppressed at

any

an

ensure

procedures

monitoring

any

web address.

organisation and

which

goods

before the

policies

technology

over

purchases

by the

and

(SSL)

server,

of http

Such

aspects

Learning

Layer

an external

place

duplicated, learning

Secure

should data

or

in

a user

instead

employees

scanned,

used

by the use of https

procedures

how implement

is

between

of SSL can

This is normally indicated

controls

Learning. that

and

the

and

create

know

personal

Copyright

private

store.

at work is consortium

are protected.

SSLs

internet-based

Editorial

Systems

transactions

internet.

by a large

SET ensures

A combination on the

of encryption

designed

time

to

from if

the

subsequent

that cover ensure

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

that they is

also

are actually following

a responsibility

Backup

and

recovery

ultimately

recovered.

Data backup

lies

software

known

or potential

unique

signature

themselves.

should

with the

The establishment

will be discussed

DBA to

and recovery

a known

is

used

viruses.

in

be in

place

in

ensure

the

data

in the

context

basis,

Firewalls

is

only

are systems

a set of rules network.

more

detail

the

event

within

of disaster

applications.

are

If

Packet

to

in this

of a

the

Development

of polices

later

Process

547

and procedures

chapter.

disaster

database

occurring.

can

management

methods

which

The

always

be fully

will be discussed

later

each

that

are accepted

messages

message

or software

are allowed

from

devices

record

antivirus

to

for

the

software

any external

be used

applications

the

rules, in

any

viruss

will check,

source

to

media

see

scan

any

which

act as gatekeepers

not

out

contains

in

allowed

of the

device.

can

be

through.

checked

designated

you to establish

or out of the

organisations

accessed

by

Firewalls

organisations

data is

be sent to the

by allowing

database

it is

and

access

be allowed

organisations

that

to

should

an

flowing

or packet

also

media

vendors

The

unauthorised

messages

as breaking

software

network can

and

date.

devices

when

drives

database.

software

up to

used

control

hard

antivirus

software

the

of hardware

is flagged

filtering

Packets

discovered,

They are used to prevent

commonly

to

system

an organisations

kept

determine

most

a message

more of three

if regularly

search

their

On request,

comprising

or filters

They

a virus is

entering

enter.

network.

to

it into

messages to

useful

to an organisations

time

incorporate

all

virus is trying

measure

by organisations

Each

and then

on a real-time

This

and

Database

chapter.

Antivirus

if

procedures

DBA,

strategies

responsibility

in this

the

of the

10

use

Web

one

or

network:

against

system

a set

of filters.

and all others

are

discarded. Proxy server

the

organisation a proxy

proxy server

and

server,

external other

requested

so that

increases

response

users

than

all the

client

gateway

will run

so that

proxy the

measures.

the

internet.

It if

also

can

other

users

proxy server

between

the internal

There

are further

cache

the

network

advantages

Web pages

request

the

same

that

page.

can also be used to limit

of an

of using have This

the

been also

websites

that

organisation.

blocks

The

all communication as the

is reduced

addition,

the

machines

machine.

as the internet,

traffic

time. In

gateway

such

security

network

may view outside

Circuit-level

manages

networks

1

all incoming software server

messages to any host

to

allow

performs

internal

client

them

to

but itself.

establish

a connection

all communications

machines

never

Within the

with

actually

have

any

organisation,

with the

external

contact

circuit-level

network

, such

with the outside

world. Diskless

workstations

the information

allow

from their

end users to

access

the

database

without

being

able to

download

workstations.

nOte James

Martin provides

security

strategy

of database Data

that

security

an excellent remains and

enumeration

relevant

today.8

may be summarised

are

of the

security

as one in

Users

Protected

and description Martins

strategy

desirable

attributes

is

on the

based

of a database seven

essentials

which:

are

Identifiable

Reconstructable

Authorised

Auditable

Monitored

Tamperproof

8

Copyright Editorial

review

2020 has

Martin,

Cengage deemed

Learning. that

any

J.,

All suppressed

Managing

Rights

Reserved. content

does

the

May not

not materially

Database

be

copied, affect

Environment.

scanned, the

overall

or

duplicated, learning

Englewood

in experience.

whole

or in Cengage

part.

Due Learning

Cliffs,

to

electronic reserves

NJ:

rights, the

right

Prentice-Hall,

some to

third remove

party additional

content

1977.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

548

PaRt

IV

Database

Design

10.3.5 testing

and evaluation

In the design phase, decisions of the

database.

were madeto ensure integrity,

During implementation

and loading,

these

security, performance plans

were put into

and recoverability

place. In testing

and

evaluation, the DBAtests and fine-tunes the database to ensure that it performs as expected. This phase occurs in conjunction with application programming. Programmers use database tools to prototype the applications during coding of the programs. Tools such as report generators, screen painters and menu generators are especially useful to application programmers. Test the Database During this step, the DBA tests the database to ensure that it maintains the integrity and security of the data. Dataintegrity is enforced by the DBMS through the proper use of primary and foreign key rules.

Many DBMSs

also support

the

creation

of domain

constraints

and

database

triggers.

Testing

will ensure that these constraints are properly designed and implemented. Dataintegrity is also the result of properly implemented data management policies, which are part of a comprehensive data administration framework. evaluate

the

Database

and its

Application

Programs

As the database and application programs are created and tested, the system must also be evaluated using a more holistic approach. Testing and evaluation of the individual components should culminate in a variety of broader system tests to ensure that all of the components interact properly to meetthe needs of the users. Atthis stage, integration issues and deployment plans are refined, user training is conducted,

and system

documentation

is finalised.

Once the

system

receives

final

approval,

it

must

be a sustainable resource for the organisation. To ensure that the data contained in the database are protected against loss, backup and recovery plans are tested. Timely data availability is crucial for almost every database. Unfortunately, the database can lose data through unintended deletions, power outages and other causes. Data backup and recovery procedures create a safety valve, ensuring the availability of consistent data. Typically, database vendors encourage

10

the

use of fault-tolerant

components

such

as uninterruptible

power

supply (UPS)

units,

RAID storage

devices, clustered servers and data replication technologies to ensure the continuous operation ofthe database in case of a hardware failure. Even with these components, backup and restore functions constitute a very important part of daily database operations. Some DBMSs provide functions that allow the database administrator to schedule automatic database backups to permanent storage devices such

as disks,

DVDs, tapes

and online storage.

Database

backups

can be performed

at different levels:

Afull backup, or dump, of the entire database. In this case, all database objects are backed up in their entirety. A differential backup of the database, in which only the objects that have been updated modified since the last full backup are backed up. Atransaction

log

backup,

which backs

up only the transaction

log

operations

that

or

are not

reflected in a previous backup copy of the database. In this case, no other database objects are backed up. (For a complete explanation of the transaction log, see Chapter 12, Managing Transactions and Concurrency.) The database backup is stored in a secure place, usually in a different

building

from the

database

itself,

and is protected

against

dangers

such

as fire, theft,

flood and other potential calamities. The main purpose of the backup is to guarantee database restoration following a hardware or software failure. Failures that plague databases and systems are generally induced by software, hardware, programming exemptions, transactions, or external factors. Table 10.1 summarises the most common sources of database failure. Depending on the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

tabLe

10.1

10

Database

Development

Process

549

Common sources of database failure

Source

Description

Software

Software-induced traceable DBMS

Software failures

to the

operating

software,

viruses

and

may be system,

application

other

In April 2017, a new vulnerability the

was found

programs,

or

in the

e-Business

malware.

Oracle

Suite,

which

unauthenticated

allows

attacker

to

modify, or delete critical Hardware

Hardware-induced memory disk

chip

failures

errors,

sectors

and

mayinclude

disk

crashes,

disk-full

A bad

bad

memory

hard-disk

errors.

can

data.9

module or a multiple

failure

system

an

create,

in

a database

bring it to an abrupt

stop.

Programming

exemptions

Application

programs

or end users

roll back transactions conditions

are

exemptions malicious

also

be caused

the

Reserve

code that

from

by hackers.

New

York

Federal

Bank to transfer

the

central

bank

to accounts in the hackers

$81

by

malware

million

of Bangladesh

Philippines.

used fraudulent

injected

of

hackers fraudulently

instructed

by

tested

2016, a group

unidentified

Programming

or improperly

can be exploited

In February

when certain

defined.

can

may

The

messages

disguised

as a

PDF reader.10 Transactions

The system

aborts (See External factors

detects

deadlocks

and

Deadlock

one of the transactions.

occurs

multiple simultaneous

Backups system

are especially important suffers

complete

earthquake,

flood,

when a

destruction or other

In August

from

a local

natural

disaster.

utility

providers

data

Although

power

automatically, enough

data loss

type and extent of the failure, the recovery a major long-term

Database

rebuild.

transactions.

2015, lightning

Googles

long

is not possible

executing

Chapter 12)

fire,

to

when

Regardless

in

struck grid

centres

in

backup

near

kicked

the interruption to

cause

affected

10

Belgium. in

was

permanent systems.

process ranges from a minor short-term inconvenience of the

extent

of the required

recovery

process,

recovery

without a usable backup.

recovery

generally

follows

a predictable

scenario.

First, the type

and extent

of the required

recovery are determined. If the entire database needs to be recovered to a consistent state, the recovery uses the most recent backup copy of the database in a known consistent state. The backup copy is then rolled forward to restore all subsequent transactions by using the transaction log information.

If the

database

needs to

be recovered

but the

committed

portion

of the

database

is

still usable, the recovery process uses the transaction log to undo all of the transactions that were not committed (see Chapter 12, Managing Transactions and Concurrency). Atthe end of this phase, the database completes an iterative process of testing, evaluation and modification that continues until the system is certified as ready to enter the operational phase.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

550

PaRt

IV

Database

Design

10.3.6 Operation Once the database has passed the evaluation stage, it is considered to be operational. At this point, the database, its management, its users andits application programs constitute a complete information system. The beginning

of the

operational

phase invariably

starts the

process

of system

evolution.

As soon

as all of the targeted end users have entered the operations phase, problems that could not have been foreseen during the testing phase begin to surface. Some of the problems are serious enough to warrant emergency patchwork, while others are merely minorissues. For example, if the database design is implemented

to interface

with the

Web, the

sheer

volume

of transactions

may cause

even

a well-designed system to bog down. In that case, the designers have to identify the source(s) of the bottleneck(s) and produce alternative solutions. These solutions mayinclude using load-balancing software to distribute the transactions among multiple computers, increasing the available cache for the DBMS, and so on. In any case, the demand for change is the designers constant, whichleads to the

next

phase:

10.3.7

maintenance

and evolution.

Maintenance and evolution

The database administrator must be prepared to perform routine maintenance database. Some of the required periodic maintenance activities include: Preventive

maintenance (backup)

Corrective

maintenance (recovery)

Adaptive

maintenance (enhancing

Assignment

of access

performance,

permissions

and their

adding entities and attributes

maintenance

Generation of database access statistics to improve audits and to monitor system performance Periodic security

10

The likelihood formats

quarterly, or yearly) system-usage

of new information

require

application

requirements

changes

and

for

new and old users

of system

statistics

summaries for internal

billing or budgeting

and the demand for additional reports

possible

within the

and so on)

the efficiency and usefulness

audits based on the system-generated

Periodic (monthly, purposes

activities

minor changes

in the

database

and new query

components

and

contents. Those changes can be easilyimplemented only whenthe database design is flexible and when all documentation is updated and online. Eventually, even the best-designed database environment will no longer be capable ofincorporating such evolutionary changes; then the whole DBLC process begins anew. You should

not

be surprised

to

discover

that

many of the

activities

described

in the

Database

Life Cycle (DBLC) remind you of those in the Systems Development Life Cycle (SDLC). After all, the SDLC represents the framework within whichthe DBLC activities take place. A summary ofthe parallel activities that take place within the SDLC and the DBLC is shown in Figure 10.9.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

FIguRe

10.9

10

Database

Development

Process

551

Parallel activities in the DbLC and the sDLC

SDLC

DBLC

System

Database initial

design

Analysis

study

Screens

Conceptual Database

design

Detailed

Logical

design

Reports Procedures

Physical

Implementation

Coding

and loading

Prototyping

Creation

System

Loading

implementation

Fine-tuning

Testing

and

Testing

evaluation

and

Debugging

evaluation

Operation

Database

Application

maintenance

and evolution

10.3.8 Determine Performance

Measures

Physical

when

design

performance

becomes

that

designers

possible.

Despite

characteristics,

example, sector In

and

block

review

2020 has

Learning. that

database

fact

that

(page)

any

All

Rights

pool

is,

deals

tend

hides

to

size,

is

and the

data

access

the

and

DBMS

storage

of

have

disk

not as

physical

properties.

on the

summary,

For

as seek time,

and read/write effect

In

and queries

it is

computers

media, such

platters

the

activities

storage

a considerable efficiency.

complexities,

of the

by physical

10

because

physical-level

complexities

affected

number

locations

Given such many of the

of the

can

speed

with fine-tuning

as

hide the

databases

of an index

at different

throughput.

that

by characteristics

creation

that

heads. relational

physical

design

to ensure that they

will meet

requirements.

Reserved. content

models

buffer

distributed

medias

of relational

as the

measurement

suppressed

relational

size,

such

data is

software

can be affected

performance

Cengage deemed

communication

performance

performance

Copyright

by the favour

performance

factors

databases

Editorial

the the

performance

addition,

end-user

more complex

is affected

surprising

program

maintenance

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

552

PaRt

IV

Database

10.4

Design

Database

DesIgn stRategIes

There are two classical approaches to database design: Top-down

design

starts

by identifying

the

data sets, then

those sets. This process involves the identification each entitys attributes.

defines the

data elements

for

each

of

of different entity types and the definition of

Bottom-up design first identifies the data elements (items), then groups them together in data sets. In other words, it first defines attributes, then groups them to form entities. The two approaches areillustrated in Figure 10.10. The selection of a primary emphasis on top-down or bottom-up procedures often depends on the scope ofthe problem or on personal preferences. Although the two methodologies are complementary rather than mutually exclusive, a primary emphasis on a bottom-up

approach

may be more productive

for small

databases

with few

entities,

attributes,

relations

and transactions. For situations in which the number, variety and complexity of entities, relations and transactions is overwhelming, a primarily top-down approach may be more easily managed. Most companies have standards for systems development and database design already in place.

FIguRe 10.10

top-down vs bottom-up design sequencing

Conceptual

T

B

o

o

p

t Entity

D o w n

10

model

t o m

Entity

U p

Attribute

Attribute

Attribute

Attribute

nOte Even

when

structures the

a primarily

selection

Copyright Editorial

review

techniques

on a distinction

2020 has

Cengage deemed

Learning. that

any

All suppressed

approach

a bottom-up

of attributes

normalisation based

top-down

is (inevitably)

and form

rather

Rights

Reserved. content

does

entities the

than

May not

not materially

is

selected,

technique. can

basis

the

be described for

normalisation

ER models

most

as

designs,

process

constitute bottom-up. the

that

revises

a top-down Because

top-down

vs

existing

process both

the

ER

bottom-up

table

even

when

model

debate

and

may

be

a difference.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10.5

CentRaLIseD

Vs DeCentRaLIseD

10

Database

Development

Process

553

DesIgn

The two general approaches (bottom-up and top-down) to database design can be influenced by factors such asthe scope and size of the system, the companys management style, and the companys structure (centralised or decentralised). Depending on such factors, the database design may be based on two very different design philosophies: centralised and decentralised. Centralised

design is productive

when the

data component

is composed

of a relatively

small number

of objects and procedures. The design can be carried out and represented in a fairly simple database. Centralised design is typical of relatively simple and/or small databases and can be successfully done by a single person (database administrator) or by a small, informal design team. The company operations and the

scope

of the

problem

are sufficiently

limited

to

allow

even a single

designer

to

define

the

problem(s), create the conceptual design, verify the conceptual design withthe user views, define system processes and data constraints to ensure the efficacy of the design, and ensure that the design will comply with all the requirements. Although centralised design is typical for small companies, do not make the mistake of assuming that centralised design is limited to small companies. Even large companies can operate

within a relatively

simple

database

environment.

Figure

10.11 summarises

the

centralised

design option. Note that a single conceptual design is completed and then validated in the centralised design approach. Decentralised design might be used when the data component of the system has a considerable number

of entities

and complex

relations

on which very complex

operations

are performed.

Decentralised

design is also likely to be employed when the problem itself is spread across several operational and each element is a subset of the entire data set. (See Figure 10.12.)

FIguRe

10.11

Centralised

sites

design

Conceptual

model

10

Conceptual

User

views

model verification

System

Data

In large

and

Instead,

a carefully

selected

Within the

decentralised

project. modules. modules

complex

Once the design to

As each

design

groups

design

group

the interrelation

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

projects,

any

among

All suppressed

Rights

does

within

May

not

be

copied, affect

modelling must

scanned, the

constraints

cannot

is

the

be

employed

to

done

by

tackle

database

design

the lead

designer

only

one

a complex

task is

divided

assigns

person. database

into

design

several

subsets

or

team.

on

subsets

materially

typically

designers

have been established, the

Data

dictionary

design

database

design framework,

focuses

not

database of

criteria

data

Reserved. content

the

team

processes

overall

or

duplicated, learning

a subset

be very

in experience.

whole

of the

precise.

or in Cengage

part.

Each

Due Learning

to

electronic reserves

system,

the

design

rights, the

right

definition

group

some to

third remove

party additional

creates

content

may content

be

of boundaries

and

a conceptual

data

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

554

PaRt

IV

FIguRe

Database

Design

10.12

Decentralised

design

Data

Engineering

Conceptual

Submodule

component

criteria

Manufacturing

Purchasing

models

Views Processes Constraints

Verification

Views

Views

Processes

Processes

Constraints

Constraints

Aggregation

Conceptual

model

Data dictionary

10 model corresponding

to the subset

being

modelled.

Each conceptual

model is then

verified individually

against the user views, processes and constraints for each ofthe modules. After the verification process has been completed, all modules are integrated into one conceptual model. Because the data dictionary describes the characteristics of all objects within the conceptual data model,it plays a vital role in the integration process. Naturally, after the subsets have been aggregated into a larger conceptual model, the lead

designer

must verify that

the

combined

conceptual

model is

still

able to

support

all of the

required transactions. Keep in mind that the aggregation process (Figure 10.13) requires the designer to create a single modelin which various aggregation problems must be addressed: Synonyms and homonyms. Different departments might know the same object by different names (synonyms), or they might use the same name to address different objects (homonyms). The object

can be an entity,

an attribute

or a relationship.

An example

of a synonym

department refers to the client while another refers to the customer. is if the IT department uses the term the client to refer to a computer

is

where one

An example of a homonym asin a client/server setup.

Entity and entity subtypes. An entity subtype might be viewed as a separate entity by one or more departments. The designer mustintegrate such subtypes into a higher-level entity. Conflicting object definitions. Attributes can be recorded different domains can be defined for the same attribute. designer mustremove such conflicts from the model.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

as different types (character, numeric), or Constraint definitions can also vary. The

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

FIguRe

10.13

summary

of aggregation

Synonyms:

two

10

Database

Development

Process

555

problems

departments

use

different

names

for

the

same

entity.

Label used: Entity

X

Department

A

X

Department

B

Y

Homonyms:

two

(Department Entity

different

B uses

the

entities label

are

X to

Label

X

addressed

describe

by the

both

entity

same

label.

X and

Y entity).

used: X

X

Entity Y

Entity

and entity

subclass:

The entities

X1 and

X2 are subsets

of entity

X.

Example:

Name Entity

EMPLOYEE

X

Common

Address

attributes

Phone

Entity X1

Entity

Department

A

X2

Department

SECRETARY

Typing

B

PILOT

speed

Conflicting

object

definitions:

attributes

for the entity

Payroll

Dept.

Systems

Primary

key:

PROF_ID

PROF_NUM

definitions

Phone

attribute:

898-2853

2853

Database

Data is

an important

attributes

PROFESSOR

Conflicting

10.6

Distinguishing

Hours flown Licence

Classification

Dept.

10

aDMInIstRatIOn and valuable

resource

within an organisation

and requires

a successful

database

administration strategy to beimplemented. Data managementis a complex job and hasled to the development of the database administration function. The person responsible for the control ofthe centralised and shared database is the database administrator (DBA). The size and role ofthe DBAfunction varies from company to company, as doesits placement within a companys organisational structure. Onthe organisation chart, the

DBA function

might be defined

as either

a staff

or line

position.

Placing the

DBA function

in a staff

position often creates a consulting environment in whichthe DBAis able to devise the data administration strategy but does not have the authority to enforce it or to resolve possible conflicts. The DBA function in aline position has both the responsibility and the authority to plan, define, implement and enforce the policies,

standards

and procedures

used in the data administration

activity.

The two

possible

DBA function

placements are illustrated in Figure 10.14.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

556

PaRt

IV

FIguRe

Database

Design

10.14

the placement of the Dba function Line Authority

Position

Information systems

(IS)

Application

Database

Database

development

operations

administration

Staff

Consulting

Position

Information systems

(IS)

Database administration

10

There is the

Application

Database

development

operations

no standard

DBA function

changes The

for itself

in

The distributed of each local

coordinating

activities

The growing

use of internet-ready

warehousing expanding

and

The increasing platform

departmental mention the database

Although DBA

has

Cengage deemed

to

Learning. that

any

In

cover

All suppressed

needs. created

short,

Rights

the

Reserved. content

does

May

not

be

the

because the

fast-paced

For example: decentralise

system

the

DBA to

new and

data

define

more complex

power

new

DBAs

and the

data

growing

modelling

and

number

design

of

activities,

job. of

But such

exists, to

materially

requires

In fact,

desktop-based

DBMS

cost-effective

and

an environment

also invites

who lack the technical

desktop

packages efficient

environment

provide solutions

specific

data duplication,

qualifications

requires

an easy to

the

to

DBA to

not to

produce

develop

good

a new

set

skills.

the

following

not

to

databases

add to the

DBAs

by people

the

standard

according

to

of user-friendly,

managerial

current

operations

personnel

2020

information problems

and

no

and

development

designs.

of technical

the

sophistication

for the

database

part, that is

DBA.

are likely

diversifying

styles.

DBA, thus imposing

In

functions.

an organisation

and object-orientated

applications

structure.

organisations

organisational

can force

further.

system

of any

changing

databases

on the

an organisations

dynamic

dictate

of distributed

function

fits in

most

the responsibilities

thus

review

the

and delegate

data

Copyright

DBA function

probably

DBMS technology

development

administration

Editorial

how the is

it is DBLC

common phases.

practice If that

to

define

approach

is

the

DBA function

used,

the

by dividing

DBA function

the

requires

activities:

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Database

planning,

including

Database requirements Database

logical

Database Database

Database

and

Figure

10.15

procedures,

Database

Development

Process

557

and enforcement

design

design

and implementation

debugging

operations

Database training

of standards,

and conceptual

and transaction

design

testing

definition

gathering

design

physical

the

10

and

maintenance,

including

installation,

conversion,

and

migration

and support

represents

FIguRe 10.15

an appropriate

DBA functional

organisation

according

to that

model

a Dbafunctional organisation DBA

Planning

Design

Implementation

Conceptual

Operations

Logical

Training

Physical

Testing

10

Keep in

mind that

different

operations.

support

top

a company

the

For example,

daily transactions

managements

in the each

different DBMS.

(SYSADM);

needs.

There

trend

charts

management

the

roles

two

manual.

Thus, the

DA is in

of the company

Copyright Editorial

review

2020 has

to

Cengage deemed

Learning. that

any

DAs job

charge

DBMS.

All

Rights

in the

does

May not

not

be

copied, affect

DBA

scanned, overall

or

some

middle

to and

DBMSs installed DBA assigned

systems

for

administrator

in

whole

For

example,

the

between

a DBA and the

manager

(irM),

of responsibility

and

usually

authority

than

the

extent.

data resources,

area of operations data,

expanded

structures

experience.

resource

degree

computerised the

function.

make a distinction

corporate

alarger

within

duplicated, learning

a higher

overall

covers

on the

the

as the

support DBMS

support

of desktop

management

as the information

to

the

only the

of the

materially

to

might have one

known

data

corporations

given

overlap

not

Depending

Reserved. content

to

description

placement

company.

and is

controlling

of controlling

The

suppressed

tend

for

company

to

with a hierarchical database

may also be a variety the

installed

in Figure 10.16. specialisation

The DA, also known

to top

DBMSs

corporations

a relational

of all DBAs is sometimes

towards

(DA).

and

an environment,

used by some of the larger

The DA is responsible

the

In such coordinator

and incompatible

to find

ad hoc information

a growing

although

uncommon level

departments.

directly

not

different

operational

administrator

DBA,

it is

several

at the

The general

organisation

reports

have

that position is illustrated

There is

data

might

but

Cengage

part.

Due Learning

to

than that also

the

organisational

components,

or in

both computerised

electronic reserves

the

rights, the

right

some to

third

party additional

content

might

may content

DBA because

outside

structure DBA

remove

of the

data

the

any

time

from

to the

suppressed at

scope

may vary report

be

and

from if

the

subsequent

DA,

eBook rights

and/or restrictions

eChapter(s). require

it

558

PaRt

IV

Database

the IRM, label

FIguRe

Design

the IS

DBA is

manager

used

10.16

here

or directly to the

as a general

title

companys

that

CEO. For simplicity

encompasses

Multiple database administrators

and to

all appropriate

data

avoid

confusion,

administration

the

functions.

in an organisation

Systems

administrator

Desktop

DBA

DBA

DB2

DBA

Oracle

relational

DBA

POSTGRES

SQL

relational

DBMS manager

Server

relational

You will now learn briefly about two distinct roles that a DBA must perform. These are known as the managerial role and the technical role. The DBAs managerial role is focused on personnel management and on interactions with the end-user community. The DBAs technical role involves the use of the DBMS database design, development and implementation as well as the production, development and use of application

tabLe

10.2

programs.

Alist

of the

Broad

Coordination

Conflict

10.2.

Technical

business

Analytical

skills is given in Table

Desired Dba skills

Managerial

10

desired

Broad

understanding

data-processing

Systems

skills

development

Structured

skills

resolution

skills

Communications

skills (oral

and

written)

background life

cycle

knowledge

methodologies:

Data flow

diagrams

Structure

charts

Programming

languages

Database

cycle

life

Database

knowledge

modelling

and

design

skills:

database

skills

Conceptual Logical Physical Negotiation

skills

Operational management,

Online Content in Appendix book.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

does

May not

not materially

implementation,

data

dictionary

and so on

Thedatabaseadministration functionis coveredin muchgreaterdepth

K, Database Administration,

Reserved. content

security,

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

which is available on the online platform for this

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10.6.1 the

Managerial Role of the Dba

As a

the

manager,

administration.

DBA

must concentrate

Therefore, the

DBAis responsible

Coordinating,

monitoring

Defining

and formulating

goals

More specifically, tabLe

10.3

on the

and allocating strategic

and

administration

plans for the

Process

559

of database

people

administration

and data.

function.

and services

Planning

End-user

Organising

Policies,

support

procedures

Data security,

Testing

Data backup

Delivering

Data distribution

DBA is generally

responsible

for

and standards

privacy

Monitoring

and integrity

and recovery and use

planning,

organising,

and delivering quite afew services. Those services might be performed the DBAs personnel. Lets examine the services in greater detail. end-User

dimensions

resources:

database

DBA Service

that the

Development

are shown in Table 10.3.

DBA Activity

Table 10.3 illustrates

planning

Database

for:

database

the DBAs responsibilities Dba activities

control

10

testing,

monitoring

by the DBA or, morelikely,

by

Support

The DBA interacts with the end user by providing data and information organisations departments. Because end users usually have dissimilar end-user support services include:

support services to the computer backgrounds, 10

Gathering user requirements. The DBA must work within the end-user community to help gather the data required to identify and describe the end users problems. The DBAs communications skills are very important at this stage because the DBA works closely with people whotend to

have different

computer

backgrounds

and communication

styles.

requirements requires the DBA to develop a precise understanding and to identify present and future information needs.

The gathering

of user

of the users views and needs,

Building end-user confidence. Finding adequate solutions to end users problems increases end-user trust and confidence in the DBA function. Resolving conflicts and problems. Finding solutions to end users problems in one department might trigger conflicts with other departments. End users are typically concerned with their own specific data needs rather than with those of others, and they are not likely to consider how their data affect

other

DBA function

departments

within the

organisation.

has the authority and responsibility

Finding solutions

to information

When data/information

conflicts

arise, the

to resolve them.

needs. The ability and authority to resolve

data conflicts

enable the

DBAto develop solutions that will properly fit withinthe existing data management framework. The DBAs primary objective is to provide solutions to the end users information needs. Giventhe growing importance ofthe internet, those solutions arelikely to require the development and management of

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

560

PaRt

IV

Database

Design

Web browsers to interface the

use

interfaces

quality

and integrity

Ensuring it

must

be properly

programmers data

access

internet that

end

and

access

database

interfaces

The

Standards

A prime

component

policies,

procedures

before

The

they

Policies

coordinates

and

of a successful and

Standards activity.

for

define,

to fire

both

application required

do

programs

must be given to the management For example,

a trigger,

that

for

transactions

application

transaction

One of the

trigger

must

DBMS features

if

an internal also

most time-consuming

properly.

understanding

correct

document

statements

The

of the

all activities

data and

DBA

functions

concerning

strategy

creation,

be fired

DBA

must ensure that and

use

end-user

is the

usage,

communicate

of direction

of the

all

DBMS

education.

the

continuous

distribution policies,

enforcement

and

deletion

procedures

or action that

communicate

of the within the

and

and support

are more detailed and specific than policies and describe the activity.

In

For example,

effect,

standards

standards

programmers

Procedures

are written instructions

performance

standards

support

that

are

used

to

of application

DBA goals.

minimum requirements

evaluate

the

programs

quality

and the

of the

naming

must use.

of a given activity.

must

are rules

define the structure

conventions

To illustrate

of the

attention the

with

procedures database

quality

environment.

use the database

monitors

the

has been found,

interface.

users.

a basic

work and

requires

sales.

solution

must

that

the

of e-commerce

product

be enforced:

DBA

and they

provide

data administration

standards

must

are general

of a given

10

have

sure

Special

is required

via the internet

database

make

database

of DBMS

DBA standards

Certifying

do not

and

and Procedures

DBA

can

transaction

end users how to

DBA

also

DBA function.

interfaces

and support

the

must

data quality.

is generated

is teaching

accessing

database.

DBA

the

database

DBMS-managed

database

Managing the training

Policies,

the

the

growth

queries

Once the right

Therefore,

them

The

the explosive product

and data.

used.

teach

those

in

when the transaction

software.

to

is a crucial

because found

application-based

users

and

users

manipulation.

the

In fact,

interactive

of applications

affect the databases

are typically

activities

databases.

to facilitate

implemented

and

not adversely that

with the

of dynamic

and

the distinctions

that describe a series of steps to be followed

Procedures

enhance

among

that

must be developed

within

existing

during the

working

conditions,

environment.

policies,

standards

and procedures,

look

at the following

examples:

Policies All users

must have

Passwords

must

passwords.

be changed

every

six

months.

Standards A password

must

A password

may have

ID numbers,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

have

names

Rights

Reserved. content

does

a a

minimum

of five

maximum

of 12

and birth

May not

not materially

be

copied, affect

characters. characters.

dates cannot

scanned, the

overall

or

duplicated, learning

in experience.

be used as passwords.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10

Database

Development

Process

561

Procedures To create

a password,

an account; computer

(2) the operator

information user

to

changes

creates

the

end

the

Standards

and

database.

Standards

define,

procedures

account,

of the to

by the

procedures

enforce

requirements

a permanent are

by all end each

must facilitate

gathering.

is

and

sent

creation

operator; sends

to the

the

DBA;

of

(3) the account

and (5) the

one.

used

procedures

for the

computer

password

information

complement

Procedures

and

it to the

a temporary

account

DBA

must

DBA a written request

and forwards

assigns

a copy

password

policies.

database

the request

defined

and

communicate

End-user

the

user; (4)

temporary

of data administration must

(1) the end user sends to the

DBA approves

the

that

users

other

and

who

want to

must

constitute

benefit

work of end users and the

cover

areas

such

Which documentation

from

the

an extension

DBA. The DBA

as:

is required?

What forms

must

be used?

Database

design

and

or object-oriented

modelling.

Which database

methodology)?

Which tools

design

to

methodology

use (CASE

tools,

to use (normalisation

data

dictionaries,

or ER

diagrams)? Documentation elements, Design, for

coding

given

the

software

provides

work

with

and integrity. is

internet and

controlled

to

databases than

Database information

2020 has

Learning. that

any

All suppressed

the

information

Reserved. content

does

May not

not materially

be

affect

scanned, overall

new

with

a

closely

standards

procedures

software that

by the

related

software

be

organisation, the

DBA

and

must

also

10

solutions.

governing

security defined

and integrity. and

of security

no system

strictly

scenarios

can ever be completely

standards.

security

from

other require

be clearly

multitude

the

and

environment,

threats

more traditional

work

any

might

needed

must

handle

define

standards.

connectivity

meet critical

to

must

and

policies

Although to

backup

The growing that

are far

internally

with internet

use of

more complex

generated

security

attacks launched

and

and

specialists

inadvertently

solutions

duplicated, learning

in experience.

management

operator

to

or attacks

of problems.

must

be clearly

whole

or in Cengage

part.

Due

Operational

be established

to

electronic reserves

backups.

must be clearly and

the

documented.

notes.

Such

procedures

notes must also

procedures.

specified.

Learning

must include

of the

instructions

and recovery

program must

procedures

daily operations

write

backup

or

and recovery

execution

must

and

training

the

to

must

those

DBA

internet

the

The DBMSs

training

copied,

the

standards

protected

and they causes

the

package

todays

define

door

proper

concerning

governing

Rights

must

Database

A full-featured

of all data

users.

guarantee

logs,

pinpointing

training.

DBA

and operation.

keep job

procedures

Cengage deemed

in

precise

End-user

review

must

helpful

include

and recovery.

maintenance

Operators

the

definition

standards

must enforce

Web-to-database

the

the

DBA DBA

has the features

In

encountered

are properly

to

that it

minimised.

by unauthorised

necessary

Database

and

databases

backup

DBMS

be designed

opens

those

use in

The The

For example,

Security

are

Therefore,

deliberately

of the

must be designed

manage

ensure that the

are

problems

interfaces.

launched

must

procedures

to

DBA

crucial.

procedures

interfaces difficult

DBA

proper

The

especially

security

security

programs.

and the

on investment. to find

security

ensure that

software,

return

to

and testing.

managed.

with existing

security

secure,

application

The selection

Database

documentation

database?

documentation

properly

Database

to

Copyright

be

a positive

Security

the

programmers,

Web administrators

enforced.

Editorial

of database

selection. must

Which

access

coding,

application

database

it

that

and testing

properly interfaced that

conventions.

programs

program

to the

Database

naming

and

application

are

to

and

sets

The

rights, the

within

right

some to

third remove

the

objective

party additional

content

organisation,

is to indicate

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

562

PaRt

IV

Database

clearly

Design

who does

available

Procedures the

DBMS

and

can

The security,

to

the

Define

to ?

the

each

?

user

to

through

SQL

rights

2020 has

Cengage deemed

Learning. that

any

All suppressed

of

company,

and

The

who

manage

information

sites,

thus

making

data configuration

provided

and

database

to limit

achieved

is

to,

In

by the

addition,

other

has

DBMS DBAs

security

a function

user

access

user to log

at two level,

levels:

the

on to the

user ID or employ

DBA

to

must

mechanisms

of authorisation

and guarantee access

database

management,

to the

database

at the

operating

can request

computer

the same

view

and likely

system.

system

the

creation

At the

DBMS

user ID to authorise

to

level of a level,

the

the

end user

Physical

a user

or a group

define

Reserved. does

May not

end users

operating

access

and

DBMS

according

to

the access

describes

to read-only,

less

common

privileges

needs

of individual

rights

the

of authorised

in relational

users to

access

or access type

The use

probable.

or the authorised

privileges

dates.

and to remind

access

privileges

privilege

system

expiration

periodically

unauthorised

managing

Access

security

can prevent

and facilities. include

must define

to

users.

specific

users

access.

access

mayinclude

databases

video,

user.

The

of one The

unauthorised common

entrances,

are

SQL

assigned

directly

security

practices

must

and

provide

the

and the

CREATE

found

workstations, biometric

and control the

more tables

command

physical

recognition

protect

DBMS or

users from

password-protected

voice

data views to

composed of users.

Some

secured

closed-circuit

are

both

commands.

an authorised

that

and

privileges.

at

with predetermined

user groups

assigns

REVOKE

DBA

be done

making

users into

DBA

badges,

The

thus

may be limited

and

access.

can

An access

rights

GRANT

of views

content

designed

system

of controlling

DELETE

accessible

Rights

multiple

section.

services

are not limited

DBA to screen

databases.

personnel

to

DBAs through

monitoring.

can be assigned

Classifying

installations

to

across

previous

proxy

but

This, too,

the

DBMS installation

databases

in the

to

multiple-site

mechanisms

data in the

usage is

periodically,

access

? View definition. are

concern productivity

defines procedures to protect

This is

user.

the

definition

of the

end

database

electronic

review

each

privileges.

physical

accessing

and integrity

include,

passwords

specified

WRITE

of great

security

operating

the

DBAs job

access

READ,

that

of the

DBMS.

the

access

Control

the introduction

reorganisation

greater

The

defined

a different

passwords

in large

Copyright

allows

create

are

way to

of data

DBMS

At the

dates enables

For example,

?

date and to ensure that

standards.

distribution

build

database.

change

facilitates

Editorial

to the

passwords

Assign

of the

procedures:

of expiration

to

the

and integrity.

This function

The database

their

in the

firewalls,

and

following

Define user groups.

?

Naturally,

violations,

the

to

procedures

DBMS level.

access the

Assign

environment.

work

security

privacy

control

user ID that

levels.

up to

database

pointed

management

Those

DBA can either

10

and extent

attacks.

and

management.

at the

logon

of the type

to keep them

and

policies

experts

possible

access

at least

and

the

data in the

use the

administration

Authorisation

User access

?

DBA

security

DBMS

includes

in

procedures

also resulted

security

and integrity.

definition,

annually

or integrity

has

data control,

the

data from

management. security

of the

of the

has

that

database

Protecting

changes

Technology

maintain

with internet

to safeguard

at least

of security

and integrity

Technology

the up

to

revision

installations.

made it imperative enforce

must be aware

Privacy and integrity

more difficult

team

quickly discovery

require

privacy

DBMS

management.

it

the

changes

Data Security,

Each end user

must be revised

adapt

software,

similar

current

when and how.

methodology.

and standards

organisation

new

what,

training

scope

tools

that

assignment

VIEW is

technology.

of the allow

data

the

of access

used

in relational

views.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

?

DBMS

access

DBMSs

query

and

?

only

DBMS

control.

Database

and reporting

by authorised

usage

access

tools.

can be controlled

The

DBA

must

the

database Security

The DBA

Preserved:

description pinpoint

accesses

or just

can yield

Action

must also audit the

unnoticed

must

computer

viruses

The integrity For

example,

Whatever

the

procedures

reason,

are

the

can

ensure as

trails

all

of

to

external

but

not

problems,

destroy

database.

and the

beyond

or destruction

database access

by

data.

the

a fire makes

recovery

the

database

or alter

an explosion,

data

by unauthorised

disrupt

include

factors

by

corruption

does

breaches

designed

or corrupted: problems,

security

security are

exist

face

potentially

all database

loss

of the

is

DBAs

control.

or an earthquake.

backup

and recovery

security,

has lost

integrity,

backup

security

and recovery

includes

disaster

testing

measures and

tasks

all of the

or a database

of

database

must include

of the

In large

security

database

loss

part

might

database

the

integrity.

of

of the

1

mean that

is

insurance

is

that

of database

loss

entire

backup

ensure

physically

you can buy.

so critical that

many DBA

officer (DSO). The DSOs sole

shops,

the

DSOs

activities

are

often

management.

management

data

and integrity.

the

database

data

also

a physical A total

or that

cheapest

must

or loss

when

integrity.

lost

are the

Therefore,

DBA

data loss

be caused

entirely

procedures

losses.

The

of physical can

database

but its integrity

ruinous

installations.

in case

A partial

and recovery

database

and

recovery

in

part

of database

a physical

Periodic

when

to

backup

disaster

Disaster

recovery

audit

record

are produced

snooping

have created a position staffed by the database

classified

recovery

such

or destroyed

companies

critical

or total.

or

continues

job is to

organising

violations

of similar

because

of database

available,

partial

management

following

actions

can be fully recovered

be

any case,

The

but

Corrupting

be lost

Such to

DBA.

are

occurred

database

departments

most security

be damaged

possibility

not readily

has

lost. In

might

to any

database

database

repetition

avoid the repetition

whose

Several

accesses.

purposes,

by hackers

procedures

Data loss

properly

and recovery

recovery

data in the

used

database.

be tailored

security

and

the

563

which automatically

by all users.

can

of similar

information

might

trails

the

avoid

to

performed audit

preserved

state.

database

Process

use of the

are

data in the

is either

a consistent

a database

on the

tools

of an audit log,

whose integrity

to

of

crucial

When data

to

The

database

a database

for

the

Data Backup and

failed

Action is required

be recovered

those

use of the

operations

violations.

As a matter of fact,

access

Corrupted:

database

access

is required

may not be necessary. and

of the

DBA to

breaches

that

Development

personnel.

monitoring.

a brief

enable

Database

by placing limits

make sure

DBMS packages contain features that allow the creation records

10

automatic.

integrity

failure.

contingency

designed to secure

Disaster

plans

and

management

recovery

data availability

includes

procedures.

all planning,

The

backup

and

at least:

applications

data in the

DBA activities

backups.

Some

database.

The

Products

such

DBA

DBMSs

include

should

as IBMs

tools

use those

to

tools

DB2 allow the

ensure to

backup

render

creation

the

and backup

of different

and

backup

types: full, incremental and concurrent. Afull backup, also known as a database dump, produces a complete copy of the entire database. Anincremental backup produces a backup of all data since the last backup date; a concurrent backup takes place while the useris working on the database. Proper

backup

identification.

date information,

the

Copyright Editorial

review

2020 has

database.

Cengage deemed

Learning. that

any

All suppressed

thus

Backups

enabling

the

While cloud-based

Rights

Reserved. content

does

May not

not materially

be

copied, affect

must DBA to

ensure

backups

scanned, the

overall

or

duplicated, learning

be clearly

identified

that

the

are fast replacing

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

through

correct

backups

tape

rights, the

right

detailed are

backups,

some to

third remove

party additional

content

descriptions used

and

to recover

many organisations

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

564

PaRt

IV

Database

Design

still use tapes. must

As tapes

be done

and location. tapes

for

Storage (NAS) approach

and fast

restoration.

Convenient

and

outside

safe

the

and

access and

data in the financial

additional

to

but it is less

points

are

and

contingency

plans

and recovery it is

are

useful

are the

the

purpose

prepared

of

and

controls.

The

backups

to

the

of closed

might include computer

protection

sites

use to

also includes

installation.

Multilevel

tokens

be stored?

provide

the

air

provision

of

can

passwords

properly

and

identify

authorised

is

to

and

Current

to

when

than

the

or security

be thoroughly are

not likely

to

not to

officer

of a database

disaster

must

is

establish

they

reach

created

tested

and

must

failure.

by

secure

an

The insurance

massive

data loss.

evaluated,

be disparaged,

cover

priorities

and they

and they

all components

concerning

based

to

have

for the

require

must

top-level

the

of an information

system.

nature

and

extent

of the

data

accomplish

task

that

information

ensure that

All suppressed

Rights

Reserved. content

does

May

not materially

be

being

copied, affect

right

time.

The

DBA is

responsible

at the right time

programming the

and in the right

very time-consuming,

to

especially

environment,

data in the

databases

database.

corporate

where Although

users,

makes it easy for authorised

is to facilitate ends.

and

scanned, the

the

overall

or

use

They

dependent

standards

not

opened

Web front

without

appropriate

access

at the

people,

their

The

when the

data

users

the use

for

format.

depend

internet has

on

and its

also

created

DBA.

philosophy

new internet

users

applications

programs

extensions

right

can become

on a typical

the

data distribution

and the

the

to the right

and use tasks

deliver

extranet

way to

any

DBA event

enforcement.

data are distributed

a new set of challenges

Learning.

The in the

drills

fire

program

distribution

capacity

that

defeats

each

and Use only

that the data

Cengage

place

Where

and

sites inside

process.

programmers

deemed

and

appropriate

Data Distribution ensuring

So-called

support

A backup

recovery

has

data,

of emergency.

database.

expensive making.

Therefore,

required

same

and temperature

of the

Physical

protection

worth

frequently.

managements

One

storage

must include

must be properly

Protection

challenge/response

for the

tools

software.

of a database

software

provide

intranet

same

well as preparation

control to the software

policy

delivery

Attached

use a layered

of the

in

(1)

use

disk-based

storage.

locations

questions:

use in case

coverage

DBAs

solutions

backups

well as humidity

protection.

Insurance

Data recovery

2020

as

insurance

Data

as

and

media for intermediate

storage the

not typically

on Network

archival

multiple The

currency

of resources.

be practised

review

and

and fire

and

be

for

backups

to two

hardware

DBMS for

hardware

disk

do

of tapes

of tape

be stored?

access,

and

backup

fast

track

optical

based

The storage locations

vaults,

respond to

power

computer

privileges

Two

Copyright

to

of both

may be expensive,

Editorial

quakeproof

backups

backup

Personal

different

a DBA

up to

location.

place.)

hire

and labelling

keep

include

storage

to tape

must

must

Enterprise

backed

There

a different

(Keeping

with restricted

a backup

10

storage. in

a policy

are

protection

conditioning,

users

data is transferred

backup

must establish

installations

the

to

solutions

online

(SAN).

are first

backups in the first

For how long

Physical

Networks

data

be stored

fire-safe

include

storage

DBA

enough

backup

the

organisation.

multiple

DBA

Later,

and the

are large

emerging

solutions

Area

which

must

may include

(2)

in

and copy

having

backup

it is vital that the

operators, that

Other

and Storage

backup

backup

organisations

Such

storage,

computer

backup.

devices.

physical

by the

However, enterprise

backup

require

diligently

duplicated,

the

generation

DBA to

on applications

procedures

learning

of a new

enable

in experience.

whole

end users to access the of

educate

more sophisticated end

programmers.

database.

users

to

Naturally,

query

produce

the

DBA

the must

are adhered to.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

This distribution as

database

enabling to

end

more

philosophy

technology users

efficient

to

administration to

ensure

nature

function. the

Data

The

DBAs

technical

role

data

modelling,

the

DBAs technical

of the

DBMS

of the

application

Many

might

For

example,

and support. covered

The technical

Designing

Training

and

the

The following

DBAs

management the

DBA

That

plan

a computer

needs.

community,

of desired

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

the

details

the

programming

issues.

maintenance

implementation

For

example,

and upgrading and

maintenance

extension

of the

and integrity,

DBAs

backup

managerial

activities.

and recovery,

as a capsule

and

training

whose technical

core is

in the

following

and related

areas

of operation:

utilities

10

criteria

and supporting

on the

step

not

be

hardware

evaluating

and

organisations that

the

for

is

needs search

the

rather

is for

needs,

the

managers,

administration

and

than

must

is involved function

make

database

Therefore,

and

specific

problems

hardware.

software rather

and

than

and not a technological

plan is to

in the

can

to

tool

utilities

on

solutions

the

organisation.

DBMS,

acquisition

DBA

selecting

use in the

selecting

a DBMS is a management

of those

sure

that

process.

be clearly

for

toy.

determine

company

the

end-user

entire

Once

established

the

needs

and the

are

DBMS

can be defined. to the

materially

areas.

responsibilities

of the evaluation

data

That

May

operational

technical

a plan for

mid-level

capability

not

of those

DBMS and Utilities

Put simply,

of the

does

the

must recognise

top-and

Reserved. content

level

understand

and applications

applications

picture

features.

Rights

organisational

applications

primarily

objectives

DBMS

data

and applications

software

a clear

DBMS

DBMS

and

execute

DBA

and selection match

under

database.

are rooted

most important

most important

including the

features

Editorial

based The

To establish

identified,

To

and

and

operation,

might be conceptualised

the

and

or DBMS software.

The first

lead

produce

of the

configuration,

DBMS-related

development,

a logical

security

databases

and

utility

be

are

job

and installing

first

features.

can

DBAs job

completely

functions,

other

installation,

design,

with the

DBAs

will explore

develop

must

hardware

the selection,

database

utilities

system,

must

also

efficiency

at the

not

data

The

the

checks

of can

Clearly,

could inadvertently

function.

do

user.

users

DBMS,

Selecting

of the

data subsets

compromise

more common

end

use

their

565

shell.

utilities

sections

evaluating, One

and

databases

supporting

Maintaining

and

Process

data elements.

of DBMS

and installing

DBMS,

the

democracy

who

understanding

dual role

of the

and evaluating the

use of the

for

data

without

users

methodologies,

interact

and implementing

Operating

again

end

a broad

with

DBAs

selecting

Thus,

activities

managerial aspects

Evaluating,

might flourish

to

Development

will become

acquisition

data administration

complicated

as well as the

that

deals

Thus, the

the

Yet this

design

technical

DBA

and the

it

more flexible

micromanage

make improper

include

programs

by a clear

Testing

and

activities

the

in

that

Database

Role of the Dba

requires

DBAs

users

elements.

and utility software,

of the

end users

duplication

of data

is

process.

sufficiently

of data

10.6.2 the technical

languages,

those

and it is likely

self-sufficient

decision

become

uniqueness

and sources

an environment

Letting

between might

today,

Such

in the

side effects.

circumstances

common on.

relatively

of data

sever the connection those

become

use

some troublesome

is

marches

10

organisations

DBMS

copied, affect

scanned, the

overall

needs,

checklist

or

duplicated, learning

in experience.

should

whole

or in Cengage

part.

the

DBA

at least

Due Learning

to

electronic reserves

would

address

rights, the

right

some to

be

wise to

these

third remove

party additional

develop

a checklist

issues:

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

566

PaRt

IV

Database

DBMS

Design

model. Are the

object/relational multidimensional DBMS

Security

Can the

audit

trail

Backup

and recovery.

offer?

DBMS

DBA

errors

How

many

capacity

are

Which

performance

tools

some

provided?

disk needed?

application

monitoring,

Does the

and

entity

to spot

integrity

errors

multiple

screen

DBMS

provide

rules,

access

and security

backup

rights,

violations?

and recovery

backups?

users?

coding

second

DBMS

is

does

offer

What levels

needed the

in the

DBMS

some

type

provide?

tools?

Does the

DBMS

of isolation

support?

of

DBA

Does the

(table,

application

page,

programs?

Are additional

management

DBMS

interface?

provide

alerts

to

the

occur? Can the

DBMS

or interoperability

to

automated

or network-based

DBA interface

distribution.

and standards. run

on

and from

without

DBMS

Hardware.

other

work

with

level is DBMS

other

DBMS

achieved?

packages?

types

in

Does the

the

same

DBMS support

Does the

DBMS support

systems

and platforms?

a

dictionary.

Vendor

training does

Available

with

and

third-party

Rights

data

Which

computers?

national

Can the

and industry

Can

DBMS

standards

Which

tool?

vendor

Is the

If

so,

Does the

what information

DBMS

offer in-house

DBMS

May

not materially

additional

in the

are required

not

dictionary?

dictionary

management

What is the

does

require?

a data

Does the

are involved

Reserved. content

desktop

support

training?

documentation

is any

Does

CASE tools?

What type

easy to read

kept in it?

and

and level

of

helpful?

What is

policy?

personnel

All

any

tools.

costs?

suppressed

operating

and

all platforms?

DBMS

have

provide?

access

What costs

additional

on different

computers

on

the

DBMS

support.

vendor

upgrade

data dictionary,

recurring

does

Does the

the

vendors

modification

hardware

interface

support

DBMS run

mid-range

follow?

Which

DBMS

Can the

mainframes,

run

does the

any

data

the

Which coexistence

applications

Learning.

or

architecture?

DBMS

that

storage

are supported?

referential

cloud,

per

violations

WRITE operations

Portability

Cengage

other

dictionary,

query

manual

Does the

does

or security and

client/server

deemed

tools.

of information

Cost.

4GLs

data

support

much

administration

What type

the

or

a relational

logs?

How

Database

the

should

size is required?

what

use of audit trails

disk,

transaction

Does the

DBMS

environment?

has

up the

needed?

Data

or

and

support

the

optical

many transactions

the

3GLs

DBMS provide

tape,

How

READ and

object-orientated,

is required,

modified?

processors

10

by a relational,

database

units

design,

DBMS

transaction

when

and

Are end-user

Does the

control. the

Interoperability

2020

be

support

back

Performance.

review

schema

available?

DBMS support

size

DBMS

Concurrency

Copyright

Which

Does the

Does the

automatically

Editorial

are

disk

many tape

support.

and integrity.

does

served

application

access?

and so on?

Does the

maximum How

(database

painters)

Web front-end

row)

What

be supported?

tools

menu

better

warehouse

be used?

development

development and

needs

a data

capacity.

must

Application

If

DBMS

storage

packages

companys

DBMS?

be

copied, affect

acquisition

and

expected

scanned, the

overall

or

tools

duplicated, learning

are

and control,

of the

what level payback

in experience.

whole

offered

by third-party

and storage software

of expertise

vendors

allocation and

hardware?

is required

(query

management How

of them?

tools,

tools)? many

What are the

period?

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Pros and cons alternatives computer

devices,

the

The

both

computer

DBA

process

be familiar

cannot

mind that

Designing

and

department.

also

that

the

DBA

both

other to

be

data communication associated

with

costs.

For

example,

preparation

and

the

DBA

must

maintenance

of the

the

also

The transactions

require

any

All suppressed

are

Reserved. content

and The

network

such

of your

that

the

details

systems

be

services

to

include

and

enforce

within

design

standards

is in place, the the framework.

of the

design

is

database

both

dedicated

and

to

may

decision

and

database

organisational

support

modelling

ensure

determined

the

reviewing

quality

the

to

by the

production

systems.

activities.

and

covered

be assigned

10

and the

modelling

areas

at

DBMS-and

and hardware-independent,

on externally

programmers

the

Such

data-processing

framework

conceptual

personnel

design

community.

the

performed

during

to the

design

based

determine

are

support

people

data

end-user within

and procedures

or executive

the

The

DBA

That coordination priorities.

and integrity

database

of

database

application

design

to

does

events.

overload

the

compliant with

DBMS.

with integrity

broad

applications

database

and

the

the

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

conversion,

whole

or in Cengage

and

during

generation,

plan is a set of instructions

standards.

the

and

Due Learning

to

electronic reserves

skills.

of the physical

and

rights, right

some to

third remove

party additional

content

database.

including

migration

storage

at application

the

physical

design,

database

compilation

generated

part.

programming

implementation

oversight

data loading,

also include

and

design

requires

assistance

and creation, tasks

Rights

DBA is to

according

resources

do not

plan. An access

Learning.

Therefore,

to the

activities

and

and

mirror real-world

of the

determination

that

design

several

systems,

support

must provide

implementation

Cengage

data

components.

log files,

sections

group

DBMS-dependent

modelling

personnel

The implementation

access

and

be grouped

coordinate

The transactions

DBA

of such

the

installed;

are:

Efficient:

the

support

and transaction

configuration

services

of a

(Remember

that

might

of available

The transactions

Therefore,

levels.

with application

Correct:

activities

to

being

DBMS-dependent.

and

standards

assistance

requires

Such

are

design

activities

modelling

managerial

works

procedures

of backup

development

appropriate

database

and transactions.

Compliant:

and

application

hardware-dependent.)

to

designated components

and Applications

design is

and

startup

details

the logical

design jobs

hardware of the

the installation

primary

physical

reassignment

DBA

an

necessary

people

and

details.

DBMS-and

ensure that transactions

deemed

from

evaluations.

in the

and

modelling

database

example,

financial

may require

has

Available

use is likely

devices,

567

existing

support

DBMSs

Process

details.

data

usually

These

For

schedules

The

those

of the

and

function

activities.

Consult

Once the

the

logical

design is

systems,

2020

process.

and so on. The costs

preparation

as the location

storage

with

one

provides

conceptual,

application.

review

Development

organisations

it requires the

storage

in the

understanding

such

Databases

to be used.

must ensure

The

for

provides

Therefore,

DBA then

physical

Copyright

solution;

system,

of all software

configuration

book.

coordinated

hardware-independent,

Editorial

with the

example,

auxiliary

sites

configuration

and

in this

often

and procedures

DBAs

of the For

involved

a thorough

physical

and implementing are

space

the

during the selection

compatible

be included

expenditures

details

installation

guide

DBA function

services

These

consider

have

include

administration

design

part

processor must

with the installation,

be addressed

design

also

be

processor,

the installation

information

Keep in

the

must

must

procedures

configuration

DBA

programs.

a transaction

and recurring

strategy;

installation

The

is just

utility

components

must supervise

administration

DBMS

system,

must

DBMS

CPU, afront-end

software

one-time

a and

must be evaluated

software

Database

room installations.

The

The

available

and

selection

include

that

software

operating

solutions

because

Remember

by the

hardware

alternative

restricted

application

constrained

must

often

system.

hardware,

the

of several

are

10

storage

services.

of the

completion

may content

be

suppressed at

any

time

The

applications

time that

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

568

PaRt

IV

Database

Design

predetermines validate

how the

the

access

Before

an

maintenance.

The

draw

Therefore,

the

when

data

and

DBA

applications. services

place

before

Although

testing

services,

service

they

programmers

and

are

or

and

and

testing

To be able to

access test

the

and implement

procedures

the

include

responsibility

users to access

for the

create

and

database. operational

utilising

database

database

training,

control

from

and

which the

the

fine-tuning

by

managing

reconfiguring

shared

DBMS

of the

corporate

data

may require

with equal

services

extension

section.

of the

Clearly,

program

can

services

are

are often too

for

the

DBMS.

repository.

assignment

of

efficiency.

for

related

database

development and

database

design

studied

must

already

and implementation

for the separation being

end-user

and implementation

company.

problem

and

standards

use in the

to

The reason

close to the

of the

procedures

be approved closely

all

design,

testing

maintained independently.

designers

the

users

evaluation

are the logical

evaluation

modified,

original

and/or the

Applications

and

preceding

are

application

added

new

application

and

usually

develop,

well as assigning

may require

Databases

any

must

to

operational

all applications

the

provide

in the

as

at run time.

rights

data.

These services

described

DBA

must authorise

assists

evaluating

must also

database

required

Such

plans,

structures

to

the

system.

database

DBMS

resources

Testing

DBA

the

have the

online,

new

required

of a new

that

additional

be in

the

addition

must

and recovery

Finally, the

applications

The

by the

and backup

will access

user

comes

required

Remember

the

application

procedures

security,

application

plan,

is that to

applications

detect

errors

and

omissions. Testing the

usually

applications,

application

and its

The testing

and

and

Technical

purpose

evaluation

creation

of

aspects

integrity,

use

Evaluation

of

and

Observance

SQL

written

Following

the

has

operations

for

can

of all

database

contains

rules

test

of the

data for

database

and

all aspects

database.

that

of the

evaluation

system

process

Backup

from

the

simple

covers:

and recovery,

security

and

be evaluated the

documentation

and

procedures

are

documenting

and

coding

data rules

applications,

the

to

database

and the

procedures,

the

system

is

end users.

and Applications into

four

main areas:

support monitoring

auditing

any

That

and integrity

The

must

ensure

made available

be divided

Security

Learning.

cover

and the

to

with existing

and can be

and recovery

that

definition

performance

naming,

testing

Backup

Cengage deemed

applications

of all data validation

thorough

Performance

2020

data

and retirement.

documentation

conflicts

operational

System

review

the

database.

application

use

application

Operating the DBMS, Utilities

Copyright

testbed

follow

of standards

The enforcement

Editorial

check

of a database data to its

and

easy to

Data duplication

DBMS

of the

is to

of both the

of the

accurate

declared

with the loading

programs.

collection

10

starts

All suppressed

Rights

and

Reserved. content

and tuning

does

May not

monitoring

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

System

support

and its

applications.

verifying

the

activities

status

activities

for

Performance to

To carry

monitoring

Establish the

DBMSs

the

often

information.

the

performance

to

and

and

demonstrate

Since

sources.

programs

and time.

and

System-related

and

resource

These activities

satisfactory

DBA

performance

are

levels.

must:

objectives

are

being

met

objectives

allow

the

are not

DBA to

use

of indexes,

can

have

they

vendors,

facilities.

Most

bottlenecks.

query

tools,

by third-party

processor

met)

The

are

or they

of the

usage

available

from

may be included

performance-monitoring

most common

query-optimisation

database

bottlenecks

algorithms

in

and

DBMS

management

routines

database

Available

not

Such

much

most

a plan

give

the

user

an index

Managing

Database

is

trying

DBMS

guidelines

and

mode

choice

DBA should and

time

Typically,

command-line

13,

usually

is

and

DBMS

especially

within

to

educate

programmers examples

within

the

that

SQL Performance,

that

10

application

a query,

create indexes

of databases

of application

The amount

of primary

types

tuning.

The

DBMS

can improve

for

examples

pool

used

of

allowing access

determining by the

efficient

concurrency.

the

DBMS

the

desired level

and requested

operation

(See

few to

of the

Chapter

system,

12,

Managing

subject)

primary

and

allocation

secondary

of storage can

memory,

resources

be used

to

is

must also

determined

be

when

determine:

concurrently

or users supported

memory (buffer

thereby

concurrent

for

to the

parameters

may be opened

programs

of locks

on that

of both

package,

parameters

influence

configuration

that

The number

that

DBMS to improving

is important

more information

performance Storage

by the issue

in terms

the

orientated

DBA specify

affected

factors

for

into

are

concurrency

resources,

DBMS

integrated

let the

also

with the

configured.

number

spend

performance

both in the

routines

Concurrency,

during

useful

the

are

the

be familiar

storage

is

performance,

plan.

use of SQL statements.

Therefore,

packages

Because

and

usage

to

user.

Chapter

Concurrency

must

on system

tuning.)

applications.

Transactions

and

DBA is likely

contain

do

the

Query-optimisation

of concurrency.

considered

proper

systems

(See

Several

DBA

on the

for

effect

creation

of SQL statements,

relational

options.

index

the

manuals

selection

a negative

environment.

performance,

use

performance

database.

defined

database

Query-optimisation

The

attention

performance-monitoring

system

selection

end users

performance.

database

DBMS

the

that

are provided

to the

a carefully

proper

makes the index

the

power

DBMS

applications.

(if performance

tools

not include utilities

administration the

programs.

the

to

569

solutions

or transaction

satisfactory

programmers

by the

checking

special

maintain

performance

solutions

on selected

index

a relational

produce

tuning

the

alternative

are related

adhere in

system

whether

does

utilities

improper

installations

manuals

of the

tape

Process

resources.

Because

important

Development

operations

changing

emergency

DBAs

tasks

Database

goals

DBMS

DBA to focus tuning

of storage

To

DBMS

to

as running

applications

and tuning

performance

sources.

and

such

much of the and

day-to-day

logs

of database

performance-monitoring

system

allow

packages

versions

require

to the

out job

tasks

utilities

evaluate

selected

If the

operating

tools

DBMS,

and find

include

many different in

to

problem

Implement

and tuning

the

filling

disk

occasional

performance

DBMS

the

hardware,

upgraded

related

from

performance-monitoring

DBMS

Monitor

Isolate

that

directly

range

periodic,

new and/or

ensure

out the

all tasks

activities

of computer

include

configurations

designed

cover

These

10

concurrently

size) assigned

to

each database

and each

database

process

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

570

PaRt

IV

Database

Design

The size and location The log

files

increase

can

of the log files (remember

be located

DBMS

Since

data

primary

issues

manuals

loss

files

and on the relative

DBMS

incremental.

and

to

DBA provides

DBAs

and

The technical common

been

must

the

database.

movement

and to

are faster

is

must

become

familiar

and

recovery

activities

a schedule

dependent

components

for

on the

the

with

task. are

of

backing

up

application

type

database,

the

database

backups,

be they

up periodically. automated

than

full

an incremental

backup

full

or

requires

purposes.

requires

plan, implement,

database

backups,

be useful for recovery failure

DBA

performance-monitoring

establish

frequency

schedule

application

test

and

of the transaction

enforce

log to the

a bulletproof

backup

and

The

assigning objects,

DBA

assignment

and end

users,

users.

and

security

using

creating

trails

an audit

and,

rights

and the

aspects

of security

SQL commands

audit

generate

violations

of access

The technical

access rights,

must periodically

or attempted

DBMS

to

trail

report

if so, from

to

discover

grant

security

to

determine

which locations

and,

required

is

one

for

DBMS, activities

use of the

users

and

can

development

with

ensure

and

activities.

In

addition,

utilities for the applications DBMS tools

as

well as the

be

programmers developed

of a technical

is to

also included

facilitate

database

such

in the support.

used to find

solutions

that

the

DBMS

vendors.

company

concerning

to

Establishing

has new

a good

products

give organisations

good

external and

relationships

support

source.

personnel

retraining.

an edge in determining

the future

DBA are an extension

includes

maintenance

usually

done

as

to

space

Reserved. does

May not

not materially

of the

new

copied, affect

is DBMS

of the

physical

reorganising

free

physical

activities.

disk-page

also

activities.

or secondary

the

fine-tuning

contiguous might

operational

Maintenance

environment.

the

locations space

storage

location

devices.

of

data in

The reorganisation

to the

allocated

the of

a

DBMS to increase

to

deleted

data,

thus

data.

scanned, the

of the

DBMS

management

process

for

be

of the

activities

part

allocate

The reorganisation disk

Applications

preservation

maintenance

might be designed

content

end

information

of the

to the

common

Rights

DBMS and its

the

procedure

are also likely

Utilities

database

All

for

by interaction way to

relations

This is

suppressed

technical

programming.

support

up-to-date

database.

more

DBAs

development.

DBMS

performance.

covers

database

the

in the

in the use of the

training

for

provided is

source

most

is included

training

programmer

might include

are dedicated

of the

tools

problems.

support

the

Periodic

and its

troubleshooting

maintenance

activities

any

appropriate

database

A technical

of database

Learning.

and

technical

suppliers

Maintaining

that

creating

on-demand

Good vendor-company

Cengage

the

Users

use the

standards

are the

providing

assumes

users

actual

procedure

of the

direction

deemed

the

backup

system

by programmers

violations.

technical

software

Vendors

has

must

or secures technical

activities.

Part

to

Application

Unscheduled,

2020

DBA

DBA

be backed

that

to

monitoring

and Supporting people

procedures

review

The

All critical

backups

privileges

rights

have

programmers.

Copyright

head

by whom.

Training Training

Editorial

disks

in the

organisation,

Backup

must

utilities

monitoring involve

there

possible,

One

data.

logs

involved

the

operation.

of the

The

or attempted

whether

The

to

intervals.

full backup

and

access

violations

with

are used to recover

the

Therefore,

details

after a media or systems

copy.

auditing

and revoke

to

DBMS

incremental

use of access

auditing

the

reduce

DBMS-specific.

devastating

include

of a periodic

Security

10

to

procedure.

proper

if

the

packages

database

recovery

be

transaction

Database recovery correct

that these files

volume

technical

at appropriate

Although

existence

are

the

importance

and the

Most

to

during

and log

applications

to learn

is likely

concern

database

the

a separate

performance)

Performance-monitoring the

in

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Maintenance require

the

might

create

DBMS

warehousing

databases.

The

upgraded

from

Database

conversion

analysis,

efforts

or for

different

services

desktop

charting,

computer

to

to

of the

DBA include software.

migration Such

modelling,

and

software-specific)

so

level

on.

to

perform

Migration

or at the

formats

access.

for

when the

by an entirely

host

data in

system

new

DBMS

of activities

conversion

is

DBMS.

(mainframe-based)

spreadsheet

services

(storage-media-or

data

or between

services

common

a variety

and

physical

host

a client/

data support,

conversion

the

on a different

for internet

dissimilar

are

data from

user

as spatial

DBMS is replaced

downloading that

and

conditions

Or it

in

571

might

tool.

running

interfaces

data in

Process

The upgrade

running

such

Development

front-end

applications

programming

or when the existing

allow

software.

DBMS

DBMS

exchange

Database

an internet

a host

features

for Java

DBMS

to

or

distributed

include

need

utility

software

access

in

databases

also include

statistical

(DBMS-or

allow

common

with the

one version to another

users

the logical

to

DBMS and

DBMS

and support

are faced

maintenance

formats

an end

are

Also, new-generation

companies

the

of the

gateway

services

and star query support,

often

incompatible

to

upgrading

version

DBMS

gateway

environment.

Quite

also include of a new

an additional

computer.

server

activities

installation

10

can

be done

at

operating-system-specific)

level.

10.6.3 Developing a Data administration For

a company

regardless

system The

to succeed,

supports

strategy

used

stable

thus

require

decrease

from

of the

company

a detailed

that

Copyright Editorial

review

2020 has

planning,

processes.

Cengage deemed

Learning. that

any

are

modification

and

is

existence. systems. change.

control

of future

the

goals.

processes

mission. that

its

Therefore, information

systems

companys to

goals,

ensure

strategic

plans.

the

plan

After

its

all,

condition

compatibility

of

development.

The

engineering.

those

of existing

or

ensure

of the companys

The IE rationale

when

of the available

guide

as information

much during their

on systems

development

affect ISA

analysis

and to

achieve

to

with the information

methodologies

is known

is

areas.

conflict

plans,

main objectives

organisation

business

The output of the IE process is aninformation for

to its

any

(ie) allows for the translation

will help the

the frequent

for

not

systems

and do not change

the impact

must

Several

methodology

that

be committed

step

each of its

derived

and information

data instead

fairly

are needs.

engineering

applications

corporate

plans

business

administration

information and

plans for

and its

most commonly

must

a critical

administration systems

or situation,

activities

size,

its strategic

database

the information

data

its

of a companys

strategy

strategic goals into the data

IE focuses

simple:

business

In contrast,

architecture

information

systems.

description

data types

processes

By placing the

systems

on the

tend

to remain

change

emphasis

10

of the

often

and

on data, IE helps

(iSA) that serves asthe basis Figure

10.17

shows

the

forces

development.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

572

PaRt

IV

FIguRe

Database

Design

10.17

Forces affecting the development

of the Isa

Company managers

Goals

Company

Critical

success

factors

Information

Information

mission

Strategic

systems

engineering

plan

architecture

Implementing

IE

commitment

of resources,

and

An ISA

control.

integrated

tools

the

DBA

of the

a successful

technological

must

and

commitment.

company to

For example,

standards,

codification,

data

involvement.

given

strategy. change

trained

has

Cengage deemed

any

All suppressed

Analysts

and standards.

in the

Learning. that

a

factors,

automated

and

current

success

necessary

must

situation

of the and to

design, Needs

data administration

success

Critical

is

example

position

of the critical

factors factors

helps include

as:

commitment The

handled?

involvement

that

use

Rights

of the

Reserved. content

does

May not

not materially

channel

affect

degree

to

are

be set

problems

enforce

the

use

of

top.

data

a clear

documentation, and

to

at the

corporate have

scanned, the

overall

duplicated, learning

to

upper-level

adapt

critical

administration

vision

of

what

must

implementation, should

to the

change

to the

be identified

to

first,

of the

involved?

Successful

Users

ensure

overall

must be familiar

success

change.

management

are key to the

and programmers and

or

aspect

of organisational able

and programmers If analysts

copied,

another

channels

procedures

be

is

people

Good communication

Defined standards. procedures

2020

the

strategy. such

analysis,

What is the

requires

an open communication

implementation.

review

The

other issues

End-user

administration

organisational

Copyright

issues,

companys

how are database and

planning,

of critical

computerised,

and, therefore,

administration

management

the

involves

identification

of

Understanding

controls.

analysis.

use

that

prioritized.

End-user

Editorial

data culture

understand

the

strategy

factors.

corporate

and

process

objectives,

includes

systems

Top-level

a costly

tools.

success

planning,

be done.

then

CASE

is

well-defined that

corporate

situation

be analysed

and

critical

procedures,

Thorough

liability,

overall information

develop

Management

organisation

a framework

as a DBMS

on several

standards,

an

management

depends

managerial,

in

provides

such

The success strategy

10

methodologies

should

success

be

of the

process.

with appropriate

lack familiarity,

they

methodologies, may need to

be

standards.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

Training. users

The vendor

must

maximum

benefit,

so they

the

pilot

to

that

DBA personnel

tools,

in the

standards

increasing

others

project.

company,

use the

thereby

can train

A small

must train the

be trained

and

end-user

use of the

procedures

confidence.

10

Database

DBMS

to

Development

and other tools.

obtain

and

Key personnel

Process

End

demonstrate

should

573

the

be trained

first,

later.

A small the

project

output

is

is recommended

what

to

was expected,

ensure

and that

that

the

the

DBMS

personnel

will

have

work in

been

trained

properly. This list

of factors

framework the

list

is

for the

planning

factors

data activity

and

cannot

development

of success

a successful

not

comprehensive.

of a successful

is, it

must

administration of the

be

be based

strategy

Nevertheless,

strategy. on the

However,

notion

are tightly

that

it

no

does

matter how

development

integrated

provide

with the

the

initial

comprehensive

and implementation

overall

information

of

systems

organisation.

suMMaRy An information manage

system is

both

system.

data

Systems

information

analysis

system.

The Systems within

the information systems

The

Database

Life

The

Like the

and loading,

to

database

protect its

plan to

conceptual

portion

The database

orientated

the

mix of skills.

any

cycle)

five

phases:

The

SDLC

extent

of, an system.

of an application planning,

is

All suppressed

Rights

has

Reserved. content

does

analysis,

an iterative

rather

than

a

May

of integrity,

security

not

and

within

the information

database

design,

maintenance

confidentiality

measures

and

common

is

tend

and

be

for

practice

handled

and

develop

and

10

evolution.

availability

of data.

a comprehensive

affect

the

to

data

overall

or

duplicated, learning

managerial

in experience.

whole

or in Cengage

part.

operations

administrator

Compared

Due

to

electronic reserves

to the

executes

rights, right

to

to data

organisation.

This

third

managerially

DBA function,

the

DA

when the

of the

DAs functions.

all

remove

more

However,

the

some

The company.

a broader

DA is

focus.

the

to

according with

the

responsibilities,

Learning

basic

(DA).

speaking,

DBA

database.

within the

and longer-term the

on two

company

a position

data

Generally

DBA.

from

DBA

created

data

corporate

varies

divide

by the

based

vs decentralised.

and other

a DA position,

scanned,

variations,

managing the

have

overlap.

and

several

function

orientated

technical

copied,

to

centralised

companies

to

not include

materially

study,

sequential.

with a broader

both

not

database

initial

operation,

administration

Some

activity

does

of the

may be subject

more technically

Because

Learning.

it is

DBMS-independent,

DBA

the loss

vs top-down

DBA functions

diverse

that

into

and the

and to

of the information

an information

history (life

database

manage computerised

management

chart

history

than

relevant

design

phases.

than the is

organisation

Cengage

of creating

the

evaluation,

rather

database

exists,

mandate to

DA and the

the

(DBA) is responsible

of the

life-cycle

data

function

deemed

of the

no standard

database

broader

has

for,

part

data.

administrator

management

2020

need

maintenance.

phases:

and

the

bottom-up

organisation

Although

of six

include

select

philosophies:

internal

review

the

be divided

and

of data into information

a very important

process

traces

can

describes

testing

security

security

The

SDLC

DBLC is iterative

should

the

The

composed

An organisation

design

is the

implementation,

is

establishes

Cycle (SDLC)

Cycle (DBLC)

DBLC is

SDLC, the

Threats

Copyright

Life

database

that

development

system. design,

implementation

Editorial

Systems

process

the transformation

the

process.

system.

The

to facilitate Thus,

is the

Development

detailed sequential

designed

and information.

party additional

DBA

content

may content

must

be

have

suppressed at

any

time

from if

a

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

574

PaRt

IV

Database

The

Design

managerial

services the

of the

?

Supporting

?

Defining

and enforcing

policies,

?

Ensuring

data

privacy

? Providing ?

The technical Evaluating,

?

Designing

?

Testing

?

and the and

Maintaining

The

needs. most

DBMS,

data

Therefore,

commonly

the

database

function

database

in at least

these

activities:

DBMS and

and

applications

applications

applications

and

applications

administration

the

strategy

development

requiring

of this

integrating

is

closely

related

of an organisations

a detailed

development

used

the

and

utilities

administration,

To guide

for the

users

of the

and objectives. of data

utilities

data in the

be involved

databases

supporting the

of the

DBA to

databases

DBMS,

development

that

the

and standards

services

use

and installing

evaluating

at least:

and integrity

and

and implementing

Training

?

distribution

selecting

procedures

and recovery

role requires

Operating

?

security,

the

include

community

data backup

Monitoring

?

end-user

DBA function

analysis

overall

plan,

methodology

is

the

strategic

of company

companys

situation,

methodology

as information

mission

plan corresponds

goals,

an integrating

known

to

and

to business

is required.

engineering

The

(IE).

Key teRMs accessplan

database development

physical security

auditlog

Database LifeCycle(DBLC)

policies

audittrail

database securityofficer(DSO)

procedures

authorisation management

decentralised design

scope

bottom-up design

disaster management

standards

boundaries

full backup

systems administrator (SYSADM)

centraliseddesign

incremental backup

systemsanalysis

computer-aided systems engineering

information resource manager(IRM)

systems development

information systems architecture (ISA)

Systems Development Life Cycle

10

(CASE) conceptual design

Information engineering (IE)

concurrentbackup

informationsystem

dataadministrator(DA)

logical design

data encryption

Copyright review

2020 has

Cengage deemed

Learning. that

any

top-down design userauthentication

minimal data rule

databaseadministrator (DBA)

Editorial

(SDLC)

All suppressed

Rights

Reserved. content

does

virtualisation

physical design

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

10

Database

Development

Process

575

FuRtHeR ReaDIng Bertino,

E. and

Sandhu,

Dependable Du,

and

W. Computer

R. Database

Secure

Security:

ReVIew

2(1),

A Hands-on

Online Content are contained

Security

Computing,

Concepts,

Approaches,

and

Challenges,

IEEE

Transactions

on

2005.

Approach.

CreateSpace

Independent

Publishing

Platform,

2017.

Answers to selectedReviewQuestionsand Problemsforthis chapter

on the online platform

accompanying

this

book.

QuestIOns

1

Whatis aninformation

2

How do systems

system?

analysis

Whatis its purpose?

and systems

development

fit into

a discussion

about information

systems?

3

What does the acronym

SDLC mean, and what does a SDLC portray?

4

What does the acronym

DBLC mean, and what does a DBLC portray?

5

Discuss the distinction

6

Whatis the

7

Discussthe distinction between top-down and bottom-up approaches in database design.

8

Whatis the data dictionarys

9

Whichfactors

between centralised

and decentralised

minimal data rule in conceptual

are important

design?

function in database design?

11

Describe and characterise the skills desired for a DBA. What are the

13

DBAs

managerial roles?

Describe the

managerial activities and services

provided

DBA.

Which DBA activities are used to support the end-user

14

10

in a DBMS software selection?

Describe the DBAs responsibilities.

by the

database design.

Whyis it important?

10

12

conceptual

community?

Explain the DBAs managerialrole in the definition and enforcement of policies, procedures and standards.

15

Protecting data security, privacy and integrity areimportant database functions in authorisation management.

Which

activities

are

required

in

the

DBAs

managerial

role

of

enforcing

those

functions?

16

Discuss the importance describe

17

Copyright review

2020 has

actions that

Cengage deemed

Learning. that

any

a checklist

All suppressed

Rights

Reserved. content

does

assigned

for the technical

May not

not materially

of database backup and recovery

must be detailed in

Assume that your company Develop

Editorial

the

and characteristics

be

copied, affect

scanned, the

overall

or

backup

you the responsibility

and other

duplicated, learning

and recovery

in experience.

whole

aspects

or in Cengage

part.

Due Learning

to

reserves

rights, the

right

Then

plans.

of selecting the corporate

involved

electronic

procedures.

in the

some to

third remove

selection

party additional

content

may content

DBMS.

process.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

576

PaRt

IV

Database

18

Design

Describe the activities that are typically the

19

DBA technical

function.

Briefly explain the concepts (ISA).

How

20 Identify

do those

associated

Which technical

ofinformation

concepts

affect

the

withthe design and implementation

skills

are

desirable

in

the

DBAs

engineering (IE) and information data

administration

systems architecture

strategy?

and explain some of the critical success factors in the development

of a successful

data

administration

21

What are the

main categories

22

Whatis data encryption

services of

personnel?

and implementation

strategy.

of threats faced

by an organisation in trying to protect its data?

and whyis it important

to data security?

PRObLeMs 1

Thabos

Car Service & Repair Centres are owned by Nationwide

and repairs provide Each and

only

of the

centre

repairs

You have

also

maintains

been

Car

entire

Dealers.

Each a

used,

billing,

contacted

system.

managed centre

manual

costs,

file

system

service

by the

and

maintains

of Thabos

hours

manager

preceding

operated

a fully

in

dates,

employees

Given the

Three

Car

Service

& Repair

Centres

province.

is independently

mechanics.

parts

the

services

stocked

which

owner,

by a shop

each

and

manager,

parts

cars

a receptionist

inventory.

maintenance

so on. Files

history

is

are also kept

kept:

to track

and payroll.

of one of the

information,

centres

to

design

and implement

a

do the following:

Indicate the most appropriate sequence of activities by labelling each of the following steps in the

10

for

eight

purchasing,

computerised

a

centres

made,

inventory,

Nationwide

and repair

three

at least

Each

cars from

service

Car Dealers; Thabos

correct

order.

(For

example,

if you think

that

Load

the

database

is the

appropriate

first

step, label it 1.) ______________Normalise

the

______________Obtain

a general

______________Load

the

______________Create ______________Test

the

______________Interview the

using

ER diagrams.

programs.

mechanics.

file (table) the

structures.

shop

manager.

How will a data dictionary help you develop the system?

Cengage deemed

model,

flowcharts.

c

Learning. that

modules that you believe the system should include.

Which general (system) recommendations system

derived

has

and system

Describe the different

if the

2020

process.

b

d

review

diagram

application the

______________Interview

Copyright

of each system

a conceptual

______________Create

operations.

system.

a data flow

______________Create

of company

database.

the

______________Create

model.

description

a description

______________Draw

Editorial

conceptual

any

All suppressed

from

Rights

Reserved. content

does

will be integrated, such

May not

not materially

which

an integrated

be

copied, affect

scanned, the

overall

or

might you maketo the shop modules

system?

duplicated, learning

in experience.

whole

Give examples.

will be integrated?

Include

or in Cengage

part.

Due Learning

several

to

electronic reserves

general

rights, the

right

some to

third remove

manager? For example, Which

benefits

would

be

recommendations.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHaPteR

e

Whatis the best approach to conceptual

f

database design?

10

Database

Development

Process

577

Why?

Name and describe atleast four reports the system should have. Explaintheir use. Who will use those

reports?

2 Suppose you have been asked to create aninformation produces

nuts

and how

and

bolts

would the

of

answers

many shapes,

to those

sizes

questions

3

What do you envision the SDLC to be?

4

What do you envision the DBLC to be?

system for a manufacturing plant that

and functions.

affect the

Which

database

questions

would

you

ask,

design?

5 Suppose you perform the same functions notedin Problem 2 for alarger warehousing operation. How

6

are the

two

sets

of procedures

7

system

(For

for

the

sequence

example,

of

if you think

______________Create

the

______________Load

why

are they

different?

example

in

Chapter

5?

the

the

______________Create

the the

______________Obtain ______________Draw

Learning. any

All suppressed

Rights

Reserved. content

does

May not

each

database

of the

is the

following

appropriate

steps

first

in

the

correct

step, label

most order.

it 1.)

programs.

of each system

conceptual

soccer

process.

file

model.

club president.

a conceptual

______________Interview

that

the

application

the

______________Create

Cengage

by labelling

database.

______________Interview

deemed

activities

system.

______________Normalise

has

University

a description

______________Test

2020

Tiny

that Load

the

______________Create

review

and

You have been assigned to design the database for a new soccer club. Indicate the appropriate

Copyright

How

Usingthe same procedures and concepts employed in Problem 1, how would you create an information

Editorial

similar?

model

soccer

using

ER diagrams.

club director

of coaching.

(table)

structures.

a general

description

a data flow

diagram

not materially

be

copied, affect

scanned, the

overall

or

10

duplicated, learning

of the and

in experience.

whole

soccer

system

or in Cengage

part.

Due Learning

club

operations.

flowchart.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 11 Conceptual, Logical, and Physical DatabaseDesign IN THIS CHAPTER,YOU WILLLEARN: About

the

How to

three

stages

design

functional

of

database

a conceptual

design:

model

conceptual,

to represent

the

model can be transformed

into

logical, business

and

physical

and its

key

areas

How the

conceptual

alogically

equivalent

set of

relations How to translate

the logical

data

model into

a set of specific

DBMS table

specifications About

different

types

How indexes How to

can

estimate

of file

organisation

be applied

to improve

data

storage

data

access

and retrieval

requirements

PREVIEW In

Chapter

cycle that

10, you learnt

was that have

of the

been

model that

about

actual

captured

accurately

in the

reflects

Such is the importance Conceptual the

Logical define

design

by producing

relationships

within

database integrity

our

Life

design.

It is

Cycle.

The

essential

initial

study

are

user requirements

where

the

we create the

to

build

needs

conceptual

of this

characteristics a database

of the

down into three

model that identifies

phase

data

used

and the

design; it is broken

a data

most critical

that

business.

distinct

stages:

representation

the relevant

of

entities

and

system

design

rules

Database

database

the

of database

database

database

the

database

where

to

ensure

we design relations there

are

based

no redundant

on each entity

relationships

and

within

our

database Physical target

database DBMS.

and how the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

design

In this

data is

scanned, the

overall

or

stage,

we have

physical to

database

consider

is implemented

how

each

relation

in is

the

stored

accessed.

duplicated, learning

where the

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure

11.1 shows the

procedural

flow

of these

11

stages

Conceptual,

and the

Logical,

steps

and

Physical

Database

within each that

Design

need to

579

be

taken.

FIGURE 11.1

The three stages of database design

Data analysis and requirements Data analysis and requirements Entity relationship relationship Entity normalisation Data model

modelling and and modelling

Data model Distributed

verification database

design

Distributed

database

design

Creating

normalisation

verification

the

logical

data

model

Creating the logical data model Validating the logical data model using normalisation Validating the logical data model using normalisation Assigning

and

validating

integrity

constraints

Assigninglogical and validating integrity Merging models constructed

constraints for different

parts

database

Merging logical models constructed for different Reviewing the logical data model with the user Reviewing the logical data model with the user

Translate Translate model

into

Determine Determine Define

each each relation relation identified identified

inin the the logical logical

of the

parts

data data

of the

database

model into

tables

tables

a suitable file organisation a suitable fi le organisation

indexes

Define indexes Define Define

user views views user

Estimate Estimate Detemine Detemine

data data storage storage requirements requirements database database

security security for for users users

11

These three stages of database design are not totally intuitive and obvious. There is no single quick or automated method for tackling each stage. A well-designed database takes a considerable amount of time and effort to envisage, build and refine. It cannot be stressed enough that, if the time is taken to design your databases properly, then it will provide a solid foundation in which to build a complete system.

One of E.F.

Codds

requirements

when designing

was that the design should maintain logical and physical two stages is very important. Logical design is concerned Physical design is concerned with how the logical design in secondary storage. Codds rules on relational database if the logical should

structure

not change

if the physical user interface

of the database should change, then the

(logical

methods (hardware, should

Copyright Editorial

review

2020 has

Cengage deemed

design

Learning. that

any

All suppressed

using

Rights

Reserved. content

does

not materially

in any

way (physical

system

be

data change, then the

data independence).

about the steps required to complete

a number

May

management

way the user views the database

storage, etc.) of storing and retrieving

not be affected

not

database

data independence)

In this chapter, you willlearn database

a relational

data independence. The separation of these with whatthe database looks like to the user. mapsto the physical storage of the database design stated that:

conceptual, logical,

and physical

of examples.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

580

PART IV

Database

11.1

Design

CONCEPTUAL DESIGN

In the conceptual represents

design stage, data modelling is used to create an abstract

real-world

objects in the

most realistic

way possible.

database structure that

The conceptual

model

must embody

a clear understanding of the business and its functional areas. At this level of abstraction, the type of hardware and/or database modelto be used might not yet have been identified. Therefore, the design must be software and hardware independent so the system can be set up within any hardware and software platform chosen later. Keep in

mind the following

minimal

All that is needed is there,

data rule:

and all that is there is

needed.

In other words, make sure that all data needed are in the model and that all data in the model are needed. All data elements required by the database transactions must be defined in the model, and all data elements defined in the model must be used by atleast one database transaction. However,

as you apply the

minimal

data rule,

avoid

an excessive

short-term

bias.

Focus

not only

on the immediate data needs of the business, but also on the future data needs. Thus, the database design mustleave room for future modifications and additions, ensuring that the businesss investment in information resources will endure. As you re-examine Figure 11.1, note that conceptual design requires four steps, each of which will be examined

in the

Data analysis

next

sections:

and requirements

Entity relationship

modelling and normalisation

Data model verification Distributed database design

11.1.1 Data Analysis and Requirements 11 The first step in conceptual

design is to discover the characteristics

of the data elements.

database

factory

for

is

an information

that

produces

key ingredients

successful

An effective

decision

making.

Appropriate data element characteristics are those that can be transformed into appropriate information. Therefore, the designers efforts are focused on: Information needs. What kind ofinformation is needed that is, what output (reports and queries) must be generated by the system, whatinformation does the current system generate, and to what extent is that information adequate? Information users. Who will use the information? different end-user data views? Information sources. once it is found?

Whereis the information

How is the information

to be found?

to be used?

Howis the information

What are the

to be extracted

Information constitution. What data elements are needed to produce the information? What are the data attributes? Whatrelationships exist among the data? Whatis the data volume? How frequently

are the data used?

What data transformations

are to

be used to

generate

the required

information?

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The designer compile

obtains

the

Developing to

a precise

used

to

Directly

data

in

data type

job.

an accurate

data

types

and their

total

business.

business

environment.

data

are

Business and

enforce

they

define

rules,

derived

actions

within

business

rules

Examples

A customer

may

payment

make

on account may generate

Each invoice

is

their

critical

role

needs

of the

Ideally,

description

database

designer,

the

organisations

the

operating

airline

2020 has

Cengage deemed

usually system

to

is

charge

is

(DBA) the

designer

of the

true part

any

All suppressed

Rights

the

in

to

May

not materially

existence

to the

understanding

Chapter

2,

the

of data

Data

description

understanding

becomes

Models,

or principle

of the companys

required

that

meaningful

a business

within a specific

of an organisations

environment.

When

multiplicities

and

is

operations,

business

rules

a

help

are

written

must be widely disseminated

interpretation

distinguishing

rule

organisations

constraints.

and they

a common

only

of the rules.

characteristics

of the

to

Using simple

data

as viewed

by

1

as follows:

customer. business

to

rules

database

a formal

that

must

designs

not

be established

casually.

and implementations

description

be

is

affect

is

how

be

of operations.

the

overall

or

design

duplicated, learning

organisations

Poorly

that fail to

As its

in experience.

operating

data sources on the

quite

different

name

meet

from

is

or in Cengage

part.

Due Learning

to

electronic reserves

of a steel

Naturally,

For example,

manufacturer, data

data

an

analysis

and

environment

and

of operations.

rights, the

mission.

when the

a

To the

data users.

may be, the

enhanced

within a description

whole

that

organisations

implies,

and thoroughly

environment.

and the

organisations

different

the

process

and precisely

scanned,

an

both the

dependent

would

matter database

copied,

define

environment

environment

Yet no

not

database

only one customer.

from

activities

of the

does

the

a desktop

the

according

yield

collection

procedure

shares

design,

derived

accurately

Reserved. content

Systems

of the

usually implies

database

the

carefully

of designing

usually

considered

the

discover

the reports.

process is part

model. (This

an existing

to identify

(tables)

support

in

has

on account.

of a university

home.

views

end users.

operating

component

Learning. that

user

existing

to

by themselves

of view,

are

rules lead

are

environment

data use are described

review

do not

main and

one

database

of the

the

581

many invoices.

operating

or a nursing

requirements

Copyright

data

of operations is a document that provides a precise, detailed, up-to-date

reviewed

Design

he or she can

end-user

and files

analyst

design

relationships,

credited

business

rules

design

administrator

a detailed

payments

organisations

business

Database

end user(s) interact

system in place, the

systems

organisations

rules

by only

in

the

forms

DBA designs

from

from that

the

is

generated

or inaccurate

the

must be easy to understand

many

the

The end

must have athorough

data

organisation

describe

so that

and the

turn,

data required

database

of a policy,

of business

A customer

description

Editorial

rules

the

database

point

attributes,

business

the input

The database

Remember

every person in the

reviews

The

But

description

entities,

designer

designer

uses.

Physical

analyst.

model, the

defined.

The

of a database

a database

narrative

output.

describe

the

systems

and

From

company.

Given

the

by the

In

desired

conceptual cases,

jointly.

and

cases,

department.

extent

To be effective,

defined

the other

and

sources

designer

has an automated

to

some

Logical,

elements.

examines

group.

In

The presence

rules

brief and precise

Each

In

created

To develop

the

develop

data-processing

specifications

ensure that

design

views

data

existing

reports

Cycle (SDLC).

will also

of a formal

language,

data

designer

desired

environment.)

administrators

properly,

The

systems

Life

system

create

system:

different

The database

main

or computer-based).

and

with the

computer

to

current

current

Development

when

of end-user

characteristics.

the

Interfacing

of the

data views.

databases

from

Conceptual,

sources:

and volume. If the end user already

examines

new

the

the

questions

Note these

description

place (manual

and their

to those

end-user

help identify

observing

system

answers

information.

and gathering

develop

are

the

necessary

11

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

582

PART IV

Database

In

Design

a business

and, therefore, written and

business

documentation more

direct

perceptions rules.

a task.

Such

does

designers

job

business

role

help

business data

business

Business

rules

They

help

They

constitute

They

allow

rules

plays

yield

standardise

the

They

designer

to

of the

Example

the first

a DVD rental

action

available

TABLE 11.1

of the

Within this and

in

Cengage deemed

Learning. that

any

yield

can initiate

should

end-user

of the

end

users

perceptions.

perceptions

problems,

of such

a

perform

Although

verify

very different

business

of what

that

problems,

general

the

reconciliation

database

to

ensure

that

nature,

design

how the

business

the

designer

and

scope

role

of new

works and

must identify

the

of data.

systems:

users

and

designers.

role

and

scope

of the

data.

processes. relationship

with Entity

whether

fully

participation

Relationship

a given relationship

rules

and foreign

key

Diagrams.)

is

mandatory

or optional

is usually

rule.

for a DVD Rental Store

conceptual

store

understand

design

movie titles Each

copies.

type

are

process,

contains

For example,

let

classified

note

us now consider according

many the

possible

summary

an example

to their

type:

titles,

and

presented

Type

Title

Family

Chronicles

of Narnia

1

Chronicles

of Narnia

2

Action

has

mechanic

in

based

comedy, most titles

Table

on

family, within

a

11.1.

The DVDrental type and title relationship

Comedy

2020

any

to

Consequently,

nature,

Modelling

new release.

multiple

business

specifying

data.

appropriate

5, Data

to

in the

of

the

develop

business

stage

store.

documentary, are

understand

noteworthy:

applicable

the

on the

between

Data Analysis and Requirements

To illustrate

type

Chapter

point is especially

a function

review

tool

to understand

(See

to

to management

results

impact

view

designer

The last

Copyright

companys

They allow the the

comes

consequences.

verify

designer

benefits

to

allow

because

authorisation

pays

discovery

operations.

important

designer

constraints.

Editorial

the

their

a communications

the

it

Given the and

company analyse

several

may point

designer.

enables

and

major legal rules,

A faster

Unfortunately,

and accurate.

within

rules

a discovery

differences

are appropriate

has

manuals.

and

users.

with inspection

business

managers

operations

end

when it

who perform the same job

database the

but it of

of operations

department

may believe that

mechanics

trivial,

and

with

source

description

makers,

standards

mechanic

only

for the

policy

interviews

reliable

department

people

the

direct

be a less

While such

is to reconcile

the

the

companys

11

are.

not

is

development

with several

rules

Knowing what

the

managers,

procedures,

rules

may seem

to

components

diagnosis

company

actually

a distinction

of information

company

user can

when

contributors

job

as

a maintenance

Often, interviews their

main sources are

of business

procedure,

crucial

the

such

source

For example,

such

the

rules

differ, the end

maintenance

are

environment, of

All suppressed

Rights

Reserved. content

does

May not

not materially

be

Copy

Toy Story

1

Toy

Story

2

Toy

Story

3

Simpsons

1

Simpsons

2

Simpsons

3

Lord of the

Rings

1

Lord

Rings

2

copied, affect

of the

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

You have been asked to set

of business

rules

produce

from

the

The

movie type

The

movie list is updated

DVD shop

owner

shop

on the vendor list

for

The video number

shop

selects

manager

not desirable

Physical

and have been provided

are necessarily

Database

Design

583

with the following

in stock.

a movie on that list

might not be ordered if the

for some reason.

order

vendors

movies from the

from

are reclassified

wants to

a title,

more titles,

whom

entire

movies

vendor list;

some vendors

may be ordered in the future.

to an appropriate

have an end-of-period

customer return

pay in

checks date

type

after they

have been in

(week,

month, year) report

for the

quickly

whether

out

The

DVD store

owner

The

owner

the

to

must be able to find it quickly.

written.

a title,

Each invoice

a record

Upon the

return

is late

wants to

wants

assistant

is

may thus

When a customer

contain

charges

for

one

or

cash.

and time.

check

also

the shop

an invoice

All customers

When the expected

and

by type.

or

more titles.

Logical,

days.

requests

one

however,

merely potential

30

of rentals

If a customer

that it is

as new releases

more than

store

not all types

does not necessarily

are

Movies classified

is standard;

as necessary;

decides

for this

Conceptual,

manager:

classification

The DVD rental

stock

a database

11

to

kept

assess

generate

generate

of the

of rented

and to

be able

be able to

is

return

checkout

titles,

the

date

shop late

revenue

reports

inventory

and time

assistant

appropriate

periodic

periodic

the

reports

and the must

return

by title

and to

be able to

fee. and

keep

by type.

track

of titles

on order. The

DVD store

owner,

wants to keep track entries

in

a

work

who

employs

two

of all employee

schedule,

while

(salaried)

work time

full-time

and

all employees

and three

payroll

sign in

data.

and

(hourly)

Part-time

out

on a

part-time

employees,

employees

must arrange

work log.

NOTE

11 When capturing aspects

As

you

of the

start

to

requirements

in

the requirements, business;

think

help

characteristics (whose

the

might

zero

for

the

differently to

use

review

2020 has

exist,

Cengage deemed

Learning. that

any

All suppressed

two,

ensure

full-time

Rights

situation

May not

not materially

be

handled If full-time

problem

employees,

while the

pay

computations, On the

in terms for

a supertype/subtype

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

and

whole

or in Cengage

part.

Due Learning

to

reserves

keep

differences.

are few

For

distinguishing

an attribute

EMP_CLASS

earn

salary

a base

by having two

approach, is

the

zero

selects

employees

benefits,

and so on, it employees.

for

is

the

either

are

and

attributes,

HOUR_PAY

set to

software

PART_TIME

electronic

Also,

using

part-time

relationship

next.

employees

application

of work scheduling, FULL_TIME

attributes.

for design

If there

EMP_BASE_PAY

hand, if

operational

and information

and

can be handled

Using this

the

other

by

the

we have listed

transaction

or part-time.

table.

wage, that

that

relationships

may be

establishes

objectives

many possibilities

as full-time

EMPLOYEE

classification

does

entities,

EMPLOYEE.

employees

Reserved. content

remember

required

not only

system

problem leaves

classification.

more sense

specific

in

correct

employee

the

the

P) in the

full-time

a supertype/subtype

variables

Copyright

To

from

by the

of operations

database,

classification

EMP_BASE_PAY,

salaried

on the

this

earn only an hourly

and

employees. depending

the

some

by defining

provided

be F or

employees

EMP_HOURPAY to

designing design

EMPLOYEE

between

values

part-time

Editorial

about

drive the

consider

description

it also establishes

mind that the description

example,

the

set part-time

F or

handled

P,

quite

would be better The

more

unique

makes.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

584

PART IV

Database

Design

Once the for the

basic requirements

DVD rental

store

is

have been captured,

presented

in

11.1.2 Entity Relationship Before

creating

the

be used in the documentation Designers

And

overlook

team.

system

designer

ER model can be created.

communicate

The standards

and any other

often

lead

to

and

enforce

the

to

The ER model

often

poor

design

promise

(but

appropriate

during

when they means

work. In

standards

use of diagrams

be followed

especially

documentation

work easier

and

include

conventions

requirement,

standardise

failures

the first

section.

must

design.

very important

to

make design

are

a failure

contrast,

documentation.

working

to

as

members

communicate

later.

well-defined

do not guarantee)

to

and symbols,

and

a smooth

enforced

integration

of all

components.

Because

the

designer

and

the

of the

this

Failure

communications

standards

model,

next

Modelling and Normalisation

writing style, layout

often

of a design

ER

documentation

the

business

developing

Table

rules

must incorporate

the

usually

them

define

into

conceptual

the

the

nature

conceptual

model using

of the model.

ER diagrams

relationship(s) The

process

among

the

of defining

can be described

entities,

business

using the

steps

the rules

shown in

11.2.1

TABLE 11.2

Developing the conceptual

model using ERdiagrams

Activity

Step 1

Identify,

2

Identify

analyse the

and refine

main entities,

the

business

using the

3

Define the relationships

among

4

Define the attributes,

5

Normalise the entities. (Remember

6

Complete the initial

7

Have the

primary

rules.

results

the

of Step

entities,

using

keys and foreign that

1. the results

of Steps

1 and 2.

keys for each of the entities.

entities are implemented

as tables in an RDBMS.)

ER diagram.

main end users verify the

model in

Step 6 against the data, information

and processing

requirements.

11 8

Modify

the

ER

diagram,

using

the

results

of

Some of the steps listed in Table 11.2 take process,

can generate

Step

7.

place concurrently.

a demand for additional

entities

and/or

And some, such as the normalisation

attributes,

thereby

causing the

designer

to

revise the ER model. For example, whileidentifying two main entities, the designer might also identify the composite bridge entity that represents the many-to-many relationship between those two mainentities. To review, suppose you are creating a conceptual model for the JollyGood Movie Rental Company, whose end users want to track customers movie rentals. The simple ER diagram presented in Figure

11.2 shows

a composite

entity that

helps track

customers

and their

DVD rentals.

Business

rules

define the optional nature of the relationships between the entities DVD and CUSTOMER depicted in Figure 11.2. For example, customers are not required to check out a DVD. A DVD need not be checked out in order to exist on the shelf. A customer mayrent many DVDs, and a DVD may be rented by many customers.

1

See

Alice

March

Copyright review

2020 has

Cengage deemed

Learning. that

any

particular,

Sandifer

1991,

changed

Editorial

In

pp.

note the

and

Barbara

13-16.

All

Rights

but the

Reserved. content

von

Although

substantially,

suppressed

does

composite

May not

not materially

be

Halle,

the

copied, affect

scanned, the

overall

Linking

source

process

has

or

duplicated, learning

RENTAL

entity that

Rules

seems

to

dated,

connects

Models, it

the two

Database

remains

the

main entities.

Programming

current

and

standard.

Design,

4(3),

The technology

has

not.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 11.2

A composite

11

Conceptual,

Logical,

and

Physical

Database

Design

585

entity

As you willlikely discover, the initial ER model may be subjected to several revisions before it meets the systems requirements. Such arevision process is quite natural. Remember that the ER modelis a communications

tool

as well as a design

blueprint.

Therefore,

the initial

ER

model should

give rise to

questions such as,Is this really what you meant? when you continue to meet with the proposed system users. For example, the ERD shown in Figure 11.2 is far from complete. Clearly, many more attributes must be defined and the dependencies must be checked before the design can be implemented. In addition, the design cannot yet support the typical DVDrental transactions environment. For example, each

DVD is likely

to

have

many copies

available

in Figure 11.2 is used to store the titles shown in Table 11.3.

TABLE 11.3

DVD entity

shown

as well asthe copies, the design triggers the data redundancies

Data redundancies in the DVDtable DVD_COPY

DVD_CHG

DVD_DAYS

SF-12345FT-1

Star

Wars

1

13.50

1

SF-12345FT-2

Star

Wars

2

13.50

1

SF-12345FT-3

Star

Wars

3

13.50

1

WE-5432GR-1

Beauty and the

Beast

1

12.30

2

WE-5432GR-2

Beauty and the

Beast

2

12.30

2

ERD

one copy From the

attribute have you

shown

preceding

definition,

are

in

available

completed

normalisation

that

system

tools

Cengage

Learning. that

any

ER

the

All

Rights

Reserved. content

does

May not

and

not

be

copied, affect

verification)

that

scanned, overall

or

place will

often take process

the

duplicated, learning

take

the

in experience.

whole

or in Cengage

part.

Due Learning

to

reserves

question,

Is

Figure

right

some to

In fact,

among that

is

and the

the

more

third remove

the

party additional

content

once

capable

of

process

may

be

any

time

meeting

the array

of

model.

suppressed at

you until

is iterative.)

conceptual

content

(entity/

activities

11.4 summarises

produce

rights, the

activities

sequence.

design

parallel,

use to

electronic

the

modelling

and forth

a database

place in

can

to

ER

a precise back

interactions.

designer

answer

11

must be supported. that

in

move

represents

activities

modelling

the

to reflect

transactions

are you

accurately

sources

materially

modified

might get the impression

chances

(The ER

be

Also, payment

you

model

and information

suppressed

must

model,

demands.

design

deemed

ER

the

11.3 summarises

has

11.2

discussion,

Figure

2020

Figure

for each title?

the initial

satisfied

the required

review

However, if the

DVD_TITLE

The initial

Copyright

purposes.

DVD_ID

than

Editorial

for rental

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

586

PART IV

Database

Design

FIGURE 11.3

ER modelling is aniterative

process based on many activities

Data analysis Database

initial

study

User views business

and rules

DBLC processes database

Initial

and

ER

model

transactions

Attributes

Verification

Normalisation

Final

FIGURE 11.4

Conceptual

Information

design tools and information

sources

Design

ER

model

sources

tools

Conceptual

model

11 Business data

rules

and

ER diagram

constraints

Data flow

diagrams

Normalisation

DFD*

Process

ERD

functional

descriptions (user

(FD)*

Data dictionary

Definition

views)

and validation

*

Output

Copyright Editorial

review

2020 has

generated

Cengage deemed

by the

Learning. that

any

All suppressed

systems

Rights

Reserved. content

does

analysis

May not

not materially

be

and

copied, affect

design

scanned, the

overall

or

duplicated, learning

activities

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

All objects is

used

(entities,

problems.

with the

During

Define the

attributes,

in tandem

this

ER

modelling

entities,

attributes,

relationships

among

Make

decisions

relations,

and

process

process,

primary

about

views

normalisation

keys

the

entities.)

adding

new

help

key

Logical,

defined

eliminate

designer

and foreign

primary

Conceptual,

so on) are

to

the

11

in

data

and

Physical

a data

Database

dictionary,

anomalies

and

Design

587

which

redundancy

must:

keys.

(The

attributes

to

foreign

satisfy

keys

serve

end-user

as the

and/or

basis

for

processing

requirements. Make

decisions

about

Make

decisions

about

Make decisions

about the

The Real

establish

Modelling

with

will

the

Entity

the following

and

TOUR

naming

entities

are

to

ignored

of the

at the

it is important

standards

completion

naming they

are

designers to

defined

design.

conventions

and

Therefore,

were

will be revisited across

to the will be

names

entity

should

peril.

ensure

that

enforced.

it is

in

established

greater

extent

more broadly

in

detail

a reasonably

greatest

wherever

contains entity

the

Proper

very

useful

This

range

As the

applicable.

Chapter

here.

broad

possible.

personal

possible.

of

older

You should

For example,

information

may be related

and

TOUR

show

what

entity

name

to

about

booking,

any

All suppressed

will be named

the

if the

Reserved. content

does

May not

not materially

be

Therefore,

to

5, Data book

uses

DBMSs

and

DBMSs fade

try to

in a travel customer

11

adhere to

agency who

customer

of those

choice

composite entity

had

makes

booked

(bridge)

be the

the

entity.

a

next point

composite

represent.

and a

entity that designer

In

such

finds

cases,

For example,

STUDENT

in the

they

many TOURs

names.

entity that links

would

of

Occasionally,

segments

one discussed

relationship

the composite

TOUR_BOOKING.

composite

the

may consist

by the

make the

A better in

describes

being linked

may borrow

sparingly.

Rights

are

that

a BOOKING

made for it.

entities

might

enrols

a name

database,

may be the

convention

STUDENT

Learning.

assigned

agency

many BOOKINGs

be used

the

usually

STU_CLASS

naming

that

frequently Therefore,

naming

be acceptable

HOTEL

in the travel

composite

Cengage

the

self-documenting.

to

and attribute

and the

BOOKING

University,

deemed

yet it is by teams.

attribute

conventions

CUSTOMER

may have

necessary

that

review

Concepts.)

hotel.

Composite

that

(If necessary,

Advanced

dictionary.

which

requirements

and

For example,

the

in

Diagrams,

are likely

entity

the

specific

it

1:1 relationships.

Modelling

conventions:

database,

links

requirements.

conventions.

successful

entity

that

scene, the

a BOOKING

has

the

Relationship

conventions

data

accomplished

are, in effect,

useful

Use descriptive

2020

to

meet self-documentation

from

Data

is important,

generally

that

some

naming

an environment

crucial

procedures

Although

naming

standard

is

work in is

in the

requirement

design

members

documentation

definitions

about

conventions

database

team

keys in

6,

processing

ER diagram.

element

decisions

naming

Chapter

satisfy

entities.

all data

Make

to

of foreign

in

attributes.

relationships.

corresponding the

multivalued attributes

placement

ternary

unnecessary

of

derived

Avoid

Include

review

adding

relationships

Normalise

Copyright

treatment

supertype/subtype

Draw the

Editorial

the

and

in

CLASS.

Tiny

However,

more cumbersome,

entity

name

ENROL,

so it

to indicate

a CLASS.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

588

PART IV

Database

Design

An attribute the

table

name should

in

characters.

For example,

VEND_PHONE.

foreign

probably

links that

DEPT_CODE,

EMPLOYEE start

occasionally, If

with the

one table

is

named

an attribute originating

in the

OI, as the

to

to the

assign

a reserved

In that

of entities

using

proper

source

significantly.

and

attribute

Example

fact

it

does

our DVD rental

following

additional

to

and

identifies

such is the

this

a

as EMP_ID,

foreign

key that

and table

naming

convention

use

spite

other

in

Modelling for the

names

such

as

generally

some

RDBMSs

word in

conventions. less

might

use

a SELECT

relate

helpful

the with

a

inventive

in identifying

will reduce

to the

or

Also,

be somewhat

are less

ITEM

Sometimes

descriptive.

have to

prefixes

obviously

attributes

ORDER.)

use of prefixes

WITHDRAW,

to the

it is

mind that

ORD

willidentify

of characters,

naming

you

those

not

as a prefix

limitation,

in

than

prefix

prefix

as a reserved

to the

consistent

CO does

ORD

of that

or entity design,

the

The ITEM

a combination

In

strictly

But then

the

which

ORDER_ITEM,

origin. (Keep

attribute

originate

store, for

and

of relationships

bend

named

use

name

a complex

prefix

not

you

may be interpreted

adhere the

prefixes.

while the

that

existence

cannot

names.

a table

Nevertheless,

Entity Relationship

Let us revisit

use

makes

name

you

an attributes

in

attributes

ORDER table.

should

ORDER

attributes

of the attribute.

is the

so you

possible

lengths

For example,

as obvious

the

name

is

in the

Clearly,

that identify

not always

VEND_ID as ITEM_ID

DEPT_CODE

that

counterpart

attribute

should

contains

the

as

such

is that it immediately

that

dictate

such

will be five

point.

table.

For example,

table

helps identify

length

names

convention

Naturally,

prefix

attributes

attribute

obvious

might

weak

table,

you

contain

naming

a prefix that

maximum

contain

originating

the ITEM

case,

to limit

number

precise

characters

prefixes

As you can tell, it is

about

DEPARTMENT.

and its

contain

the

EMPLOYEE

to

ORDER_ITEM

word list.

statement.

large

if the

same

ORDER_ITEM

prefix

requirement

of this

an attribute in

might

might

it is immediately

ORDER

originating

here,

table

table

as you can see in the next

will be used to indicate

possible

ITEM

The advantage

and

and it should

purposes

VENDOR

the

key(s). For example,

EMP_LNAME

names

For the

the

Similarly,

ITEM_DESCRIPTION. tables

be descriptive,

which it is found.

the

sourcing

CHECK_OUT

doubts

table,

just

USER.

DVD Rental Store

we have gathered

the

basic requirements.

Now examine

requirements:

11 DVDs are classified Release),

so

The shop

assistant

This requirement entering

the

The shop the

add

actual

not

The store

Here

we

Copyright review

2020 has

Cengage deemed

Learning. that

any

the

All suppressed

do

creation

Rights

does

to

to

Documentary,

prevent

requests

way to

data

Action

and

New

redundancy.

quickly.

query the

quickly

fees create

DVD data (by name, type,

to

the

be able to keep track

new

etc.)

while

that

attributes

entity.

Note

relationship.

database

there

business

conceptual

assess

as expected is

no need

mind that

in the tables

the

of all employee

that

Keep in

attributes

enforces

and to

such

to

some

and by combining

rule.

Remember

that

diagram.

work time

and payroll

data.

This

entity.

entities:

and the

RENTAL

the appropriate

in the

or not the return is late

met by adding

an additional

of an EMPLOYEE two

whether is

program

be represented

schedule

Reserved. content

and late

met by including

can

payroll

check

an application

wants to

work

generate

Editorial

date

we need

must create

employees

customers

This requirement

return

rules

the

fee.

nor

through

owner

will require

return

are easily

all business

Family,

be created

an easy

must be able to

entity,

attributes

must

data.

late

requirements

TYPE (Comedy,

TYPE

met by creating

assistant

date,

to their

called

must be able to find

RENTAL

a new

those

entity

is

appropriate

return

according

a new

WORK_SCHEDULE

actual

times

and

worked,

WORK_LOG,

respectively.

which

These

will show

entities

also

help

from

eBook

the us

report.

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

if

the

subsequent

rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The description

also specifies

End-of-period entities

to

report

generate

Revenue report generate

Titles

all the rental

on order.

Employee and After

for the

by type.

This report

This report

This report

Logical,

that the database This report

period

will use the

and

Physical

Database

eventually

will use the

Design

589

produces:

RENTAL

and

DVD

of time. RENTAL,

DVD and TYPE entities

to

will use the

will use the

and

payroll

ORDER,

DVD and TYPE entities.

DVD and

data.

This report

gathered

so far,

TYPE

entities.

will use the

EMPLOYEE,

WORK_SCHEDULE

entities.

all the

FIGURE 11.5

of rentals

reports

Conceptual,

data.

work times

analysing

expected

data for some specified

and by type.

reports.

WORK_LOG

of the

number

all rental

by title

Periodic inventory

some

11

requirements

afirst

draft

of the

ER

model is

shown

in

Figure

11.5.

ERDof the DVDrental store

11

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

590

PART IV

Database

Although to

Design

there

is a temptation

WORK_LOG

entity

for

and

an

attribute.

It is far

EMPLOYEE

entity.

applications

software

entities,

there

The

point,

is

no

on the

the

entitys

LOG_DATE

and times to

same

are

out.

the

and

work

of verification

next

been

dont

against

the

been

rented

has

customer

cannot

which are then

reflects

perhaps be

related

a substitution

named

of an

EMP_TYPE,

P 5 part-time

WORK_LOG

transaction

or F

and

in the

5 full-time.

The

WORK_SCHEDULE

and

which

yet

be used

proposed

logged

it is

the

You

the

or out. Therefore, In

addition,

to record

two

how the

dates

if you

the

additional

rent work log

time

want

in

schedules.

requires

willlearn

customers

addition,

employees

produced

system.

in

For example,

If five In

necessary

to track

has been

copy.

LOG_DATE_OUT.

employee,

model that

requirements.

by a customer.

has

when all employees

part-time

the

then

against

LOG_DATE_IN

by each

data

attribute,

entries into the

which

named

schedule

an would

entities,

a decision

value.

video

know

PART_TIME

such

create

verified

of tracking

Clearly, the

and

values

attribute

yet

specific

you

worked

the

when?

process the

not

perhaps

hours

simply

be used to force

is incapable

Similarly,

worked

in

the

to

attribute

which

video,

necessary,

determine

time

has

check

of the

FULL_TIME respectively,

EMP_TYPE

ERD

way to

copies

better

EMP_TYPE

can then

depending

At this

to create

WORK_SCHEDULE,

and

Who has

work through

data

model is verified

section.

11.1.3 Data Model Verification The

ER

model

intended run

must

be verified

processes

through

of tests

data views

manipulation Chapter Access

and their

paths

and

Revision

of the

by a detailed important

some

of the

other

entities.

original

Or

and

DELETE,

with a careful

that

describe

on attribute

details

primary

first

that

that the

include which

the

the

model be

data

you learnt

re-evaluation

those

entities.

primary

Chapter

That

3,

change

entered.

Relational

Learning. any

primary

All suppressed

Rights

does

May not

not materially

key.

affect

original

problems

about

in

of the

entities,

This process

followed

serves

several

the

overall

or

duplicated, learning

in experience.

it

key

Cengage

Due Learning

to

or

within

contain

more new

of relationships

relationships

Perhaps

be attributes out to

a

entities.

as they

lead

are

to implementation

might be useful to create

key

always

part.

out to

of one

example

speed,

electronic reserves

in

the

you

rights, right

some to

third remove

and

same

key 3.18

and

PROD_CODE.

order

may create

primary

the

in Figure

of INV_NUMBER

of INV_NUMBER appear

a new primary

illustrated

composed

multiple-attribute

or in

themselves.

may turn

nature

composed

processing

whole

turn

in the invoicing

the invoice

entities

later.

a primary

primary in

the

defined

requirements,

an existing

scanned,

about

For example,

and to increase

copied,

will, instead,

the introduction

clues

Characteristics,

of the

be an attribute

keys. Improperly

end-user

replace

be

warrant

provide

the items

queries key to

Reserved. content

the

that

to

development

Model

replaced ensured

To simplify

surrogate

can

a revision

be entities

to

and to application and/or

to

considered

and foreign

an existing

may lead to

believed

was originally

processing

that

requires

Such transactions

UPDATE

starts

details

first

what

To satisfy

Cengage

corroborate

Language.

attributes

attribute

to replace

deemed

to

and constraints.

of subcomponents

LINE_NUMBER

has

INSERT,

design

of the

components

by the

problems

2020

order

model. Verification

transactions.

Query

database

of the

number

defined

review

in

security.

examination

The focus

Copyright

database

processes

purposes:

sufficient

Editorial

required

Structured

system

following:

data requirements

The emergence

in

proposed

by the the

SELECT,

Beginning

Business-imposed

11

against

commands

8,

the

can be supported

a series

End-user

against

as they

were

a single-attribute

key.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Unless the to

entity

evaluate

guard

the

against

A careful

details (the extent

undesirable

review

revisions Because

rough

ensure

real-world

database

a specific

function,

important

The

greatly

The

the module

be identified

entire

the

as

modules

the

fragments

be able

to

against

the complete

support

that

end

users.

are,

they

all of the

591

difficult helps

so

on.

should

strive

to

organise

the

system component that handles

At the

Creating

design

level,

and using

a

module

modules

is

an ER

accomplishes

can be delegated

to

design

more

number

of entities

manageable

number

within

within a complex

is

also

a great

programming confidence

quickly, the implementation

being

made

ER

model

all of the

processes.

and that

models

To avoid

ER model. That verification

process

The ER model verification

at least

fragments.

ER

that

design

can

of entities.

and applications

prototyping

is

represent

groups

work.

online

of the

Fragmentation

problem,

system

creates and

the

is detailed in

builder.)

of one or more

part

components

trouble

is ready

a potential

may, therefore,

modules

must

not

be verified

Table 11.4.

process

Activity

11

2

Identify

each

module

3

Identify

each

modules

the

ER

models

central and its

entity. components.

transaction

requirements:

Updates/Inserts/Deletes/Queries/Reports

Module interfaces

4

Verify all processes

5

Make all necessary

6

Repeat Steps 2-5 for all modules.

Keep in

mind that the

well

as system

systems

Learning. that

any

All suppressed

verification

and

modules.

Cengage deemed

you

progress

Identify

has

by teams,

ER model.)

(Quick

1

2020

Those

Design

requirements.

be brought

required

External:

review

a

not include

Internal:

Copyright

to revisions.

to lead end-user

quickly. Implementation

cant

may

Step

as

it is levels

is likely

and

work. The large

more readily.

system

TABLE 11.4

Editorial

payroll

within them)

contains

will demonstrate serving

useful

problem:

defined,

Database

normalisation

meeting

of

done

development

design

can be prototyped

Even if the

As

simplify

modules

begin

Physical

of the

module is aninformation

orders,

up the

The

to

blueprint

generally

even the segments

Each

modules

and

are precisely

Knowledge

capable

part of the overall

speeding

modules

can

is

is

modules. (A

be daunting.

spots

Logical,

ends:

modules (and

teams,

design

design

as inventory,

that is an integrated

several

the design

major components into

segment

characteristics)

normalisation.

database

that

designs

such

and their

designs

Conceptual,

redundancies.

of the

will help

attributes

of the

11

Reserved. content

changes

process

does

The

11.6 illustrates

May not

not materially

be

copied, affect

scanned, the

overall

ER model.

suggested

requires

user requirements.

Figure

Rights

against the

the

duplicated, learning

continuous

verification

the iterative

or

in Step 4.

in experience.

whole

nature

or in Cengage

part.

Due Learning

to

verification

sequence

must

of the

process.

electronic reserves

rights, the

right

some to

third remove

of business be repeated

party additional

content

may content

transactions

for

be

each

suppressed at

any

time

from if

the

subsequent

of the

eBook rights

and/or restrictions

eChapter(s). require

it

592

PART IV

Database

Design

FIGURE 11.6

Iterative

ER model verification

process

Identify

processes

Change process Verify

results

Define transaction

Change

ER

The

the

systems

in the

The

greatest

boundaries

attention

among

Analyse

each

coupling must

modular

have

other,

modules

highly

Finding

the

Processes

Operational

review

2020 has

Cengage deemed

Learning. that

any

All

they

unnecessary the

right

designer

selects

the

central

within the

results

according

display

entity that

has

belongs

uses it

modules

the

high

is

most

of

entity

more lines

and to

define

most frequently.

framework

to

to let

you

creating

more

is, the

another.

other

modules.

the

entities

among

entities. and that

Modules

Note:

One of the

coupling. are

coupling

of a truly

Often,

dependent

effect, resulting

designers

Module

Low

creation

cohesion modules

coupling.

of one

allowing

has the reverse database

that

module

of

thereby

of the relationships

self-sufficient.

address

between

coupling

strength

cohesivity

and

relationships

in

Decreasing

describes

are independent

balance

the

quest

on each

in low

cohesion.

job.

to their:

(INSERT

or ADD,

UPDATE

or exceptions). or

CHANGE,

DELETE,

queries

and reports,

batches,

backups). must

be verified

verification

and attributes

Reserved. content

entity of

the

module

are independent

dependencies,

type

Rights

central focus

it is the entity that

the

the

be complete

modules

that

yearly,

The process

suppressed

which

monthly,

entities

The

to:

modules

balance is a key part of the

processes

implemented.

to

placed

must

other

weekly,

and

which

need

must

(daily,

maintenance All identified

A module

with

modules

may be classified

Frequency

additional

eliminating

cohesive

entity is

module

to

achieve

to

cohesivity

indicating

high coupling. correct

entity,

ER diagram,

belongs

The term

intermodule

is to

entity

you

entities.

extent

unnecessary

hence

central

(In the

entity.

and it is the

details.

relationships

the

and

important)

relationships,

the

framework,

and the

coupling,

challenges

(most

models

or subsystem The

the central

modules

low

central

to identify

module scope.

modules

related,

system

design

the

cohesivity.

describes display

decreases

to

the

be strongly

of the

words,

entity/module

modules

the

most

of relationships.

and

on the

central

Ensure the

must

other

module is identified,

your

selecting in

other.)

step is to identify

found

with

number

any

next

Within the

Copyright

In

modules

focus

Editorial

participation

to it than

Once each

11

starts

of its

operations.

connected

that

process

in terms

involved

model

model

verification

defined

ER

steps

does

May not

not materially

be

copied, affect

against

the

is repeated

ER

the

overall

or

duplicated, learning

in experience.

whole

If

for all of the

will be incorporated

scanned,

model.

or in Cengage

into

part.

Due Learning

necessary,

models

modules.

the conceptual

to

electronic reserves

rights, the

right

some to

third remove

appropriate

changes

are

You can expect

that

model during its validation.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

At this

point,

independence life

by

making it

Example Applying

a conceptual ensures

the

possible

model has been defined systems

to

portability

migrate

to

Data Model Verification the

verification

store from the

process

previous

FIGURE 11.7

across

another

for the

produces

Conceptual,

Logical,

as hardware-and platforms.

DBMS

another

Physical

Database

software-independent.

Portability

and/or

and

may extend

hardware

the

Design

593

Such databases

platform.

DVD Rental Store

described

section

11

in

the

Table

verified

11.4

to

data

the

conceptual

model

model as shown in

of the

DVD rental

Figure 11.7.

Verified ER modelfor the DVDrental store

11

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

594

PART IV

Database

Design

As you examine

the

components

All relationships LINE. (The factor

in

are read from

natural tendency

an

can

and

out

COPY

clearing

FK in

the

vary

from

are likely the

ORD_

to

keep

the

PKs

COPY

than in

movie, each

COPY not a composite

a requirement

that

combination

ORD_LINE.

the

FK in

vendors

that

the

COPY

and

vendor.

entity? entity

by the

PK of the

of DVD_ID

is

RENT_LINE.

entity

is the

COPY_CODE).

Therefore,

However, if the

individual

when

PK is referenced

goes to a particular

of the

will be the

are

entitys

PK. (Note

than the

track

may reside

be found the

The

of a given

order

the

VEND_

goes to a clearing

supplied

the

movies

to the

ORD_LINE.

Database Design in to

different

in

different must

design

physical locations.

another.

designer

For

example,

physical also

complications

1, The Database

11.2

ORDER contains

are 12 copies

So, why is

single-attribute

each order

one location

system,

database.

Chapter

Therefore,

points:

or from left to right is not the governing

movie. If there

entities.

case,

rather

VEND_CODE

of a database

process across

entity.

bottom

a single-attribute

that

want to

11.1.4 Distributed

may also

why

ORDER, rather

still

house,

Portions

of

have

to assume

you

of each

In this

COPY_CODE,

It is reasonable

and

top to

are composite

entity.

must

attribute

CODE is the

copies

example

by another

Therefore,

house

parent to the related

the following

separately.

RENT_LINE

an excellent

referenced

single

model in Figure 11.7, note particularly

to read from

individual

be rented

ORD_LINE Here is

the

ERD

ERD!)

We can now track copy

of the

locations.

develop

If the

the

introduced

by

Processes

that

process

and

a retail

data

database

and

the

is to

are

storage

be distributed

allocation

processes

database

warehouse

process

distribution

distributed

access a

strategies

examined

in

for

detail

in

Approach.

LOGICAL DATABASE DESIGN

11 When the

business

conceptual

design

rules that, in turn,

and constraints.

(Remember

enforced

application

within

at the five

definition

days

thus

attributes

that

requiring

the

For

The

to

meet the requisite

were used concurrently.

the

each

of the

entities

models

ERD.

normalisation such

entities

requirements.

item

must

are required

focus

to

was

In short,

design

use of design

the

before

they

can

and relationships,

on the

verified

be returned

includes

meet information

entities

models

the

cardinalities

model

must be normalised

data

level

and are, therefore,

conceptual

additional

the

the

modelled

checked-out

and that

Because

conceptual

connectivities,

be

the

may yield

design,

concurrent

a

addition,

process

initial

cannot

constraint In

conceptual

at the

optionalities,

elements

ERD.)

an implementable

In fact,

ERD reflects

the

normalisation of the

produce

design

example, in

the

relationships,

of the

describe

modification

design

completed,

some

mind that the

implemented.

conceptual

certain to

that

be reflected

Keep in

properly

is

define the entities,

level.

cannot

of the

requirements. be

phase

verification

in this

and normalisation

and normalisation

of the

chapter

were

processes

reflects

real-world

practice. The logical

second design

stage stage

on a relational

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

in the is to

DBMS.

Rights

Reserved. content

does

database map the

The logical

May not

not materially

be

copied, affect

design

cycle

conceptual

design

scanned, the

overall

or

duplicated, learning

is

known

model into

stage

in experience.

whole

consists

or in Cengage

part.

Due Learning

as logical

alogical

database

model that

of the following

to

electronic reserves

rights, the

right

some to

third remove

design. can then

The aim

of the

be implemented

phases:

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1

Creating the logical

2

Validating the logical

3

Assigning and validating integrity

4

Merginglogical

5

Reviewing the logical

Next, you data

willlearn

11

Conceptual,

Logical,

and

Physical

Database

Design

595

data model data model using normalisation constraints

models constructed for different parts for the database

about

data model with the user each of these

phases in

detail,

which

will help you to build a successful

logical

model.

11.2.1 Creating the Logical Data Model The first

stage

database

constructs.

of relations must

of logical

using

be created

no dependents

is

key attribute(s) however,

Step

lets

of rules.

whilst

at the

not

now

becomes

an

time

in

the

the

Strong

keys)

required

for

each

are

to

conceptual

entity

integrity

created

keys.

design

in

and

the

conceptual

a set

attributes

relations

relations,

brackets.

So far, this sounds the

phase into

Usually,

To create

enclosed

convert

a set of relational

and relationships

constraints.

first.

attributes

design into

with

the

Next, the

name

primary

quite straightforward;

model in

detail.

Entities

entities

relation. Notice

conceptual

required

by any foreign

steps

the

ER model from the

associated

followed actual

for

relation.

meeting

with its

all regular

attribute

corresponding

attribute

at the

the

must be created

any foreign

along

Relations

This rule transforms

its

same

is (are) identified, look

converting

A relation

containing

specified

1: Create

design is to translate

This involves

a set

(e.g.

of the relation

database

in the

Figure that

the

ER diagram

11.8

shows

primary

into

the

relations.

entity

Each attribute

DVD from

the

in the

relation

key is indicated

in the

DVD rental by

entity

store

and

underlining

the

DVD_ID.

FIGURE 11.8

Transforming the strong entity DVDinto the DVDrelation 11

Step 2: Create In

Chapter

Relations for

5, Data

Modelling

that is, it cannot entity has a primary states

that,

The primary

Copyright Editorial

review

2020 has

key

Cengage deemed

for

each

primary

Learning. that

any

key

cannot

All suppressed

Entity

exist

key that is weak

Weak Entities

with

without the

partially

entity,

of the

Reserved. content

does

May not

not materially

be

is then until

copied, affect

scanned, the

overall

Diagrams,

entity

or totally

a new relation

relation

be established

Rights

Relationship

or

duplicated,

with which it

derived from the must

be created

determined

all the

learning

we defined

in experience.

whole

or in Cengage

has a relationship.

each

owner

key relationships

part.

Due Learning

to

electronic reserves

entity

as being

existence-dependent;

In addition,

a weak

parent entity in the relationship.

that includes

from

foreign

a weak

rights, the

right

all attributes of the with the

some to

third remove

party additional

content

from

entity.

may

be

any

entity. the

entities

suppressed at

the

However,

owning

content

Step 2

time

from if

the

subsequent

have

eBook rights

and/or restrictions

eChapter(s). require

it

596

PART IV

Database

Design

been identified.

To do this, the

key attribute. primary the

The primary

key

weak

of the

entity

by a *

owner

the

FIGURE 11.9

key

the

and

relation.

foreign

key of the

new relation

and the

from

RENT_NUM

RENT_LINE

after

primary of the

entity

RENT_LINE

The attributes key on the

key

partial DVD

owner then

entity is included becomes

identifier

Rental

of the

Store

RENTLINE_NUM

in the new relation

a composite weak

appears

in

entity.

the

of transforming

11.9.

to indicate

keys

as a foreign

combining

An example

Figure

are underlined

You will also notice that foreign

key through

on the

the

composite

new relations

primary

are indicated

attribute.

Example of mappingthe weakentity RENT_LINE

Validation of foreign keys occurs later validate integrity constraints.

on in the logical

Step 3: Map Multivalued Attributes As part of the ER verification process, attributes that database designer will have either: created several new attributes,

database

contain

one for each of the original

created a new entity composed

of the original

design phase when we assign and

multiple values are identified

and the

multivalued attributes components,

multivalued attributes

or

components.

Therefore, at logical database design, such attributes should not exist in the ERD. However, if they do, then for each multivalued attribute that is found within an entity, create a new relation. The new relation should have aforeign key, which is the primary key from the original entity. The primary key ofthe new relation is a composite key comprised ofthe primary key of the original entity and one or more attributes

11

of the

multivalued

attribute

itself.

Supposing we created an entity called CAR, where CAR_COLOUR was a multivalued attribute comprising of attributes containing information about the different colours (COL_COLOURS) that are used on different sections of the car (COL_SECTION). Figure 11.10 shows how a new entity called CAR_COLOUR

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

is used to represent

Rights

Reserved. content

does

May not

not materially

be

copied, affect

this

scanned, the

overall

or

duplicated, learning

multivalued

in experience.

whole

or in Cengage

attribute

part.

Due Learning

to

electronic reserves

and the relations

rights, the

right

some to

third remove

party additional

content

that

may content

be

are created.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 11.10

11

Example of mapping multivalued

Step 4: Map Binary

Conceptual,

Logical,

and

Physical

Database

Design

597

attributes

Relations

One-to-many (1:*) Relationships For each 1:* relationship, create the relations for each of the two entities that are participating in the relationship. To create the foreign key on the many side, include the primary key attribute from the one side.

The one

side is referred

to as the

parent table

and the many

side is referred

to as the

dependent

table. For example, Figure 11.11 shows a portion of the ER diagram which represents the pays one-to-many relationship between CUSTOMER and RENTAL and the corresponding relations that are created.

FIGURE

11.11

Example

of mapping

a 1:* relationship

11

One-to-one (1:1) Relationships 1:1 relationships are a special kind of relationship and each one has to be treated individually on the participation constraints between the two entities:

depending

If both entities are in a mandatory participation in arelationship and they do not participate in other relationships, it is mostlikely that the two entities should be part of the same entity.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

598

PART IV

Database

Design

If there is

mandatory

optional becomes

the

foreign If

key

both

role

becomes

dependent of the

entities

of the

are in

entity

to

of the

both sides key

LECTURER

would

SCHOOL

FIGURE 11.12

hard

entity that

has the

the

to

dependent

parent

has the

mandatory

entity

determine

entity.

To

be determined,

relationship

should

which make

perhaps

can be obtained,

are

both

again

mandatory,

in

either

Figure

then

create

deans

of this the

contain

the

would

take

a decision,

through

it is

the

more

the identification

up to the

database

key is

Each

new relation

mandatory

would

with two

exist

have

and optional

to

must

have

LECTURER

is

LECTURER

between at each

two

copies

participations

copies

of the

in this

between

a lecturer a

to

also shown in Figure

placed

can

constraints

the relationship

school

so the

is

relationship participation

or

primary

relationship.

SCHOOL

1:1 relationship

foreign

the

consider

of a school,

the

at the

one relation

the

11.12. from

(e.g.

look

of both

to represent

relationship

are the

participant,

is recursive

would

comprises

you could

shown

all lecturers

entities you

map a 1:1 relationship,

the

one. The mapping optional

then the that

to

it is

information

two

then

new relation

Therefore,

not

optional is the

we

and

school.

the

entity),

then

a further

how

be the

no further

If they

are optional,

To illustrate

However,

would

then

would have to

key. If the relationship

or create

entity

corresponding

participation,

which

between same

relationship.

primary

and the

decision.

1:1 relationship

of the

The relation

However, if

of the

entity

entity.

and

make the

occurrences side

parent

about the relationship

designer

of the

entity.

an optional

more attributes.

If the

on one side of the relationship, the

dependent

parent

information of

participation

participation

entities

who is the

mandatory

SCHOOL

11.12.

the

dean

participation.

relationship

Notice that

is

an

as SCHOOL

relation.

Example of mapping a 1:1 relationship

11

NOTE Chapter

6, Data

Modelling

Advanced

Concepts,

contains

more about

design issues

of implementing

1:1 relationships.

Many-to-many For

each

Copyright Editorial

review

2020 has

Then

the

Cengage deemed

Learning. that

Relationships

*:* relationship,

relationship. contain

(*:*)

any

foreign

All suppressed

Rights

keys

Reserved. content

create

create

does

May not

the

a third of the

not materially

be

two

copied, affect

relations

relation original

scanned, the

overall

or

duplicated, learning

for

each

of the

to represent entries

in experience.

whole

the

that

or in Cengage

part.

two

entities

actual

participate

Due Learning

to

electronic reserves

in the

rights, the

that

are

relationship.

right

some to

third remove

original

party additional

content

participating

The third

in the

relation

will

*:* relationship.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Figure

11.13

examine

shows

Figure

composite Figure

11.13,

entity

11.13

a *:* relationship you

called

also

can

see that

FIGURE 11.13

the

11

the

entities

between the

ATTRACT_TOUR,

shows

CHAPTER

relationship which

corresponding

Conceptual,

TOUR

between

contains

the

foreign

Logical,

and

two

keys to

and

Database

Design

ATTRACTION.

As you

is represented

by the

entities both

Physical

ATTRACTION

and

599

TOUR.

relations.

Example of mapping a*:* relationship ATTRACTION ATTRACTION_NO CITY_ID

ATTRACT_TYPE

TOUR TOUR_ID

{PK}

{FK}

ATTRACT_NAME

ATTRACT_TOUR

{PK}

TOUR_NAME

may_contain

c

TOUR_ID

ATTRACT_WEBSITE

{PK}{FK1}

may_be_visited

c

ATTRACT_PHONE

TOUR_DESCRIPTION ATTRACTION_NO

{PK}{FK2}

ATTRACT_OPENING_TIME

TOUR_PRICE_ADULT 1..1

TOUR_PRICE_CHILD

0..*

0..*

ATTRACT_CLOSING_TIME

1..1

ATTRACT_ADDRESS TOUR_PRICE_CON ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON

TOUR

Relation

TOUR

{TOUR_ID,

TOUR_NAME,

TOUR_DESCRIPTION,

TOUR_PRICE_CHILD,

ATTRACTION

ATTRACTION{ATTRACTION_NO, CITY_ID*,ATTRACT_TYPE,ATTRACT_NAME,ATTRACT_WEBSITE,

Relation

ATTRACT_PHONE,

ATTRACT_OPENING_TIME,

ATTRACT_ADDRESS,

ATTRACT_CLOSING_TIME,

ATTRACT_TRAVEL_INSTRUCT,

ATTRACT_COST_CHILD,

ATTRACT_TOUR

TOUR_PRICE_ADULT,

TOUR_PRICE_CON}

ATTRACT_COST_ADULT,

ATTRACT_COST_CON}

ATTRACT_TOUR{ATTRACTION_NO*, TOUR_ID*}

Relation

NOTE

11 Remember

that,

during the

at the

ER model verification

1:* relationships

and

of a ternary three

in the fourth

a fourth

The

keys

primary

and

is

no single

of relations.

depend

Copyright Editorial

review

2020 has

on a number

Cengage deemed

set

Learning. that

any

All

Rights

may have

not, then

been

identified

and

all *:* relationships

then

resolved

be

mapped to

should

phase.

does

entities

May not

not materially

be

amongst

must of the

be

entities

created

entities

in

composite

to the

for

designers

mapping have

One consideration

affect

scanned, the

overall

represent

ternary

primary

exist in the

the

ERD. In the

relationship

relationship

key along

supertype

come

is

up

or

duplicated, learning

in experience.

whole

or in Cengage

and with

case

amongst

become

the

foreign

with any additional

subtype

several

whether individual

not part of the inheritance

copied,

may also

keys

attributes.

Relationships

available

of factors.

Reserved. content

relation of each

database

with other

suppressed

design

degrees

and Subtype of rules

Therefore,

relationships

However, if

database

may also form the

Step 6: Map Supertype There

process.

of higher

relationship,

relation

*:* relationships

Relations

other relationships

entities.

level,

during the logical

Step 5: Map Ternary Ternary

conceptual

part.

hierarchy.

Due Learning

to

electronic reserves

right

subtypes

some to

third remove

party additional

content

into

a set

techniques,

participate

Another is the type

rights, the

relationships

different

may content

be

any

in further

of disjoint

suppressed at

which

time

from if

the

subsequent

and

eBook rights

and/or restrictions

eChapter(s). require

it

600

PART IV

Database

Design

overlapping options

constraints

that

exist

Option

1:

Merge the

participation In this and

in the

case,

subtypes

determines

into the

the

supertype

and the

which

for the

rows

subtypes

whether

2:

subtypes

in the

Create

example,

the

relationship

subtype.

Two of the

most common

for

an attribute

corresponding

called

of attributes

each

Both subtype

only

and the

is

used

provide

frequency

if the

is created

case,

in

in

STUDENT

to

determine

P

or F

the

supertype

to

11.14

illustrates

participate

option

the

1 and

in

in

between

Other

a

the

mandatory

merge the supertype

mapped relation

discriminate

and

subtypes.

Figure

subtypes

a guide. of

supertype

example,

no overlapping

shown two

Notice that

which

options

overlap.

in the

PART_TIME_STUDENT

value

this

we could follow

PERSON.

PERSON_TYPE,

table.

in

Therefore,

In are

relationship

called

when

or (P)ART_TIME.

subtype.

These

For

subtypes

the

can

placed

STUDENT_TYPE

and there

STUDENT.

subtypes

is

subtype.

be assigned

and

relationship

PERSON.

that

was (F)ULL_TIME

supertype

and

one relation

database

FIGURE 11.14

in the

could

that

type (with

attribute

supertype/subtype

with the supertype

contains

each

to

This is suitable

and the

attribute

belong

additional

STUDENT

EMPLOYEE

into

table

one relation.

mandatory

was student

STUDENT_TYPE of

participate

PERSON

subtypes

number

merged.

is

an additional

database

then the

one relation

subtypes

create

STUDENT

instance

optionally

overlapping

and the

were

a particular

Option

to

supertype

and create

relationship

discriminator

and FULL_TIME_STUDENT), if the

supertype

supertype/subtype

use the

discriminator

For

between

are:

factors

Figure

11.14

subtypes

to

in

consider

the

are the

overlapping.

Example of mapping asupertype/subtype relationship

{AND,Mandatory}

11

11.2.2 Validating the Logical Data Model Using Normalisation As you will have seen in Chapter 7, Normalising Database Designs, the normalisation process helps to establish appropriate attributes, their characteristics, and their domains. Nevertheless, because the conceptual modelling process does not preclude the definition of attributes, you can reasonably argue that normalisation often straddles the line between conceptual and logical modelling. In creating the logical

data

model, we have so far created

relations

from

the ERD and these

bein third normal form. If they are not, then it is likely that we have made some verification process and it will be necessary to revisit the ERD model.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

relations

should

already

mistake during the model

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

If

normalisation

additional

tool

certain the

has not

in

which to

anomalies

database

been

verify

undertaken

the

relation.

Chapter

3,

constraints

entity

integrity

integrity

updating

Relational

the

lets

within

or deleting

logical

database

and

and

design, it

all, a normalised

data

database

check

Physical

can

relational

will therefore

and validate

Characteristics,

you

on objects

integrity.

at domain

range

shown in Table

Model

In

Database

be used

schema

help to

design

the

database.

relations

but it is

the integrity

were introduced

within the

creating

will have been identified

look

the

design is to

we can impose

and referential

within

First,

After

Logical,

Design

601

as an

will avoid

keep

the

data in

Constraints

database

that

constraints

stage

be

stage in logical

In

integrity

design.

Conceptual,

consistent.

11.2.3 Validate Integrity The next

prior to logical

database

when inserting,

11

to the

These

within

constraints

the logical

essential

that these

that

assigned

three

were

on each

main types

domain model,

most

are validated

of

constraints, of these

as a separate

process.

constraints.

All the

of allowable

values.

For

Domain

constraints

values

example,

the

are

domain

to

a specific

constraints

for

attribute

the

must

DVD relation

are

11.5.

TABLE 11.5

for the

Attribute

Description

DVD_ID

Set of all possible

DVD_COPIES

DVD relation Domain

Number

values for

movie codes

of copies held of each

Alphanumeric

movie

Integer

character

size 10

2 digits

Minimum value of 1 and

DVD_NAME

The name

DVD_CHARGE

The cost to rent

of the

movie a DVD

maximum

value

Character

size

The amount

to

be paid for

each

day the

DVDis late CATEGORY

Set of all possible

50

DVD_CHARGE

.5

6

DVD_CHARGE

.5

0.25

DVD_CHARGE

,5

25

CHARGE DVD_LATE_CHG_DAY

of 50

movie categories

,5

and

'Comedy',

Entity integrity

is

no null

singular

and

into

parent

the

keys

review

key

values

maintained.

must

What to

contain

is

foreign

key

If

has

are

11

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

movie

May not

not materially

be

have

checking

place.

For example,

not

copied, affect

scanned, the

overall

or

values.

are

deleted.

in experience.

whole

row

If

then the

there

duplicated, learning

key.

This

in logical

'Doc'

constraint database

the relationships in the

in the

guarantees design,

every

prior to this copy

both

a number

of

ways in

DVDrelation This allows

or in Cengage

part.

Due Learning

to

electronic reserves

so that stage

key cannot

NULL values

the

is

user

rights, the

right

some to

are allowed which they

performing

third remove

party additional

content

child

or not

an entry

in the

may

can

be

be dealt

any

with:

COPY relation delete

suppressed at

COPY

as values in the

the

content

and

whether

be NULL in the

until all values in the

the

the relations

store, for every row

of a DVD requires

DVD_ID foreign then

between

DVD rental

DVD relation

been identified

was optional,

been

stage

'Action',

checked.

NULL

allowed

a primary

At this

involves

Don't allow any movies to be deleted from the with that

has

key.

be an existing

mandatory,

that if the relationship NULLs

relation

may have

It is possible

2020

are

keys are in there

be allowed

attributes.

each primary

constraints

and the relationship

associated

Copyright

the

relation.

1

Editorial

relation

is

that

into

foreign

COPY

would

DVD relation

integrity

correct

relationship

foreign

primary

referential

ensure that the

entered

by ensuring

are inserted

composite

Validating to

validated

values

and

Character size 6. Must be one of 'Family',

that

DVD_

16

time

operation

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

602

PART IV

Database

to

Design

check if they

integrity

2

is

wish to

proceed

with the

deletion

and,

moreimportantly,

When a MOVIEis deleted, then delete all associated rows in the a cascading

3

delete

Set the foreign in

It is

the

store

possible

that referential

is

COPY relation.

This is known as

constraint.

key value to

that

that

consideration

ensures

maintained.

are

NULLin the COPY relation. This means we may have copies of a movie

no longer

different

listed

choices

as being

available

will apply to

different

for

rental.

Follow

relations

this

in the

strategy

same

with

database,

care.

so careful

needed.

11.2.4 Merge Relations In the database different

user views

directly

from

relations. small

design

next

databases,

Identify

For

the

example,

in the

database.

can

see

ideal

Although

this

view

data

combine

be easily

merged

be

model

them.

and

to

should

new logical

store,

redundancy

remove

merged

which

one

are

amongst

any redundancy.

not

at a time.

For

For

each

duplicated.

Usually,

these

relations

model,

ensuring

that

will have the

identified.

logical

managers

different

views

the following

data

and

sales

staff

of the database two

relations

all integrity

will require

different

information

may have been designed

for the

from

managers

which have been created:

EMP_FNAME,

Learning.

EMP_LNAME, relations

they

EMP_SALARY,

All suppressed

to

AREACODE,

Rights

does

May not

simple,

from

However,

was called

Reserved. content

not materially

EMP_ID

same

meaning.

these

be

copied, affect

SALES_STAFF

describe

the

called

PHONE)

there

different called

attributes

scanned, the

overall

or

duplicated, learning

PHONE)

have

the

characteristic

same

AREACODE, are a number

views.

When

EMPLOYEE.

PHONE,

of

These

of recognised we

merged

problems

the

We assumed

those

that

of employees,

are

in experience.

EMP_NO?

are

known

synonyms

whole

or in Cengage

part.

that

Due Learning

to

key

value,

This

makes

EMP_SALARY)

electronic reserves

We would then as synonyms represent

rights, the

right

some to

third remove

party additional

occur

SALES_

MANAGER

and

based

upon the fact that

if the

primary

have two

may content

be

with

database

number.

suppressed at

key in the

attributes

up to the

employees

content

can

and

both the

and it is the

that

MANAGER

what would have been our assumption instead

primary

of an employee.

EMPLOYEE.

to the same characteristics,

were the same.

that

AREACODE,

EMP_FNAME,

a new relation

but the

and

that

one relation

be quite

of relations

referred

understand

into

EMP_LNAME,

we formed

relation

attributes

merging

appears

relations

EMP_FNAME,

MANAGER

have

for

names

to

any

two

merging

keys

different

that

the

process the

primary

Cengage deemed

new

user

will so far have created relations

duplication

of relations

should:

can

EMP_LNAME,

addition,

relations,

designer

has

two

Consider

(EMP_NO,

SALES_STAFF

2020

DVD rental

candidates

SALES_STAFF

review

in the

sets

designer

and

relations

(EMP_NO,

that In

surrounding

Copyright

each

in the

are similar

so such

(EMP_NO,

EMPLOYEE

Editorial

from

process

some

to represent

SALES_STAFF_VIEW

EMP_NUM.

the

that

merge these

database

all relations

Therefore,

SALES_STAFF

the

of relations

merged the

design to

will have been generated

MANAGER_VIEW

In the

STAFF

to

of ERDs

maintained.

sales staff.

MANAGER

them

therefore

relationships are

You

is

sets

key,

all the

In the

11

lead

stage

constraints

and the

database

will inevitably

relations

primary

Check

The logical

which

include

those

same

system.

that is

Automatically

a number

ERD,

the

set of relations

that

of the

each

The

process, it is likely

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

A second meanings. and

problem These

can occur if attributes

kinds

SALES_STAFF

database

both

designer

stores

of attributes had

attributes

discovered

that

area code and phone

employees merged

home

EMPLOYEE

when

merging

already

been

if a duplicate

relations,

merged

sure that the

completing

complete

database moving

attributes

when

phone

database If

the

original

of the

logical

the

users

database

by verifying

logical

model

a single,

in

transactions

it is

need to a few

be solved

steps

11.3

database

by the

DBMS

specifications to

addition,

we

The

was

happen to

if the

a particular

to an individual

be included

in the

STORE_PHONE,

EMP_

check

that

supertype/subtype

supertype/subtypes

created,

then

the

that

decision

merging relation.

have

needs

to

This is just to

be

make

that

the

stages

the logical

in

the

actively model.

all the

different

should

exist

constraints model

which

should

with the

represents

be again

the

validated

user.

Model with the User

database

ensure

model

all integrity

reviewing

have

beginning

design

database.

reference

the

before

different

The next

in

stage

views.

This

database design

the

conceptual

involves

data requirements

user

physical

participated

reviewing

have

stage

design

is

design

been

the

stage,

completed

modelled

very important

of the

and

as any

even if it requires

all the

problems

going

back

process.

PHYSICAL DATABASE DESIGN

Physical used

to

within the

and revisiting

to

contains

logical

correct,

process:

should

conceptual

users

are supported

needs

exists in the

validated,

that

the

database

the

with the

have to

603

was correct.

stage

of the

also

relations

model

11.2.5 Review the Complete Logical So far,

would

Design

MANAGER

would

referred

STORE_AREACODE,

designer

one

relationship

To ensure

final

meanings

have

referred

they

Database

relations

What

relation

relation

Physical

relations

original

PHONE.

MANAGER

Both

EMP_FNAME,

the

stage,

system.

to the

and

the

and

separate

The two

SALES_STAFF

number?

correctly.

decision this

in

whilst in the

supertype/subtype

original

After

AREACODE

name in

homonyms.

Logical,

EMP_SALARY)

are represented

revisited

called

EMP_LNAME,

EMP_PHONE,

relationships

before

and

same

as

Conceptual,

i.e:

(EMP_NO,

AREACODE, Finally,

relation,

with the

known

these

number,

area code

EMPLOYEE

are

11

for attribute

must

be able

designer

ultimate

goal

to

and

definition

response been

data

time.

In

or access

data.

this,

decisions

that

storage

the logical

In

doing

key)

any relationships

complex

ensure

of specific must translate

by a primary

to represent

have

we

accessing

(represented

be to

of query

needs

the

do this,

with some

must

in terms

information

requires order to storing

a data

database

efficient

In

1

to

we

must

be located

that

occur

regarding

storage

order

can

between

effective

carry

out

to

physical

that

every

physical

is

This

and design,

stored.

security, the

In

presents

physically

integrity

database

logical

database.

relations.

database

ensure

will be

a set of specific

ensure

in the

how the

is

methods that

model into

and

following

collected:

1 A set of normalised relations devised from the ER modelandthe normalisation process. This would have

been

derived

from

the

conceptual

and logical

design

stages

and

would

be the

logical

data

model.

2

An estimate

of the volume of data which will be stored in each database table

and the usage

statistics.

Copyright Editorial

review

3

An estimate of the physical storage requirements

4

The physical storage characteristics

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

for each field (attribute)

within the database.

of the DBMS that is being used.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

604

PART IV

Database

Physical

Design

database

design

can be broken

down into

1

Analyse data volume and database usage.

2

Translate each relation identified in the logical

3

Determine a suitable file organisation.

4

Defineindexes.

5

Define user views.

6

Estimate data storage requirements.

7

Determine database security for users.

Next

you

will learn

about

each

of these

phases

a number

of stages:

data modelinto

in

more

atable.

detail.

11.3.1 Analyse Data Volume and Database Usage Analysing The

user queries

process

is

introduced it is

often

to in

usage,

the

data

gather to

volume

predict

that

and the

has

cent

been

Wiederhold

in

of this

his 1983

chapter.

data volume

involved

either

that

book

may arise.

types are

in

design, that

will take

to

cent

of

system,

the

in

it

would

further

were

on each

involves

a given

file

and

essential occur

in

to order

be impossible

that

rule

to

account

suggested

reading

requested

used today in analysing

place

that

80/20

the

queries

you

database,

It is therefore

transactions

upon

design.

Every transaction of requests

may have, so transactions

based

which is listed

20 per

the

modification.

a large

users

This is

This rule is often

or

Generally,

that

designing that

number

most important

database

Cycle (SDLC)

processing.

the

viewing

of queries that

estimated

in the

the

considered.

on database

Weiderhold

of transactions

on both

for

Life

When physically

number

based

on at least

data

80 per cent of data accesses.

11

the

stage of physical

Development

Process.

possible

different to

Systems

requested

issues

possible

of the

is usually the first

Development

and limitations

as

of accesses

database

approximately

overheads

performance

all the

80 per

know

much information

establish

end

has

as part

Database

you

database

and

as

out

10,

that

within the

data

carried

Chapter

very important

table

and the size of the

section

by users

for

by

Gio

at the

account

data usage in existing

for

database

systems. The

steps

required

Identifying

the

of the

most

DVDs

every

an impact

In

and critical

shown each

Copyright Editorial

review

2020 has

transactions

to

would need to data

usage

map for

as dashed entity

Cengage deemed

usage

Learning. that

any

lines

represent

All suppressed

Rights

such

determine

For example renting

in our

a DVD.

as a Friday

does

May not

DVD rental

store,

While customers

or Saturday

evening,

one

would

which

rent

might

have

be

copied, affect

are usually

usage arrows

scanned, overall

or

shown

DVD rental

representing

number

the

in the

database

relations

(COPY,

duplicated, learning

on a simplified

map or a transaction

of the

estimated

not

relations

a DVD, four

participate

in these

RENT_LINE,

RENTAL

be accessed.

a section

materially

which

to rent

statistics

with the the

Reserved. content

transactions.

peak times

diagram is known as a composite composite

are:

will be a customer

may be

order for a customer

CUSTOMER) and

phase

performance.

of critical

Data volume

out this

transactions

day, there on the

Analysis

carrying

most frequent

common

transactions. and

for

store. the

of records

in experience.

whole

or in Cengage

part.

Due

Access

direction

stored

Learning

usage

to

electronic reserves

each

rights, the

frequencies of the

in

right

version

some to

of the

ERD.

This

map. Figure 11.15 shows the to

access.

each

relation

The numbers

are inside

relation.

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 11.15

11

Conceptual,

Logical,

and

Physical

Database

Design

605

Composite usage mapfor the DVDrental store

11

As you examine Figure 11.15, note the following: It is estimated that the store has 600 customers. The store

holds

1 500 different

movies and has on average

three

copies

of each

movie title (giving

an average of 4 500 records in the COPY relation).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

606

PART IV

Database

Design

There are approximately comprise

customers

Each customer

visits the

1 200 records gives

in the

an average

Of the

500

(renting

500

The

store

the

receives

estimated

also receives DVD is

It is important

to

available

are rented

accesses

with the

store.

to an estimate

by each

of

customer,

which

relation.

access

to

400

the

of these

RENTAL

(assuming

relation

a customer

to the

COPY table

movies

each

involve could

rents

to validate

20

generate

more than

the rented

week; therefore

DVD rentals a

one

DVD).

copy.

accesses

are required

to

degrade.

of

The

complete

that

has taken

of new

where

that

and

the

also

be taken

that

this

the

asking if a copy

to the

of new reporting

which

of a

of the

over the

database, as

can

cause

next

Once

and

users,

OLAP (Online

performance

The

an increase Analytical

of the

size

in

of

analysis

statistics

years.

for

but to gain

an initial

access

several

its

an idea

business.

additional

such

are only estimates exact figures

designer

of data volume

business

tools

obtain

database

functions

an estimate

shown

to

database,

give the

of the

with the

access

in

phase

provide

will grow

statistics

It is not necessary

may occur

an overview

require

introduction

into

customers

and volume

business.

from

to

database

applications

and the

a week from

data access

bottlenecks gathered

place it is important

assumption

the

within the

statistics

system

80 enquires

for rent.

emphasise

an understanding

data

20 new

on average

most critical transactions

the

each

DVDs

day. These

registering

DVDs, leading

approximately

relation

500 accesses

on average

four

relation,

that

each

customers

at week to rent

RENT_LINE

RENT_LINE

relation

new

DVD relation.

specific

to

and

On average

in the

CUSTOMER

to the

could lead to

The store

the

It is

CUSTOMER

DVDs

on average twice relation.

to the

accesses

This in turn

store

RENTAL

accesses

to the

and returning

of 4 800 records

or returning).

further

500 accesses

renting

on the

development the

volume

Processing)

of

must

all

consideration.

11.3.2 Translate Logical Relations into Tables The

11

output

represents in the

from the

data

the

logical

complete

dictionary

database

database (e.g.

the

construct

each corresponding

DBMS,

but the stages involved

a Identify

DEFAULT

names

stage

was

For each

relation

of the

database

each attribute

require

design

system.

attributes,

table.

are similar.

a complete

set

of normalised

we use the information

their

domains,

How the tables

For each relation

details

to

be inserted

into

the

of any

are constructed

which

documented

constraints,

is specific

etc) to

to the target

you should:

name and its domain from the data dictionary.

values

relations

we have

attribute

whenever

Note any attributes

new rows

are inserted

which

into

the

database.

b

Determine any attributes that require

a CHECK constraint in order to validate the value of the

attribute.

c Identify the primary and any foreign d Identify be

those

UNIQUE.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

attributes that are not allowed to contain You

automatically

can

exclude

imposes

Rights

Reserved. content

keys for each table.

does

May not

the

not materially

be

copied, affect

the

NOT

scanned, the

overall

primary

NULL

or

duplicated, learning

key

and

in experience.

whole

UNIQUE

or in Cengage

NULL values and those

attribute(s)

part.

Due Learning

here

as the

PRIMARY

which should KEY

constraint

constraints.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Once you have identified order

of table

created

creation

first,

Lets

followed

now look

database for the

by those

one foreign

using

the

contain

key, then of the

relations

DVD rental

store.

Table

and

tables

where

Figure

the

11.16

target

Conceptual,

SQL can

that

two

COPY relations

database

FIGURE 11.16

DBMS-specific Relations

with

example

stage from

DVD and

CREATE

above, the

very important.

at an

design

corresponding

the

is

11

two,

key

Physical

Database

create the table.

dependencies

Design

607

The

should

be

etc. that

were

11.6 shows

is

and

be written to

no foreign

shows

DBMS

Logical,

the

determined

a portion

SQL code

during

of the

used to

the

data

logical

dictionary

create

the two

Oracle.

Creating the DVD and COPYtables

TABLE

DVD (

DVD_ID VARCHAR2(10), DVD_COPIES NUMBER(3) NOT NULL, DVD_NAME VARCHAR2(50) DVD_CHARGE

NOT NULL,

NUMBER(2,2),

DVD_LATE_CHG_DAY

NUMBER(2,2),

CATEGORY CHAR(6), CONSTRAINT pk_dvd_dvdcode

PRIMARY KEY(DVD_ID),

CONSTRAINT ck_dvd _category

CHECK (CATEGORY IN ('Family',

CONSTRAINT ck_dvd_charge

CHECK (DVD_CHARGE

CHECK (DVD_LATE_CHG_DAY

CONSTRAINT

CHECK (MOVIE_COPIES

CREATE

TABLE

'Comedy',

'Doc')),

BETWEEN 0.25 and 25),

CONSTRAINT ck_dvd_latecharge ck_dvd_dvdcopies

'Action',

BETWEEN 6 and 16),

BETWEEN

1 AND 50));

COPY (

11 COPY_CODE VARCHAR2(10), DVD_ID VARCHAR2(5), COPY_NUM

NUMBER(2) DEFAULT 1 NOT NULL,

CONSTRAINT pk_copy_copycode CONSTRAINT fk_copy_movie

In Figure

11.16,

notice that the

PRIMARY KEY(COPY_CODE),

FOREIGN KEY(MOVIE_CODE)

constraints

imposed

on the

REFERENCES MOVIE(MOVIE_CODE));

attributes

have

been named. It is important

to name constraints using a standard so that we can associate a particular constraint with a particular table. If we do not name them, the DBMS assigns an unnamed constraint that is difficult to understand. Naming them makesit easy to modify or drop constraints and quickly fix any errors as we will know which table they

(constraint where the

are related

to.

An example

standard

format

for constraint

naming

could

abbreviation )_(table name)_(column_name)

constraint

abbreviations

would be PK (primary

key), FK (foreign

key),

CK (check

UK(unique constraint). Using this format you can see that the primary key constraint as pk_dvd_dvdcode in Figure 11.16.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

be:

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

constraint),

has been named

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

608

PART IV

Database

Referenced

FK

Design

MOVI Table

/

/

NULL

NULL

NULL

NOT

NOT

NOT

NULL

1

places

Constraints

CHECK

UNIQUE

NOT

CHECK

DEFAULT

OF

decimal

PK or

FK

PK

PK

FK

the

including

Required Y

Y

Y

Y

Y

Y

long,

digits

99.99

99

10.00

99

nine

to

Range 0

0

0.00

'Family',

'Action',

'Comedy',

'Doc'

0

up

and

places

Format

X9999

Xxxxxxx

99

99.99

99.99

Xxxxxx

X9999

Xxxxxxxxx

99

rental decimal

two

COPY

with

and

11

NUMBER(3) VARCHAR2(10)

Type

NUMBER(2,2) NUMBER(2,2) CHAR(6) VARCHAR2(50)

VARCHAR2(10)VARCHAR2(5)

NUMBER(2)

DVD numbers

characters

late

DVD

for

movie 000

paid is

specify code

DVD the

2

be

to

DVD

to

characters

DVD

of

1

identifier

entries

the

identifier

to

copy used

number

possible

255

the

the

DVD

DVD

data,

is

rent to copies

of

all 1

day

to

amount

unique

copy

of of

store length

Description

Unique

Name

No

Cost

Set

The

each

data,

Unique

The

categories

The

in

dictionary

NUMBER(9,2)

length

Data character

data.

character Variable key key

11.6

Numeric

5

Name

DVD_ID

CHG_DAY

DVD_COPIES DVD_NAME DVD_CHARGE DVD_LATE_

COPY_CODE

CATEGORY

DVD_ID

COPY_NUM

Attribute

Fixed

5

Primary

Foreign

5 5

TABLE Table

Copyright Editorial

review

2020 has

Cengage deemed

Name

Learning. that

any

COPY

DVD

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

FK

or in Cengage

part.

Due Learning

to

electronic reserves

PK

rights, the

right

CHAR

some to

VARCHAR2

third remove

NUMBER5

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

11

Conceptual,

Logical,

and

Physical

Database

Design

609

NOTE Both the In

SQL required

Chapter

11.3.3

8,

aspect

arranged.

from

model

one single

database record

store,

Selecting efficiently

the and

records

and

in

and

only

of the

Files that

are

Files

hashed

In the following

Heap

File

into

inserted

into

File

a sequential

more fields,

Copyright Editorial

review

2020 has

onto

stored

attribute

data

rows

to

SQL

Each

each

data

to

a record

Each a data

with the

are

are

file

DBMS.

types.

known

tables.

entity field.

as file many

Each row

in the logical For

fields

physically

may contain

many different

fields.

corresponds

records

storage

files.

from

of data

as

database

secondary

in

of a number

know

where

this

record

database

Oracle

data

example,

DVD_ID,

12c,

is

complex

techniques.

There

known

more fields,

such

known

stored

in the

DVD_TITLE,

are

the

are alot

data is

contains this

and

record

can identify

of file

are three

categories

you need to are

that of file

it. It

organisation

techniques

it is important

of

as quickly

of criteria that

However,

stored

thousands

how it

organisation

you

built

have

an

organisations:

files

organisations

which

are

based

on indexes

11

as hash files.

more detail about the characteristics

These

that

whether the type

file

as heap

as file

ensure

database and retrieve

and

administrator.

records

you learn in

to

your

As you can see, there

as

database

more fields,

If

be able to locate

of the

data loss.

ordered

possible.

must

growth

such

very important

as

you

must

or

is

quickly

organisation

or

organisation

file

each

heap,

any

come.

for

first

the

row.

sequential,

indexed,

of the

b-trees,

most commonly

bitmap

and join

used

indexes,

All

Heap files

time.

Since

of the

heap file are

The input

the

only

impractical

used

when

sequence

way to

if

only

where records

access

we want to

is

a large

often

a record

provide

are

quantity

used

in this

efficient

unordered.

to

of

Records

data

needs

automatically

type

of file is to

are to

be

generate

a

search

every

data retrieval.

Organisations organisation,

case

Rights

can it

Reserved. content

the

records

often the primary

search

suppressed

is that

as they

become

which is

Learning. that

file

and every record

not the

Cengage deemed

basic

on one

file

this

is

as

by the

by one

a table

key for

situations,

are specific

common

clusters.

the

Sequential

if this

tuning

sorted

row in the file, they

searched

attribute

some

Organisations

inserted

In

are

organisation

a DBMS

randomly

most basic file

primary

each

how the

relations

be represented

file

DBMS

techniques.

and

each

can

against

sections,

organisation

hash files

and

at the future

In

need

contain

for

choosing

may contain

consist

is required,

the

protection

Files that

The

record

consideration. often

or it

can

be retrieved

to look

understanding

file

can

To do this,

some

DVD

most suitable

one single

possible.

into

entity

is

a database

table,

a record

types

we identified

DVD_CHARGE.

data

is also important provides

the

and

design

arranging

in

that

data

Language,

File Organisation

physically All data

to

and the

Query

database

a database

DVD_COPIES

take

for

will correspond

DVD rental

as

of physical

techniques.

represents

tables

a Suitable

Techniques

organisation

creating Structured

Determine

An important

rows

for

Beginning

does

in the file be fast

can

May not

are

key. In

stored

if all the

not

be

copied, affect

scanned, the

overall

or

records

duplicated, learning

a sequence

order to locate

must be read in turn are

be very time-consuming

materially

in

in experience.

whole

or in Cengage

part.

find

Due Learning

a specific

record,

until the required

ordered to

based

based one

to

electronic reserves

on the

specific

rights, the

right

some to

third

the

record

party additional

content

value

of one

whole file

In some

key value.

Inserting

may content

be

any

However,

or

suppressed at

time

from if

or

must be

is located.

primary

record.

remove

on the

modifying

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

610

PART IV

Database

records

Design

usually results in rewriting

deletion

of records

leads

file is

a telephone

sequential a record

that

is

FIGURE 11.17

to

stored

the

storage

whole file,

space

book,

which is very impractical

being

as shown

in

wasted Figure

unless

11.17.

the

Each

in a database.

In addition,

file is reorganised. name

and

the

An example

phone

number

of a

represents

alphabetically.

Example of sequential file organisation LAST

NAME

FIRST

NAME

AREACODE

PHONE

First

Brown

James

0181

297-1228

record

Dunne

Leona

0161

894-1238

Marlene

0171

894-2285

Mlilo

in file

Moloi Moloi

0181

297-3809

Amy

0161

442-3381

Paul

0181

894-2180

Myron

0181

222-1672

Padayachee

Vinaya

0161

382-7185

Ramas

Alfred

0181

844-2573

George

0181

290-2556

OBrian

To locate Williams

all other

Orlando

records must first

be

read

Williams

Dueto their deficiencies, Indexed

sequential files are not used for

File Organisations

Accessing

a record

directly instead

file organisation. Records in afile unsorted sequence and an index speeding up data access. Indexes join operations. Theimprovement

11

modern database storage.

values that

contains

the index

of searching

through

the entire file involves

the

use of an indexed

supporting this type of file organisation can be stored in a sorted or is created to locate specific records quickly. Indexes are crucial in facilitate searching, sorting and using aggregate functions and even in data access speed occurs because an index is an ordered set of

key and pointers.

The pointers

are the row IDs for the actual table rows.

Conceptually, a data index is similar to a book index. Whenyou use a book index, you look up the word, similar to the index key, which is accompanied by the page number(s), similar to the pointer(s), which direct(s) you to the appropriate page(s). An index

scan is

more efficient

than

a full table

scan

because

the index

data are already

ordered

and the amount of data is usually a magnitude of scale smaller. Therefore, when performing searches, it is almost always better for the DBMS to use an index to access a table than to scan all its rows sequentially. For example, Figure 11.18 shows the index representation of a CUSTOMER table with 14 786 rows and the index COUNTRY_NDX on the CUS_COUNTRY attribute.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 11.18 STATE_NDX Key

Row 1 ....

....

....

FR

CUSTOMER (14 Row

ID

....

.....

.......

CUS_

CUS_

CUS_

CUS_

LNAME

FNAME

INITIAL

Physical

Database

Design

611

ROWS) CUS_

CUS_

CUS_

CUS_

AREACODE

PHONE

COUNTRY

BALANCE

1

10010

Ramas

Alfred

A

0181

844-2573

AS

2

10011

Dunne

Leona

K

0161

894-1238

SA

3

10012

Moloi

0181

894-2285

UK

4

10013

Olowski

0181

894-2180

5

10014

Orlando

0181

222-1672

FR

OBrian

0161

442-3381

NL

0181

297-1228

CZ

0.00

0181

290-2556

UK

0.00

0161

382-7185

SW

0181

297-3809

UK

3

6

10015

UK

4

7

10016

UK

8

8

10017

UK

10

9

10018

.......

10

10019

.....

0.00

you submit

0.00

the

DBMS

of CPU

updates

........

.........

.......

........

........

.........

.......

23120

course,

D

415

342-9234

UK

675.00

.......

.........

........

.........

........

........

.........

.......

......

.......

.........

........

.........

........

........

.........

.......

24560

Suraez

Victor

7898

233-8999

UK

342.00

query: CUS_COUNTRY

5 'UK'; must

perform

COUNTRY_NDX equal

a full table

is

to UK

created,

that

scan,

the

and then

Assuming

so important, column

why

in

every

thus

DBMS

reads

reading

for customer

all 14 786

automatically

all subsequent

only five rows

not index table

if the table

uses the index CUSTOMER

meet the rows

customer

condition

that

do not

rows.

to locate

rows,

using

the

CUS_COUNTRY

meet the

criteria.

11

5

Thats

has

every

will tax

column

the

in

DBMS

many attributes,

every

too

has

table?

Its

much in terms

many rows

not

practical

to

do

of index-maintenance

and/or

requires

many inserts,

deletes.

are logically

that

George

......

a state

especially

and/or

is applied

Veron

would save 14 781 I/O requests

are

Indexes

0.00 453.98

........

as a guide.

every

processing,

K

cycles!

If indexes so. Indexing

Mlilo

.........

index

row IDs in the index the

Moloi

G

.........

DBMS

with

Vinaya

........

the

customer

G

George

Padayachee

CUS_COUNTRY

that

first

James

Williams

673.21 1014.56

........

CUSTOMER

Assuming

Brown

B

1285.19

.........

the following

no index,

Amy

UK

.........

FROM

is

Myron

.......

CUS_NAME,

If there

F

Paul

896.54

.......

SELECT

WHERE

W

......

14786

Suppose

Marlene

......

13245

a lot

and

TABLE

786

CODE

UK

UK,

Logical,

5

....

the

Conceptual,

for the CUSTOMERtable

INDEX

AS ...

Index representation

11

they

and

require

and is

physically

their

own

an important

independent

storage

factor

of the

space.

How

that is initially

data in the

much

decided

associated

space

depends

within the

table. on the

physical

This

type

database

means,

of index

design

of

that

stage.

Types of Indexes There

are three

main types

Primary index to locate can

Copyright Editorial

review

2020 has

of indexes

these indexes

a specific

record

have

several

secondary

Secondary

index

these

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

that

can

are placed

pointed

to

be used:

on unique fields

by the

index.

A file

such can

as the

have

at

primary most

key. They are used

one

primary

index

but

indexes. indexes

not materially

be

copied, affect

can

scanned, the

overall

or

be placed

duplicated, learning

in experience.

whole

on any field

or in Cengage

part.

Due Learning

to

electronic reserves

in the

rights, the

right

file that

some to

third remove

party additional

is

content

unordered.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

612

PART IV

Database

Design

Multilevel index number

these indexes

of separate

which

keeps

index

track

on the

FIGURE 11.19

indexes

in

of these

DVD table

are used order

additional

from

the

where one index

to reduce

the

indexes!

DVD rental

becomes

too large

and is

This then

results

a further

search.

Figure

11.19

shows

in

an example

split into

a

index,

of a two-level

store.

Multilevelindex onthe DVDtable

Level 1 Index

Level 2 Index

The

M1231

W6790

W6790

DVD_CHARGE

DVD_CODE

DVD_NAME

M1000

Ramblin

M1020

Once

Upon

M1231

Tulips

and

S8756

S8756

MOVIE data file

6.50

Tulip

S3425

Khumba

S4854

Action

S4978

Invictus

a

Midnight

6.00

Breezy

6.00

Threelips

6.00 6.00

Heros

6.50

S6785

Tales of the

S8756

The

6.50

Unexplained

6.00

Stars

W4567

Flowers

in

Summer

6.00

W6756

Flowers in

Spring

6.00

W6790

The

6.00

Winter Garden

Each index can be defined as being sparse or dense. When using a sparse index, index pointers are created only for some of the records, whereas with a dense index, an index pointer appears for every search

key value in the file. In practice,

dense indexes

are faster,

but sparse indexes

space. In addition to these three types ofindex, there are a number of other types You willlearn about each of these indexes in the next sections.

11

require

less

storage

ofindex that are popular.

B-trees

Within a DBMS, indexes are often stored in a data structure known as a tree. Trees are generally more efficient at storing indexes as they reduce the time of the search compared with other data structures such aslists. These trees are often referred to as Balanced or B-trees and are used to maintain an ordered

set of indexes

or data to

allow

efficient

operations

to

select,

delete

and insert

data.

A B-tree

consists of a hierarchy of nodes that contain a set of pointers that link the nodes of the B-tree together. Each B-tree that is created is said to be of the order n where n is the maximum number of children allowed for each parent node. We can say that each node in a B-tree of order n contains at most 2n keys and 2n 1 1 pointers. This is true

except

for the root

node,

which

provides

the

starting

point

of the

B-tree.

When a node

does not have any children, it is called aleaf node. Each item (index or data) stored in a B-tree is known as a key. Each key is unique and can occur in the B-tree in only onelocation. The B-tree must always be balanced in that every path from the root to the leaf must be exactly the same length. The general principle is that for every node (which we will call n)in the tree: The left subtree of n contains only values smaller than the value in n. The right subtree of n contains

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

only values greater than the value in n.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

A special often Often, are

kind of B-tree is

used

to represent

B-trees

some

details

which

a suitable

file

concentrate

level

root

than

the

Figure

to other

not in the

greater

shows in

of

nodes.

node,

France

general

so

in

the

tree.

the

data record.

Here,

FIGURE 11.20

structure

a

the

each

index

can

Although

they

have

similar

kinds

of this

in the

two

for

of trees

in

chapter.

context

which

so it

data record for

contains

for

Germany

UK, so

match

and

so that

of a B1-tree,

the

Logical,

Physical

This tree is be quickly

an article

As we are

Database

Douglas

dealing

of physical

database

represents

country

at

keys

613

most

located.

proprieties, by

Design

there Comer,

with choosing design,

we will

section.

order

must look

than

we find

section

this

of the

Tolocate we

and less

access

is

versa.

these

upon indexes in

map

vice

about

reading

B1-trees

11.20

and

Conceptual,

where all keys reside in the leaves.

as a road

more

based

basics

B1-tree, act

B1-trees

are in the further

11.20

and pointers is

as

can read

organisation

B1-tree

as the which

to

You

on the

Figure The

are referred

differences.

of

known

indexes

11

in

we select Germany

names. (the

country

names)

Germany, first look in the root

node.

Germany

the

the

most two

child

nodes.

Alphabetically,

middle

pointer

and

and follow

the

pointer

Germany

proceed

to the

to the

left

of

is

second

Germany

to

B+-tree terminology

11

Now that

we have introduced

new key into

the tree.

some of the basic terminology

Suppose

we want to

the DVDtable. Figure 11.21 illustrates the DVD_IDs shown in Table 11.7.

TABLE 11.7

use the

of B1-trees, lets

attribute

DVD_IDs to

the steps required to construct

see if we can insert

act as the

a B1-tree

primary index

a on

of order two to store

DVD codes DVD_CODE M1020 M1231 M1000 S3425 S4854

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

614

PART IV

Database

Design

FIGURE 11.21

Creating a B+-tree

11

Insert

MOVIE_CODE

right

of S3425,

parent

S4854.

but the

node (which

It is

greater

node is full.

currently

So

contains

than

we have to M1020).

becomes full, so we have to create a further

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

S3425

or

duplicated, learning

in

should

split this

However,

child

experience.

and

whole

or in Cengage

part.

Due Learning

node

this

node for

to

therefore and

will

be placed

promote

mean that

to the

S3425 the

to the

parent

node

M1231.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

As you have data,

which

seen, the leads

appropriate

to

levels

B-tree is a powerful a faster

of index

space,

ensures

that

a node

as the tree

each

response

for

the

node

is

reorganises

way of storing

time

file

for

being

at least

itself

11

user

indexed half

indexes

queries. and,

used

Figure

Logical,

as it

The

through

and full.

as you saw in

Conceptual,

Database

never

615

of

sustains

management a case

Design

quick retrieval

automatically

careful

There is

Physical

allows the

B-tree

the

and

the

of storage

of overcrowding

at

11.21.

NOTE You

may be wondering

node to

pointers

the

data record

continue

to

at all. The again

once

are

mainly

used

Bitmap

Indexes we have looked

is

Another

known

given form

B-trees

when

we delete

a record.

do

perform

implementations

basis

for

the

maximum

this is that

all files

number

when you know that

domain.

not

at indexes

popular type

as the

bitmap

the

their

age.

are

not

are likely

of children

to

Whileit is the

grow

in the

a query refers to

store, So, the

applied

to

speed

that is often used on

Bitmap

indexes

if customers

DVD rental

enter

that

ofindex

index.

For example,

to join

would

to

B-tree

possible

actual

and therefore

node

has

been

a column

to remove leaf

deletion

of the

the leaf

pointer

is likely

to

reached.

which is indexed

and

only a few rows.

So far,

tables.

some

grow

B-tree indexes will retrieve

what happens

to records,

are

values

for

would

applied

to

enter

age in the

retrevial

multidimensional

usually

were required

everyone

up data

enter their

to

database

relational

database

data held in data

attributes

personal name

from

that

information

and

would

warehouses

are sparse

in

their

on an application

address,

but

a large

number

be sparse.

NOTE You will look at how bitmap indexes Chapter

15,

Databases

In a bitmap index,

for

can be used to optimise queries that use multidimensional

data in

11

Business Intelligence.

a two-dimensional

array is constructed.

One column is

generated

for every row in the

table that we wantto index, with each column representing a distinct value within the bitmapped index. The two-dimensional array represents each value within the index multiplied bythe number of rows in the table. An example of a bitmap index on the DVD_CHARGE field is shown in Figure 11.22. The DVD table

also shown

values {6.00,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

in

Figure

6.50,

All suppressed

Rights

Reserved. content

11.22

7.00,

does

May not

currently

7.50,

not materially

be

copied, affect

8.00}.

scanned, the

overall

or

duplicated, learning

has 11 rows

and the

This bitmap index

in experience.

whole

or in Cengage

part.

Due Learning

to

DVD_CHARGE

field

has five

different

has 11 entries with five bits per entry.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

616

PART IV

Database

Design

FIGURE 11.22 The

Bitmap index

on the field

MOVIE_CHARGE

MOVIE table

MOVIE_

MOVIE_

MOVIE_NAME

CODE

COPIES

M3456

3

Ramblin

R2345

2

Once Upon a Midnight

S4567

3

S4854

MOVIE_

MOVIE_LATE_CHG_

CHARGE

DAY

CATEGORY

6.50

0.25

Family

8.00

0.25

Comedy

Tulips and Threelips

6.00

0.25

Family

3

Action

6.00

0.25

Action

S4978

2

Invictus

6.50

0.50

Action

S6785

3

Tales

6.50

0.25

Action

S8756

2

The Stars

6.00

0.50

Doc

W1234

5

Khumba

8.00

0.50

Family

W4567

2

Flowers

in

Summer

6.00

0.25

Doc

W6756

2

Flowers

in

Spring

6.00

0.25

Doc

W6970

3

The

6.00

25.00

Doc

Bitmap

index

on the field

Tulip Breezy

Heros

of the

Unexplained

Winter

Garden

DVD_CHARGE

MOVIE_CHARGE 6.00

6.50

7.00

7.50

8.00

0

1

0

0

0

0

0

0

0

1

1

0

0

0

0

1

0

0

0

0

0

1

0

0

0

0

1

0

0

0

1

0

0

0

0

0

0

0

0

1

1

0

0

0

0

1

0

0

0

0

1

0

0

0

0

11

Bitmaps

are

more compact

bitmap indexes a bitmap index associated entry. the

bitmap

2020 has

Cengage deemed

Learning. that

for

the

wanted The

up less

storage

in

performance.

field in the

CATEGORY to find

DVD table.

field.

out the

SQL to retrieve

this

This

names

space.

Figure bitmap

of all

However,

Suppose 11.23

index

shows the

has

movies

combining

with

DVD table

11 entries a

multiple

we decide to also create

movie

with four

charge

and the bits

per

of 6.00

and

data is:

MOVIE

WHERE

review

index

and take

improvements

MOVIE_NAME

FROM

Copyright

B-trees

significant

CATEGORY

we then Family.

SELECT

Editorial

on the

Suppose

category

than

can provide

any

CATEGORY

All suppressed

Rights

Reserved. content

does

5 'Family'

May not

not materially

be

copied, affect

AND

scanned, the

overall

or

duplicated, learning

MOVIE_CHARGE

in experience.

whole

or in Cengage

part.

Due Learning

5 6.00;

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

To retrieve an

AND

retrieval data,

this

data,

operation of data

but

we would access

with the where

bitmaps

both

are

FIGURE 11.23

third

also

bits easy

the first

bit from had to

a

the

bit from

11

the

value

bitmap

of 1.

Logical,

CATEGORY

DVD_CHARGE

matching

Conceptual,

Physical

Database

bitmap index

index.

Not only is this

and

This

would

an efficient

and

Design

617

perform

then

allow

the

way of accessing

read.

Bitmap index onthe CATEGORYfield CATEGORY Family

Bitmap indexes

Comedy

Doc

Action

1

0

0

0

0

1

0

0

1

0

0

0

0

0

1

0

0

0

1

0

0

0

1

0

0

0

0

1

1

0

0

0

0

0

0

1

0

0

0

1

0

0

0

1

are usually used when:

A column in the table has low cardinality. Although all DBMSs vary, Oracle considers where the index has fewer than 100 distinct values.

columns

The table is not used often for data manipulation activities. This meansthat there are hardly any updates to the data in the table and few rows areinserted or deleted. Updating bitmapped indexes takes a lot of time, so, for example, if you update the data in the table regularly another type

of index

would be less resource

intensive.

As a guideline,

bitmapped

indexes

are

11

most

suitable for large, read-only tables. Specific

SQL queries

reference

a number

of low

cardinality

values in their

WHERE clauses.

Join Index Like the bitmap index, the join index is used mainlyin data warehousing and applies to columns from two

or

more tables

whose values

come from the

same

domain.

It is

often referred

to

as a bitmap join

index and it is a way of saving space by reducing the volume of data that must bejoined. The bitmap join stores the ROWIDs of corresponding rows in a separate table. For example, Figure 11.24 shows two tables, CUSTOMER and EMPLOYEE, which both have columns containing area codes (CUST_AREACODE and EMP_AREACODE) that share the same domain. Each table

also has a ROWID.

The join index

on the

the ROWIDs for the rows in each table

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

AREACODE

column

(also

shown in Figure 11.24)

shows

which share the same AREACODE.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

618

PART IV

Database

FIGURE 11.24 The customer

Design

Join index

on the AREACODEfield

table

ROWID

CUST_NUM

CUST_LNAME

CUST_FNAME

CUST_INITIAL

CUST_AREACODE

50001

1001

Ramas

Alfred

A

0181

844-2573

50002

1002

Dunne

Leona

K

0161

894-1238

50003

1003

Moloi

0191

894-2285

50004

1004

Olowski

0181

894-2180

50005

1005

Orlando

0181

222-1672

50006

1006

OBrian

Amy

B

0161

442-3381

50007

1007

Brown

James

G

0181

297-1228

50008

1008

Williams

0113

290-2556

50009

1009

50010

1010

The employee

Marlene

W

Paul

F

Myron

George

Padayachee Moloi

CUST_PHONE

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

table

The join index

ROWID

EMP_NUM

EMP_LNAME

EMP_AREACODE

EMP_PHONE

72001

230

Smithson

0191

555-1234

72002

231

Johnson

0181

123-4536

72003

233

Wallace

0113

342-6567

72004

235

Ortozo

0161

899-3425

on the

common

column

AREACODE

11

This type

of index

is useful

ROWID

ROWID

AREACODE

50001

72002

0181

50002

72004

0161

50003

72001

0191

50004

72002

0181

50005

72002

0181

50006

72004

0161

50007

72002

0181

50008

72003

0113

50009

72004

0161

50010

72002

0181

when dealing

with large

quantities

of data that

are typically

found

in

data

warehouses. Join indexes are less common in small relational databases. However, like bitmaps they are unsuitable when there are high-volume updates. The queries that access these indexes may also not reference any fields in the WHERE clause which are not in the join index.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Hashed File Organisations A hashed file organisation uses a hashing algorithm to address

in

hashed

organisation

hashing

the

file.

will tell

DBMS

from

a shorter

1

the

2

this

algorithm

method

Lets look number

record

number

for

at an example.

Database

evenly

Thus,

are

within the

has no direct

start

files

many

of the

will usually

file. reduce

that

Design

follow

different

619

the

kinds

of

data storage

area.

meaning except

that

This

artificial

the

number

primary

key

is known

as the

division/remainder

method.

is

value

The steps for

20 per cent larger than the number of records you

would

be 997

as it is

would

of

The

362

value

this

hashing

FIGURE 11.25

primary key by the prime number and use the remainder

storing

Suppose

of 120001

illustrates

Physical

store.

number 362).

and

are:

Divide the value of the logical relative

file.

There

that

the

The algorithm

Choose a prime number that is approximately want to

number to

the

files.

records

relative

key.

Logical,

a hash.

of hashing

using

an artificial

primary

throughout

or direct

distribute

is located

Conceptual,

map a primary key value onto a specific record

order

as random

generates record

logical

called

type

a hash

a random

to

aim of each is to

a real-world

One common

in

referred

algorithm where

identifier,

generating

are stored often

but the

hashing

the

generated to

are

algorithms,

Each specific it

Records

11

the

we need to store information

approximately

then would

have

20 per

a hash

then

be the

as the

record.

cent

larger

of 362 (120001 relative

about than

800.

divided

record

800 customers.

by

number

of

A suitable

A customer

997

gives

customer

with

120

prime

a customer

with

a remainder

120001.

Figure

11.25

algorithm.

Hashing algorithm

applied to the customer

number field

11

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

620

PART IV

Database

Design

In hashed file

organisations,

as a bucket.

If the

bucket

hold

can

to the

number

each relative

bucket

can

hold

several

records,

of records

that

address

more than then

the

the

that is

capacity

bucket

generated

one record,

can

then

of the

is held in a storage

individual

bucket

hold, including

for

records

are

a specific

some

free

hashed

space

location

held in file

known

a slot. If the

should

for future

be set

modifications

of records. The

main weakness

generated.

If the

collision

as the

hash

DBMS

will decrease

to

capacity,

the record

problem,

order

to

should

be stored

produce

algorithm

Dealing manage are in

in the

is

with

primary

storage

in the

overflow

used to

collisions

all of the

a random

order

alot

area is

rather

record

This type can

be

the

to

good

obtained

based

key

To deal with are stored.

overflowed value

In

record

is rehashed

to

full, then

bucket.

is

upon

of

been filled

area becomes

news

organisation

the

is

as a

performance

already

records

overflow

next free

so the

known

overflowing.

where

address

it is

the

has

primary

If the

into the

of file

occur,

overflow

point

a unique keys

bucket

bucket from

where

the logical

complicated,

matches

If the

used to

Alternatively,

that

primary

of collisions

area to store the record.

organisation.

exact

different

prevent the

a pointer

area.

If

overflow

put the overflowing

file

and

to

an

two

data increases.

elsewhere

may seem

hashed

for

record.

retrieve

records,

there is no guarantee

hash

relative to

have

overflow

is that

same

same taken

organisations

of these

a new location

another

to the

as the time

file

track

the

must be stored

hashed

keep

algorithms

generates

will point

the

this

with hashing

algorithm

is

that

most

generally

the

used

hash

that

DBMSs

will

when records

is

generated.

Clusters User in

queries

more

secondary

columns

are

would

make

required

to

access

common,

cluster

11

is

11.26

CUST_NUM

As part

clustered

is

necessary

Cengage deemed

together.

This

related

the

would physical

speed

RENTAL table Therefore,

reduce files

the in

time

different

or set of fields, that the clustered tables

the table join.

CUSTOMER

together.

design The

mainly

you

general

used

any

are frequently

obviously

whether to

Learning. that

and the

stored

common

to increase

table frequently

the

are

share

The cluster

key is

determined

have

when the

Notice

and

that

RENTAL

tables

with the

each

CUST_NUM

is

only

would have to select

appropriate

tables

cluster

stored

key

once

and

the two tables. database

are

that

table

All suppressed

to

undertake

both clustered

has

key is a field,

of the

stored

not a good idea

determining

2020

a portion

together.

that

tables

a clustered

review

how

between

tables

Clustering

Copyright

together

together.

accessing

that

for

rules

are:

queries

and

not

other

data

manipulation

that

may benefit

operations

such

as

or update.

Select

Editorial

tables

tables

tables

CUSTOMER

be accessed

two with

Where these Usually,

clustered

the

and

these

through

tables. time.

physically

store,

compared

The cluster

be physically

of physical

being Select

are

CUST_NUM

clustering

usually identified

shows

would

acts as the join

insert

fields

multiple

response

DVD rental

records,

storage.

which is

query

so, they

in the

consider

related

data from

on the

queries;

common

to

require

created.

Figure

from

same

on the sense

parts of secondary in

not

an impact

For example,

be joined

would

than

has

used in the

of data retrieval.

it

often

storage

Rights

Reserved. content

does

takes

May

longer

tables

some

not

together

when applications

cluster

and stored

joined

require

than or not

experiments

within

queries.

a full

a full

table

will largely

that

database scan

of an

depend

will compare

table

upon

the

query

to

be scanned.

unclustered the

table.

application

response

Scanning However,

and it is

times

often

when tables

are

separately.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 11.26

CUST_

CUST_

LNAME

FNAME

Ramas

Alfred

A

Key

Physical

Database

RENTAL

Design

621

CUST_

RENT_

RENT_

AREACODE

PHONE

NUM

NUM

CHARGE

0181

844-2573

......

6.00

6

8.00

9

6.00

13

6.00

1

6.50

8

0.00

1003

2

6.00

.....

.....

1002

894-2285

.....

3

1001

894-1238

0191

W

TABLE

CUST_

0161

K

......

Cluster

and

CUST_

INITIAL

Marlene

......

Logical,

TABLE

CUST_

Leona

Moloi

Conceptual,

Cluster key on the CUSTOMERand RENTALtables CUSTOMER

Dunne

11

....

......

5 CUST_NUM

11.3.4 Define Indexes As

you

discovered

perfomance and

decisions

secondary) table.

queries

in

the

to

be

Secondary

a primary index

are

are

the

speed

on the

UNIQUE

using

DVD_ID

INDEX

of the

fields

part

be indexed

on additional

INDEX

key field from

DVDINDEX

an important of

role

and the

index fields

in improving

physical type

created that

are

the

database

design

of index

(primary

or

for the

primary

key

used

regularly

in

user

retrieval.

CREATE

primary

a large

has a primary

placed

data

play

is

to

typically

usually

can

indexes

the

Each table

indexes

created

indexes

Defining

made regarding

to increase

SQL, indexes

section,

system.

will be applied.

order

CREATE

previous

database

need

that

of the

In

in

of the

statement.

the

For example,

DVDs table, the

if

we wanted

to

1

create

SQL would be:

ON DVD(DVD_ID)

where: UNIQUE

specifies

duplicate

value

addition,

these

after

The

the

DVD_ID and

2020 has

Cengage deemed

field,

the

an attempt

unique

constraint

name

of the index the

are created store

to

will instead

duplicate

CREATE

INDEX

file that

is

a similar

if

SQL

a specific

give the

created

column

statement

DVD_TITLE.

table

store

which

command.

index

by the

the

a

message.

DVD table

In and

are not inserted.

is

of the being

that store.

query to the

contain

an error

into

value

Supposing

stocked

A regular

each

the

does

will return records

DVD_IDs), they to

for

DVD is

If the

any further

have duplicate

and the

using

values.

made to insert

(e.g.

table

enquire

is

have

index. created.

customers They

frequently

are

database

unlikely

ring

to

know

may be:

DVD

Learning. that

if

may not

DVD_ID

WHERE

review

DVD_ID

specifies

indexes

FROM

Copyright

is the

DVD rental

SELECT

Editorial

in the

ON clause

Secondary

the index

creation,

violate this

DVDINDEX

up the

that

any

M.DVD_TITLE

All suppressed

Rights

Reserved. content

does

5 'Flowers

May not

not materially

be

copied, affect

in

scanned, the

overall

Winter';

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

622

PART IV

Database

To speed

Design

up results

of this

CREATE INDEX As it is

possible

UNIQUE

that

keyword

there

The

index

a table

both

optimisation,

process

must,

create

large

tables

the

complete

When

a

for

When the

data

MIN function

table

sparsity.

can

different

rows are

also

in

you

create

13.

must

DVD

as the

also

through

linked

codes),

values

the

of fields

every time

be inserted

to

During the which

table.

a new

into

database

physical

fields

Indexes

to

the

is

tuning

database

index.

should

an index

to

also

and

design

Generally, be

you

considered

more efficient

than

on

scanning

you

Data sparsity

refers

is

said

you

to is

have

to

a small

subset

column

high

sparsity. For

is

a

said

to

read

Knowing

have

the

example,

a high

when

percentage

you

of the

work.

of rows

from

a large

with high selectivity. Index selectivity processing.

of

in

date of birth can have

appropriate.

are likely

number

column

that

student

may be unnecessary

select

to the

a STU_GENDER

stores the

of an index

processing

BY clause.

M or F; therefore,

that

sparsity,

want to

high.

column use

ORDER

column.

values,

that

or

For example,

column

the

index

is

have.

which has a WHERE or HAVING clause.

BY

an indexed

possible

with low

criterion

a GROUP

column

whether

the

are

Here are some

indexes

primary

in join

Managing declare

Learning.

same rows,

Using

operations.

Database primary

All suppressed

Rights

and

keys

Reserved. content

does

as in

table,

based

on a

is a measure of how

general

guidelines

for

creating

May not

same

keys

(Note SQL

not materially

be

copied, affect

with low

logic,

do not

query

that the

keys

scanned, the

overall

or

duplicated, learning

because

in experience.

whole

the

or in Cengage

a table too

part.

Due

to

with low

costly

reserves

and low-sparsity

sparsity

and

making

for tables

may return the

with few

DBMS

will be covered joins

electronic

small tables

a specific

and

optimiser

Learning

are

full

table

rows

and few

values in a column.

within

optimiser

All natural

have

the index

each row. Indexes

Remember,

indexes

optimiser

query

Performance.)

create

DBMS

5 INVOICE.CUS_CODE.

in

of unique

the if you

by accessing

P_PRICE for

operation

existence

conditions,

For example,

be handled

sparsity.

condition

the index

so the

and foreign

can

ORDER BY, or GROUP

search scan.

and evaluating

A search

making

the

used in

of a full table . 10.00

or tables thing.

HAVING,

CUSTOMER.CUS_CODE

must ensure the

and foreign

attributes

all table rows

such

not the

a WHERE,

P_PRICE

in small tables

option.

all single

condition

of table

used in

scan instead

scanning

unless you

Declare

for

an index

expressions,

percentage a viable

attribute

indexes

using

of sequentially

attributes

any

in

will be used in query

P_PRICE,

tables

that

when

table

used in join

a high

Cengage

by itself

therefore,

each single

Do not use indexes

deemed

different

be used:

STU_DOB

therefore,

for

If

for

instead

has

to

possibly

a column

useful

the

an index

2020

with

For example,

closely

about

of each

as searching

only two

the

decide

an index

BY clause.

review

field:

indexes:

accesses

Copyright

is

Chapter

key

The objective is to create indexes

Create indexes

Editorial

primary

applied

could

values;

you to

anyway;

it is that

scan

DVD_TITLE

indexes

a record

decisions

by itself in a search

is

have

date

a search

using

(but

overheads.

index,

in

initial

on the indexed

In contrast,

perform

also

the

appears

a column

helps

and

some

appears

sparsity

sparsity

likely

title

most secondary

indexes

are covered

are likely

column

values

STUDENT

11

same

of

secondary

accessed

column

MAX or

different

condition.

with the

a secondary

and

make

are regularly

an indexed

Indexes

on the

table.

When

table

be created

can be additional

has

which

indexes

When an indexed

many

DVDs

This is true

there

primary

As a general rule, indexes

low

be two

that

however,

unique

that

could

table.

of

performance

should

index

ON DVD(DVD_TITLE);

not be used.

and as a result into

selection

you

could

should

is inserted

corresponding

a secondary

DVD_TITLE_INDEX

are often repeated record

query,

in

old-style

rights, right

some to

third remove

party additional

detail in joins

will use the

the

can

content

may

Chapter

will benefit

available

content

use the

be

indexes

suppressed at

any

time

from if

the

subsequent

13, if

you

at join

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

time. (The Also,

for

declaration the

same

Procedural Declare than

of a PK or FK will automatically

reason,

Language indexes

the

SQL

in join

primary

it is

better

and

write joins

Advanced

columns

and foreign

to

other keys,

11

Conceptual,

create using

Logical,

an index

the

SQL

you

do join

and

for the

JOIN

Physical

Database

declared

syntax

(see

Design

623

column.)

Chapter

9,

SQL).

than

you

the

PK/FK.

If

may be better

operations

off declaring

indexes

user views required

for the

in

columns

in those

other

columns.

11.3.5 Define User Views During the Using

the

conceptual relations

defined

taking

users.

We discuss

in

SQL in

design

defined

database

security

how to

Chapter

stage, the

different

in the logical

define

8, Beginning

data

into

account

roles

in

model, as they

section

Structured

these

views

can

11.3.7.

help

You

must to

now

define

can learn

database

are determined.

be defined.

the more

roles

Views

of

about

are

different

how to

often

types

create

of

views

Query Language.

11.3.6 Estimate Data Storage Requirements Allocating Most

physical

of the

storage

information

technical

manuals

characteristics

necessary

of the

software

depends

for

defining

you

are

the

on the physical

DBMS

and the

storage

operating

characteristics

can

systems

used.

be found

in the

using.

NOTE If the

DBMS

physical details

does

design of the

versions

not automate

requires

database,

of relational

the

process

well-developed operating

DBMS

technical

system

software

of determining

and

hide

skills

hardware

most

and

used

of the

storage

locations

a precise by the

data access

knowledge

database.

complexities

and

inherent

of the

Fortunately, in the

paths,

physical-level the

physical

more recent design

phase.

11 During the process of physical database design it is important to estimate not only the size of each table but also its long-term growth pattern. It is not necessary to be 100 per cent accurate but it should be based

upon the expected

growth

of the

business.

Therefore,

input into this

process

should

be provided

by the business experts within the company. They will need to answer questions such asHow many customers are welikely to have in the next five years? or Are welikely to expand the products that we currently sell? Next, the

physical

requirements

of each table

must be estimated.

One simple

way of performing

this for each table is to: 1

Estimate

the

size of each row

by summing

2

Estimate the number of rows, taking into

3

Multiply the size by the estimated

the length

in bytes for

consideration

each data type.

the expected

growth.

number of rows.

Table 11.7 shows this calculation for the DVDtable from the DVD rental store database.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

624

PART IV

Database

Design

TABLE 11.7

Physical storage

Attribute

Name

requirements:

Data

the

DVD table

Type

Storage

DVD_ID

VARCHAR2(10)

MOVIE_COPIES

NUMBER(3)

MOVIE_NAME

VARCHAR2(50)

MOVIE_CHARGE

NUMBER(2,2)

4

MOVIE_LATE_CHG_DAY

NUMBER(2,2)

4

CATEGORY

CHAR(6)

6

Row length: Number

(Bytes)

10

50

77

of rows:

7 590

Total space required:

The physical than

Requirement

584 430

size of any indexes

estimating

table

sizes,

that

because

have been specified the

actual

must also be estimated.

size can depend

on the

specific

This is

more difficult

DBMS.

NOTE Oracle

18c

provides

a number

CREATE_TABLE_COST average

row

size in

These

for

determines

estimating

the

size

the size of the table

of database

tables

given various

and indexes:

attributes

including

the

bytes.

CREATE_INDEX_COST existing

of tools

determines

the

amount

of storage

space

required

to

create

an index

on

an

table.

tools,

however,

can

only

usually

be accessed

by the

database

administrator.

11

11.3.7 Determine In

Chapter

Database Security for

10, Database

Development

Process,

Users

issues

surrounding

the security

of the databases

such

as potential threats and measures that could be taken to combact these threats were discussed. As part of the Systems Development Life Cycle(SDLC), the security requirements ofthe database will have been identified. This will haveincluded all the users of the database and their individual access requirements and restrictions. During physical database design, these requirements must beimplemented withinthe target

DBMS

and database

privileges

for

users

will need to

be established.

For example,

privileges

may

include selecting rows from specified tables or views, being able to modify or delete data in specified tables, etc. Implementing basic data security in Oracle requires all users to be given an account comprising a user name and an associated password. Oracle has two levels of privilege (system and object) that allow the

database

administrator

(DBA)

to

control

how

much power

a specific

user is

granted.

For example,

we do not want all staff to be able to access the complete database or to drop tables when they have no right to do so. System privileges authorise a user account to execute SQL data definition language (DDL) commands such as CREATE TABLE. Object privileges allow a user account to execute SQL data manipulation

operations

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

language

(DML)

commands

such

as performing

SELECT, INSERT,

UPDATE

and

DELETE

on specific tables.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

The SQL commands user

accounts.

Craig

the

ability

to

the

select

rows DVD TO

SELECT

ON

GRANT

CREATE

TABLE

these

privileges ON

REVOKE

CREATE

TABLE

difficult

time

managing

each

then

automatically

rental

store,

the

sales

CREATE Once

created,

privileges SELECT

GRANT

UPDATE

that

create

to

add

all the need

to

have

625

on specific

with the

username

tables:

and

users,

overcome

from

this,

to

DBA

the

under

a role

assigned

of the

role

operations

database

name. These

For

The

major

changes

example,

on the

a very

can be assigned

a single

role.

will have

users

at any time.

that

UPDATE

create the role

so the

and a database

referred

been

SELECT

used to

of

require

privileges

who

perform

ROLE is

To

they

of privileges

or revoke

users

number

are required.

of privileges

can then

ON

be granted

CUSTOMERS

ON

CUSTOMERS

chooses

on selected

TO

in

will

the

CUSTOMER

DVD table.

STAFF_CUSTOMER_ROLE:

granting

database

objects

to the

new role.

For example:

STAFF_CUSTOMER_ROLE;

TO

STAFF_CUSTOMER_ROLE;

the role to individual

STAFF_CUSTOMER_ROLE

DBA then

assigned

to

ability

privileges

account

Design

SQL statements:

will be a very large

a collection

can

stage then involves

GRANT If the

simply

staff

the

Database

STAFF_CUSTOMER_ROLE;

GRANT

The last

and the

or withdraw

grant

Physical

Craig;

privileges

CREATE

ROLE

DVD table

on the type

a DBA

apply

The SQL command

statements

and

Craig;

there

all the

is that

SQL

Logical,

Craig;

FROM

that

A role is

of roles

two

the

FROM

depending

group.

benefit

from

DVD

it is likely

can be grouped

REVOKE are used to authorise

following

can be done using the following

SELECT

company,

Conceptual,

TO Craig;

REVOKE

any

to

GRANT and

example,

GRANT

Removing

In

For

11

users accounts,

e.g. Lindiwe;

TO Lindiwe;

to revoke

a privilege

from

the

role,

it is

automatically

removed

from

all users

1

to the role.

SUMMARY Conceptual

database

by producing This

a data

stage

of the

requirements,

processes

can

and

Logical

database

designed

based data

review

2020 has

Cengage deemed

Learning. that

any

Cycle

the

of tests

design

All suppressed

Rights

does

May not

entity

not

the

be

entities down

model

of the

database

and relationships

in to

and normalisation,

conceptual

by the

is the

materially

relevant

conceptual

proposed

model using

Reserved. content

representation

be broken

system

four

data

is

within

steps:

data

created

the

analysis

model verification,

must embody

end-user

system. and

and

a clear understanding

affect

stage

and its

overall

or

views

and their

duplicated, learning

in the

within

stages:

creating

assigning

in experience.

whole

or in Cengage

part.

phase

of

required

Life

Cycle,

Due

to

electronic reserves

where

conceptual

the logical

rights, the

right

third remove

party additional

content

be run

access

relations

are

Creating

model,

integrity

to

model

model.

data

some

the

transactions,

constraints.

the

ER model

that the intended

that

and

and validating

Learning

where the

requires

Database

relationships

following

the

Verification

data requirements

second

scanned,

design

in order to corroborate

model.

data

normalisation,

copied,

database processes

database

business-imposed

on each

data

the can

modelling

against

and

conceptual

areas.

model involves

the logical

Copyright

identifies

The final

be supported

security

the

is part of the

against

a series

paths

Editorial

design.

model verification

through

where

Life

and its functional

must be verified

logical

model that

Database

database

business

Data

is

entity relationship

distributed the

design

the

validating

constraints,

may content

be

merging

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

626

PART IV

Database

logical model When

Design

models

constructed

with the

user.

creating

the logical

is important.

Those

be translated

tables

used

database access,

Selecting storage

indexes;

data

views,

file

the

of the the

data

DBMS.

integrity

entities

primary

the logical

are translated

any foreign

relation

is

key

model is

The and

design

data

into

relations

keys) should

specified

along

attribute(s)

suitable

with its

is identified,

estimating

data

goal

must

the

physical

be to

following

identified

organisation,

storage

onto the

and to improve

comprises

file

mapped

ultimate

security,

each relation

most

organisation

is important

most common indexed

files, in

known

as

and join

for

types

ensure

seven

defining

data

in terms

stages:

of

analysing

data

indexes

and

database

that

efficiency

in the logical

requirements,

value

of the

B-trees, indexes.

fast

of file

sequential

which

algorithm

primary

which

allow

These

are

data

retrieval

organisation

files,

which a hashing

upon the

indexes

and reviewing

do not contain

Finally,

database

the

records;

structures

bitmap

which

name

usage, translating

The three

based

database,

to

model into

speed

determining

up data

appropriate

users.

and hashed

each record

are

user for

ordered

the

chosen

ensure

Physical

a suitable

randomly

to

determining

space.

in

(e.g. that

brackets.

in the

time.

security

order

where the logical

and database

designing

the

relations, in

design is

tables,

database

the

effectively,

response

parts for the

keys.

be implemented

is

model,

enclosed

database

data volume

in

To create

attributes

to

storage query

data

by any foreign

Physical

different

with no dependents

first.

associated followed

for

are

is

key.

sorted

used to

efficient

heap

on one

on

Two

of

which

or

determine

data retrieval. used

use

files,

contain

more fields

the

Within a DBMS, indexes

fast often

and

are

address

using

of

are often stored

other

kinds

multidimensional

of indexes

data

held in

data

warehouses. Indexes

are

aggregate because

sparsity

crucial

for

functions an index

refers

recommended

speeding

and is

an

to the in

up data

even join ordered

number

access.

operations.

set

of values

of different

highly sparse

Indexes

that

values

columns

facilitate

The improvement contains

data

the index

a column

used in search

searching, in

could

sorting,

access

key

possibly

and

and

speed

using

occurs

pointers.

Data

have. Indexes

are

conditions.

11

Online Content In Appendices BandC,available onthe onlineplatform forthis book,you will have the

creation

chance

to

experience

of two real-world

e-commerce

all the

database

stages

systems:

of the

the

database

University

design

Lab and

life

cycle

through

Global Tickets

the

Ltd, a travel

database.

KEYTERMS B-tree

file organisation

bitmapindex

hashedfile

cluster key

heap file

primaryindex

cohesivity

index selectivity

secondaryindex

compositeusage map

indexes

sequentialfile

conceptual design

join index

transaction usage map

description of operations

Copyright Editorial

review

module

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

minimal data rule

Reserved. content

modulecoupling

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

11

Conceptual,

Logical,

and

Physical

Database

Design

627

FURTHER READING Comer,

D.The

Garmany,

Ubiquitous

J.,

B-Tree,

Walker, J. and

ACM

Computing

Clark, T. Logical

Surveys,

Database

11(2),

Design

pp. 121137,

Principles

1979.

(Foundations

of

Database

Design).

AUERBACH, 2005. Pavlovic, Z. and Veselica, Lightstone, to

S., Teorey,

Exploiting

M., Oracle Database 12c Security Cookbook.

T. and

Indexes,

Management

Nadeau,

Views,

Systems,

T. Physical

Storage,

and

Database

Design:

More, 4th revised

edition.

Wiederhold,

Morgan

Professionals Kaufmann

Guide

Series in

Data

2007.

Teorey, T. Database Modelling and Design Logical D, 5th edition. Systems,

Packt Publishing, 2016.

The Database

Morgan Kaufmann

Series in Data Management

2011. G. Database

Design,

Online Content are available

on the

2nd edition.

McGraw-Hill,

1983.

Answers to selectedReviewQuestionsand Problemsforthis chapter online platform

accompanying

this

book.

REVIEW QUESTIONS 1

What are the stages of the conceptual

database design?

2

What are business rules?

3

Which steps are required in the development

Whyare they important

to a database designer?

of an ER diagram?

4

List and briefly explain the activities involved in the verification

of an ER model.

5

Describethe logical database design process.

6

Describethe steps required to convert the conceptual ER modelinto the logical

7

What are the typical

8

Whichintegrity

9

What are the stages of physical database design?

model.

11

Copyright Editorial

review

problems in

constraints

merging relations?

need validating

during logical

10

Whyis it important

11

Whenshould indexes be used?

12

Describe the purpose of a B-tree.

13

When are factors important

14

How is basic database security implemented?

15

How is entity integrity

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

database design?

to analyse data volume and usage statistics?

May not

in selecting

a bitmap index?

and referential integrity

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

enforced

or in Cengage

part.

Due Learning

to

when creating tables in SQL?

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

628

PART IV

Database

Design

PROBLEMS 1

Writethe proper sequence of activities in the design of a video rental database. (The initial ERD was shown in Figure 11.7.) The design must support all rental activities, customer payment tracking and employee work schedules, as well as track which employees checked out the videos to the customers. When you have finished writing the design activity sequence, complete the ERD to ensure that the

normalised 2

database

design

can be successfully

implemented.

(Make

sure that

the

design is

properly and that it can support the required transactions.)

Create the initial ER diagram for a car dealership. The dealership sells both new and used cars, and it operates a service facility. Base your design on the following business rules: a

A salesperson

b

A customer

c

A salesperson

d

A customer

e

can sell many cars, but each car is sold by only one salesperson.

can buy many cars, but each car is sold to only one customer. writes a single invoice

for each car sold.

gets aninvoice for each car(s) he or she buys.

A customer

might come in

only to

have a car serviced;

that is,

one need

not buy a car to

be

classified as a customer. f

When a customer takes in one or more cars for repair or service, one service ticket is

written

for each car.

g

The car dealership maintains a service history for each car serviced. referenced by the cars serial number.

h

A car brought in for service can be worked on by many mechanics, and each work on many cars.

i

A car that is serviced adjust

3

The service records

Verify the

a carburettor

may or may not need parts. (For example,

or to

conceptual

clean

a fuel injector

model you created

in

are

mechanic can

parts are not necessary to

nozzle.)

Question

2. Create

a data

dictionary

for the

verified

model.

11 4

Transform

the ERD in

FIGURE P11.1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Figure

P11.1 into

a relational

schema

showing

all primary

and foreign

keys.

ERDfor Problem 4

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

630

PART IV

Database

11

Design

Should you create anindex Problems

12 and 13 are based

SELECT

P_CODE,

FROM

LINE

Why or why not?

on the following

query:

SUM(LINE_UNITS)

GROUP

BY

HAVING

SUM(LINE_UNITS)

12

P_CODE

Whatis the likely

13

on EMP_DOB?

. (SELECT

data sparsity

MAX(LINE_UNITS)

of the LINE_UNITS

14

If not, explain

Problems

14 and 15 are based P_CODE,

FROM

PRODUCT

on the following

Whatis the likely

query:

P_QOH*P_PRICE

P_QOH*P_PRICE

WHERE

16

on P_CODE? If so, write the SQL command to create that index. If

your reasoning.

SELECT

15

column(s) be and why would you create

your reasoning.

Should you create an index not, explain

LINE);

column?

Should you create an index? If so, what would the index that index?

FROM

. (SELECT

data sparsity

Should you create an index,

AVG(P_QOH*P_PRICE)

FROM

PRODUCT)

of the P_QOH and P_PRICE columns? what would the index

column(s) be, and why should you create that

index?

17

Consider the composite The

composite

etc.).

There

materials

11

for

usage are

two

account

can

types

for

70 per cent

of

35 per

100

that

there

materials,

cent

of purchases.

be greater than

FIGURE P11.3

usage mapshown in Figure P11.3 for a building company called BricksRUs. map shows

of

full

materials

As the

per cent.

are

1000

rows

price that

same

in the

materials are

materials

material

and

purchased

table

(e.g.

wholesale while

wish to

apply for

cement,

materials.

wholesale

Full-price

materials

can be of both subtypes,

When contractors

bricks,

the

a contract

account

percentages for

a building

Composite usage mapfor BricksRUs 500

60

CONTRACTOR

MATERIAL 1000

100

40 1..1

{Mandatory,AND} 60

35% FULL

0..*

70% 350

175

WHOLESALE

PRICE

MATERIALS

1..*

1..1

ESTIMATE

700

MATERIALS

350

3200

40

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

(80)

40

200

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

job, they and, are

send in

on

average,

500

accesses

materials

there After

a

and

estimates. they to the

350

of time,

The number 175 require

b

are roughly

80 estimates

material

accesses

are 40 subsequent a period

There

provide

table

to

accesses the

which

assumptions

of direct

for

accesses

can

this

Logical,

3200 down

Of the

the

into

175

60

accesses

usage

map

have

changed

estimate

for

accesses

631

BricksRUs

On average

to the

Design

there

to full-price

contractor

table,

as follows:

to 400 per hour.

Out of this,

table.

Wholesale materials now account for 80 per cent of all materials.

Full price materials nowrepresent only 25 per cent of all materials.

d

There are now an average of 60 estimates for each supplier.

Draw a new composite

Draw a B1-tree You should have no

19

jobs

Database

table.

c

18

Physical

estimates).

materials has decreased

to

and

who undertake of

be broken

estimate

accesses to

subsequent

a total

materials.

to the

Conceptual,

40 contractors (giving

wholesale

11

usage

map reflecting

this

new information.

with n 5 2 and insert the following

show the insertions

more than two

Remember

keys and no fewer than

Draw another B1-tree show the insertions

at each stage.

keys in order: A, B, C, D and Einto the tree. n 5 2 means that

nodes

are allowed

to

one key each.

with n 5 2 and insert the following

keys in order: B, D, C, A, E, F. You should

at each stage.

11

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Part V

DATABASE TRANSACTIONS ANDPERFORMANCE TUNING

12 Managing Transactions and Concurrenc

13 Managing Database and SQL Performanc

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

BUSINESS VIGNETTE FROM DATA WAREHOUSETO DATA LAKE Since the early 1990s, a vast amount of data has been stored in data warehouses in order to provide a central repository for business intelligence within an organisation. The concept of a data warehouse originated from studies undertaken at MITin the early 1970s.1 However, the term information

warehouse

was first

used in 1986

by Barry Devlin and Paul Murphy in an article

entitled

An Architecture for a Business and Information System in IBM Systems Journal.2 They identified what was known commonly as the islands of information problem. This is where organisations had many operational systems that were not integrated, data were duplicated and reporting from the global business perspective was rare. Data warehousing took off in earnest in 1991 when Bill Inmon

published

his book

entitled

Building

the

Data

Warehouse.3

While in

1996 there

were

more data warehouse projects initiated than in previous years, arguments began about whether data warehousing solutions were too generalised in trying to model the whole organisation. An alternative methodology to developing a data warehouse that focused on the use of data marts was championed by Ralph Kimball.4 The development of data martsfocused on the data requirements of individual

departments

rather

than the

whole organisation.

The data

mart proved

successful

as

it provided a quick return oninvestment and introduced the concepts of the dimensional modelling of data. It is now the norm for data warehouses to store terabytes of data. The number of users accessing

a typical

more complex

1

Haisten,

M. Data

Newsletter. 2

Devlin, 27(1),

3

B. and

warehouse

has increased,

whats

next?

Part

along

with the requirements

for

Withthe rise of Big Data, traditional

4: integrate

the

new

islands

of information,

data

DM

Direct

www.dmreview.com/article_sub.cfm?articleId55238 P. An

architecture

for

a business

and information

system,

IBM

Systems

Journal

1998.

B. Building Ralph.

Warehouses,

at

Murphy,

6080,

Kimball,

data

warehousing:

Available

pp.

Imnon,

4

organisations

queries and near real-time information.

the

Data

The

Kindle

Warehouse,

Data

Warehouse

Edition.

John

Wiley

4th

edition.

Toolkit: & Sons,

Hungry

Minds

Practical

Inc,

2005.

Techniques

for

Building

Dimensional

Data

2010.

633

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

634

PART

V Database

Transactions

warehouses

user to store

companies Today,

of the

to

whose

right

with data larger

Data

will be required

a

competitors

our

customers

used

to

Lake

deliver

In 2018,

new

based

business

upon

that

to

their

previous and

their

company

Big

analytics

and

artificial

Columbus,

L. 10

Charts

That

Business

Data, it

would

intelligence

of tools

to

The

Data is to

Big

analytics warehouse

dimensional

such

as What

products

do

business

business

from

value. Big

and

multiple

Today,

that, if they

more

as

to

of this

a data

questions

to Which

will increase reported

deliver

such

habits?

Data is likely

data analysis

uses

answer

view

be devised

intelligence

more efficient

approaches

Smart

need to

typically

shopping

produce

context.

complex

campaign?

Big Data revenues

decision

which

are important

cross-functional

architectures

intelligence

advertising

advanced

5

business

a variety

alongside

accelerate

right

mining. It can be used to

59 per cent of executives

and to

Data Lakes

a summarised,

organisations.

billion in 2027.5 In addition,

accuracy

is a Data Lake,

form.

Furthermore,

Business

opportunities

Forbes reported

the

through

data

provides

information

sizes.

results

an expensive

might like,

to find

and in

decisions

and

introduced

time

which

where different

timely

An alternative

an unstructured

driven.

a business.

make

tools

in

Data,

and different

within

to

data

right

Data

volumes

tools

Data

our

is

Big

forecasting

as rigid in structure.

Smart

at the

cope

analyses,

are

require

information

are fundamental

be seen

business

from

Tuning

data in its raw format

organisations

be extracted

and/or

Performance

can sometimes

allows the for

and

if

we think

intelligence

processes.

$42 billion in

2018 to

$103

used artificial intelligence main

driver

achieve

of combining

greater

predictive

making.

Will Change

Your

Perspective

Of Big

Datas

Growth,

Available:

www.forbes.com/sites/

louiscolumbus/2018/05/23/10-charts-that-will-change-your-perspective-of-big-datas-growth/#1d6790d32926

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER 12 Managing Transactions and Concurrency IN THIS CHAPTER,YOU WILLLEARN: What a database

transaction

What concurrency

control

is is

and

and

what its

properties

what role it

are

plays in

maintaining

the

databases

integrity Whatlocking

methods

are and how they

work

How stamping

methods

are used for

How optimistic

methods

are used for

concurrency

management

is

How

database

The

recovery

ANSI levels

of transaction

concurrency

used

control control

to

maintain

database

integrity

isolation

PREVIEW Database

transactions

as buying

a product,

Transactions require

the

are likely

to

the

customers

updating

sellers

accounts

completed

to

transactions The

main

many

problems.

defining

how SQL can be used to represent the

many

control.

discusses

2020 has

that

some

Learning. that

any

of the

sales

All suppressed

Rights

Reserved. content

does

May not

at the of

not materially

copied, affect

and

may updating

must be successfully executing

atomicity,

and

managing

consistency,

durability,

properties,

this

and how transaction

same

is

number

provide occur

the

overall

or

duplicated, learning

in experience.

are is

especially

chapter

logs

with

can

called

concurrent

or in Cengage

part.

Due Learning

to

electronic reserves

concurrency

in

routinely Web!).

This

transactions You

enforces

concurrent

important

via the

summaries.

whole

called

of transactions

services

and inconsistent

scanned,

they

transactions

control

can

time,

when a DBMS scheduler

be

transaction

inventory

transaction

such

the

and

that

data

can be solved

are

transactions

place

(just imagine

problems

such

account.

transactions.

concurrency

conduct

by events

a current

a sales

product

Therefore,

these

execution

uncommitted

problems

Cengage deemed

the

can imagine, environment

updates,

such

take

Managing

As you

companies

lost

to recover

transactions

database by

ability

example, the

in

activities.

properties

After

ensure

DBMSs

For

are triggered a deposit

of a transaction

system

transaction

that

making

adjusting

All parts

database

serialisability.

or

parts.

account,

data integrity

database

transactions.

review

transactions

a course,

shows

When

Copyright

for

receivable.

prevent

and

real-world

contain

are important

isolation

Editorial

reflect registering

will

rights, right

some to

third remove

handled chapter such

discover

concurrency

the

a multi-user

party additional

as that

control.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

636

PART

V Database

Transactions

You willlearn optimistic

and

about the

methods.

and types

Performance

most common

Because

of locks.

Tuning

Locks

locks

can

algorithms

are the

also

create

most

for concurrency widely

deadlocks,

so

control:

used

method,

you

you

will learn

locks,

time

stamping

will examine

about

various

strategies

for

and levels

managing

deadlocks.

Database

contents

management

can be damaged

failures.

databases

contents

full backups

or destroyed

Therefore,

you

means

of various

by

to transaction

log

will learn

by critical how

backup

operational

database

errors, including

recovery

procedures.

Such

transaction

management

backup

maintains

procedures

range

a

from

backups.

12

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

12.1

12

Managing

Transactions

and

Concurrency

637

WHATIS A TRANSACTION?

To illustrate what transactions are and how they work, lets relational diagram for that database is shown in Figure 12.1.

use the

Ch12_SaleCo

database.

The

Online Content The'Ch12_SaleCo' database usedtoillustratethe material in this chapter is

available

on the

FIGURE 12.1

online

platform

for

this

book.

The Ch12_SaleCodatabase ERD

12

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

638

PART

V Database

Transactions

and

Performance

Tuning

NOTE Although

SQL

be able

commands

to follow

Language,

the

and

Chapter

SQL commands

database

in

8 and

examine

The

design

indicate

the

whenever

on the

entity

stores

the

total

The

you

practice

can

Including

to

serve

To understand Furthermore, scenario,

You

must

update

the

account

You

must

update

the

customer

The

preceding

sales

any

action

reads

to

to

one

statements.

is a logical acceptable. not

the

A successful

database

has

Cengage deemed

state

Learning. that

any

All suppressed

is

Rights

or it

example you

is rolled

May not

not

be

products

to

customer

CUSTOMER

table

or to

generate

balances. payments

to

track

the

copied, affect

database

to

you to

reflect

track

the

a product

to

accounting transactions

you

sell

purchase

to

her/his

a customer.

account.

Given that

parts:

inventory.

in the

database.

it

may consist

tables;

it of

includes can

now

the

such

or only the

original

the

overall

or

duplicated, learning

in experience.

constraints

whole

or in Cengage

successfully.

database

part.

Due Learning

that

electronic reserves

states

mentioned

If any existed

one consistent are

to

state

statements. A transaction

no intermediate

receivable

to

and INSERT

UPDATE

previously

accounts

statements

UPDATE

and

is

SELECT

statements

of a transaction.

aborted;

as the

UPDATE

of INSERT

of INSERT definition

a transaction

of a simple

of related

of SELECT,

or entirely

must be completed

terms,

of a series

a combination

transaction,

all data integrity

scanned,

of a series

a combination

completed

the

database

may consist

may consist

augment

only the inventory

back to

In

A transaction

changes the database from

which

materially

table

the

any customer

will enable

following

a database.

must be entirely

transaction

does

features:

is increased

when

and

that

the

the

may consist

in the transaction

one in

Reserved. content

to

contents;

Updating

transaction

attribute

maximum

here

suppose

be reflected

words, a multi-component

completed.

CUSTOMER

decreased

Ch12_SaleCo

may charge

in various

discussion,

SQL statements entire

started.

2020

In other

simplifying

discussions.

of at least

writes

of table

unit of work that

be partially

All of the fail,

must

transaction

preceding

material

balance.

more tables;

The sales

Given the

the

transactions.

of attributes

or

the

invoice.

and/or

a list

values

following

in the

purchases

of the

on hand in the

transaction

generate

the

the

balance in the

and

provided

customer

quantity

change

SQL, ignore

and to augment

balance for

all customer

chapters

consists

the

add rows

review

the

must reduce

statement

account

know

Query

of SQL, you can use the

CUST_BALANCE

minimum

of a transaction,

transaction

from

note

and it is

current

should

procedures.

value

The

design

of the

You

that

records

purpose

that

12.1,

credit,

the

average,

database

must write a new customer

to

Copyright

the

You

12

Editorial

sales

stored

Figure

customer

dont

knowledge

you

Structured

activity.

concept

suppose

your

and

issues,

Beginning

UPDATE examples

customer.

but the implementation

the

you

in

determine

table

the

SQL. If

current

as total,

account

more precisely,

well enough

Advanced

on

the

such

change

8,

(CUST_BALANCE)

by the

control

Chapter

a purchase

write a query to

of customer

Naturally,

balance

owed

concurrency

studied

own triggers

diagram

makes

ACCT_TRANSACTION

details

and

your

customer

summaries

and

not

If you have a working

relationship

amount

makes it easy to

have

Language

writing

customer

makes a payment.

transaction

you

your own SELECT and

9 by

the

the

important

if

discussions.

to generate

Chapters

As you

several even

9, Procedural

and focus

Ch12_SaleCo presented

illustrate

discussions

is

not

sale,

are must

acceptable.

of the

SQL statements

before

the

state to another.

transaction

A consistent

satisfied.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

To ensure consistent database later,

consistency

state. that

of the

If the

violates

its

all transactions

database,

database

is

integrity

are

every transaction

not in and

controlled

a consistent

business

and

the

For that

by the

Managing

must begin

state,

rules.

executed

12

with the

transaction

reason,

DBMS

Transactions

to

database

will yield

subject

and

639

in a known

an inconsistent

to limitations

guarantee

Concurrency

database

discussed

integrity.

Mostreal-world database transactions are formed by two or more database requests. A database request is the equivalent of a single SQL statement in an application program or transaction. Therefore, if

a transaction

uses

is

three

composed

database

operations

that

requests.

read

from

all transactions

customer

number

using

SQL

the

update 10016

the

in

CUSTOMER

WHERE

query

does not

because the

database

one INSERT

request

storage

statement,

the

several

input/output

generates

transaction (I/O)

media.

Results

the

Suppose

you

CUSTOMER

want

table.

to

Such

examine

the

a transaction

current can

balance

for

be completed

by

CUST_BALANCE

CUST_NUMBER

that

alter the

and

code:

FROM

transaction

each

database.

located

statements

physical

Transaction

CUST_NUMBER,

access,

turn,

write to

SELECT

Although

UPDATE

In

or

12.1.1 Evaluating Not

of two

it

make any changes

accesses

database

5 10016;

remains

the in

database.

in the

If the

CUSTOMER

database

a consistent

state

after

may consist

of a single

table,

existed

the

access,

SQL

statement

in

the

SQL code represents

a consistent

because

the

state

a

before

transaction

the

did

not

database.

Remember

statements.

that

a transaction

Lets revisit

Ch12_SaleCo

the

database.

product

89-WRE-Q

to

INVOICE,

LINE,

PRODUCT,

represent

this transaction

INSERT

INTO

VALUES

INTO

VALUES

SET

INSERT

in

'18-Jan-2019',

1, '89-WRE-Q',

of

a more complex you

the

amount

and

ACCT_TRANSACTION

register

277.55.

the

transaction,

credit

The required tables.

sale

using the

of

one

transaction The

SQL

unit

affects

of the

statements

that

256.99,

20.56,

277.55,

'cred',

0.00,

277.55);

12

1, 256.99,

256.99);

PRODUCT 5 PROD_QOH

PROD_CODE

UPDATE

5 CUST_BALANCE

CUST_NUMBER

INTO

1

5 '89-WRE-Q';

CUST_BALANCE

WHERE

10016

CUSTOMER

2019

LINE

SET PROD_QOH WHERE

to illustrate

on 18 January

SQL

are as follows:

10016,

(1009,

UPDATE

customer

sales example

that

of related

INVOICE

(1009,

INSERT

previous

Suppose

or a collection

CUSTOMER

1 277.55

5 10016;

ACCT_TRANSACTION

VALUES (10007,

'18-Jan-19',

10016, 'charge',

277.55);

COMMIT;

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

640

PART

V Database

Transactions

The results involved

of the

in the

To further

the

The

The

successfully

your 1009

row

value

The customer the

The

the

appear

the

in

Figure

was 12), thus leaving

was added to the

row,

the

12.2. (Note

that

all records

following:

derived

of one

values

(PROD_QOH)

in

attribute

values

were

stored

balance.

purchase

attribute

a quantity

(CUST_BALANCE)

the

and the invoice

to reflect

derived

note

In this

total,

on hand

value

results, table.

invoice

was added row,

balance (the initial

A new row number

tax,

transaction

INVOICE

quantity

balance

existing

the

In this

89-WRE-Qs

one (the initial

to

1009

of 256.99.

product

of the

the

for invoice

a price

transaction

highlighted.)

was added subtotal,

Tuning

completed

are

understanding

invoice

LINE

with

Performance

transaction

A new row for

and

for

unit

the line

the

of product

amount

PRODUCT

89-WRE-Q

were stored.

table

was reduced

by

on hand of 11.

for customer

10016

was updated

by adding

277.55

to

was 0.00).

ACCT_TRANSACTION

table

to reflect

the new account

transaction

10007.

COMMIT

FIGURE 12.2

statement

is

used

to

end

a successful

transaction.

(See

Section

12.1.3.)

Tracing the transaction in the Ch12SaleCodatabase

Table name: INVOICE INV_

12

INV_

INV_

CUST_

INV_

SUBTOTAL

DATE

INV_TOTAL

INV_PAY_

TAX

INV_

INV_PAY_

TYPE

BALANCE

AMOUNT

NUMBER

NUMBER

1001

10014

16-Jan-19

54.92

4.39

59.31

cc

59.31

0.00

1002

10011

16-Jan-19

9.98

0.80

10.78

cash

10.78

0.00

1003

10012

16-Jan-19

270.70

21.66

292.36

cc

292.36

0.00

1004

10011

17-Jan-19

34.87

2.79

37.66

cc

37.66

0.00

1005

10018

17-Jan-19

70.44

5.64

76.08

cc

76.08

0.00

1006

10014

17-Jan-19

397.83

31.83

429.66

1007

10015

17-Jan-19

34.97

2.80

37.77

1008

10011

17-Jan-19

1033.08

82.65

1115.73

cred

1009

10016

18-Jan-19

20.56

277.55

cred

Table

name:

256.99

cred

100.00

329.66

37.77

0.00

500.00

615.73

chk

0.00

277.5

PRODUCT

PROD_

PROD_DESCRIPT

PROD_

CODE

INDATE

11QER/31

Power

painter,

15 psi.,

PROD_

PROD_

VEND_

MIN

PRICE

DISCOUNT

NUMBER

8

5

109.99

0.00

25595

PROD_

PROD_ QOH

03-Nov-18

3-nozzle 13-Q2/P2

7.25 cm pwr. saw blade

13-Dec-18

32

15

14.99

0.05

21344

14-Q1/L3

9.00 cm pwr. saw blade

13-Nov-18

18

12

17.49

0.00

21344

1546-QQ2

Hrd. cloth,

1/4

cm,

2

3 50

15-Jan-19

15

8

39.95

0.00

23119

1558-QW1

Hrd. cloth,

1/2

cm,

3

3 50

15-Jan-19

23

5

43.99

0.00

23119

2232/QTY

B&D jigsaw,

12 cm

30-Dec-18

8

5

109.92

0.05

24288

2232/QWE

B&D jigsaw,

8 cm

24-Dec-18

6

5

99.87

0.05

24288

2238/QPD

B&D cordless

20-Jan-19

12

5

38.95

0.05

25595

23109-HB

Claw

20-Jan-19

23

10

9.95

0.10

21225

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

blade blade

drill, 1/2

cm

hammer

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

PROD_

Concurrency

641

PROD_

PROD_

PROD_

PROD_

PROD_

VEND_

QOH

MIN

PRICE

DISCOUNT

NUMBER

8

5

14.40

0.05

Rat-tail file,

1/8 cm fine

15-Dec-18

43

20

4.99

0.00

21344

89-WRE-Q

Hicut

saw,

07-Jan-19

11

5

0.05

24288

PVC23DRT

PVC pipe,

06-Jan-18

188

75

5.87

0.00

SM-18277

1.25

01-Mar-19

172

75

6.99

0.00

21225

SW-23116

2.5 cm

50

24-Feb-19

237

100

8.45

0.00

21231

4 m 3 8 m

17-Jan-19

18

0.10

25595

54778-2T

chain

cm

Steel

Table name:

16 cm

3.5 cm, metal

8

screw,

wd. screw, matting, m,.5

m 25

CUST_

CUST_

AREACODE

PHONE

BALANCE

0181

844-2573

0.00

0161

894-1238

0181

894-2285

0.00

0181

894-2180

0.00

0181

222-1672

0.00

B

0161

442-3381

0.00

G

0181

297-1228

0181

290-2556

0.00

CUST_

NUMBER

LNAME

FNAME

10010

Ramas

Alfred

A

10011

Dunne

Leona

K

10012

Moloi

10013

Pieterse

10014

Orlando

Myron

10015

OBrian

Amy

10016

Brown

James

F

615.73

277.55

Vinaya

G

0181

382-7185

0.00

Mlilo

K

0161

297-3809

0.00

LINE LINE_NUMBER

PROD_CODE

LINE_UNITS

LINE_PRICE

LINE_AMOUNT

1001

1

13-Q2/P2

3

14.99

44.97

1001

2

23109-HB

1

9.95

9.95

1002

1

54778-2T

2

4.99

9.98

1003

1

2238/QPD

4

38.95

155.80

1003

2

1546-QQ2

1

39.95

39.95

1003

3

13-Q2/P2

5

14.99

74.95

1004

1

54778-2T

3

4.99

14.97

1004

2

23109-HB

2

9.95

19.90

1005

1

PVC23DRT

5.87

70.44

1006

1

SM-18277

3

6.99

20.97

1006

2

2232/QTY

1

109.92

109.92

Cengage deemed

W

George

Moloi

INV_NUMBER

has

INITIAL

Jaco

Padayachee

10019

CUST_

Marlene

Williams

10018

2020

119.95

CUST_

CUST_

name:

5

m mesh

CUST_

Table

256.99

CUSTOMER

10017

review

and

02-Jan-19

Sledge hammer,

3 1/6

Copyright

Transactions

12 kg

23114-AA

WR3/TT3

Editorial

Managing

INDATE

PROD_DESCRIPT

CODE

12

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

12

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

12

and/or restrictions

eChapter(s). require

it.

642

PART

V Database

Transactions

INV_NUMBER

Table

and

Performance

LINE_NUMBER

Tuning

PROD_CODE

LINE_UNITS

LINE_PRICE

LINE_AMOUNT

9.95

1006

3

23109-HB

1

1006

4

89-WRE-Q

1

256.99

256.99

1007

1

13-Q2/P2

2

14.99

29.98

1007

2

54778-2T

1

4.99

4.99

1008

1

PVC23DRT

5

5.87

1008

2

WR3/TT3

4

1008

3

23109-HB

1

1008

4

89-WRE-Q

2

256.99

513.98

1009

1

89-WRE-Q

1

256.99

256.99

name:

9.95

29.35 479.80

119.95

9.95

9.95

ACCT_TRANSACTION

ACCT_TRANS_

ACCT_TRANS_

CUST_

ACCT_TRANS_

ACCT_TRANS_

NUM

DATE

NUMBER

TYPE

AMOUNT

10003

17-Jan-19

10014

charge

329.66

10004

17-Jan-19

10011

charge

615.73

10006

29-Jan-19

10014

payment

329.66

10007

18-Jan-19

10016

charge

277.55

Now suppose that the DBMS completes the first three SQL statements. Further, suppose that during the execution of the fourth statement (the UPDATE of the CUSTOMER tables CUST_BALANCE value for customer 10016), the computer system experiences aloss of electrical power. If the computer does not have a backup power supply, the transaction cannot be completed. Therefore, the INVOICE and LINE rows

were added,

the

PRODUCT

table

was updated

to represent

the sale of product

89-WRE-Q,

but customer 10016 was not charged, nor wasthe required record in the ACCT_TRANSACTION table written. The database is now in an inconsistent state, and it is not usable for subsequent transactions. Assuming that the DBMS supports transaction management, the DBMS will roll back the database to a previous

12

consistent

state.

NOTE Microsoft to

Access

an external

as Oracle,

supports

DBMS,

transaction

or via

SQL Server

and

management

Access

Data

through

Objects

DB2, do support

(ADO)

its

native

JET

components.

the transaction

engine,

More

management

via

an

ODBC interface

sophisticated

DBMSs,

components

such

discussed

in this

chapter.

Although

the

interruption programmer of the

ten

Copyright Editorial

review

2020 has

and

transaction

units

Cengage deemed

DBMS is prevents

any

All suppressed

truly

Rights

to recover

completion

represents

89-WRE-Q,

Reserved. content

designed

does

May not

not materially

be

copied, affect

correct.

the

scanned, overall

The

real-world

the inventory

the

a database

of a transaction,

must be semantically

of product

Learning. that

the

or

duplicated, learning

in experience.

to

a previous

the

transaction

DBMS

cannot

event.

For example,

commands

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

is

guarantee

UPDATE

whole

consistent

itself

right

defined that

suppose

the

that

to

third remove

party additional

content

may content

any

an or

the

sale

of

way:

suppressed at

user

meaning

following

be

when end

semantic

were written this

some

state by the

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

UPDATE

PRODUCT

SET

OD_QOH

5 PROD_QOH

OD_CODE

5 '89-WRE-Q';

WHERE The

sale

should

UPDATE

added

Although

decreased

to

event

product

customer Clearly,

Some

that

many

errors

1546-QQ2

10012 rather improper

DBMSs

constraints

especially

letting

customer

the

into

end the

a customer

All transactions properties

is

successfully

Consistency database

one

Isolation second

Tn) until

the

very

Copyright review

2020 has

to

Cengage

any

the

number

that

of the

first

test.

of

programmers

are

the

quantity value

on

user

can

define

integrity. enforceable

referential

structures if

database

are

properly

a transaction already

and

defined,

inserts

exists,

the

key integrity

entity

a new

DBMS

will

rule.

Lets

durability

look

briefly

serialisability.

at each

of a transaction

of these

These

properties:

be completed; if not,

SQL requests, is

and

all four

aborted.

In

requests

other

must

words,

be

a transaction

work.

of the databases another.

one is

consistent

When a transaction

state. Atransaction

is

parts violates

of a system

does

particularly

completed,

an integrity

the

takes a

database

constraint,

the

entire

May not

not materially

useful

execution

scanned, the

overall

a transaction

in

T1 is

by any

multi-user

other

database

database

being transaction

environments

at the same time. they

system is

or

duplicated, learning

in experience.

of several transactions

after

multiple

transaction

is

cannot

be undone

executed

whole

Cengage

part.

Due

to

are likely

serialisability ensures

at a time.

Learning

electronic reserves

to is

some to

be executed not

atomicity

an issue.

third remove

party additional

content

and isolation and the

a single-user

right

is important

serialisability

The

rights, the

consistent

This property

transactions

DBMSs. (Even

or in

yields

T1, T2 and T3 yields results

another).

executed,

automatically

by the single-user

affect

if

be accessed

of transactions

order (one

where

a single

one transaction

copied,

words,

are done (committed),

execution

in serial

database

be

other cannot

cannot be used by a

failure.

databases, only

of atransaction

and update the

changes

the concurrent

if

In

data item

that the concurrent

only

Reserved. content

is

once transaction

a single-user

Rights

completed.

X, that

property

distributed

because

All

DBMS

represents

CUST_BALANCE

governing

primary

isolation,

transaction

unit

users can access

event

Naturally,

suppressed

and

the

being inserted

a violation

entire

to

data item

must be guaranteed

Learning. that

Yet the

of reducing

which the

For example,

T1 has four

logical

have been executed and

database,

deemed

a transaction

This

different

nature,

transactions

Editorial

the

ensures

multi-user

the

1

More specifically,

concurrently. By its

customer

as the

state

until the

ensures

appear

users

effect

as those

when the table

ACIDS

If

T1 ends.

Serialisability

in

DBMS

state. If any of the transaction

even in the

results.

End

by

such

transactions.

the permanence

using

several

Durability

643

aborted.

and is

or lost,

means

rules,

to

consistent

transaction

because

results.

consequences

a devastating

meansthat the data used during the execution

executed (T2 ...

provide

to indicate

otherwise,

a consistent is

Instead,

whether the transaction

or of crediting

have

consistency,

indivisible,

indicates

from

transaction

that

aborted.

as a single,

reaches

89-WRE-Q

atomicity,

referred

completed;

is treated

by ten.

use yields incorrect

the

requires that all operations (SQL requests)

transaction

Concurrency

Properties

sometimes

Atomicity the

89-WRE-Q

responsibility.

can

by the some

and the

must display

are

and

10016.

variety

code

product

evaluate

Imagine

Other integrity

validate

an error

12.1.2 Transaction

users

transactions

table

with

end

its

cannot

of product

rules.

Transactions

value.

correct,

fashion.

customer

automatically

DBMS

transaction

the

in this

the relational

on business

are enforced

thereby

is

for

PROD_QOH

The DBMS

instead

than

value

syntax is

or incomplete

based

integrity,

89-WRE-Qs

anyway.

Managing

1 10

PROD_QOH

commands

correctly;

of introducing

on hand for

the

product

UPDATE

the transaction

real-world

capable

for

ten

the

will execute the

have

12

DBMS

may content

be

durability

must

suppressed at

any

time

of

from if

the

subsequent

of

manage

eBook rights

and/or restrictions

eChapter(s). require

it.

644

PART

V Database

Transactions

recovery

from

improper

application

and

errors

Multi-user

created

the

of transactions

and integrity. second

property

is

by using

in addition if

transaction

violated

and the

concurrency

American

the

require

sequence

events

1

is

Standards

such

to

interruptions

first

guard the

The

DBMS

multiple

and

concurrent

serialisability

databases over the

transaction

undesirable

to

ensure

are executed

the

consistent.

avoid

Institute

support

that,

no longer to

to

transactions before

subject

controls

and durability

database

techniques

are typically

must implement

concurrent the

when

must continue

is

(ANSI)

provided

through

has

by two

a transaction

same

is finished,

must

manage

and

consistency data set

the

the

isolation

transactions

situations.

defined

standards

SQL statements:

sequence

is initiated

all succeeding

that

govern

COMMIT

by a user

SQL statements

and

SQL

ROLLBACK.

or an application

until

database ANSI program,

one of the following

four

occurs:

A COMMIT statement is reached, in database.

2

power

Management with SQL

Transaction

standards

several

database

control

National

transactions.

interruptions,

LAN-based,

DBMS

to atomicity

updates

12.1.3 Transaction The

mainframe-or

multi-user

For example,

and the

by operating-system-induced

whether

Therefore,

isolation

Tuning

execution.)

databases,

transactions.

Performance

The

COMMIT

statement

which case all changes are permanently recorded automatically

ends the

within the

SQL transaction.

A ROLLBACK statement is reached, in which case all changes are aborted and the database is rolled

3

back

to its

previous

consistent

state.

The end of a program is successfully reached, in within

4

the

database.

This

action

The program is abnormally aborted

and the

database

is

equivalent

terminated,

is rolled

to

in

back

which case all changes are permanently recorded COMMIT.

which case the changes

made in the database

previous

This

to its

consistent

state.

action

is

are

equivalent

to

ROLLBACK. The

use

of

COMMIT

quantity

on hand

product

1558-QW1

is illustrated

in the

(PROD_QOH) priced

at

and

following

the

43.99

simplified

customers

per

unit (for

sales

balance a total

of

example, when

87.98)

the

and

which

updates

customer

buys

charges

the

a products two

purchase

units to

of

his or

her account:

12 UPDATE

PRODUCT

SET

PROD_QOH

5 PROD_QOH

WHERE

PROD_CODE

UPDATE

CUSTOMER

SET

CUST_BALANCE

WHERE

2

5 '1558-QW1';

5 CUST_BALANCE

CUST_NUMBER

1 87.98

5 '10011';

COMMIT; (Note

that

database,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

the

example

is

the transaction

All suppressed

Rights

Reserved. content

does

May not

simplified

to

would involve

not materially

be

copied, affect

scanned, the

overall

or

make it

several

duplicated, learning

in experience.

whole

easy

to trace

additional

or in Cengage

part.

Due Learning

to

the

table

electronic reserves

transaction.

In

the

Ch12_SaleCo

updates.)

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Actually, is the

the

COMMIT

applications

practice

last

dictates

that

A transaction

such

BEGIN

you to

the

COMMIT

implicitly

the

example

application

when

not necessary

terminates statement

the

ANSI standard;

is

first

normally. at the

SQL

some (such

Managing

Transactions

if the

UPDATE

However,

end

good

of a transaction

statement

is

and

645

statement

programming

declaration.

encountered.

as SQL Server)

Concurrency

Not

use transaction

all

SQL

management

as:

TRANSACTION;

to indicate

the

follow

used in that

and the

you include

begins

implementations statements

statement action

12

the

beginning

assign

Oracle

of

a new transaction.

characteristics

RDBMS

uses

Other

for the transactions

the

SET

SQL implementations,

as parameters

TRANSACTION

statement

such

as

VAX/SQL,

to the

BEGIN statement.

declare

a new transaction

to

allow

For example, start

and its

properties.

12.1.4 The Transaction Log A DBMS uses a transaction information statement, crash.

a programs Some

state.

After

a server

failure,

transaction

DBMS log.

A record

the

that

The type

?

transactions

of the

and after

The ending (COMMIT)

as a network forward

back

uncommitted

written to the physical database,

the

it

also

or a disk

a currently

modify

The

by a ROLLBACK

discrepancy

to

but not yet

that

(SQL

consistent

transactions

and

database.

automatically

updates

the

statement):

delete,

affected

values

to the previous

rolls

database.

triggered

transaction

(update,

objects

such

a database

automatically

update the

stores:

component

of the

The before

? Pointers

log

beginning

of operation

? The names

Oracle

that

requirement

failure

to recover

were committed

executes

For each transaction

log

example,

The transaction

for

or a system

transaction

for

of all transactions

DBMS for a recovery

termination,

use the

transactions

While the

used by the

abnormal

RDBMSs

rolls forward

?

log to keep track

stored in this log is

insert)

by the transaction

for the

fields

being

and next transaction

(the

name

of the table)

updated

log

entries for the

same transaction

of the transaction.

12 Although

using

corrupted

database

management

Table

basis

Copyright review

2020 has

Cengage deemed

Learning. that

any

as

like

All suppressed

Rights

COMMIT,

does

transaction

May

failure

overhead

Access etc.

not materially

be

were not

copied, affect

scanned, the

overall

log

that

occurs,

of a DBMS,

the

does not support

As such

it is

or

ability

to restore

advanced

not as resilient

duplicated,

in experience.

whole

a basic

DBMS

a

transaction

to failure

process written

or in Cengage

part.

Due Learning

is

the

recovery

electronic reserves

the

database

rights, the

right

some to

composed

the transaction

database

complete,

to the

to

transaction

will examine

(ROLLBACK)

recovery

physically

learning

reflects

the

and restore

When the

that

not

Microsoft

ROLLBACK,

transactions

information.

Reserved. content

processing

price. (Note:

If a system

transactions

the

Oracle.)

or incomplete of that

increases

a simplified

statements.

uncommitted

Editorial

such

12.1 illustrates

all committed

log

worth the

databases

UPDATE

on the

is

options

as enterprise

SQL

a transaction

third remove

to its

DBMS

before

party additional

content

may

previous writes in

the

content

of two

log for

be

failure

suppressed at

any

time

the log

occurred.

from if

all

state

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

646

PART

V Database

Transactions

TABLE 12.1 TRL_ID

341

and

Performance

Atransaction

TRX_

PREV

NEXT

NUM

PTR

PTR

101

Null

352

Tuning

log OPERATION

TABLE

START

ROW ID

ATTRIBUTE

BEFORE

AFTER

VALUE

VALUE

****Start Transaction

352

101

341

363

UPDATE

PRODUCT

1558-QW1

PROD_QOH

363

101

352

365

UPDATE

CUSTOMER

10011

CUST_

25

23

525.75

615.73

BALANCE 365

101

363

Null

COMMIT

****

End of

Transaction TRL_ID

5 Transaction

TRX_NUM (Note:

The transaction

PTR

If

a ROLLBACK

only for

is issued

that

previous

particular

transaction As the

In

logs

number

number

for

committed

to common some

on several

different

disks

dangers

of the to

the

transactions

most

reduce

the

such

risk

DBMS.)

in

the

database

durability

any other

conditions

a DBMS,

of a system

the

of the

back.

DBMS like

as disk-full data

will restore

maintain

not rolled

by the

critical

DBMS to

are

managed

by the

ID

all transactions,

and it is

database

contains

record

assigned

of a transaction,

than

a database,

automatically

log

termination

words,

is

a transaction

rather

log

database.

The

and disk crashes.

some

implementations

failure.

CONCURRENCY CONTROL

coordination

of the

simultaneous

known as concurrency of transactions

12

to

the

other

log is subject

12.2 The

before

log is itself

transaction

support

5 Pointer

transaction,

transactions.

The transaction

log record ID

5 Transaction

in

simultaneous

a

control.

multi-user

execution

consistency

execution

database

The three

a

control

database

are lost

multi-user

database

system

is

control is to ensure the serialisability

Concurrency

over a shared

main problems

in

of concurrency

environment.

of transactions

problems.

of transactions

The objective

is important

can create

updates,

several

uncommitted

because

the

data integrity

and

data and inconsistent

retrievals.

12.2.1 Lost Updates The lost

update

data element

of lost

updates, lets

products value

problem

quantity

is

occurs

and one of the

35.

Also

when two

concurrent

on hand assume

(PROD_QOH). that

two

Assume

concurrent

PRODUCT

table.

Purchase

T2:

Sell 30 units

2020 has

T2,

are

updating

you

have

a product

T1 and

T2,

The transactions

whose occur

the

same

To see an illustration

attributes is a

current

that

PROD_QOH

update

the

PROD_

are:

Computation

T1:

review

that

T1 and

other transaction).

One of the PRODUCT tables

transactions,

Transaction

Copyright

by the

examine a simple PRODUCT table.

QOH value for some item in the

Editorial

transactions,

updates is lost (overwritten

100

Cengage deemed

Learning. that

any

units

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

PROD_QOH

5 PROD_QOH

1 100

PROD_QOH

5 PROD_QOH

2 30

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table 12.2 shows the correct

answer

serial

execution

PROD_QOH

TABLE 12.2

of those

transactions

Normal execution

Step

1

T1

Read

2

T1

PROD_QOH

3

T1

4

T2

Read

5

T2

PROD_QOH

6

T2

suppose

12.3

shows

35,

when

and its

promptly

a transaction (using

how the lost the

overwritten

TABLE 12.3

is able to read

the same

update

second

subtraction

Concurrency

yielding

Stored

647

the

product)

problem

can

transaction

5 35

1 100 135 135

PROD_QOH

5 135

2 30

yields

5 in

by T2. In short, the

a products

arise. is

In

the

memory.

105

PROD_QOH

has been committed.

(T2)

addition

Note that

transaction

Therefore,

meantime,

T1

value from the table

The sequence

the first

executed.

Value

35

PROD_QOH

Write PROD_QOH

that

and

circumstances,

Write PROD_QOH

transaction

committed

normal

Transactions

of two transactions

Transaction

However,

under

Managing

5 105.

Time

a previous

12

T2 still

writes

of 100 units is lost

the

depicted

(T1)

has

operates

value

before

in

not

yet

on the

135 to

Table

disk,

been value

which

is

during the process.

Lost updates Stored

Time

Transaction

Step

1

T1

Read

PROD_QOH

35

2

T2

Read

PROD_QOH

35

3

T1

PROD_QOH

5 35

1 100

4

T2

PROD_QOH

5 35

2 30

5

T1

Write PROD_QOH

6

T2

Write

(Lost

update)

Value

135

PROD_QOH

5

12

12.2.2 Uncommitted The

phenomenon

Data

of uncommitted

data

occurs

when two transactions,

T1 and

T2, are executed

concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has already accessed the uncommitted data thus violating the isolation property of transactions. Toillustrate this possibility, lets use the same transactions described during the lost updates discussion. T1 has two atomic

parts to it,

one of which is the

update

of the inventory,

the other

possibly

being the

update

of

the invoice total (not shown). T1 is forced to roll back due to an error during the update of the invoice total; hence, it rolls back allthe way, undoing the inventory update as well. This time the T1transaction is rolled back to eliminate the addition of the 100 units. Because T2 subtracts 30 from the original 35 units, the correct answer should be 5.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

648

PART

V Database

Transactions

and

Performance

Tuning

Computation

Transaction

T1: Purchase

100 units

T2: Sell 30 units

PROD_QOH

5 PROD_QOH

PROD_QOH

5 PROD_QOH

Table 12.4 shows how, under normal circumstances, the correct answer.

TABLE 12.4

Correct

Step

1

T1

Read PROD_QOH

2

T1

PROD_QOH

3

T1

Write PROD_QOH

4

T1

5

T2

Read

6

T2

PROD_QOH

7

T2

Write PROD_QOH

12.5 shows

how the

begun

TABLE 12.5

its

5 35

yields

135 35

*****

35

PROD_QOH 5 35

uncommitted

Value

1 100

2 30 5

data

problem

can arise

when the

ROLLBACK

is

completed

execution.

An uncommitted data problem

Transaction

Step

1

T1

Read PROD_QOH

2

T1

PROD_QOH

3

T1

Write PROD_QOH

4

T2

Read PROD_QOH (Read

5

T2

PROD_QOH

6

T1

7

T2

Stored

*****

Inconsistent

5 35 1 100 135

5 135

ROLLBACK

uncommitted

data)

135

2 30

*****

35

if transaction

occur

finish

working

T1 calculated

transaction,

are

To illustrate

changed that

T1 calculates

2

Atthe same time,

with such

the same

and

problem,

1

when a transaction

a summary

T2, was updating

they

105

Retrievals

retrievals

transactions

Value

35

Write PROD_QOH

12.2.3 Inconsistent

before

of those transactions

35

Time

other

the serial execution

*****ROLLBACK

T2 has

- 30

Stored

Transaction

after

back)

of two transactions

Time

Table

12

execution

1 100 (Rolled

other

the total

quantity

(using

accesses

For

data before and after one or

example,

an inconsistent

SQL aggregate

functions)

retrieval

over

data. The problem is that the transaction

data

assume

data.

after

the

they

are

following

on hand

changed,

thereby

would

a set

of data

might read

yielding

some

inconsistent

more occur

while data

results.

conditions:

of the

products

stored in the

PRODUCT

table.

T2 updates the quantity on hand (PROD_QOH) for two of the PRODUCT tables

products.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The two transactions

are shown in

TABLE 12.6 transaction

Retrieval

Table

during

T1

SELECT FROM

Transactions

and

Concurrency

649

update T2

UPDATE

PRODUCT

Managing

12.6.

Transaction

SUM(PROD_QOH)

12

PRODUCT

SET PROD_QOH

5 PROD_QOH

1 10

WHERE

PROD_CODE

5 1546-QQ2

UPDATE PRODUCT

SET PROD_QOH

5 PROD_QOH

2 10

T2 represents

the

WHERE PROD_CODE

5 1558-QW1

COMMIT;

While T1

calculates

of a typing

error:

ten

units to

(Only

the

added

PROD_QOH

a few

in

of the

the

quantity

TABLE 12.7

is

Transaction

for

given

1558-QW1s

PROD_QOH

few

the

table

are

but

meant to

PROD_QOH.

values

are reflected

shown.

To illustrate

After

PROD_QOH

PROD_QOH

1546-QQ2

15

(15 1 10)

1558-QW1

23

(23

2232-QTY

8

8

2232-QWE

6

6

92

92

that

inconsistent

execution

for

1 25

before

has

total

Cengage deemed

next

Learning. that

any

is

All suppressed

in

are

write 65

Rights

Reserved. content

reflects

statement

does

was

are correct

during

the

shown

after the

total

1 23

12.7

summation

was read

The Before

Table

possible

The After

1546-QQ2 5 65.

the

Before

2020

retrievals

incorrect.

product

40

shown

the

data entry correction

32

results

12.7.

point,

products.)

32

final

the two

Table

the

13-Q2/P2

the

add the

product

(See in

8

Although

correction

user adds ten to

8

Total

review

product

problem,

Before

11QER/31

Copyright

PROD_QOH,

the

PRODUCT

for those

all items,

1558-QW1s

and final

the

results:

for

To correct

ten from

PROD_CODE

Editorial

product

The initial

values

values

to

subtracts

12.6.)

PROD_CODE

PROD_QOH

units

(PROD_QOH)

PROD_QOH.

and Table

on hand

ten

1546-QQ2s

statements

sum for

total

user

product

1546-QQ2s UPDATE

the

after the

transaction in

Table

write statement the

fact

completed

that to

adjustment,

reflects

value

reflect

the

of 23 for corrected

12.8

making the

was completed.

the

10) ?13

Table

execution, 12.8

? 25

fact

Therefore, product

demonstrates

the

that

12

result the

of T1s

value

the After

total is

1558-QW1

update

of

13.

of 25

was read

Therefore,

the

5 88.

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

650

PART

V Database

TABLE 12.8

Transactions

and

Inconsistent

Performance

Tuning

retrievals Value

Total

Time

Transaction

Action

1

T1

Read PROD_QOH for

PROD_CODE

5 '11QER/31'

8

8

2

T1

Read PROD_QOH for

PROD_CODE

5 '13-Q2/P2'

32

40

3

T2

Read PROD_QOH for

PROD_CODE

5 '1546-QQ2'

15

4

T2

PROD_QOH

5

T2

Write PROD_QOH

for

PROD_CODE

5 '1546-QQ2'

25

6

T1

Read

PROD_QOH

for

PROD_CODE

5 '1546-QQ2'

25

(After)

7

T1

Read

PROD_QOH

for

PROD_CODE

5 '1558-QW1'

23

(Before)

8

T2

Read

PROD_QOH

for

PROD_CODE

5 '1558-QW1'

23

9

T2

PROD_QOH

10

T2

13

11

T2

12

T1

Read

13

T1

Read

5 15

5 23

for

PROD_CODE

5 '1558-QW1'

PROD_QOH

for

PROD_CODE

5 '2232-QTY'

8

96

PROD_QOH

for

PROD_CODE

5 '2232-QWE'

6

102

COMMIT

65 88

2 10

Write PROD_QOH *****

1 10

*****

The computed answer of 102 is obviously wrong because you know from Table 12.7 that the correct answer is 92. Unless the DBMS exercises concurrency control, a multi-user database environment can create

havoc

within the information

system.

12.2.4 The Scheduler You now know that severe problems can arise when two or more concurrent transactions are executed. You also know that a database transaction involves a series of database I/O operations that take the database from one consistent state to another. Finally, you know that database consistency can be ensured

only before

and

after the

execution

of transactions.

A database

always

moves through

an unavoidable temporary state of inconsistency during a transactions execution. That temporary inconsistency exists because a computer cannot execute two operations at the same time and must therefore execute them serially. During this serial process, the isolation property of transactions prevents

12

them

from

accessing

the

data not yet released

by other transactions.

In previous examples, the operations within atransaction were executed in an arbitrary order. Aslong as two transactions, T1 and T2, access unrelated data, there is no conflict and the order of execution is irrelevant to the final outcome. However, if the transactions operate on related (or the same) data, conflict is possible among the transaction components and the selection of one operational order over another

may have

some

undesirable

consequences.

So, how is the

correct

order

determined,

and

who determines that order? Fortunately, the DBMS handles that tricky assignment by using a built-in scheduler. The scheduler is a special DBMS program that establishes the order in which the operations within concurrent

transactions

are executed.

The scheduler

interleaves

the

execution

of database

operations

to ensure serialisability and isolation of transactions. To determine the appropriate order, the scheduler bases its actions on concurrency control algorithms, such aslocking or time stamping methods, which are explained in the next sections.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The scheduler If there

were

a first-come, the

waits

first-come, DBMS

for

scheduling

the

data

that

some

processing

yield

other

is that

thereby

would

processing

time

several

response

method is

and

Concurrency

651

unit (CPU) is used efficiently.

losing

unacceptable

scheduling

Transactions

all transactions

approach

to finish,

to

a

facilitates

at the

that

and

WRITE

isolation

time.

CPU

times

be

executed

is

wasted

cycles.

the

on when

In

within the

needed to improve

are

executed

operations

to

ensure

operations

Table

12.9 shows the

concurrently

are in

that

Database

For example,

T2,

two

data

same

conflicts.

T1

note

is

scheduler

element

produce

12.9,

of them

with that

operation

tends

central

of transactions,

problem

WRITE

Therefore,

transactions,

Table

The

or

computers

execution

Managing

short,

multi-user efficiency

of

system.

same

actions

the

basis.

a READ

environment.

Additionally,

two

schedule

first-served

overall

the

makes sure that the

way to

first-served

CPU

the

also

no

12

conflict

over

when

two

transactions

may require

the

they

possible same

access

same

update

and/or

conflict

data.

the

do not

READ

WRITE

scenarios

when

Using

the

summary

in

data

and

at least

one

operation.

TABLE 12.9

Read/write

conflict

scenarios:

conflicting

database

operations

matrix

Transactions

Operations

Several

methods

transactions. methods

have

Those are used

12.3

been

Read

Read

No conflict

Read

Write

Conflict

Write

Read

Conflict

Write

Write

Conflict

proposed

have

a lock

so that

another

assumes

that

execution

of conflicting

as locking,

time

operations

stamping

in

concurrent

and optimistic.

Locking

is

a data item access; can lock

to

a current

that is currently

the lock the

transactions

based

WITH LOCKING

data

item

might

on the

assumption

discussion

that

that

used

(unlocked)

for its

attempt

transaction.

being

is released

METHODS

manipulate

conflict

other

words, transaction

by transaction

T1. A transaction

when the

exclusive

to

In

between

use.

This

the

same

transaction series

complete

of locking

data

transactions

is

at the

is likely

1

actions

same

and is

time.

known

locking. the

database

earlier

may be in

are required

Most

to

transaction

Recall from

to

multi-user

managed

the

use of a data item

data

concurrent

The use of locks

schedule

CONTROL

exclusive

prior to

as pessimistic

to

been classified

most frequently.

T2 does not have access

is

Result

methods

guarantees

acquires

locks

T2

CONCURRENCY

A lock

the

T1

a temporary

prevent

DBMSs

by a lock

data

inconsistent

another

transaction

automatically

manager,

consistency

initiate

cannot

state

when

from

reading

and enforce

which is responsible

for

be guaranteed

several

updates

are

inconsistent

locking

assigning

during executed.

Therefore,

data.

procedures. and

a transaction;

policing

All lock information the

locks

used

by the

transactions.

12.3.1 Lock

Granularity

Lock

indicates

granularity

table, Copyright Editorial

review

2020 has

page, row

Cengage deemed

Learning. that

any

All suppressed

the level

of lock

use. Locking

can take

place at the following

levels:

database,

or even field (attribute). Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

652

PART

V Database

Transactions

Database Level In a database-level database batch

and

lock,

by transaction

processes,

access

Performance

the entire database is locked, T2

but it is

while

Note that

next

for

T1 and

the

had to

executed.

the

This level

DBMSs.

You

previous

Figure

12.3 illustrates

same

database

of locking

can imagine

wait for the

database.

access

preventing the use of any tables in the is

how

transaction

to

the

good

slow

data

be completed

database-level

concurrently

for

the

even

lock.

when they

use

Database-level locking sequence

Time

Payroll

Transaction (Update Lock

database

Database

Transaction

1 (T1)

Table

A)

TABLE

(Update

A

2(T2)

Table

B)

Lock

request

Locked

2

being

multi-user

entire

T2 cannot

thus

tables.

FIGURE 12.3

1

Tl is

online

of transactions

one could reserve

transactions

different

transaction

unsuitable

would be if thousands

before the

Tuning

database

request

WAIT

OK

3

4

TABLE

5

B OK

Locked

Unlocked

6

7

8

9 Unlocked

12

Table Level In a table-level transaction

lock, the entire table is locked,

T1 is

locked.

using

However,

the

two

table.

If

a transaction

transactions

can

preventing requires

access

the

access to any row by transaction access

same

to

several

database

tables,

as long

each

as they

T2 while

table

access

may be different

tables. Table-level

locks,

transactions forces

are

a delay

when

the

suitable Figure

Copyright review

2020 has

note

rows;

Learning. that

to

any

All suppressed

Rights

does

Figure

transactions

May not

wait

not materially

be

same

T1

until

copied, affect

table.

the

overall

or

Such

with

each

and

duplicated, learning

the

in experience.

whole

to

other.

the

T2 cannot

locks,

cause

a condition

access

12.4 illustrates

T1 unlocks

scanned,

database-level

require

not interfere

DBMSs.

must

Reserved. content

the

than

transactions

would

that

T2

restrictive

access

different

multi-user

12.4,

Cengage deemed

waiting

when

transactions

for

different

Editorial

while less

especially

parts

same

same

table

when

locks

lock.

even

many

if the table,

table-level

of a table-level

the

jams irksome

of the

Consequently,

effect

access

is

different

traffic

lock

that

is,

are

not

As you examine

when they

try to

use

table.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 12.4

12

Managing

Transactions

and

Concurrency

653

An example of a table-level lock Payroll

Time

Transaction

1 (T1)

(Update

1

Lock Table Arequest

2

Locked

row

Database

Table

Transaction

A

2 (T2)

(Update

5)

row

Lock

30)

Table

A request

WAIT

OK

3 4

5 Unlocked (end

6

of transaction

OK

1)

Locked

7 8

9 Unlocked (end

Page Level In a page-level lock, diskblock, such

as 4K, 8K or 16K.

page

from

disk,

can

contain

several

rows

multiuser

DBMS

locking

12.5.

used

until the

page

Cengage deemed

Learning. that

any

updated

As you examine

diskpages.

has

if you

a page

different

2020

For example,

be read

Figure

review

an entire diskpage.

as a directly

and

most frequently

Copyright

DBMS locks

must

pages,

Editorial

the

which can be described

is

All suppressed

If

T2 requires

unlocked

Rights

Reserved. content

Figure

does

by

May not

not materially

in

memory

12.5,

the

use

A diskpage, or page, is the equivalent

addressable want to

or

note that

of a disk.

only 73 bytes to

written

back

more tables.

method.

of a row

section

write

and

of one

of transaction

to

located

A table

on a page

can

locks

the

that

same table

is locked

is

4K

several

currently

lock

of a size,

entire

span

are

of a page-level

T2 access

has a fixed

a 4K page, the

Page-level

An example

T1 and

disk.

A page

2)

the

shown

in

12

while locking

by T1,

T2

must

wait

T1.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

654

PART

V Database

Transactions

FIGURE 12.5

and

Performance

Tuning

An example of a page-level lock

Time

Transaction (Update

Payroll

1(T1)

row

Database Transaction

1)

(Update

Table A

Lock page 1 request

2 (T2)

rows

5 and 2)

1

Locked

Page 1

2

OK

1

Lock

3

2

4

Lock

page

Page 2

5

2 request

Locked

OK

4

3

page

1 request

Wait

5 Unlock

6 7

(end

page

6

1

OK

of transaction)

Locked

Unlock (end

pages

1 and

2

of transaction)

Row number

Row

Level

A row-level

lock

transactions

to

page.

Although

requires the

high

use

is

much less

access the

overhead.

rows

FIGURE 12.6

discussed

of the same table,

locking

(A lock

Figure

approach

exists

12.6,

for

each

note that

are on the same

in

earlier.

even

improves

row

page.

both

each

The

DBMS

when the rows

the table

transactions

of the

allows

concurrent

are located

availability

on the

of data, its

database.)

can

execute

T2 must wait only if it requests

Payroll

Transaction (Update

12

rows

the locks

same

management

Figure

12.6 illustrates

concurrently,

the

even

same row

when the

as T1.

An example of arow-level lock

Time Lock

1

than

lock.

examine

requested

different

row-level

of a row-level

As you

restrictive

row

Database

1(T1)

row

Transaction

1)

Table

(Update

2 (T2) row

2)

1

1 request

Page 1

2

2

Lock

row

2 request

3

OK

Locked

3

A

OK

Locked

4

4 Unlock

5 (end

6

row

Page 2

5

1

of transaction)

6

Unlock row 2 (end

of transaction)

Row number

Field

Level

The field-level the

lock

use of different

flexible

multi-user

allows

concurrent

fields (attributes) data

access,

transactions

to

within that row.

it is rarely

done

access

Although

because

the

same row

field-level

it requires

as long

locking

as they require

clearly

an extremely

yields the

high level

most

of computer

overhead.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

12

Managing

Transactions

and

Concurrency

655

12.3.2 Lock Types Regardless

of the level

of locking,

the

DBMS

may use different lock

types:

binary

Binary Locks A binary lock has only two states: locked (1) or unlocked (0). If an object page

or row

is locked

by a transaction,

no other transaction

can

or shared/exclusive.

that is, a database, table,

use that

object.

If

an object

is

unlocked, any transaction can lock the object for its use. Every database operation requires that the affected object belocked. As a rule, atransaction must unlock the object after its termination. Therefore, every transaction requires alock and unlock operation for each data item that is accessed. Such operations are automatically managed and scheduled by the DBMS; the user does not need to be concerned

about locking

or unlocking

data items.

(Every

DBMS

has a default locking

mechanism.

If

the end user wants to override the default, the LOCK TABLE and other SQL commands are available for that purpose.) The binary locking technique is illustrated in Table 12.10, using the lost updates problem you encountered in Table 12.3. As you examine Table 12.10, note that the lock and unlock features eliminate the lost

update

problem.

(The

lock is

not released

until the

write statement

is

completed.

Therefore

a PROD_QOH value cannot be used until it has been properly updated.) However, binary locks are now considered too restrictive to yield optimal concurrency conditions. For example, the DBMS will not allow two transactions to read the same database object even though neither transaction updates the

database

(and,

therefore,

no concurrency

problems

can occur).

concurrency conflicts occur only whentwo transactions the database.

TABLE

12.10

An example

Time

Remember

execute concurrently

from

Table

12.9 that

and one ofthem updates

of a binary lock

Transaction

Step

Stored

1

T1

Lock

2

T1

Read PROD_QOH

3

T1

PROD_QOH

4

T1

Write

5

T1

Unlock PRODUCT

6

T2

Lock

7

T2

Read PROD_QOH

8

T2

PROD_QOH

9

T2

Write PROD_QOH

10

T2

Unlock

Value

PRODUCT 15

5 15

1 10 25

PROD_QOH

12

PRODUCT 23

5 23

10 13

PRODUCT

Shared/Exclusive Locks An exclusive lock exists when access is reserved specifically for the transaction that locked the object. The exclusive lock must be used when the potential for conflict exists (see Table 12.9). A shared lock exists when concurrent transactions are granted Read access onthe basis of a common lock. A shared lock

produces

no conflict

as long

as all the

concurrent

transactions

are read-only.

A shared lock is issued when a transaction wants to read data from the database and no exclusive lock is held on that data item. An exclusive lock is issued when a transaction wants to update (write) a

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

656

PART

V Database

data item

Transactions

and no locks

exclusive

locking

As you

in

Table the

on that

T2 lock

12.9,

two

Read transactions

or exclusive

granted

to transaction

same

conflict

the

data

granted

T2,

if

and

possibility

lock)

T2

X, an

only if

The

schema

Although

has

locks

known

been

at least

Using the shared/ and

one

data item

exclusive

of them

at once,

For example,

(Write).

is

shared

a

locks

Write allow

if transaction

T1 has

X, T2 may also obtain

locks

a shared

as two-phase

renders

data

overhead,

data

This

condition

more efficient,

item

X.

Therefore,

T1, an exclusive

T1 commits.

access

over

lock is

if

cannot

known

a

be

as the

on the same object.

a shared/exclusive

lock

for several reasons:

before

a lock

(to

can

check

be granted.

the

type

of lock),

WRITE_LOCK

(to

issue

to

and

the

lock). to

allow

a lock

may not

upgrade

(from

shared

they

can lead

exclusive)

a lock

A database deadlock,

managed:

deadlocks

are

wait for

each

serialisability

is

can

examined

to two

major

problems:

be serialisable.

transactions

be

and

techniques

T2

data item.

at a time can own an exclusive lock

data inconsistencies,

can

locking,

Those

begin

by

on the

shared).

when two

problems

required

held

X by transaction

schedule

caused

is

are

until

enhanced

serious

lock

locks

wait to

the

to

other

on data item

may create deadlocks.

both

techniques.

concurrently.

exclusive

no

READ_LOCK

exclusive

transaction

city, is

Fortunately,

(Read)

executed

held

be known

exist:

prevent

The schedule a big

must

(to release

(from

The resulting

in

held

UNLOCK

downgrade

when

must

managers

operations

and

only

T2 wants to read

item

and

of shared

the lock

of lock

lock

shared

be safely

data item

X and transaction

lock is already

schema increases

Three

by any other transaction.

unlocked,

can

mutually exclusive rule: only one transaction

The type

states:

transactions

to read the

updates

is

shared

Although

data item

have three

X.

If transaction exclusive

held

can

two

on data item

on data item

The

a lock

Read transactions

a shared lock

Tuning

are currently

Because

several

Performance

concept,

saw

transaction.

lock

and

be

in the

managed next two

whichis equivalent to traffic other

to

unlock

guaranteed using

through

deadlock

gridlock

data. a locking

detection

protocol

and

prevention

sections.

12.3.3 Two-Phase Locking to Ensure Serialisability 12

Two-phase

locking

guarantees

1

serialisability,

A growing Once

2

defines

A shrinking

have

locking

No unlock

Figure

Copyright Editorial

review

2020 has

Learning. that

any

are

All

not

Rights

the

Reserved. content

does

is

May

the

can

precede

operation

until

all locks

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

alllocks

by the following

a lock

phases

is in its locked

releases

locks.

locking

The two

locks.

acquires all required locks

have conflicting

are

and relinquish

deadlocks.

transaction

governed

two-phase

not

acquire

prevent

which a transaction

cannot

affected

depicts

suppressed

does

acquired,

protocol

operation

12.7

Cengage deemed

been

phase, in

Two transactions

No data

but it

phase, in which a transaction

all locks

The two-phase

how transactions

obtained

in the that

is,

Two-phase

locking

are:

without unlocking any data.

point.

and cannot obtain any new lock.

rules:

same

transaction.

until the

transaction

is in its

locked

point.

protocol.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 12.7

Two-phase locking

12

Managing

Transactions

and

Concurrency

657

protocol Locked point Release lock

Acquire lock

Acquire lock

Time

Release lock

12

34

56

78

Operations

Start

Growing

End

Locked phase

phase

Shrinking

phase

In this example, the transaction acquires all of the locks it needs until it reaches its locked point. (In this example, the transaction requires two locks.) When the locked point is reached, the data are modified to conform to the transaction requirements. Finally, the transaction is completed asit releases all of the locks

it acquired

in the first

phase.

Two-phase locking increases the transaction processing effects. One undesirable effect is the possibility of creating

cost and may cause additional undesirable deadlocks.

12.3.4 Deadlocks A deadlock

occurs

12

when two transactions

occurs when two transactions, T1 5 access

wait for each

other to

unlock

T1 and T2, exist in the following

data items

X and Y

T2 5 access data items

Y and X

data.

For example,

a deadlock

mode:

If T1 has not unlocked data item Y, T2 cannot begin; if T2 has not unlocked data item X, T1 cannot continue. Consequently, T1 and T2 wait indefinitely, each waiting for the other to unlock the required data item.

Such

a deadlock

is

also known

as a deadly

embrace.

Table

12.11

demonstrates

how a

deadlock condition is created.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

658

PART

V Database

Transactions

TABLE 12.11

and

Performance

Tuning

How a deadlock condition is created Lock

Status

Transaction

Reply

1

T1:LOCK(X)

OK

2

T2: LOCK(Y)

OK

Locked

3

T1:LOCK(Y)

WAIT

Locked

Locked

4

T2:LOCK(X)

WAIT

Locked

Locked

5

T1:LOCK(Y)

WAIT

Locked

Time

0

Data Y

Unlocked

Unlocked Unlocked

Locked

D

Locked

WAIT

T2:LOCK(X)

6

Data X

7

T1:LOCK(Y)

WAIT

Locked

8

T2:LOCK(X)

WAIT

Locked

9

T1:LOCK(Y)

WAIT

Locked

Locked

E A

Locked Locked

D

Locked

...

..............

........

.........

L O

...

..............

........

.........

C

..........

...

..............

........

.........

K

..........

...

..............

........

.........

The preceding a real-world

DBMS,

probability

obtain

The three

many

techniques

prevention.

a deadlock

are rolled

control

A transaction

deadlocks

a deadlock

simultaneously, possible

thereby

only

condition

a new lock

is aborted,

when can

is

prevention

aborted

one

exist

condition.

In

increasing

of the

the

transactions

among

when there

all changes

by the transaction

Deadlock

demonstrate

shared

locks.

are:

requesting

obtained

execution.

are

no deadlock

can occur. If the transaction

for

to

be executed

deadlocks

on a data item;

to

back and alllocks

rescheduled

transactions

can

Note that

lock

.........

concurrent

more transactions

deadlocks.

an exclusive

basic

Deadlock

to

used only two

of generating

wants to

that

example

..........

are released.

works

because

it

is the

possibility

made by this transaction The transaction

avoids

the

is then

conditions

that

lead

deadlocking.

12 Deadlock found,

detection. one

transaction Deadlock

avoidance.

increases

Copyright Editorial

review

2020 has

Cengage deemed

any

avoids

is

high,

the

database

aborted

for

(rolled

deadlocks.

back

If

a deadlock

and restarted)

and the

is other

All suppressed

Rights

does

May not

not materially

all of the locks

of conflicting

the serial lock

control

is low,

it

needs

transactions

assignment

be

affect

scanned, the

overall

detection

before

it

can

by requiring

required

on the

or

database

in

be

that

deadlock

locks

avoidance

environment.

is recommended.

is recommended.

might

copied,

method depends

deadlock

prevention

avoidance

Reserved. content

obtain

rollback

times.

deadlock

deadlock

must

the

However,

response

of deadlocks

is

Learning. that

tests

victim)

The transaction

of the best deadlock

probability

list,

periodically (the

in succession. action

The choice

priority

DBMS

This technique

be obtained

deadlocks

The transactions

continues.

executed.

the

of the

However,

If response

time

is

not

For example,

if the high

probability

on the

if of

systems

be employed.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

12.4

12

Managing

CONCURRENCY CONTROL WITHTIME STAMPING

Transactions

and

Concurrency

659

METHODS

The time stamping approach to scheduling concurrent transactions assigns a global, unique time stamp to each transaction. The time stamp value produces an explicit order in which transactions are submitted to the DBMS. Time stamps must have two properties: uniqueness and monotonicity. Uniqueness ensures that no equal time stamp values can exist, and monotonicity6 ensures that time stamp

values

always increase.

All database operations (Read and Write) within the same transaction must have the same time stamp. The DBMS executes conflicting operations in time stamp order, thereby ensuring serialisability of the transactions. If two transactions conflict, one is stopped, rolled back, rescheduled and assigned a new time

stamp

value.

The disadvantage of the time stamping approach is that each value stored in the database requires two additional time stamp fields: one for the last time the field was read and one for the last update. Time stamping thus increases memory needs and the databases processing overhead. Time stamping tends to demand considerable system resources because manytransactions may have to be stopped, rescheduled

and restamped.

12.4.1

Wait/Die and Wound/Wait Schemes

You have learnt that time stamping methods are used to manage concurrent transaction execution. In this section, you willlearn about two schemes used to decide which transaction is rolled back and which continues executing: the wait/die scheme and the wound/wait scheme.7 An example illustrates the difference. Assume that you have two conflicting transactions, T1 and T2, each with a unique time stamp.

Suppose

T1 has a time

stamp

of 11548789

and

T2 has a time

stamp

of 19562545.

You can

deduce from the time stamps that T1is the older transaction (the lower time stamp value) and T2is the newer transaction. Given that scenario, the four possible outcomes are shown in Table 12.12. TABLE

12.12

Wait/die

Transaction

and

wound/wait

Transaction

Requesting

Lock

Owning

T1 (11548789)

concurrency

Wait/Die

control

schemes

Scheme

Wound/Wait

Lock

T2 (19562545)

T1

waits

and

until

T2 is

T2 releases

completed

its

T1

locks.

preempts

T2 is

T1(11548789)

T2 dies (rolls

back).

T2 is rescheduled

same time

Using the

using

monotonicity

term

and to its

and recovery 7

The

Copyright review

2020 has

Cengage deemed

in

procedure

distributed

Editorial

any

is proper

All suppressed

part

of the

standard

concurrency

use

was in

an article

written

decentralized was first

database

Learning. that

12

same

stamp.

T1 releases

the

T2. the

completed

and

its locks.

stamp.

wait/die scheme:

The term this

back) using

T2 waits until T1is

If the transaction requesting the lock is the older of the two transactions, transaction is completed and the locks are released.

6

(rolls

rescheduled

time

T2 (19562545)

Scheme

Rights

systems,

Reserved. content

computer

described

does

May not

not materially

ACM

be

systems,

by

copied, affect

R.E.

the

overall

or

duplicated, learning

in

vocabulary.

Kohler, Surveys

and

Lewis

on

experience.

control W.H.

Computer

Stearnes

Transactions

scanned,

by

whole

P.M.

Database

or in Cengage

part.

Due Learning

A

electronic reserves

authors

of techniques

June

1981,

pp.

II in System-level

Systems,

to

The

survey 3(2),

it will wait until the other

2, June

rights, the

right

some to

third remove

first

introduction

for

synchronization

149-283.

concurrency 1978,

party additional

content

pp.

may content

to

control

for

178-98.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

660

PART

V Database

Transactions

If the transaction

and

Performance

requesting

and is rescheduled

In short, in the

using

wait/die

Tuning

the lock the

same

is the younger time

scheme, the

of the two transactions,

it

will die (roll

back)

stamp.

older transaction

waits and the younger is rolled

back and

rescheduled. In the

wound/wait

If the the

scheme:

transaction younger

requesting

transaction

preempted

(by

transaction

transaction

requesting

other

transaction

is

short, in the

reschedules In

to

T1

is the

the

two

pre-empts same

younger

and the locks

scheme,

of the

the

transactions, T2

time

when

it

will pre-empt

T1 rolls

back

T2.

(wound)

The

younger

stamp.

of the

two

transactions,

it

will

wait until the

are released.

older transaction

one of the transactions

in

many cases,

each lock

rolls

a deadlock.

To

waits for the

a transaction

request?

value. If the lock is

back the

younger

transaction

and

Obviously,

prevent

that

not granted

requests that

type

scenario

of

other transaction

multiple can

deadlock,

before the time-out

locks.

cause

some

each lock

expires,

to finish

How long

does

transactions

request

has

the transaction

and release

to an

the

atransaction

have

wait indefinitely,

associated

is rolled

time-out

back.

CONCURRENCY CONTROL WITH OPTIMISTIC METHODS

The optimistic not

completed

schemes,

wait for

12.5

back). using

the lock

wound/wait

However,

causing

it

older

it.

both

locks.

rolling

is the

is rescheduled

If the

In

the lock

approach

conflict.

The

transaction

moves

During the

based

on the assumption

approach

is executed

transaction

and

is

optimistic

does

without restrictions

through

two

or three

makes the

updates

to

are recorded

a private

in

majority

locking

or time

until it is committed.

phases.

Read phase, the transaction

transaction

that the

not require

The

phases

reads the copy

file,

Read,

values.

which is

database

an optimistic

a

each

Write.8

the needed

computations

operations

accessed

do

Instead,

approach,

and

All update

not

operations

techniques.

validation

executes

database

update

Using

are

database,

of the

a temporary

of the

stamping

of the

by the

remaining

changes

made

transactions. During

12

not

the

affect

validation

phase,

the integrity

and

transaction and the

goes to the changes

During the

are

the

transaction

consistency

Write phase. If the

validated

to

database.

validation

ensure

If the

test is

that

the

validation

negative,

test

is

positive,

the transaction

will

the

is restarted

discarded.

Write phase, the changes

The optimistic

is of the

approach

is

acceptable

are permanently for

most read

applied

or query

to the

database

database. systems

that require

few

update

prevention

and

transactions. In

a heavily

constitutes

discussed,

as

8

The

pp. two

Copyright review

2020 has

Cengage deemed

approach

Even

decades

ago.

any

All suppressed

Rights

to for

the

Reserved. content

environment,

does

May not

concurrency

control

not materially

be

copied, affect

software

scanned, the

overall

or

duplicated, learning

The

techniques.

is

control,

current

management

function.

on those

concurrency

most

the

DBMS

as variations

methods

213-26.

Learning. that

DBMS

an important

well

optimistic

Optimistic

Editorial

used

detection

is

in experience.

built

whole

in

the

an

or in Cengage

part.

on

conceptual

Due Learning

to

electronic reserves

rights, right

or

by

some to

more of the

third remove

that

party additional

is

H.T.

Database

standards

the

their

one

deadlock

article

Transactions on

deadlocks

will use

However,

described

ACM

of

DBMS

content

sometimes

King

and

Systems were

may content

techniques

J.T. 6(2),

suppressed at

any

time

from if

Robinson, June

developed

be

worse

1981,

more than

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

than

the

disease that locks

recovery

techniques

management levels

12.6

in

ANSI

SQL

standard

isolation

concurrent

of read

that

t2,

yielding Based

may be necessary

that

To further

to

about

and

employ

understand

you learn

Atransaction

additional

on the

rows

above

levels.

Server

databases.

Concurrency

661

database

how transaction

the transaction

database

does

isolation

also

mode

of operation

commits.

The Repeatable level

query

the first

shows

uses

reads

for

most

locks

However,

are always

However,

possible.

therefore,

data,

t1,

from

other transactions

are described

by the

are:

12.13

the

(including

query at time t2,

four by

other transactions.

ANSI Oracle

transaction and

At this isolation

transaction

Oracle

and to

SQL

Server).

wait until the

MS SQL

level,

performance

the

but at the

to read only committed

other transactions

at

Read Uncommitted,

the

provided

which increases

same row

or deleted.

isolation: shows

of isolation

forces transactions

causing

and then it reads

of transaction

data from

to

ensure

new rows

it is important

will detect

that

to

other transactions

are read

isolation

Most databases

they

levels.

data. Thisis

At this original

level,

the

transaction

Read isolation level ensures that queries return consistent results. This type

query ran. The Serialisable

standard.

data

levels

and then it runs the same

Table

level

databases

on data,

shared locks

it.

at time t1,

an additional

on the

which

operations

at time

four levels

uncommitted

on

isolation

may have been updated

Serialisable.

Read Committed

will use exclusive

original

and

any locks

database

of isolation

query.

will read

cost of data consistency. default

satisfy the

ANSI defined

based

of read

a given row

The original row

Read,

isolation or isolated

not yet committed.

a query

operations,

not place

reads

on transaction

data is protected

described

The types

executes

Repeatable

Uncommitted

are

based

transaction

the transaction

data that is

results.

that

The table

which

levels

or not.

A transaction

read:

Committed,

Read

allows

different

management

to

More precisely,

can read

read:

degree

The isolation

yielding

isolation

and,

it

state.

it is important

transaction

to the

execution.

A transaction

Phantom

SQL

Therefore,

a consistent

Transactions

standard.

defines

refer

a transaction

Non-repeatable time

the

1992

(1992)

levels

during

Dirty read:

Read

to

a database,

SQL

transactions.

can see (read) type

in

ANSI

to cure.

database

Managing

ANSI LEVELS OF TRANSACTION ISOLATION

Transaction other

the

is implemented

as defined

The

are supposed

to restore

12

(phantom

level is the

note

that

deadlocks

during

the

as these

most restrictive

even

use a deadlock

do not update

read)

with

rows

level

a Serialisable

detection

approach

transaction

a row

did

not

defined

isolation

phase

when

by the

level,

ANSI

12

deadlocks

to transaction

validation

after the

exist

management

and reschedule

the

transaction. The reason

levels

for

the

different

go from the least

the isolation

level,

at the

expense

in the

transaction

BEGIN

levels

restrictive

the

more locks

of transaction

for

is to increase

Uncommitted)

(shared

ISOLATION

using LEVEL

The isolation

general READ

concurrency.

more restrictive

are required

performance.

example

transaction

to the

and exclusive)

concurrency

statement,

TRANSACTION

of isolation

(Read

ANSI

SQL

The isolation

(Serialisable).

to improve level

The higher

data consistency,

of a transaction

is

defined

syntax:

COMMITTED

...

SQL

STATEMENTS

...

COMMIT

TRANSACTION; Oracle the

level

and

consistent

Copyright Editorial

review

2020 has

Server

SQL

statement-level

Cengage deemed

MS SQL

of isolation.

Learning. that

any

All suppressed

Rights

reads

Reserved. content

use

Server

does

May not

not materially

be

the

to

copied, affect

SET

supports ensure

scanned, the

overall

or

TRANSACTION all four

Read

duplicated, learning

in experience.

ISOLATION

ANSI isolation

Committed

whole

or in Cengage

part.

and

Due Learning

to

electronic reserves

LEVEL

statement

Oracle

by

levels. Repeatable

rights, the

right

some to

third remove

to

default

Read transactions.

party additional

content

may content

be

suppressed at

any

time

MySQL

from if

define

provides

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

662

PART

V Database

Transactions

TABLE 12.13

and

Performance

Transaction isolation

Tuning

levels

Isolation

Level

Allowed Dirty Read

Comment

Non-Repeatable

Phantom

Read

Read Less restrictive

Read

Y

Y

Y

The transaction

Uncommitted

reads

uncommitted

data,

allows

non-repeatable

reads

and phantom

reads. Read

N

Y

Y

Does

Committed

not

allow

uncommitted reads

data

but allows

non-repeatable

reads phantom

More restrictive

Repeatable

N

N

N

reads.

Only allows

Read

and

phantom

reads.

Serialisable

N

N

N

Does

not

reads,

allow

dirty

non-repeatable

reads

or phantom

reads.

Oracle/SQL

Read

Only/

Server only

Snapshot

N

N

N

Supported

by

Oracle

and SQL Server. The transaction see the

can

only

changes

that

were committed

at the

time the transaction started.

12 uses

START

reads;

that

TRANSACTION

is, the

As you can see from databases it

database

to

12.7

review

2020 has

recovery state.

transaction

Cengage

Learning. that

previous

of various

a consistent

Database

deemed

the

CONSISTENT only

SNAPSHOT

see the

committed

discussion,

techniques

to

sometimes

to

to

data

provide

at the

transactions

time

the

transaction

management

manage the

concurrent

employ

database

with

transaction

is a complex execution

recovery

consistent started.

subject

and

of transactions.

techniques

to restore

the

state.

DATABASE RECOVERY MANAGEMENT

completed

Copyright

can

may be necessary

consistent the

Editorial

make use

However,

WITH

transaction

any

restores Recovery

must

to

All suppressed

be treated

produce

Rights

Reserved. content

does

a database

a given

state,

are

based

on the

atomic

as a single,

logical

unit

techniques

a consistent

May not

not materially

be

copied, affect

database.

scanned, the

from

overall

or

duplicated, learning

in experience.

If, for

whole

or in Cengage

part.

of

usually

work in

which

some reason,

Due Learning

to

electronic reserves

inconsistent,

transaction

rights, the

right

to

property: all operations

are

any transaction

some to

third remove

party additional

content

may content

a previously all

portions

operation

be

suppressed at

any

time

from if

of

applied

the

subsequent

and

cannot

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

be completed, (undone).

the transaction

In

database

short,

before

Although

to the

it

this

Examples

was

has

Hardware/software

events

Some

database

of the

to the

changes

database

that

the

and

Concurrency

must be rolled

transaction

663

back

made to the

recovery

of transactions,

of critical

error

recovery

techniques

also

apply

has occurred.

of this type

or a failing

could

be a hard disk

memory

bank.

system

errors

or operating

media failure,

Other causes that

is

cause

administrators

argue

that

this

one

This type

of event

can

be categorised

of errors

data

of the

to

most

a bad

under this

category

be overwritten,

common

deleted,

sources

of

problems. incidents.

? An unintentional wrong

rows

database

by a careless

pressing

the

to

are of a more severe

risk.

Under

access

disasters.

The following inconsistent

to

error

Section

12.1.4, purposes.

database

from

Before

can

on the

or intentional.

Such errors include keyboard,

deleting

or shutting

down

that the

company

and normally indicate

security

and

virus

operation

includes render

introduces

the the

main

the

caused

by hackers

caused

by disgruntled

and

fires,

the

threats attacks damage

the

earthquakes,

database various

into

and

power

an inconsistent used

to

gain

employees

company.

floods

techniques

trying

data

failures.

Whatever

state.

to

recover

the

database

from

an

state.

Recovery

you learnt

about

Database

to

lets

the

transaction

transaction

an inconsistent

continuing,

are

database

a consistent

12.7.1 Transaction

recovery

category

category

section

state

key

nature

data resources

the

This

a critical

this

to

compromise

cause,

end user.

wrong

as unintentional

by accident.

events

at serious

Natural

caused

a table,

server

unauthorised trying

failure is

from

? Intentional

In

the

some type

Afailure

program

database

Human-caused

the

all

Transactions

are:

failures.

application

or lost.

is

after

on a motherboard,

include

and any changes

reverses

emphasised

or the system

of critical

capacitor

recovery

Managing

aborted.

chapter

database

must be aborted

transaction

12

a consistent

examine

log

recovery

structure

focuses

state

four important

how it

different

the

data in the

by using

concepts

and

on the

that

contains

data

methods

for

used

transaction

database

to recover

a

log.

affect the recovery

process:

12 The write-ahead-log before

protocol.

any database

database

can later

data are actually be recovered

Redundant transaction that

a physical

disk it

and

actually

updates

accessing to

the

a physical

the

copy

physical

buffers

A checkpoint

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

to

All

Rights

Reserved. content

does

the

May

DBMSs

the

data in

ability

to recover

storage area in primary the

in

DBMS

software

primary

data in the

not materially

also

be

copied, affect

the

of a failure,

transaction

because

thereby

saving

is

happening,

registered

scanned, the

overall

or

duplicated, learning

in the

in experience.

whole

the

DBMS

transaction

or in Cengage

part.

Due Learning

to

process

contain

significant

log.

electronic reserves

right

log.

log to ensure

the

is

physical

updates

data,

much faster

updated

than

data are

processing

written

time.

which the DBMS writes all ofits not

execute

As a result

rights, the

data from

a transaction

that

does

the

data.

the

When

on, all buffers that

written

memory used to speed up disk

reads

memory.

buffer

Later

operation,

While this is

not

using

are always

ensures that, in case

A database checkpoint is an operation in

disk.

operation

suppressed

of the

a single

Database checkpoints. updated

time,

on a buffer

disk every time.

disk during

state,

logs

Most DBMSs keep several copies of the transaction

will not impair

of it

that transaction

This protocol

a consistent

processing

a copy

ensures

updated.

A buffer is a temporary

To improve stores

to

logs.

disk failure

Database buffers. operations.

This protocol

some to

third remove

party additional

any of this

content

may content

other

requests.

operation,

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

664

PART

V Database

the

Transactions

physical

because

database

update

database.

next,

database

are

is

or

undo)

to

be

3

saved to

buffers

DBMS

to

because

transactions

its

the

and

state

that started and committed

its

commit

commit

point,

log is

point,

using

no changes

was

never

these

its

or undo

commit

the transaction

ROLLBACK

nor

are

a ROLLBACK) never

transaction

(no

ROLLBACK The recovery

steps:

data was

nothing needs to be done,

after the last checkpoint,

and to

made in

before

uses write-through

operations

needs to

operation

the

The

update

the

ascending

during

the

the

failure

database,

order,

occurred,

or immediate

transactions

aborts

be done to restore

uses

the

the DBMS uses

from

using the after

oldest

to

newest.

or that wasleft active

nothing

needs

to

be done

updated.

point. If the transaction

operation

operations

updated.

updated.

follows

before the last checkpoint,

The changes

was

procedure

by transaction

reaches

techniques.

This is the last time transaction

a COMMIT operation

to redo

log.

database

Whenthe recovery updated

a failure.

saved.

that performed

a COMMIT

the

after

disk.

transaction

neither

because

As you

write-through

For any transaction that had a ROLLBACK operation after the last checkpoint (with

hour.

update, the transaction

the failure)

log.

physical

per

a consistent

database

(before

the

times

only the transaction

reaches

it reaches

is required

not in

recovery.

of deferred-write

Instead,

and

several

transaction

database

use

database.

before

This synchronisation

write or deferred

in the transaction

log records

in the

make

database

the data are already

the transaction

4

the

and committed

For a transaction values

the

after the transaction

aborts

checkpoint

by the role in

bringing

physical only

made to

For atransaction because

the

sync.

data in the

scheduled

generally

updated

the last

physically

of the

an important

involves

transaction

for all started

1 Identify

2

update

If the

need

process

will be in

copy

procedure uses deferred

physically

information.

play

procedures

do not immediately

log

the

automatically also

process

recovery

database

update

checkpoints

Whenthe recovery

Tuning

and the transaction

operations

recovery

Transaction

log

Performance

Checkpoints

will see The

and

transaction

before it reaches

the

log

update, the database is immediately

execution,

database

before

to

even

its

commit

a consistent

values.

before

the

point,

transaction

a ROLLBACK

state. In that

The recovery

process

case, the

follows

these

steps:

1 Identify

the last

physically

12 2

checkpoint

saved

to

For atransaction because

the

in the transaction

This is the last time transaction

data were

disk.

that started and committed

data

log.

are

already

before the last checkpoint,

nothing needs to be done

saved.

3 For atransaction that committed afterthe last checkpoint, the DBMSredoes the transaction, using the

after

values

of the

transaction

log.

Changes

are

applied

in

ascending

order,

from

oldest

to

newest.

4

For any transaction active (with

log

transaction

log.

You make

Copyright review

2020 has

sure

Cengage deemed

had a ROLLBACK operation

a COMMIT

records

to

Changes

you

Learning. that

any

understand

and

All suppressed

Rights

does

May not

the

not materially

be

applied

log in

in reverse

12.14

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

order,

from

or in Cengage

part.

Due Learning

to

newest

electronic reserves

occurred,

to

log

is

the transaction

rights, the

right

the

or that

was left

DBMS

uses the

some to

third remove

party additional

values in the

oldest.

database

transaction

includes

checkpoint

using the before

a simple

a simple log

whole

operations,

to trace

process,

This transaction

after the last

before the failure

or undo the

Table

recovery

one checkpoint.

Reserved. content

nor a ROLLBACK)

ROLLBACK are

may use the transaction

transactions

Editorial

neither

transaction

that

recovery used

that

process. includes

components

content

may content

be

any

time

three

used

suppressed at

from if

the

subsequent

To

earlier

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

12

Managing

Transactions

and

Concurrency

665

...

...

...

89-WRE-Q,1,

Value

After

Value

Before

43

675.62

45

615.73

1009,1,

1009,10016,

277.55

11

12

1007,18-JAN-2014,

2

0.00 6

PROD_QOH

Attribute

CUST_BALANCE

CUST_BALANCE

PROD_QOH

PROD_QOH

*

*

*

ROW

H

54778-2T

ID

10011

1009,1

1009

89-WRE-Q

10016

10007

2232/QWE

S*

*R*A*

C

examples *

*

*

Transaction

Transaction

Transaction

of

of

of

*

*

Transaction

Transaction

Transaction

recovery

End

Start

End

Start

PRODUCT

CUSTOMERACCT_TRANSACTION

PRODUCT

PRODUCT CUSTOMER

Table

End

Start

INVOICE

LINE

****

****

****

****

****

****

12 transaction

for

Operation

log

NEXT

PTR

PREV

PTR

TRX

NUM

UPDATE

UPDATE

START

352

COMMIT

Null

UPDATE

START

COMMIT

UPDATE

START

COMMIT

INSERT

INSERT

415

419

427

431

457

397

405

415

419

427

431

106

106

106

427

431

457

363

365

341

352

363

101

101

101

101

106

106

106

106

341

352

363

365

397

405

415

419

405

CHECKPOINT UPDATE

INSERT

Null

525

528

Null

transaction

A

12.14

Null

Null

Null

521

525

155

155

155

521

525

528

TABLE TRL

ID

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

423

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

666

PART

V Database

in the

Transactions

chapter,

transaction

and

Performance

so you should

log

has the

Transaction

and increase

product

already

following

101 consists

54778-2T

be familiar

with the

basic

process.

106 is the

represents

the

277.55.

of two

UPDATE

the

customer

same

credit

credit

sale

of one

This transaction

Transaction

statements

balance

that

A database

only the changes Using Table

the

checkpoint

done

unit

for

that reduce

customer

all

by transaction

12.14,

buffers

update.

on hand

for

a credit

database

101 to the

on hand for sale

of two

product units

of

10016

and two

to

from

disk.

The

In this

in the

amount

UPDATE)

consists

2232/QWE

buffers

This transaction

customer

This transaction

transactions.

database

the

to

12.1.1.

INSERT

of product

committed

were physically

Note that transaction were

already

of one

6 units

to

checkpoint

case, the

of

statements. UPDATE

26 units. event

writes

checkpoint

applies

all

data files.

database

recovery

the

transaction

to

Find

disk,

that

log

transaction

process

for

a DBMS,

using the

deferred

data

Use the

previous

Use the

next

the

TRL ID

pointer

after

to

before the last checkpoint. action

needs

after the last

write the

changes

to

to

checkpoint

disk,

Therefore, all changes

be taken.

using

(TRL ID 423), the

the

after

values.

For

DBMS uses example,

for

Repeat the

process

active (do to

values

to locate

values to locate (Start

with

COMMIT

statement

areignored.

not end

the

each

TRL ID

for transaction

Any other transactions written

457).

values.

was the

were

disk.

no additional

committed

pointer

457

were left

and

was TRLID 423. This wasthe last time

106:

COMMIT (TRL ID

using

written to

101 started and finished

written

For each transaction

4

the

method as follows:

database

12

quantity

Section

DML (three

inventory

you can now trace

saw in

1 Identify the last checkpoint. In this case, the last checkpoint

3

the

10011

89-WRE-Q

SQL

updated

for all previously

you

of product

quantity

wrote

event

of five

a simple

increases

changes

sales

consists

155 represents

statement

2

Given the transaction,

characteristics:

54778-2T.

Transaction

update

Tuning

start

of the

transaction

DML statement

405,

then

for this

415,

(TRL

ID

and apply the 419,

427

and

397).

changes

431.)

to

disk,

Remember

that

transaction.

155.

Therefore, for transactions

with a COMMIT

that ended

or ROLLBACK)

nothing

with ROLLBACK or that

is done

because

no changes

disk.

SUMMARY A transaction represents

is

the transaction transaction one in

Copyright review

2020 has

can exist

by itself.

a database

all

data integrity

which

have five

consistent

state),

(data

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

aborted), being

scanned, the

that

must

access

be a logical

consistent

constraints

is

Learning.

one

main properties:

isolation

operations

the unit

Either all parts are executed,

from

the transaction

that

database A transaction

otherwise,

Cengage deemed

of

events.

takes

Transactions

Editorial

a sequence

real-world

overall

or

duplicated, learning

state

to

database.

of

work;

A transaction

that

is,

or the transaction

another.

no portion

is

A consistent

of

aborted.

database

A

state

is

are satisfied.

atomicity

(all parts

consistency used

in experience.

of the transaction

(maintaining

the

by one transaction

whole

or in Cengage

part.

Due Learning

to

electronic reserves

cannot

rights, the

right

some to

third remove

are executed;

permanence

of the databases

be accessed

party additional

content

may content

be

by another

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

transaction cannot

until the first transaction

be rolled

concurrent

back

once

execution

serial

order).

SQL

provides

changes

disk)

for

and

SQL transactions request

keeps

Concurrency

control

execution

The

scheduler are

scheduler

of locking:

when

Serialisability

When two

or

a deadlock,

Concurrency

of the

control

Copyright review

2020 has

COMMIT

(saves

in

requests.

Each

database

database.

The information

purposes. The

updates,

concurrent

uncommitted

data

and

which

order is

stamping

the

concurrent

critical

and

and

transaction

ensures

optimistic

database

methods

are

integrity

used

by the

by a transaction.

another

The lock

transaction

is

using

prevents

it.

There

one

are

several

levels

database

states:

systems:

1 (locked)

binary

locks

and

or 0 (unlocked).

a database

shared/exclusive

A shared

lock

and no other transaction

wants to

update

(write

a particular

to) the

locks. used

is updating

item.

database

is

the

An exclusive

and

no other

when

lock is

locks

(shared

or

data.

is

guaranteed phase,

any

data,

through

in

and

which

the

the

use of two-phase

transaction

a shrinking

phase,

locking.

acquires

in

all

which

the

to

release

The two-phase

of the locks

transaction

that

it

releases

all of

new locks. wait indefinitely

embrace.

There

for

are three

each

other

deadlock

control

a lock,

they

techniques:

are in

12

prevention,

which

conflict

stamping

transaction

wound/wait with

and that At commit

methods

of conflicting

assigns

transactions

is rolled

back

a unique

in time

and

which

time

stamp

stamp order.

continues

to

each

Two

executing:

transaction

schemes the

are

wait/die

scheme. optimistic

methods

transactions time,

are

assumes

executed

the

private

copies

isolation

levels:

Read

database

from

that

the

majority

concurrently, are

updated

using to the

Uncommitted,

of

database

private,

database.

Read

transactions

temporary The

Committed,

copies

ANSI

standard

Repeatable

Read,

Serialisable. recovery

Learning. that

any

restores

backups

be used in

Cengage deemed

the

lost

in

can exist for

execution

control

data.

order

locks

with time

the

and the

Database

Editorial

executed

and field.

data from

acquiring

decide

Database

to

being

of transactions.

problems:

or Read

a growing

defines four transaction and

of the

avoidance.

schedules

Concurrency

time

while

used in

more transactions

and

do not

on the

or a deadly

detection

be

unlocking

without

to

main

a data item

page, row

shared

has

to

only two

of schedules

schema

used

have

held

without

scheme

result

of transactions.

data item

a transaction

are

locks

can

can

Several

exclusive)

and

modify

execution

execution

Locking,

access

the

wants to read

data.

needs

that

the

The transaction

table,

of locks lock

a transaction

the

using

database,

Two types

locking

three

establishing

systems.

unique

from

issued

in

ensure the serialisability

guarantees

transaction

same

667

state).

or database

(ROLLBACK)

simultaneous

result

for

executed.

database

to

A binary

can

is responsible

multi-user

A lock

of all transactions

the

Concurrency

operations.

used for recovery

coordinates

and

made by a transaction (the

statements:

database

SQL statements

Transactions

transactions

of two

previous

changes

serialisability

of the

use

Managing

retrievals.

operations in

the

(the and

as that

the

database

log is

of transactions

inconsistent

through

by several

durability

committed)

same

(restores

track

stored in the transaction

is

is the

ROLLBACK

several I/O

log

completed)

transactions

are formed

originates

The transaction

is

transaction

of transactions

support

to

the

12

All suppressed

are permanent

case

Rights

of a critical

Reserved. content

the

does

May not

not materially

be

copies

of the

error in the

copied, affect

scanned, the

overall

or

duplicated, learning

a given

database;

master

in experience.

whole

state

to

a previous

they

consistent

are stored in

state.

a safe place

and are

database.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

668

PART

V Database

Transactions

and

Performance

Tuning

KEYTERMS atomicity

field-level lock

row-level lock

atomic transaction property

full backup

scheduler

binarylock

immediateupdate

serialisability

buffer

inconsistentretrievals

serialisable

checkpoint

isolation

sharedlock

concurrency control

lock

table-level lock

consistency

lock granularity

time stamping

consistentdatabasestate

lock manager

transaction

database-levellock

lost updates

transaction log

databaserecovery databaserequest

monotonicity

transaction log backup

mutually exclusiverule

two-phase locking

deadlock

optimisticapproach

uncommitteddata

deadlyembrace

page

uniqueness

deferred update

page-level lock

wait/die

deferred write

ReadCommitted

wound/wait

differential backup

Read Uncommited

write-ahead-log protocol

durability

redundanttransactionlogs

write-through

exclusivelock

repeatable read

FURTHER READING Assaf,

W.,

West,

R.,

Aelterman,

S. and

Curnutt,

M. SQL

Server

2017

Administration

Inside

Out.

Microsoft

Press,

2017. Brumm,

B. Beginning

Seppo,

S., and

Underlying

Oracle

SQL for

Soisalon-Soininen, Physical

Oracle

Database

S. Transaction

Structure,

Data-Centric

18c:

From

Processing: Systems

and

Novice

to

Professional,

Management Applications.

1st

of the

Logical

Springer,

2016.

edition.

Apress,

Database

and its

2019.

Online Content Answers to selected Review Questions andProblems forthischapter are available

12

on the

online

platform

for this

book.

REVIEW QUESTIONS 1

Explain the following

2

Whatis a consistent

3

The

DBMS

real-world

4

Copyright Editorial

review

2020 has

Atransaction

that the

What are the

6

Whatis a scheduler,

Cengage

Learning. that

any

a transaction

All suppressed

Rights

Reserved. content

does

May not

log,

semantic

possible

List and discuss the four transaction Whatis

is alogical

unit of work.

database state, and how is it achieved?

does not guarantee event.

5

deemed

statement:

and

meaning

consequences

of the transaction

of that

limitation?

not

be

copied, affect

what is its function?

scanned, the

the

an example.

properties.

what does it do and whyis its activity important

materially

truly represents

Give

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

to concurrency

party additional

content

may content

be

suppressed at

any

time

control?

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

7

Whatis alock,

8

Whatis concurrency

9

Whatis an exclusive lock,

12

Managing

Transactions

and

Concurrency

and how, in general, does it work? control, and whatis its objective? and under which circumstances

is it granted?

10

Whatis a deadlock, and how can it be avoided? Discuss several deadlock avoidance strategies.

11

Which three levels what

each

669

of backup

of those

three

12

What are the four

13

What does serialisability

may be used in database recovery

backup

levels

management?

Briefly describe

does.

ANSI transaction isolation levels? of transactions

Whichtype of reads

does each level allow?

mean?

PROBLEMS 1 Suppose you are a manufacturer of product ABC, whichis composed of parts A, B and C. Each time

a new

product

ABC is

QOH in a table

named

PART_QOH

a table

in

database

contents

Table name:

created,

it

PRODUCT. named

are

shown

must

be added

And each time

PART,

must

in the

following

to the

the product is

be reduced

PROD_QOH

ABC

1205

name:

PART_QOH

A

567

the parts inventory,

of parts

A, B and

using

C. The sample

98

12

549

Given that information,

a

answer

questions

a-e.

How many database requests can you identify for aninventory update for both PRODUCT and

PART?

b

Using SQL, write each database request you identified in Step a.

c

Writethe complete transaction(s).

d

Writethe transaction

log, using Table 12.1 on p. 646 as your template.

e

Using the transaction

log you created in Step d, trace its usein database recovery.

2

Describe the three concurrency

3

most common concurrent transaction

control

can

be used

Which DBMS component resolve

has

PROD_

PART

PART_CODE

2020

the

PRODUCT

C

review

using

tables.

B

Copyright

inventory,

created,

by one each

PROD_CODE

Table

Editorial

product

Cengage deemed

Learning. that

any

to

avoid

those

is responsible

for

execution problems. Explain how

problems.

concurrency

control?

How is this feature

used to

conflicts?

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

670

PART

V Database

4 5

Transactions

and

Performance

Tuning

Using a simple example, explain the use of binary and shared/exclusive Suppose your database system has failed. of deferred-write

and

write-through

locks in a DBMS.

Describe the database recovery

process and the use

techniques.

Online Content The'Ch12_ABC_Markets' database is available ontheonlineplatform for

6

this

book.

ABC Markets sell products to customers. The entity relationship diagram shown in Figure P12.6 represents the main entities for ABCs database. Note the following important characteristics: A customer

may make many purchases, each one represented

The CUS_BALANCE is updated amount the customer owes.

by aninvoice.

with each credit purchase or payment and represents

The CUS_BALANCE is increased (1) customer payment.

with every credit purchase and decreased (2)

The date oflast purchase is updated

with each new purchase

The date oflast payment is updated

FIGURE P12.6

the

with every

made by the customer.

with each new payment

made by the customer.

The ABC Markets Relational Diagram

12

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

An invoice

represents

An INVOICE

can

have

The INV_TOTAL

can

CHEQUE

The invoice

be 30,

can

quantity

A customer

one for

cost

60

or 90

be OPEN,

PAID

Transactions

and

Concurrency

671

by a customer.

LINEs,

the total

Managing

each

of the invoice,

product

purchased.

including

(representing

the

taxes.

number

of days

of credit)

or

or CC.

status

A products

purchase

many invoice

represents

The INV_TERMS CASH,

a product

12

may

or CANCEL.

on hand (P_QTYOH)

make

many

is updated

payments.

The

(decreased)

payment

type

with each

(PMT_TYPE)

product

can

sale.

be one

of the

following:

? CASH

for

cash

? CHEQUE

? CC

for

cheque

for credit

The payment

payments.

card

details

payments.

payments.

(PMT_DETAILS)

are

used

to

record

data

about

check

or credit

card

payments: ?

The

bank,

account

? The issuer,

credit

Note:

Not all entities

Using

this

BEGIN

a

attributes write

a unit

invoice

has

code

COMMIT

only

to

10010

to

cheque

payments.

date for credit

in this

example.

represent

group

the

each

SQL

card

payments.

Use only the of the

statements

attributes

following

indicated.

transactions.

in logical

Use

transactions.

makes a credit purchase (30 days) of one unit of product

of 110.00; one

for

and expiration

SQL

price

number

are represented

the

and

with

the

product

tax rate

is

8 per

cent.

The invoice

number

is

10983,

line.

On 3 June 2019 customer 10010

c

Create a simple transaction log (using the format shown in Table 12.14) to represent the actions previous

8

in

makes a payment of 100 in cash. The payment ID is 3428.

transactions.

Create a simple transaction of the transactions

log (using the format

Problems

shown in Table 12.14) to represent the actions

6a and 6b.

Assuming that pessimistic locking is being used but the two-phase a chronological the

9

list

complete

of locking,

processing

unlocking

of the

chronological

10

list

complete

of locking,

processing

manipulation

described

in

locking

activities

Problem

and

data

transaction

manipulation

described

in

a chronological

11

list

complete

of locking,

processing

unlocking

of the

Assuming that pessimistic locking chronological the

Cengage deemed

list

complete

Learning. that

any

All suppressed

of locking,

processing

Rights

Reserved. content

does

May not

not materially

and

transaction

unlocking

copied, affect

scanned, the

overall

manipulation in

and

or

duplicated, learning

data

that

locking

activities

Problem

in experience.

whole

or in Cengage

part.

Due Learning

in

to

reserves

rights, the

right

protocol, would

create a

occur

during

protocol is not, create

that

would

locking

activities

Problem

electronic

during

occur

during

6b.

manipulation

described

would occur

6a.

is being used with the two-phase

of the transaction

be

data described

that

locking

activities

Problem

Assuming that pessimistic locking is being used but the two-phase the

12

protocol is not, create

6a.

is being used with the two-phase

unlocking of the

and data

transaction

Assuming that pessimistic locking the

has

number

cheque

b

7

2020

and

On 11 May 2019 customer

of the two

review

and

TRANSACTION

and this

Copyright

card

database,

11QER/31

Editorial

number

that

protocol, would

create a

occur

during

6b.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 13 Managing Databaseand SQL Performance IN THIS CHAPTER,YOU WILLLEARN: Basic

database

How

performance-tuning

a DBMS

processes

SQL

About the importance About the types

practices

How to formulate

queries

of indexes

of decisions

Some common

concepts

in

the

query

used to

queries

query

processing

optimiser

has to

write efficient

and tune the

make

SQL code

DBMS for optimal

performance

PREVIEW Database

performance

coverage few

in the

records

intended

least

per table.

task,

most

efficient

Unfortunately, when, in the

query

environment

query

environment

real

focus

the

often

is

no visible

yet it

on

making

are

executed

create

a

SQL

query

over more

receives

minimal

have only a

queries

process.

performance

when only 20 or 30 table

to

usually

used in classrooms

of the

to query efficiency

queries

what it takes

topic,

efficiency

gives

of attention

world,

you learn

a critical

Most databases

the

considering

the lack

is

curriculum.

As a result,

without

efficient

chapter,

optimisation

database

perform

In fact,

improvements

an

even the over

the

rows (records)

are queried.

can give unacceptably

slow results

tens

of

efficient

millions

query

of records.

In this

environment.

NOTE Asthis book focuses on databases, this chapter covers only those factors directly affecting database performance. Also, because performance-tuning techniques can be DBMS-specific, the material in this chapter may not be applicable under all circumstances, nor will it necessarily pertain to all DBMS types. This chapter

is

designed

to

build

a foundation

for the

general

understanding

of database performance-tuning issues and to help you choose appropriate performance-tuning strategies. (For the most current information about tuning your database, consult the vendors documentation.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13.1

DATABASE PERFORMANCE-TUNING

One of the required. the

main functions

of a database

End users and the

following

system is to

DBMS interact

through

The end-user (client-end)

2

The query is sent to the DBMS (server end).

3

The DBMS (server end) executes the query.

4

The DBMS sends the resulting

End users expect performance

answers

use of queries

to to

and

SQL Performance

673

end

users

generate

when they

information,

are using

is

good?

goal

that is, to try to

amount

of time.

The time

required

a typical

DBMS

memory (RAM) components

vary from is

end-user

easier

to return

a result to

General

on

disk and guidelines

CPU

network) for

for better

of end-user Therefore,

of the

database

DBMS in the

Table

minimum

factors

tend

The

power,

if

than

performance

time

These

query

know

Unfortunately,

possible.

to vendor.

processing

better

system

as

by the

throughput.

achieving

results.

the response

vendor

do you

performance

Database

many factors.

and from

whether the

Regardless

as fast

processed

depends

query

tuned.

to reduce

main factors:

guidelines

System

set

environment

by three

(hard

slow

queries

How

database

months later.

execute

query is

evaluate.

bad

about

well two

designed

an end-user

general

hard to

monitored and regularly

environment

and input/output

is

How do you know

to identify

complaints

is to

application.

as possible.

performance Its

and procedures

constrained

and summarises

TABLE 13.1

is

performance

ensure that

by a query

and to

enough?

well one day and not so

to a set of activities

wide-ranging

database

good

must be closely

system,

as quickly

Good is

of database

performance

refers

results

all it takes

may perform

the

database

time

generates a query.

data set to the end-user (client-end)

queries to return

performance

query

perceptions,

of

application

query response

database

same

tuning

their

of a database

a 1.06-second good

Database

sequence:

1

the

Managing

CONCEPTS

provide

the

13

available

13.1 lists

to

be

performance primary

some

system

performance.

performance

Client

Server

Resources

Hardware

CPU

The fastest

possible

Multiple processors. The fastest higher.

possible, i.e. quad-core

Cluster

Virtualised RAM

The

maximum

possible

The

of networked

server

maximum

or

computers.

technology.

possible

to

avoid

13

OS

memory to disk swapping. Storage

Fast SATA/EIDE sufficient state

hard disk

with

free hard disk space.

drives (SSDs)

for faster

Multiple high-speed, Solid

speed)

RAID configuration). (SSDs) for

Software

Network

High-speed

Operating

Fine-tuned

system

high-capacity

(SCSI/SATA/Firewire/Fibre

connection

for faster

OS, DBMS

High-speed

for best client application

64-bit

performance

Solid speed.

and

disks

Channel) in state

drives

Separate

data

disks

spaces.

connection.

OSfor larger

Fine-tuned

address server

spaces.

for

best

application

for

best throughput.

possible.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Network

Fine-tuned

Application

Optimise

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

for

SQL in

duplicated, learning

best throughput

in experience.

whole

client

or in Cengage

part.

Fine-tuned

application

Due Learning

to

electronic reserves

Optimise

rights, the

right

some to

third remove

party additional

content

DBMS for

may content

be

suppressed at

any

time

if

best

performance.

from

eBook

the

subsequent

rights

and/or restrictions

eChapter(s). require

it

674

PART

V Database

Transactions

and

Naturally, the system in

the

real

world,

Therefore,

the

possible

performs

unlimited

and

existing

Tuning

best

when its hardware

resources

hardware

with

Performance

are

software

(and

often

not the

components

limited)

and software

norm;

internal

should

resources,

resources

and

be optimised

which

is

are optimised.

external

why

to

constraints obtain

database

However,

always

the

exist.

best throughput

performance

tuning

is

important. Fine-tuning be

checked

minimise the

the to

the

performance

designed

database

is redesigned

general,

client

On the least

client

it is

such

this

design.

well as a well-designed

factor

On the

server

side,

in

to

determining

No amount offine-tuning This is true

performance

must

resources

ethos:

will make

when

gain from

activities

can

be divided

into

those

taking

objective using

is to

the

generate

minimum

a SQL

amount

query

that

of resources

returns at the

the

an existing

older databases.

place

correct

server

end.

required to achieve that goal are commonly referred to as SQL performance

requests

all factors

sufficient

books

database.

a unrealistic

That is, has

an important

worth repeating

with good database as

is

and

either

on the

side:

the

of time,

design

approach. level

Tuning: Client and Server

server

side,

amount

optimum

As database

starts perform

a holistic

at its

efficiency,

performance-tuning

or on the

requires

operates

and the end user expects

database

side

one

performance

13.1.1 Performance In

of a system

each

of bottlenecks.

systems

Good database database

that

occurrence

database

a poorly

performance

ensure

the

DBMS

in the fastest

environment

way possible,

must

while

be properly

configured

making optimum

to

use of existing

answer The

in the

activities

tuning. respond

to

clients

resources.

The

activities required to achieve that goal are commonly referred to as DBMS performance

tuning.

Online Content If you wanttolearn moreaboutclientsandservers,checkAppendix F, Client/Server

Systems,

located

on the

online

platform

for this

book.

Keep in mindthat DBMS implementations are typically more complex than just a two-tier client/server configuration. However, even in multi-tier (client front-end, application middleware and database server back-end)

client/server

environments,

performance-tuning

activities

are frequently

subdivided

into

subtasks to ensure the fastest possible response time between any two component points. This chapter covers SQL performance-tuning practices on the client side and DBMS performance-tuning practices onthe server side. However, before you can start learning about the tuning processes, you mustfirst learn more about the DBMS architectural components and processes, and how those

13

processes

interact

to respond

to end-user

requests.

13.1.2 DBMS Architecture The architecture of a DBMS is represented bythe processes and structures (in memory and in permanent storage) used to manage a database. Such processes collaborate with one another to perform specific functions.

Copyright Editorial

review

2020 has

Cengage deemed

Figure

Learning. that

any

All suppressed

Rights

13.1 illustrates

Reserved. content

does

May not

not materially

be

the basic

copied, affect

scanned, the

overall

or

duplicated, learning

DBMS

in experience.

whole

architecture.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 13.1

13

Managing

Database

and

SQL Performance

675

Basic DBMS architecture

DBMS

server

computer Client computer

SQL

User

query

Client

process

Database

Listener

process

Table

Lock

Scheduler Result is

spaces

Optimiser

manager

set

I/O

sent

SQL cache

Data cache

operations

back to

Data

files

client

DBMS

processes

running

in

Database

primary

memory

stored

(RAM)

data

in

secondary

memory

(hard

As you

examine

Figure

13.1,

note

the

following

components

All data in a database are stored in data files. of several

from files

data files.

A data

many different that

make

up the

that

each

can

contain

A database

database;

predefined increments define

file

tables.

rows

however,

extend

will be in

10

enterprise database is normally composed one

single

(DBA)

as required,

known as extends.

new

from

table,

or it

determines

the

data files

For example, if KB or 10

a logical

you

grouping

may have a system

data table

space

temporary

the

work

which

data

Copyright review

2020 has

to

DBMS

space

the

creates

DBMS and

it in

with

can

a

an index

and

minimum the

set

data

RAM (data

table

table

on.

of table

Each

time

you

Cengage deemed

Learning. that

any

All suppressed

also caches

Rights

Reserved. content

does

May not

not materially

be

system

copied, affect

scanned, the

overall

or

is

permanent

where

data read

or before the data are

storage

(data

data and the contents

duplicated,

or in

in experience.

whole

Cengage

files

in

part.

Due Learning

to

electronic reserves

rights, the

from

the

mostrecently

database

data

written to the database

catalogue

learning

or a a new

cache).

the

cache

13

spaces.

after the

data

group

a user

create

data

The

or file

all indexes;

are stored

RAM.

in

For example,

accessed

in

expand

data are stored;

space to hold

so

from

space

characteristics.

memory area that stores the

data have been read

data

the DBA can

The data cache or buffer cache is a shared, reserved blocks

rows

size of the

automatically

Atable

similar

data dictionary

grouping,

must retrieve

place

data

tables; sorts,

contain

MB increments.

store

where the

do temporary

stored)

that

user-created

automatically

data, are

The data cache

Editorial

table

space

with the

the

data files

to store the

table

database, To

of several

can

the initial

more space is required,

Data files are generally grouped in file groups creating table spaces. is

disk)

and functions:

Atypical

administrator

files

permanent

right

some to

files

data files.

of the indexes.

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

676

PART

V Database

Transactions

and

Performance

The SQL cache or procedure executed

SQL

know

statements

more about

and

Advanced

stores To

move

data

I/O requests purpose

of the I/O

multiple

from

8K,

might issue Working

I/O

with

DBMS

of

using

Also illustrated and their

the

to

other

on to the

Scheduler.

Lock

some

vendor

DBMS

to

data

faster

computer

physical

cache,

DBMS

issues

even if

than

disk

printer).

components

disk

on the

and

block,

you

generally

use

operating on the

only

one

system

situation,

and A DBMS

request. working

to retrieve

with

the

data in

data.

the

(Thats

data files, because

no

cache.)

focus

on

minimising

than

reading

the

the

SQL

SQL cache

disks, video

depending

read

DBMS

vendor,

different

slower

typical

hard

size depends

hard

data

the

processes.

functionality

the

number

data from Although

is

of I/O

the the

similar.

data

operations, cache.

number

of

The following

processes

processes

are

for

processes.

creates

clients

requests

Once a request

a user process

you

server.

The

listens

and

hands

is received,

the

processing

the listener

of the

SQL

passes the request

to

process.

DBMS,

to the

Chapter

the

activities

are

cache),

Furthermore,

the

many times

RAM (data

an entire

to the

many times

within

DBMS.

as memory,

multi-block

wait for

work

is

process

user

User. The DBMS

submit

is

9, Procedural

Rather,

by the

and from

disk block

or a

to

the

SQL.

want to

(I/O) request is a low-level (read or write)

retrieves

storage

Chapter

written

mostrecently

(If you

Figure 13.1:

appropriate

log

cache

to

data to

or even larger.

have

13.1

study

end-user

files)

operation

request

operations

The listener

requests

64K

to

the

devices (such

permanent

data

vary from

in

(data

move

read

performance-tuning

Figure

names

is to

read

needed

I/O

in

Listener.

32K,

and functions.

SQL functions,

store

Aninput/output

The physical

doesnt

are

majority

represented

16K,

data in the

the

because

storage

disk

from

a singleblock

operations

The

rows,

triggers

SQL that is ready for execution

operation an I/O

and

not

computer

only one row.

be 4K,

because

to/from

Note that

containing

could

of the

permanent

operation

or devices.

attribute

the

does

memory area that stores the

including

triggers

cache

and waits for the replies.

data access The

SQL

version

from

procedures,

procedures,

The

a processed

cache is a shared, reserved

or PL/SQL

PL/SQL

SQL.

Tuning

are

There

scheduler

12,

Managing

manager.

This

Transactions

Optimiser.

The

and

manages

process.

This

processes,

schedules

Transactions

and

manage each client

a user

many user

process

process

Managing

assigned

are

to

the

session.

process

at least

will handle

one

concurrent

Therefore,

per

each

execution

when you

all requests

logged-in

you

client.

of SQL requests.

(See

Concurrency.)

all locks

placed

on

database

objects.

(See

Chapter

12,

Concurrency.)

13 the

data.

optimiser

You

process

will learn

more

analyses

about

this

SQL

process

later

13.1.3 Database Query Optimisation Most of the algorithms The selection

of the

The

of sites

selection

Within those mode

two

or the

Automatic

Copyright review

2020 has

Cengage deemed

Learning. that

any

for

optimum

execution

to

timing

a query

of its

All

Manual

Rights

Reserved. content

does

May not

not

be

copied, affect

overall

are based

duplicated, learning

on two

communication

algorithm

Operation

or

efficient

way to

access

Modes

minimise

optimisation

the

most

principles:

order

optimisation

scanned,

the

chapter.

can

modes

requires

in experience.

whole

or in Cengage

that

can

part.

Due Learning

to

the

electronic reserves

costs

be evaluated be

meansthat the DBMS finds the

query

materially

and finds

in the

optimisation

to

optimisation.

query optimisation

suppressed

query

be accessed

principles,

user intervention.

Editorial

proposed

queries

on the

classified

as

basis manual

most cost-effective optimisation

rights, the

right

some to

third remove

additional

content

may content

operation

or automatic.

access path without

be selected

party

of its

be

and

suppressed at

any

time

from if

scheduled

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

by the

end user or programmer.

users

point

the

of view,

optimisation

Within this timing

Static strategy

query

of such

selected

When the

the

are

embedded is

database.

the

at run

when

program

query

time,

is

using

is

determined

query

optimise

the

optimisation

convenience

is

is clearly

the

Database

and

more desirable

increased

overhead

SQL Performance

from

that

the

677

end

it imposes

on

optimisers, time

first

Setting

the

quickly

user

goal to

procedures.

In these

as

possible

dynamic

so control is

or manual. In the

possible. for

retrieval

cases, the

Then,

the

Arule-based the

best

they

query

of the last

are typically

Because detail

in the

through

strategy.

general

database next

in

Within should

retrieve

the

cases,

the

while

DBMS

the

number

with access

In these

control

row

of records,

rights.

These

statistically attempt

first

based

to

row

is

goal is to

waits for

is typically

will not pass

it is important

by the

DBMS

statistical

the

done in

minimize often

used

present user

to

embedded

back to the calling

to retrieve

and is

generation

all of the

generated

in

mode, the

utility

one

DBMS

manual statistical

a user-selected

query optimisation access

to

used

algorithms.

about the database.

as size,

optimiser

is

program. that is

or rule-based

strategy.

query

strategy

in the same

of information

of users

time

dynamic

The best

information

DBMS

the

scroll

SQL and

application

data to the last

until row

as

be returned.

managed

dynamic

periodically

access

query.

and updates the statistics after each access. In the must be updated

type

such

the

the

.NET.

to

by the

Although

based

access

that

Minimising

therefore,

can

information

specify

Basic

SQL

the database.

overhead.

several times

number

best

Visual

determined

to the

characteristics

environments.

rows

minimise

data have been retrieved;

The statistical

row.

as

other

happen

uses statistical

to

done.

when

necessary

access

database.

algorithm

client

quickly

the

the

processing

and

plan

dynamically

according

the

common

C# or

the

plan to

on statistically

serviced

is

Database access strategy is defined

may be based

a goal

or the last

as

can fetch

optimiser

stored

all of the

setting

row

about

determine

optimisation

is

as

creates

is

by run-time

database

to

and interactive

to the

data, it

about

allow

the first

rows

the

inside

DBMSs

strategy

be classified

queries

DBMS

such

it

uses that

which could

can

of requests

by the

DBMS

information

measured

optimisation

number

systems

several

through

the

the

or dynamic.

approach

languages

compilation,

access

up-to-date

query

used

to retrieve

in transaction

for

when

This

place at execution time.

Therefore,

information

time,

some

takes

cost is

DBMS.

programming

to

can be static

time. In other words, the best optimisation

by the

DBMS

techniques

based

are then

compiled

is executed,

For example,

according

algorithms

place at compilation is

the query is executed

provide

access

statistics

be classified

optimisation

to the

most

its

optimisation query.

also

procedural

program

the

A statistically

average

in

efficient,

The statistics

query

executed.

every time

Finally,

takes

the

optimisation

program

optimisation

can

query

submitted

When the

Dynamic when

algorithms

classification,

query optimisation is

statements

the

Automatic

cost

Managing

DBMS. Query

to

but the

13

of two

generation

which is specific

different

automatically

modes:

evaluates

mode, the statistics

to the

a particular

DBMS.

algorithm is based on a set of user-defined rules to determine

The rules

are

entered

by the

end

user

or database

administrator,

and

1

nature.

statistics

play a crucial

role in

query

optimisation,

this topic

is explored

in

more

section.

13.1.4 Database Statistics Another

DBMS

statistics. tables,

indexes

temporary

query

automatically

review

2020 has

Cengage deemed

Learning. that

any

All

resources

These in this

does

May not

to

such

statistics

chapter,

not

be

copied, affect

overall

or

in

query

of

uses the statistics

duplicated,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

be

make

gathered

rights, the

right

the

some to

database

objects

processor

such

speed

as and

characteristics. to

support

gathering

database used,

database

statistics can

is

about

of processors

many DBMS vendors

learning

optimisation

measurements

a snapshot

DBMS

Database

scanned, the

of

as number

give

the

For example,

materially

role

a number

efficiency.

DBMS.

Reserved. content

an important

available

processing

Rights

plays

refers

later

by the

suppressed

that

statistics

available.

will learn

improving

Copyright

and

space

As you

Editorial

process

Database

third remove

critical

decisions

on request

ANALYZE

party additional

content

may content

by the

command

be

suppressed at

any

time

from if

the

subsequent

about DBA

in

or

SQL

eBook rights

and/or restrictions

eChapter(s). require

it.

678

PART

V Database

to

gather

IBMs

Transactions

statistics.

DB2

sample

and of

addition,

the

database

Number of rows,

in Indexes

each

column

Number

and

Logical

Resources

per data file.

object

Although

For

to

ANALYZE When you generate

Database

(and

and returns.

picture

you

13.2

review

2020 has

DBMS

Cengage deemed

Learning. that

objects

STATISTICS parameters.

is shown

in

Table

A 13.2.

the

know

the

DBMS

an index.

and

as is

Server

and

number

number

of extends

described

in

detail later.

DB2) automatically

To generate

the

gather

database

object

syntax: STATISTICS;

table,

you

indexes

would

use the

following

command:

are also analysed.

However,

you could

command:

of the index.) catalogue

in

database

specially

objects,

if you are the

a RENTAL

table

be subject

to

RENTAL

table

The

basic

processing

SQL

in

of data files

on request.

asit exists today.

way to execute

architecture

of

a SQL

query

processes

of key values

in the index,

STATISTICS;

for

use

Therefore,

of the table

query

all related

system

would

minimum value

STATISTICS;

For example,

will likely

of columns in each row,

of key values

and size

Oracle,

VENDOR

name

the

number

owner

to

designated

especially

of a video

store

constant

the

daily

you

generated

rentals.

updates

a given

DBMS

That

that

RENTAL

as you record

last

more current the statistics,

common

objects

and you have a video

video

and

It is

database

store

inserts

statistics

tables.

those

week

the

do not

your depict

better the chances

query.

processes

and

memory

structures,

you

are ready

request.

QUERY PROCESSING

What happens the

the

statistics

indexes)

histogram

with the following

DBMS to find the fastest

how the

as

COMPUTE

change.

system

daily rentals

Now that

in its initialisation

number

COMPUTE

a table,

in

the

and

following

is the

stored

key,

statistics

COMPUTE for

are

an accurate

to learn

options database

For example,

UPDATE

maximum value in each column,

size, location

gather

for

VEND_NDX

regenerate

your

gather statistics. uses the

used, row length,

uses them in

use the

statistics

associated

key

object_name

statistics

its

are for the

could

VEND_NDX

DBMS,

table

DBMS

DBA to

statistics

are subject to frequent rental

to

Server

have indexes.

DBMSs (such

the

example,

to

different

in the index

block

for a single index

INDEX

that

in the index disk

VENDOR

generate

periodically

Statistics

about

of disk blocks

of columns

newer

you

TABLE

ANALYZE

SQL

measurements

columns

physical

generate

statistics

(In this

and

,TABLE/INDEX.

example,

Auto-Create

values in each column,

name

require

on request,

ANALYZE

number

exist, the

of the

others

statistics

and

statistics

some

statistics,

own routines

Microsofts

may gather

statistics

key values

Environment

If the

and

DBMS

of distinct

of distinct

Copyright

the

while

Measurements

number

Editorial

Auto-Update

Sample Sample

have their

procedure,

that

Object

Tuning

many vendors

RUNSTATS

provides

Table

13

Performance

measurements

TABLE 13.2 Database

In

uses the

procedure

and

any

at the

processes

All suppressed

Rights

Reserved. content

does

DBMS

server

queries

in three

May not

not materially

be

copied, affect

end

scanned, the

overall

when the

clients

SQL

statement

is received?

In

simple

terms,

phases:

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13

Managing

Database

1

Parsing. The DBMS parses the SQL query and chooses the

2

Execution. The DBMS executes the SQL query, using the chosen execution

3

Fetching. The DBMS fetches the data and sends the result set back to the client.

The

processing

required

tables

by DML

are

SQL

statements The

catalog,

end-user

discussed

(such

difference

as

is that

in the

Figure

13.2 shows

following

CREATE

TABLE)

(SELECT,

the steps

is

actually

required

from

updates

query

the

the

UPDATE

for

plan.

plan.

different

INSERT,

679

and

processing

data

dictionary

DELETE)

processing.

mostly

Each

of the

sections.

Query processing

....

From

...

Where

...

SQL cache

Data cache

Parsing

Execution

phase

Syntax

Execute

check

Access

check and

Generate

Place

operations for

Retrieve

analyse

access access

I/O

Add locks

rights

Decompose

phase

plan

check

Naming

Fetching

phase Access

Store

SQL Performance

most efficient access/execution

a DDL statement

while a DML statement

data.

FIGURE 13.2

Select

DDL

statements.

or system

manipulates steps

of

and

data data

Generate

transaction blocks

blocks

in

result

set

mgmt from

data

data

cache

files

plan

plan

in

SQL

13

cache

Data files

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

680

PART

V Database

Transactions

and

Performance

Tuning

13.2.1 SQL Parsing Phase The

optimisation

process

includes

breaking

down

transforming the original SQL query into a slightly that is fully equivalent and more efficient. Fully equivalent

means that the optimised

parsing

the

query into

smaller

units

and

different version of the original SQL code, but one

query results

are always the same as the original query.

More efficient means that the optimised query will almost always execute faster than the original query. (Note that it almost always executes faster because, as explained earlier, many factors affect the performance of a database. Those factors include the network, the client computers resources, and even other queries running concurrently in the same database.) To determine the most efficient you learnt about earlier. The SQL parsing

activities

way to execute the query, the DBMS may use the database statistics

are performed

by the

query

optimiser.

The query

optimiser

analyses

the

SQL query and finds the most efficient wayto access the data. This process is the mosttime-consuming phase in query processing. Parsing a SQL query requires several steps. The SQL query is: Validated for syntax compliance Validated against the data dictionary to ensure that tables Validated

against

the

data

dictionary

Analysed and decomposed into

ensure that the

user has proper

access

rights

more atomic components

Optimised through transformation Prepared for execution

to

and column names are correct

into afully equivalent

by determining the

but more efficient

most efficient execution

SQL query

or access plan.

Once the SQL statement is transformed, the DBMS creates whatis commonly known as an access or execution plan. An access/execution plan contains the series of steps a DBMS uses to execute the query and return the result set in the most efficient way. First, the DBMS checks to see if an access plan for the query already exists in the SQL cache. If it does, the DBMS reuses the access plan to save time. If it

doesnt,

the

optimiser

evaluates

different

plans

and

makes decisions

about

which indexes

to

use and how best to perform join operations. The chosen access plan for the query is then placed in the SQL cache and made available for use and future reuse. Access plans are DBMS-specific and translate the clients SQL query into the series of complex I/O

operations

required

to read

the

data from

Although the access plans are DBMS-specific, in Table 13.3.

13

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

the

physical

data files

and

generate

the

some commonly found I/O operations

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

any

time

set.

are illustrated

suppressed at

result

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 13.3

Managing

Database

and

SQL Performance

681

Sample DBMS access plan I/O operations

Operation Table

13

Description

Scan (Full)

Reads

the

entire

table

sequentially,

from

the

first

row

to the last

row,

one row

at a time

(slowest). Table

Access

Index

(Row

ID)

Reads

Scan (Range)

a table

Reads

the index

(faster than Index

Access (Unique)

row

directly,

first

obtain

the row

the row

ID

IDs

value (fastest). and then

accesses

the table

rows

directly

a full table scan).

Used when a table

Nested Loop

to

using

has a unique index in a column.

Reads and compares

a set of values to another

set of values,

using a nested loop

style

(slow). Merge

Merges two

Sort

Sorts

Table

13.3

shows

RDBMS.)

just

However,

accessing

and

a unique

A row

you

your

park

number. single

ID is like car in

set (slow).

access

I/O

the

operations.

type

of I/O

(This

illustration

operations

that

is most

based

on an

DBMSs

Oracle

perform

when

data sets. note that

for

every

row

the

row

address.

an airport

a table

saved

parking

Using that information,

section

sets (slow).

does illustrate

Table 13.3,

identification

directly.

a data

database

13.3

manipulating

As you examine is

a few

Table

data

access

in

using

permanent

storage

Conceptually,

space.

The

a row ID is the fastest

it is

parking

you can go directly

slip

to

and

similar

your

can

to the

contains

car

the

method.

be used

to

parking

slip

section

without

access

the

you

number

having to

Arow ID row

get and

when space

go through

every

and space.

13.2.2 SQL Execution Phase In this run,

phase,

the

from

all I/O

proper

the

operations

locks

are

data files

processed

and

during

the

indicated

if

in

needed

placed in the parsing

and

the

access

acquired

for

DBMSs

plan

the

data cache.

execution

phases

are

data to

executed.

When the

be accessed

and the

All transaction

of query

execution data

management

plan is

are retrieved

commands

are

processing.

13.2.3 SQL Fetching Phase 13 After

the

parsing

are retrieved,

execution

sorted,

the resulting table

and

grouped

query result

space

to

store

phases

and/or

are

(if required)

set are returned

temporary

completed,

all rows

aggregated.

to the client.

that

match

the

specified

During the fetching

During this

phase, the

condition(s)

phase, the rows

DBMS

of

may use temporary

data.

13.2.4 Query Processing Bottlenecks The

main objective

the fewest the

query

more

Copyright Editorial

review

2020 has

into

Learning. that

any

All

Rights

Reserved. content

is to

of interdependent

a query is,

suppressed

processing

execute

As you have seen, the

a series

complex

Cengage deemed

of query

resources.

does

the

May not

not materially

I/O

more complex

be

copied, affect

scanned, the

overall

or

duplicated, learning

a given

execution operations

the

in experience.

to

be executed

operations

whole

or in Cengage

part.

query in the fastest

of a query requires

are,

Due Learning

to

electronic reserves

in

which

rights, the

right

the

to

third remove

break

a collaborative

means that

some

way possible

DBMS to

party additional

content

manner.

The

are

more

bottlenecks

may content

be

suppressed at

any

time

from if

with down

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

682

PART

V Database

likely.

Transactions

and

A query processing

causes

the

overall

more interfacing DBMS,

five

CPU

to

is required

the

CPU

CPU

not enough

RAM

the

must

be shared

Hard disk

hard

allocates among

likelihood

a system

has, the

of bottlenecks.

Within

a

should

match

the

processor

speed

is too

can be caused

CPU spends

too

A CPU

systems

expected

slow

for

the

by other factors,

much time

bottleneck

swapping

workload.

amount

such

of

memory

blocks), DBMS

will affect

not

only the

as data

cache

and

work

as a defective a badly but

all

memory

for

specific

usage,

processes,

such

including

moving data among

the

operating

components

system

that

SQL

and

cache.

DBMS.

are competing

RAM

If there

for scarce

a bottleneck.

as needed space

the

process.

hard disk space is

disk

that

all running

also use the

storage

the

of an I/O operation that

components

system.

RAM available,

RAM can create

more

increasing

DBMS

CPU utilisation

or a rogue

in the processing

way, the

bottlenecks:

of the

RAM (the

in the

DBMS

not enough

systems

heavy

driver

running

power

same

components,

might indicate

component, device

the

In the

cause

processing

utilisation

is a delay introduced

down.

among

However,

written

slow

typically

performed.

processes

Tuning

bottleneck

system

components

A high

is

Performance

used for

hard disk for to

make room

available

and the

more than just

virtual in

memory,

RAM for

ability

storing

end-user

which refers

more urgent

to

have faster

data

the

database

server

data.

to copying

tasks.

Therefore,

transfer

rates

Current

areas the

reduce

operating

of RAM to the more

hard

disk

the likelihood

of

bottlenecks. Network

in

network.

a database

All networks

many network Application

have

nodes

code

and poorly

as the

poorly

designed

a limited

access

two

designed

as long

Learning

environment,

of the

these

of bandwidth

at the

common

Inferior design

perform

better.

bottlenecks

is

and

that

same time,

sources

code

database

database avoid

amount

network

most

databases.

underlying

how to

the

and the is

clients shared

can be improved

optimise

all clients.

are inferior

When

application

optimisation

no amount

database

via a

are likely.

with code

However,

connected

among

bottlenecks

of bottlenecks

sound.

are

techniques,

of coding

performance

is the

code

will

make

main focus

a

of this

chapter.

13.3 INDEXES AND QUERY OPTIMISATION 13

In in

Chapter the

functions

is

Conceptual, that

and

speeds

even join

an ordered

indexes

of indexes

section

Logical,

and

up

access.

data

operations.

13.5,

in

you

Physical

contains

the

physical

database

how indexes

Design,

facilitate in

the index

where high-sparsity

will learn

Database

Indexes

The improvement

set of values that

are recommended

selection In

11,

process

data

key and columns

design

impact

access

are

that

sorting speed

pointers.

stage SQL

you learnt

searching,

In

occurs

will play

a huge

are

using

crucial

aggregate

because

addition,

used in search

performance

indexes and

an index

you also learnt

conditions.

part in

that

The careful

query

optimisation.

tuning.

NOTE You

can learn

(Section

Copyright Editorial

review

2020 has

how

to

select

indexes

in

Chapter

11,

Conceptual,

Logical,

and

Physical

Database

Design

11.3.3.).

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13.4

13

Managing

Database

and

SQL Performance

683

OPTIMISER CHOICES

Query optimisation is the central activity during the parsing phase in query processing. In this phase, the DBMS must choose what indexes to use, how to perform join operations, which table to use first, and so on. Each DBMS has its own algorithms for determining the most efficient way to access the data. The query optimiser can operate in one of two modes: Arule-based execute

optimiser

a query.

uses a set of preset rules and points to determine the best approach to

The rules

assign

a fixed

cost

to each

SQL operation;

the costs

are then

added

to yield the cost of the execution plan. For example, a full table scan will have a set cost of ten, while atable access by row ID will have a set cost of three. A cost-based optimiser uses sophisticated algorithms based on the statistics about the objects being accessed to determine the best approach to execute a query. In this case, the optimiser process adds up the processing cost, the I/O costs, and the resource costs (RAM and temporary space) to come up with the total cost of a given execution plan. The optimiser objective is to find alternative ways to execute a query, to evaluate the cost of each alternative and then to choose the one with the lowest cost. To understand the function of the query optimiser,

lets

use a simple

example.

Assume that

you

based in South Africa (SA). To acquire that information,

want to list

all products

you could

SELECT

P_CODE, P_DESCRIPT, P_PRICE, V_NAME, V_STATE

FROM

PRODUCT,

WHERE

5 VENDOR.V_CODE

VENDOR.V_COUNTRY

Furthermore, lets

by a vendor

query:

VENDOR

PRODUCT.V_CODE

AND

provided

write the following

5 'SA';

assume that the database statistics indicate

1

The PRODUCT table has 7000 rows.

2

The VENDOR table

3

Ten vendors come from

4

One thousand

that:

has 300 rows.

South Africa.

products come from vendors in South Africa.

Its important to point out that only items 1 and 2 are available to the optimiser. Items 3 and 4 are assumed to illustrate the choices that the optimiser must make. Armed with the information in items 1 and 2, the

optimiser

would try to find the

most efficient

way to

access

the

data.

The primary

13

factor

in determining the most efficient access plan is the I/O cost. (Remember, the DBMS will always try to minimize I/O operations.) Table 13.4 shows two sample access plans for the previous query and their respective I/O costs.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

684

PART

V Database

Transactions

TABLE 13.4 Plan

and

Comparing

Step

Performance

Tuning

access plans and I/O costs I/O

Operation

A1

Cartesian A1

A2

product

Select

rows

vendor A2

Total

I/O

Cost

7 300

2 100 000

7 300

2 100 000

2 100 000

7 000

2 107 300

7 000

7 000

1 000

2 114 300

300

300

10

300

7 010

70 000

7 310

70 000

1 000

77 310

7 000

5(PRODUCT

Set

Rows

Operations A

Resulting

I/O

Cost

1 300

X VENDOR)

in

A1

with

matching

codes

5 s PRODUCT.v_code

5

VENDOR.v_code(A1) A3

Select

rows

in

A2

V_COUNTRY A3

B

B1

5 (s

Select

5 'SA'

V_COUNTRY rows

in

5 'SA'

VENDOR

V_COUNTRY B1

with

(A2))

with

5 'SA'

5 s V_COUNTRY

5 'SA'

(VENDOR)

B2

Cartesian

7 000

Product

B2 5 (PRODUCT B3

Select rows in matching B3

1 10

X B1) 70 000

B2 with

vendor

codes

5 s PRODUCT.v_code

5

B1.v_code(B2)

To makethe example

easier to understand,

the I/O

Operations

and I/O

Cost columns in Table 13.4 estimate

only the number ofI/O disk reads the DBMS must perform. For simplicitys sake, it is assumed that there are noindexes and that each row read has anI/O cost of 1. For example, in Step A1,the DBMS must perform a Cartesian product of PRODUCT and VENDOR. To do that, the DBMS mustread all rows from PRODUCT (7 000) and allrows from VENDOR (300), giving atotal of 7 300 I/O operations. The same computation is done in all steps. In

Table 13.4, you can see how plan A has a total I/O

than plan B.In this case, the optimiser

cost that is

almost

30 times

higher

will choose plan Bto execute the SQL.

NOTE

13

Not all DBMSs from the

optimise

optimisation

Given the right the

SELECT

without

defaults

You learnt sparsity

2020 has

Cengage deemed

in

are

Learning. that

any

your

not

All suppressed

to

need

Reserved. does

11,

May

not materially

be

could

an index

access

Oracle parses

Always read the

be answered

PQOH_NDX could

any

queries

documentation

differently

to examine

the

be resolved

of the

data

entirely

in the

with only an index.

P_QOH

attribute.

by reading

only the

for the

PRODUCT

blocks

Then first

For example,

a query

entry

table.

such

in the

as

PQOH_

(Remember

that

order.)

Conceptual,

candidates

not

chapter.

implementation.

PRODUCT to

way. As a matter of fact,

in this

queries

with

ascending

good

content

some

FROM

Chapter

Rights

DBMS

table

the

same

sections

conditions,

MIN(P_QOH)

the index

review

for

PRODUCT

NDX index,

Copyright

in several

requirements

assume

Editorial

SQL queries the

way described

copied, affect

for

scanned, the

overall

Logical,

index

or

duplicated, learning

and

creation.

in experience.

whole

or in Cengage

Physical However,

part.

Due Learning

to

electronic reserves

Database there

rights, the

right

are

some to

third remove

Design,

that

cases

party additional

content

columns

where

may content

be

suppressed at

any

with low

an index

time

from if

the

subsequent

in

eBook rights

alow

and/or restrictions

eChapter(s). require

it

CHAPTER

sparsity you

column

would be helpful.

want to find

SELECT

out

how

scan to read

all EMPLOYEE

have

without

an index

the

need

on

to

FROM

for the rows

the

that the

are in the

EMPLOYEE

EMP_GENDER

the

EMPLOYEE

employee

data

you

Database

table

the

query

could

write

attributes

SQL Performance

a query

such

685

If as:

5 'F';

would have to

be answered

and

has 122 483 rows.

would

WHERE EMP_GENDER

column,

query

Managing

company,

and each full row includes

EMP_GENDER,

access

assume

employees

COUNT(EMP_GENDER)

If you do not have an index

if you

For example,

many female

13

you

perform

a full table

do not need.

by reading

only the

However,

index

data,

instances

the

at all.

13.4.1 Using Hints to Affect Optimiser Choices Although

the

optimiser

optimiser

may not

on the

existing

choose

the

statistics.

best execution There

generally

performs

best

If the

execution

statistics

plan. Even

with current

occasions

when the

are some

well

under

plan.

statistics,

the

user

optimiser

that

most common

are

embedded

optimiser

TABLE 13.5

hints

optimiser

would like

inside

the

SQL

used in standard

Optimiser

the

optimiser

to

optimiser

choice

some

makes

decisions

may not be the the

Optimiser

command

in

based

may not do a good job in selecting

change

SQL statement. In order to do that, you need to use hints. the

circumstances,

Remember,

are old, the

end

most

text.

optimiser

the

most efficient mode for

the

one.

current

hints are special instructions

Table

13.5

summarises

a few

for

of the

SQL.

hints

Usage

Hint

Instructs

ALL_ROWS

the optimiser to

it takes to return processes. SELECT

FIRST_ROWS

to

ALL_ROWS

*/ * FROM

the optimiser to

generally

SELECT /*1 Forces the SELECT

set. This hint is generally

PRODUCT

minimise the time

used for batch

mode

P_QOH

, 10;

only the first set of rows in the query result

used for interactive

FIRST_ROWS

mode

processes.

*/ * FROM PRODUCT

optimiser to use the

/*1

WHERE

minimise the time it takes to process the first set of rows, that is,

minimize the time it takes to return

hint is

INDEX(name)

query result

For example: /*1

Instructs

minimise the overall execution time, that is, to

all rows in the

For example:

WHERE P_QOH

P_QOH_NDX index to

INDEX(P_QOH_NDX)

*/ * FROM

set. This

, 10;

process this query. For example:

PRODUCT

WHERE

P_QOH

, 10;

13 Now that some

you are familiar

general

13.5 SQL

SQL

performance

tuning

practices

processes

to facilitate

the

SQL queries, work

of the

lets

query

turn

our attention

to

optimiser.

evaluated

from

the

write efficient

perspective.

Afew

Therefore,

words

the

of caution

some

MostSQLperformance optimisationtechniques are DBMS-specific and,therefore, arerarely portable, different

Cengage deemed

Learning. that

any

All suppressed

versions

in database

Rights

Reserved. content

does

May not

not materially

be

DBMSs perform automatic

goal is to illustrate

are appropriate:

2

across

relational

client

SQL code.

Most current-generation

2020 has

is

used to

advancement

review

recommendations

DBMS

1

even

Copyright

coding

way the

SQL PERFORMANCE TUNING

common

Editorial

with the

of the

same

DBMS.

Part of the

query optimisation

reason

for this

atthe server end.

behaviour

is the

constant

technologies.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

686

PART

V Database

Transactions

Does this

mean that

will always

optimise

general

and

you it?

optimisation

circumstances

database performance

not

are related

services,

SQL

data

UPDATE,

DELETE

SELECT

statement

and, in

focusing

to

poorly

a SQL

written

statements

the

of view.

almost

include

in

use of indexes

because

The

and

different

this

although

and how to

bring

one.

as INSERT,

to

write conditional

provides

written

(such

the

database

a DBMS

are related

uses special

will,

a poorly

commands

section

by the

of current

outperforms

DBMS

DBMS

usually

majority

Therefore,

the

(The dictated

can,

always

many

most recommendations

particular,

query

SQL code.

query

written

for improvement. techniques

SQL

point

written

written

query is

room

on specific

A poorly

a carefully

SELECT),

how

considerable

a performance

manipulation

and

is

than

execution.)

knees from

problems

worry about there

rather

query

to its

Tuning

because

techniques,

optimizing

Although

should No,

of the

system

general

Performance

the

use

of the

expressions.

13.5.1 Index Selectivity Indexes

are the

Chapter

11,

how likely

most important

Conceptual,

an index

using indexes

technique

Logical,

is to

used

and

in

Physical

be used in query

SQL

performance

Database

processing.

optimisation.

Design, index

To recap,

the

As you learnt

selectivity

general

is a

guidelines

from

measure

for creating

of and

are:

Create indexes

for each single

attribute

used in a WHERE,

HAVING,

ORDER

BY or GROUP

BY

clause. Create an index When

a

MIN or

Declare

indexes

However,

you

How

especially

one index

to

will change indexes

in

creation

Too

as new rows

all search

columns

of the index

It

a query,

let

operations.

your

or deleted optimiser evaluate

that

help the

UPDATE

query

choose.

A proper if

and

shown

attributes.

an index

DELETE

in

will choose many

optimiser,

different

the

answer

In any case, you should procedure

will be the

performance

is

not

for

operations,

optimisers

conditions

in

P_QOH

table

not create

cost-based

from the tables.

data

condition

in the

query

uses

use the

the

search

you should

some

and improve

using

use functions

Furthermore,

even if

the

you

down INSERT,

use? If you

test,

will not

when

of rows.

optimiser

monitor,

in join

For example,

P_MIN

bears repeating

are added

and then

usage

for

are ignored

will slow

for

use the indexes

performance.

of an index

many thousands index

can

PK/FK.

indexes

does the

high.

column.

optimiser

to improve

many indexes

driving

column is

an indexed

than

you create?

Which index

with time

evaluation

the

contains

be the

columns.

other

because

to

so the

an index

should

table

keys

use

is

in a table.

if the

indexed

13

section,

The reason

on the indexed

applied

columns

always

many indexes

every column

only

in join

next

* 1.10.

is

and foreign

cannot

13.6 in the

. P_MIN

data sparsity

MAX function

all primary

Declare

Table

when the

create

constant

adequate.

13.5.2 Conditional Expressions A conditional statement. to

only

shown

Copyright Editorial

review

2020 has

Cengage deemed

the in

Learning. that

any

expression

is

A conditional rows Table

All suppressed

normally

expression

matching

the

expressed

(also

within the

known

conditional

WHERE or

as conditional

criteria.

criteria)

Generally,

the

HAVING

restricts

clauses

of a SQL

output

of a query

the

conditional

criteria

have

the

form

13.6.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 13.6

Conditional

13

Managing

Database

and

SQL Performance

687

criteria Conditional

Operand1

Operator

P_PRICE

.

V_COUNTRY

5

P_QOH

A simple

Table

column

A literal

Most

of the

easier.

Lets

name

such

query

column

as

some

columns

to

P_QOH

first.

The

For example,

.

and

involve

additional

column

Equality

comparisons

to

there

are

symbol

no exact

(.,

.5,

The reason perhaps

is

,,

all comparison Also,

the

high,

Whenever

- 10

Cengage

Learning. that

any

All suppressed

Rights

Reserved. content

will almost

is

does

with

symbol there

conditional it to read

such

as:

AND

P_MIN

May

not materially

be

copied, affect

additional

the

overall

or

duplicated, learning

in

whole

as in

equality

P_PRICE in the you

use

than

13

request.

values

and

of NULL) of

LIKE %glo%.

when the

there

to use literals.

the

exception

V_CONTACT

If

an inequality

than

the

5 10.00

column.

complete

or less

especially

values

conditions

are

sparsity

equal

For example,

of the

values.

if your

condition

5 17.

write the

experience.

if

(with

to

operands.

index

to

than

conditions,

attribute

NULL

rule,

However,

searches,

expressions

the

processing

such

different

5 P_REORDER

scanned,

if all

decimal) faster than

For example, using

The slowest

slower

P_PRICE

5 'Jim'

search

of all conditional

more greater

symbols,

In

null values,

As a general

as false.

expressions,

to

slowest

evaluated

more

* 1.10

execution

a character

(integer,

references

search

yields many

P_MIN

query

comparisons.

comparisons

be the

be

. 10.00

the

try to use V_NAME

comparisons.

perform

work

SQL code:

of a single

P_PRICE

total

in

use of

contents

evaluate

comparing

do a direct

wildcard

must

NULL

comparisons.

always

(,.) are

conditional

not

is

avoid the the

add to the

than

values in the index.

LIKE

transform

P_MIN

can

must

equal

5 7, change

condition

,

DBMS

there

when

multiple

P_QOH

deemed

condition

equal

is,

possible,

P_PRICE

composite

has

not

that

When using

2020

the

exactly

numeric

than inequality

matches, the

date and

to

optimisers

expressions

example,

5 'JIM',

do not store tend

make the

capitalisation.

is faster

inequality

DBMS

operators

using

data is

As indexes

the

because

literal

CPU handles

faster

For DBMS

will also

with proper character,

because

,5),

only a few

than

than

expressions. the

to

conditional

Comparing

UPPER (V_NAME)

a numeric

are faster

faster

is

designed

expression

possible.

expressions

and, therefore,

are processed

processed

in

are

write efficient

because

are stored

comparisons.

comparisons

'SA'.

a conditional

to

* 1.10

are faster

processing

in

whenever

of functions

attribute

date

text

next

used to

comparing

P_MIN

a character literal. In general, the character

or the

mentioned

as operands

than

comparisons

a numeric

10.00

practices

if your condition

V_NAME

field

comparing

value

techniques

use

can be:

or V_COUNTRY

with functions

is faster

* 1.10

* 1.10.

or literals

than

names in the

review

P_MIN

P_PRICE

common

a literal

expression

Numeric

P_MIN

expressions

will be faster

Copyright

'Moloi%'

an operand

as the

optimisation

examine

Use simple

as

such

such

conditional

time.

13.6, note that

or a constant

An expression

Editorial

'SA'

.

As you examine

is

10.00

LIKE

V_CONTACT

is

Operand2

equality

AND

or in Cengage

part.

Due Learning

conditions

P_QOH

to

electronic reserves

first.

If

you

have

a

5 10

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

688

PART

V Database

Transactions

and

Performance

Tuning

change it to read: P_QOH

5 10 AND P_MIN

Remember,

equality

RDBMSs query

If you use

(The

multiple

this technique, conditional

DBMS

everything

is

will be evaluated

process

paying

than

inequality

to this

condition the rest

V_COUNTRY5

'SA'

If

of the

set.

as soon as it finds AND

conditions

the

a

conditions

evaluates

technique,

Naturally,

data

most

for the

done.)

multiple

of the

use this

conditions.

Although

the load

be false first. If you use

for

one

if you

to

conditions

Remember,

Therefore,

sparsity

of the

as true.

additional

of the

most likely

be false.

be evaluated as false.

conditions.

detail lightens

do what you have already

write the

to

. 10

attention

evaluating

evaluating

knowledge

to

have to

evaluated must

unnecessarily

an implicit

condition

that

all conditions

else

waste time

will stop

AND P_MIN

you,

wont

AND conditions,

the

true,

are faster

do this for

optimiser

expression

be found

the

to to false,

DBMS

wont

use of this technique

For example,

look

at the

following

list:

P_PRICE

. 10

If you know that

AND

only a few

V_COUNTRY When

conditions

will automatically

optimiser.

implies

5 P_REORDER

using

DBMS

5 'SA'

multiple

will stop

that

is

the

conditions

vendors

AND P_PRICE

OR conditions,

evaluating

evaluated

are located

to must

put the

to

condition

Africa,

you could rewrite

most likely

conditions

Remember,

be evaluated

South

this

condition

as:

. 10

the remaining

be true.

in

for

as soon

to

be true

as it finds

multiple

OR conditions

what is

described

first.

By doing

a conditional

to

evaluate

this,

the

expression

to true,

only

one

of

true.

NOTE Oracle

evaluates

conditions

queries

from

last

in

an opposite

here.

That is,

Oracle

evaluates

to first.

Whenever

possible,

expression

try to

containing

NOT (P_PRICE Also,

way from

avoid

the

use

a NOT logical

.

10.00)

can

NOT (EMP_GENDER

NOT logical

operator into

be

5 'M')

of the

written can

be

an equivalent

as P_PRICE written

operator.

as

,5

It is

best

to transform

expression.

a SQL

For example:

10.00.

EMP_GENDER

5 'F'.

13

13.6

QUERY FORMULATION

Queries are usually and tells get the to

you

to

job

match

done,

generate

the

environment

applications.

desired

values

also

want

2020 has

Cengage deemed

Learning. that

any

To

that

on

All suppressed

Rights

Reserved. content

does

May not

must

you

you

queries would

and computations

if an end user gives you a sample

write the

which

must

have

because normally

want to

computations?

tables

Remember

the

output.

To

are required

understanding

are the the

steps

The first

return

generate

computations

output

of the

database

SQL code.

they

follow

to

and

a good

of your

are required.

Do you

SQL required

columns,

will be the focus

to return. some

single

you

do that,

SELECT

a query,

want

For example,

evaluate

output.

to include

return

format,

database

will focus

you

questions.

carefully

which columns

data

should

review

output

must

To formulate

1 Identify

Copyright

that

you

and of the

This section

Editorial

written to answer

queries

you

outlined

step is to

just

the

names

that

all columns

will find in

most

below:

determine

and

clearly

addresses,

in the

which

or do you

SELECT

statement

values.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Do you need simple

expressions?

on hand

the

such

to

as

generate

DATE(),

total

SYSDATE()

Do you need aggregate should

use

the

learn

more about

in

Chapter

to

summarise

breaking

as views. the

final

tables

to

fewest

SQL Performance

price times

single

689

the quantity

attribute

functions

Then you could

your

granularity

15, Databases data that

the total

may need

use

output.

for

by product,

you

a subquery.

The granularity

is known

are

sales

to

as atomic

is the level

data.

You

will

Business intelligence.

not readily

available

the

query into

multiple subqueries

create

a top-level

query that joins

on any table.

and storing

those

views

In

those

and

output.

used in

use the

some

and

Once you know

the

query.

tables

in

Some

your

which columns

attributes

query

to

appear

minimise

are required,

in

the

more than

number

you can determine the

one table.

of join

In those

cases,

operations.

Determine how to join the tables. Onceyou know whichtables you needin your query statement,

4

must

properly

identify

how to join

but in some instances,

you

the

tables.

may need to

In

case, you

that the

must determine

data type

Simple

comparison.

In

Single value to may need

to

5

use

an IN

Nested

comparisons. subqueries.

Grouped

most cases,

data in the

you

data

Also, in other

selection.

but to the

operator.

cases,

For example:

data.

you

single

values.

a single

13.7 DBMS

performance

tuning

(allocating

for

the

data

the

previous

the

cases,

DBMS

For

generating

Data cache.

cache;

The

from this

Learning. that

any

the

use the

criteria

need

nested

to

may apply

use the

ORDER

'SA');

selection

criteria

PRODUCT);

not to the

HAVING

output

you

'UK',

FROM

. 10

raw

clause.

may be ordered by one

BY clause.

All

tasks

1

such

purposes)

data

the

Rights

cache

May not

DBA

as

and

managing

the

also includes

must

work

must

the

structures

DBMS

in

processes

physical

among

all database

to

the

in

storage

on setting to

to

users.

The

that

primary

(allocating

examined

the

speed

queries

up query

in the perform

response

optimisers.

the parameters

permit

has settings

practices

ensure

indexes

by cost-based

enough

DBMS

several

developers

end focuses

Each

applying

creating

required

be set large

as possible.

with

for

statistics

at the server

data

does

DBMS

database

cache

Reserved. content

of the

DBA is responsible

tuning

to the

suppressed

global

caching

the

the

cache is shared

allocated

Cengage deemed

example, case,

performance

serviced

for

performance

In that

and for

are

you need to

IN ('FR',

AVG(P_PRICE)

selection you

P_PRICE

multiple values,

V_COUNTRY

5 ( SELECT

Ensure

files).

section.

as expected.

includes

memory

Fine-tuning

time

cases,

of natural

DBMS PERFORMANCE TUNING

memory space

In those

type

For example:

value to

Determine in which order to display the output. Finally, the required or more columns.

some

are correct:

may need to have some .

occasions,

In those

criteria

For example:

P_PRICE

On other

aggregate

will use

are needed in your criteria.

comparison

will be comparing

comparison

you

and operators

multiple values. If you are comparing

involving

data,

of the

cases,

Most queries involve some type of selection criteria.

which operands

and granularity

most

use an outer join.

Determine which selection criteria are needed. In this

has

for

may need

join,

2020

required

may consider

you

review

data

maximum

you

source

Copyright

with

you

2 Identify the source tables.

Editorial

you

Sometimes,

generates

3

cases,

such

cases,

Database

multiply the

may need

some

raw

Data

granularity

You

If you need to compute In

of the

data.

do you need to

cost?

Managing

ROUND().

BY clause.

granularity

within the

subqueries

try

or

functions?

a GROUP

Determine

of detail

That is,

inventory

13

as

that

many

used for:

data

requests

control the

majority

to

be

size of the

of primary

data

memory resources

cache.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

690

PART

V Database

Transactions

SQL cache.

users

have

been

accessing cases,

access from

DBMS

by the the

the

will likely

query

only

parsing

The sort cache is used as a temporary as well as for index-creation mode.

statistics that

are

are

Most DBMSs

automatically available.

used

operate in

determine

cost-based

have

and

by

execute

it

SQL requests

(after

the

an application many

SQL

with

different

many times,

users.

using

for the same

multiple In

the

same

query are served

storage

area for

ORDER

BY or

GROUP

BY

functions. one of two

the

For example,

by the

you

phase.

Sort cache.

Others

SQL statements

if

be submitted

once

and subsequent

operations, Optimiser

executed

Generally,

query

the

second

skipping

most recently optimiser).

same

will parse

way, the

SQL cache,

Tuning

stores the

parsed

a database,

the

plan. In that

the

Performance

The SQL cache

statements

these

and

the

optimisation

optimisation

mode

DBA is responsible

optimiser.

If the

modes:

based

for

statistics

are

cost-based

on

whether

generating not

the

available,

or rule-based.

database database

the

statistics

DBMS

uses

a rule-based

optimiser. From

the

performance

memory to

minimise

options for their (if

not

all)

systems

of components

on data

important storage

(RAM)

and

as flash

memory

are still

with poorly

drives.

That is

why

rather

tuning.

than

drives). to

secondary

the

costs,

these and

written

primary

database

portions

storage.

These

modern

database advances

type

of databases

performance

tuning

SQL statements.

database

storage

Note the following

in

and technology

optimisation

most

physical

(disk) from

Even though

or poorly

markets,

stored

offer in-memory

demands

query

databases

selected

managing

vendors

diminishing

state

database

are optimised to store large

performance

subject

designed

a niche in

entire

database

Data),

and solid

they

carving

Big

have the

systems

storage

Analytics

DBMS performance

An

SSD

speed

performance

Although

implementations

details

of the

still rely

data files

general recommendations

plays

for

an

physical

does

than rates

This type of device uses flash solid state drives (SSDs) to store the not

tolerance.

RAID

13.7

shows

moving disk

contention

systems

use

the

multiple

common

and, therefore I/O

caused

RAID systems

most

parts drives.

Array of Independent disks.

Common

any

rotating

and reduce

by several individual 13.7

have

traditional

Use RAID (Redundant

TABLE

why several

database

Business

are

on disk

database.

Table

primary

to

databases:

a higher

13

This is

of increased

UseI/O accelerators.

fault

be optimal

because

when faced

role in of

in

bottlenecks,

databases stored

would

popular

access

especially

in-memory

as

(such

disk

it

disk access.

database

becoming (such

eliminate

costly

of view,

main products. In-memory

of the are

applications

rules,

point

by typical

deliver

storage

I/O

operations

high

transaction

at

drives.

Disks) to provide balance between performance disks

to

provide

RAID

performs

accelerators

create

virtual

performance

disks

(storage

volumes)

improvement

and fault

and

formed

tolerance.

configurations.

RAID configurations

RAID Level

Description

0

The

data

blocks

performance

are but

reconstructed

1

The

Copyright Editorial

review

2020 has

same

Cengage deemed

Learning. that

any

no fault

over

data

blocks

Provides of two

All suppressed

Rights

are

written

drives.

known

tolerance

a

minimum

of two

to

separate

(duplicated)

read

Also

Fault

Requires

increased

performance

means

and

as a striped

that,

in

case

array.

Provides

of failure,

data

increased could

be

drives. drives.

fault

Also

tolerance

referred via

data

to

as

mirroring

redundancy.

or

Requires

a

drives.

Reserved. content

separate

tolerance.

and retrieved.

duplexing. minimum

spread

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13

Managing

Database

and

SQL Performance

691

RAID

Level

Description The data are striped

3

drive.

5

Parity

data are specially

Provides

good read

The data

and the

tolerance

across separate

parity

and fault tolerance.

data that

and fault

are striped

data.

Requires

The data blocks are spread

1+0

generated

performance

via parity

drives, and parity data are computed permit the reconstruction

tolerance

across a

via parity

separate

minimum

over separate

drives.

of three

data.

of corrupted

Requires

Provides

good

or

a minimum read

missing

of three

performance

data.

drives.

and fault

drives.

drives and

This is recommended

and stored in a dedicated

mirrored. This arrangement

RAID configuration

for

provides

both speed

most database installations

( if cost is

not an issue).

Minimise

disk contention.

Use

multiple, independent

storage

volumes

with independent

spindles

(a spindle is a rotating disk) to minimise hard disk cycles. Remember, a database is composed of manytable spaces, each with a particular function. In turn, each table space is composed of several data files (in which the data are actually stored). A database should have atleast the following table spaces: ? System table space. Used to store the data dictionary tables. It is the table space and should be stored in its own volume.

mostfrequently

accessed

? User data table space. Used to store end-user data. You should create as many user data table spaces and data files as are required. You can create and assign a different user data table space for each application and/or for each group of users. ? Index

table

space.

Used to store indexes.

You can create

and assign

a different

index

table

space for each application and/or for each group of users. The index table space data files should be stored on a storage volume that is separate from user data files or system data files. ? Temporary

table

space.

Used as a temporary

storage

area for

operations. You can create and assign a different temporary and/or for each group of users. ?

Rollback

segment

table

space.

Take advantage

of the

various

table

storage

or set aggregate

table space for each application

Used for transaction-recovery

Put high-usage tables in their own table spaces. with other tables.

merge, sort

purposes.

By doing this, the database

organisations

available

in the

minimises conflict

database.

For example,

in Oracle, consider the use ofindex-organised tables (IOT); in SQL Server, consider clustered index tables. Anindex-organised table (or clustered index table) is a table that stores the end-user data and the index data in consecutive locations on permanent storage. This type of storage organisation provides a performance advantage to tables that are commonly accessed through a given index

the

order,

because

the index

contains

the index

key as well as the

data rows.

13

Therefore,

DBMS tends to perform fewer I/O operations.

Assign separate files in separate storage volumes for the indexes, system and high-usage tables. This ensures that index operations will not conflict with end-user data or data dictionary table access

operations.

Partition

tables

based

on usage.

Some

RDBMSs

on attributes. (See Chapter 14, Distributed be processed by multiple data processors. the most.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

support

horizontal

partitioning

of tables

based

Databases.) By doing so, a single SQL request could Put the table partitions closest to where they are used

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

692

PART

V Database

Transactions

and

Use denormalised involves to

taking

second

tables a table

normal

For

and

example,

table.

you

might

add the

you have learnt

the

Lets

use a simple

example

work.

The

is

based

in

previous

ones

that

causes Chapter

attributes

a lower

data

normal

duplication,

tables.

subtotal,

minimises

performance-improving

to

7, Normalising in

invoice

attributes

form

In the

but it

Database

short,

use

amount

computations

in

typically, minimises

from join

third

operations.

Designs.)

derived

of tax

queries

technique

form

attributes

and the

and join

in

total

your

in the

tables.

INVOICE

operations.

QUERY OPTIMISATION EXAMPLE

Now that

the

in

Another

normal

This technique

aggregate

Using derived

13.8

a higher

was discussed

computed

Tuning

where appropriate.

from

form.

(Denormalisation Store

Performance

example you

used

you do not overwrite

basis of query

to illustrate on the

optimisation,

how the QOVENDOR

chapters.

previous

query

optimiser

and

However,

you are ready works

QOPRODUCT

the

to test and

how

tables.

QO prefix

is

your you

Those

used

for

the

new knowledge. can

tables table

help it are

name

do its

similar to

to

ensure

tables.

Online Content Thedatabases andscriptsusedin this chaptercanbefoundontheonline platform

for

this

book.

To perform this query optimisation illustration, you will be using the Oracle SQL*Plus interface. Some preliminary work must be done before you can start testing query optimisation. The following steps will guide you through this preliminary work: 1

Log in to

Oracle SQL*Plus. using the username and password

provided by your instructor.

2

Create a fresh set oftables, using the QRYOPTDATA.SQL script file located on the online platform for this book. This step is necessary so that Oracle has a new set of tables and the new tables contain no statistics. Atthe SQL. prompt, type: @path\ QRYOPTDATA.SQL

3

where path is the location

of the file in your computer.

Create the

The PLAN_TABLE

PLAN_TABLE.

is a special

table

used

by Oracle to

store the

access

plan information for a given query. End users can then query the PLAN_TABLE to see how Oracle will execute the query. To create the PLAN_TABLE, run the UTLXPLAN.SQL script file located in the RDBMS\ADMIN folder of your Oracle RDBMS installation. The UTLXPLAN.SQL script file is

13

also found

on the

online

platform

for this

book.

At the

SQL prompt,

type:

@path\UTLXPLAN.SQL You use the EXPLAIN PLAN command to store the execution plan of a SQL query in the PLAN_TABLE. Then, you would use the SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY) command to display the access

Copyright Editorial

review

2020 has

Cengage deemed

plan for

Learning. that

any

All suppressed

a given

Rights

Reserved. content

does

SQL statement.

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13

Managing

Database

and

SQL Performance

693

NOTE Oracle,

MySQL

available, using

the

and

To see the

11g

through

on the

access

statements

plan

command

Note that the first access

plan in

FIGURE

default

to

by the

Figure display

Oracle

13.3.

INITIAL

to

you

plan for

generates

EXPLAIN

The

interface.

execute

Then,

Figure 13.3 uses afull table

13.3

optimisation.

optimiser.

SQL*Plus

access

SQL statement

cost-based

In

examples The

Oracle,

if table

in this

examples

statistics

section

will

give

are

not

were

generated

different

outputs

you are using.

DBMS

the

to

a rule-based

of ORACLE

used in

to

all

back the

version

as shown

DISPLAY)

server

will fall

ORACLE

depending

SQL

DBMS

the

scan

your use

query,

the

SELECT

a given

SQL

statistics

on the

use the *

FROM

PLAN

TABLE

and

SELECT

(DBMS_XPLAN.

statement.

for the

QOVENDOR

PLAN (Oracle

EXPLAIN

QOVENDOR table,

table.

Also, the initial

and the cost of the

plan is 3.

11g)

13

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

694

PART

V Database

Lets clause) are

Transactions

now and

shown

create see

in

FIGURE 13.4

and

Performance

an index

how

that

Figure

Tuning

on V_AREACODE

affects

the

access

(note

plan

that

generated

V_AREACODE by the

is

used in the

cost-based

ORDER

optimiser.

BY

The results

13.4.

EXPLAINPLANafter index on V_AREACODE (Oracle 11g)

13

As you examine

Figure

13.4,

note that the

new access

plan cuts the

cost

of executing

the

query

by

half! Also note that this new plan scans the QOV_NDX1 index and accesses the QOVENDOR rows, using the index row ID. (Remember that access by row ID is one of the fastest access methods.) In this case, the creation of the QOV_NDX1 index had a positive impact on overall query optimisation results.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

At other times, indexes indexes what (Note

on small happens that

tables when

V_NAME

you is

FIGURE 13.5

do not necessarily or

when the create

used

query

an index

on the

help in query accesses

clause

Managing

optimisation.

a high

on V_NAME.

WHERE

13

This is the

percentage

The

new

of table

access

as a conditional

Database

case

rows

plan is

expression

and

SQL Performance

when you have

anyway.

shown

695

in

Lets

see

Figure

13.5.

operand.)

EXPLAINPLAN after index on V_NAME(Oracle 11g)

13

As you can see in Figure 13.5, creation of the second index did not help the query optimisation. However, there are occasions when an index could be used by the optimiser, but it is not selected because of the wayin which the query is written. For example, Figure 13.6 shows the access plan for a different query using the V_NAME column.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

696

PART

V Database

Transactions

FIGURE 13.6

and

Performance

Tuning

ACCESSPLAN using index

on V_NAME (Oracle 11g)

13

In Figure 13.6, note that the access plan for this new query uses the QOV_NDX2 index on the V_NAME column. Lets now use the table QOPRODUCT to demonstrate how an index can help when aggregate function

queries

using the cost of 3.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

are being run.

For example,

Figure 13.7 shows the access

MAX(P_PRICE) aggregate function.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

Note that this

whole

or in Cengage

part.

Due Learning

to

electronic reserves

plan for a SELECT statement

plan uses a full table

rights, the

right

some to

third remove

party additional

content

may content

be

scan

suppressed at

any

time

from if

with a total

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 13.7

13

FIRST EXPLAIN PLAN: aggregate function

Managing

Database

and

SQL Performance

697

on a non-indexed column

13

A cost

of 2 is

performance

by

two-thirds

Copyright Editorial

review

has

but

an index is

could on

created

you improve

P_PRICE.

and the

it?

Figure

Yes, 13.8

QOPRODUCT

plan uses only the index

could

shows

table

QOP_NDX2

you

improve

how

is

the

analysed.

to answer the

the

plan Also

previous

cost is note

query; the

query

reduced

that

the

by

second

QOPRODUCT

table

accessed.

Cengage deemed

creating

of the access

never

2020

already,

after the index

version is

very low

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

698

PART

V Database

Transactions

FIGURE 13.8

and

Performance

Tuning

SECOND EXPLAIN PLAN: aggregate function

on an indexed

column

13

Although

the

optimisation, As a DBA,

just

Copyright review

2020 has

Cengage deemed

examples

in

you

also

examples

you

tools

Learning. that

any

All suppressed

saw

should

for a single

graphical

Editorial

few

be aware

query, for

Rights

but for

does

May not

not materially

be

section

that

in the

all requests

performance

Reserved. content

this

monitoring

copied, affect

scanned, the

overall

or

duplicated, learning

show

how

important

which

index

main

goal is to

proper

creation

does

optimise

and query types.

index

selection

not improve

overall

database

Most database

is for

query

query

performance.

performance

systems

provide

not

advanced

and testing.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

13

Managing

Database

and

SQL Performance

699

SUMMARY Database performance tuning refers to a set of activities and procedures an end-user

query is

processed

by the

DBMS in the

minimum

amount

designed to ensure that

of time.

SQL performance tuning refers to the activities on the client side designed to generate SQL code that returns the correct answer in the least amount of time, using the fewest resources at the server

end.

DBMS performance tuning refers to activities on the server side orientated to ensure that the DBMS is properly configured to respond to clients requests in the fastest way possible while making

optimum

use of existing

resources.

Database statistics refers to a number of measurements gathered by the DBMS that describe a snapshot of the database objects characteristics. The DBMS gathers statistics about objects such as tables, indexes, and available resources such as number of processors used, processor speed

and temporary

about improving

space

available.

The DBMS

the query processing

The DBMS processes

uses the statistics

to

make critical

decisions

efficiency.

queries in three

phases:

? Parsing. The DBMS parses the SQL query and chooses the

most efficient access/execution

? Execution. The DBMS executes the SQL query, using the chosen execution

plan.

plan.

? Fetching. The DBMS fetches the data and sends the result set back to the client. Indexes

are crucial to the

process

that

speeds

up data access

and should

be carefully

selected

during physical database design in order to facilitate the searching, sorting and use of aggregate functions and join operations. During query optimisation, the operations,

which table

DBMS must choose

to use first,

and so on. Each

the most efficient way to access the data. The two optimisation and cost-based optimisation. A rule-based

optimiser

uses

a set of preset rules

execute a query. The rules assign afixed to yield the cost of the execution plan. A cost-based

optimiser

which indexes to use, how to perform join DBMS

has its

own algorithms

most common approaches

and points to

determine

for

determining

are rule-based

the

best approach

to

cost to each SQL operation; the costs are then added

uses sophisticated

algorithms

based

on the

statistics

about the

objects

being accessed to determine the best approach to execute a query. In this case, the optimiser process adds up the processing cost, the I/O costs and the resource costs (RAM and temporary space) to come up with the total cost of a given execution plan.

13

Hints are used to change the optimiser modefor the current SQL statement. Hints are special instructions for the optimiser that are embedded inside the SQL command text. SQL performance tuning particular,

queries

deals with writing queries that

should

make good

use of indexes.

make good use of the statistics. In

Indexes

are very useful

when you

want to

select a small subset of rows from alarge table based on a condition. When anindex exists for the column used in the selection, the DBMS may choose to use it. The objective is to create indexes with high selectivity. Index selectivity is a measure of how likely anindex will be used in query processing.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

700

PART

V Database

Transactions

and

Query formulation generate

the

deals

required

computations

Performance

with how to translate

results.

are required

tuning

memory

memory

(allocating

space

for

To do this, to

DBMS performance (allocating

Tuning

the

generate

includes for

data

business

you the

desired

tasks

caching

questions

must carefully

into

specific

evaluate

which

SQL code to

columns,

tables

and

output.

such

as managing

the

purposes)

and the

structures

DBMS in

processes physical

in

primary

storage

files).

KEYTERMS access/execution plan

extends

RAID

automaticqueryoptimisation

index-organisedtable

rule-basedoptimiser

cluster-indexed table

index selectivity

rule-based query optimisation algorithm

cost-based optimiser

in-memory database

SQLcache or procedure cache

data cache or buffer cache

input/output (I/O) request

SQLperformancetuning

datafiles

I/O accelerators

static queryoptimisation

databaseperformancetuning

manual statisticalgenerationmode

DBMSperformance tuning

optimiser hints

dynamic statistical generation mode

query optimiser

dynamic query optimisation

query processing bottleneck

Online Content are contained

on the

statisticallybasedqueryoptimisation algorithm table space orfile group

Answers to selectedReviewQuestions andProblems forthis chapter online platform

accompanying

this

book.

FURTHER READING Fritchey,

G. SQL

5th

edition.

Niemiec,

R.

Server

2017

Apress, Oracle

Query

Performance

Tuning:

Troubleshoot

and

Optimize

Query

Performance,

2018.

Database

12c

Release

2 Performance

Tuning

Tips

& Techniques,

Oracle

Press,

2017.

REVIEW QUESTIONS 13 1

Whatis SQL performance tuning?

2

Whatis database performance tuning?

3

Whatis the focus

4

Whatare database statistics, and why arethey important?

5

How are database statistics

6

Which database statistics

7

How is the processing

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

of most performance-tuning

processing required

Rights

Reserved. content

does

May not

activities, and why does that focus exist?

obtained? measurements are typical

of SQL DDL statements

of tables, indexes

(such

and resources?

as CREATE TABLE) different from the

by DML statements?

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

8 In simple terms, the DBMS processes is

accomplished

9 If indexes

in

each

Managing

queries in three phases.

Database

What are those

and

SQL Performance

701

phases, and what

phase?

are so important,

why not index

every column in every table?

10

Whatis the difference between a rule-based

11

What are optimiser

12

13

optimiser and a cost-based

optimiser?

hints, and how are they used?

Mostofthe query optimisation techniques are designed to makethe optimisers workeasier. Which factors

13

should

you

keep

in

mind if you intend

to

write

conditional

expressions

in

SQL

code?

Whichrecommendations would you makefor managingthe datafiles in a DBMS with manytables and indexes?

14

What does RAID stand for, and what are some commonly

used RAIDlevels?

PROBLEMS Find the

solutions

to

Problems

SELECT

EMP_LNAME,

FROM

EMPLOYEE

ORDERBY

EMP_LNAME,

5 'F'

AND

query:

EMP_AREACODE,

EMP_AREACODE

EMP_GENDER

5 '0181'

EMP_FNAME;

Whatis the likely sparsity

2

Whichindexes should you create? Writethe required SQL commands.

3

Using Table 13.4 as an example, create two alternative access plans. Usethe following There are 8 000 employees.

b

There are 4 150 female employees.

c

There are 370 employees in area code 0181.

d

There are 190female employees in area code 0181. 4 to

6 are

based

on the

EMP_LNAME,

FROM

EMPLOYEE

5

5 Should you create anindex

EMP_FNAME,

Cengage

Learning. that

any

All suppressed

Rights

Reserved. content

does

data sparsity

May not

not materially

be

copied, affect

EMP_DOB,

13

AS YEAR

of the EMP-DOB column?

on EMP_DOB?

scanned, the

YEAR(EMP_DOB)

5 1976;

Whattype of database I/O operations

deemed

assumptions:

query:

YEAR(EMP_DOB)

4 Whatis the likely

2020

following

SELECT

4

6

has

of the EMP_GENDER column?

a

WHERE

review

on the following

1

Problems

Copyright

3 based

EMP_FNAME,

EMP_GENDER

WHERE

Editorial

1 to

overall

or

duplicated, learning

Why or why not?

willlikely

in experience.

whole

be used by the query? (See Table 13.3.)

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

702

PART

V Database

Problems

Transactions

7 to

and

10 are based

on the

SELECT

P_CODE,

FROM

PRODUCT

WHERE

7

Performance

Tuning

ER

model shown in Figure

P_PRICE

.5

(SELECT

AVG(P_PRICE)

Assuming that there are no table statistics,

FROM

query:

9

Whatis the likely data sparsity ofthe P_PRICEcolumn? Should you create anindex?

PRODUCT);

what type of optimisation

Whattype of database I/O operations

FIGURE P13.1

Given the following

P_PRICE

8

10

P13.1.

willlikely

will the DBMS use?

be used by the query? (See Figure P13.1.)

Whyor why not?

The Ch11-SaleCo ER model

13

Problems

Copyright Editorial

review

2020 has

Cengage deemed

11 to

based

on the

SELECT

P_CODE,

FROM

LINE

GROUPBY

P_CODE

HAVING

SUM(LINE_UNITS)

Learning. that

14 are

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

following

query:

SUM(LINE_UNITS)

scanned, the

overall

. (SELECT

or

duplicated, learning

in experience.

whole

or in Cengage

MAX(LINE_UNITS)

part.

Due Learning

to

electronic reserves

rights, the

right

some to

FROM

third remove

party additional

content

LINE);

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

11

Whatis the likely

12

If not, explain

Discuss whether or not you should create anindex

14

Writethe command to create statistics for this table. 15 to

19

are

on the

V_CODE,

FROM

VENDOR

Whichindexes

V_NAME,

by the

column(s) be, and why would you create

on P_CODE. Justify your answer.

query:

V_CONTACT,

V_COUNTRY

5 'UK'

query?

Number

AB

15

AN

55

Country

of Vendors

Number

HG IC

358

IT

25

100

BE

3244

LV

645

BL

345

LC

16

BH

995

LT

821

LX

62

BU

75

CR

68

MC

CY

89

MO

12

CR

12

MN

65

DK

19

NL

74

ES

45

NW

113

FI

29

PL

589

208

SA

36

GM

745

UK

375

GR

35

VC

258

425

17

Whattype of I/O database operations

18

Using Table 13.4 as an example, create two alternative access plans.

has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

of Vendors

47

AU

2020

703

Assumethat 10 000 vendors are distributed as shownin the table below. Whatpercentage ofrows

FR

review

SQL Performance

should you create and why? Writethe SQL command to create the indexes.

Country

Copyright

and

V_NAME;

will be returned

Editorial

following

V_COUNTRY

ORDERBY

16

based

SELECT

WHERE

15

Database

your reasoning.

13

Problems

Managing

data sparsity of the LINE_UNITS column?

Should you create anindex? If so, what would the index the index?

13

13

would mostlikely be used to execute that query?

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

704

PART

V Database

19

Transactions

and

Performance

Tuning

Assume that you have 10 000 different products writing than

a

Web-based

or equal

that

your

interface

to the

query

to list

minimum

returns

the

quantity,

result

stored in the PRODUCT table

all products

with

P_MIN.

set to

the

Which

a quantity optimiser

Web interface

in

on hand hint

the

least

and that you are

(P_QOH)

would

that

you

time

is less

use to

possible?

ensure Write the

SQL code. Problems

20 to

21

are

based

on the

following

query:

SELECT

P_CODE,

FROM

PRODUCT

P, VENDOR

P.V_CODE

5 V.V_CODE

WHERE

P_DESCRIPT,

AND

V_COUNTRY5

'UK'

AND

V_AREACODE

5 '0181'

ORDER

BY

P_PRICE,

P.V_CODE,

V_COUNTRY

V

P_PRICE;

20

Whichindexes

21

Writethe command(s) used to generate the statistics for the PRODUCTand VENDORtables.

Problems

22

and

would you recommend?

23 are

based

on the

SELECT

P_CODE,

FROM

PRODUCT

WHERE BY

query:

P_DESCRIPT,

V_CODE

ORDER

following

P_QOH,

P_PRICE,

V_CODE

5 '21344'

P_CODE;

22

Whichindex

23

How should you rewrite the query to ensure that it uses the index Problem

Problems

would you recommend,

and which command

would you use? you created in your solution to

22?

24 and 25 are based SELECT

P_CODE,

FROM

PRODUCT

on the following

query:

P_DESCRIPT,

P_QOH,

P_PRICE,

V_CODE

13 WHERE

P_QOH P_MIN

AND

P_REORDER

Copyright review

2020 has

Cengage deemed

BY

5 P_REORDER

Learning. that

any

more

All suppressed

Rights

given in Section 13.5.2 to rewrite the query to produce the required

efficiently.

Reserved. content

5 50

P_QOH;

Use the recommendations results

Editorial

P_MIN

AND

ORDER

24

,

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

25

Whichindexes

Problems

Managing

Database

26 to 29 are based

on the following

CUS_CODE,

MAX(LINE_UNITS*LINE_PRICE)

FROM

CUSTOMER

NATURAL

CUS_AREACODE

GROUPBY

JOIN

INVOICE

NATURAL

JOIN

5 '0181'

CUS_CODE;

about

the

use

of derived

28

you gave in Problem 26, how would you rewrite

query?

Which indexes commands

29

would you give the

attributes?

Assuming that you follow the recommendations the

705

LINE

Assumingthat you generate 15 000invoices per month, whatrecommendation

27

SQL Performance

query:

SELECT

designer

and

would you recommend?

WHERE

26

13

would you recommend

for the

query you

wrote in

Problem 27, and what SQL

would you use?

How would you rewrite the query to ensure that the index

you created in Problem 28 is used?

13

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Part VI

Database ManageMent 14 Distributed Databases

15 Databases forBusiness Intelligence 16 Big Data and NoSQL 17 Database Connectivity and Web Technologies

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

busIness

VIgnette

the FacebookcaMbrIDge scanDal anD the gDPr

analytIca

Data

In 2018, Facebook faced international investigations into illegally collecting users personal data. The data was collected by Cambridge Analytica, which was a political consultation company that supported President Trumps 2016 election campaign. It was suggested that Cambridge Analytica had collected data from up to 87 million users across the globe and then used this data: firstly, to

profile the

candidate

people

were likely

to

vote for in the

US election,

and secondly,

to target

advertisements at users to try to influence whothey would vote for. The data was collected through a Facebook app called thisisyourdigitallife, where users consented to take partin a personality study. However, the app also extracted personal data from linked Facebook friends without their consent. However,

all the

data obtained

was used

without

knowledge

to

develop

a software

program

to

influence the US elections, which was sold to Trump campaigners. The major concern, even today, is that Facebook does not know which data the app shared with Cambridge Analytica. In 2019, lawsuits against Facebook continue, with USjudges requesting that all Facebooks data privacy records be made available after the companys lawyers argued that users have no expectation

of privacy.

What this scandal

demonstrated

how collecting personal data to profile individuals used to mislead people and generate fake news. It to collect your personal data and what exactly this where hidden patterns are discovered in data that behaviour

and

perform

predictive

analytics

was the

power

of Big Data analytics

for the purpose of automated profiling raises a debate about giving a company data will be used for. In the field of data can be used to makeinferences about a

new knowledge

can be discovered

about

and

could be consent mining persons a person

that he or she does not even know about. So,this raises the questions: Who owns this knowledge, and was consent ever obtained to use it for a purpose unknown at the time of collection? Better protection for users of data is now in place, thanks to the General Data Protection Regulation

(GDPR),1

which

become

a legal

requirement

for

all organisations

in

Europe from

25

May 2018, that collect and process data. One of the major changes detailed in Article 22 of the GDPRincludes the rights of an individual not to be subject to automated decision making, which includes profiling, unless explicit consent is given. Article 4(4) of the GDPR defines whichforms of

707

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

708

Part

VI

Database

Management

data processing of personal a natural

could

data,

utilising

for

example,

person

performance

at

behaviour,

work,

location

by the term

behaviour,

at

of up to

the

GDPR applies to

data for

4 per

citizens, they

store

which it

cent

the

situation,

The

of annual

global

challenge

and

where it is

was collected.

unlikely,

but if the

violation

of GDPR rules.

legal

and

the

ensure

GDPR

news had broken two

that

all they

have

months later

process

him

breaches

the

organisations have

stopped

the

the

then

of

is

consent

reliability

to

or

or her or similarly regulation

greater).1

is

a

Given that

data of European

exactly

Cambridge

meant

data subjects

the

personal

know

Facebook

persons

what is

the

concerning that

to

reliability,

or interests,

million (whichever

that

natural

definition

concerning

effects

or 20

that

relating

interests,

preferences

an organisation

turnover

ensure

stored,

Could

for

and companies

will be to

a lengthy

aspect

processing

aspects

preferences,

personal

produces

penalty

all organisations

provides

any personal

of automated

personal

concerning

personal

health,

where it

or her.

71

to

certain

aspects

health,

Recital

any form

evaluate

or predicting

in relation

movements,

him

fine

Union

movements.1

or

This includes data to

situation,

especially

affects

profiling. personal

analysing

work, economic

location

significantly

this

economic

or

profiling

performance

be considered

and

what

personal

use it for the

purpose

Analytica

might now

scandal?

be facing

It is

fines for

14

1

Copyright Editorial

review

2020 has

The

Cengage deemed

GDPR

Learning. that

any

Portal

All suppressed

Rights

(2019),

Reserved. content

does

May not

[online].

not materially

be

copied, affect

Available:

scanned, the

overall

or

duplicated, learning

https://eugdpr.org/

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter 14 Distributed Databases In thIs

chaPter,

you

What a distributed components

wIll learn:

database

management

implementation

is affected

system

(DDBMS)

is and

what its

are

How database

by different levels

of data and process

distribution How transactions How

are

database

managed

design

is

in

a distributed

affected

by the

database

distributed

environment

database

environment

Preview A single on

database

different

several

can be divided into

computers

different

distributed

network

database

The growth by the increased

and

that

cost

apply.

However,

network

clearly

adds

words,

it

database

must

database

fragments,

2020 has

Learning. that

any

All suppressed

can be stored

dispersed

forms

Reserved. content

the

to

the

the

has

growth

among core

of the

been fostered of Big

Data and

services

does

May not

not

be

the

location

affect

scanned, the

overall

or

duplicated, learning

of the

practical.

some

in

whole

a

Cengage

Due Learning

of

partitioning

of Web-centric

scalable;

As demand

part.

design

In todays

grows,

To accommodate

or in

sites in a

the

and the

fragments.

desirable

you learnt

different

example,

data

distributed

concepts

must be highly

increases.

experience.

For

of those

complexity.

achieve

treats

design

of data among

data system

as demand

copied,

basic

complexity.

and replication

made to

materially

(DDBMS)

the

distribution

and inherent

be

system

a systems

consider

dynamically

must

Rights

systems

network-based

therefore,

any distributed

needs

trade-offs

Cengage deemed

grow

processing

growth,

review

must

be

database

operations,

distributed

management

database;

chapters

systems

multi-site

can

effective.

database

environment,

Copyright

made

computer

data into

too,

management

of business have

as a single logical

a distributed

The

database

in earlier

the

Editorial

or nodes.

globalisation

distributed

database

sites,

The fragments

Processing,

system.

changes

more reliable The

a network.

of distributed

technological

several fragments.

within

in so

such

other do the

dynamic

properties.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

710

Part

VI

Database

14.1

Management

the eVolutIon systeMs

oF DIstrIbuteD

Database

ManageMent

A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites. To understand how and why the DDBMS is different from

the

DBMS, it is

useful to

examine

briefly the

changes

in the

database

environment

that

set the

stage for the development of the DDBMS. During the 1970s, corporations implemented centralised database management systems to meet their structured information needs. Structured information is usually presented as regularly issued formal reports in a standard format. Such information, generated by 3GL programming languages, is created by specialists

in response

to

precisely

channelled

requests.

Thus, structured

information

needs are

well

served by centralised systems. Basically, the use of a centralised database required that corporate data be stored in a single central site, usually a mainframe or midrange computer. Data access was provided through dumb terminals. The centralised approach, illustrated in Figure 14.1, worked wellto fill the structured information needs of corporations,

but it fell short

when quickly

moving events required

faster

response

times

and equally

quick access to information. The slow progression from information request to approval, to specialist, to user, simply did not serve decision makers wellin a dynamic environment. What was needed was quick, unstructured access to databases, using ad hoc queries to generate on-the-spot information.

FIgure

14.1

centralised

database

management

system

Request Application issues a data request to the DBMS

DBMS

Reply

Data

End user

Read

L o c al

dat

ab

se

14 Database management systems based on the relational model could provide the environment in which unstructured information needs would be met by employing ad hoc queries. End users would be given the ability to access data when needed. Unfortunately, the early relational modelimplementations did not yet deliver

database

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

acceptable

throughput

when compared

to the

well-established

hierarchical

or network

models.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

The past three affected

decades

database

Business next

operations

corner

in

of crucial

Among

with this

social

those

and technological

changes

change,

Distributed

changes

Databases

that

711

have

were:

competition

expanded

from

the

shop

on the

cyberspace.

market needs favoured

an on-demand

transaction

style,

mostly based

on

services.

Rapid social demand

and

a series

design.

global;

Web store

demands

Web-based

and

became

to the

Customer

gave birth to

development

14

and technological

for

complex

have increasingly

changes

and fast adopted

fuelled

networks

by low-cost,

to interconnect

advanced

network

smart

them.

technologies

mobile

devices increased

As a consequence, as the

platform

the

corporations

for their

computerised

solutions. Data realms

manage

are

geographically mobile

distributed

created

competitive

Rapid

data

recent

the

factors

data

diverse locations

must

data tend to

be

via location-aware

as

became

had to respond

restructured

quickly

leaner-and-meaner,

obvious: environment.

decentralisation

databases

even

to form

became

decision-making

on the

of

business

units

However,

the

a necessity.

more firmly

entrenched.

way

by:

particularly,

the

The

World

WWW is, in

tolerance

These

such

does

not

distributed

as

have

voice,

to

of

Wide

effect,

created

Web (WWW

the

as

repository

for

digital

devices

use

wireless

and

high

tablets

demand

music

and

data

as

data

access.

varied

pictures.

databases,

of

such

for

and require

distributed

the

mobile

Pixel,

locations

video,

imply

lead

use

Googles

dispersed

data,

often

and

devices

necessarily

requirements

widespread

iPhone

geographically

formats,

access

The

Apples

Galaxy.

data from

data

iPad,

They

exchanges

Although

distributed

performance

replication

includes

Apples

and failure

techniques

similar

to

those

in

databases.

The accelerated

growth

provides

applications

remote

maintenance

and

fully

tolerance

of companies to

operations.

necessarily

require

based

distribution.

revolution.

such

Samsungs

not

units

quick-response

influenced

of the internet

companies

requirements

multiple-location

described

and

which

database

in the

structures

was strongly

in business

two

and

just

access

wireless

multiple

are

from

As large

crucial

management

addressed

for

mobile

data

applications

Such

data.

smartphones

access

became

acceptance

platform

The

As a result,

music and images.

environment

operations,

multiple-access

were

distributed

and

of

years,

growing

the

more frequently.

video,

accessed

pressures.

access

decentralised

factors The

world

as voice,

business

dispersed

decentralisation

During

in

a dynamic

reacting,

ad hoc

made

those

digital

and remotely

and technological

quickly

The

in the

of data, such

devices.

These factors to

converging

multiple types

The

distributed.

distributed

using applications

companies company

Just

as

data functionality;

often require

the

that data

with

as a service.

want to are

mobile

however,

use of data replication

outsource

generally

stored

data

access,

other

factors

techniques

This new type

their

application

on central

this

type

such

servers

of service

and

may not

as performance

similar to those in

14

of service

development,

and failure

distributed

databases. The increased mobile to

Copyright Editorial

review

2020 has

customers,

Cengage deemed

Learning. that

focus

technologies

any

All

Rights

need

Reserved. content

mobile business

within

the

suppressed

on

does

May not

for

not materially

be

their

business

on-the-spot

copied, affect

scanned, the

overall

intelligence. plans. decision

or

duplicated, learning

in experience.

whole

More and

more companies

As companies making

or in Cengage

part.

Due Learning

use

increases.

to

electronic reserves

rights, the

right

social Although

some to

third remove

party additional

are embracing

networks

content

to

a data

may content

be

get

suppressed at

any

closer

warehouse

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

712

Part

VI

Database

Management

is

not usually

distributed

Emphasis sources

and

new

online content see for this

of

many

habits

discover

platform

that

database,

facilitate

different

does rely extraction

The era of data

of communities, ways to

it

data

on Big Data analytics.

spending

distribution,

a distributed

queries

types.

and

effectively

and

on techniques

such

as data replication

and

and integration.

mobile communications

Todays

customers

organisations efficiently

have

significant

in

ways to

are investing reach

gave us data from

many

influence harvest

on the

such

data

to

customers.

Tolearn moreabouttheinternetsimpactondataaccessand

Appendix

H, Databases

in

e-Commerce,

available

on the

online

book.

Atthis point in time, the long-term impact ofthe internet and the mobile revolution on distributed database design and management is unclear. Perhaps the internet and mobile technologies success willfoster the use of distributed databases as bandwidth becomes a more troublesome bottleneck. Perhaps the resolution of bandwidth problems will simply confirm the centralised database standard. In any case, distributed

databases

exist today

and

many distributed

database

arelikely to find a place in future database development. The distributed database is especially desirable because subject to problems such as: Performance

degradation

High costs associated

operating

centralised

due to a growing number of remote locations

concepts

database

management is

over greater distances

with maintaining and operating large central (mainframe)

Reliability problems created by dependence the need for data replication

and components

database systems

on a central site (single point of failure syndrome)

Scalability problems associated with the physical limits imposed by a single location, physical space, temperature conditioning and power consumption Organisational rigidity imposed by the database, which meansit and agility required by modern global organisations.

and

such as

might not support the flexibility

The dynamic business environment and the centralised databases shortcomings spawned a demand for applications based on accessing data from different sources at multiple locations. Such a multiple-source/multiple-lo database environment is managed by a DDBMS.

14.2

14

DDbMs aDVantages

anD DIsaDVantages

Distributed database management systems deliver several advantages over traditional systems. At the same time, they are subject to some problems. Table 14.1 summarises the advantages and disadvantages associated with a DDBMS.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

table

14.1

Distributed

DbMs advantages

Data are located

near the

a distributed

match

Faster

demand

system

site.

are

The

Complexity

dispersed

Applications

with only

sites.

requirements.

access.

stored

greatest

database

business

data

locally

End

subset

of

often

work

companys

a

713

A distributed

database

database

spreads

processing

out the

data

at several

Growth facilitation. network

systems

workload

by

security,

sites.

the operations

addressed Security.

smaller

and located

foster

better

Reduced add

closer to customers,

customers

operating

cheaply

to

system.

and

among and

local

shared

Development

to

work is

on low-cost

optimisation,

of data

by different

communication

update

a

done

more

PCs than

multiple sites. will be

at several

sites.

There are no standard

protocols

at the

different

different

often incompatible

and

lapses

at

management

people

For example,

manage the

on

of security

when data are located

Lack of standards.

staff.

than

query

control,

and so on, must all be

The probability

The responsibility

sites

more cost-effective

a network

more quickly

sites are

departments

company

costs. It is

workstations

mainframe

Because local

communication

between

anomalies.

and resolved.

increases communications.

data

to prevent

concurrency

recovery,

path selection

different

of other

sites.

Improved

data from

activities

due to

management,

and they

must have the

database

degradation

backup,

access

New sites can be added to the

without affecting

control.

data location,

together

ability to coordinate

data.

and

Database administrators

Transaction system

management must recognise

must be able to stitch

users

of the

Faster data processing.

to

Databases

Disadvantages

data in

and

Distributed

and disadvantages

Advantages

to

14

database level.

database

distribution

vendors

employ

techniques

of data

and

to

processing

in

a

DDBMS environment.

mainframes. Increased User-friendly

interface.

usually

equipped

interface

(GUI).

PCs and

workstations

with an easy-to-use

The GUI simplifies

are

graphical

data

user

the

danger

of a single-point

computers

other

fails,

the

workstations.

multiple Processor

access users

When one

is

picked

of

request

Increased

training

generally

higher

even to the

at

hardware

sites.

is

The end

user is

able to

thus

of

requiring

space.

by any

processor

cost. in

Training

a distributed

extent

location,

at the

costs

are

model

than

they

model, sometimes

of offsetting

Distributed

duplicated infrastructure

copy of the data, and an end

processed

sites,

copies

operational

and

savings.

Higher costs.

any available

at different

disk storage

would be in a centralised

up by

Data are also distributed

independence.

Multiple

use and training

failure.

workload

requirements.

are required

additional

for end users. Less

storage

databases require to operate, such as physical

environment,

personnel,

software

and

licensing.

data location.

14 Distributed databases are used successfully but have along way to go before they can yield the full flexibility and power of which they are theoretically capable. The inherently complex distributed data environment increases the urgency for standard protocols governing transaction management, concurrency

control,

security,

backup,

recovery,

query

optimisation,

access

path selection,

and so on.

Such issues must be addressed and resolved before DDBMS technology is widely embraced. The remainder of this chapter will explore the basic components and concepts of the distributed database. Because the distributed database is usually based on the relational database model, relational terminology is used to explain the basic distributed concepts and components.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

714

Part

VI

Database

14.3 In

Management

DIstrIbuteD

distributed

ProcessIng

processing,

a databases

anD DIstrIbuteD

logical

processing

Databases

is shared

among

two

or

more physically

independent sites that are connected through a network. For example, the datainput/output (I/O), data selection and data validation might be performed on one computer, and a report based on that data might be created on another computer. A basic

distributed

processing

environment

is illustrated

in

Figure

14.2. It shows that

a distributed

processing system shares the database processing chores among three sites connected through a communications network. Although the database resides at only one site (London), each site can access the data and update the database. The database is located on Computer A, a network computer known as the database server.

FIgure

14.2

Distributing

processing

Computer

Site 1 London user Joe

environment

A DBMS

E m pl o y e e d at a b se

Site 2 Cape Town user Donna Computer B

Site 3 Harare user Victor Computer C

Update payroll

Generate payroll

Communications

data

Database

records

are

network

processed

in

report

different

locations

A distributed database, on the other hand, stores a logically related database over two or more physically independent sites. The sites are connected via a computer network. In contrast, the distributed processing system uses only a single-site database but shares the processing chores among several sites. In

14

a distributed

database

system,

a database

is composed

of several

parts known

fragments. The database fragments are located at different sites and can be replicated sites. An example of a distributed database environment is shown in Figure 14.3.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

as database

among various

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

FIgure

14.3

Distributed

14

Distributed

Databases

715

database environment Computer

A DBMS

London

Site 1 user

Alan E1

Communications

Computer

network

B

Computer

C

DBMS

DBMS

E2

E3

Site 2 Cape

Town

Site 3

user

Betty

Harare

user

Victor

The database in Figure 14.3is divided into three database fragments (E1, E2 and E3)located at different sites. The computers are connected through a network system. In a fully distributed database, the users Alan,

Betty and Victor

do not need to know the

name

or location

of each database

fragment

in order to

access the database. Also, the users may belocated at sites other than London, Cape Town, or Harare, and still be able to access the database as a single logical unit. As you examine and contrast Figures 14.2 and 14.3, you should keep the following points in mind: Distributed processing does not require a distributed requires distributed processing.

database,

but a distributed

database

Distributed processing may be based on a single database located on a single computer. For the management of distributed data to occur, copies or parts of the database processing functions must be distributed to all data storage sites. Both distributed

processing

and distributed

1

databases require a network of interconnected

components.

14.4

characterIstIcs systeMs

A distributed related

Copyright Editorial

review

2020 has

data

Cengage deemed

Learning. that

any

database

oF DIstrIbuteD

management system (DDBMS)

over interconnected

All suppressed

Rights

Reserved. content

does

May not

not materially

be

computer

copied, affect

scanned, the

overall

or

duplicated, learning

systems

in experience.

whole

Cengage

ManageMent

governs the storage and processing

in

or in

Database

part.

which

Due Learning

to

electronic reserves

both

data and processing

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

oflogically

functions

time

from if

the

subsequent

are

eBook rights

and/or restrictions

eChapter(s). require

it.

716

Part

VI

Database

Management

distributed

among

several

sites.

A DBMS

must have at least

the following

functions

to

be classified

as

distributed:

Application within

interface

the

distributed

Validation

to

Query

optimisation

by the

query,

Mapping

to

Formatting

to find

to

Security

to

provide

Backup

and recovery

Concurrency

database

data location

write

data

for

privacy

to

for the manage

in the

distributed as follows:

and remote local

fragments

and recoverability

end

user

or to

an application

of the

data

access

ensure

ensure

that

the

data

of local

one

consistent

move from

and remote

transactions

management

such

system

must perform

the request.

as the following:

may require

data

from

Select

only

logical-to-physical

The request

mayinclude

all customers

a single

table,

or it

Ensure database consistency,

7

Validatethe datafor the conditions, if any, specified by the request

8

Present the selected a distributed

DBMS must

transparent

data

access

does

May not

This

of a centralised

greater than

access

to

several

and/or 1

000.

tables

data components

security and integrity

must handle

And it

Reserved.

another.

data in the required format

processing.

content

to

mathematical

with balance may require

6

Rights

state

(or an end users) request

Search for, locate, read and validate the data

All

across

well as transactions

all of the functions

5

suppressed

consistency

segments

Decompose the request into several disk I/O operations

any

program

in case of a failure

data

as

4

Learning.

database

and to

Mapthe requests

that

be accessed

databases

3

Cengage

must

administrator

simultaneous

Validate, analyse and decompose

deemed

which are local

storage

availability

2

DDBMSs

and

fragments

and remote

Receive an applications

and

DBMSs

DDBMS to

database

operations

database

both local

1

addition,

with other

be synchronised?)

permanent

database

distributed

DBMS,

or to

(which

to the

at

across

multiple

and

are distributed

presentation

to ensure the

features

of local

data from

the synchronisation

data

has

if any,

management

A fully

In

2020

best

activity includes

The request

review

must data updates,

data

fragments

logical

Copyright

strategy

the

control

Transaction

programs

components

access

or

prepare

which data request

the

the

read

DB administration

Editorial

determine

determine to

end user or application

data requests

and how

I/O interface

with the

database

to analyse

Transformation

14

to interact

not materially

be

copied, affect

perform features

scanned, the

overall

or

duplicated, learning

all necessary

those

additional

are illustrated

in experience.

whole

or in Cengage

part.

Due Learning

functions functions

in

to

electronic reserves

Figure

rights, the

right

imposed

by the

transparently

to the

distribution end

of

user.

The

14.4.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

FIgure

14.4

a fully distributed

database

Site 1

14

Distributed

Site 2

Distributed processing

UserTom Communications

network

Single logical

database

Dat a bas e f r ag me nt A1

Dat a bas e f r ag me n A2

SOURCE:

single

logical

database

in

sites 1 and 2, respectively. users end

see

only

users

do not

need

to

know

where

To better distributed

14.5 The

to

consists

and

know

fragments

understand

the

that

different

systems

do not the

workstations

(sites

must

Network

types

best to

the

and

allow

software

network

that

Communications

that

Cengage deemed

names into

of the

A2, located

at

so can Tom.

Both

fragments.

separate

In fact,

fragments,

nor

the

do they

database

scenarios,

lets

first

define the

distributed

processor

data.

Learning. that

any

All suppressed

the

network

system that

reside

and exchange and so on

database

carry the

that

in

each

As the

are likely

to

can

one

is, it

The

distributed

workstation.

data.

functions

data from

system.

database

hardware.

be supplied

be run

on

workstation

must

be

The

components

by different

multiple

14

vendors,

it

platforms.

to another.

able to

network

computers,

support

The DDBMS several

must

types

of

media.

The transaction requests

form

components

hardware

media that

communications

has

the

divided

A1 and

database;

components:

computer

all sites to interact

systems, ensure

of the

be communications-media-independent;

2020

know

is

of distributed

following

or nodes)

be independent

hardware

operating

review

to

database

fragments,

were alocal

Learning

components.

at least

components

Copyright

need

database

as if it

Technology/Cengage

are located.

must include

Computer system

Editorial

of two

Course

DDbMs coMPonents

DDBMS

is

14.4

database

need

the

database

Figure

Mary can query the database

one logical even

717

management system

User Mary

The

Databases

(TP),

The transaction

Rights

Reserved. content

does

May not

not materially

be

which is the

processor

copied, affect

scanned, the

overall

or

duplicated, learning

software

receives

in experience.

whole

component

and

or in Cengage

part.

Due Learning

processes

to

electronic reserves

rights, the

found the

right

some to

third remove

in

each

computer

applications

party additional

content

may content

data

be

suppressed at

any

time

from if

that

requests

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

718

Part

VI

Database

Management

(remote and local). manager (TM).

The TPis also known as the application

processor

(AP) or the transaction

The data processor (DP), whichis the software component residing on each computer that stores and retrieves data located at the site. The DPis also known as the data manager (DM). A data Figure

14.5

among

may even

illustrates

TPs

used

FIgure

processor

and

by the

the

be a centralised

placement

DPs shown

in

DBMS.

of and interaction

Figure

14.5 is

made

among

possible

components.

through

a specific

The set

communication

of rules,

or protocols,

DDBMS.

14.5

Distributed database system management components Melusi

Peter

Mary

TP

TP

TP

TP DP

DP

network

TP DP

Aneesha

any

DP

Dedicated data processor

Chantal

Each TP can access

data from

Dedicated data processor

DP

Communications

Note:

the

data on any

DP, and each

DP handles

all requests

for local

TP.

14 The protocols Interface

determine how the distributed with the network to transport

Synchronise all data received from (DP side)

database system will:

data and commands

DPs(TP side) and route retrieved

Ensure common database functions in a distributed concurrency control, backup and recovery.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

between

duplicated, learning

in experience.

whole

or in Cengage

part.

DPs and TPs data to the appropriate

system. Such functions include

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

security,

suppressed at

any

time

TPs

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

DPs and TPs can be added to the system TP and data

a DP can reside

transparently.

support

access

leVels

Current are

(distributed

systems

matrix to

of processes

table

can

For example, DB) and

a simple

other

are

14.2

in

data

the

Database

on the

may store

sections

systems:

Host

process

basis

user to

the

of how a single

at a single

according

of the access

other

local

DBMS

to

process

distribution

14.6

interfaces

site

or at

data

DB) or in

multiple

data and process

of data and process

to

sites.

distribution multiple

Table

distribution.

sites

14.2

uses

These types

distribution

Data

Multiple-Site

DBMS

Data

Not applicable

multiple processes)

Fully distributed

Client/server

FIgure

A

well as remote

proper

and

site (centralised

File server

process

719

network.

(Requires Multiple-site

components.

as

with

Databases

follow.

levels

Single-Site Single-site

in

data in

that

operation

end

Distributed

DIstrIbutIon

processing

systems

the

centralised

DBMSs

be classified

database

the

be an independent

anD Process

a DBMS

discussed

affecting

allowing

independent

may support

classify

without

computer,

a DP can

from

oF Data

database

supported.

same

In theory,

remote

14.6

on the

14

DBMS (LAN

DBMS)

Client/server

DDBMS

single-site-processing, single-site data(centralised)

T1

Dumb terminals

DBMS

T2 Front-end processor

Dat aba e

14

T3

Communication DSL or fibre

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

Remote dumb terminal

through line

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

The end user

must

All record-and

All data selection, entire

files

computer than

000.

SELECT

at the

functions for

time

can

table

take

SQL

query:

CUS_BALANCE

.

1000;

at the

be illustrated

remote

thus

Such

721

data.

data

For

rows,

of

that

costs.

example, 50

requiring

a requirement

communication

easily.

10 000

workstation,

workstation.

and increases

containing

the

order to access

Databases

location.

place at the

processing

response

condition

server in

end-user

Distributed

suppose

the

have

balances

which

file

server greater

* CUSTOMER

WHERE

All 10 000

CUSTOMER

A variation architecture. database

is

and the

distributed.

rows

must travel through

the

network

to

be evaluated

at site A.

of the multiple-site processing, single-site data approach is known as client/server Client/server architecture is similar to that of the network file server except that all

processing

server

site. In

last

to the file

done

network

slows

A issues

FROM

file

the

a CUSTOMER

If site

is

and update

traffic,

of the

stores

1

through

network

The inefficiency

activity

search

travel

increases

make a direct reference

file-locking

14

Note that

contrast,

done

at the

client/server

the

the

server

network

client/server

perform

file server

reducing

is

network

multiple-site

approach

architecture

online content this

site, thus

systems

requires

capable

traffic.

Although

processing, the

database

of supporting

both the

the latters to

data

at

network

processing

be located multiple

is

at a single

sites.

Appendix F, Client/Server Systems, islocatedonthe onlineplatform for

book.

14.6.3 Multiple-site

Processing,

Multiple-site

Data(MPMD)

The multiple-site processing, multiple-site data (MPMD) scenario describes afully distributed DBMS with support for multiple data processors and transaction processors at multiple sites. Depending on the level

of support

for different types

of centralised

DBMSs,

DDBMSs

are classified

as either

homogeneous

or heterogeneous. Homogeneous DDBMSs integrate multiple instances of the same database over a network. Thus, the same DBMS will be running on different mainframes, minicomputers and microcomputers. In contrast, heterogeneous DDBMSs integrate different types of centralised DBMSs over a network. A fully

heterogeneous

models (relational, No

DDBMS

currently

heterogeneous systems are

DDBMS

hierarchical provides

environment.

and networks,

subject

to

certain

Remote

access is

will support

different

or network) running full

support

Some

the

scenario

DDBMS implementations

and allow remote restrictions.

for

DBMSs

that

may even support

different

data

or for

fully

over a network.

data access to

depicted

in

support

several

another

DBMS.

Figure

14.8

platforms,

However,

such

the

14

operating DDBMSs

still

For example:

on a read-only

basis

and does not support

Restrictions are placed on the number of remote tables that

write privileges.

may be accessed in a single

transaction.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

722

Part

VI

Database

Management

Restrictions

are placed

Restrictions

are placed

provided

FIgure

to relational

14.8

on the

number

on the

database

databases

of distinct

databases

model that

but not to

that

may be accessed.

may be accessed.

network

or hierarchical

Thus,

access

may be

databases.

heterogeneous distributed database scenario

DBMS

Platform

IBM

DB2

MVS

APPCLU 6.2

VAX rdb

MVS

DECnet

SQL/400

OS/400

3270

UNIX

TCP/IP

3090

DEC/VAX

IBM

AS/400

RISC

Informix

computer

Intel

Network Communications Protocol

Operating System

Xeon

Windows Server 2019

Oracle

CPU

TCP/I

14 The

preceding

change

number several

of issues

A distributed

review

2020 has

Cengage deemed

Learning. that

any

that

Rights

does

means

system

requires

May not

not materially

be

copied, affect

the

overall

duplicated, learning

DDBMS data

Therefore,

management

in experience.

whole

or in Cengage

have

part.

the

continues

multiple

sites

next section

leads

to to

a

will examine

Features

characteristics features

technology at

systems.

transParency

functional

or

The Managing

and understood.

transparency

scanned,

exhaustive.

frequently.

database

Database

DDBMS

Reserved. content

no added

must be addressed

database

All

by are

of distributed

features.

suppressed

is

new features

DIstrIbuteD

transparency

Copyright

of restrictions

and

key features

14.7

Editorial

list

rapidly,

Due Learning

to

that the

electronic reserves

can

be grouped

common

rights, the

right

some to

third remove

property

party additional

content

may content

and

described

of allowing

be

suppressed at

any

time

from if

the

subsequent

as

the

eBook rights

end

and/or restrictions

eChapter(s). require

it.

chaPter

user to feel like the a centralised The

databases

DBMS;

DDBMS

transparency

Distribution

transparency, a DDBMS

If

and

stored

data

are

?

That the

data

are replicated

failure.

node.

This is

backbone

that feature

maintaining

networks

platform

find

the

cost-effective

a transparent

processing Heterogeneity (relational, responsible Distribution,

14.8

of

need

2020 has

Cengage deemed

allows

know

but

entirely

completed

will be picked that

degradation

depend

the

overall

due to its

or

or aborted,

up by another on a

network

Web presence

as the

also

data.

systems

The

capacity

under a common,

data requests

transparency

that

should

DBMS.

or due to

the

system

be able to

will

scale

more transaction

out

or data

of the system. of several

or global,

from the

ensures

by adding

performance

were a centralised

use on a network

transparency

remote

a physically

global

different local

schema.

schema

features

All suppressed

Rights

The

DBMSs

DDBMS is

to the local

will be examined

does

database

supported

DBMS in

schema.

greater

detail in

by the

to

be

DDBMS

a database

is

prior

to

of transparency.

May not

need

to

partitioned. data

Therefore,

neither

not

it

were a

system to system.

fragment

14

names

nor

access.

specify

where

those

must specify the database

fragments

are located.

exists when the end user or programmer

must specify both the

locations.

summarised

materially

as though

The end user or programmer

exists when the end user or programmer not

managed varies from

are recognised:

is the highest level

specified

does

are

Reserved. content

dispersed

of transparency

that

and their

features

any

failure

organisations

performance

transparency

are

names

Learning. that

are split vertically

transParency

transparency

Transparency

review

know:

business.

performance

mapping transparency

fragment

Copyright

to

names

Local

Editorial

the

transparency

locations

Location

to

to update data at several network sites.

which allows the integration

The level

distribution

Fragmentation

fragment

as a single logical

need

sites.

will be either

of the

in

access

affecting

and

transparency

fragment

user.

sections.

database.

not

to

or increase

for translating

Distribution

does

not

and columns

multiple

transaction

Performance

path

DIstrIbutIon

levels

does

which allows the system to perform asif it

and hierarchical)

centralised Three

with

to the

sites.

because

particularly

transparency,

transaction

next few

the

any performance

without

network

that

differences.

manner

nodes,

user

rows

among

multiple

were lost

transparency,

the

in

among

trust in their

will not suffer

most

working

which ensures that the system will continue to operate in the event of a

a critical

Performance The system

he or she is

or transparent,

723

integrity.

Functions

for

dispersed

ensures

Failure transparency,

the

which allows a transaction

database

hidden,

Databases

sites.

geographically

transparency

maintaining

are

database to be treated

transparency,

meaning that the tables multiple

transparency,

Transaction

node

distribution

on

That the

thus

database

which allows a distributed

exhibits

?

Transaction

words, the user believes that

Distributed

are:

data are partitioned,

horizontally

other

of a distributed

features

database.

? That the

the

only user. In

all complexities

14

be

copied, affect

in

scanned, the

overall

or

Table

duplicated, learning

in experience.

14.3.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

724

Part

VI

table if the

Database

Management

14.3

a summary

SQL statement

Fragment

of transparency

features

requires:

name?

Location

name?

Then the

DBMS supports

Yes

Yes

Local

Yes

No

Location transparency

No

No

Fragmentation

As you

examine

name is No cannot

have

a fragment

attributes

divided

data

FIgure

its

the

name

use

that is

of various

14.9

ask

fails

to

clearly

that is,

the

no reference

The reason

reference

for

to

a situation

not including

an existing

levels,

that

fragment.

suppose

EMP_ADDRESS,

over three London

E2 and

is

High

(If

in

which the

scenario

you

dont

fragment

is simple:

you

need

to

specify

table

containing

irrelevant.)

EMP_DOB,

in fragment

transparency

Medium

why there

transparency

data are distributed

stored

might

of distributon

Low

transparency

name is Yes.

location

by location;

are

you

EMP_NAME,

EMPLOYEE is

14.3,

a location name,

To illustrate

the

Table

and the location

Level

mapping transparency

employee

data

employee

have

an

EMPLOYEE

EMP_DEPARTMENT

different locations:

Harare

you

London,

are stored

data

are

and EMP_SALARY.

in fragment

stored

The

Cape Town and Harare. The table E1, Cape

in fragment

Town

E3. (See

employee

Figure

14.9.)

Fragment locations Distributed

DBMS

EMPLOYEE table

E1

Fragment

Location

E2

London

Now suppose

the

end user

wants to list

E3

Cape Town

all employees

Harare

with a date of birth

prior to

1 January,

1970.

To

focus on the transparency issues, also suppose the EMPLOYEE table is fragmented and each fragment is unique. The unique fragment condition indicates that each row is unique, regardless of the fragment in whichit is located. Finally, assume that no portion of the database is replicated at any other site on the network.

14

Depending

on the level

of distribution

transparency

support,

you

may examine

three

query cases.

Case 1: The Database Supports Fragmentation Transparency The query conforms to a non-distributed database query format; that is, it does not specify fragment names orlocations. The query reads: SELECT

*

FROM

EMPLOYEE

WHERE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

EMP_DOB

All suppressed

Rights

Reserved. content

does

May not

not materially

be

, '01-JAN-1970';

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

Case 2: The Database Fragment

names

must

SELECT

Supports

Location

be specified

in the

14

Distributed

Databases

725

Transparency

query,

but fragment

location

is

not specified.

The

query

reads:

*

FROM

E1

WHERE

EMP_DOB

, '01-JAN-1970';

UNION SELECT

* E2

FROM

EMP_DOB

WHERE

, '01-JAN-1970';

UNION *

SELECT

E3

FROM

EMP_DOB

WHERE

Case 3: The Database Both the

fragment

Supports

name

SELECT

, '01-JAN-1970';

Local

and location

Mapping Transparency

must

be specified

in the

query.

Using

pseudo-SQL:

*

FROM

E1

WHERE

NODE

LONDON

EMP_DOB

, '01-JAN-1970';

UNION SELECT

*

FROM

E2

WHERE

NODE

CAPE

EMP_DOB

TOWN

, '01-JAN-1970';

UNION SELECT

*

FROM

E3

WHERE

NODE

HARARE

EMP_DOB

, '01-JAN-1970';

14 note NODE indicates part

of the

the location

standard

As you examine

the

of the

database

preceding

query formats,

way end users and programmers interact

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

fragment.

NODE is

used for illustration

purposes

and is

not

SQL syntax.

does

May not

not materially

be

copied, affect

scanned, the

overall

or

you can see how distribution

transparency

affects

the

with the database.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

726

Part

VI

Database

Management

Distribution transparency is supported by a distributed data dictionary (DDD), or a distributed data catalogue (DDC). The DDC contains the description of the entire database as seen by the database administrator. The database description, known as the distributed global schema, is the common database

schema

are processed Therefore,

the

Keep in

DDC

14.9

transparency

update

data

ensures

that

the

their

updating

the

(remote

impose

distribute

DDBMS

requests)

that

at the network

nodes.

at all sites.

implementations

be able to

that

subqueries

and it is replicated

limitations

a database,

supports

on the level

but

location

not

a table,

of

across

transparency

but

not

transParency is

a DDBMS

in

and

many

property

different

transactions

are

that

consistency.

that that

connected

in

computers

completed

ensures

Remember

only

when

all

database

transactions

a DDBMS a network.

database

maintain the

database

transaction

Transaction

sites

involved

can

transparency

in the

transaction

part of the transaction. database

databases

systems

consistency

basic

distributed

require

complex

and integrity.

concepts

governing

mechanisms

To understand

remote

requests,

to

how the remote

manage transactions

transactions

are

transactions,

and to

managed,

distributed

ensure

you

should

transactions

and

requests.

14.9.1 Distributed Whether

or

not

update

distributed

requests and Distributed

a transaction

difference

between

or request

distributed

transaction

distributed,

from

it is

different

lets

the

begin

BEGIN

transparency

to

by and

remote

one

and

having

to

or

more

a distributed sites

the

COMMIT specify

database

transaction

on a network.

by establishing

WORK

avoid

transactions2

formed

transaction

several

concepts, using

of location

14.10

is

a non-distributed data

transactions,

existence

FIgure

through

might

into

distributed,

DDBMS

indicates

integrity

stored

Distributed

can

user requests

transparency.

databases

basic

current you

a condition

distributed

the

translate

consistency

of the

For instance,

Such

Transaction

know

maintain

transactIon

complete

TPs to

DPs. The DDC is itself

some

support. sites.

fragmentation

the

by local

must

mind that

transparency multiple

used

by different

To

difference

better

The

the

latter

illustrate

between

WORK transaction the

requests. is that

the

remote

format.

and

Assume

the

data location.

a remote request Site B

Site A TP

DP Network

14

C USTO ME

SELECT* FROM CUSTOMER WHERE

2

The

CUS_COUNTRY

details

of

distributed

White, Clarifying

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Comment: The request is directed to the CUSTOMER

All suppressed

= 'ZA'

requests

Client/Server,

Rights

Reserved. content

does

May not

not materially

be

table

and DBMS

copied, affect

scanned, the

overall

at site

transactions 3(14),

or

duplicated, learning

B

were

November

in experience.

whole

or in Cengage

originally

1990,

part.

Due Learning

to

pp.

electronic reserves

described

in

David

McGoveran

and

Colin

78-89.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

Aremote to

request, illustrated

be processed

can

by a single

reference

data

Similarly, site.

at

only

a remote

Aremote

FIgure

in Figure 14.10, lets a single SQL statement

remote

database

one remote

transaction

transaction

14.11

processor.

In

other

words,

the

14

Distributed

Databases

727

access the data that are

SQL

statement

(or request)

site.

composed

is illustrated

in

of several

requests,

accesses

data at a single remote

Figure 14.11.

a remote transaction

Site A

Site B

TP

DP

I NV

CE

OI

Network

P R O D U C

BEGIN WORK; UPDATE PRODUCT SET

PROD_QTY WHERE

INSERT

INTO

= PROD_QTY PROD_NUM

INVOICE

VALUES '100', COMMIT WORK;

1

= '231785';

(CUS_NUM,

INV_DATE,

'15-FEB-2015',

As you examine Figure 14.11, note the following The transaction The remote

updates

transaction

The transaction

the

PRODUCT

remote transaction

and INVOICE

is sent to and executed

can reference

INV_TOTAL)

120.00;

only one remote

tables

features:

(located

at the remote

site

at site

B).

B.

DP.

Each SQL statement (or request) can reference only one (the same) remote entire transaction can reference and be executed at only one remote DP.

DP at atime,

and the

A distributed transaction allows atransaction to reference several different local or remote DP sites. Although each single request can reference only one local or remote DP site, the transaction as a whole can reference

multiple

DP sites

because

each request

can reference

a different

site. The

distributed

transaction process is illustrated in Figure 14.12. Note the following features in Figure 14.12: The transaction

14

references two remote sites (B and C).

The first two requests (UPDATE PRODUCT and INSERT INTO INVOICE) are processed by the DP at the remote site C, and the last request (UPDATE CUSTOMER) is processed bythe DP at the remote site B. Each request

can access only one remote site at a time.

The third characteristic may create problems. For example, suppose the table PRODUCT is divided into two fragments, PRODl and PROD2, located at sites B and C,respectively. Given that scenario, the preceding distributed transaction cannot be executed because the request:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

728

Part

VI

Database

Management

SELECT

*

FROM

PRODUCT

WHERE cannot

PROD_NUM

access

distributed

FIgure

data

from

5 '231785';

more than

one remote

site.

Therefore,

the

DBMS

must

be able

to

support

a

request.

14.12

a distributed transaction Site

A

Site

B

DP

TP Network

M ER

C U S T O

Site

BEGIN

C

WORK;

UPDATE

I

SET PROD_QTY=PROD_QTY WHERE

INSERT

INTO

VALUES

UPDATE SET

PROD_NUM

INVOICE

('100',

1

= '231785';

(CUS_NUM,

'15-FEB-2019',

INV_DATE,

INV_TOTAL)

120.00);

CUSTOMER CUS_BALANCE

WHERE COMMIT

C E

N V OI

DP

PRODUCT

= CUS_BALANCE

CUS_NUM

P R O D U C

+ 120

= '100';

WORK;

A distributed request lets a single SQL statement reference data located at several different local or remote DP sites. Because each request (SQL statement) can access data from more than onelocal or remote DP site, a transaction can access several sites. The ability to execute a distributed request provides

fully

distributed

database

processing

capabilities

because

of the

ability to:

partition a database table into several fragments reference one or more of those fragments fragmentation transparency.

14

The location

and partition

a distributed

request.

with only one request. In other words, there is

of the data should be transparent

As you examine

Figure

14.13,

to the end user. Figure 14.13 illustrates

note that the transaction

uses

a single

SELECT

statement to reference two tables, CUSTOMER and INVOICE. The two tables arelocated at two different sites, B and C. The distributed request feature also allows a single request to reference a physically partitioned table. For example, suppose a CUSTOMER table is divided into two fragments, C1 and C2,located at sites

B and

C, respectively.

Further

suppose

the

end

user

wants to

obtain

a list

of all customers

whose balances exceed 250. The request is illustrated in Figure 14.14. Full fragmentation transparency support is provided only by a DDBMS that supports distributed requests. Understanding the different types of database requests in distributed database systems helps you address the transaction transparency issue more effectively. Transaction transparency ensures that

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

distributed (Review

transactions Chapter

concurrent state

to

are treated

12,

Managing

transactions,

as centralised

Transactions

whether

transactions,

and

or not they

are

ensuring

Concurrency, distributed,

if

serialisability

necessary.)

will take

the

14

Distributed

729

of transactions.

That is, the

database

Databases

from

execution one

of

consistent

another.

FIgure

14.13

a distributed request Site A

Site B

DP

TP

Network

CUST O ME R

Site

C

I N V OI CE DP BEGIN

WORK;

SELECT FROM

CUS_NUM, INV_TOTAL CUSTOMER, INVOICE

WHERE

CUS_NUM

= '100'

INVOICE.CUS_NUM COMMIT WORK;

FIgure

14.14

AND

= CUSTOMER.CUS_NUM;

PR O D U C

another distributed request Site A

Site B

TP

DP

Network

C1

Site C SELECT * FROM CUSTOMER WHERE CUS_BALANCE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

14 DP > 250;

duplicated, learning

in experience.

whole

C2

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

730

Part

VI

Database

Management

14.9.2 Distributed Concurrency

control

concurrency

becomes

control

especially

important

in the

distributed

database

environment

because

multi-site, multiple-process operations are morelikely to create data inconsistencies and deadlocked transactions than single-site systems are. For example, the TP component of a DDBMS must ensure that all parts of the transaction are completed at all sites before a final COMMIT is issued to record the transaction.

Suppose each transaction operation was committed by each local DP, but one of the DPs could not commit the transactions results. Such a scenario would yield the problems illustrated in Figure 14.15: the transaction(s) would yield aninconsistent database, withits inevitable integrity problems, because committed data cannot be uncommitted! The solution for the problem illustrated in Figure 14.15 is a two-phase

FIgure

commit

14.15

protocol,

which you

the effect of a premature

will explore

next.

coMMIt

DP

Site A

LOCK (X) WRITE (X) COMMIT

Data are committed

Cant roll back sites A and B DP

Site B

LOCK

(Y)

WRITE (Y) COMMIT

DP

Site C

LOCK

Rollback site

(Z)

... ... ROLLBACK

at

14

14.9.3 two-Phase

commit Protocol

Centralised databases require only one DP. All database operations take place at only one site, and the consequences

of database

operations

are immediately

known

to the

DBMS. In

contrast,

distributed

databases makeit possible for atransaction to access data at several sites. A final COMMIT must not beissued until all sites have committed their parts ofthe transaction. The two-phase commit protocol guarantees that, if a portion of a transaction operation cannot be committed, all changes made at the other sites participating in the transaction will be undone to maintain a consistent database state.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

732

Part

VI

Database

14.10

Management

PerForMance

One of the

most important

anD FaIlure functions

transParency

of a database

is its

ability

to

make data

available.

Web-based

distributed data systems demand high availability, which means not only that data are accessible but that requests are processed in atimely manner. For example, the average Google search has a sub-second response time. When was the last time you entered a Google query and waited more than a couple

of seconds

for the results?

Performance

transparency

allows

a DDBMS to

perform

as if it

were

a centralised database. In other words, no performance degradation should be incurred due to data distribution. Failure transparency ensures that the system will continue to operate in the case of a node or network failure. Although these are two separate issues, they are interrelated in that a failing node or congested network path could cause performance problems. Therefore, both issues are addressed in this section.

The objective of a query optimisation routine is to minimise the total cost associated execution of a request. The costs associated with arequest are afunction of the:

with the

access time (I/O) cost involved in accessing the physical data stored on disk communication

cost associated

with the transmission

of data among

nodes in

distributed

database systems CPUtime cost associated Although

with the processing

costs are often classified

overhead of managing distributed transactions.

as either communication

or processing

costs, it is difficult

to separate

the two. Not all query optimisation algorithms use the same parameters, and not all algorithms assign the same weight to each parameter. For example, some algorithms minimise total time; others minimise the communication time; and still others do not factor in the CPU time, considering it insignificant relative to other cost sources.

note Chapter

13,

Managing

Resolving

Database

and

data requests

in

SQL

Performance,

a distributed

provides

data

additional

details

environment

about

must take

query

the

optimisation.

following

points

into

consideration:

Data distribution. In a DDBMS, query translation is decide In this

which fragment case,

a TP

to

access.

executing

(Distribution

a query

must choose

data requests to the chosen remote

14

more complicated

transparency

what fragments

DPs, combine the

because the DDBMS

was explained to

access,

DPresponses

must

earlier in this chapter.) create

multiple

and present the data to the

application. Data replication.

In

data replication ensure that

addition,

the

makes the

all copies

data

access

of the

may also be replicated

problem

even

data are consistent.

at several

more complex Therefore,

different

because

an important

the

sites.

The

database

must

characteristic

of query

optimisation in distributed database systems is that it must provide replica transparency. replica transparency refers to the DDBMSs ability to hide multiple copies of data from the user. This ability is

particularly

important

with data update

processed, it can be satisfied

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

operations.

If a read-only

by accessing any available remote

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

request

is

being

DP. However, processing

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

a

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

write request

also involves

The two-phase

commit

will complete ensure

successfully.

the

introduces

of all the

this,

and node

predetermined network

system

availability.

because

path

that

all changes

is,

to

maintain

14.9.3 at other

sites,

means that

that

the

should

to

Databases

733

data consistency.

ensures

all fragments

and pushes them

and basically

transparency,

delay imposed

due to

the

transaction

DDBMSs

be

must

mutually

each remote

replica.

not all data changes

also

consistent.

This

are immediately

how

to

the

database

database

Chapter

3,

Diagrams;

a distributed

Chapter

query

issues

a data

sites

in less

and traffic

consider for

with remote

of the

the delay imposed

partition

time

loads.

such

packet to

cannot than

be easily

others

Hence,

to

as network

and

achieve

latency,

make a round

the

trip from

when nodes become suddenly

point

A

unavailable

Where to locate

those

Data fragmentation

where

to locate

a distributed

the

database.

database

fragments

The following

can

section

help

discusses

or

DesIgn

distributed,

Characteristics;

Normalising

the

design

Chapter

Database

three

database

to

and

of

design.

introduces

the

a database

centralised

7,

Which fragments

third

of time required

Model

database

How to

part

of bandwidth

Database

is

Relational

and

associated

their

should

consistency

DIstrIbuteD

Whether

DDBMS

partition and

distributed

14.11

finish

because

partitioning,

performance

for

time

failure.

planning the

the

by the amount

a network

Carefully ensure

nodes

varies

to point B; or network

the

Section

are replicated

The response

some

performance

performance

in

data

fragments

in

Distributed

by all replicas.

Network

issues

if

about

fragments

a DP captures

delays in the

all existing

you learnt

However,

consistency

To accomplish

seen

synchronising

protocol

14

5,

Designs,

principles

and

Data

Modelling

are

still

applicable.

two

issues,

concepts

with

described

Entity

Relationship

However,

the

design

of

new issues:

into fragments

replicate fragments

and

data

and replicas

replication

deal

with the

first

and

data

allocation

deals

with

issue.

14.11.1 Data Fragmentation Data fragmentation object site

might over

catalogue

allows you to break a single object into two or more segments

be a users

a computer (DDC),

database, network.

from

vertical parts

and

Information

where it is

Data fragmentation a table into logical

strategies,

fragments.

mixed. (Keep

by a combination

the

Each

in

unique

rows

equivalent

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

all

accessed

mind that

All

Rights

does

May not

by the

fragmentation

TP to

is

process

here, are based three

Each fragment

types table

user

stored

in

at any

the

distributed

data

not

be

1

at the table level

and consist

always

of dividing

strategies:

be recreated

from

horizontal,

its

fragmented

and joins.)

stored

materially

The

be stored

requests.

of data fragmentation can

or fragments.

can

refers to the division of a relation into subsets (fragments)

is

have the

Reserved. content

or a table.

data

a fragmented

at a different

same

copied, affect

overall

or

duplicated, learning

and

each

(columns).

with the

scanned, the

node,

attributes

of a SELECT statement,

suppressed

about

as discussed

of unions

fragment

database

You will explore

Horizontal fragmentation (rows).

a system

in experience.

In

short,

WHERE clause

whole

or in Cengage

part.

Due Learning

to

electronic reserves

fragment

has

each

on a single

rights, the

right

some to

third remove

unique

fragment

of tuples

rows.

However,

represents

the

attribute.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

734

Part

VI

Database

Management

vertical fragmentation subset the

(fragment)

is

exception

refers to the division of a relation into attribute (column)

stored

of the

PROJECT

key

statement

Figure

the

Table

into

several

to

fragment

has

all fragments.

unique

subsets. Each

columns

This is the

with

equivalent

of the

of horizontal and vertical strategies. In other words,

horizontal

strategies,

The table

CUS_LIMIT,

14.16

name:

each

common

refers to a combination

fragmentation

14.16.

COUNTRY,

FIgure

which is

and

subsets

(rows),

each

one

having

CUSTOMER

table

for the

a subset

of the

(columns).

To illustrate

in

node,

SQL.

may be divided

attributes

column,

in

Mixed fragmentation a table

at a different

contains

lets

the

CUS_BAL,

use the

attributes

CUS_NUM,

CUS_RATING

and

XYZ

CUS_NAME,

Company,

depicted

CUS_ADDRESS,

CUS_

CUS_DUE.

a sample customer table

CUSTOMER CUS_NAMe

CUS_

Sinex, Inc.

COUNTrY

LiMiT

UK

3500.00

2700.00

3

SA

6000.00

1200.00

1

UK

4000.00

3500.00

3

3400.00

SA

6000.00

5890.00

3

1090.00

St.

SA

1200.00

550.00

1

0.00

Ave.

NL

2000.00

350.00

2

50.00

12 Main St.

11

Martin Corp.

321 Sunset

12

Mynux

910

Corp.

Eagle

13

BTBC, Inc.

Rue du

14

Victory,

Inc.

123

Maple

15

NBCC

Corp.

909

High

online content the online

Horizontal

CUS_DUe

CUS_

CUS_

NUM 10

CUS_BAL

CUS_

CUS_ADDreSS

Blvd. St.

Monde

rATiNG 1245.00 0.00

Thedatabases usedtoillustratethe material in this chapterarefoundon

platform for this

book.

Fragmentation

There are various

ways to partition a table horizontally:

Round-robin partitioning. Rows are assigned to a given fragment in a round-robin fashion (F1, F2, F3,..., Fn) to ensure an even distribution of rows among all fragments. However, this is not a good strategy if you require location awareness the ability to determine which DP node will process a query based on the geospatial location of the requester. For example, you would want

14

all queries from

UK customers

to

be resolved

from

this fragment to be located in a node close to Range

partitioning

based

on a partition

key.

a fragment

that

stores

only

UK customers

and

UK.

A partition

key is one or more attributes

in a table

that determine the fragment in which a row will be stored. For example, if you want to provide location awareness, a good partition key would be the customer state field. This is the most common and useful data partitioning strategy. Suppose XYZ Companys corporate three countries, but company locations

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

management requires information about its customers in all in each country (UK, SA and NL) require data regarding local

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

customers you

only.

define

the

Based on such requirements, horizontal

fragments

to

you decide to

conform

to the

table

14.4

horizontal fragmentation

Fragment

Name

Location

CUST_H1

United

CUST_H2

The

CUST_H3

South

distribute

structure

Netherlands

Africa

in

Distributed

data by country. Table

of the customer table

Databases

735

Therefore,

14.4.

by country

Node

Condition

Kingdom

the

shown

14

Name

Customer

Number

Numbers

of rows

CUS_COUNTRY

5 'UK'

NAS

10, 14

2

CUS_COUNTRY

5 'NL'

ATL

15

1

CUS_COUNTRY

5 'SA'

TAM

11, 13, 14

3

The partition key will be the CUS_COUNTRY field. Each horizontal fragment may have a different number of rows, but each fragment must have the same attributes. The resulting fragments yield the three

tables

depicted

FIgure

in

14.17

Table name: CUS_

Figure 14.17.

table fragments in three locations

CUST_H1

Location:

CUS_NAMe

United

CUS_ADDreSS

NUM 10

Sinex,

12

Inc.

12

Mynux

Main

910

St.

Eagle

Node: NAS

Kingdom

St.

CUS_BAL

CUS_

CUS_

COUNTrY

LiMiT

UK

3500.00

UK

4000.00

CUS_

CUS_

rATiNG

DUe

2700.00

3

1245.00

3500.00

3

3400.00

CUS_BAL

CUS_ rATiNG

CUS_DUe

Corp.

Table name:

CUS_ NUM

CUST_H2

CUS_NAMe

15

NBCC

Table

Location:

name:

CUS_ADDreSS

Corp.

909

CUST_H3

High

Ave.

Location:

CUS_NAMe

CUS_

South

Martin

review

2020 has

CUS_

rATiNG

DUe

Monde

SA

6000.00

5890.00

3

Maple St.

SA

1200.00

550.00

0.00 1090.00

14

0.00

1

Fragmentation

may also divide the

the

in in

a few

Table

14.5.

any

All suppressed

suppose

Rights

of the

Reserved. content

does

May not

relation

the

department.

only

Learning. that

CUSTOMER

For example, collections

Cengage deemed

CUS_

1

123

attributes.

Copyright

CUS_BAL

1200.00

Blvd.

50.00

TAM

6000.00

Victory, Inc.

shown

Node:

2

SA

Sunset

14

interest

350.00

CUS_

Rue du

and

2000.00

LiMiT

BTBC, Inc.

vertical

NL

COUNTrY

13

You

Editorial

321

CUS_ LiMiT

CUS_

CUS_ADDreSS

Corp.

CUS_ COUNTrY

Africa

NUM 11

Node: ATL

The Netherlands

company

Each

not

be

copied, affect

scanned, the

overall

vertical fragments

is

department

CUSTOMER

materially

into

or

duplicated, learning

in experience.

divided into two is located

tables

attributes.

whole

that

or in Cengage

part.

Due Learning

in

departments: a separate

In this

to

electronic reserves

are composed

case,

rights, the

right

some to

the service building,

the

third remove

of a collection

party additional

content

may content

department

and

fragments

each

are

be

suppressed at

any

time

of

from if

has

defined

the

subsequent

eBook rights

an as

and/or restrictions

eChapter(s). require

it

736

Part

VI

table Fragment

Database

Management

14.5

Vertical fragmentation

name

Location

CUST_V1

of the Node

Service

Bldg

custoMer

Name

table

Attribute

SVC

Names

CUS_NUM,

CUS_NAME,

CUS_ADDRESS,

CUS_COUNTRY CUST_V2

Collection

Bldg

ARC

CUS_NUM,

CUS_LIMIT,

CUS_BAL,

CUS_RATING,

CUS_DUE

Each vertical fragment must have the same number of rows, but the inclusion of the different attributes depends on the key column. The vertical fragmentation results are displayed in Figure 14.18. Note that the

FIgure Table

key attribute

14.18

name:

(CUS_NUM)

is

Location:

Service

CUS_NUM 10

14

to

both fragments

CUST_V1

and

CUST_V2.

Vertically fragmented table contents

CUST_V1

Table name:

common

Building

Node:

CUS_NAMe

CUS_ADDreSS

CUS_COUNTrY

Sinex, Inc.

12

Main St.

UK

11

Martin

Corp.

321

Sunset

12

Mynux

Corp.

910

Eagle

13

BTBC, Inc.

Rue du

14

Victory,

Inc.

123

15

NBCC

Corp.

909

CUST_V2

Location:

SVC

Collection

Blvd.

SA

St.

UK

Monde

Maple High

SA

St.

SA

Ave.

NL

Building

Node: ARC

CUS_NUM

CUS_LiMiT

CUS_BAL

CUS_rATiNG

CUS_DUe

10

3500.00

2700.00

3

11

6000.00

1200.00

1

12

4000.00

3500.00

3

3400.00

13

6000.00

5890.00

3

1090.00

14

1200.00

550.00

1

15

2000.00

350.00

2

1245.00 0.00

0.00

50.00

Mixed Fragmentation The

XYZ

Companys

accommodate vertically

table

the

to

structure different

accommodate

requires

the

site

the

attributes,

yields

Copyright Editorial

review

2020 has

the

Cengage deemed

Learning. that

any

requires

All

CUSTOMER within

departments

thus

the

data

be fragmented

locations,

(service

and

the

horizontally

data

collection).

does

May not

in

not materially

be

Table

copied, affect

must

In short,

to

be fragmented the

CUSTOMER

horizontal

fragments)

that

are located

is used

information

fragmentation

The

horizontal at each

is introduced fragmentation

site.

As the

within each horizontal

needs

at each sub site.

for yields

departments

fragment

to

divide

Mixed fragmentation

14.6.

scanned, the

First,

(CUS_COUNTRY).

vertical fragmentation

meeting each departments

Reserved. content

procedure.

a country

(horizontal

buildings,

displayed

Rights

a two-step within

tuples

different

results

suppressed

the

locations;

different

on the location

of customer

in

that

mixed fragmentation.

based

subsets

are located

company the

Mixed fragmentation each

requires

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

table

14.6

Mixed fragmentation

of the custoMer

Fragment

name

Location

Horizontal

CUST_M1

UK-Service

Criteria

CUS_COUNTRY

5

UK-Collection

Distributed

Node

resulting

Name

rows

NAS-S

10, 14

vertical

at Site

Criteria

Attributes

5

NAS-C

737

at

each Fragment CUS_NUM,

CUS_NAME,

CUS_ADDRESS,

CUS_COUNTRY

Databases

table

'UK' CUST_M2

14

10, 14

'UK'

CUS_COUNTRY

CUS_NUM,

CUS_LIMIT,

CUS_BAL,

CUS_RATING,

CUS_DUE

CUST_M3

NL-Service

CUS_COUNTRY

5

ATL-S

15

CUS_NUM,

'NL' CUST_M4

NL-Collection

CUS_NAME,

CUS_ADDRESS,

CUS_COUNTRY

5

ATL-C

15

'NL'

CUS_COUNTRY

CUS_NUM,

CUS_LIMIT,

CUS_BAL,

CUS_RATING,

CUS_DUE CUST_M5

SA-Service

CUS_COUNTRY

5

TAM-S

11, 13, 14

CUS_NUM,

'SA' CUST_M6

SA-Collection

CUS_NAME,

CUS_ADDRESS,

CUS_COUNTRY

5

TAM-C

11, 13, 14

CUS_NUM,

'SA'

CUS_COUNTRY

CUS_LIMIT,

CUS_BAL,

CUS_RATING,

CUS_DUE

Each by

fragment

displayed

department

fragments

location,

listed

FIgure

in

14.19

Table name:

in

Table

to fit

Table

14.6

each

14.6

are shown

CUST_M1

Location:

Location:

2020 has

Cengage deemed

Learning. that

any

All

Rights

910 Eagle St. Node: CUS_BAL

NAS-C

CUS_rATiNG

CUS_DUe 1245.00

12

4000.00

3500.00

3

3400.00

does

May not

not materially

Node:

NL-Service CUS_NAMe

be

copied, affect

scanned, the

overall

or

duplicated, learning

in

whole

909

or in Cengage

part.

Due Learning

to

High

electronic reserves

CUS_COUNTrY

Ave.

rights, the

right

14

ATL-S

CUS_ADDreSS

Corp.

experience.

to the

UK

3

NBCC

country,

UK

2700.00

Reserved. content

Main St.

3500.00

Location:

each

CUS_COUNTrY

10

CUST_M3

suppressed

12

CUS_LiMiT

15

review

within

corresponding

process

CUS_ADDreSS

UK-Collection

CUS_NUM

Copyright

and,

The tables

Node: NAS-S

Mynux Corp.

CUS_NUM

Editorial

by country

14.19.

Sinex, Inc.

CUST_M2

name:

Figure

data

requirements.

CUS_NAMe

12

Table

in

data

UK-Service

10

name:

customer

table contents after the mixedfragmentation

CUS_NUM

Table

contains

departments

some to

NL

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

738

Part

VI

Database

Table

Management

name:

Table

CUST_M4

name:

Location:

NL-Collection

ATL-C

CUS_NUM

CUS_LiMiT

CUS_BAL

CUS_rATiNG

CUS_DUe

15

2000.00

350.00

2

50.00

CUST_M5

Location:

Node:

SA-Service CUS_NAMe

CUS_ADDreSS

11

Martin

321

13

BTBC, Inc.

Rue du

14

Victory, Inc.

123

CUS_NUM

Table name:

Node:

CUST_M6

Location:

CUS_NUM

Corp.

TAM-S CUS_COUNTrY

Sunset

Blvd.

SA

Monde

SA

Maple St.

SA

Node: TAM-C

SA-Collection CUS_LiMiT

CUS_BAL

CUS_rATiNG

CUS_DUe 0.00

11

6000.00

1200.00

1

13

6000.00

5890.00

3

1090.00

14

1200.00

550.00

1

0.00

14.11.2 Data replication Data

replication

Fragment existence reduce

refers

copies

of fragment

Suppose

while fragment Replicated all

of

the

There

are

stored

data

14

in

two

to

that

on

update, data

by a computer

network.

requirements.

time,

Since the

data copies

can help to

A1 and

possible:

A2.

Within

fragment

A1 is

a replicated stored

distributed

at sites

S1 and

S2,

S3. rule.

Therefore,

the

The

to

update

is

originating

mutual

maintain

consistency

data

performed

DP node

are immediately

However,

rule

consistency

at all sites

requires

among

the

where replicas

exist.

it

updated.

decreases

sends

the

This type

data

changes

to the

of replication

availability

focuses

due to the latency

on

involved

at all nodes.

of the update. type

maintaining

is

a database

After a data update,

In this

served

information

of replication:

consistency.

notify them

fragment. is

ensure

data consistency

Pull replication.

focus

a data

data

ensuring

local

After

sites

and response

mutual consistency

that

styles

multiple

fragments,

14.20

be identical.

basically

nodes

nodes to

to the

must ensure

maintaining

two

S2 and

fragments

at

serve specific

data availability

Figure

at sites

copies

sites to

costs.

into

in

DDBMS

Push replication. replica

divided

data are subject

copies

replicas,

query

depicted

A2is

data

can enhance

A is

scenario

of

at several

and total

database the

storage

copies

communication

database,

that

to the

can be stored

the

originating

The replica

of replication,

data

availability.

some

benefits,

DP node

nodes

data

updates

However,

sends messages

decide

when to

propagate

this

style

to the replica

apply the

more slowly

of replication

updates

to the

allows

for

to their

replicas.

The

temporary

data inconsistencies. Although

replication

because

each

on a DDBMS,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

has

data

copy

consider

Rights

Reserved. content

does

must

the

May not

not materially

be

copied, affect

scanned, the

also imposes

maintained

processes

be

it

overall

by the

that the

or

duplicated, learning

in experience.

additional

system.

DDBMS

whole

or in Cengage

part.

DDBMS

To illustrate

must perform

Due Learning

to

electronic reserves

rights, the

right

some to

the

to

third remove

processing

replica

use the

party additional

content

overhead

overhead

imposed

database:

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

chaPter

FIgure

14.20 S1

A1

database

If the

the

selected

Site S3

DP

DP

A1

is fragmented,

the

is replicated,

nearest

A2

DDBMS

must

decompose

DP receives

The

TP assembles

The

problem

communication

DDBMS

to satisfy the

a data request

The

and

the

must

decide

copy to satisfy the transaction.

and updated

The TP sends

Three

739

2

a query

into

subqueries

to

access

the

fragments.

database

selects

Databases

S2

Site

DP

appropriate

Distributed

Data replication

Site

If the

14

and the

becomes

each

request

copy

to

access.

A WRITE operation

mutual consistency

to each selected

executes

which

requires

that

operation

all copies

be

rule.

DP for and

A READ

execution.

sends

the

data

back

to the

TP.

DP responses. more

complex

when

you

consider

additional

factors

such

as network

topology

throughputs.

replication

scenarios

exist:

database

stores

a

database

can

be fully

replicated,

partially

replicated,

or

unreplicated: A fully this to

replicated

case, the

all database

amount

of

fragments

overhead

A partially replicated sites.

Most

DDBMSs

An unreplicated duplicate

Several factors Database and the

it imposes

database are

data transmission

Usage

frequency.

fragment

database

at

multiple sites. In

can be impractical

multiple copies of some database fragments the

due

system.

partially

replicated

stores each database fragment

the decision

size. The amount

network

handle

of each database

A fully replicated

database

at multiple

14

well.

at a single site. Therefore, there are no

fragments.

influence

higher

on the

stores

able to

database

database

multiple copies

are replicated.

to

of data replicated costs.

bandwidth

use data replication:

that

Replicating could

The frequency

large

affect

of data

will have an impact

other

usage

amounts

on the

of data

storage

requires

requirements

a window

of time

and

applications. determines

how frequently

the

data

needs

to

be

updated.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

740

Part

VI

Database

Costs. with

Management

Costs include

synchronising

associated When the

with replicated

usage

can reduce catalogue

performance,

cost

of

software

overhead,

components

of remotely

located

requests.

Data

data

whose

and their

versus

and

management

fault-tolerance

associated

benefits

that

are

data.

frequency

the (DDC),

access.

those for transactions

contents

The data replication

are

used

data is

high

replication by the

makes it possible

and the

database

information

TP to

decide

to restore

is which

lost

is large,

stored copy

in

data

the

replication

distributed

of a database

data

fragment

to

data.

14.11.3 Data allocation Data allocation

describes the process of deciding

where to locate

data. Data allocation strategies

are

as follows:

With centralised

data allocation,

the entire database is stored at one site.

With partitioned

data allocation,

the database is divided into several disjointed

and stored

at several

Withreplicated Data

data allocation,

distribution

combination

over

of both.

Data allocation

studies

and

Size, number Types

a computer

network

data

take

into

to

through

to the

data

partition,

way a database

which data to locate

consideration

a variety

data

is

replication,

divided

or a

or fragmented.

where.

of factors,

including:

goals

and number

of transactions

achieved

on one issue:

availability

of rows

is

is closely related

focus

algorithms

Performance

copies of one or more database fragments are stored at several sites.

Data allocation

Most data allocation

parts (fragments)

sites.

of relations

be applied

to

the

that

an entity

database

and the

maintains

with other

entities

attributes

accessed

by each

of those

transactions Disconnected Most

algorithms

and location. No optimal to

In a 2000 highly

data

mobile users such

Some algorithms or universally

as network

include

accepted

symposium

external

algorithm

system

However,

there

for the

Robust

Distributed

Corporation,

presentation

proven

are three

three

review

2020 has

data, such

Systems,

Cengage deemed

Learning. that

any

All suppressed

Rights

does

May not

commonly

bandwidth

as network

Gilbert and

not materially

be

copied, affect

scanned, the

overall

and throughput,

topology

duplicated, learning

of

data

or network

size

throughput.

have been implemented

in experience.

whole

or in Cengage

part.

his presentation

properties:

consistency,

all three these

University

properties

three

of

Computing,

MITin their

Web Services,

stated in

provide

Consider

A. Brewer,

Nancy Lynch

or

to

properties.

Eric

Brewer

desirable

a system

at the Principles of Distributed

by Seth

Reserved. content

Dr Eric

for

desirable

of Consistent, Available, Partition-Tolerant

Copyright

network

exists yet, and very few algorithms

computing,

it is impossible

CAP stand

Towards was later

on distributed

data

tolerance.

The initials

Editorial

topology,

the caP theoreM

distributed

partition

3

include

for

date.

14.12 14

operation

California

at

paper Brewers

in

same

Due

to

electronic reserves

rights, the

right

some to

third remove

more detail:

Berkeley

and Inktomi

July 2000. This theorem

Conjecture

party additional

content

may content

and time.3

and the

Feasibility

ACM SIGACT News, vol. 33, Issue 2, pp. 5159,

Learning

any

availability, at the

properties

ACM Symposium,

that in

be

suppressed at

any

time

from if

the

subsequent

eBook rights

2002.

and/or restrictions

eChapter(s). require

it

chaPter

Consistency. the

same

In a distributed

data

However,

this

Section

involves

Simply

ever lost.

If

you

operation. Partition

transaction ensure

In this

a few

using

consistency

trade-off

shows,

Google

buy

which

data

Copyright Editorial

review

tolerance

has

provides

Cengage deemed

Learning. that

any

All

This

Rights

Reserved. content

does

Computicket

to

in the

in

No received

request

is

stop

middle

of the

in the

14.7).

failure.

This is the

The system

will fail

because

May

not materially

be

of the

you

ACIDS

in

seen

2). In

practice,

of database the

scanned, the

overall

that

or

duplicated, learning

ranges

whole

By the

ones

want.

you

small

probability

the

achieve

of having have

countdown

at

higher

website

and

small

principle

you else!

The

consistency

noticed

work.

tend to forfeit availability.

of distributed

At

time

by someone

(BASe).

data

This

systems

in

BASE refers to a data

slowly

emergence

from

through

the

system

the

of

highly

NoSQL

distributed

consistent

(ACIDS)

to

and NoSQL data models. For example,

highly scalable provides

in

thing.

view.

14.7.

emergence

experience.

same

browsing

best

purchased

propagate

the

ranges

mergethe best of relational provides

minutes

some companies

consistent

without access).

ChiefsOrlando

have the

same

to

latency

NoSQL databases provide a highly distributed

that

Table

have the

a new type

but

a few

all

becomes

data

Kaizer

ensure

ACIDS which

transactions

seats

the

to

systems,

not immediate

Chapter

get the

properties

eventually

of consistency

the been

If you

generated

the

which

prefers

distributed

has

for

database

have

For example,

(see

practice,

affect

the

to refresh.

with highly

are

you

Computicket

tickets,

as shown

copied,

until

locking

Web pages

of consistency

not

tickets

The one in

by concurrent

exactly

are five

databases,

ACIDS

are

there

database latency

may spend

have

state

distributed

ensure

see

may be doing may already

soft state,

new type In

website

that

and serialisability.

buy tickets

to

other

changes

to

its implications

you learnt

database

imposed

You

world

concert

database

Johannesburg.

selected

systems, 12,

increases,

stadium

than

data service

support.

a spectrum

suppressed

Section

and small

(delays

consistent.

(BASE),

Cloud Spanner

transactions.

2020

see

updated.

as you learnt

of a node

durability,

contention

using

a spectrum

consistent

partition

event

a consistent

For centralised

availability

consistency

in

distributed

components and

isolation,

result

or data

select

available,

provides

ACIDS

to

Chapter

a highly

when dealing

consistency

eventual

now

eventually

now

741

organisations.

Web-based In

need for availability

for their

NewSQL databases attempt to the

delays,

system

(see

and the

you

and

to

and isolation

with

consistency,

all over the

waiting

model in

databases

databases

databases.

transactions

until all replicas are eventually

the

should

be immediately

system.

even in the

distributed

way on purpose their

which data are basically

database

want the

including

are

tickets

Webtickets

between

consistency

by the

Web-centric

on highly

checking

again

work this

As this example the

the

of customers

when

distributed

FNB Stadium and

users from

restart

for

you

at the

will start

to

do not

operate

same results.

grows

that

button,

you

thousands clock

All nodes

should

partitioning

fulfilled

of all

transactions

latency

tickets

other

customers

you

to

atomicity,

more difficult

game

checkout

is designed

systems,

network

imagine

available

case,

focuses

business

It is

time,

in

all successful

price in

soccer

the

a bigger role.

replicas

network

always

online,

transparency

properties:

problem.

is

continues

always return the

a high

same

the

Databases

fail.

As the

For example,

click

takes

that

and

requirement

The system

that

data operations is not an issue.

the

tickets

all distributed

properties

a request

buying

CAP theorem

for

database

the

means

with latency

a paramount

of failure

the

Pirates

are

all nodes

widespread

through

consistency

which

dealing

tolerance.

only if

paying

time,

speaking,

This is

equivalent

a bigger

database,

same

Distributed

14.10.

Availability.

Although

at the

14

consistency of

from

or in Cengage

part.

distributed

Due Learning

to

and

NoSQL

and

ACIDS

to the

electronic reserves

rights, the

right

some to

databases high

availability

NewSQL

third

party additional

content

databases

consistent

may content

be

suppressed at

any

14

for

with relaxed

distributed

eventually

remove

with support

time

BASE.

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

742

Part

VI

table

Database

Management

14.7

Distributed

database

spectrum

DBMS

Partition

Transaction

Type

Consistency

Availability

Tolerance

Centralised

High

High

N/A

Model

Trade-off

ACIDS

No distributed

DBMS

data

processing High

Relational

Relaxed

High

ACIDS (2PC)

Sacrifices

DBMS

ensure

availability

to

consistency

and

isolation

Relaxed

NoSQL

High

High

BASE

Sacrifices

DDBMS

consistency

to

ensure availability High

NewSQL

High

Relaxed

ACIDS

Sacrifices

DDBMS

partition

tolerance

to

ensure

transaction

consistency

and availability

14.13

Database

Maintaining network

data security in a DDBMS is far

has also to

features

described

users

and roles.

features

to

public, the

In

10,

14.14

will offer via

link

DBMS, as the

will support i.e.

new

style

their

14

of delivering

organisations

own databases

interconnected third

DDMBS

party

cloud

virtualised

provider

with the

an organisation

Consider the following

4

Oracle

that

all users

refer to the

wIthIn

their

to

party

supply

Guide,

users

could

access

by referencing

vendor-specific

over the

the

reference

pricing

manual.

of IT

a service

that

each

level

provides

an alternative

(IT) infrastructure

provider that

services

model for

Cloud computing is a

Web. It

technology

party cloud

a range

often called

to

uses a number

are standardised.

service

it

provides,

agreement.

host

The

of

Each which

can

main benefits

are:

cloud

one IT infrastructure

Administrators

to

own information

own flexible

This is

third

a link

SQL statement:

the clouD

they rely on a third

cloud infrastructure

As the

only

Database

Instead, computers

organisation.

provides

make

'travel';

security,

and resources

provide

will have its

of using

Cost-effectiveness.

organisations,

data

wish to

or software.

and

be negotiated to

do not

To

for

travel.

Databases

applications,

that

Oracle

links.4

Current trends in distributed data systems cannot fail to discuss cloud computing. for

security

authentication

For example, database

underlying

all of the

password

features. through

actual link.

non-authenticated

about

DIstrIbuteD

additional

USING

database

database Process,

authentication

customer

remote

more specific information

DDBMS

when creating the

LINK

a public,

to the

the

than in a centralised

Development

vendors

DATABASE

creates

Typically

database

keyword is used

pointer

more complex

Database

specific

a distributed

PUBLIC

statement

made secure.

Chapter

addition,

PUBLIC

customer

For

be in

access

CREATE This

securIty

provider

is likely

is required,

11g

Release

to

be hosting

which reduces

2 (11.2),

Part

the

Number

services

for

many

cost to the individual

E25494-02.

Available:

https://docs

.oracle.com/cd/E11882_01/nav/portal_4.htm

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

organisation.

Under the negotiated

service

level

agreement,

14

an organisation

Distributed

will also only

Databases

743

pay for

what it requires.

Latest

software.

version

available

Scalable the

Most third to remain

architecture.

database

allows

data requirements

and/or

change

Data and software greater

providers

will ensure that their

software

is always the latest

competitive.

If the

capacity

Mobile access. which

party cloud

flexibility

the

of the

within the

for

organisation

underlying

employees

data

cloud

expand,

can be accessed

in terms

it is

easy to increase

model.

of

where

generally

they

from

anywhere,

work.

note You

It is

willlearn

clear

For

more about

that

traditional

example,

hardware

a

resources

One solution

Column

is to

for

if

time

use

are for

such

stores depending

replica

addition,

upon

A further

example

distributed to

review

Singh,

has

Learning. that

any

store

a cloud

queries)

within

the

Web Technologies.

environment. and

operating

petrabytes

uses

associated

within a cloud at a given time.5

cloud.

defines

Bigtable

of

Bigtable

Current

data

to

NoSQL

by a row

across

store

as a parse,

map is indexed

its

structured

distributed,

key,

column

and is

often

often

The

CAP

All

to

Rights

P.,

key and a

Reserved. content

does

May not

not

be

(nodes

within

support

synchronous even

Computing and

copied, affect

Impact,

scanned, the

overall

or

duplicated, learning

the

73,

in experience.

whole

do

database can

can

it is

so the

which

queries

be changed

is

a

can be quickly,

to

2012,

replication

Latest pp.

a necessity

Computer

or in Cengage

part.

Due Learning

electronic reserves

cloud

tolerance properties

Abadi

stated

many applications

that

a

require

partition.6

Architectural

Concepts,

World

2011.

Society,

to

that the

only two

Dr Daniel

and

follow

of partition

support

because

Trends

104245,

infrastructures

property

be said In

cloud

when there is no network

Databases:

IEEE

SimpleDB,

1 but

cloud)

an AP system.

Technology,

In

data can be geographically

model

Web services,

So, a cloud as

Amazons

data

system

clients.

data.

relational

scalability,

using

is sacrificed

Cloud

underlying

database

or offline

Thus, the

Traditional

unlimited

essential.

Growing

materially

offer

is

stored

as document-orientated

performance.

to

Engineering Theorems

The for

and accessed

also

solution

is

to

a distributed

servers

or delete

environment.

referred

cannot

Sandhu, Science,

requests. optimised

multiple

update, insert

database

document

are referred

which is

on

high availability.

servers

data is

query,

NoSQL

ultimately

of

exist

each

stores

CouchDB, can

within a cloud

and

Instead,

Document

Apaches

Web service

of individual

of

suppressed

is

operates

Thus, consistency

S.,

Cengage deemed

that

data in tables.

database

As data is stored

availability

T. and

Shim,

2020

data

example,

Google

the ability to

indexed seeks

system

Academy

Copyright

using

CAP theorem

distributed

Editorial

data

with failure High

low latency.

6

for

and format.

which allows

automatically

computing

deal met.

5

in

upon service requirements

manage

systems

Earth.

of a vendor-specific

CAP theorem?

of the

and

Google,

storing

size

same

are offered

data store that

query

Cloud

can

of the

with replication,

data is

its

An example

copies

all users

non-relational

is

operate (through

and

stores:

map where the

move away from

databases. where

the

Google

to

requests

based

store

distributed

sorted

trying

Connectivity

stamp.

Document

used

as

multidimensional

differently

and

to

of servers.

17, Database

This is in contrast to a DDBMS

document

large-scale

all data

dynamically

and

not thousands,

Chapter

when

over

databases

stores

in

problems

control

data is consistent.

NoSQL

applications

persistent

has

are allocated

column

stores

hundreds, data

to ensure

services

will face

typically

resources

include

computing

DDBMS

DDBMS

where hardware

solutions

cloud

45(2),

rights, the

right

some to

third remove

212,

party additional

2012.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

744

Part

VI

Database

Management

When a third cloud

party

providers

and

Oracle

Oracle

are

that

grow

DDBMS

are

to

Each

the

For unless

of a target

a fully

for

defend

distributed

against

attacks. basis

these threats.7

Databases

Dates

database

a useful

Alibaba

on a daily

DIstrIbuteD C.J.

big public

IBM,

malicious

dramatically

to

it includes

do constitute

(AWS),

distributed

and,

distributed

database

although

no current

database

target.

The

Each local for

site can act as an independent,

security,

concurrency

control,

autonomous,

backup

centralised

and recovery.

No site in the network relies on a central site or any other site. All sites

case

of a node

failure

by node failures.

or an expansion

of the

The user does not need to know the location

The system is in continuous network.

of the data in order to retrieve

data.

Fragmentation transparent order

6

more

data? In 2019,

Services

will increase

intelligence

The system is not affected

even in the

Location transparency.

5

artificial

of that

Web

capabilities.

Failure independence.

those

them

events

describe

the rules

site is responsible

same

operation,

4

makes

complete

commandments

Central site independence.

3

is

security

Amazon

as follows:

DBMS.

have

growth

use cloud-based

all of them,

Local site independence.

2

This

Cloud,

of security-related

databases

Dates

conforms

Google

12 coMManDMents

of distributed

commandments.8

12 rules

bigger.

will be to

data, how good is the

Azure,

number

c.J. Dates

No discussion

all your

Microsoft

the

one solution

14.15

1

as

set to

predicts

and that

manages

such

to

to

transparency. the

retrieve

user.

The user sees only one logical

The

user

does

not

need

to

know

the

database. name

Data fragmentation

of the

database

is

fragments

in

them.

Replication transparency. The user sees only one logical database. The DDBMStransparently selects

the

database

fragment

to

access.

To the

user,

the

DDBMS

manages

all

fragments

transparently.

7

Distributed query processing. Query optimisation

8

9

A distributed

performed

Distributed transaction transaction

14

is

processing.

is transparently

may be executed

by the

Atransaction

executed

Hardware independence.

query

transparently

at several

The system

may update different

data at several different sites. The

platform.

10

Operating system independence.

11

Networkindependence. The system mustrun on any network platform.

12

Database independence.

7

Oracles

top

10

Cloud

The system

Predictions

mustrun on any operating system software

must support

2019 [online],

DP sites.

DP sites.

mustrun on any hardware

The system

at several different

DDBMS.

any vendors

available:

platform.

database product.

www.oracle.com/assets/oracle-cloud-predictions-2019-5244106.p

2019

8

Copyright Editorial

review

2020 has

Date, C.J., Twelve

Cengage deemed

Learning. that

any

All suppressed

Rights

Rules for a Distributed

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

Database,

in experience.

whole

or in Cengage

Computer

part.

Due Learning

to

electronic reserves

World, 2(23), pp. 7781,

rights, the

right

some to

third remove

party additional

content

may content

8 June, 1987.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

14

Distributed

Databases

745

suMMary A distributed database stores logically related data in two or more physically independent sites connected via a computer network. The database is divided into fragments, which can be horizontal (a set of rows) or vertical (a set of attributes). Each fragment can be allocated to a different

network

node.

Distributed processing is the division oflogical database processing among two or more network nodes. Distributed databases require distributed processing. A distributed database management system (DDBMS)

interconnected

governs the

computer

The main components (DP).

processing

and storage

of logically

of a DDBMS are the transaction

The transaction

related

data through

systems.

processor

component

is the

processor (TP) and the data processor

software

that resides

on each computer

that requests data. The data processor component is the software that resides that stores and retrieves data. Current

database

systems

can be classified

by the

extent

to

which they

node

on each computer

support

processing

and data distribution. Three major categories are used to classify distributed database systems: (1) single-site processing, single-site data (SPSD); (2) multiple-site processing, single-site data (MPSD); and (3) multiple-site processing, multiple-site data (MPMD). A homogeneous

distributed

database

system integrates

a computer network. A heterogeneous distributed types of DBMSs over a computer network. DDBMS

characteristics

are best

described

only one particular

type

database system integrates

as a set of transparencies:

of DBMS

over

several different

distribution,

transaction,

failure, heterogeneity and performance. Alltransparencies share the common objective of making the distributed database behave as though it were a centralised database system; that is, the end user sees the data as part of a single logical centralised database and is unaware of the systems complexities. Atransaction is formed by one or more database requests. An undistributed transaction updates or requests data from a single site. A distributed transaction can update or request data from multiple sites. Distributed concurrency control is required in a network of distributed databases. COMMIT protocol is used to ensure that all parts of a transaction are completed. A distributed

DBMS evaluates

every data request

to find the

database. The DDBMS must optimise the query to reduce costs associated with the query. The design

of a distributed

database

must consider

optimum

access

Atwo-phase

path in a distributed

access, communications

the fragmentation

and CPU

and replication

of data.

14

The designer must also decide how to allocate each fragment or replica to obtain better overall response time and to ensure data availability to the end user. A database

can be replicated

over several

different

sites

on a computer

network.

The replication

of the database fragments has the objective ofimproving data availability, thus decreasing access time. A database can be partially, fully, or not replicated. Data allocation strategies are designed to determine the location of the database fragments or replicas. The CAP theorem states that a highly distributed data system has some desirable properties of consistency, availability and partition tolerance. However, a system can only provide two of these properties at atime.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

746

Part

VI

Database

Management

key terMs application processor(AP)

distributed transaction

NewSQL

basically available, soft state, eventually

distribution transparency

NoSQL

document stores

partially replicated database

centraliseddataallocation

DO-UNDO-REDO protocol

partitionkey

client/server architecture

failure transparency

partitioned data allocation

cloud computing

fragmentation transparency

performancetransparency

column stores

fully heterogeneous DDBMS

remote request

coordinator

fully replicateddatabase

remotetransaction

dataallocation

heterogeneitytransparency

replicatransparency

datafragmentation

heterogeneous DDBMS

replicated data allocation

data manager(DM)

homogeneous DDBMS

servicelevel agreement

data processor(DP)

horizontal fragmentation

single-site processing, single-site data(SPSD)

datareplication

local mapping transparency

subordinates

databasefragments

consistent (BASE)

location transparency

transaction manager(TM)

distributed data catalogue (DDC)

mixedfragmentation

transaction processor(TP)

distributed data dictionary (DDD)

multiple-site processing, multiple-site data

transaction transparency

distributeddatabase

(MPMD)

distributeddatabasemanagement system

two-phasecommit protocol

multiple-siteprocessing, single-sitedata

(DDBMS)

(MPSD)

distributed globalschema

unreplicated database

mutualconsistency rule

distributed processing

network latency

distributedrequest

networkpartitioning

Further Jain,

A., The

Cloud

DBA-Oracle:

online content

reVIew 1

vertical fragmentation write-ahead protocol

reaDIng

are contained in the

14

uniquefragment

Managing

Oracle

Database

in the

Cloud.

Apress,

2017.

Answers to selectedReviewQuestions andProblems forthis chapter online platform for this book.

QuestIons

Describe the evolution from centralised

DBMS to distributed

DBMSs.

2 List and discuss some ofthe factors that influenced the evolution ofthe DDBMS. 3

Whatarethe advantages ofthe DDBMS?

4

What are the disadvantages

5

Copyright Editorial

review

2020 has

DDBMS?

Explain the difference between a distributed

Cengage deemed

of the

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

database and distributed

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

processing.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

6

Whatis afully distributed

database

7

What are the components

of a DDBMS? features

Explain the transparency

9

Define and explain the different types of distribution transparency. Describe the different types

11

Explain the need for the two-phase

12

Whatis the objective

13

To which transparency

Distributed

Databases

747

management system?

8

10

14

of a DDBMS.

of database requests

and transactions.

commit protocol. Then describe the two

phases.

of query optimisation functions? feature

are the query optimisation functions

of query optimisation

related?

14

What are the different types

algorithms?

15

Describethe three data fragmentation strategies. Givesome examples.

16

Whatis data replication,

17

How does a BASE system differ from a traditional

18

What are the three

19

What are the

and what are the three replication

proprieties

of the

strategies?

distributed

database system?

CAPtheorem?

main benefits to an organisation

of using a cloud infrastructure?

ProbleMs The following

FIgure

problem is

P14.1

based

on the

DDBMS

FRAGMENTS

LOCATION

CUSTOMER

N/A

A

PRODUCT

PROD_A

A

PROD_B

B

INVOICE

N/A

B

INV_LINE

N/A

B

I N V OI CE

P RO D_ A

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

Figure

P14.1.

14

C U ST O ME R

Copyright

in

the DDbMs scenario for Problem 1

TABLES

Editorial

scenario

does

I N V _LI N E

R O D_ B

Site C

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

748

Part

VI

Database

1

Management

Specify the

minimum type(s)

transaction,

distributed

of operation(s) the database

transaction,

or distributed

must support (remote request, remote

request)

to

perform

the

following

operations:

At Site C

a

SELECT* FROM

b

CUSTOMER;

SELECT

*

FROM

INVOICE

WHERE

c

INV_TOT

SELECT

1000;

*

FROM

PRODUCT

WHERE

d

.

PROD_

QOH

, 10;

BEGIN WORK; UPDATE

CUSTOMER

SET

CUS_BAL

WHERE

CUS_NUM

INSERT

INTO

INSERT

1 100

5 '10936';

INVOICE(INV_NUM,

'10936', INTO

CUS_NUM,

'15-FEB-2019', LINE(INV_NUM,

PROD_NUM,

PRODUCT

SET

PROD_QOH

5 PROD_

PROD_NUM

5 '1023';

COMMIT

INV_DATE,

INV_TOTAL)

VALUES

('986391',

100);

UPDATE

WHERE

e

5 CUS_BAL

LINE_PRICE)

VALUES('986391',

'1023',

100);

QOH 1

WORK;

BEGIN WORK; INSERT

INTO

CUSTOMER(CUS_NUM,

('34210',

INSERT

'Victor

Ephanor',

INTO INVOICE(INV_NUM, '34210',

COMMIT

CUS_NAME, '143

Main

CUS_NUM,

'10-AUG-2018',

St.',

CUS_ADDRESS,

CUS_BAL)VALUES

0.00);

INV_DATE,

INV_TOTAL)

VALUES ('986434',

2.00);

WORK;

At Site A

14

f

SELECT

CUS_NUM,CUS_NAME,INV_TOTAL

FROM

CUSTOMER,

WHERE

g

CUSTOMER.CUS_NUM

SELECT INVOICE

WHERE .

SELECT FROM

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

1000;

*

PRODUCT

WHERE

Editorial

5 INVOICE.CUS_NUM;

* FROM

INV_TOTAL

h

INVOICE

Rights

PROD_QOH

Reserved. content

does

May not

not materially

be

copied, affect

,

scanned, the

overall

10;

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

chaPter

14

Distributed

Databases

749

At Site B

i

SELECT

j

*

FROM

CUSTOMER;

SELECT

CUS_NAME,

INV_TOTAL

FROM

CUSTOMER,

INVOICE

WHEREINV_TOTAL

.

1000

AND

CUSTOMER.CUS_NUM

5

INVOICE.CUS_NUM;

k

SELECT

*

FROM

PRODUCT

WHERE

2

PROD_QOH,10;

The following a

data structure

and constraints

The company publishes one regional the

b

Netherlands

(NL)

and the

exist for a magazine publishing

magazine in each country: France (FR), South Africa(SA),

United

Kingdom

(UK).

The company has 300 000 customers (subscribers) listed

c

in

company:

distributed throughout

the four countries

Part a.

On the first customer

of each whose

attribute

to indicate

CUSTOMER

month, an annual subscription

subscription the

is

country

(CUS_NUM,

POSTCODE,

due for

(FR,

SA,

CUS_NAME,

CUS_SUBSDATE)

renewal. NL,

INVOICE is printed The INVOICE

UK) in

which

the

CUS_ADDRESS,

INVOICE

(INV_NUM,

entity

and sent to each

contains

customer

a REGION

resides:

CUS_CITY,

CUS_REGION,

INV_REGION,

CUS_NUM,

CUS_

INV_DATE,

INV_TOTAL) The companys and

management

has decided

regional

data to

all

current

List

all

new

has

associated

of the

will handle

however, ad

its

own

will have hoc

with centralised

subscriptions

into

customer access

queries

the

such

to

management

companys

and invoice

four

data.

customer

The

and invoice

as:

by region.

customers

by region. by customer

requirements,

how

and

by region.

must you partition

the

database?

in Problem 2, answer the following

questions:

14

willyou makeregarding the type and characteristics ofthe required

system?

Whattype of data fragmentation

c

Which criteria

is needed for each table?

must be used to partition each database?

Design the database fragments. names,

site

and to issue

b

d

2020

reports

Whichrecommendations database

review

subscription

Given the scenario and the requirements

a

problems

management

headquarters,

customers

all invoices

Given those

Copyright

annual

List

Report

Editorial

Each

at company

generate

is aware of the

decentralise

subsidiaries.

management

3

to

attribute

names

and

Show an example

demonstration

with node names, location,

fragment

data.

e

Whattype of distributed

f

Whattype of distributed database operations must be supported atthe headquarters site?

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

database operations

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

must be supported

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

at each remote site?

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter 15 Databasesfor Business Intelligence In thIs Chapter, How

business

you

intelligence

wIll learn:

provides

a comprehensive

business

decision

support

framework

About

business

About the How to

intelligence

data

warehouse

prepare

and Loading How to

the

About

the

How

SQL

About

life

data for the

star

role

and reporting

styles

cycle

data

warehouse

and

snowflake

schemas

and functions

of data

characteristics analytic

data

its evolution

using the

Extraction,

Transformation

Process.

develop

About

architecture,

and

functions

visualisation

analytics

capabilities are

and

used

how it

for

decision-making

and

data

of online

to

support

supports

purposes

mining

analytical data

processing

(OLAP)

analytics

business

intelligence

Preview Business

intelligence

developed

to

markets,

rapid

information

change

support

data

data

Online

gather,

especially

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not

be

a data

pool.

best

and

software

of globalisation, complexity

developed.

new

The

external

ways

to

and

range

data

warehouse

a new extracts

providing

and

of

operational

Therefore,

sources,

analyse

tools

emerging and

has increased,

all of these requirements.

well as from

Additionally,

age The

decisions

warehouse, as

practices

in this

regulation.

business

databases

including

and

copied, affect

the

overall

provides

multidimensional

or

duplicated, learning

in experience.

or in Cengage

part.

Due Learning

to

analysis.

intelligence

warehouses,

whole

advanced

data

of business

and present information

use of data

scanned,

(OLAP)

components

generate on the

materially

increasing

processing

tools,

main concepts

that

of making

a more

present

decision

were developed.

analytical

visualisation

collection

decision

support

called

operational

comprehensive

the

were unable to support

facility,

data from

and

to

structures

data storage

is

business

required

database

its

(BI)

support

for

electronic reserves

rights, the

right

This

and

business

data analytics

some to

third remove

data

analysis

chapter

decision

explores

support

decision

and the

systems

makers, focusing

and data visualisation.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

15.1

the

neeD For

Data

15

Databases

for

Business

Intelligence

751

analysIs

Organisations tend to grow and prosper as they gain a better understanding of their environment. Typically, business managers must be able to track daily transactions to evaluate how the business is performing. Bytapping into the operational database, management can develop strategies to meet organisational goals. In addition, data analysis can provide information about short-term tactical evaluations

and strategies

such

as: Are our sales

promotions

working?

What market

percentage

are

we controlling? Are we attracting new customers? Tactical and strategic decisions are also shaped by constant pressure from external and internal forces, including globalisation, the cultural and legal environment, and, perhaps mostimportantly, technology. Given the

many and varied

competitive

pressures,

managers

are always looking

for

a competitive

advantage through product development and maintenance, service, market positioning, sales and so on. Thanks to the internet, customers are moreinformed about the products they how muchthey are willing to pay. Technological advances allow customers to place orders smartphones while they commute to work. Decision makers can no longer wait a couple of report

to

be generated;

quick

decisions

must be

made for the

business

to remain

promotion want and from their days for a

competitive.

Every

day, advertisements offer, for example, instant price matching, and the question is, How can a company survive on lower margins and still make a profit? The key is having the right data at the right time to support the decision-making process. Different

managerial

levels

require

different

decision

support

needs.

For example,

transaction-processing

systems, based on operational databases, are tailored to serve the information needs of people who deal with short-term inventory, accounts payable and purchasing. Middle-level managers, general managers, vice presidents and presidents focus on strategic and tactical decision making. Those managers require summarised information designed to help them make decisions

in

a complex

business

environment.

Companies

multilevel decision support needs by creating users for example, those in finance, customer also developed for different industries such as started to work well, but changes in the way in expanding

markets,

merges and acquisitions,

and software

vendors

addressed

these

autonomous applications for particular groups of relationship management, etc. Applications were education, healthcare and finance. The approach which business was conducted, e.g. globalisation,

increased

regulation

and new technologies,

called for

new ways of integrating and managing decision support across levels, sectors and geographical locations. This more comprehensive and integrated decision support framework became known as business intelligence.

15.2

BusIness IntellIgenCe

Business intelligence (Bi)1 is a term that describes a comprehensive, cohesive, and integrated set of tools and processes used to capture, collect, integrate, store and analyse data with the purpose of generating and presenting information to support business decision making. This intelligence is based

1

In

on learning

1989,

while

of concepts

and understanding

working

and

at

the facts

Gartner Inc.,

Howard

methods to improve

about

Dresner

the

business

popularised

business decision

environment.

BI is

BI as an umbrella

making by using fact-based

term

to

support

a framework

describe

15

a set

systems (www.

computerworld.com/action/article.do?command=viewArticleBasic&articleId=266298).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

752

part

VI

that

Database

Management

allows

into

a business

wisdom.

business This

performance

business

of the

to

affect through

empowers

users

to

information

into

culture

positively

a companys active

decision

make sound

support

decisions

knowledge,

at

based

and knowledge

by creating

all levels

on the

in

adopters

companies

You

were high-volume

As BI technology

retail/merchandising,

willlearn

15.1

evolved,

industries its

manufacturing,

that

usage

BI tools

about these tools

later in the

such spread

media,

have implemented

an organisation.

accumulated

as financial to

other

government, and

how

CiCis Enterprises

US; operates in

Source:

Cognos

even

the

such

education.

tools

and healthcare

as telecommunications, Table

have

chain in

650

pizza

was

Provided

and time-consuming

Needed to increase

less

accuracy in the

creation

of marketing

Needed

an easy, reliable

efficient

way to access

Provided

budgets

accurate,

timely

Nasdaq

Inability US electronic

stock

to

query

organisation

analysts

with access to

data for decision-making

and

Received

daily data

provide real-time,

and standard

executives,

Oracle

other

www.oracle.com

in-depth

performance

pharmaceutical

Oracle

storage

a multitier

storage

analysts

and

Implemented centre

storage

terabytes

of data

Needed

a way to

costs

control

government better

and flexible

many

costs

market

competition

increasing Needed

for

view

and

analytical

Ability to

and

reliable

Faster

Needed

leading

telecommunications

generate

Source:

Needed 200

review

help

process

performance

financial

sources

in

a

to improve

process

and smarter

business

Quick

to

users

standards-based

analysis

decision

strategy monitor

dashboard and

making

formulation

performance

using

technology easy

performance

reports

warehouse

way

Ability to

employees

compliance

moving to

access

to real-time

data

Microsoft

www.microsoft.com

Copyright

to

Had a time-consuming

provider

Editorial

a tool

monitor service-level

end

multiple

forecasting

for Swisscom

for

Streamlined,

framework

by

solution

get and integrate

financial

decision-making

costs

and near real-time

access

data from

capabilities

profits

with support for ad hoc query

conditions,

regulations

of product

new data

and reporting, data

to tougher

international

www.oracle.com

Reduced

for

users

adjust

company

15

purposes

by store to reduce

ad hoc

reporting

business

Excessive

Pfizer

Switzerlands

budgets in

time

waste and increase

Source:

some

companies.

Corp.

www.cognos.com

Global

lists

the

chapter.

access

cumbersome

30 states

market trading

15.1

benefited

Benefit

Information

pizza

restaurants

services, insurance

industries

and

shows

Problem

Eighth-largest

Source:

knowledge

solving business problems and adding value with BItools

Company

Largest

continuous

business.

BIs initial

the

data into information,

potential

improvement

insight

companies.

taBle

to transform

BI has the

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

a way to integrate

different

copied, affect

scanned, the

overall

Managers

data from

control

systems

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

have

closer

and

better

over costs

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Implementing

BIin an organisation

also the

metadata,

a deep

understanding

of

users

on the

BI is that

not

help

situation

in

about

alignment

online

platform

better

and identify

key

data.

of the

not only internal

In

practice,

business

(See

for this

by itself,

business

capturing

the

an organisation.

a product

a

framework

1

and

at all levels

available

involves

or knowledge

15

BI is

L,

business

Data

for

and external

a complex

processes,

Appendix

Databases

Business

business

proposition data

Warehouse

Intelligence

data,

that

753

but

requires

and information

needs

Implementation

Factors,

book.)

but a framework

understand

its

opportunities

of concepts,

core

to

practices,

capabilities,

create

provide

competitive

tools

and technologies

snapshots

advantage.

In

of the

general,

company

BI provides

a

for:

Collecting and storing operational data

2 Aggregating the operational datainto decision support data 3

Analysing decision support data to generate information

4

Presenting such information

5

Making business on (restarting

6

decisions,

the

which in turn generate

7

be collected,

stored,

outcomes

preceding

within the the

of the

processes

knowledge, basic

represent

In

practice,

of a BI system

will use the

and so

per

operational and

a system-wide

the first

are the focus

it is the

as input

explained

of the

the

which again provides

of data,

processes

operational

of an operational

from

which

preceding

BI system.

flow

and storing

function

material in

of the

view

point, collecting

se; rather, data

outcomes

and they

decisions,

more

with a high degree of accuracy. and

outcomes

data, does not fall

system.

However,

information

will be derived.

are

towards

points

In the following

orientated

section,

you

the

BI

The rest generating

willlearn

about the

BI architecture.

15.2.1 Business Intelligence

architecture

BI

and

covers

a range

acquisition

to

of technologies

storage,

BIfunctionality and

stored,

and so on

points

BIframework.

realm

system

decisions

more data that are collected,

of the business

Predicting future behaviours and outcomes

The seven

business

process)

Monitoring results to evaluate data to

into

to the end user to support

ranges

presentation.

integrated,

transformation,

from

simple

data

BI architecture

multivendor

applications

to

integration,

gathering

ranges

from

environments.

manage

presentation,

and transformation highly

However,

the

integrated

some

entire

analysis,

data

functions

cycle

and

to very complex single-vendor

common

life

monitoring

data

systems are

from

archiving.

analysis

to loosely

expected

in

most

BI

implementations.

Like any critical technology

business IT infrastructure,

and the

fit together

within

management

the

of such

the

BI architecture

components.

is composed

Figure

of data, people,

15.1 depicts

processes,

how all these

components

BI framework.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

754

part

VI

FIgure

Database

Management

15.1

Business intelligence

People

framework

Business

External data

intelligence

Processes

framework

Data visualisation

Operational data

Monitoring and

Data

alerting

Query

analytics

and reporting

Data store Data

ETL

D

warehouse

at a mar

Extraction, transformation and

Management

loading

Governance

Source:

The general

BI framework

functionality

required

components

taBle

later

15.2

depicted

in

in this

most chapter.

The

Basic BI architectural

Component

Description

ETL tools

Data and

the

15

such

Copyright review

2020 has

and

The data store is

analysis

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not

be

copied, affect

components You

that

will learn

described

in

Learning

encompass

more

Table

about

the these

15.2.

tools

be found

prices,

be saved into by the

Such

company

data

decision support

that

optimised its

are relevant

located

market the

that

data to

(such in

external

data.

generally represented

mart. The data are stored in structures

for

day-to-day

information

are generally

and is

integrate

The external

but

marketing

groups or companies

for

during

and payments.

within the

data.

filter,

a data store

company

market indicators,

competitors

collect,

by a data

are optimised

for data

speed.

scanned, the

(eTL)

sales history, invoicing

cannot

optimised

query

are

by industry

or a data

and

materially

data to generated

as stock

provided

warehouse

briefly

data

data that

as demographics)

Data store

basic

and loading

such as product

databases

are

and external

Internal

provide

business,

has six

BI systems.

components

internal

support.

sources

15.1

transformation

aggregate

operations,

Figure

Technology/Cengage

components

extraction,

decision

Editorial

in

current-generation

Course

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

Component Query

Databases

for

Business

Intelligence

755

Description

and reporting

This to

component

create

performs

queries

that

This component ways.

monitoring

and

alerting

helps

as summary

This

user to select

maps,

concise

information

in

activities,

as number

by product

a given

metric;

system

will perform

store.

value

advises

creates

placed

statistical

mixed

in the

meaningful

database,

goes

The

view for the

hours,

BI system data

analyst.

performance

number

Alerts

or above

or

of customer

can

be placed

a certain

shop floor

format,

or dashboards.

activities.

by region.

below

and innovative

presentation

graphs,

past four

revenue

such as emailing

data analysis

the

data

store

Depending

operational

metrics about the system

of a metric

user about

model. the

to

which

baseline,

managers,

on the

presenting

or predictive.

data that

data

to

build

select

and

analysis

types,

predictions

and

by special

situations

Explanatory and their

allow

tool

using the data in the

are generated

of business

relationships

of the

tasks

data analysis

models

understanding

discover

models

and data-mining

Business

analysis can be either explanatory data

analyst

an application.

and enhance

data in the

data

reports.

the

of

a single integrated

and total

a given action,

business

that identify

month,

performs

This tool

a reliable

by the

required

most appropriate

of business

specific

of orders

by

once the

This component

the

used

accesses

graphs,

monitoring

view could include

such

tool

the

pie or bar

real-time

visual alerts or starting Data analytics

end

This integrated

complaints

create

and reporting

data to the end user in a variety

the

allows

the

and

and is

data store.

reports,

component

will present

the

and retrieval,

database

query

presents

This tool

such

the

the

or more commonly, Data visualisation

data selection

access

on the implementation,

Data

15

and

how to

algorithms

problems.

Data

uses the existing predictive

of future

values

analysis and

events.

Each BI component shown in Table 15.2 has generated a fast-growing market for specialised tools. Thanks to technological advancements, the components can interact with other components to form a truly

open

architecture.

As a matter of fact,

a single BIframework.

taBle

15.3

you can integrate

multiple tools

Table 15.3 shows a sample of common

sample of business intelligence

Tool

from

different

business

and

Sample

Dashboards

activity

business

monitoring

use

Web-based

performance

integrated

view,

concise

technologies

indicators

generally

to

present

or information

using

into

tools

Description

Dashboards

vendors

BItools and vendors.

graphics

in

that

vendors

key

Salesforce

a single

IBM/Cognos

are clear,

BusinessObjects

and easy to understand.

Information

Builders

iDashboards Tableau Portals

Portals

provide

information

a unified,

distribution.

technology

that

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

of

not materially

be

affect

scanned, the

Portals

are a

a single

BI functionality

copied,

point

of entry

for

overall

or

duplicated, learning

can

in experience.

Web page. be accessed

whole

or in Cengage

part.

Due Learning

Oracle

Web-based

use a Web browser to integrate

multiple sources into types

single

to

Actuate

data from

Microsoft

Many different through

electronic reserves

rights, the

right

15

Portal

SAP

a portal.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

756

part

VI

Database

Management

Tool

Description

Data analysis reporting

and

These

tools

Sample

advanced

data sources

tools

to

are

used to

query

create integrated

multiple

and

diverse

Microsoft

reports.

tools

These

tools

problems

provide

advanced

statistical

and opportunities

hidden

Reporting

Services

MicroStrategy SAS

Data-mining

vendors

analysis

to

within business

Web Report

uncover

SAP

data.

Teradata

Studio

MicroStrategy Hadoop Data

warehouses

The

(DW)

data

Data are

in the

warehouse captured

from

foundation

the

integration

Online

of data

issues

in

analytical

of a BI infrastructure.

production

DW on a near real-time

business OLAP tools

is the

and the

a timely

system

and

basis. BI provides capability

Amazon

placed

Oracle

company-wide

to respond

Redshift

IBM

to

Exadata

DB2

Azure

manner.

processing

provides

multidimensional

data

IBM/Cognos

analysis.

Micro

Strategy

ioCube Apache Data visualisation

These

tools

provide

techniques insight

to

into

advanced

enhance

visual

understanding

business

data

and its

analysis and

true

and

Kylin

Dundas

create

additional

Tableau

meaning.

QlikView

Actuate

As depicted in Figure 15.1, BIintegrates people and processes using technology to add value to the business. Such value is derived from how end users apply such information in their daily activities, and particularly

in their

daily business

decision

making.

The focus of traditional information systems was on operational automation and reporting; in contrast, BI tools focus on the strategic and tactical use of information. To achieve this goal, BI recognises that technology alone is not enough. Therefore, BI uses an arrangement of best management practices to manage data as a corporate asset. One of the most recent developments in this

area is the

use of

master

data

management

techniques.

Master

data

management

(MDM)

is a collection of concepts, techniques and processes for the proper identification, definition and management of data elements within an organisation. MDMs main goal is to provide a comprehensive and consistent definition of all data within an organisation. MDM ensures that all company resources (people,

procedures

and IT

systems)

that

work

with data

have

uniform

and

consistent

views

of the

companys data. An added benefit of this meticulous approach to data management and decision making is that it provides aframework for business governance. Governance is a method or process of government. In this case, BI provides a method for controlling and monitoring business health and for consistent decision

15

making.

Furthermore,

having

such

governance

creates

accountability

for

business

decisions.

In the present age of business flux, accountability is increasingly important. Had governance been as pivotal to business operations in previous years, crises precipitated by Enron, WorldCom, Arthur Andersen and the 2008 financial meltdown might have been avoided. Monitoring

a businesss

headed. To do this,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

health is

crucial

BI makes extensive

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

to

understanding

where the

use of a special type

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

of

rights, the

right

company

is

and

where it is

metrics known as key performance

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

indicators.

Key performance

that

the

assess

Many

different

General.

KPIs

Earnings

receivable

and

Human

KPIs

are

To tie

Another from

per

and

to job number scores

the

main

KPI to

the

strategic

a specific

time

its

strategic

examples

of

of business, sales

revenue

Business

and

KPIs

Intelligence

757

measurements

operational

goals.

are:

same-store

sales,

product

by employee

per

employee,

percentage

would

on

plan

example,

student

cent

by

of sales

of

an

organisation,

if

you

to

account

basis,

and

plans

are

for

compared

In this

rate

defined

case,

first

by autumn

year

to

indicators

achieve

such

business

operations,

a

you

a sample

seniors

from

a

to

environment,

school

performance

to

publication

a KPI is

high

such

rates,

an academic

student

case,

longevity

goals

or retention.

returning

In this

a year-to-year

operational

are in

employee retention

and

of graduating

the

2022.

and

student

satisfaction

grades

be to increase

75 per

monitored

first-years,

master

exam

turnover

tactical

For

measure

the final

KPI

employee

strategic,

frame.

ways to

to

and

of incoming

after

cent

by line

openings,

rates,

in

reaching

Some

profit

margin,

evaluation

sample

60

of

for

sales

be to increase

measured

in

the

be interested

2022. year

to

in

industries.

by promotion

profit

Applicants

within

would

sales

share,

assets

determined

goal

might goal

per

Graduation

business.

by different

Databases

are quantifiable numeric or scale-based

or success

measurements

and teaching

desired

used

recalls,

resources.

Education. rates

are

product

Finance.

(KPis)

effectiveness

Year-to-year

turnovers,

indicators

companys

15

second would

goals

would

be

be set

place. Although

BI has

must initiate exists ask

to

the

the

appropriate

missed. In spite

unquestionably

decision

support

the

an

important

support

manager;

process it

questions,

does

by

asking

not replace

problems

of the very powerful

role

modern

the

appropriate

the

will not

BI presence,

in

be identified

the

questions.

management

human

The

function.

and

and

manager

BI environment

If the

solved,

component

the

manager

fails

opportunities

is still at the

to

will

centre

be

of business

technology. The

main

Tables

15.2

decision

BI architectural and

15.3.

support

intuitive

capabilities.

and informational

provides

three

Advanced

user to

The reports Monitoring

and

exceptions

and

can

15.1

advanced

decision

support

and

further

explained

information

functions

capabilities.

a decision

provide

has

been

provides

warn

to

about the

interactive

from

used

to evaluate

be set to

information

of view

information

indicators

alerts

reports

points

The BI system

key performance

Figure

is its

its reporting

insightful

the

multiple

After

outcome.

advanced

presents

actionable

alerting.

decisions

other

key

in

BI system

and particularly

Furthermore,

data from

present

of the

in

generation

and

come to life

via its

A modern

BI system

styles:

A BI system

formats. the

were illustrated

heart

A BI systems

reporting

reporting.

study

the

user interface,

distinctive

of presentation

the

components

However,

highly

managers

offers

with ways to

allow very

the

end

detailed

ways to

define

data.

deviations

monitor

metrics and

of an organisation. about

in a variety

making.

BI system

aspects promptly

that to

decision

the end user

different

features

summarised

support

made, the

organisation

In addition, or problem

areas.

15 Advanced

data analytics.

patterns types

and trends of data

relationships,

with

Copyright Editorial

review

2020 has

hidden

analysis: trends

Learning. that

any

All suppressed

Rights

does

May not

not materially

be

copied, affect

provides

the

among

the

overall

or

to

data,

duplicated, learning

in experience.

help the end user discover

data.

predictive.

predict future

scanned,

tools

organisations

and

patterns

models that

Reserved. content

within

explanatory and

ways to create

Cengage deemed

A BI system

These

tools

Explanatory while

are

used

analysis

provides

analysis

provides

predictive

relationships,

to

create ways the

two

to

discover

end

user

outcomes.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

758

part

VI

Database

Management

Understanding BI in in

the

architectural

an organisation.

the

next

components

A good

As you

have learnt

framework

for

decision

in

previous

continuous

making

Integrating

is the

main

architecture.

of company-generated

such

mainframes,

Common

a disparate

no longer similar

different separate

data

a single

Improved

using

Copyright Editorial

review

2020 has

single

stores.

of the

not

time.

period

useful

any

potential

of becoming

This

executive,

BI front

the integrating

architecture

as

ends

for all company for

devices

version

could

well as diverse

can

provide

support

hardware

that

use

up-to-the-minute

users. IT departments

diverse interfaces.

of company

End users

multiple

the

data.

operations.

data

to integrate

performance. to

clever

In the

benefit

and insightful

past,

Such

systems

and

up-to-date

synchronised

such

data under

BI can provide

manufacturing

waste,

increased

bottom

takes

alot

achieved

multiple

IT

collected

systems

and

stored

has always

a common

for

environment

been

and

in can

employee

and

and technological

are the

as you

advantages advantages

many different be reflected

customer

in

turnover,

and

business.

financial

but

Such

reduced

the

of human,

overnight,

competitive

processes. sales,

line

As a matter of fact,

information

result

of

resources,

a focused

not to

company-wide

willlearn in the next section,

the

mention effort

BIfield

that

has evolved

to

evolution

end

users

has

part of corporations.

computer

of

Learning.

a

Improved

of time itself.

an integral

and

that

provide

making.

mobile devices.

analysis.

different

could

decision

data.

support

benefits are

a long

evolution

Cengage

as outlined

benefits:

to

and

and

BI architecture

organisation.

operational

laptops

Keeping

an increased

all these

systems

deemed

an

of an organisations

15.2.3 Business Intelligence

15

an organisation,

business

BI has the

within

options

fosters

other

project,

interface

reduced

BI benefits

Following

IT

a common

a framework

customer

efficiency,

over a long

BI provides

multiple training in

version

most importantly,

could take

and

systems

interfaces

organisational

from

Achieving

other

desktops,

aspects

BI provides

added

implemented

data reporting

provide

data repository

present

became

step in properly implementing to

formats.

Common

difficult.

a properly

data from

for

or common

supported data in

any

servers,

information have to

presentation

Providing

benefits

improvements

BI, but

mix of IT

user interface

consolidated

of

Like

all types

as

sections,

goal

for

time.

is the first

many

Benefits

performance

umbrella

areas,

promises

section.

15.2.2 Business Intelligence

from

of a BIframework

BI infrastructure

technology

evolved

into

been

a priority

of IT

Business

decision

support

advances,

todays

highly

business

systems

intelligence

integrated

since

mainframe

has evolved started

BI environments.

over

computing

many decades.

with centralised Table

15.4

reporting

summarises

the

BI systems.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

taBle

15.4

Business intelligence

15

Data

Traditional

Operational

mainframe-based online

Source

data

processing (OLTP)

Intelligence

end

end

Process

Data

None

None

Reports

transaction

Business

extraction/

integration Type

for

759

evolution

Data

System

Databases

read

Store

Very

Temporary files

data directly from

for reporting

data

Presentation

Tool

Tool

basic

Very

Predefined

and summarise

operational

User

Query

used

purposes

User

basic

Menu-driven,

reporting

predefined

formats

reports,

Basic sorting, totalling,

text

numbers

and

only

and

averaging Managerial

infor-mation Operational

system

Basic

data

extraction

aggregation

(MIS)

and

Lightly

data in

filter and summarise operational into

Operational

departmental

data

decision system

data

as above,

in

ad hoc columnar

hoc reporting

report

definitions

SQL

store

and

First DSS

process

populates

External

Same

addition to some

to some ad

using

Data extraction

support

RDBMS

data

integration

(DSS)

as above,

in addition

intermediate

data

First-generation

aggre-gated Same

Read,

DSS

Query tool

database

data

with

Spreadsheet

some analyti-cal

generation

capabilities

store

Usually

and

Run periodically

RDBMS

reports

style

Advanced presentation

predefined

tools

with and

plot-ting graphics

capabilities First-generation

Operational

Advanced

data

Data

BI

data

extraction

and

warehouse

integration External

data

diverse

data

Same

BI Online

as

presentation

Optimised for query

classifications,

purposes

scheduling

Same

and

to

multidimensional

technology

aggregations,

conflict

Same as above, in addition

RDBMS

Access

sources, filters,

Second-generation

Same as above

tools

with drill-down capabilities

Star schema

resolution

model

as above

Data

ware-house

above

Adds

support

Same

analytical

stores

processing

in

(OLAP)

data

cubes

and

data

MDBMS

as above,

but uses cubes

for end-user-based

multidimen-sional matrixes;

analytics

with

limited

multiple

by terms

of

cube size

dimensions

Dashboards Scorecards

15

Portals Third-generation

Same

Mobile BI

Same

above but

Cloud-based Big

as

includes

Data

as above

Same

Cloud-based

above

social

Cloud-based

media, IoT and

as

Hadoop

machine-generated

and

Advanced

Mobile

analytics

iPhone,

Flexible

Pixel,

interactions

NoSQL

Galaxy

devices:

iPad,

Note

via data

databases

visualisation

data

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

760

part

VI

Database

Using the

Management

Table 15.4

desktop

Connectivity The

and

of the

support

decision

making.

decision

offerings.

as

a small

available

to

You

can

has a

of the

desktop

effectively

changed

use

from

Table

15.4

to

track

1970s,

the

need

the

support

in

managers

to the

evolution

system.

Atfirst,

support

statistical

to

and

systems;

modelling.

shop floor

Over time,

migrated

appliances

to

of information

managerial

a BI solution.

systems

of decision

support

an organisation.

servers,

with training

line

than

in

to

Database

systems.)

tools used to assist

and reach

commodity

17,

decision

managers

decision

the reach

managers

an organisation,

focus

environment

(Chapter

of cloud-based

of computerised

servers,

mainframe

first-generation

selected

computer,

high-end

of top-level

all users in

of a few

from the BI environments.

discussion

was the

much narrower

realm

minicomputers,

group

also

mobile

a detailed

BI environment

were the

This evolution

to

intelligence

system (DSS) is an arrangement

systems

such

business

cloud-based,

provides

modern

with the introduction

limited

more current

A DSS typically

support

platforms,

you can trace

to the

Web Technologies,

precursor

A decision

and

as a guide,

and then

more

BIis

no longer

Instead,

mobile

BI is

agents

dissemination

agile

cloud-based

now

in the field.

styles

used

in

business intelligence: Starting

in

running

the late

on

mainframes,

predefined

and took

the

spreadsheet,

environment, data in As the

and

use

systems

of the can find

Rapid changes

the

1980s, for

environments.

a new style

decision

from

reports

Such reports

of information

support

centralised

With

systems.

data

mobile

depicts

as

BI, end as the

the

of

to

stores

manage

systems

that

were

started

were

distribution, In this

and

end users

evolution

of business

intelligence

OLAP in

analytical Section

manipulated

the

and the internet dashboards

access

BI reports

Google

Pixel

BI information

in via

data in

a

early

first-generation

more 1990s

DSS.

The

with the

systems

(OLAPs) in the

mid-1990s.

chapter.

revolution

led to the introduction

early

native

of

in the

flourished

of this

the

flow

were familiar.

processing

15.7

the

developed

with the

with which

Web-based

users

evolution

umbrella

and online

iPhone,

tried

These

features

technology

such

such

departments

an IT

more about

in information

device, 15.2

format

systems.

data into

warehouse

out

BI systems

decade.

Figure

all

data

IT

reporting

DSSs were established,

advanced

in the

information

still used spreadsheet-like

You

server

by centralised

process.

dominant

multiplied,

enterprise

integrated

introduction

smart

as the

was filled

spreadsheets.

using

basically

to

computers

downloaded

of spreadsheets

way

Once

emerged

distribution

or even central

time

of desktop

managers

desktop

formal

minicomputers

considerable

With the introduction

for information

2000s

applications

and

mobile

that

of

BI later

run

on a

in the

mobile

or iPad. dissemination.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.2

evolution

1970s

of BIinformation

1980s

Centralised

Spreadsheets

reporting

dissemination

15

Databases

for

Business

Intelligence

761

formats

1990s

2000s

Enterprise

Dashboards

2010s

Present

reporting

Big Data analytics/Hadoop /NoSQL/Data

visualisation OLAP

Mobile

BI

Credit:

SOURCE:

Oleksiy Mark/Shutterstock.com

Course

Technology/Cengage

Learning

Although still in its infancy, mobile BI technology is poised to have a significant impact on the way BIinformation is disseminated and processed. If the number of students using smart phones to communicate with friends, update their Facebook status and send tweets on Twitter is any indicator, you can expect the next generation of consumers and workers to be highly mobile. Leading corporations are therefore

starting

to

push decision

making to agents in the field

to facilitate

sales and ordering, and product support. Such mobiletechnologies some users call them disruptive technologies. BIinformation technology has evolved from centralised reporting in just

over a decade.

The rate

of technological

15.2.4 Business Intelligence Several technological advances

create

advances

of BIto new levels.

technology

styles to the current,

that

mobile BIstyle

down; to the contrary,

technology

The next section illustrates

some BI

trends

are driving the growth

new generations

relationships,

are so portable and interactive

change is not slowing

advancements are accelerating the adoption technology trends.

customer

of more affordable

of business intelligence

products

and services

that

technologies. are faster

These

and easier to

use. In turn, such products and services open new markets and work as driving forces in the increasing adoption of business intelligence technologies within organisations. Some of the more remarkable technological trends are: Data storage improvements. Serial Advanced Technology capacity

that

New data storage technologies, such as solid state drives (SSDs) and Attachment (SATA) drives, offer increased performance and larger

make data storage

with a capacity approaching

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

faster

and

more affordable.

Currently,

you can buy single

15

drives

16 terabytes.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

762

part

VI

Database

Management

Business intelligence warehouse

and

simplified

appliances.

administration,

vendors

include

Business

These

without the

for

commitments.

Data

analytics.

Organisations knowledge

The

analytics.

organisation.

BI can

personal

analytics

self-service

15.3

now

relentless

Although

BI is

depends

on the

used

suited to

decison

that

data

and

inventory

large

provide

time

Teradata,

or cost

MicroStrategy

difference

to join

several

tactical

and

tables. strategic

any

All suppressed

Rights

May not

not materially

are

closer

Tableau.

is the

end user in the

to

and

an

walls of the

customers.

There is

need

for

decision

levels

better

support

within

operational

level.

between

be

serve

Some

a growing

trend

decision

support

data

and

for

data

operational

an

organisation,

its

Yet operational

operational

and

data.

effectiveness

data is

data and

decision

Therefore,

it is

seldom

well

support

data

affect

is

update

or

to the

overall

or

duplicated, learning

in experience.

a

capture

provide

not

or in Cengage

part.

Due Learning

such

electronic

surprising

to

From

rights, right

some to

data

represent

Thus,

INVOICE,

is

party additional

content

may

be

any

in

would

DSS point

an

have

data

give

of view,

dimensionality.

suppressed at

a simple

excellent

analysts

content

data,

INVOICE

you

and

to

performance,

invoice,

granularity

third

that

Customer

transactions,

the

remove

tend

of fields.

example,

a simple

(tables)

update

an arrangement

span,

the

for.

effective

business

data.

reserves

transactions

(for

extract

daily

to

structures

number

tables

main areas: time

whole

support

minimum

to

operational

data in three

To

Although

data

the

must be accounted

more different

For example,

operational

scanned,

with

which

to

sold, it mode.

each

by five

in

optimised is

DEPARTMENT).

the

purposes.

database

an item

many tables,

meaning

copied,

Data

different

a relational

query-friendly.

business

does

who

and

managerial

a frequent

operational

Reserved. content

making outside

between

data storage

and not

Whereas

DSS data differ from

Learning.

in

each time

data in

it is

analytics.

information

differ.

stored

be represented

STORE

database,

data

of every

desktop

users

at the

data

structures are

store

for

to the

evolution

support

on are in

might

market

near real-time

Data vs Decision support

so

DISCOUNT,

operational

mobile

gathered

Operational

systems

transaction

a new

section.

and

and

created

decision

The differences

For example,

data,

to

and tactical

tasks.

data

normalised.

operational

that

warehouse

Data

data

decision

formats

operations.

Cengage

the

of the

support

operational

be highly

deemed

BI

prepackaged

and they

new source for

MicroStrategy

technological

in the following

their

Most

has

a data

These

incurring

Microsoft,

and

analytics.

at strategic

quality

has

business

DeCIsIon support

Operational

2020

develop

and capacities, without

of these

warehouses

personnel.

data

ratios,

Examples

data

to

or extra

Oracle,

data analytics

include data

15.3.1 operational

review

offer

corporation

for

advantages.

of understanding

are examined

Copyright

Aster.

a BI project

media as the

be deployed

vendors

in this

the importance

Editorial

integration.

industries

by IBM,

phenomenon

BIis extending

personalised

One constant

Data

OLAP brought

Mobile

LINE,

any

software

pilot-test

offered

to social

gain competitive

organisation.

sales

to are

Big

are turning

to

Personal

15

allow

models for specific

services

and fast Teradata to

optimised

price-performance

SAP.

Big

daily

and

are starting

hardware,

appliances

offer improved

scalability

Greenplum

services

need for

organisations

Such

appliances

Companies

cloud-based

offer pay-as-you-go

an opportunity

learn

EMC,

as a service.

store rapidly

now offer plug-and-play

new

installation,

Netezza,

as a service.

and

These

rapid

IBM,

intelligence

services

Vendors

BI applications.

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Time span. cover

X; rather, five

Operational

alonger

time

they

data cover a short time frame. In

frame.

tend

to focus

on sales

contrast,

are seldom

interested

in

generated

during

past

the

Databases

decision

support

a specific

for

the

past

Intelligence

763

data tend to

sales invoice

month,

Business

to

year

customer

or the

past

years.

Granularity from

(level

highly

they

of aggregation).

summarised

must be able to

within the

city

to

decompose,

effects

region, Figure

extract

data focus

and are interested

store

and

customer.

shows

how

decision

support

using

a variety

year),

information

in

data and transaction-at-a-time

15.3

up the

analyse

sales

by region,

by store

summarised

(that

is,

finer-grained

individual

data relate X fared

relative

In that data

meaningful

data to

one

than

multiple

on

an

six

part

months

of the

(such

The ability between

by

picture.

dimensions

dimension.

differences

past

are

of

many data

the

and time

from

levels

a higher level.

For example,

Z during

each

of the

rather

dimensions.

place

produce

at lower

data to

tend to include

product

both

the

transactions

be examined

to

ways is

operational

to

case,

can

of filters

data

data, you are aggregating

over those

city,

FIgure

to

of aggregation,

within the region,

managers require

in how the

province,

present

case,

levels

need

by city

data analysts

product

and

managers

by region,

on representing

how

and

at different

over time. In contrast,

know

region

if

sales

components

want to

15.3

product,

the

and so on. In this

when you roll

Operational

might

be presented

For example,

more atomic

In contrast,

of the transactions

dimensions analyst

must

but they also need data in a structure that enables them to drill down, or

data into

Dimensionality.

data

data showing

within the region,

the

aggregation).

DSS

near-atomic.

access

compare the regions,

the

Managers

15

to

as

analyse,

decision

support

data.

transforming operational datainto decision support data Region

Operational

data

Decision

support

data

Time

Product

Agent

Sales Operational

data

granularity

and

presented

in

represents makes

have

tabular

difficult

focus.

time Such

format,

a single it

a narrow

single

in

derive

low

are

which

each

This

format

transaction. to

span, data

useful

15

Decision

usually

support

timespan,

row

system

tend

examined

to

in

(DSS)

have

multiple

data

high levels

focus

on

a broader

of granularity,

dimensions.

For

and

example,

note

can

often possible

information.

aggregations: Sales

by

Sales

for

product, all years

region,

agent,

or only

etc.

a few

selected

years.

Sales for all products or only a few selected

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

be

these

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

products.

eChapter(s). require

it

764

part

VI

Database

Management

online Content Theoperational datain Figure15.3arefound onthe onlineplatform forthis book.

The decision

support

data in

Figure

15.3 show

the

output

for the

solution

to

Problem

2 at

the end of this chapter.

From the designers as follows:

point of view, the differences

between operational

and decision support

data are

Operational data represent transactions as they happen, in real time. Decision support data are a snapshot of the operational data at a given point in time. Therefore, decision support data are historic, representing a time slice of the operational data. Operational and decision support data are different in terms of the transaction type and transaction volume. Whereas operational data are characterised by update transactions, are

mainly characterised

by query (read-only)

transactions.

Decision

support

DSS data

data also require

periodic updates to load new data that are summarised from the operational data. Finally, the transaction volume in operational data tends to be very high when compared with the low-to-medium levels found in decision support data. Operational data are commonly stored in manytables, and the stored data represent the information about a given transaction only. Decision support data are generally stored in a few tables that store data derived from the operational data. The decision support data do not include the

details

of each

operational

transaction.

Instead,

decision

support

data represent

summaries; therefore, the decision support stores data that are integrated, summarised for decision support purposes.

transaction

aggregated

and

The degree to which decision support data are summarised is very high when contrasted with operational data. Therefore, you will see a great deal of derived data in decision support databases. For example, rather than storing all 10 000 sales transactions for a given store on a given

day, the decision

support

database

might simply

store the total

number

of units

sold and the

total sales euros generated during that day. Decision support data might be collected to monitor such aggregates as total sales for each store or for each product. The purpose of the summaries is simple: they are to be used to establish and evaluate sales trends, product sales comparisons, and so on, that serve decision needs. (How well are items selling? Should this product be discontinued?

Has the

advertising

been effective

The data models that govern operational operational

databases

frequent

as

measured

by increased

data and decision support

and rapid

data updates

sales?)

data are different. The

make data anomalies

a potentially

devastating problem. Therefore, the data requirements in a typical relational transaction (operational) system generally require normalised structures that yield manytables, each of which contains the minimum number of attributes. In contrast, the decision support database is not subject

to such transaction

updates,

and the focus

support databases tend to be non-normalised large number of attributes.

15

is

on querying

capability.

and include few tables,

Therefore,

decision

each of which contains

a

Query activity (frequency and complexity) in the operational database tends to below to allow additional processing cycles for the more crucial update transactions. Therefore, queries against operational data typically are narrow in scope, low in complexity and speed-critical. In contrast, decision support data exist for the sole purpose of serving query requirements. Queries against decision

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

support

All suppressed

Rights

data typically

Reserved. content

does

May not

not materially

be

copied, affect

are broad in scope,

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

high in complexity

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

and less

party additional

content

may content

speed-critical.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Finally, is the

decision

result

display many

many

to

15.5

database

First,

product,

are stored

and

duplications.

customer,

the

differences

point

15.5

different

store,

summarises

by large

data

ways to represent

designers

taBle

data are characterised

factors.

data redundancies

different

relation Table

support

of two

in

amounts

Second,

and

Operational

Data currency

Current

operational

operations

Historic

data

Snapshot

Atomic-detailed Low;

some

aggregate

yields

to

be categorised

in

might

in

be stored

support

data

from

the

Mostly

volumes

High

update

Transaction

speed

Updates

Query

activity

Low to

Query

scope

Narrow

Query

complexity

Simple

data

many aggregation

Mostly

but

mostly

multidimensional

loads

Retrievals

and summary

calculations

are critical

High

medium

Broad

range

range

medium

Very complex

of gigabytes

Hundreds

of terabytes

to

petabytes

The many differences between operational data and decision support data are good indicators requirements of the decision support database, described in the next section.

15.3.2 Decision support

DBMS

query

Periodic

are critical

Hundreds

levels

Complex structures

DBMS

volumes

to

data

(week/month/year)

Non-normalised

updates

Transaction

Data

of company

Some relational,

Data volumes

data volume are likely

data

component

High;

Highly normalised

type

765

data characteristics

Support

Summarised

data

Mostly relational

Transaction

can data

decision

support

Decision

Time

Data model

data sales

and

and decision

Data

Real-time

level

same

that

Intelligence

manager.

between

operational

Characteristic

Summarisation

the

Business

The large

structures

For example,

for

of view.

Contrasting

Granularity

Databases

of data.

non-normalised

snapshots.

region

15

of the

Database requirements

A decision support database is a specialised DBMS tailored to provide fast answers to complex queries. There are four mainrequirements for a decision support database: the database schema, data extraction and loading, the end-user analytical interface and database size. Database Schema The decision support database schema must support complex (non-normalised) data representations. As noted earlier, the decision support database must contain data that are aggregated and summarised. In addition to meeting those requirements, the queries must be able to extract multidimensional time slices. If you are using

an RDBMS, the

data. To see whythis a single department. Table 15.6.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

suggest

using non-normalised

and even

duplicated

must be true, take alook atthe ten-year sales history for a single store containing At this point, the data are fully normalised within the single table, as shown in

Reserved. content

conditions

15

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

766

part

VI

taBle

Database

Management

15.6

ten-year sales history for a single department,

millions of euros

Sales

Year

2010

8 227

2011

9 109

2012

10 104

2013

11 553

2014

10 018

2015

11 875

2016

12 699

2017

14 875

2018

16 301

2019

19 986

This

structure

unlikely

that

works

that

such

a decision

more than contain

one

taBle

only has

a factor

To support stores

and

queries

that

track

one

store

much

need

when dealing all

of the

all of their sales

dimension

yearly

to include

2014

and

2019

for

a decision

with

one

support

One

one store,

the

database

ellipses

over

two stores

and two

data

departments

suppose

has

database

time.

For

must support

simplicity,

(1 and 2) within each store.

that

very

of which

must be able to

and

(...) indicate

it is

would

each

requirements,

departments,

departments

However,

support.

more than

and the

by

department.

data. Table 15.7 shows the sales figures

are shown;

sales summaries,

only

decision

by stores,

yearly

with

departments

are only two stores (A and B) and two

Only 2010,

15.7

have

environment

all of the

change the time conditions.

you

becomes

department.

multidimensional

there

when

a simple

support

data for

suppose

well

Lets

also

under the specified

values

were

per store,

omitted.

If

millions

of euros Store

2010

A

1

1 985

2010

A

2

2 401

2010

B

1

1 879

2010

B

2

1 962

...

...

2014

A

1

3 912

2014

A

2

4 158

2014

B

1

3 426

...

2019

A

1

7 683

2019

A

2

6 912

2019

B

1

3 768

2019

B

2

1 623

Copyright Editorial

1 203 ...

...

...

...

...

2

B

2014

15

Sales

Department

Year

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

you examine and that

Table 15.7, you can see that the

the

Now

table

exhibits

suppose

suppose

you

that

want to

sales attributes

multiple

number

of rows

15

Databases

and attributes

for

already

Business

Intelligence

multiplies

767

quickly

redundancies.

the

company

access

yearly

has sales

per row. (Actually,

ten

departments

summaries.

there

per

Now you

are 15 attributes

store

are

and

dealing

20 stores with

nationwide.

200 rows

and

And

12

monthly

per row if you add each stores

sales total for

optimised

retrievals.

each year.) The

decision

optimise

query

to increase

support speed,

search

non-normalised

the

decision

and

by importing

data

extraction

and

is data

filtering

capabilities

should

databases,

also

be

features

DBMS

found

also

created from

tools.

should

in

such

query

for

query

as bitmap

optimiser

decision

To

allow

support

largely

support

(read-only)

indexes

must

and

data

be enhanced

To

partitioning

to

support

the

databases.

by extracting

external

sources.

minimise

batch

the

and

different

data

or data

data

DSS database,

the

from

external

validation

DBMS

data

Thus,

impact

flat

and

to filter

advanced

database

support

operational

advanced

database,

The

the

data

hierarchical,

capabilities

Finally,

must support

operational must

extraction.

files

Data filtering

the

DBMS

data

sources:

rules.

from

the

on the

scheduled

as well as multiple vendors.

for inconsistent

the

the

structures

database

relational data into

must support

addition,

additional

capabilities

check

must

and Filtering

support

extraction

In

and complex

The

schema

DBMS

speed.

Data extraction

to

database

data

extraction

network

must include

and integrate

the

data integration,

and

the ability operational

aggregation

and

classification. Using

data

conflicts. based

For

multiple

example,

on different

be filtered

dates

scales,

and purified

and that they

Database

sources

and ID

and the

same

to ensure that

are stored in

Decision

support in

databases

2017,

DBMS

tend

Wal-Mart

had

might be required

importantly,

to

massively

support

parallel

The complex

sparked

to

that

the

Bill Inmon,

Copyright review

Inmon,

2020 has

facilitate

that

different

support

any

All suppressed

and

solve

data-formatting

measurements names.

In

may be

short,

data

data are stored in the

petabyte

of data in its

hardware,

such

as

such

ranges

data

databases (vLDBs).

technologies,

data

and the

are

not

warehouses.

To support

multiple

must

database

unusual.

For

Therefore,

the

a VLDB adequately,

disk arrays,

as a symmetric

ever-growing

of data repository. extraction,

data

and, even

multiprocessor

more

(SMP)

or

demand

This repository, analysis

and

for

sophisticated

called

decision

a data

data

analysis

warehouse,

contains

making.

warehouse father

of the data

nonvolatile

warehouse,

collection

definition,

lets

take

a

C. The

twelve

rules

of data

Kelley,

6-16,

Learning. that

of a new type

time-variant,

B. and pp.

Cengage deemed

to

formats,

(MPP).

the acknowledged

To understand

terabyte

very large

requirements

Data

subject-oriented,

Editorial

decision

40 petabytes

to use advanced

processor

the creation

15.4

4(5),

having

different

may have

pertinent

be enormous;

more than

multiple-processor

information

data in formats

2

elements

means

in

format.

DBMS must be capable of supporting

a

data

only the

a standard

usually

may occur

Size

example,

the

also

numbers

more

of data that

detailed

warehouse

look

for

defines the term provides

at its

support

as an integrated,

for

decision

15

making.2

components.

a client/server

world,

Data

Management

Review,

May 1994.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

768

part

VI

Database

Management

Integrated. derived

The data

from

integration

the implies

metrics

are

and from

that

all business in the

you

performance

can exist

cancel

or PG

format

tangle,

the

throughout

the

enhances

data in the

organised

and

products,

is

customers,

quite different

from the

For example,

structures

(relational

two

tables:

Data

the

sales

data

In

the

data, generated once

warehouse,

of the

warehouse

month,

quarter,

data

Non-volatile. in

the

data

contrast

a data

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

to

any

All suppressed

data

enter

data

organisation

data

components

a subject

on the

transaction

normalised

real-time

in

orientation.

processes

that

by product

require

modify

data updates!)

its sales

on current

the

and

retrieval

transactions,

warehouse

warehouse,

other

also

can

even

of sales

weekly

sales

are

As data in

used

to

generate

Once the

data

enter

the

aggregations

uploaded

a data

the time

summaries data

projected

in the sense that

aggregates

by its variables,

is

warehouse

contain

all time-dependent

time-dependent

updated.

measured

that

even

does

May not

the

data

warehouse,

companys

always

multiterabyte

it

not materially

be

copied, affect

scanned, the

history,

overall

databases

duplicated, learning

are

to the for

constitute

component and

data

products,

warehouse

warehouse,

is crucial.

aggregations

the

time

and

in

whole

or in Cengage

Due Learning

data,

and new DSS

ID

by assigned

a data

to

electronic reserves

right

a

The week, to

to

data

the

near-term

added,

must be able to support

hardware.

more

some

the

representing

comprehensive

specifically

rights, the

Because

data are continually

DBMS

multiprocessor

provided

part.

removed.

operational

of transaction

experience.

never

deleted why the

Kimball

a copy

or

the

That is

Ralph

was . . .

they

Data are never

growing.

definition,

saying

Reserved.

the

added to it.

Bill Inmons

content

than

activities

focus data

previous

and are

as ID

so on.

represent

and

Rights

of

be changed.

warehouse is

warehouse

and

of interest

of typical

models. It is also time-variant

data

data for

variables

history

and

which The

yearly

a time

are always

multigigabyte In

other

data,

to the

when

monthly,

contains

warehouse

so the

distribution

invoice

has

to numerous

support

time.

and other

uploaded

company

Once

history,

through

example,

and

year

cannot

data

on designing

stores

This

answers

warehouse

This form

by storing

warehouse

it

operations.

provide

organisation

data rather

decision

operational

statistical

weekly,

stores

snapshot data

For

data

because

of data

through

the

customers,

15

to

flow

data are periodically

are recomputed.

the

accomplished,

subjects

warehouse

potential

or customer.

contrast

represent

data

on the

to

Data

on.

another

acceptable

once

finance,

so

process

the

data are not subject

an invoice,

components

by product

and

as

opportunities.

specific

concentrates

business

contrast,

specifically

warehouse

of storing

the

In

marketing,

the

companys

business

contains

in

such

and as UG1,

format

but

the

or process-orientated

support

focus

all, data

by customer

Time-variant.

to

and INVLINE.

instead

summaries

the

tables)

a common

a company.

as sales,

and 4

To avoid

and optimised

within

designer

3

labels

1, undergraduate

department

understand

promotions

system

2, year

to

of strategic

warehouse

regions,

as 1,

for sales

any other

with text

be time-consuming,

better

areas

such

data

more functional

designers

data. (After

Therefore,

the

an invoicing

INVOICE

warehouse

by topic,

topic,

measurements

department.

conform

data are arranged

diverse functional

departments,

systems.

can

into recognition

summarised

For each

must

business

requirement

holds true for

accounting

Data

and this

be indicated

and

systems

managers

warehouse

from

Although

as undergraduate

in the

This integration

Data

might

data

formats.

characteristics

many different

department

warehouse

making and helps

coming

transportation.

one

information

data

organisation.

Subject-orientated.

are

computer

data

the same scenario

be defined

that integrates

with diverse

enterprise.

how

year 3 or postgraduate the

can be translated

questions

the

of an order

in

might

database

sources

elements,

discover

status

and closed

in

decision

understanding

to

the

status

year 2, undergraduate UG3

to

within an organisation;

A students

UG2,

data

way throughout

be amazed

For instance,

received,

department.

consolidated multiple

entities,

same

would

element.

open,

a centralised,

organisation

described

sounds logical,

business

warehouse is

entire

third remove

party additional

description

structured

content

may content

be

for

suppressed at

any

time

from if

of

query

the

subsequent

eBook rights

and

and/or restrictions

eChapter(s). require

it

Chapter

analysis.3 how it

In this

should

Bill Inmons

approach around

a number

detail in several dart

Section

ensure

between

the

the

development

the

one

two

have

and 15.8

Ralph

consistent

of the

successfully

development

summarises

15.8

Similar or

the

about

data

can

meanings.

not

is

belabelled

data

Subject-orientated

these can

be

Bill Inmons

Ralph Kimballs

to

With the

data

method

warehouse

advent

of

Ralph Kimball

enterprise

warehouses

and

have

Data

data

operational

Big

Data,

recognises

warehouse

to

databases.

different ID

warehouse

representations

Provide

numbers

with a common or as

and a given condition

for

in thousands

or in

a unified

For example,

for invoices,

payments

Data are recorded

or process,

data

For example,

the

that

sales

data

example,

decision

by

perspective

in

and common.

For

changes

data

Data cannot

with

only

environment

2nd

Evolving

Lifecycle

Edition. Role

Toolkit:

John

of the

Wiley

Enterprise

mind.

Therefore,

Practical & Sons, Data

Techniques

for

be changed.

data

changes

Warehouse

by

with a historical

periodically

Once the

from are

a time

data analysis

Building

Data are added

historical

properly

are allowed.

environment

Data

and

or by region.

dimension is added to facilitate

is fluid.

Systems,

data

by product,

manager,

may be the

orientation of the

making. For example,

may be recorded

Data are recorded

amount the

views

and various time comparisons.

are frequent

Therefore,

multiple

transactions.

on a given date, such as

an inventory

sale.

with a subject

facilitates

facilitates

on 12-MAY-2013.

Data updates

each

elements

and representation

units.

Data are stored

may be stored

and credit amounts.

as current

sale of a product

Non-volatile

of all data

may

division,

342.78

view

definition

millions.

with a functional,

orientation.

Time-variant

Data

all business

sales

R. The

building

as T/F or 0/1 or Y/N. A sales value

Data are stored

R. The

more

15.5).

organisations.

Data

as ######-####-###

may be shown

Intelligence

by first integrating

comparison

approaches

for the

approach

marts in

structured.

whereas

(Section

big data analytics.

are required

data

A further

Bill Inmons

Down

Data.

between

For example,

#############,

Kimball,

and

769

which is linked

virtually

warehouse

schema

Top

warehouse,

Up approach

normal form,

star

and

that

Database

may be stored

4

Intelligence

warehouse

as the

and then

organisation.

in large

of Big

to

will learn

data in the

be able to facilitate

differences

Operational

Integrated

Kimball,

Business

a comparison of data warehouse and operational database characteristics

Characteristic

3

for

data

data

a Bottom

of the

model in third

paradigms4

the

adopts

view

Kimball

challenges

referred

an organisation

implemented

to

often

marts. (You

of how the

data, i.e.

Ralph

has to transform

the

in

enterprise

both

been

Kimball

in terms

models

encompass

taBle

of the

enterprise-wide

as data

departments

approaches

is

centralised

known

contrast,

differences,

need for new

Table

on the functionality

development

of alarge

data using the relational

warehousing

expand

warehouse

different

multidimensional

Despite

only focuses

databases

In

within

method structures

data

data

creation

15.4.2.)

marts

marts to

creates

to

the

of departmental

data

made

Kimball

Databases

be developed.

and revolves

to

definition,

15

Therefore,

is relatively

Data

systems.

stored,

no

the

data

static.

Warehouse

and

15

Business

2008. Warehouse

in the

Era

of

Big

Data

Analytics.

Available:

www.

kimballgroup.com/

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

770

part

VI

In

Database

Management

summary,

query

the

integrated As

in

warehouse

this

from

15.4

warehouse

Typically, other

mentioned,

FIgure

data

processing.

data

words,

passed

process

is

operational

is

usually

are

extracted

through

known

as

a read-only from

database

various

a data filter ETL.

Figure

optimised

sources

before

and

being loaded

15.4 illustrates

the

for

data

are then into

ETL

analysis

and

transformed

the

data

process

to

and

warehouse. create

a data

data.

Creating a data warehouse

Operational

data

Data warehouse Transformation

Extraction

Loading

Filter

Transform

Integrated

Integrate

Subject-oriented

Classify

Time-variant

Aggregate

Non-volatile

Summarise

Although the centralised andintegrated data warehouse can be an attractive proposition that yields many benefits, managers may be reluctant to embrace this strategy. Creating a data warehouse requires time, money and considerable

managerial

effort.

Therefore, it is not surprising

that

many companies

begin their

foray into data warehousing by focusing on more manageable data sets that are targeted to meet the special needs of small groups within the organisation. These smaller data stores are called data marts.

15.4.1 twelve

rules that

Define a Data warehouse

In 1994, William H. Inmon and Chuck Kelley created 12 rules defining a data warehouse, summarise many of the points madein this chapter about data warehouses.5 15

5

1

The data warehouse and operational

2

The data warehouse data are integrated.

3

The data warehouse contains historical

Inmon, 4(5),

Copyright Editorial

review

2020 has

Cengage deemed

B., and pp.

Learning. that

any

Kelley,

6-16,

All suppressed

C. The

twelve

rules

environments

which

are separated.

data over along time horizon.

of data

warehouse

for

a client/server

world,

Data

Management

Review,

May 1994.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

4

The data warehouse data are snapshot

data captured

5

The data warehouse data are subject-oriented.

6

No online updates are allowed.

15

Databases

for

Business

Intelligence

771

at a given point in time.

7 The data warehouse developmentlife cycle differsfrom classical systems development. The data warehouse

development

is

data-driven;

the

classical

approach

is

process-driven.

8 The data warehouse contains data withseveral levels of detail: current detail data, old detail data, lightly

summarised

data

and

highly

summarised

data.

9 The data warehouse environment is characterised byread-only transactions to very large data sets.

The

entities

10

operational

environment

is

characterised

by numerous

update

transactions

to

a few

data

at a time.

The data warehouse environment

has a system that traces

data sources, transformations

and

storage.

11

The data warehouses

metadata are a critical component

and define all data elements. usage,

12

relationships

and

The

history

metadata of each

provide the

data

The data warehouse contains a chargeback use

Note

of the

how

an entity

those

from

Most

market

by end

12 rules

separate

processes. their

data

data

share

The metadata identify integration,

storage,

element.

mechanism for resource

usage that enforces optimal

users. capture

the

the

complete

operational

warehouse

suggests

use the star schema

of this environment. source, transformation,

data

data

store

to its

implementations

that

their

based

will not fade

to handle

life

cycle

components,

are

popularity

design technique

warehouse

from

functionality

on the

anytime

relational soon.

multidimensional

its

introduction

and

management

database

Relational

as

model,

data

and

warehouses

data.

online Content Furtherconsiderations aboutdatawarehouse development can be found in Appendix L, Data Warehouse Implementation platform

for this

Factors, located

on the online

book.

15.4.2 Data Marts A data

mart is

a small,

single-subject

data

warehouse

subset

that

provides

decision

support

to

a

small group of people. A data mart could also be created from the data extracted from alarger data warehouse for the specific purpose of supporting faster data access to atarget group or function. Some organisations choose to implement data marts not only because of the lower cost and shorter implementation time, but also because of the current technological advances and inevitable people

issues

that

make data

marts attractive.

Powerful

computers

can provide

a customised

DSS

15

to small groups in ways that might not be possible with a centralised system. Also, a companys culture may predispose its employees to resist major changes, but they might quickly embrace relatively minor changes that lead to demonstrably improved decision support. In addition, people at different organisational

levels

arelikely

to require

data

with different

summarisation,

aggregation

and presentation

formats. Data marts can serve as atest vehicle for companies exploring the potential benefits of data warehouses. By migrating gradually from data marts to data warehouses, a specific departments decision support needs can be addressed within a reasonable time frame (six months to one year),

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

772

part

VI

Database

Management

as compared years). have

to the longer

Information the

opportunity

The

only

frame

(IT)

to learn

difference

being solved.

time

technology

the

issues

between

Therefore, the

information

on available and

definitions

skills

a data

this

approach

required

to

warehouse

is the

and data requirements

The Data Perhaps

the

Because

data

extent

section

first

depth

than

to

remember

for

decision

foundation

of

means that

data

support its

warehouse

infrastructure

Decision is that

design

A Company-wide Designing that

a data

captures

business

the

discover

have

data that

is

resistance

and

of knowing

Data

power,

power

how

to

struggles

create

and

arbitration.

Involve

end

users in the

Secure

end users

Create continuous Manage

15

Great must

be

warehouse

Cengage deemed

Learning. that

any

for

skills

Rights

database.

always

design

you

as

it is

progress. of the

of a complete Although

database-system-development

it is easy to focus

must remember

entire

Instead,

work in

and implementation

procedures,

of the

a

that

the

decision

well as data.

Therefore,

infrastructure.

is

develop

lines

departments,

and

Building

it requires

and

perfect

and

skills

groups

turf

deal

model

user

and

boundaries.

and

damaging

warehouse to

end

data, you are likely

to trigger

data

managerial

data

both

geographical

support

uses is likely

the

from

organisations

data inconsistencies

sources

an integrated

organisation,

model all of the

(divisions,

designer

help

the

departmental

easy to find of its

to

to

so

battles,

often

end-user

is not just

with

on)

redundancies.

conflict

a matter

resolution,

must:

from the

beginning.

feedback.

conflict not,

well.

resolution.

of course, The

old

solely adage

sufficient.

The technical

of input-process-output

aspects

of the

repeats

itself

data

warehouse

here.

The

data

must satisfy:

Reserved. content

are as

designer

All

warehousing.

expectations.

addressed

suppressed

warehouse

process.

commitment

procedures

managerial

data cross

the

data

a static

support.

and

an opportunity

an attempt to

schema;

short,

end-user

end-user

Establish

In

not

the

repository, people

be essential

at all levels.

a star

mediation

to

components

control

imposed

data

and implementation

to

and implementation

in light

given

represents

and the

by an IS

User involvement

being

and it certainly

constraints

played

constraints

design

definition,

decision

data

of the

role

perfect

be common

is by

design

software,

warehouse

warehouse

goals,

central

considered

organisational

conflicting

Information

are

almost activities,

be examined

means

data that

warehouse

is,

in the

That requires

warehouse

the

perspectives.

Because to

as the

to

problem

Framework

company-wide

hardware, must

effort

a data

intelligence

for

appear

Some

Add the

warehouse

of the

the same for both.

of the

can describe

to three personnel

warehouse.

scope

are essentially

view

(one their

a data

and

many constraints.

data

Support

that

you are involved

and implementation

that

because

create

requirements.

a single

factors

support

database

to

warehouse

size

managements

why no single formula

business

includes

of

of the information

a few

infrastructure on the

a function

proposing

willidentify

thing

warehouse

are

as an Active

framework it is the

and

rather

warehouse

a dynamic

development

Others

and you understand

Therefore, this

system

funding.

of the

culture,

methodology,

has

the

a data

from

is subject

based

development.

2020

develop

mart and

to implement

benefit

Organisation-wide

by corporate

review

and

a data

problem

also

a Data warehouse

department

Copyright

required

15.4.3 Designing and Implementing

are

Editorial

usually

departments

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Data integration

and loading

Data analysis End-user

data

The foremost support

right

capabilities

with

data,

analysis

it is

advanced

data

data in

warehouse

begin

the

data

query

implementing

analysis

life

with

its

database

operational

in

773

performance.

a data

capabilities

cycle

warehouse

at the

of the

to fit the

data from

design

database

and the

a review

be adapted

derives

operational

Intelligence

right

is

to

moment,

provide in

the

end-user right

decision

format,

with the

cost.

database

must then

the

Business

Design Procedures the

wise to

procedures

with acceptable

concern

about

perhaps

for

criteria.

technical

Apply Database

Databases

needs.

and at the right

You learnt

15

are

corrupted.

design

traditional

data

operational

is important.

data

database

database

warehouse databases,

Its

difficult

Figure

process

to

design

10 and

good

why a solid

data

a simplified

11,

These

If you remember

will understand

produce

depicts

Chapters

procedures.

requirements. you

15.5

in

design

that

the

foundation

warehouse

process

so

data

when

for implementing

warehouse.

FIgure

15.5

Data warehouse design andimplementation

road map

Identify

Initial

and

Define

data

Identify

gathering Design

extraction

and transformation

Design

star

Facts,

schema

dimensions,

attributes

Create star schema Attribute

Naming

users

data

ownership

of

model

data

Define

frequency

of

Define

end-user

interface

use

Define

outputs

and

update

diagrams Design

hierarchies

Map to relational

key

subjects

operational

Define

routines

interview

main

and

tables

mapping

conventions Prepare

for

loading

Define initial

and

update

processes

Define transformation Loading

and

Define

testing

load

window

Map from

operational

Integrate Training Build

in

development

environment

Load

menus

Customise Build

query

Verify

tools

queries

out

Test

interfaces

Building

outputs

Optimise

for

End-user

data,

validate

required

Lay

and

data

transform

index

data

and

data

metadata

and star

schemas

and

testing and

results

speed

and

prototyping

accuracy

15

and testing

Roll out system Rollout

Get end-user

and feedback

System

feedback

maintenance

System expansion

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

774

part

VI

Database

One of the defining must

Management

key differences

the

business

be described

Identify

how

Check

all

existing

data

when this

in

order

to

has

been

The following

level,

to

how The

the

a data

data

the

of

in

which

customer

a

nature.

mortgage,

as asubject

associated.

Types

Operational any

Historical section

of detail required

within

the

may be the

does

basis users

data

in

warehouse

number

of a

the

organisation

or an hourly

need

basis?

The

to

general

require.

required

can

actually

15.6.2).

be created

be obtained

from

the

and the

ETL

and one option

processes

be

of how to create

the

customer

that

data

warehouse,

main source

This type

However,

systems

routines

of

the

data is

that

many

database. ready

for

extracted

from

the

and

external formats

Savings

and

may have

data from and

Loans,

can

be

each

of

both

a savings

account

In

order to

store the

of the same

customer

may be different.

these two instances

of sources

different

departments, customer

must be at a high

must be

warehouse. The

This data can come

key is to

determine

which

directly

from

operational

data is

warehouse. useful

store

are therefore

as routines

not just

old systems

It

useful

include:

organisation. within

is

warehouse. relevant,

15.4 illustrates,

a number

data

DBMS

of data into the

the

Figure

an operational from

often in

two

The same

may be extracted

in

fields

data

accurate,

a warehouse

of data.

files from

each

a successful

building

Data is

may have

in

to

high-quality,

Typically,

market.

numbers

within

data

archives,

DBMSs.

data

is

contained

warehouse.

a bank

different

critical

phase in

the

stock

For example,

is

process

many sources

preselected

also from

data.

and transformation

(ETL)

warehouse

often

data

will be included

archived

model

design issues

loading

the

from

the

within the

all

from

takes

but

data in

The

dimensional

ETL process

into

created

example

or application

as not

fields

into

of data

data.

DBMS

on a daily what the

of data

process

is loaded

is

but the

customer

relevant

can the

extraction

for

contradictory

and

sold

most time-consuming

systems

company,

stores

measure

For example,

than

the level

Loading

that

and loading

the

finer

will explore

required

data

operational

outside

modelled

a star schema.

warehouse

process

current

data.

were

that

business

transformation,

This is the select

transformation

of the

grain

completed

sections

Transformation, that

and accessible. developed

one

ensure

model using

Extraction, ensure

be

a week.

product

for

15.4.4 the extraction,

must

a sales

sold in

or granularity

design

process is the level

is to

to:

been

detail

design

that

systems.

dimensional

The

process

For example,

has

sources

database

business

of a particular

is to

source

defined. the

of

many

of thumb

Each

measures. that

the level

know

Only

detail

product

Identify

rule

in

business

particular

from the traditional

model.

to

this

perform

data

are

required

to load

such

as budgets

predictive

often

the

analytics

obsolete.

(discussed

Unique

data in the

data

warehouse

in

extraction

during the

first time load.

15

Internal

data.

spread

sheets.

External

Sources

the internet)

and

it

can

available.

In

may require

main issue Editorial

review

2020 has

Cengage deemed

Learning. that

any

organisation

for comparing of external

marketing

addition,

the

unique

one-off

may choose

the business

data include

data

be available

An organisation

Copyright

within the

data. Important

competitive.

is that

Data

that

has

been

and

constant

format

of the

performance

real-time

at any time

The

monitoring data

which

to enable

data feeds,

purchased.

external

or sales forecasts,

an organisation

newspapers

main

problem

is required

will be different

external

determine

the

to

and reports with

to

from

may exist in

be

(from data

when it is

internal

data

and

transformations. to

buy tools

to extract

data or may write individual

routines

in-house.

The

is cost. All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Once the In

each

system

data has been determined,

case, is in

The

it

may be necessary

a different

process

especially

format

from

common

and

Storing for than

source

Address

names

the

house

There

example,

of

ready

being

within a DBMS no unique

could

be down

same

the

to

details,

address,

and

data

warehouse. the

(or

as

source

mapping.

extracted

cleans) the

as subject-oriented

775

data,

data and

data.

both

shown in

may

entry

error

record

names

Figure

has

and

15.6,

missing.

or the

fact

been

many unique A person

that

created.

addresses

the Two

challenges

may

person or

have has

more

more moved

people

may

may be spelt incorrectly.

which exist in two

separate

For

databases

within

data anomalies

Address Rogers

6 State

Rd,

Roy

Rogers

Clare

A. Peterson

Jane Smiley

2: Customer

Marketing

6 State

Road,

4

Street,

West

1214

Marketing

A Claire

14

Smiley, Jane

NW,

M

L100

F

L121

Warehouse

Addresses

West St,

is to ensure format

L100

M22

M33

Range

L333

Gender

M33

12 to 14 Range

problem

M

M23

R

an agreed

Location

table

Name

Peterson,

Gender North

West,Manchester,

Rogers,

provides

be

Sales table

Roy

to this

in the

Intelligence

from

known

found

also scrubs

warehouse

values

a data a new

and

Name

A solution

field

This is

anomalies

presented

and a data

key

name and address

Database 1: Customer

For example,

if the

warehouse.

data

Transformation

for

example

Business

include:

often

the two tables

15.6

Database

for

data

any

for

must be mapped into the

rule,

in the

source.

anomalies

updating

the

consider

FIgure

is

which

and, instead

under

field

eliminate

Databases

inconsistencies

address,

be stored

mapped aims to

format data

and addresses

designer. one

the

attribute

a transformation

an operational

ensures it is in a standardised

Name

apply

of data transformation

when it is from

Some

each selected

to

15

for

that

names

Warehouse

the

name

and

addresses

and

Supplier_iD

Male

L100

Z123

Female

L121

Z45

Female

L333

address

are

within the

Title

Location

broken

data

down

into

warehouse

their

could

component

parts.

be:

Mr

First_name

Roy

Middle_name Rodgers

Last_name

Street

No. or house

name

15

6

Address_line1

State

Road

Address_line2 Manchester

Country Postal

M23 4FR

code

United

Country

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

Kingdom

........

........

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

776

part

VI

Database

Management

an organisation. Marketing the

address In

Customer

table

in

field,

address check

to

ensure

then

M23 4FR is

that,

when

both the

existing

Figure

15.6,

Customer Male

entered

problems

the

a person

may

stored

the

when

been

encoded

as M

and F

is to

agree

data values into

data and flag records

Different

Standards

Country the

For

if

which

Often,

one

standards

format

example,

metric,

are likely

of the

and

does

not

be flagged

at that

that

each

Road.

both records

already

address.

could also

exist

be used place

one

does,

if

analysis.

house

be flagged

of the

be in

Finally, it

This is a difficult

moved

should

component

Accuracy

should

for further

may have

at all. In

M22.

ensure

UK. Rules

name

should

suitable

standard

for

is

are the

from

on,

the

data from

differently

while in

the

on a format

a number

in

each

situation

and the

correct

for further

analysis.

Customer

for

each

It is

where fields

cannot

be standardised.

global

as

be stored

Europe

within

and

and then

exist

fields

data

stored

warehouse

these

cover

and

warehouse,

perform

In the

it is

that

Standards

or imperial,

to

data

data

mm/dd/yyyy)

data

Africa,

routines

the

the

systems.

databases. table

very important

to

the

South

source file into

file in

organisations.

opposed

two

Marketing

format.

in

of operations

of the

the correct

exist

to

UK,

agreed

data as it is transported

to

merging

date (dd/mm/yyyy

measurements

is

same

or a person again,

M23 to

Customer

entered

a postal code database

Manchester,

a person living

occur

pick up erroneous

country

of

both

and the

or not

e.g. Rd becomes

For example,

be inserted

case,

is

different

needs

format,

city

be incorrect

has

The solution

to transform

Multiple

code

Sales table

is

developer

with the

to

In this

field

it is

and Female.

currency,

sources.

record

typically

Gender

Sales table

routines

postal

warehouse

code in the

Customer

his address

Problems

Multiple encoding In

external

postal

been recorded.

encoding

both the case

Rd and the data

whether there is not already

may not have

Multiple

against a valid

and the

to check

in

each

appears in a standardised

entered,

The address

analyse.

as the

record

to

yet in

stored

data is

is necessary

appears

rows,

problems,

be checked

that

details

also

such

name and address

should

to

of three

Road is

order to resolve

part of the

Roy Rogers

a total

is

automatic

and

write

rules

also

the

type

of

measurements.

should

which

as

these

used

in

be in the

US?

conversions

of the

warehouse.

Missing values Often,

when

you

information

extract

may not

or fields

may not

being

terms

the

of the

to

the

example,

15

are

referential In

be flagged

source.

for

some

3, Relational database.

matches

the

applied

To to

extracted

Copyright Editorial

review

2020 has

Cengage deemed

any

to

be simply

if the

be completed,

value

then

missing

may simply

with

missing

contained

ignored.

made to

missing

from

ensure

each

All suppressed

key

extracted

that

Reserved. content

does

May not

integrity

value

different this

relationship

Rights

Characteristics,

Referential

If critical,

establish within

the

the

is

is to

error,

upon

not

the

critical,

the record

missing field

weight

mismatched

depends

the field

record

and

human

been

values

then

Sometimes,

age

due to

have

within

an alternative

not materially

states

a table

from

does

into

be

in

you learnt

to

copied, affect

not

will determine

scanned, the

overall

or

duplicated, learning

referential

a foreign

systems

violations

happen,

the data

that

that

key

which it is related.

source

databases,

that

and then inserted

Learning. that

action

primary

combined

occur.

could

and an attempt

However,

Model

made on all data that is

to

value

deal

data

missing.

give their

in

containing

value

by going

is time-dependent,

not

extract

the

for

record

until

complete.

a relational

data is

missing

If the

to

may be

or data

How to

warehouse.

may be

may decline

values

system,

of sources.

data

values

people

integrity

Chapter

that

the

warehouse,

all cases. In addition,

a number

within the

could

original

waiting

all fields

field

BI function,

missing field

back

to

the

For example,

stage in the source

across

of the

into

collected.

at the input

selected

significance

been

be applicable

no data available from

individual

have

the

must

prior to insertion

of

data

status

the

be enforced

entry

data

referential columns

in

or an entry

checks

must be

warehouse.

constraints

warehouse key

a null

integrity

into

integrity

of foreign

must

have

Referential

of referential

a set

integrity

are

As

more likely

integrity

rules

when records

are

are first

warehouse.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

It is

very important

handling In

missing

order

the

to

support

warehouse

The final

data into is,

very time

Once

the

often it is

is

the

update

(the

time

15.5

to

star

takes

unique

it

on the update

database

the

database.

schema relational

served

advanced

the

has four

In

from

volume

will need

of

to

a data

which the

about

data).

available

and is

once

data is

moved into

usually

done in two

and is

used initially

and transformation

routines

of data

and

be updated

processing

within

the

to load

data

stages. historical

being

required,

warehouse

organisations

or refreshed

business

at regular

and the

used

(that

the first

time

less

intervals.

scheduling

complex

than

of its

the

first

and, of course, there is less

systems

should

cycle

is

are less intricate

warehouse)

effect,

the

data

and networks,

be scheduled

used to

schema

relational

and therefore

during

How business

time

data.

load.

However,

the load

non-business

requirements

structures facts,

on

map multidimensional

creates

database.

ER and

the The

near star

normalisation

window

hours.

schema

did

decision support

equivalent

not

of a was

yield

data

multidimensional

developed

a database

because

structure

that

well.

an easily implemented

relational

components:

star

techniques,

analysis

yield

the

existing

modelling

Star schemas preserving

is

777

and rules for

(data

data

Intelligence

sCheMas

a relational

existing

which

Business

fixes

metadata

exactly

in

only

extraction

The star schema is a data modelling technique into

know

process

place

organisations

routines

on the

update

integrity

warehouse

for

intensive.

The

will put pressure

to

as loading,

and the large

and transformation

it takes

load,

is live,

dependent

activities.

The extraction

use)

warehouse

updated

essential

is known

time

as referential

data

Databases

analysis.

Due to

in

and resource

data

intelligence

being

the

be a very time-consuming

first

warehouse. not

such

within

it is

data

ETL process

as the

routines

documented

intelligence,

data can

data

systems

load is

are

effective

of the

known

the

older

enable

Loading

stage,

all transformation

etc.

business

to

stage

warehouse. The first

that

values

15

model for

which

dimensions,

the

multidimensional

operational

attributes

and

database

attribute

is

data built.

analysis

The

basic

while still

star

schema

hierarchies.

15.5.1 Facts Facts are numeric example,

sales

commonly

measurements (values) that represent

figures

used

in

are

numeric

business

measurements

data

analysis

are

that units,

a specific

represent costs,

stored in afact table that is the centre of the star schema. through

their

Facts

dimensions

can

called

also

metrics

from

(covered

in the

be computed

to

differentiate

operational

next

or derived

product

prices

aspect or activity.

and/or

service

and revenues.

sales.

Facts

are

For

Facts

normally

The fact table contains facts that are linked

section).

at run

them from

business

stored

time.

Such

computed

or derived

facts.

The fact table is updated

facts

are

sometimes

periodically

with data

databases.

15.5.2 Dimensions 15 Dimensions that

are qualifying

dimensions

For instance,

Copyright Editorial

review

sales

the

next.

The kind

the

sales

of unit

location

2020 has

Learning. that

any

All suppressed

because

of problem

typically for

dimensions.

Rights

Reserved. content

does

May not

the In

not materially

be

copied, affect

provide

by product addressed

first

scanned, overall

additional

from

of 2014

dimensions

or

duplicated, learning

in experience.

whole

to

are the

or in Cengage

part.

Due Learning

to

perspectives

always

region

by a DSS

quarters

effect,

the

that

DSS data are almost

might be compared

X by region

and time

Cengage deemed

characteristics

are of interest

to region

might

In that

rights, the

right

glass

some to

third remove

party additional

may content

suppressed at

data.

period to

have which

be

Recall

other

a comparison

sales

through

content

to

one time

make

example,

magnifying

reserves

a given fact.

and from

be as follows:

2018.

electronic

to

viewed in relation

any

time

you

from if

of

product,

the

subsequent

study

eBook rights

and/or restrictions

eChapter(s). require

it

778

part

VI

Database

Management

the facts.

Such dimensions

for

with

sales

FIgure

15.7

product,

are normally stored in dimension

location

and time

tables.

Figure 15.7 depicts a star schema

dimensions.

simple star schema

Product dimension

Apple iPad

Location dimension

Sales Fact

Southeas

125

Time dimension

May 2018

000

15.5.3 attributes Each dimension table Dimensions

provide

contains

attributes.

descriptive

Attributes

characteristics

are often used to search, filter

about the facts

through

their

or classify facts.

attributes.

Therefore,

the

data warehouse designer must define common business attributes that will be used bythe data analyst to narrow a search, group information or describe dimensions. Using a sales example, some possible attributes for each dimension areillustrated in Table 15.9.

taBle

15.9

Dimension

possible

Name

attributes

Description

Location

Possible

Anything that location.

Product

Cape

Anything

that

product

sold.

shampoo, bottle

Time

provides

Example:

Eastern

15

Anything

month

provides For

blue

that

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

Region, country,

city, store and so on

101,

hair

Essence

of the

Product

product,

presentation,

care

brand,

150

type,

product

ID,

colour,

brand,

size

and

package, so

on

ml

liquid

provides

4:46

of the

Store

a description

example,

a time

frame

for

Year,

For example, the year 2018, the

of July, the

the time

London,

Attributes

SA

Natural and

a description

East

and

the sales fact.

Editorial

for sales dimensions

date

29/07/2018,

quarter,

month,

week,

day, time

of

day, and so on

and

p.m.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

These

product,

analyst The the

can

star

schema,

data

are

data (such

location

now

and time

group

the

through

sales

its

needed.

And it

as order

number,

dimensions

add a business

perspective

figures

a given

in

facts can

15

and

for

product,

dimensions,

can

do so

without

imposing

purchase

order

number

provide the

to the

a given the

Databases

of the

and status) that

Business

sales facts.

region

and

data in the

burden

for

required

time.

format

and

779

The data

at a given

additional

commonly

Intelligence

when

unnecessary

exist in

operational

databases. Conceptually,

the cube.

that

can

used.

sales Of course,

be associated

However,

to

using

a view

FIgure

using

of sales

15.8

this

a fact

multidimensional does

table.

not imply There

a three-dimensional

example, represents

examples

the

that

is

no

model

multidimensional

dimensioned

data

model

there

is

is

best represented

a limit

mathematical

limit

by product,

analysis

location

jargon,

a three-dimensional

on the

number

of dimensions

to

number

of

the

makes it easy to visualise data

by

the

the

problem.

dimensions

In this three-dimensional

cube illustrated

in

Figure

15.8

and time.

three-dimensional view of sales

Location Conceptual cube

three-dimensional

of sales

location

by product,

and time

Produc

Sales the

Time

facts

are

stored

intersection

product,

time

of

in

each

and location

dimension

Note that each sales value stored in the cube in Figure 15.8 is associated with the location, product and time dimensions. However, keep in mind that this cube is only a conceptual representation of multidimensional

data,

and it

does not show

how the

data are physically

stored in a data

warehouse.

Whateverthe underlying database technology, one ofthe mainfeatures of multidimensional analysis is its ability to focus on specific slices of the cube. For example, the product manager may beinterested in examining the sales of a product, while the store manager is interested in examining the sales made by a particular store. Using multidimensional jargon, the ability to focus on slices of the cube to perform a more detailed

analysis is known

as slice

and

dice.

Figure

15.9 illustrates

the slice-and-dice

As you look at Figure 15.9, note that each cut across the cube yields a slice. Intersecting small cubes that constitute the dice part of the slice-and-dice operation.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

concept.

slices produce

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

15

eChapter(s). require

it.

780

part

VI

FIgure

Database

Management

15.9

slice-and-dice

view of sales

Location

Sales

managers

view

of sales

data

Produc

Product Time

view

managers

of sales

data

To slice and dice, it must be possible to identify each slice ofthe cube. This is done by using the values of each attribute in a given dimension. For example, to use the location dimension, you might need to define a STORE_ID attribute in order to focus on a particular store. Given the requirement for attribute values in a slice-and-dice environment, lets re-examine Table

15.9.

Note that

each

attribute

adds

an additional

perspective

to the

sales facts,

thus

setting

the stage for finding new ways to search, classify and possibly aggregate information. For example, the location dimension adds a geographic perspective of where the sales took place: in which country, region, city, store and so on. All of the attributes are selected with the objective of providing decision support data to the end user so that he or she can study sales by each of the dimensions attributes. Time is

an especially

important

dimension.

The time

dimension

provides

a framework

from

which

sales patterns can be analysed and, possibly, predicted. Also, the time dimension plays an important role when the data analyst is interested in looking at sales aggregates by quarter, month, week, and so on. Given the importance and universality of the time dimension from a data analysis perspective, many vendors

have added

automatic

time

dimension

management

features

to their

data

warehousing

products.

15.5.4 attribute Attributes hierarchy

hierarchies

within dimensions can be ordered in a well-defined attribute hierarchy. The attribute provides a top-down data organisation that is used for two main purposes: aggregation and

drill-down/roll-up

data analysis.

For example,

Figure 15.10 shows

can be organised in a hierarchy by country, region,

how the location

dimension

attributes

city and store.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

FIgure

15.10

location

attribute

15

Databases

for

Business

Intelligence

781

hierarchy

Country

The

attribute

hierarchy allows

Region

the

user perform

end

to drill-down

and

City

roll-up

searches

Store

The

attribute

hierarchy

warehouse. month-to-date

analyst

month

the

analyst

data

sales

spots

the

country

provides

For example,

can

or in

identifies

the

OLAP

aggregated attribute

for

example,

product

group

(dairy,

15.11

the

data

meaning

that

Q1,

Q3 and

cell contains

and

so

on) in

analyst the

In this

will see

data

are

Q4). Finally,

the total

all

which

example,

data

the

on the

y-axis.

by quarters

location

dimension

sales for each country

necessary

for

narrative

brand analyst

all

drill down

example,

that,

a specific until

the

to

to form

A, Brand sales

is

product,

thus

meaning

set to Quarter,

of products

set to Country,

on the

on).

the

products,

(x-axis)

sales

so

using

For

dimension

be based

B, and

is set to All

of an

dimensions.

product

facts,

and

a hierarchy.

can

studies

warehouse

be part

of the

using the

dimension

data

decomposed

attributes

(Brand

total

product

be

dimension

dimension

is initially

for a given

data

By doing within

allows the

descriptions

product

The time

(for

2013

The

be extended

are to

can be grouped

The

product

even

a data

the

drill down inside

year.

all regions

hierarchy data

may want to

product

the

in

in

does

performance?

previous

can

how

provide

store.

How

norm.

attribute

not

searches

query,

might decide to

of the

operation

the

roll-up

sales

were reflected

dimensions

in the

products

the

It is to

or on the

aggregated

to those

will identify

city to store, you

a scenario

dimensions.

the

that

different

products

to the

month-to-date

sales

below

merely

from

slow

meat,

illustrates

and location

Q2,

exist

and

answers

The data analyst

because

path

drill-down

of drill-down

operations.

attributes

can identify

March

possible

attributes

2019

performing

a defined

after you drill down from manager

that

is

is

at the

compared

This type

and roll-up

some

mind that the

so the

Figure

have

drill-down

hierarchy;

But keep in

time

to

the low

that

scenario

systems

to the

by region

region.

store

perform

looks

March 2019.

sales

whether

a particular

The just-described and

how

to

analyst

compare

decline for

see

determine

only

analyst

sales

March to

capability

a data

performance

a sharp

of

the

suppose

A, B and

ensuring

Cin

that

each

in a given quarter.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

782

part

VI

FIgure

Database

Management

15.11

attribute

hierarchies in

multidimensional

Time

Year

analysis

dimension

Quarter

Month

Week

Total

Q1Q2Q3 All

By

of

product

products

Product

Product

Q4

product

A

Product

B

Product

C

........

type

dimension

Location

........

hierarchy

........ Country One

product

Total

of

quarters

Region

City

Store

The

simple

different

data

information

all products,

analyst

sales analyst

value

on one

As the

data is

of the

dictionary

ensured,

support

powerful

region

values

examples

Copyright Editorial

review

2020 has

analytical

is related

Learning. that

any

All suppressed

Rights

Reserved. content

total

does

gives

presented.

May

levels

down

sales

to

for

y-axis),

sales

access

the

data

dimension

the

warehouse

request

and

the

see data

used,

the

Clicking

so forth.

data in the

stored in the

properly.

warehouses

GUI is country.

how the is

to the

month or week.

When a

region,

with three

x-axis),

quarter,

information

data

can (the

within

determine

data

with the

year,

product.

city in the

analyst

analyst

by region

hierarchy

data

time

of each

see

hierarchies

integrated

the

of aggregation:

The attribute to

provides

On the

each

attribute

OLAP tool

be closely

(the

product.

by country,

drill

the

15.11

Once

metadata

data

DBMSs

such

access

and they

must

capabilities.

representation

each

not

one

sales,

illustrate,

are normally to

dimension

or just

cell to

by the

must

star schema

table

Cengage deemed

and

used

tools

Facts and dimensions fact

the

country

and is query

15.5.5

shows

are extracted

Figure

data at different

on the

preceding

warehouse

15

initially

in

product

by type,

time-variant

clicks

illustrated

On the

grouped

can request

data

scenario

paths.

products

Each

again

analysis

not materially

represented

dimension

be

copied, affect

scanned, the

overall

table

or

duplicated, learning

by physical tables in

in experience.

whole

a

many-to-one

or in Cengage

part.

Due Learning

to

electronic reserves

in the

(*:1)

rights, the

right

data

warehouse

relationship.

some to

third remove

party additional

content

In

may content

database.

other

be

words,

suppressed at

any

time

from if

the

subsequent

The many

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

fact

rows

are related

product

appears

Fact

and

foreign

key

primary the

easily the

each

dimension

tables

constraints.

The

key on the many

primary

among

to

many times

key

the the

star

fact

are related

CUSTOMER

table

FIgure

15.12

by foreign

key

on the

keys

1

and

side,

the

primary location

are

for

Business

you can conclude

subject

dimension

including

key. Figure and

a customer

required

example,

to the table,

Because the fact table is related

product,

be expanded,

merely

to the

the

sales

Databases

Intelligence

that

783

each

table.

is a composite

and

can

dimension

Using the

fact

side, the fact table.

table

schema

row.

sales

primary

of the fact table

sales

customer

dimension in the

15

time

CUST_ID

to

has

to

fact

of the

tables,

the relationships To show

added

SALES

key/

as part

many dimension

tables.

been

in the

primary

stored

15.12 illustrates

dimension

dimension

the

familiar

is

the

table

you mix.

and

how

Adding

adding

the

database.

star schema for sales

TIME

LOCATION LOC_ID

1

1

LOC_DESCRIPTION

TIME_ID TIME_YEAR

COUNTRY_ID

TIME_QUARTER *

LOC_REGION

TIME_MONTH

SALES

*

LOC_CITY

TIME_DAY

TIME_ID

25 records

TIME_CLOCKTIME

LOC_ID *

CUSTOMER

365 records

CUST_ID

1

*

PROD_ID

CUST_ID

PRODUCT

1

SALES_QUANTITY

PROD_ID

CUST_LNAME

SALES_PRICE

CUST_FNAME

PROD_DESCRIPTION

SALES_TOTAL

CUST_INITIAL

PROD_TYPE_ID

3 000 000 records

CUST_DOB

PROD_BRAND

Daily sales aggregates by store,

125 records

customer product

and

PROD_COLOUR PROD_SIZE

PROD_PACKAGE PROD_PRICE

3000 records

15 The composite primary key for the SALES fact table is composed of TIME_ID, LOCATION_ID, CUST_ID and PRODUCT_ID. Each record in the SALES fact table is uniquely identified by the combination of values for each of the fact tables

foreign

keys.

By default,

the fact tables

primary

key is always formed

by combining the foreign keys pointing to the dimension tables to which they are related. In this case, each sales record represents each product sold to a specific customer, at a specific time and in a specific location. In this schema, the time dimension table represents daily periods, so the SALES fact

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

784

part

VI

Database

Management

table represents values

daily sales aggregates

used in the

Therefore,

the

contain

only

dimension

fact

the

tables

tables

SALES

only

Data

data

data

the

support

VENDOR

dimension

the

must orders

schema. all

Since

unique

to thousands

but it

has thousands

schema

facts

facilitates

through

searches

the

suppose

warehouse. be the

In that

case,

an interest

in the

sales

the

in the

the

fact

tables.

dimension

products,

and

of fact records.

tables

so

on), the

For example,

of corresponding data retrieval

dimensions

smaller

database.

special

another

department.

In

in

attention.

perhaps

Figure

If the

records

functions

attributes.

dimension

the

table

tables

in

because Therefore,

before

to

schema

might

a

accessing

vendor

in

department time

star

schema

while

If

shares

key interest, product

and

same

by a

product

table

as well as sales, the time same time

different

represent

maintaining

represented

by the

orders

specific

and a SALES

vendor,

dimension

uses the

table. to

answer

orders

have

is represented

ORDER_TIME,

Orders

in

to

be an organisations

that

a new

dimension

same

designed

an ORDERS fact table

given the interest orders

is

a new interest

maintain

yields

by the named

15.13,

develop

a star

product

However,

fact

are considered

of

vendors

be represented

table,

you

If orders

The

Each

you should

centre

star schema.

can

tables.

scenario,

should

time

create

For example,

data

now requires

department,

you

star

contain the actual

many times

is related

star

at the

many fact

table

table

used in the initial

sales

of the

DBMS first

in sales. In that

same

fact

dimensions.

new

record

will look

have

questions.

in the

ORDERS

time

in the

salespersons,

dimension,

characteristic

warehouse

tables

unique

Since fact tables

are repeated

the fact tables.

product

analyst

usually

your original interest fact table

(all

values

tables.

warehouses

decision

the largest

smaller than

and by customer.

those

each dimension

That

the

DSS-optimised fact

always

once in the

table.

time

by product process,

information

star schema,

fact

of the

the larger

are

are always

appears

most

support

non-repetitive

In a typical widget

decision

time

the

the

periods

periods

time

product,

as the

are

periods vendor

used,

used

by

and time

dimensions.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.13

15

Databases

for

Business

Intelligence

785

orders star schema

PRODUCT PROD_ID

1

PROD_DESCRIPTION TIME

ORDER PROD_TYPE_ID

1

*

*

TIME_ID

TIME_ID

PROD_ID

TIME_YEAR

VEND_ID

TIME_QUARTER

ORDER_QUANTITY

TIME_MONTH

ORDER_PRICE

TIME_DAY

ORDER_AMOUNT

TIME_CLOCKTIME

PROD_BRAND PROD_COLOUR *

PROD_SIZE PROD_PACKAGE

PROD_PRICE

3000 records 365 records

85 000 records Daily sales by product

aggregates and vendor

VENDOR

1

VEND_ID VEND_NAME VEND_AREACODE VEND_PHONE

VEND_EMAIL 50 records

Multiple

fact

tables

will explain

can

several

also

be

created

for

performance

performance-enhancing

and

techniques

that

15.5.6 star schema performance-Improving The creation

of a database

warehouse speed

designs

through

Normalising

dimensional

tables

Maintaining

multiple

tables

Partitioning

review

2020 has

are

Cengage deemed

Learning. that

any

fact

All

code

often

as

used

can

within

answers

optimise

different

the

star

to data analysis

performance-enhancement

well as through

to represent

be used

The following

section

schema.

techniques

and accurate

to

reasons.

better data

semantic

warehouse

aggregation

queries is the data

actions

might target

representation

query

of business

15

design:

levels

fact tables and replicating

suppressed

fast

Therefore,

of SQL

Four techniques

Denormalising

Copyright

provides

objective.

the facilitation

dimensions.

Editorial

that

prime

semantic

Rights

Reserved. content

does

May not

not materially

tables

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

786

part

VI

Database

Management

Normalising

Dimensional

Dimensional through

tables

the

among

in

Database

dimension

The

city,

tables.

The the

dimensions.

In this to the

star

15.14

fact

normalised

can

those

review in

is

usually

tables,

the

you

COUNTRY,

table.

Figure

table

facilitate

contains to

15.14

is

transitive

the

techniques

end-user

in

known

the

result

simplify

the

Chapter

table

of normalising

directly

schema.

tables.

operations

contain

related

form),

can have their own

dimension

data-filtering

is

normal

7, Normalising

as a snowflake

CITY and LOCATION

LOCATION

navigation dependencies

3NF (third

which the dimension tables

REGION,

Only the

and

relationships

normalisation

shown

schema

simplicity

dimension

revise

of star schema in

dimension

example,

semantic

if the location

you

schema

snowflake

SALES

achieve

necessary,

schema is a type

By normalising

compared

to

example,

and

15.14. (If

Designs.)

A snowflake

FIgure

For

province

Figure

Tables

normalised

dimensions.

region,

as shown

are

to the

related

to

the

very few records SALES

fact

table.

dimension tables

note Although using the dimension tables shown in Figure 15.14 gains structural simplicity, there is a price to pay for that simplicity. For example, if you want to aggregate the data by country, you must use a four-table join, thus increasing the complexity of the SQL statements. The star schema in Figure 15.12 uses a LOCATION

15

dimension

table

that

greatly

facilitates

This is yet another example of the trade-offs

Maintaining

Multiple

Fact

Tables

that

data retrieval

designers

representing

by eliminating

multiple join

operations.

must consider.

Different

Aggregation

Levels

You can also speed up query operations by creating and maintaining multiple fact tables related to each level of aggregation (country, region and city) in the location dimension. These aggregate tables are

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

precomputed processor

at the cycles

decision

analysis

by accessing

aggregate

data-loading

at run then

a lower

phase rather than

thereby

properly level

fact tables

FIgure

time,

of

accesses detail

for country,

15.15

speeding the

fact

The purpose

analysis.

summarised

table.

region

at run time.

up data

fact

This technique

15

tables

sales

for

Business

of this technique

An end-user

query

instead

is illustrated

and city to the initial

Databases

tool

Figure

787

is to save

optimised

of computing

in

Intelligence

the

15.15,

for values

which

adds

example.

Multiple fact tables

15

The

data

warehouse

database. because of

use

These the

Copyright review

2020 has

Cengage deemed

any

All suppressed

is to

processing

designer

Learning. that

must identify

multiple aggregate

objective

and the

warehouse

Editorial

designer

Rights

does

May not

not materially

be

access

required

must select

Reserved. content

fact tables

minimise time

which levels

to

which

copied, affect

scanned, the

overall

are updated

duplicated, learning

during

and processing calculate

aggregation

or

of aggregation

in experience.

whole

time,

a given fact

or in Cengage

part.

tables

Due Learning

to

electronic reserves

to

pre-compute

each load according

to

mode. And

expected

level

at run

frequency

time,

the

data

create.

rights, the

store in the

cycle in batch to the

aggregation

and

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

788

part

VI

Database

Management

Denormalising

Fact Tables

Denormalising objective,

fact

tables

however,

limitations

that

of records

is

the

product

improves

region

operation

for

down

system.

the

example,

sales

you

In

and

by using total

and

cases,

each

years

might

total.

as frequency

of use and performance

on the

to

Partitioning

these

manage

and

Since table

month level,

the

replicating

partitioning

techniques

discussed

and replication

geographic

areas. Partitioning client

places it in No used

are

computer

which

business

aggregation

defined

analysis. within

the

previous

years

operations, tables

time

daily

time. This is

In this

section,

business

materials

case in

for

data

very

table.

If

might

might

you

not

are

have

as a basis

Here again,

against the

300 000

begins to

bog

denormalised.

MONTH_1,

serve

take

have to

be a very taxing

sales

that

DBMS number

normally

you

for

possible

For

MONTH_2

...

year-to-year

design

the

the

table

avoid

size

criteria,

overload

such placed

relate

actually

generate

used,

is

business

in the

quarterly

is the

have

widely

Table

dispersed

of a table

and yearly.

of the

data

are

moved

all sales

to

and

stored to

in

the

another

records

current

level

of

have five

must have

year

sales all

would

be of value!

allows

that

the

and

data,

contains

beginning

previous

years

of

years

sales table which

but the

you to

data

table. table

from

historic

the

might

Those fact tables

to intimidate

use

each

you

The previous

the

dimension

for

as current year only, previous

enough

tools

table

example,

usually expressed year

most common

one fact

sales

design technique

intelligence

warehouse. in

makes a copy

time

to

The data in the

denormalisation

Business

is

access

table

data

Databases,

or columns and places the subsets

of the company.

remote

of this

Distributed

to the

replication

contains

year.

14,

time.

timespan

how the star schema

to

time.

current

current

which

making.

analytics

Chapter

a DSS is implemented

common

monthly,

sales history to

possible

you learnt

for

of the

locations

decision

to

year level.

For example,

about

This

complete

The one

weekly,

sales

only.

exception

it is

defined. Periodicity,

daily

sales

at several

response

YEAR_ID,

detail in

scheme

information

the

data that

this

fields:

when

access

dimension.

daily,

year,

represent

optimisers.

for

each

with the

thus

be replicated slow

provides

of

this

tables

specifically

data access

Therefore,

or explicit periodicity end

in

important

also to improve

SALES fact tables:

or all years,

here only as they

to improve

an implicit At the

in

and

costs.

worth of previous

be used

daily, maximum

all regions,

Although

splits atable into subsets of rows

aggregate

years

store

in

records

are evaluated

were covered

performance-enhancement

data

space

The latter

relations.

particularly

a different location,

matter

in

almost

raw storage to

space.

Tables

partitioning

close to the

decrease

aggregate

or the

storage

and the

ten years

easily

requirements,

and replication

are

can

data

size limits

products

of the

special

quarter level

denormalised

all

the following

tables

the

for

all

have

contain

Such

at the top

to

costs

300 000 rows.

of, for example,

comparisons

DBMS

summarise

useful

saves

a single record

sales

at least

it is

table

storage

effects than

compute

a comparison

a YEAR_TOTALS

MONTH_12

more negative

the

and

and record

performance to

such

Data

size limits,

be summarising

a DBMS,

performance

of an issue.

aggregates

could

access

and table

have far

For example,

sales,

less

database

in a single table,

Denormalisation

data

becoming

restrict

many records. access

improves

can

bravest

can

cause

of query

model data optimised

warehouse

data

as the

raw

knowledge.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

15.6

Data

15

Databases

for

Business

Intelligence

789

analytICs

Data analytics is a subset of BIfunctionality that encompasses a wide range of mathematical, statistical and modelling techniques with the purpose of extracting knowledge from data. Data analytics is used at all levels within the BI framework, including queries and reporting, monitoring and alerting, and data visualisation. Hence, data analytics is a shared service that is crucial to what BI adds to an organisation.

Data analytics

represents

what

business

managers

really

want from

BI: the

ability

to

extract actionable business insight from current events and foresee future problems or opportunities. Data analytics discovers characteristics, relationships, dependencies or trends in the organisations data, and then explains the discoveries and predicts future events based on the discoveries. In practice, data

analytics

is

better

understood

as a continuous

spectrum

of knowledge

acquisition

that

goes

from discovery to explanation to prediction. The outcomes of data analytics then become part of the information framework on which decisions are built. Based on the previous discussion, data analytics tools can be grouped into two separate (but closely related and often overlapping) areas: explanatory analytics focuses on discovering and explaining data characteristics and relationships based on existing data. Explanatory analytics uses statistical tools to formulate hypotheses, test them, and answer the how and why of such relationships for example, how do past sales relate to Predictive

previous

analytics

customer

focuses

promotions?

on predicting

future

data

outcomes

with a high degree

of accuracy.

Predictive analytics uses sophisticated statistical tools to help the end user create advanced models that answer questions about future data occurrences for example, what would next months sales be based on a given customer promotion? You can think of explanatory analytics as explaining the past and present, while predictive analytics forecasts

the future.

However,

you need to

understand

that

both

sciences

work together;

predictive

analytics uses explanatory analytics as a stepping stone to create predictive models. Data analytics has evolved over the years from simple statistical analysis of business data to dimensional analysis with OLAP tools, and then from data mining that discovers data patterns, relationships and trends to its current status of predictive analytics. The next sections illustrate the basic characteristics of data mining and predictive

analytics.

15.6.1 Data Mining Data mining refers to analysing relationships; to form computer models to support

business

massive amounts of data to uncover hidden trends, patterns and models to simulate and explain the findings; and then to use such

decision

making. In other

words,

data

mining focuses

on the discovery

and

explanation stages of knowledge acquisition. To put data mining in perspective, look at the pyramid in Figure 15.16, which represents how knowledge is extracted from data. Data form the pyramid base and represent what most organisations collect in their operational databases. The second level contains information that represents the purified and

processed

data. Information

Knowledge is found

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

forms

at the pyramids

does

May not

not materially

be

copied, affect

scanned, the

the

basis for

decision

apex and represents

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

making

and

business

understanding.

15

highly specialised information.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

790

part

VI

FIgure

Database

Management

15.16

extracting

knowledge from

data

Processing High

Knowledge

Information

Low

Dat

Current-generation business

such

data

mining

requirements.

as banking,

mining tools

tools

Depending

insurance,

contain

on the

marketing,

can use certain

algorithms

many

problem

retailing,

that

design

and

domain,

data

finance

application

variations

mining

focus

tools

and healthcare.

are implemented

in different

to fit on

specific

market

Within a given

ways and applied

niches

niche,

data

over different

data. In

spite

of the

lack

1

Data preparation

2

Data analysis

3

Knowledge

4

Prognosis.

In the

data

and filtered,

15

The

the

data

analysis

review

2020 has

groupings,

Data

dependencies,

Learning. that

any

All suppressed

and

main data

mining is

subject

trends

to four

general

phases:

data

Reserved. content

does

May not

set for

studies

the

mining tool

clusters,

by the

data

data

data

operation

are identified

are already

integrated

mining operations.

data to identify

applies

mining

warehouse

specific

common

algorithms

data

to

characteristics

find:

or sequences

or relationships

and deviations. phase

acquisition

Rights

be used

data in the

the target

phase

the

classifications, links,

sets to

Because the

classification phase,

acquisition

knowledge

Cengage deemed

the

warehouse is usually

this

Data

Data patterns,

Copyright

phase,

data

During

The knowledge

data

acquisition

of any data impurities.

or patterns.

the

standards,

and classification

preparation

and cleansed

Editorial

of precise

uses the results

phase, the

not materially

be

copied, affect

scanned, the

overall

or

data

duplicated, learning

of the

data analysis

mining tool (with

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

and classification

possible

rights, the

right

some to

third remove

intervention

party additional

content

may content

phase. by the

be

suppressed at

any

time

from if

During

end

the

subsequent

eBook rights

user)

and/or restrictions

eChapter(s). require

it.

Chapter

selects

the

used in

appropriate

data

mining

regression

trees,

algorithms

also

neural

phase.

In

88 per

per

per

age

, 30

to

The

complete

model the

cent

Intelligence

algorithms

classification

and

data

to

optimise

791

and

visualisation.

Hybrid

decision

trees

any combination

and

to generate

data set. acquisition

findings

are

of data

mining findings

did

use

not

most common

induction,

in

Business

used

to

can

a particular

phase,

others

predict

future

continue

to the

behaviour

and

be:

credit

card in the

past

six

months

are

account. who

within the

bought

next four

25 000

of findings

can

presentation

phase

Figure

FIgure

algorithms

knowledge

mining

be used

for

and

a 60-inch

or larger

TV are

90 per

cent likely

to

buy

weeks.

credit

rating

, 3 and

credit

amount

.

25 000,

then

the

term is ten years.

set

prognosis

data

of customers

,5

can

Databases

The

rules

neighbour

that

of the target

at the

who

that

centre

or a visual

promotion.

cancel

and income

minimum loan

the

Examples

of customers

an entertainment If

phase,

stop

trees,

many of these

behaviour

tools

nearest

algorithms

may use

algorithms.

decision

and

genetic

the

mining

that

cent

cent likely

Eighty-two

reflects

outcomes.

acquisition

networks,

reasoning,

mining tool

data

business

Sixty-five

or knowledge on neural

example,

A data

many

prognosis forecast

for

model that

Although

based

memory-based exist,

networks.

a computer

modelling are

15

might project

15.17

15.17

be represented

interface

illustrates

the likely the

in

that is

a decision

used to outcome

different

tree,

project future of a new

phases

of the

a neural

network,

events

or results.

product

data

mining

rollout

a forecasting

For example,

or a new

marketing

techniques.

Data mining phases

Identify

O p er ati on al

Data preparation

phase

data

set

Clean data set

d at ab ase

Integrate

data

Classification Data

Dat a ware h o u s

analysis

classification

and

Clustering

phase

set

analysis

and sequence

analysis

Link analysis Trend and deviation analysis Select

and apply

Artificial Inductive

Knowledge acquisition phase

algorithms

Neural

Networks

logic

Decision

ensembles

Classification Nearest

and regression

trees

15

neighbour

Visualisation

etc.

Prediction

Prognosis

phase

Forecasting Modelling

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

792

part

VI

Database

Management

Because of

what

between that

of the

a customers

meaningful

data that

reduce

healthcare

fraud,

Guided.

The

or relationships.

Automated.

end

In this

uncover

hidden

techniques

data.

patterns,

customer

taBle

However, For

models.

such

as a customer

example,

15.10

software

data

mining

can

also

model

to a target in

that

could

be used

marketing

more detail.

basis

Table 15.10

to the

data.

and

multiple

and explains that

to

to

campaign.

known

mining

profile

as the

modes:

apply

applies

data

describes

a customer

be used

explain

automatically

section,

acceptance,

in two

to run

these

relationships

and

to

mining tool

in this

create

and

Clearly,

mining usually

practical

techniques

tool

information

a predictive

analytics

mining

could

response

the use of predictive

The

As you learnt

data

be run

explore

which

mining

can

car. analysis,

development

mining

boundaries relationship

customers

Fortunately,

by step to

data

and extracting

data

Data

the

a close

(In regression

helpful in finding

decides

up the

model

on the

product

on.

step

user

and relationships.

group.

data

end

sets

an explanatory

data

and

user

of tyres

correlation.)

so

outside

might find

managers.

improve

and

might fall tool

brand sales

idiot

mining tool

relationships.

behaviour,

warehouse

data

trends

predictive

explains

markets,

on discovering

example,

and the among

patterns,

mode, the end

significant

focus

For

the

In this

mining

mining has proven

buying

stock

guides

drink

data

some findings

a data

by the label

In fact,

mode, the

to find

methodologies

given

user

of cool

customer

analyse

example,

high regard

described

define

patterns

the

brand

results.

help

mining process, For

be held in

are commonly

more

among

not

data

expect.

favourite

might

relationships

of the

managers

relationship

yields

nature

business

describes

create

predict

a

advanced

future

customer

The next section

contains

a sample

of data

vendors.

asample of current data warehousing vendors

vendor

Product

Teradata

Teradatas

the

web Address EDW (Enterprise

market leaders.

innovations,

Data

Warehouse)

The company

and capabilities

is

has included

one

of

www.teradata.com

new tools,

such as Hadoop-based

technologies. Oracle

Oracle is

synonymous

warehouses. platform

with databases

Oracle

Exadata

that includes

flash

and Hybrid Columnar Amazon

Amazon data

MarkLogic

Copyright Editorial

review

2020 has

Data

Hub is

It is optimised

analytics

and interactive

analytics

Although

the term predictive

Learning. any

All suppressed

the

promise

bottom

line.

Rights

Reserved. content

does

May not

Redshift

is their

a Hadoop-based

for

fully

aws.amazon.com

managed

data storage

batch processing,

not

analytics

is

of predictive

be

copied, affect

used

overall

or

by

analytics predictive

scanned, the

offers

ways to

www.marklogic.co

queries.

Therefore,

materially

www.cloudera.com

advanced

SQL.

offers a NoSQL platform that

predictive

that

overheads

solution.

15.6.2

Cengage

I/O

cloud-based

semantic-based

deemed

for lower

way through

Amazon

perform

their

www.oracle.com

Web Services has led the

warehousing.

MarkLogic

improve

data

I/O.

solution.

of functionality,

with

for reduced

Enterprise

15

storage

now

an advanced

Compression

petabyte-scale Cloudera

and

Machine is

duplicated, learning

in experience.

many BI vendors

is

very

analytics

whole

or in Cengage

part.

Due Learning

attractive is

to

receiving

electronic reserves

to indicate for

rights, the

right

a lot

some to

third remove

many different

businesses

party additional

of

content

looking

marketing

may content

be

buzz;

suppressed at

any

time

levels

for

from if

the

subsequent

ways to vendors

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

and

businesses

are dedicating

use

of advanced

mathematical,

high

degrees

learnt

earlier,

of accuracy.

use similar

and

answering models the

data

has

step

behaviours.

after

data

In fact,

origins

need to

profile

force

for the

based

evolution

on your

mining when

send

data

mining

their to

and

data

business

predictive

mining

while predictive

analytics

focuses

In

you

some

ways,

understand dropping

you

your

the term

data,

you

data

refers

to the

outcomes

with

analytics

mining focuses

on creating

can

on

actionable

of predictive

analytics

use the

mining

793

As you

predictive

Data

can think

Intelligence

analytics?

and

focus.

be traced

customer

modelling

back to

buying

and purchasing

what credit limit

to

the

banking

patterns

methodologies

information

a big

predictive

offer,

analytics

experiences. and

flyer

stimulus

and

in these

data to

and replacing

BI data

history,

a credit

which

offers

credit

as

predict

it

with the

card industries.

industries

used in

drive

and

customers

in

with the

as a

was one of the first

search

profile

many

received

and

loyalty

frequent

mining

Business

was

analytics

card

you are

today.

For

company

more likely

The

a critical

driving example,

can use data to accept,

and

offers.

Google

personalise customer

data In fact,

for

analytics

future

different

are

can

predict

determine

analytics

media sites.

and of

those

Predictive to

analytics

demographic

models to

to

once

Predictive

predict

with a slightly

events.

BI vendors

to

Databases

analytics.

of predictive customers

data,

and

mining;

most

between

but

BI area.

tools

capabilities.

of tools, of past

more alluring term predictive The

difference

behaviours

to this

modelling

predictive

sets

and what future

next logical

future

also

overlapping

predict

resources and

What is the

mining

the how

to

extensive statistical

15

way to

companies

affinity

card

Take

to

get

the

social

of the

Nowadays, keep

the

right

Companies

of

stored

on social

of organisations

and

credit

organisations

ones,

data

turned

ads as a way to increase

airline

many

media.

mountains

offered targeted

example

and

of

were used by all types

the

programs.

an attempt

harvest

that

Similar initiatives

up sales.

advent

card

use

which

to increase

industries

predictive

in turn

and

and

analytics

will increase

loyalty

and sales.6 Predictive

analytics

intelligence

and

available

and

data.

employs

other

The algorithms

work

with certain types

in statistics

and understands

thanks the

to

constant

optimum

Most

service,

Predictive

analytics

optimise

existing

However,

predictive

carefully

monitor

advances, predictive

modern

analytics

is

evaluate

proper BI tools are

detection,

an organisation

not the the

model

models

hidden

that the algorithms

sauce

apply

used in

such

areas

multiple

algorithms

to find

relationships,

example,

problems

to

However,

optimised

pricing.

it

can help

or opportunities.

problems.

models

is trained

hand.

and

on

of problems

who typically

ways; for

all business

analytics

types

as customer

future

based

problem in

marketing

many different

artificial

models

end user,

automatically

to fix

networks,

to certain

to the

and anticipate

of predictive

neural predictive

are specific

targeted

in

problems

secret

value

actionable

it is important

fraud

identify

algorithms,

create

predictive

analytics

retention,

statistical

to

applies the

can add value to

and

the

Therefore,

business,

processes,

tools

build

of data.

customer

and

modelling

used to

technology

model.

customer

mathematical

advanced

Managers

determine

their

should

return

on

investment.

So far, support

you

data,

mentioned will learn

have learnt and

data

components about

about

analytics to

a widely

data

to

provide

used

warehouses

extract

knowledge

decision

BI style

known

support as

and star

schemas

from

data.

to

online

the

to

all organisational analytical

model and store

A BI system users.

uses In the

all the next

decision previously

section,

you

processing.

15

6 Analytics

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Insight,

All suppressed

Rights

Available:

Reserved. content

does

May not

not materially

be

www.analyticsinsight.net/

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

794

part

VI

Database

15.7

Management

onlIne

analytICal

proCessIng

The need for moreintensive decision support prompted the introduction of a new generation of tools. Those new tools, called online analytical processing (OLAP), create an advanced data analysis environment that supports decision making, business modelling and operations research. OLAP systems share three main characteristics. They: Use multidimensional Provide advanced

data analysis techniques

database support

Provide easy-to-use

end-user interfaces.

Lets examine each of those characteristics.

15.7.1

Multidimensional

Data analysis techniques

The most distinct characteristic of modern OLAP tools is their capacity for multidimensional analysis. In multidimensional analysis, data are processed and viewed as part of a multidimensional structure. This type of data analysis is particularly attractive to business decision makers because they tend to view business data as data that are related to other business data. To better

understand

this

view, lets

examine

how a business

data analyst

might investigate

sales

figures. In this case, he or she is probably interested in the sales figures as they relate to other business variables such as customers and time. In other words, customers and time are viewed as different dimensions of sales. Figure 15.18 illustrates how the operational (one-dimensional) view differs from the

multidimensional

view

of sales.

As you examine Figure 15.18, note that the tabular (operational) view of sales data is not well suited to decision support because the relationship between INVOICE and LINE does not provide a business perspective of the sales data. Onthe other hand, the end users view of sales data from a business perspective is more closely represented by the multidimensional view of sales than by the tabular view of separate

tables.

Note also that the

multidimensional

view allows

end users to consolidate

or aggregate

data at different levels: total sales figures by customers and by date. Finally, the multidimensional view of data allows a business data analyst easily to switch business perspectives (dimensions) from sales by customer to sales by division, by region, and so on. Multidimensional data analysis techniques are augmented by the following functions: Advanced data presentation functions: 3-D graphics, pivot tables, crosstabs, data rotation and three-dimensional cubes. Such facilities are compatible with desktop spreadsheets, statistical packages and query and report-writer packages. Advanced

data aggregation,

consolidation

and classification

functions

that

allow the data analyst

to create multiple data aggregation levels, slice-and-dice data (see section 15.5.3), and drill-down and roll-up data across different dimensions and aggregation levels. For example, aggregating data across the time dimension (by week, month, quarter and year) allows the data analyst to drill

15

down

and roll up across time

dimensions.

Advanced computational functions: Business-orientated variables (market share, period comparisons, sales margins, product margins and percentage changes), financial and accounting ratios (profitability, overhead, cost allocations and returns), and statistical and forecasting functions.

These functions

their components

are provided

automatically

and the end user does not need to redefine

each time they are accessed.

Advanced data modelling functions: Support for what-if scenarios, variable assessment, contributions to outcome, linear programming and other modelling tools.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

variable

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.18

Database Table

name:

name:

operational

vs multidimensional

Databases

for

Business

Intelligence

795

view of sales

Ch15_Text

DW_INVOICE iNv_NUM

Table

15

name:

iNv_DATe

CUS_NAMe

2034

15-May-19

Dartonik

1400.00

2035

15-May-19

Summer Lake

1200.00

2036

16-May-19

Dartonik

1350.00

2037

16-May-19

Summer

2038

16-May-19

Trydon

iNv_TOTAL

lake

3100.00 400.00

DW_LINE

Dw_LiNe

iNv_NUM

LiNe_NUM

PrOD_DeSCriPTiON

LiNe_PriCe

LiNe_AMOUNT

LiNe_ QUANTiTY

2034

1

2034

2

Optical

Mouse

Wireless

RF remote

and laser

3TB

45.00

20

900.00

50.00

10

500.00

pointer Drive,

1

Everlast

Hard

2036

1

Optical

Mouse

45.00

30

1350.00

2037

1

Optical

Mouse

45.00

10

450.00

2037

2

Router

2037

3

Everlast

Hard

2038

1

NoTech

Speaker

200.00

Drive,

Time Customer

2 750.00

1 350.00

3 100.00

1 800.00

Lake

Totals

16-May-19

1 400.00

Dartonik

of Sales

Dimension

15-May-19

Dimension

Summer

View

4 900.00

400.00

Trydon

Totals

400.00

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

15

8 050.00

4 850.00

3 200.00

Aggregations are provided for both dimensions

Sales are located in the intersection of a customer row and time column

Copyright

400.00

8

50.00

Set

2050.00

10

205.00

3TB

600.00

5

120.00

Multidimensional

Editorial

1200.00

6

2035

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

796

part

VI

Database

Management

Predictive

modeling

(business

allows

the

with

a high

outcomes)

15.7.2 advanced To

deliver

features

efficient

system

to

build

percentage

advanced

statistical

models to

predict future

values

of accuracy.

Database support

decision

support,

OLAP

tools

must

have

advanced

data

access

features.

Such

include:

Access

to

Access

to

many

different

aggregated

kinds data

of

DBMSs,

warehouse

flat

data

files

as

and internal

and

well as to the

external

detail

data

data found

in

sources operational

databases Advanced

data

Rapid

consistent

The

and ability

to

data

must

be optimised

warehouse

Support

for

To

provide

from

the

very large

to

the

and roll-up

the

in

proper

data

either

data

source,

business

access

or

model

language

regardless

of

terms,

(usually

whether

to the

SQL).

the

The

source

is

query

operational

As already

interface,

analysis

explained,

the

data

warehouse

can

easily

and

quickly

in size. OLAP

database

data

expressed

to

match

as drill-down

times

requests,

databases.

a seamless

data

response

and then

multiple terabytes

appropriate

such

data.

operational

end-user the

query

source

or data

grow to

features

map end-user

appropriate code

navigation

tools

to their

requests

into

map the

own

the

data

proper

data

elements

dictionaries. (optimised)

from

These query

the

data

metadata

codes,

warehouse

and

are used to translate

which

are then

directed

to

source(s).

15.7.3 easy-to-use end-user Interface The

end-user

analytical

implemented,

an analytical

accelerates access

decision

to them

is

sophisticated

interface familiar

data

features to

end

spreadsheet such

review

2020 has

users.

Excel.

is

advanced

Cengage deemed

data

and

Learning. that

any

for

analysis

All

Rights

Reserved. content

does

May

not materially

be

copied, affect

user to

navigate OLAP

vendors

overall

this

and

the

closely

features

available

in

early

of data

functions

end-user

programs

tools

are

graphical

because

of the

are already to

desktop

with spreadsheets

interfaces,

Figure

their

Many

that

and when

equipped

common

systems

in

vendors

have

useful

interfaces.

analysis

their

simplifies

more

and

graphical

menu bar, as shown

familiar

become

integrated

and spreadsheet

When properly

data in a way that

lesson

presentation

have

components.

features

easy-to-use

generations

vendors

by using

scanned,

learnt with

previous analysis

OLAP

Advanced

spreadsheet

are

the

most critical

tools

OLAP systems

costs

not

the

features

development

suppressed

OLAP

Using

tool

analysis

within the

an advantage

the

analysis.

many

most

option

permits

from

Because

another

of the

OLAP and

are borrowed

Microsoft

training

Copyright

simple.

packages,

one

or data

extraction

as

to

Editorial

kept

is

interface

making

becomes integration

15

interface

OLAP

15.19.

simply

This seamless

end users gain access

and interfaces.

Therefore,

additional

minimised.

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.19

Integration

15.7.4 olap

of olap

with a spreadsheet

15

Databases

for

Business

Intelligence

797

program

architecture

OLAP operational

characteristics

can be divided into three

main modules:

Graphical user interface (GUI) Analytical

processing

Data processing

logic.

Figure 15.20 illustrates As Figure

logic

OLAPs architectural

15.20 illustrates,

OLAP systems

components. are designed

data. Although Figure 15.20 shows the OLAP system single-user scenario is only one of many.In fact, one each data analyst must have a powerful computer on data processing locally. In addition, each analyst uses copies

must be synchronised

to

ensure that

analysts

to use both operational

and data

warehouse

components located on a single computer, this problem with the installation shown here is that which to store the OLAP system and perform all a separate copy of the data. Therefore, the data are

working

with the same

data. In

other

words,

1

each end user must have his or her own private copy (extract) of the data and programs, thus returning to the islands ofinformation problems discussed in Chapter 1, The Database Approach. This approach does not provide the benefits of a single business image shared among all users. A more common

and practical

architecture

is

one in

which the

OLAP

GUI runs

on client

workstations,

while the OLAP engine, or server, composed of the OLAP analytical processing logic and OLAP data-processing logic, runs on a shared computer. In that case, the OLAP server will be a front end to the data warehouses decision support data. This front end or middle layer (because it sits between the data warehouse and the end-user GUI) accepts and processes the data-processing requests generated

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

798

part

VI

FIgure

Database

Management

15.20

olap architecture

Advanced reporting

OLAP engine

provides

the

data

a front

end to

warehouse

Spreadsheet reports Excel

Operational

External

plug-in

Analytical

data

data

processing

logic

Access

OLAP

plug-in

Data-processing logic

reports

OLAP GUI

Dashboards Alternate of

direct

access

operational

and

warehouse

Multiple

data

interfaces

and application

data

plug-ins Mobile

Bl

Data

ETL

Warehouse Credit: Oleksiy

Mark/Shutterstock.com

Technology/Cengage

Learning

Extraction, Transformation

and

Loading

SOURCE:

by the

many end-user

data

OLAP

approaches increase

in

Figure

by storing the

speed

with fairly sales

small,

data, Whatever

sharply

Copyright review

2020 has

stable

the

an OLAP server

for the

characteristics

Learning. that

any

superiority

All suppressed

Rights

Reserved. content

does

subsets.

May not

not materially

be

affect

scanned, the

overall

to

with local

miniature

databases

duplicated, learning

in experience.

whole

or in Cengage

part.

is

to

mart

objective

is to

of data

trends

most likely

to

certain:

work

multidimensional

with

data

OLAP proponents

multidimensional

store

work

data.

managed?

to store the

databases

will be examined

is

customer

and

The

data

most end users usually

analyst

with

stored

multidimensional

or

a sales

and

representations

that

one thing

best

warehouse

workstations.

graphic

work

components,

data

data

end-user (the

For example,

use of relational

approach

copied,

at

the

is the assumption

is likely

multidimensional

of specialised of each

approach

OLAP

merge

visualisation

representative

the

could

warehouse

data

behind this

of the

are

system

data

and

warehouse

Some favour

basic

Cengage

data

OLAP

of the

arrangement

divided.

the

access

a customer

But how

argue

deemed

data

The logic

whereas

must be used.

15.21,

extracts

of

and characteristics).

Editorial

Figure 15.21 illustrates

marts.

As illustrated

15

workstations.

Course

are

data; others

multidimensional

data.

The

next.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.21

olap server

with local

15

for

Business

Intelligence

799

miniature data marts

Sales

Dept

OLAP

OLAP

Databases

Local

data

Cust

GUI

marts

o mer s

server

Marketing Multiple

Analytical

accessing

processing

OLAP

the

Dept

clients

OLAP server

OLAP

M ar k eti n g

GUI

logic Data

Manufacturing

Dept

processing logic

OLAP GUI

Procurement

OLAP

Data

Data

Data Warehouse

Operational

Warehouse

data

extracted

warehouse

which

Dept

GUI

from

to local

provides

Pr o d u cti o n

Ven d o s

the

data

data

faster

marts,

processing

SOURCE:

15.7.5 relational relational

online

databases

Course

Technology/Cengage

Learning

olap

analytical

and familiar

processing

relational

query tools

(rOLAP)

provides

OLAP functionality

to store and analyse

by using relational

multidimensional

data. That approach

builds on existing relational technologies and represents a natural extension to all of the companies that already use relational database management systems within their organisations. ROLAP adds the following extensions to traditional RDBMS technology: Multidimensional

data schema support

within the

Data access language

and query performance

Support for very large

databases (VLDBs).

RDBMS.

that are optimised for

multidimensional

15

Multidimensional Data Schema Support within the rDBMS Relational technology uses normalised tables to store data. The reliance design

methodology

for relational

databases

data.

is

seen as a stumbling

on normalisation

block to its

use in

as the

OLAP systems.

Normalisation divides business entities into smaller pieces to produce the normalised tables. For example, sales data components might be stored in four or five different tables. The reason for using normalised tables is to reduce redundancies, thereby eliminating data anomalies and to facilitate data updates.

Copyright Editorial

review

2020 has

Unfortunately,

Cengage deemed

Learning. that

any

All suppressed

Rights

for

Reserved. content

does

May not

decision

not materially

be

copied, affect

support

scanned, the

overall

or

duplicated, learning

purposes,

in experience.

whole

or in Cengage

part.

it is

Due Learning

to

easier to

electronic reserves

rights, the

right

understand

some to

third remove

party additional

content

data

may content

be

when they

suppressed at

any

time

from if

the

subsequent

eBook rights

are

and/or restrictions

eChapter(s). require

it.

800

part

VI

Database

seen

Management

with respect

decision

support

seem

preclude

to

to

the

of standard

The

use

for those

to enable

technique star

heavily

known is

changing

as a star

designed

the

End users

who are familiar with the

star

schema

Another Most SQL

criticism

the

tables).

Query codes

intended

against

operational before

East from

off.

was

data

covered

query

operations that

relational

ROLAP

saves

provides

day

SQL is

not suited

use

ROLAP

of

for

performing

multiple-pass

extends

SQL

data (based

on the

SQL

For

the

example,

to

query

so that

star

the

optimiser

is

identifies

the

target

that

operational

it

can

data

or

analysis.

multiple

nested

differentiate

and

between

operational

SQL code required

data

to

access

use

As you

Figure

may be represented

are represented

in

Table

a row

15.11.

and

will recall based

15.3

The 1

warehouse,

has in

from

only four Table bit

5 East,

on,

0

1

0

0

0

0

1

0

0

0

1

0

0

0

1

0

0

0

0

0

1

0

0

0

1

copied, affect

scanned, the

South,

ten

rows

represents

bit

bit would be on.

west

0

be

0

first

of region values

0

not

the

only the East

1

materially

a given

table.)

0

May

SQL

such

North,

(Only

and the

0

not

the

11, Conceptual,

outcomes

15.11.

0

does

queries

techniques

Chapter

1

Reserved.

SQL

optimiser

optimises

indexing

0

content

the

drill-down

properly

1

Rights

the

on 0 and 1 bits to represent

represents

attribute

in the index

to identify

performs

of advanced

as shown

with a REGION

must be represented

data

user

operation

is the

in

end

modified

DBMS.

index is

attribute

is the

if the

databases.

REGION

to represent

query

Design, a bitmapped

Bitmap representation

All

for

functions

Data

advanced

schema)

generate

query

However,

performance

within relational

if the

the

if the

optimiser

through

outcomes

each row

because

warehouse.

query

Database

those

15.3

suppressed

must

support

analysis

queries

0

any

data

by adding

Multidimensional

0

Learning.

operations.

such

methods.

0

that

update

data

1

Cengage

This special

15.5.

access

advanced

Optimised for

the

design

will discover that those tools the

0

deemed

for

a special

data

to

0

has

foundation

Section

east

2020

as the

uses

than

used

query tools

ROLAP

visualisation

warehouse

tools

in

South

review

that

characteristics

data representations. detail rather

the

However,

is that

ROLAP

in

means

used.

require

data

them

indexes

West

15.11

has stressed

These

RDBMSs

North

Copyright Editorial

targets.

of improved

For example,

Note that

are

also improved

data,

Physical

Figure

and

technology,

way, a ROLAP system is able to

to the

For example,

and

schema.

criticism,

data

is

passing

source

condition.

and

this

for

query

requests

and

data

tools

requests

In that

the requests

Logical,

book

data.

passes

as bitmapped

star

databases

data

performance

Another

this

pre-aggregated.

multidimensional

which

and Query Performance

To answer

schema

to support

foundation

query

of relational

requirements

star

environment, and

techniques

in relational

with the traditional

optimisation

support

statements.

(normalised

design

optimise

new

when familiar query

decision

access

data

duplicated

schema,

design

work efficiently

Data Access Language

15

to

data

do not

and improves

taBle

invested

change.

the

view of the

relational

RDBMS technology

is

schema

Naturally,

Given that

be non-normalised,

data.

Fortunately

design

data. to

multidimensional

technique

other

data tend

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

As you

examine

bitmapped found

in

used in

Table

indexes

15.11,

are

more

many relational situations

married, gender

small.

single,

the

the index

at handling

databases.

where

domain) is fairly

note that efficient

However,

number

REGION

widowed,

divorced

of ROLAP

tools

a

minimum

amounts

do keep

of possible

For example,

takes

large

in

values

be another

Databases

amount

of data

mind that

for

than

outcomes

in this

bitmapped

Business

of space.

bitmapped (in

good

for

are the

an attribute

has only four

would

15

words,

example.

index

801

Therefore,

indexes

indexes

other

Intelligence

typically

are

primarily

the

attribute

Marital status

candidate,

as would

M or F.

Early

examples

analytical

processing,

the interaction

FIgure

and the

of the client/server

15.22

are

data

mainly

client/server

processing

ROLAP

early traditional

took

products

place

in

which

on different

the

end-user

computers.

interface,

Figure

the

15.22

shows

ROLAP

GUI

ROLAP

GUI

ROLAP

GUI

ROLAP

GUI

components.

rolap

client/server

architecture

ROLAP system

ROLAP

server

ROLAP

analytical

processing

Dat a

war eh o use d at a

logic

ROLAP

data-processing

logic

The ROLAP server interprets end-user complex

Op er ati o n a d at

requests

to access

the

data

If an end user

a

and builds

SQL queries

required warehouse.

requests

operation, the server

builds

code to

the

access

a drill-down

ROLAP

required the

The GUIfront

SQL

client

operational

end runs on the

computer

data-analysis

database.

ROLAP

receives

and passes requests

server.

to the

The

data replies

GUI

from the

ROLAP server and formats them

according

users

to the

presentation

end

needs.

15 Support for very Large Databases Recall that support for VLDBs is a requirement database is used in a DSS role, it

capability

Copyright Editorial

review

2020 has

Cengage deemed

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

DSS databases.

must also be able to store

and the process of loading

Learning. that

for

copied, affect

amounts

when the relational

of data.

Both the storage

data into the database are crucial. Therefore, the RDBMS must

scanned, the

very large

Therefore,

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

802

part

VI

Database

Management

have the

proper tools to import,

data

normally

are

require

that

both

data-loading

24 hours

loaded the

and populate

and

mode from

the

destination

is important,

the the

data

databases

especially

is

are

scalable

use relational

hardly

open

when

only briefly, typically

client/server to

the

entire

databases

surprising

architecture, enterprise.

for their

that

Clearly,

operational

realise

most current

slack

ROLAP

RDBMS

the

most

Decision batch The

speed

operational

of opportunity

support

operations of the

systems

for

run

maintenance

periods. advanced

is

alogical

Given the

vendors

However, (locked).

that

window

provides

data.

with data.

data.

be reserved

you

during

ROLAP

warehouse

operational

a day, 7 days a week, 52 weeks a year. Therefore,

With an open

is

integrate

bulk (batch)

source

operations

and batch loading

that

in

choice

size

have

decision

of the

extended

for

support

capabilities

companies

relational

that

already

database

their

products

extends

OLAP

to

market,

support

it

data

warehouses.

15.7.6

Multidimensional

Multidimensional

online

multidimensional techniques databases

analytical

database

to store

data in

are

suited

techniques

olap

best

used

in

management matrix-like

to

are

manufacturing

store

multidimensional

data using

MDBMS

as a data cube. z-axes The

in

created

cubes

created

of

before

precreated

MOLAP dealing

data in the from

a client/server

client,

or in

both

A datacube

be

sets.

cache.

To speed

the

Figure

MOLAP

cube

dimensions

data

by ad sales

hoc

design

and

Instead, product,

work

ROLAP

access,

data

can

the

be located

basic

must

you

be

query and

process

may be well justified

cubes

MOLAP

are

location

counterparts,

Since

at the

MOLAP

value.

cubes

One important

change

queries.

known

data

Data

the data cube creation

their

up data

to

will have the

Therefore,

than

of the

warehouse.

subject

are not the same thing.)

cache

shows

tools

of the x-, y-and

especially

are

normally

(A data cube is only a window to a predefined

cube

15.23

proprietary design/

(GIS).

hypercubes.

not

work. The front-end much faster

and a database

infrastructure, locations.

data

cube

dimensions.

design to

the

are

be created for

of the

a three-dimensional

becoming

they

a cube

only those

known

medium

cannot

example,

as

or from the

is,

Most

multidimensional

computer-aided

systems

represent

databases that

as

data cube is a function

thus

to

stores.

data

z-axes

static;

cubes

for

front-end

what is called the

database.

are

Data axes;

are

to

or column

dimensions,

data. such

information

stored

x-, y-and of

they

be used.

in-depth

small

the

operational

is that

databases

with

memory in

The

and you can query

and requires

because when

can

fields

geographic

stores

visualise

premise is that

multidimensional

engineering

and

n number

with defined

dimensions,

is critical

in

they

cubes

to

cubes

analyse

functionality

An MDBMS uses special proprietary

MOLAPs

of each data value in the

data from the

data

and

arrays.

from

arrays, row

space.

grow

by extracting

characteristic

time

can

store

derived

users

The location

a three-dimensional

data

systems (MDBMSs).

(CAD/CAM)

end

(MOLAP)

n-dimensional

manage,

MDBMSs

computer-aided

Conceptually,

processing

held

subset

of

MOLAP also benefits server,

at the

MOLAP

architecture.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.23

Molap client/server

15

Databases

for

Business

Intelligence

803

architecture

MOLAP system

MOLAP

Multidimensional

MOLAP

GUI

MOLAP

GUI

MOLAP

GUI

server

database

MOLAP

MDBMS

analytical

processing

logic

MOLAP data-processing Data

cube

logic

Data cube within

is

The

MOLAP

data

requests

and

translates

cube

requests

created

predefined

dimensions

to

engine

receives

from

end

them that

the

users

into

data

are

MOLAP GUI

passed

MDBMS The

RDBMS

MOLAP

users MOLAP

server

data

As the

data

requires

that

Therefore, over

the

less

work

the

MDBMS

access

review

2020 has

space

Learning. that

any

of the

All suppressed

created

And

data

cube

of

is restricted for the

proprietary

Reserved. content

does

May not

not materially

be

copied, affect

the

overall

or

and to

have

avoid

addition is

loses

end

with and

for

the

request

analysis

performance

lengthy

data

and the

techniques

of its

operation.

turn,

advantage

over

is

relational

somewhat

times

application in

dimension

speed

advantages

Scalability access

that,

of a new

a time-consuming

some

data sets.

system

storage

the

process

MDBMS

medium

operating

data

scanned,

the

MDBMSs

small

using a multidimensional

Rights

often,

although

available

use

of dimensions,

This re-creation

too

best suited to

(memory)

makes

be re-created.

are

database.

size

with a set number

cube

cubes

MDBMS is

methods

Cengage deemed

data

data

the the

a

predefined

entire

relational

because

Copyright

the when

databases,

Editorial

cube is

allows

Dat a war eho use d at a

Oper ati o na d at

GUI

to interact

caused

programs. require

limited

15

by having In

addition,

proprietary

data

query language.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

804

part

VI

Database

Management

Multidimensional

data

analysis

is

also

affected

by how the

database

system

handles

sparsity.

Sparsity is a measurement of the density of the data held in the data cube. Sparsity is computed dividing the

the

data

empty. time

total

cubes

number

of actual

dimensions

are

Returning

to the

sales

are

populated.

processing

In

any

overhead

in the not

example,

period in a given location.

cells

values

predefined,

In fact,

case,

there you

cube all

may be

are

total

number

of cells

populated.

In

many products

that

will often find that fewer than

multidimensional

and resource

by the

cells

databases

other

in

the

words,

are not sold

50 per cent

must handle

sparsity

cube.

some

Since cells

during

of the

by are

a given

data cubes

effectively

to reduce

requirements.

note You

can read

Physical

more about

Database

Relational

data

with

investment

also

other

of time

architecture,

bitmapped

are the

argue that

data

sources

and

effort

MOLAP

databases

and

indexes

in

Chapter

11,

Conceptual,

Logical,

and

Design.

proponents

MDBMS

sparsity

may

norm

using

and tools

used

to integrate be

proprietary

the

a good

within

new

solution

and application

the

makes it

enterprise.

technology

for

software

solutions

those

speed is

Although

and the clients

difficult

in

it takes

existing which

to integrate

the

a substantial

information

small-to

systems

medium-sized

critical.

15.7.7 relational vs Multidimensional olap Table one

15.12 or the

summarises other

must include

provides and

a unified

decision

Figure

data) in the support

taBle

local

many

15.12

Characteristic

computer.

In

the

Schema

size

with the existing tools.

the integration

OLAP

products

the

selection

OLAP

DBMS,

Nevertheless,

of their are

able

if you are using

of

evaluation

programming

the

relational

databases

respective

to

Excel

OLAP data in a SQL server

meantime,

mind that a proper

summary

in

solutions

handle

tabular

within

and

OLAP functionality,

multi-dimensional

as shown

as well as cube (multidimensional

have

successfully

extended

SQL to

relational vs multidimensional olap MOLAP

Uses data cubes

Additional

dimensions

be added

dynamically

Medium

Architecture

Keep in example,

OLAP tools.

Uses star schema

Database

For

comparison.

towards

Many

rOLAP

15

for

working

relational

cons.

compatibility

same ease. For example,

15.24, you can access

and point.

of administrative

point

are

pros view

platforms,

framework.

with the

MOLAP

evaluators

availability

starting

vendors

support

data in

and

a useful

MOLAP

and

on the

hardware

performance

15.12

ROLAP

OLAP

depends

price, supported

requirements, Table

some

often

can

Multidimensional dimensions

arrays, row stores, column

require

re-creation

of the

data

stores.

Additional

cube.

Large

to large

Client/server

Client/server

Standards-based

Open

or proprietary

depending

on vendor.

Open

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Characteristic

rOLAP

Access

Supports

Good

Databases

for

Business

Intelligence

805

MOLAP ad hoc requests

Unlimited Speed

15

to

predefined

dimensions.

Proprietary

access

languages.

dimensions

with small

average

Limited

for

data

sets;

Faster for large

data

sets

with

predefined

dimensions.

medium to large

data sets

15.8

sQl analytIC

The proliferation

of

FunCtIons

OLAP tools

has fostered

the

development

of SQL extensions

to

support

multi-dimensional

data analysis. Most SQLinnovations are the result of vendor-centric product enhancements. However, many of the innovations have made their wayinto standard SQL. This section willintroduce some of the new SQL extensions that have been created to support OLAP-type data manipulations. The SaleCo snowflake schema shown in Figure 15.24 demonstrates the use of the SQL extensions. Note that this

snowflake

schema

has a central

DWSALESFACT

fact table

and three

dimension

tables:

DWCUSTOMER, DWPRODUCT and DWTIME. The central fact table represents daily sales by product and customer. However, as you examine the star schema shown in Figure 15.24 more carefully, you

FIgure

15.24

saleCosnowflake schema

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

806

part

VI

Database

see that

Management

the

DWCUSTOMER

DWREGION

and

Keep in (such

mind that

as

a database

CREATE,

expected.

INSERT,

However,

aggregations that

materialised

DWPRODUCT

is

at the

dimension

views

particularly to

store

of all

DELETE

you run in

multiple columns.

are

core

UPDATE,

most queries

over

BY clause

and

tables

have their

own

dimension

tables:

DWVENDOR.

Thats

useful:

data

and

a data

pre-aggregated

work

tend

CUBE.

in the

In

in

all

the

two

SQL

data

to include

section introduces

and

rows

Therefore,

will

warehouse

why this

ROLLUP

warehouses.

SELECT)

data

you

as

groupings

extensions

addition,

commands

warehouse

to the

will learn

and

GROUP

about

using

database.

online Content Thescriptfiles usedto populate the database andrunthe SQLcommands are available

on the online

platform for this book.

note This

section

uses

functionality.

the

similar functionality

15.8.1 The

and

the

extension

you to

instead.

BY

Copyright review

2020 has

BY

order

Cengage

Learning. that

any

in the for

your

SQL

extensions

to verify

to

support

OLAP

whether the vendor

supports

DBMS.

GROUP

BY clause generates

BY clause.

column

listed

to

generate

only

The

one

ROLLUP

except

aggregates

aggregate

for

extension

for the last

one,

goes which

by

each one gets

different new

value

step further; a grand

total

BY ROLLUP is as follows:

column2

[,table2,

GROUP

BY clause

each

GROUP

column1

of the

generates

deemed

of

[, ...],

aggregate_function(expression)

...]

column2

[, ...])

condition]

[ORDER

Editorial

of the

with the

ROLLUP (column1,

[HAVING

use of the

use

condition]

GROUP

15

the

documentation

is for

GROUP

listed

table1

[WHERE

syntax

used the

column1,

FROM

The

is

a subtotal

The syntax

SELECT

proper

know,

get

demonstrate

extension

of attributes

enables

to

DBMS, consult the

what the

As you

combination

list

RDBMS

rollup

ROLLUP

dimensions.

it

Oracle

If you use a different

column

a grand

ROLLUP

All suppressed

Rights

list

total.

column2, within

does

May not

not materially

be

...]]

the

All other

extension

Reserved. content

[,

GROUP

columns

to generate

copied, affect

scanned, the

overall

or

duplicated, learning

BY

ROLLUP

generate

subtotals

in experience.

whole

or in Cengage

is

very important.

subtotals.

by vendor

part.

Due Learning

to

electronic reserves

The last

For example,

Figure

column 15.25

in the

shows

the

and product.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

15.25

rollup

15

Databases

for

Business

Intelligence

807

extension

Note that Figure 15.25 shows the subtotals by vendor code and a grand total for all product codes. Contrast that with the normal GROUP BY clause that generates only the subtotals for each vendor and product combination rather than the subtotals by vendor and the grand total for all products. The ROLLUP extension is particularly useful when you want to obtain multiple nested subtotals for a dimension

subtotals

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

hierarchy.

by region,

any

All suppressed

Rights

For example,

a location

hierarchy,

you can

use

ROLLUP

to

generate

province, city and store.

Reserved. content

within

15

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

808

part

VI

Database

Management

15.8.2 the CuBe extension The

CUBE

extension

is

also used

with the

GROUP

BY clause to

generate

aggregates

by the listed

columns, including the last one. The CUBE extension enables you to get a subtotal for each column listed in the expression, in addition to a grand total for the last column listed. The syntax of the GROUP BY CUBE is asfollows: SELECT

column1 [, column2, ...], aggregate_function(expression)

FROM

table1 [,table2, ...]

[WHERE

condition]

GROUP BY

CUBE (column1,

[HAVING

column2 [,....])

condition]

[ORDER

BY

column1 [, column2, ...]]

For example, Figure 15.26 shows the use of the month and by product, as well as a grand total.

FIgure

15.26

CUBE extension to compute the sales subtotals

by

CuBe extension

15

In Figure 15.26, note that the CUBE extension generates the subtotals for each combination of month and product, in addition to subtotals by month and by product, as well as a grand total. The CUBE extension is particularly useful when you wantto compute all possible subtotals within groupings based on multiple dimensions.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Cross-tabulations

All suppressed

Rights

Reserved. content

does

May not

not materially

be

are especially

copied, affect

scanned, the

overall

or

duplicated, learning

good candidates

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

for application

rights, the

right

some to

third remove

party additional

of the

content

may content

CUBE extension.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

15

Databases

for

Business

Intelligence

809

15.8.3 Materialised Views The data

warehouse

normally

contains

fact tables

that

store

specific

measurements

of interest

to

an

organisation. Such measurements are organised by different dimensions. The vast majority of OLAP business analysis of everyday activities is based on comparisons of data that are aggregated at different levels, such astotals by vendor, by product and by store. Since

businesses

normally

use a predefined

set of summaries

for

benchmarking,

it is reasonable

to

predefine such summaries for future use by creating summary fact tables. However, creating multiple summary fact tables that use GROUP BY queries with multiple table joins could become a resource-intensive operation. In addition, data warehouses must also be able to maintain up-to-date summarised data at all times. So, what happens with the summary fact tables after new sales data have been added to the

base fact

tables?

Under

normal

circumstances,

the

summary

fact

tables

are re-created.

This

operation requires that the SQL code be run again to re-create all summary rows, even when only a few rows needed updating. Clearly, this is a time-consuming process. To save query processing time, most database vendors haveimplemented additional functionality to manage aggregate summaries more efficiently. This new functionality resembles the standard SQL views for

which the SQL code is

predefined

in the

database.

However, the added functionality

is that the views also store the preaggregated rows, something like Microsoft SQL Server provides indexed views, while Oracle provides explains the use of materialised views. A materialised

view is a dynamic table that

contains

not only the

difference

a summary table. For example, materialised views. This section

SQL query command

to generate the

rows, but also stores the actual rows. The materialised view is created the first time the query is run and the summary rows are stored in the table. The materialised view rows are automatically updated when the base tables are updated. That way,the data warehouse administrator creates the view but will not have to update the view. The use of materialised views is totally transparent to the end user. The OLAP end user can create

OLAP queries,

using the standard

fact tables,

and the

DBMS

query optimisation

feature

will automatically use the materialised views if those views provide better performance. The basic syntax for the materialised view is: CREATE

MATERIALISED

VIEW view_name

BUILD {IMMEDIATE | DEFERRED} REFRESH

{[FAST

[ENABLE

|

COMPLETE

|

FORCE]}

ON

COMMIT

QUERY REWRITE]

AS select_query; The

BUILD

clause

that

the

indicates is in

rows

materialised

The

part of the

review

you indicate

commit

materialised

has

lets

create

Cengage deemed

rows

right

are populated

provides

are

actually

after the

routine

populated.

command

at a later time.

a special

any

the

DBMS

views

All

Rights

views,

does

May not

not materially

be

that

tables.

affect

how

that

rows.

try to

that

do

to

IMMEDIATE

is entered.

Until then, the

indi-cates

DEFERRED

materialised

an administrator

update

whenever

the

runs

materialised

a change is

COMPLETE select

indicates

query

on

to

the

update;

otherwise,

updates

to the

materialised

DML statement, ENABLE

that is,

view

view

populate

when

made in the

that

which

a FAST

the

The

it

update is

based

REWRITE

privileges

and

is rerun.

will do a COMPLETE view

will take

as part of the commit

QUERY

1

base tables,

a complete view is

new

option

allows

place

of the the

as

DML

DBMS

to

optimisation. you

need

defer to the copied,

and

when the

underlying

query

As always, Reserved.

content

will first

base

in

affected view

indicates

of the

the

materialised

suppressed

only the

clause

process

when

FAST indicates

materialised

updated

steps.

Learning. that

base tables.

COMMIT

use the

2020

clause

that

ON

that

To

Copyright

DBMS

in the

transaction

prerequisite

Editorial

The

view updates

all rows

indicates

update.

are populated

view rows

state.

are added to the

FORCE

view

views.

REFRESH

made for

materialised

view rows

materialised

an unusable

The

when the

materialised

that the

materialised

the

indicates

scanned, the

overall

or

duplicated, learning

specified

DBMS in experience.

whole

documentation or in Cengage

part.

Due Learning

to

electronic reserves

you

must

complete

for the latest rights, the

right

some to

third remove

party additional

content

specified

updates. may content

be

In the

suppressed at

any

time

from if

the

subsequent

case

eBook rights

and/or restrictions

eChapter(s). require

it.

810

part

VI

Database

of Oracle

Management

versions

materialised Figure

15.27

RDBMS.

11g

view.

In

shows

Note that,

a sysdba)

then

the if

you

FIgure

sales

you

do this

code

you

to

do not

Figure

aggregates

15.27

to

must create you

create have

must the

15.27, this

by product.

have

view logs

appropriate

a materialised

view.

view computes

(i.e.

the

you

base tables

set

materialised

privileges

SALES_MONTH_MV

on the

privileges

Administer

materialised

The

materialised the

MONTH_SALES_MV

Database

will not be able to create

As you can see in total

and 12c,

order

by the

view

in the

would

of the

Oracle

DBA.

Oracle

log into

11g

Oracle

as

monthly total

units sold and the

view is

to

materialised

configured

update

Creating a materialised view in oracle 11g using oracle sQl* plus as a DBa

15

automatically after each change in the base tables. Note that the last row of SALES_MONTH_MV indicates that, during October 2015, the sales of product SM-18277 are three units, for atotal of 20.97. Although all of the examples in this section focus on SQL extensions to support OLAP reporting in an Oracle DBMS, you have seen just a small fraction of the many business intelligence features currently Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

provided

by

manipulate, one for

most DBMS vendors. analyse

Oracle

FIgure

and

and

present

one for

15.28

For example, the

most vendors

data in

multiple

OLAP

products.

Microsoft

provide

formats.

Figure

15

rich

Databases

graphical

15.28

shows

Microsoft

SQL

Analysis

Data

the

allow

user

the

been

is

more

process

quickly

and

accurate

all

DBMS

than

managers

in

does

Data visualisation insight

from

simple

histograms,

time range

software

7

into

such

The

Best

Data

very

steps

many

Power

of 2019,

into

waterfall (such

Oliver

rich

Domo

Rist,

thousands,

techniques

and

many Excel)

Google

Baker

part.

to

tabular

make informed that

visualisation

provide

pie charts,

line

heat

maps,

The tools

used in

data

advanced

1

techniques

Gantt charts,

more. to

of rows

include

plots,

and never

millions

graphical)

Data

has

summarised

data to

(mostly

scatter

Microsoft

Pam

or

of the

formats

Such

and

that

is to

patterns

and this

way. Providing

relationships.

charts,

trends,

words,

meaning

charts,

as

representation

by identifying

hundreds,

are familiar.

data

a thousand

the

possible

maps, donut

BI,

picture

meaningful

visually

and

spreadsheet

Tools

with

a visual

data. The goal of data visualisation

worth

Tables

data into

charts,

provide

big is

mind in a

bubble

Microsoft

Visualization

picture

patterns and

charts,

Tableau,

datas

a

enough insight the

complex,

a simple

see the

human

trends,

to

meaning of the

visualisation.

by the

plots,

from

data

the

saying

encodes

bubble series

as

abstracting

to

the

data

overall

to

bar charts,

visualisation

of

not give them

decisions.

graphs,

Services

Server

efficiently heard

be processed

at-a-glance range

to

screens,

Services

ability to comprehend

We have

of data cannot data to

the

users

relationships.

sample

811

VIsualIsatIon

visualisation

enhances

Intelligence

sample olap applications

Oracle

Data

Business

user interfaces

two

OLAP

15.9

for

data

visualisation

Analytics.7

PC

Magazine,

July

24,

2018.

Available:

huk.

pcmag.com/cloud-services/83744/the-best-data-visualization-tools.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

Due Learning

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

812

part

VI

Database

Management

Common

productivity

visualisations.

Excel

visualising row

spreadsheet

and

column

powerful

The top

of the

manager

report

answered there

out

and

15.29

up.

aline

data

of the

that

sell

The rest

of the

visual

more than

product

most business

sales

15.29

by product

and

month.

products

up or down?

rest,

and that

remain

at the

sellers.

This

data.

of those

of

both.

top

table, if

The the he or

questions quickly

are trending

the

monthly

for

those We can

puts

Microsoft

What about

However,

through

eliminated

report

with totals

Looking

sales two

constant

a simple

top

has

sources.

For example,

month

are the

of the

add-in

multiple

shows by

the

for

users.

and

representation

data

capabilities

PowerPivot

of data from

are trending

powerful

PivotChart

of the

product

which

surprisingly

and

Figure

by

sales out

sales

at the

products

analysis. sales

to figure

which product

of

provide

PivotTable

introduction

within reach

plot

can often and

for the integration

shows

minutes

Excel

charting the

allows

sales

table

by looking

are three

one is trending

FIgure

shows

to figure

Microsoft basic

capabilities

a few

immediately

as

More recently,

summary

might take

she needs

that

data.

be used to visualise

data.

bottom

such included

data limitations

data visualisation

Excel could sales

tools

has long

are

deduce down

and

year.

Microsoft excel sales data report

The above,

albeit

simple,

example

shows the

power

of data visualisation;

it shows

how end users

can quickly gain insight into their data using a simple graphical representation.

15

15.9.1 the need for Data Visualisation From the

previous

discussion

you

might think

that

data visualisation

is

nothing

new,

and

you are

correct to a certain degree. After all, spreadsheets and graphics libraries have been around for a while. What has changed is the development of Big Data and business intelligence. The reality is that, in the current business climate, companies are trying to find a competitive edge by mining large amounts of data.

Copyright Editorial

review

2020 has

Cengage deemed

Tools that facilitate

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

and enhance

be

copied, affect

scanned, the

overall

or

duplicated, learning

the

in experience.

understanding

whole

or in Cengage

part.

Due Learning

to

electronic reserves

of large

rights, the

right

some to

amounts

third remove

party additional

content

of data have

may content

be

suppressed at

any

time

from if

the

subsequent

become

eBook rights

and/or restrictions

eChapter(s). require

it

814

part

VI

Database

Management

This new data visualisation Comparative larger

conveys

sales volumes

at least

as shown

two

additional

insights

by the size of the

into the sales

bubbles.

Larger total

data:

sales values

produce

bubbles.

Geographic

market penetration

visualisation greatest

as shown

makes it easier for

sales

penetration.

to

get

more

detail

to

get

more

detailed

one of the

Also,

the

by clicking

information.

The

many advantages

density

a manager to identify

Furthermore,

data.

by the

on the

of the current

manager map, the

to

zoom

breed

bubbles

the region

sales

ability

of the

in

end

the

(northeastern)

could

and

against

click

user

out,

zoom

down

of data visualisation

that

on any

can

drill

map. The has the

of the

in

on

and

sales

bubbles

a given

region

up, filter,

etc. is

tools.

note Data visualisation to

present

For example, uses

plays

data

are

health

data to

Another easier

data

within

helps has to

discover

it.

However,

modelling

and

understanding

data

the

the

visualisations

of this

meaning

can

of data.

be used

deals

data.

in

to

allows

rigorous

bad

data

that

makes

tool,

points)

and

and

explore

data quickly

other

tools

chapter,

Data

for

could

is just

and

such

data

Big

data visualisation

data

organised

structuring

decisions,

that

using

which he

years.

tool

of properly

to

understand

analysis

200

as we have seen in this of bad

process

end users to

data

in

past

ways

discipline.

As a communication

However,

can lead

New

any

communication

data.

(distilled

data

also important

not replace

of

with the

bad

over the

an effective

validated

because

Its

health

amounts

hidden in the

chapter

issue

it is

large

processed,

Data visualisation

predictive

population

is that

message

vetted

does

world

particular,

even larger!

it

of

visualisation

part

and not an end in itself.

and

Good

history

in

a very important

make a bad decision

about

data

data

A large

This is

the

be properly

a context.

analysis.

of

understand

discovering

developed.

Dr Hans Rosling, (www.youtube.com/watch?v=jbkSRLYSojo)

visualise

advantage

to

visualisation such

role in

being

see the video from

public

it

an important

constantly

a tool,

gain insights

as statistics,

data

modelling.

15.9.2 the science of Data Visualisation Data visualisation brain

sciences This is

interprets,

investigate a

at Figure was

Copyright review

2020 has

any

All suppressed

Rights

quicker

Reserved. content

does

May not

all

not materially

be

start

balls

are in

people

copied, affect

scanned, the

overall

would data

or

duplicated, learning

our

Panel say

in experience.

whole

the

A?

B.

Cengage

part.

Due Learning

How

Why?

to

learn

science visual

presented

or in

to

about

electronic reserves

rights, right

are in the

grouped

some to

the

the

external

third remove

B?

human

additional

content

may content

relates

is

to

looking

Which

brain

objects.

party

world.

psychology,

exercise:

Panel

human

cognitive

of data visualisation

Because

the

how the

neurology,

communication

many

with

study

speaking,

neuroscience,

with a simple

when

sciences Broadly

senses

linguistics,

Specifically,

Lets

process

The cognitive information.

with

includes

other fields. data.

to

sciences.

processes

connect

that

soccer

Almost

makes it

Learning. that

and

many

and

brains

visual

how

quicker/easier?

Cengage deemed

our

science

process

15.31,

way that

Editorial

how

anthropology

how our brains

cognitive

organises

multidisciplinary

philosophy,

15

has its roots in the

receives,

answer

wired

in

a

What constitutes

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

good

data

seen and

as

visualisation?

both

an art

function.

correct

the

Form

data

of data

15.31

means

In

using

question

to

words,

data

other

the

proper

Remember

visual

that

the

answer

because

visualisation

construct, purpose

data

is

and

of

data

Databases

for

Business

Intelligence

visualisation

concerned

with

function

means

visualisation

is to

can both

815

be

form

applying

the

communicate

easily.

the power

Over the past few

a difficult

a science.

transformations.

meaning

FIgure

That is

and

15

decades,

of visual communication

plenty

of research

has been done on data visualisation.

Data visualisation

has evolved to become a very robust discipline. As a discipline, data visualisation can be studied as a group of visual communication techniques used to explore and discover data insights by applying: Pattern recognition: Spatial

Visually identifying

awareness:

Aesthetics:

trends,

Use of size and orientation

distribution to

compare

and relationships and relate

data

Use of shapes and colours to highlight and contrast data composition

and

relationships. In

general,

data

visualisation

uses five

characteristics:

shape,

colour,

size,

position

and

grouping/

order to convey and highlight the meaning of the data. When used correctly, data visualisation can tell the story behind the data. Here is another example that uses data visualisation to explore data and quickly provide some useful data insights. In this case, we are going to use vehicle crash data for the state of Iowa, available at https://catalog.data.gov/. The data set contains data on car accidents in the

US State

of Iowa

from

2010 to

early

2015.

Figure

15.32

contains

a visualisation

of this

data

set using Tableau. 15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

816

part

VI

FIgure

Database

Management

15.32

Vehicle crash analysis

note There the

are most

several

public

common

sources

sources

of large

data

sets

that

you

could

use to

practise

http://catalog.data.gov

http://data.worldbank.org

http://aws.amazon.com/datasets

http://data.worldbank.org

https://data.medicare.gov

www.faa.gov/data_research/

good

Visualisation

Some

of

https://data.world/

www.cdc.gov/nchs/data_access/

For some

visualisations.

are:

examples

of data visualisations,

see the

Centers for

Disease

Control

and

Prevention,

Data

Gallery, at www.cdc.gov/nchs/data-visualization/

15 This

visualisation

visualisation,

includes

we can

vehicles majority slight

the

Copyright Editorial

review

2020 has

increase

in

visualisation,

Cengage deemed

Learning. that

driving

of accidents

any

All suppressed

Rights

not involve

vehicle

crashes

data

Reserved. content

does

May not

graphs

determine

on two-lane

did

the

three

quickly

not materially

roads

in the

copied, affect

scanned, the

overall

or

duplicated,

the

Finally,

past four

learning

bar,

in

whole

speed

years.

or in Cengage

heat

map)

number

we could

processed

experience.

and

a significant where

alcohol.

was previously

be

(line,

that

It is

limit

is

filters.

Looking

of car accidents

involved

90 km/h.

also

determine

also important

and transformed

part.

Due Learning

to

electronic reserves

and

rights, the

right

We can that

to

to

third remove

party additional

content

may content

see that

seems

note that,

extracted,

some

also

there

in

suppressed at

any

time

from if

the

to

order

formatted,

be

at this

single-occupant

be a to

do

formulas

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

applied, adult

etc. For example,

or senior;

etc.

As you

and its

can

You

you usually

introduces

data set

if the

BAC level

see in these

domain.

data,

in this

determine

examples,

cant

start

you

presented

start

in

with

are two

?

Ordinal:

400

400

Quantitative: ordered

can think being

correct

the

type

visually.

story

and

others

the

graph.

as being

of this This

was

problem

817

occupants,

of the

after

domain.

you

data

set

get the

raw

The next section

data

and

highlight

on the

end

visualisation

data

can

be

of the

data

or the

tool.

In

or aggregated.

into two

subtypes:

Examples:

data.

Gender

Examples:

R200

This type

000,

Rate your

200

001 to

on a star

schema

it

shape, ways.

The

that

the

be counted,

Examples

of

colour,

need

way you

and

visualise

provide

data use the

data

tells

insights

Panel

a

and

A, the

of at the

visually

it

group/order

the

unknown

Figure 15.33,

to resonate

to

way to represent

position

x-axis is at the top instead a red

quantitative

you

proper

size,

can

As you can see in

using

and the

means

colour,

visualisations

bar graphs

can

data.

etc.

certain data

of data

and ratio

data type, including

an issue.

with

(under

as interval

because

uses

Some

along

income

of the data

dimensions

data in

is that the

purposely,

use

same

but not aggregated.

family

this

visualisation

to

ordered

is important

users.

draw attention

your

with each

before,

not

of accidents,

the

This

proper

The

more).

to

number

schema.

data.

of data can be subdivided

measures

refer

and operations

done

multiple

understanding

the

be the

and ordered

or

or

data

a star

can be a way to

the

Intelligence

as child, teenager,

or

Therefore,

understand

may not

what is

facts

of qualitative

have learnt

a good

Business

or undergraduate).

001

GPA,

an impact

characteristic

600

age,

to represent has

poor),

000,

numeric

of functions

characteristics

fair,

600

to

but

can be counted

data include

As you

drivers

single

understand.

This type

(graduate

Statisticians

of

ways

data.

aggregated.

facts

need

be counted

class

good,

001 to

Describes

and

quantitative You

student

(excellent,

implies

dont

understanding

you

of the can

This is data that

000,

you

to

of those

qualities

data that

or female);

teacher

the

of data:

This is

(male

as

Some

?

Nominal:

visualisation,

ways.

Describes

visualisation

what

to classify

determine

for

on this topic.

types

Qualitative:

or illegal;

Databases

the Data

data

multiple

general, there

data

some time

basic notions

15.9.3 understanding Before

was legal

analysing

must dedicate

some

we used several formulas

15

main

bottom

with the

of

title

of

presentation. However,

change

you

could

the colour

convey.

Notice

use the

of the

that

the

same

bars to same

data

to

blue, and it

data

plot the

bar

graph

with the

x-axis

would have a different impact

can tell two

different

stories

at the

bottom

on the story

depending

on the

(Panel

B),

you are trying to

visualisation.

15

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

818

part

VI

FIgure

Database

Management

15.33

Infographics

can have an impact

beyond presenting the data

note If you

would like to learn

more about the fascinating

discipline

of data visualisation,

Show

Methe

Numbers:

Designing Tables and Graphsto Enlighten by Stephen Few and The Visual Display of Quantitative Information by Edward R.Tufte are good places to start.

suMMary Business

intelligence

applications generating

and

Decision

support

information

from

support

system

making

DSS

to

is

a term

capture,

presenting to

a

is

to

use such

(or

and

a series

of

data

decision

for

of

purpose

of

making. designed

decision

tools

set

with the

methodologies)

as a basis

of computerised

and integrated

analyse

business

information

an arrangement

cohesive

store

support

methodology

and to

(DSS)

a comprehensive, integrate,

information

refers data

for

collect,

used

to

to

making. assist

extract

A decision

managerial

decision

within a business.

Operational

15

(BI)

used

data

data

differ

are

not

from

best

suited

operational

for

decision

data in three

support. main areas:

From

the

end-user

time

span,

point

granularity

of view,

and

dimensionality. The of

data

warehouse

data that

database

data

Copyright review

2020 has

Cengage deemed

Learning. that

any

All

Rights

for

subject-orientated, decision

analysis

(OLAP)

making,

business

does

May not

not materially

be

copied, affect

scanned, the

overall

making.

and

provides

processing

Reserved. content

data

subset that

decision

suppressed

an integrated,

support for

analytical

supports

Editorial

provides optimised

warehouse

Online

is

query

decision refers

to

duplicated, learning

in experience.

whole

The

data

to

A data

a small

an advanced and

or in Cengage

part.

data

operations

Due Learning

to

electronic reserves

non-volatile

warehouse

processing.

support

modelling

or

time-variant,

is

usually

mart is

group analysis

collection a read-only

a small,

single-subject

of people. environment

that

research.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Relational

online analytical

relational

databases

data.

Multidimensional

by using

star

is

has four

Facts are numeric are

Conceptually,

general

analytics

knowledge

is

data.

to

map

Intelligence

819

multidimensional

OLAP functionality to

store

and

multidimensional

of performing

analyse

data

attributes

a specific

provide

model is

best represented

attribute

hierarchies.

The

hierarchies.

aspect

perspectives

or activity.

to

a given

fact.

by a three-dimensional The

to

support

analysis.

and attribute

business

additional

main purposes:

decision

advanced

dimensions,

that

unknown

data

process

has four

BI functionality

the

attribute

permit

cube.

hierarchy

provides

aggregation

and to

data

tools

a

provide

focuses

analysis

data

enhanced

advanced

can be divided into

on discovering

models

to

analysis

explanatory

and explaining

on creating

of operational

relationships,

phases:

provides

to

extract

and predictive

data characteristics

predict

future

and

outcomes

or

data.

characteristics,

been

that

Data analytics focuses

analytics existing

mining automates

has

data

analytics

on the

SQL

categories

well-defined

business

Predictive

prognosis.

in

of

events

based

facts,

a subset

Explanatory

and

provides

Business

by using

analyse

(MDBMSs)

or values representing

that is used for two

relationships.

Data

and

for

data analysis.

from

analytics.

components:

multidimensional

organisation

used

purpose

qualifying

be ordered

drill-down/roll-up Data

store

systems

technique

with the

measurements

the can

top-down

OLAP functionality

to

(MOLAP)

management

modelling

database

basic star schema

Attributes

tools

processing

database

a data

a relational

Dimensions

analytical

provides

query

Databases

data.

schema

data into

(ROLAP)

relational

online

multidimensional

multidimensional The

processing

and familiar

15

with

with the intention

dependencies

preparation,

analytic

data

data

and/or

analysis

functions

that

and

of finding

trends.

The

classification,

support

previously

data

mining

knowledge

OLAP type

acquisition

processing

and

data

generation.

Data visualisation comprehend

provides

the

visual representations

meaning of the

of data that

enhance

the

users

ability to

data.

Key terMs attributehierarchy

drill-down

partitioning

business intelligence(BI)

explanatoryanalytics

periodicity

cube cache

extraction, transformation andloading (ETL)

portal

dashboard

facts

relational online analytical processing

data cube

fact table

dataextraction

governance

replication

datafiltering

KeyPerformance Indicators(KPI)

roll-up

data mart

masterdata management(MDM)

slice and dice

data mining

materialisedview

snowflake schema

data store

metrics

sparsity

datavisualisation

multidimensional databasemanagement

starschema

system (MDBMS)

data warehouse

onlineanalyticalprocessing (OLAP)

dimensiontables

review

2020 has

Cengage deemed

Learning. that

verylarge databases(VLDBs)

(MOLAP)

dimensions

Copyright

15

multidimensional online analytical processing

decisionsupport system(DSS)

Editorial

(ROLAP)

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

820

part

VI

Database

Management

Further Finlay,

reaDIng

S. Artificial

Intelligence

and

2nd

Relativistic,

Technologies, Inmon,

W. Building

Kimbal,

R. The

Witten,

I.

in

Data

and

edition.

the

Data

Data

Frank,

E. Data

4th

Toolkit,

3rd

Mining:

1

What

Describe the

BI benefits

Explain

edition.

Wiley

of

systems,

and

main components

BI information

data

and

Techniques

(Morgan

Kaufmann

Series

of BI usage, using the internet

for

of the

do they

BI architecture

play in the

interact

business

environment?

to form

a system.

decision

support

Describe

the

formats.

differences

between

operational

and

data?

main characteristics?

of problems likely to be encountered

when operational

data are integrated

warehouse.

Use the following

8

Tools

examples

what role

dissemination

most relevant

Give three examples

scenario

working

data

2013.

found?

Whatis a data warehouse, and what areits

While

Learning

companies

6

its

Driven

2005.

Publishing,

Give some recent

What are the

the

Data

platform for this book.

5

into

to

Illustrate the evolution of BI.

support

how the

evolution

7

Publishing,

Wiley

Machine

have

BIframework.

What are decision

4

Guide

Answers to selectedReviewQuestions andProblems forthis chapter

on the online

Whatis business intelligence?

3

A No-Nonsense

QuestIons

assistance.

2

Business:

2016.

online Content

reVIew

for

edition.

Practical

Systems),

are available

Learning 2017.

Warehouse,

Warehouse

Management

Machine

to answer

as a database

warehouse

Questions

analyst

for

8-14.

a national

sales

organisation,

you

are

asked

to

be part

of

project team.

Prepare a high-level summary

of the

mainrequirements

for

evaluating

DBMS products for data

warehousing.

9

Your data warehousing implementation. some

data

would

15 10

warehousing

Suppose

Before OLAP

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

the

All suppressed

them?

a commitment,

overview.

architecture

them

data

for

The

requirements

Rights

Reserved. content

does

May not

not materially

and

be

copied, affect

How

data

members how

main OLAP client/server

scanned, the

overall

idea

would

OLAP

duplicated, learning

in experience.

to

you

about the need to acquire

enterprise-wide

or in Cengage

users. its

project

particularly will fit the

whole

your

explain

warehousing are

components

or

the

a data warehouse before its

concerned

data

warehouse.

What

recommendations.

warehouse

the

groups

are especially

implementing

your

the

analysis

members

before

Explain

you are selling

making

group

skills

you recommend?

data

11

project group is arguing about prototyping

The project

group

concerned

existing

How

would

advantages

you

has invited about

environment.

define

multi-dimensional

to them?

you to

the

OLAP

provide

an

client/server

Your job is to

explain

to

and architectures.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

12

One of your vendors recommends to

13

your

project

14

Databases

for

Business

Intelligence

821

using an MDBMS. How would you explain this recommendation

leader?

The project group is ready to should

15

be the

basis

for

this

make afinal decision, choosing between

decision?

ROLAP and MOLAP. What

Why?

The data warehouse project is in the design phase. Explain to your fellow designers how you would use

15

a star

schema

in the

design.

Trace the evolution of DSSfrom its origins to todays technologies

16

Whatis

17

Explain

influenced

this

advanced analytical tools.

Which major

evolution?

OLAP, and what areits

main characteristics?

ROLAP and give the reasons

you

would recommend

its use in the relational

database

environment.

18

Explain the use offacts,

dimensions

19

Explain multidimensional

cubes and describe how the slice-and-dice

20 In the star schema context,

and attributes in the star schema. technique fits into this

model.

what are attribute hierarchies and aggregation levels and whatis their

purpose?

21

Discuss the

most common

22

Explain some of the

performance improvement

mostimportant

techniques

used in star schemas.

issues in data warehouse implementation.

23

Whatis data mining, and how does it differ from traditional

24

How does data mining work? Discuss the different phases in the data mining process.

25

Describe the characteristics

of predictive

analytics.

DSS tools?

Whatis the impact

of Big Data in predictive

analytics?

26

Describe data visualisation.

27 Is data visualisation 28

Whatis the goal of data visualisation?

only useful when used with Big Data? Explain and expand.

As a discipline, data visualisation data insights

29

by applying:

______________,

Describe the different types some

30

examples

of the

can be studied as _______________ used to explore and discover _________________

of data and how they

different

data

and

convey

mapto star schemas

and data analysis.

Give

types.

Whichfive graphical data characteristics findings

and _______________.

does data visualisation

use to highlight and contrast data

a story?

proBleMs

online Content Thedatabases usedforthis problemsetarefoundonthe online platform

for

this

databases, Problems on the

Copyright Editorial

review

2020 has

Cengage deemed

any

These

databases

named

'Ch15_P1.mdb',

1, 3 and

4, respectively.

online

Learning. that

book.

All suppressed

platform

Rights

for

Reserved. content

does

May not

this

not materially

be

are

stored

in

Microsoft

'Ch15_P3.mdb', The

data

for

Access

2002

and 'Ch15_P4.mdb', Problem

2 are

stored

format.

contain in

Microsoft

15

The the

data

Excel

for

format

book.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

822

part

VI

Database

1

Management

The university students lab

computer labs

using

director

usage

the lab.

assigns

statistics.

Show

the

Show

fact

student

lab

table

and

different

which includes

keep

track

of the lab

classification.

semesters.

tables:

student

and

using

data. the

Ch15_P1.MDB

These facts

data,

complete

become the source for the design of

table.)

design

of the

dimension

dimensions. (Hint: These dimensions become the source

tables.)

1a and

1b.

e

Recommend the appropriate

2

of

data.

containing

requirements

by student

the following

access

to

number

The computer

periods.

major

and

which

Define the attributes for each of the dimensions in Problem 1b.

Implement

g

you

Victoria Ephanor

force

defined

in

of four

be used Using

She

study

as the the

1d.

will meetthe requirements

sales

with

you

figures

basis for data

asks

supplied

in

distribution

to

develop

Because the business is growing

data the

currently

warehouse

salesperson

warehouse

Ch15-P2.xls

pool to help guide the accelerating

software,

a data

region,

employs

application and

a small

prototype

product.

sales

that

(This

prototype

seven

problems:

will is to

database.) file,

a

Identify the appropriate fact table components.

b

Identify the appropriate

c

company.

spreadsheet

by year,

a future

listed in this problems introduction.

manage the vast information

who is familiar

people.

her to

Problem

that it is time to

Ms Ephanor,

enable

hierarchies.

manages a small product

she recognises

growth.

attribute

your data warehouse design, using the star schema you created in Problem 1c and

attributes

Create the reports that

fast,

complete

the

following

dimension tables.

Draw a star schema diagram for this data warehouse.

d

Identify the attributes for the dimension tables that

e

will be required to solve this problem.

Using a Microsoft Excel spreadsheet (or any other spreadsheet

15

tables),

generate

be able to

first

f

a pivot table

specify

the

pivot table in

to

display

Figure

show

the

sales

of sales

for

any

by product given

year.

capable of producing

and (The

by region. sample

The

output

end is

pivot

user

shown

must in the

P15.1.)

Using Problem 2e as your base, add a second pivottable (see Figure P15.1)to show the sales by salesperson

for all years

g

Cengage

Learning. that

and

and for

by region.

a given

The

end

product

user

must

be

able to

specify

sales

for

a given

year

or

or for all products.

Create a 3-D bar graph to show sales by salesperson, by product, and by region. (See the sample

deemed

majors

database,

bulleted

by

in

purposes.

are to:

d

the

has

period,

time

measured by the

budgeting

warehouse

database

different

usage,

for

Drawthe lab usage star schema, using the fact and dimension structures you defined in

f

2020

by

a data this

mainfacts to be analysed. (Hint:

Problems

review

users

Define and describe the appropriate

c

Copyright

developing for

of lab

is important

1a-1g.

Define the

for the

Editorial

keeps track

function

of

different

the

three

Problems

the

of

is a dimension

Given the

b

for

contains

STUDENT

task

by time

Ch15_P1.mdb

USELOG

a

the

numbers

usage

director

particular

main requirements

number

usage

Use the

you

The

total

Compare

This

any

All suppressed

output

Rights

Reserved. content

does

in

May not

not materially

Figure

be

copied, affect

P15.2.)

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

p15.1

FIgure

p15.2

15

Databases

for

Business

Intelligence

823

using a pivot table

3-D bar graph showing the relationships

among the agent, product and

region

3

David Suker, the inventory the

use of supplies

manager for a marketing research

within the different

company

company, is interested

departments.

in studying

Mr Suker has heard that

his friend,

Ms Ephanor, has developed a small spreadsheet-based data warehouse model(see Problem 2) that she uses to analyse sales data. Mr Suker is interested in developing a small data warehouse model like Ms Ephanors so he can analyse orders by department and by product. He will use Microsoft

a

Access

as the

data

warehouse

DBMS

and

Microsoft

Excel as the

analysis tool.

15

Develop the order star schema.

b

Identify

c

the

appropriate

dimensions

attributes.

Identify the attribute hierarchies required to support the

d

Develop

a crosstab

report

(in

Microsoft

Access),

model.

using

a 3-D

bar graph to

show

orders

by

product and by department. (The sample output is shown in Figure P15.3.)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

824

part

VI

Database

FIgure

Management

p15.3

4

Crosstab report:

ROBCOR, on-demand has

grown

orders by product and department

whose sample data are contained in the database named Ch15_P4.mdb, provides aviation charters, using a mix of different aircraft and aircraft types. Because ROBCOR rapidly,

it

hires

you to

be its first

database

manager.

(The

companys

database,

developed by an outside consulting team, already has a charter database in place to help manage all of its operations.) Your first critical assignment is to develop a decision support system to analyse the charter data. (Review Problems 24-28 in Chapter 3, Relational Model Characteristics, in which the operations have been described.) The charter operations manager wants to be able to

analyse

charter

data such

as cost,

hours flown,

fuel

used and revenue.

She would

also like to

be able to drill down by pilot, type of aircraft and time periods. Given those

15

requirements,

Create a star schema for the charter data.

b

Define the dimensions

c

Define the necessary attribute Implement the data Problems 4a-4c.

e

review

2020 has

Cengage deemed

Learning. that

and attributes for the charter operations

any

All suppressed

Rights

design,

using the

design components

you developed in

willillustrate that your data warehouse meets the specified

requirements.

Reserved. content

star schema.

hierarchies.

warehouse

Generate the reports that information

Copyright

the following:

a

d

Editorial

complete

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Using the

data

provided

in the

SaleCo

snowflake

schema

in

15

Figure

Databases

15.24,

for

Business

Intelligence

825

solve the following

problems.

online Content Thescriptfiles usedto populate the databaseareavailableonthe online

platform

DBMS, and

5

for

consult

what

this

the

the

book.

The

script

for

your

documentation

proper

syntax

to

is

Whatis the SQL command to list the total customer

6

and a grand total

files

assume

verify

whether

an the

sales? (Hint:

7

and by month and a grand total

vendor

supports

for all product

ROLLUP

8

Whatis the

all sales? (Hint:

SQL command

Use the

to list the total

9

Whatis the

SQL command

month, with subtotals 10

Whatis the

to list the

ROLLUP

sales by

month and a grand total for all sales? (Hint:

sales? (Hint:

to list the

number

a different

functionality

with subtotals

by

Use the

with subtotals by

ROLLUP

command.)

with subtotals

byregion

command.) category,

with subtotals

by

ROLLUP command.)

of product

sales (number

by month and a grand total for all sales? (Hint:

SQL command

use

similar

month and product,

month and product

Usethe

number

you

command.)

Whatis the SQL command to list the total sales by region and customer, and a grand total for

If

and by product,

Use the

Whatis the SQL command to list the total sales by customer, customer

RDBMS.

DBMS.

sales by customer

for all product

Oracle

of product

of rows)

and total

sales

by

Use the ROLLUP command.)

sales (number

of rows)

and total

sales

by

month and product category, with subtotals by month and product category and a grand total for all sales? (Hint: Use the ROLLUP command.) 11

Whatis the SQL command to list the number of product sales (number of rows) and total sales by month, product category and product, with subtotals by month and product category and a grand total for all sales? (Hint: Usethe ROLLUP command.)

12

Using the answer to Problem 10 as your base, which command would you need to generate the same output but with subtotals in all columns? (Hint: Usethe CUBE command.)

13

Create your own data analysis and visualisation presentation. you to search for a publicly available data set using the internet using

a

what you have learnt

in this

The purpose of this project is for and create your own presentation

chapter.

Search for a data set that interests you and download it. Some examples sources are (see also Note on page 816):

of public data sets

www.data.gov http://data.worldbank.org http://aws.amazon.com/datasets 1

http://usgovxml.com/ https://data.medicare.gov/ www.faa.gov/data_research/

Copyright Editorial

review

2020 has

b

Use any tool available to you to analyse the data. You can use tools such as Microsoft Excel Pivot Tables, Pivot Charts, or other free tools, such as Google Fusion tables, Tableau free trial, IBM Many Eyes, etc.

c

Create a short presentation to explain some of your findings the data comes from, what the data represents, etc.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

what the data sources are, where

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER 16 Big Dataand NoSQL IN THIS CHAPTER, YOU WILLLEARN: The role

of Big Data in

The primary 3

modern

characteristics

business

of Big Data and how these

go beyond

the traditional

Vs

How the

core

To identify To

components

the

major

summarise

differ To

from

the

the

describe

of the

components

four

major

relational

the

Hadoop framework of the

Hadoop

approaches

operate

ecosystem

of the

NoSQL

data

model

and

how

they

model

characteristics

How to

work

with

document

How to

work

with

graph

of NewSQL databases

databases

databases

using

using

MongoDB

Neo4j

PREVIEW In

Chapter

and the learn

2, Big

about

Data Data

Models,

these

issues

You will also learn developed, the

to

Hadoop

efforts

the

NoSQL

database

MongoDB

retrieving provide

and

it has faced

and

Neo4j.

Just as

storing

new

data

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

is

copied, affect

the

tutorials model

overall

or

about

the

a standard

learn

about

data

model

chapter,

you

and continue

low-level in

approaches

databases

which try to

bridge

in

organisations

higher-level

databases

to be

technologies

component

the

non-relational

such

duplicated, learning

activities

with relational

data,

coding

scanned,

NoSQL In this

have developed,

column-oriented

key

such

of

as key-value

and

graph

databases.

gap

between

relational

NoSQL

products:

the

to

NoSQL

for

existing

MongoDB

in experience.

whole

or in Cengage

and the

part.

Due Learning

to

two

the

reserves

Neo4j, for

based

rights, the

right

some to

third remove

and,

and the

additional

perform old

data

data

and

Q and

R

respectively.

on it

party

to

Appendixes

decades

databases

tools

ability removing

Online

and

dominant

electronic

current

data,

databases.

has been

model

in

databases,

updating

as object-oriented

The relational

be

become you

database

database

challenges

emerging

NoSQL. basic

warehouses.

that

databases,

explore

hands-on

the

development.

you learn

developing

databases, NewSQL

specific

has Next,

you

The relational

Editorial

Data.

to

NoSQLs detail.

First,

Hadoop

about

management

to

greater

Data.

model to

systems

Finally,

much

Big

document

also learn

has led

about the technologies

Big

data

databases, You

in

address

address

were introduced

that

framework.

to

you

problem

content

may

that

development

have

content

during

evolved

be

suppressed at

any

time

to

from if

the

subsequent

time,

of data adapt

eBook rights

and/or restrictions

to

eChapter(s). require

it

CHAPTER

these

challenges

because new

Big

manipulation this

new

of dealing

many

of the

16.1

BIG DATA

Data

generally and

database

the

Velocity

Veracity

Data

ambiguity

from

now.

which

Web data,

the

created

large

Although

2020 has

The

Big

that

Data

lacks

velocity,

variety,

by a relational

as follows:

growth

Bigtable

Data.

are

sets

of data that

discussed

later

Web data

Data issues

any,

but

business

not

necessarily

has to

any

All suppressed

Rights

Reserved. content

does

the

However,

May not

not materially

be

copied, affect

cost the

scanned, the

overall

of social

created to

duplicated, learning

in experience.

deal

of the

or in Cengage

of

Due

to

electronic reserves

of

were

right

survived

businesses. among

the

quickly

As a first

to

followed,

Data problems.

created

growing

need

Cassandra

to

store

and

to

Big

Big

third remove

Data issues,

have increased

Data has been redefined Data, the mining

of its

some

of

in technology

and

rights,

dot-com

that

Big

perceptions

of

terms

the

set

variety.

structures,

companies

Facebook

Big

processing

Learning

Big Data

3 Vs.

changes

Volume

pundits

and

complex

address

with the

original

data, in

part.

to

among

After the

and

giant

data so that

Given the

whole

relational

Originally,

and Facebook

forefront

of automatically Veracity

or

Dynamo,

at the

and track 5 Vs.

what

Data five

current

velocity

into

a smaller growth

media

of the

Big Data. volume,

but the

into

More recently,

generate

all, of the

outweigh

new revenue.

Learning.

to

failed,

new technologies

been

too.

Big

the

characteristics.

significant

chapter),

have

that

combined

all three

companies

characteristics

Big

businesses

this

is

might not be considered

disagreement

3 Vs:

sources

consolidated

Amazon

of specificity

be considered

an extent

of the

audio

creating

in

had the

to

not

be considered

involve

start-up

in

and

for

and that

The success

pioneers

might

present

experienced

data store,

that

now

a data set to

video

Amazon

Big

This lack

data.

Web commerce

and

became

Data are

the

management

as

Google

characteristics.

as a combination

many Web-based

managing

the

16.1

have

that

term

field.

databases

management

Big Data is that there is some

graphics,

data

1990s,

of

Figure

media

Cengage deemed

Big

managing

social

generating

review

the

and

analyse

with it.

for

can be defined

characteristics

organisations

involving

NoSQL

of volume,

unsuitable

and

database

of

What was Big Data five years ago

of defining

of text,

like

opportunities

to the

the with

in

for

companies

(technologies

the

Although

associated

of these

storage

manipulate

in the

characteristics

data

with these

considered

problem

challenges

pressure

manage

the

of data

of the data to be stored

Big Data.

struggles

significant

and these Google

defining something

a combination

companies

other

in

as shown

experienced

feel

model.

created

be stored

associated

The key is that

burst in the

result,

displays

trends

development

generally

makes the

of the 5 Vs must be present for

new

bubble

emerging

to the

store,

arose

and

The latest

wave

827

of the data

values

adding to the

conceived

created

that that

to

possible,

leverage.

a new

efforts

relational

These characteristics

of data to

Similarly,

technology

Further

was

have led

of the

of data

data

describes

most urgent

Data

of characteristics

an extent

system.

of specific

now.

database

about

a set

increased

that

what is

NoSQL

the worth of the data to the business.

the lack

years

to

5 Vs) to

quantity

of Big

a set

the trustworthiness

to the

Big

one of the

is

of

from

term

Data and

arena. In each case the challenge

perceptions

Organisations

the variations in the structure

Value

value

Big

the speed at which data is entering the system

Variety

leads

(the

management

Volume

Notice

refers

value

create

assumptions

there

management

an ill-defined

wave

underlying

definition,

data

businesses

and requirements.

with the

a consistent

Big

to

Data is

wave of data represent

veracity

Copyright

Big

in the

changed

organisations

Data.

possibilities

challenges reject

for

is

dominant

advances

opportunities

challenges

Editorial

and remain

technological

16

Value

this

accuracy

party additional

content

may content

and

be

of the

data

any

time

data

in terms quality,

suppressed at

from if

the

subsequent

16

as

of must

eBook rights

and/or restrictions

eChapter(s). require

it

828

PART

VI

Database

be verified

Management

before data

FIGURE 16.1

a business

and

acts

upon it.

machine-generated

Advances

data that

in technology

can

spur

growth

have led to in

specific

a vast array

of user-generated

areas.

Original view of Big Data

Volume

Velocity

Big

Data

Variety

For example, Each visitors These

bands

Disney World has introduced Magic Band is connected to use radio

frequency

Magic Bands for park visitors to wear on their wrists. much of the data that Disney stores about that individual.

identification

(RFID)

and

near-field

communications

(NFC)

to

act

as tickets for rides, hotel room keys, and even credit cards within the park. The bands can be tracked so that Disney systems can follow individuals as they move through the park, record with which Disney characters (who are also tracked) they interact, purchases made, wait time in lines, and

more.

Visitors

can

make reservations

at a restaurant

and

order

meals through

a Disney

app

on

their smartphones and, by tracking the Magic Bands, the restaurant staff know when the visitors arrive for their reservation, can track at which table they are seated, and deliver their meals within minutes of the guests sitting down. Withthe many cameras mounted throughout the park, Disney can also capture pictures and short videos of the visitors throughout their stay in the park to produce a personalised

movie of their

vacation

experience,

which

can then

be sold to the visitors

as souvenirs.

All of this involves the capture of a constant stream of data from each band, processed in real time. Considering the tens of thousands of visitors to Disney World each day, each with their own Magic Band, the volume, velocity, and variety of the data are enormous.

16.1.1 Volume Volume, the quantity of datato be stored, is a key characteristic of Big Data. The storage capacities associated with Big Data are extremely large. Table 16.1 provides definitions for units of data storage capacity.

16

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 16.1

Abbreviation

Bit

0 or 1 value

b

Byte

8 bits

B

Kilobyte

1 024*

Megabyte

1 024 KB

1 024 GB

TB

Petabyte

1 024 TB

PB

Exabyte

1 024

PB

EB

Zettabyte

1 024

EB

ZB

Yottabyte

1 024

ZB

YB

units

kilobyte

are

defined

5 210 5 1024

Naturally,

as the

On the

is

resources

system. such

addition

hundreds

it is

grows,

review

2020 has

the

any

all other

storage

the

prefix

kilo typically

values

means

are

1000;

based,

all values

however,

in

data

for

data

storage,

a

scale

up

system

to

a server

costs

for larger

out.

system:

and faster the

need

or scale

to a larger

to larger Further,

the

Scaling

up is

for example,

with

64

systems.

of these

storage

CPU

keeping

and

there

high-powered

the

changing

cores

However,

devices

from

a 100

TB

are limits

to

systems

increase

All suppressed

Rights

does

This is

100

exceeds

help

to

TB storage

also

capacity

creating

overall

than

into

of a server,

as clustering the

capacities

easily reaches can

to

reduce

systems

need storage

which

the

also referred

can

data that

warehouses,

of the

carry

storage

over

it is

cost

to

in these the

of the

buy

a single

extreme

dozens

be petabytes

not

multiple

over

There

are

degree to ill-suited

copied, affect

scanned, the

overall

or

duplicated, learning

DBMS

in

sizes.

of petabytes.

size

and

acts

requires

spread

brain

degree

associated

database

to

always

database

in

Chapter

12,

and fragmentation. data in the

of

database,

communication

with the and

was

hide the

data

of the

using replication

of communication

could the

As discussed

all of the

model

that

so that

as the

database.

a high

limits

relational

system user,

of control for

which a relational

for

the

multiple servers

significant costs

from

within the

point

by the

management

the

data

systems

performance

be

database

functions, of the

database

systems.

materially

all

represented

manipulation

must act as a single

across

the

advances

and

out these

a relational

May

greatest

a sophisticated

data

DBMS

not

workload

This

buy ten

clickstream

one

RDBMSs

Reserved. content

workload.

to

data

control

This limits

makes

of servers.

a

mistake, organisations

collect

To

the

when the

of nodes.

maintain

the

share

RDBMS

tables.

across

Learning.

either

means that,

enterprise

database

grows.

that

be.

cheaper

due to the increased

and it

Cengage deemed

it is

underlying

because

coordination

of nodes

can

a number

to

3 that

of an

must

DBMS

can

across

Make no

Chapter

be in

distributing

the

out

possible to distribute

However,

which

increases,

each system

moving

out

as eBay

of the to

scaling

to the

development

and

on

be stored

a 1 TB storage

system

of thousands

from

system

to

migrating

and

since

Organisations

appears

systems

but

servers

1 PB storage

complexity

occurs,

up involves

spread

computing

the

basis

of 2. For example,

needing

cores

hand,

of low-cost

Recall

are the

rate.

other

workload

over

and

data

a single

a cluster

This is in

nature

of powers

of

Scaling

and fast

at a dramatic

the

CPU

system.

how large

in terms

of systems,

with 16

storage

in

MB

bytes.

When this

number

a server

binary

quantity

also increases.

same

are

829

GB

1 024

bits

NoSQL

MB

Gigabyte

because

Data and

KB

bytes

Terabyte

that

Big

units

Capacity

storage

Copyright

capacity

Term

*Note

Editorial

Storage

16

ability

coordination

be scaled

and

to

distribute

as the

16

number

out as data volume

clusters.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

830

PART

VI

Database

Management

NOTE Although claim

some

to

RDBMS

support

storage

products,

clusters,

subsystem,

such

16.1.2

Velocity

Velocity,

another

key

of volume.

might

capture

retailer

like

in

and that

must

be stored

reader

the

to the

In addition

real

it

time.

creating

speed

must be processed

Clusters,

rely

legitimately

on a single,

example,

but

data

Other

shared

making

also

every

data

often be

used

products

of

to

items

as

magnitude

in the

GPS

inventory

This

produced,

time

of data and

the tag and the reader,

being

sale)

RFID,

amounts for

a

in the

final

20-minute

such

are still in boxes.

of a product

(the

during that

large

store

Today, mouse

event

generate

mirror

a retail

of the

one

system

of velocity

a purchase.

click

track

the

past,

in technology,

between

orders

enters

In the

on 30 events

can

while the

data

of capturing

that

quantity

of several

new

Amazon.

advances

line-of-sight

simultaneously a given

as

Instead

RFID tags

do not require

which

many ways, the issues

transaction

data.

at

of a customer

opportunities

an increase

at any

In

might capture

tracking

rate

such

process.

of the

For

of tags for

system

final

data-gathering

The tags

Application

generally

to the

transaction

purchasing

velocity

of

record

to the

and

experience,

hundreds

is tracked,

delivered

data

only

in

of a single

product

not

management.

can read

instead

captures

new layers

refers

a web retailer final

the

Real

and

must be processed.

consider the

in

Oracle

scope

Data,

about

shopping

add

warehouse

data

comparing

and

in

network.

data

increase

NFC,

Server

of Big

which the

browsing,

a 303

area

For example,

only the

a 20-minute

frame

at

SQL

are limited

characteristic

Amazon

searching,

as

clusters

as a storage

as well as the rate those

such

these

and the

means that,

each individual

amount

of

data

being

one time.

with which

data is entering

at a very rapid

the

pace. The velocity

system,

for

Big Data to

of processing

be actionable,

can be broken

that

down into two

categories:

Stream

processing

Feedback Stream

loop

processing

enters the that

it is

the

system

Large

focuses

produce of time

to

to try to

determine

which only

data

1

could

on that

review

2020 has

CERN,

Processing: 20,

Learning. that

data.

to

any

All suppressed

These

and

most

powerful

data

which

must data

What

discard.

data stream

at such

example,

in

the

in a two-step

as it

pace enters

at the

world,

have created algorithms

as it

a rapid

and filtered

For

accelerator

will actually

CERN

experiments

to decide ahead

process

to filter

the

data

be stored.1

on inputs,

of capturing

within just

the

loop.

Feedback

to the

to

be processed

to

particle

are applied

that

a feedback

purchases.

of data

algorithms

of as focused

is

delivered

of the

refers to the analysis of the data to produce actionable results.

The process

book

analysis

of data can enter the system

The

keep

of data

be thought

amounts

product

August

Cengage deemed

volumes

of the

and

second

information, for

large

a part of the

Copyright

GB per

on outputs.

acting

analysing

Editorial

data

and it requires

large

all

largest

processing

processing

recommendations

1

the

will be kept.

about

of as focused

16

which

Collider,

Feedback loop

then

store

processing,

about 600 TB per second of raw data. Scientists

to

stream

on input

system. In some situations,

not feasible

Hadron

down

processing.

a few

data,

Figure

loop seconds

loop

processing 16.2

to

so that

processing

it into

shows

processing

user in real time.

record?

feedback

the

loop

immediate

results

Not all feedback

While

be thought

usable information,

a feedback

provide

can

of the

loops

and

for

providing

results

requires

analysis

can

become

are used for inclusion

of

http://home.web.cern.ch/about/computing/processing-what-record,

2015.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

results

within immediate

through

terabytes

and tactical

data products.

and

petabytes

decisions.

It is

FIGURE 16.2

Feedback

of data

also

a key

loop

to inform

processing

decision

component

in

Big

Data and

NoSQL

is also used to help organisations

makers

data

16

and

help them

make faster

831

sift

strategic

analytics.

Feedbackloop processing Information

requested

information

by

user

plus

on recommendations

are

returned

List

of

recommended items added to the user request Data is

analysed

determine and

to

other

products

books

the

user

may like

Data is captured about the user and about the book requested

User

16.1.3 In the

Big

Data

context,

Data

data that

organised

can

to fit into

of the

data.

A data

discussed

data fit

refers

data model

created

organisations

transcripts,

by the

work

example,

most large-scale

of unstructured BLOB and

objects

data is that

uninterpretable

to impose

processing structure

on the

Copyright Editorial

review

2020 has

on the

for

audio,

semantic

a book

by data

processing.

that

data

the

model

data

when

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

graphic data,

be captured

data is

copied, affect

data

the

overall

decades

in

whatever

to the and

captured

data.

Big

Data

and

or

duplicated, learning

in experience.

whole

or in Cengage

Due Learning

for

format

This is

it one

Big

Data

to

electronic reserves

rights, the

right

some to

model

value.

of the

One

conveys,

problem

without

differences databases

processing

imposes

additional

content

may content

be

between

any

time

a

a structure

of providing

suppressed at

1

any

impose

One advantage

party

with

is inaccessible

exists in,

key

data. For

allows the storage

Relational

third

tweets,

of unstructured

naturally

remove

as

are semi-structured

texts,

which the relational

object

and processing.

rules,

world

emails,

atomic

the

some

for storage in

data in the

data type that

processing.

stored.

part.

that

not

much of the transactional

some forms

as a single,

meaning

business

images,

object (BLOB)

as a part of retrieval

scanned, the

Over the

data

on structured

and routed

of the

satellite

may be

of both rely

organisations

most

maps,

a binary large

and

database

the

environment,

data forms.

of the

elements

databases

model. Although

have evolved to address

or structure

a relational

on the

data

Structured

data is data that is

combines

Relational

based

data includes

video

value

not.

data

which the

Unstructured

the data are decomposed

a structured

support

in

or semi-structured.

data do

designer

database,

of other

RDBMSs

a data in

host

data as needed for applications

Cengage deemed

data

well in

and structures

model.

parts

as defined in the

databases

like

the

Big Data requires attempt

a whole

data

other

database

Unstructured

relational

of formats unstructured,

Semi-structured

while

and columns use

and

has been dominant,

model.

4. As data enters the

tables

array

to fit a predefined

a predefined

Chapter

vast

be structured,

a predefined

or unstructured. videos,

to the to

organised

model is

in

corresponding

data that

variety

be considered

has been

parts

the

on a link

Variety

captured.

is

clicks

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

832

PART

VI

Database

Management

structure

during retrieval

ways for

different

and processing

is the flexibility

of being

businesses

make

able to structure

the

data in different

applications.

16.1.4 Veracity Veracity

is

becoming

Veracity refers

more important

of the data and the information formats

it takes,

from

several

Also, in

Big

data

of the

generated

quality

causes,

terms

comments

for

to the trustworthiness

and

as having

of sentiment

analysis,

at one point in time

Data, it is important

that

to

are less

capture

controllable.

only

customers

data

source

is

portions

about

of the

and

preferences

at another

where

data they

rely

can

of the

the

data

collect.

on the accuracy

of Big Data, in terms

for action

validated

on the

Uncertainty

selected

opinions

based

makers reasonably

Due to the variety

might not be suitable

the

decisions

Can decision

from it?

accuracy

such

that

data.

different

data

due to change

point in time.

can

arise

high

velocity.

over

time,

so

When utilising

possible.

16.1.5 Value Given the

costs

has value.

Value, also called

viability,

information

can add value to the

meaningful In

order

to

advanced

of processing,

that

create

value,

algorithms

data. Information

that

through

after

contacts

predictive

analysing

to the

must

data types to the

an insurance calls

to

models to

hidden

such

as

and used to that

use to

a business

unless

data can be analysed

the

use

patterns

market

and

trends

amongst

data

and

and

current

other insurance

and

products;

analytics,

new

it

to provide

which

customer

insurance website

new

utilises

knowledge

within the

buying

making across the

about

persons,

of

and

drive decision

collects

objects

distribution

of no

which the

through

discover

company

cross-sell

Data, it is

organisation.

business,

etc.), insured

at risk

Big

degree to

be actionable

Big Data analytics

by looking

of different insurance

and

refers

Data

valuable

phone

company

by using

is

analytics,

(surveys,

to the

Big

on different

can be realised example,

storing

claims, usage,

customers;

patterns,

business.

all customer

can

create

increase

and even perform

For

value

turnover

analytical

pricing

contracts.

NOTE While the

value

important

to

(GDPR)

became

on 25

May

of the

Big

this

and have to

right

to

ask for online,

of the logic

used

to

ethically

all

businesses

actions

and

consent

is

given.

of how the

was automatically

algorithm

that

made the

organisations

be subject

A person

decision

rejected,

General that

is reached. they

would

collect

automated

subject

to

For example, have

a right

if to

also

Regulation

and

process

data

exchange

detailed

decision such

it is

Protection

businesses

major changes

to

who is

Data analytics, Data

most

One of the

not to

by Big

The

requirement,

with its requirements.

explicit

informed

and legally.

Union legal

of an individual

an explanation

by the

both

a European

comply

and

measurable

used for

is

the rights

unless

a bank loan

data is

requirement

Although

profiling,

has the

the

a legal

GDPR includes

includes

Data is linked

that

2018.

internationally

for

of

ensure

in

data

Article

making,

decision

making

a person ask for

was to an

22

which now apply

explanation

decision.2

16 2

Crockett, Joint

K.,

Goltz,

conference

S. and on

Garratt,

Artificial

M. GDPR Impact

Neural

Networks

on

(IJCNN),

Computational DOI:

Intelligence

Research,

10.1109/IJCNN.2018.8489614,

IEEE

International

ISSN:

2161-4407,

2018.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

16

Big

Data and

NoSQL

833

16.1.6 Other Characteristics Characterising

Big

Data

with the

5 Vs is fairly

standard.

However,

as the industry

matures,

other

characteristics have been put forward as being equally important. Keeping with the spirit of the 5 Vs, these additional characteristics are typically presented as additional Vs. Variability refers to the changes in the meaning of the data based on context. While variety and variability are similar terms, they mean distinctly

different

things

in

Big

Data.

Variety

is

about

differences

in

structure.

Variability

is

about

differences in meaning. Variability is especially relevant in areas such as sentiment analysis that attempt to understand the meanings of words. Sentiment analysis is a method of text analysis that attempts to determine whether a statement conveys a positive, negative, or neutral attitude about a topic. For example, consider the statements I just bought a new smartphone I love it! and The screen on my new smartphone

shattered

the first time I dropped

it

I love it!

In the first

statement,

the

presence

of

the phrase I love it might help an algorithm correctly interpret the statement as expressing a positive attitude. However, the second statement uses sarcasm to express a negative attitude, so the presence of the phrase I love it may cause the analysis to interpret the meaning of the phrase incorrectly. The final characteristic of Big Data is visualisation. Visualisation is the ability to present the data graphically

in

such

a way as to

make it

understandable.

Volumes

of data can leave

decision

makers

awash in facts but withlittle understanding of what the facts mean. Visualisation is a way of presenting the facts so that decision makers can comprehend the meaning of the information to gain insights. An argument could be madethat these additional Vs are not necessarily characteristics of Big Data; or, perhaps

more accurately,

they

are not characteristics

of only

Big Data. Visualisation

was discussed

and illustrated at length in Chapter 15 as an important tool in working with data warehouses, which are often maintained as structured data stores in RDBMS products. The important thing to remember is that these characteristics that play animportant part in working with data in the relational model are universal and also apply to Big Data. Big Data represents

a new wave in

data

management

challenges,

but it does not

mean that relational

database technology is going away. Structured data that depends on ACIDS (atomicity, consistency, isolation, durability, and serialisability) transactions, as discussed in Chapter 12, will always be critical to business operations. Relational databases are still the best way of storing and managing this type of data. What has changed is that now, for the first time in decades, relational databases are not necessarily

the

best

way for

storing

and

managing

all of an organisations

data.

Since the rise

of

the relational model, the decision for data managers when faced with new storage requirements was not whether to use a relational database, but which relational DBMS to use. Now, the decision of whether to use a relational database at all is a real question. This has led to polyglot persistencethe coexistence

of a variety

of data storage

and

management

technologies

within an organisations

infrastructure. Scaling up, as discussed, is often considered a viable option as relational databases grow. However, it has practical limits and cost considerations that makeit unfeasible for many Big Data installations. Scaling out into clusters based onlow-cost commodity servers is the dominant approach that organisations are currently pursuing for Big Data management. As a result, new technologies not based

on the relational

16.2

model have been

developed.

HADOOP 16

Big

Data

clusters.

requires Although

standard

for

framework

Copyright Editorial

review

2020 has

Cengage deemed

a different

for

Learning. that

most

any

All

to

other implementation Big

Data storage

distributing

suppressed

approach

Rights

Reserved. content

does

and

May not

not materially

be

affect

scanned, the

data

technologies and

overall

are

processing.

processing

copied,

distributed

or

duplicated, learning

in experience.

whole

possible,

Hadoop

very large

or in Cengage

storage

is

not

data sets

part.

Due Learning

to

electronic reserves

that

Hadoop

designed has

a database.

across

rights, the

is

right

some to

third

party additional

content

large-scale

become

Hadoop

clusters

remove

for

the

is

de facto

a Java-based

of computers.

may content

be

suppressed at

any

time

from if

the

subsequent

While

eBook rights

and/or restrictions

eChapter(s). require

it

834

PART

VI

Database

the

Management

Hadoop

framework

Distributed which

File

means that

supports

it

be

large

MapReduce

better together enormous

and

MapReduce.

used data

directly sets in

separately,

as a Hadoop

amounts

many parts, the two

(HDFS)

can

processing

HDFS and

includes

System

of data

Hadoop

system. across

Distributed

for

data

a highly

alow-level

storage.

parallel,

Hadoop

vast

MapReduce

is

file

complement

are the

system, model

While it is

each

specifically

other

to

Hadoop

processing

a programming

manner.

was engineered

clusters

components distributed

distributed

the two technologies

16.2.1 Hadoop Distributed The

most important

HDFS is

possible

so that

distribute

that

to

use

they

and

work

process

of servers.

File System

File System

(HDFS)

approach

to

distributing

data is

based

on several

key

assumptions: High volume. petabytes, the

The volume or larger.

HDFS is

in

size,

Hadoop,

Oracle

blocks each

on the

these into

file

hardware

greatly

just

storage

is

and

hand,

can

that files in the blocks,

reduced,

Oracle

to

block

even the

of file

blocks

be in terabytes,

that

are

Relational

organises

data into

of 64

example,

512

bytes

databases

often

8 KB physical times

As a result,

overhead

Data in

For

often

MB(8000

values.

metadata

large.

storage.

involved.

size

larger

to

will be extremely

types

into

system

has a default

simplifying

other

organised

By default,

be configured

is expected HDFS

as in

operating

blocks.

other

and it

per file is

assumes

database

on the

block!),

Big Data applications

physical

computer,

depending

blocks.

into

personal

aggregate

an

Hadoop

organised

on a typical

of data in

the size

the

number

of tracking

the

of of

blocks

in

file.

Write-once,

read-many.

and improves

system,

and then

This improves many

Big

Using

overall

data

closed.

overall Data

advancements

This is a key

the

write-once,

read-many

Using

this

performance Although

HDFS

advancement

allow

for

works

files

NoSQL

have

databases

the

of the

new

data

because

written

be

made to its of tasks

file

cannot

for

file

contents. performed

be changed,

to the

allows

issues

to the

types

appended

it

concurrency

created,

cannot

well for

contents

to

simplifies

a file is

changes

and existing

for

model

model,

Once the file is closed,

system

applications. in

a

throughput.

end

recent

of the

database

by

logs

file.

to

be

updated. Streaming pieces

access.

Instead

several

of optimising

optimised Fault

for

some

the

batch

tolerance.

commodity

computers.

device.

is

devices

on three

tables,

system

to

designed assumed

Hadoop

individual

elements

across

the

device fails, factor

Different

often retrieve

typically

process

thousands of such

data is

of three,

replication

is

of low-cost,

devices,

still

can

at any

point

to replicate

available

meaning

factors

Hadoop

of data.

HDFS is designed the

small

entire files.

randomly,

stream

with thousands

Therefore,

a replication

devices.

data

be distributed

when one

uses

different

where queries

as a continuous

that,

errors.

so that

files

to

systems

Big Data applications

access

of entire

hardware

By default,

stored

file

It is

will experience

processing

different

processing

Hadoop

many different

is

Unlike transaction

of data from

that

from

each

across

another

block

be specified

in time,

data

for

of data each

file,

if

desired. Hadoop

16

tasks

node

Copyright Editorial

review

2020 has

several

the

Learning. that

any

All suppressed

types

system.

and one or

Cengage deemed

uses within

Rights

does

May not

not materially

be

copied, affect

is just

there

as depicted

scanned, the

A node HDFS,

more data nodes,

Reserved. content

of nodes. Within the

overall

or

duplicated, learning

in experience.

in

whole

a computer

are three

Figure

or in Cengage

part.

Due Learning

that

types

of

performs nodes:

one

the

or

client

more types

node,

the

of

name

16.3.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 16.3

Hadoop Distributed

16

Big

Data and

NoSQL

835

File System (HDFS) Metadata:

Client Node

Name

Data Node 1

Block

1

Block 4

Block

Data nodes

Data Node 2

Node

File1:

Blocks

1,3,4: r3

File2:

Blocks

2,5,6: r3

Data Node 3

Data Node 4

Block

2

Block

1

Block

3

Block

2

Block

3

Block

1

Block

2

Block

5

Block

5

Block

4

Block

5

Block

4

Block

3

Block

6

Block

6

6

store the

actual file

data

within the

HDFS. Recall that files in

HDFS are broken into

blocks

and are replicated to ensure fault tolerance. As a result, each block is duplicated on more than one data node. Figure 16.3 shows the default replication factor of three, so each block appears on three data nodes. The within the

name

node

a HDFS metadata

small

and improve for the block

user

allows

that to

metadata

metadata

is

name

node

the

performance.

node is

requests

the

The

system

name numbers

makes

contains

cluster.

minimised.

comprise the

file

for

the

file

designed to

to

hold

system.

There

be small,

all of the

metadata

This is important

because there is composed

each file, system,

and the

either

desired

to read

files

one

name

recoverable.

to reduce

node

Keeping

disk

accesses

only one name node so contention of the

factor

write

only

easily

memory

primarily

replication or to

typically

and

in

The

metadata is

is

simple,

for

name

each

new files,

of each file,

file.

as

needed

with the name

node.

The

client

to

support

the node the

application.

When a client

node needs to create

Adds the new file name to the

a new file, it communicates

The name

node:

metadata

Determines a new block number for the file Determines alist of which data nodes the block will be stored Passes that information

back to the

The client node contacts file

on that

data

nodes that

Copyright Editorial

review

the

second

data

has

Cengage deemed

Learning. that

any

the first

At the

next

data

node

node then

All suppressed

Rights

Reserved. content

does

the block. in

the list

contacts

May not

not materially

be

copied, affect

node.

data node specified

same time,

will be replicating

contacts

2020

node.

client

the

scanned, the

overall

the

by the name node and begins

node

sends

the

Asthe data is received and

next

or

client

duplicated, learning

begins

data

in experience.

sending

the

or in Cengage

part.

Due Learning

to

electronic reserves

node the list

writing the

of other

16

data

from the client node, the data node data

node in the list,

whole

data to this

and the

rights, the

right

some to

third remove

node

for

process

party additional

content

replication.

continues

may content

be

suppressed at

any

time

from if

This

with the

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

836

PART

VI

Database

data

being

written, the

streamed

the

next

file is

the

Management

client

block.

node.

entire

if

the

a client

with that

nodes

to it

node file

each

reports

needs

and the

block,

on the

network.

each

name

node

a file,

or

desired

block

writing

files.

works

will consider

how

MapReduce

is

data

so

with the

same

key

are

Figure

node

of

key-value

units

once.

even in the

knows

not to

the to

that

could

the

the

of blocks

may appear

from

node.

data

data

from

HDFS

produce

The

in

node

many that

directly from

data

and informs

name

is

each

node

a data

node

can have

of which

a fault,

a heartbeat

node

in

block

is used to let

experiences

data

send

node

A heartbeat

will not receive that

nodes

the

seconds.

node

include

of the

down

specialised

lists

to

due to

from client

to

that nodes

causes

a block

data

node initiate

replicating

specialised

distributed

a live

a powerful,

processing

provides

data

have

fewer

yet

requirements

processing

to

highly of

Big

complement

Data

applications.

data

storage

Next,

of

HDFS.

used to process large data sets across clusters. Conceptually,

and follows into

value,

and

the

data

data in

into

to

that

subset

small

is

a great

key

and the

combines

line

them

sold.

of the

in

units

are the the

value.

is

The

original

of Figure

that

data

is

is

pairs, all

performed

by

map and reduce

in

that

Figure

that

original

redundant

function with

then each

the

are

stored

so the

each

data

as

data are

fact

store.

is

kept for

pairs in that

list

that,

customer

which the

key (product

stored

Note

data about the

takes

total

data as a value.

data is

to find

a new list of key-value

associated

16.4

database

ensures

data in the 16.4,

determines

of the invoice

a relational

The reduce

values

map function

function

MapReduce

parse each invoice

map function

by summing

of

of normalisation

shown

The

programs.

of duplication

is

original task.

of key-value

platform; therefore,

do not constitute

map functions

takes all at the

A mapfunction takes

pairs.

The reduce

subtasks

for the

takes a collection

key and the remainder

no form deal

of data that

Dunne. In the figure, The result

been

MapReduce

the

a final result

of key-value

result.

Java

as the

data storage

and there

very

a single

illustration

has

number

conquer.

performs

and areduce function.

Hadoop is a Java-based

conceptual

Hadoop

is

into

and

produce

a set

procedure-oriented

product

tables

them

Recall that

a simple

there

summary

any

of divide subtasks,

of a mapfunction

summarises

with the invoice

into

the

Learning.

principle of smaller

of each subtask

and filters

as detailed,

sold on that invoice.

and

the

a collection

the result

sorts

Therefore,

Leona

that

that

node.

it

of each

the

separated

Cengage

for

node

the list

block

block

If a data

name

node

understand

provides

pairs,

Remember,

is

name

available. the

name

MapReduce

and

written

16.4

number

deemed

node

transmitted

congestion

to request

node reads

by a program called a mapper. Areduce function

functions

has

name

block is

name

actually

avoid

each

the

six hours

of a heartbeat

a program called a reducer.

2020

the file

node

Given that

the

the

combines

of data

performed

10011,

with

on, then

As the name implies, it is a combination

review

data

name

client

If the lack

well for the

breaks

and then

a collection

Copyright

the

informs

node to

to retrieve

the

node is still

and

components

easy to

task,

same time,

Editorial

Once the first from

MapReduce

a complex

the

the

hold them.

attempts

is sent every

MapReduce is the computing framework

pairs

name

contacts

of replicas,

data

the

that

16.2.2

code

that

node

report

name

number

together,

system

the

the

on another

Taken

16

block. nodes

of the

Heartbeats are sent every three

outage,

As a result,

the

only

it

communicates

A block

power

reading

not

node

any

to the

the

of data

client

was

data flow

nodes

client

node

know that

failure,

node.

we

the

no time

Using this information,

data

and heartbeats.

hardware

file

are storing and list

written,

at

to read data

the

blocks are on that data node.

the

that

number

nodes.

Periodically,

than

been

to

has

This helps to reduce

for

for

file

that

data

data

nodes

block

note

associated

the

data

another

performance.

Similarly,

of these

get

It is important

slow system

closest

all of the

can

When the

closed.

name

across

node

products product

of key-value

code)

to

produce

result.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 16.4

As previously

files from

the

data

sets

used in

multiple nodes to a central and

computational pushed

place

nodes

that

are then

the

HDFS. Typically,

retrieving

the

data for

the

data

across

the

Data

processing

containing

aggregated

must be processed.

Big

nodes

to

and sent

Hadoop framework

and the

of reducers

is

HDFS is

using

configurable

64

by the

but

in

a central Each

back to the

client.

Data and

NoSQL

837

that

large.

Transferring

a tremendous node.

location,

copy

copies

of the

This

amount

Therefore,

mirrors the

of network

instead

of the

program

entire

of the

program

produces

are

results

distribution

of data in

a mapper for each block on each data node that

number

best

extremely

central

processed.

distributes

MB blocks,

user,

on the

processing be

are

would require

burden

This can lead to a very large

processed

applications

node for processing

an incredible

program

to the

Big

MapReduce

stated,

bandwidth,

16

of mappers.

yields

over

practices

For example,

15 000

suggest

mapper

about

if 1 TB of data is to be programs.

one reducer

per

The

number

data

node.

NOTE Best practices

suggest

However, there a given at

node

each

that

are cases

the

number

of applications

with satisfactory

of

mappers

with simple

performance.

Clearly,

on a given

node

map functions much

should

running

depends

on the

be kept to

100

as many as 300 computing

or less.

mappers

resources

on

available

node.

The implementation of MapReduce complements the structure of the HDFS, which is an important reason why they work so welltogether. Just as the HDFS structure is composed of a name node and several data nodes, MapReduce uses ajob tracker (the actual name ofthe program is JobTracker) and several task

trackers

(the

programs

are named

TaskTrackers).

The job tracker

acts as a central

control

for MapReduce processing, and it normally exists on the same server that is acting as the name node. Task tracker programs reside on the data nodes. Oneimportant feature of the MapReduce framework is that the user must write the Java code for the map and reduce functions, and must specify the input and output files to be read and written for the job that is being submitted. However, the job tracker will take

Copyright Editorial

review

2020 has

care of locating

Cengage deemed

Learning. that

any

All suppressed

Rights

the

Reserved. content

does

data, determining

May not

not materially

be

copied, affect

scanned, the

overall

or

which nodes to use, dividing

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

the job into tasks for the

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

16

nodes,

eBook rights

and/or restrictions

eChapter(s). require

it

838

PART

VI

Database

Management

and

managing failures

user

submits

a

of the

MapReduce

1

A client node (client

2

The job tracker blocks that

3

Thejob tracker the

might

system

be busy

same

4

determines

The job tracker

This

to

which task trackers

jobs

mappers

that

other

nodes

portion

fails

or crashes,

is

still

job

working

the

data that

Therefore,

the

new request

can handle

users can be running is

being

task

arrives.

on

processed

tracker

on that

Because

multiple

by node

the

data

nodes for the

end, either

can reassign

processing

completing

is

often

of the reports

for student

used

systems

when

about

portion

evenings

fee payment

of complementary

computing

when

of the

the larger

IT infrastructure.

The

task

mappers and

not

map and reduce functions.

halted.

availability

to

task

manager has failed.

node.

is when a program runs from beginning

requires

any interaction

an

often

processing

section

another

know that the

more jobs).

whether atask

without

may be idle,

Batch

for

until the job status is completed.

extended

use

and

discusses

might

user.

of time

processing

universities

use

batch

the integration

of these

or

to run

Batch a large

year-end processing

but it has limitations.

to improve

some

with the

period

batch

is not bad,

have been developed

next

begin

changes status to indicate that the job is completed.

Businesses

systems

programs

is

nodes

with an error,

capacity.

processing.

number

the

Batch processing

or halting

the

tracker

messages to determine

that

processing

in the

task

queries the job tracker

the task

nodes to

machine (JVM) to run the entire

(and

monitors the heartbeat

tracker

on each of those task.

messages to the job tracker to let the job tracker

on the job

The client node periodically

financial

different

may contain time.

when this

Whenthe entire job is finished, the job tracker

portion

which data nodes contain the

may be able to select from

of the

The Hadoop system uses batch processing. to

jobs

creates a new Java virtual

a function

tracker

If so, the

9

When a

work. Each task tracker

jobs from

node same

then contacts the task trackers

Thejob tracker

8

a data

all at the

for

The task tracker sends heartbeat

7

are available for

many MapReduce so

different

complete

way, if

task

user intervention.

as follows:

a MapReduce job to the job tracker.

multiple nodes, the job tracker

The task tracker

6

is

data.

reducers

5

without

process

for this job.

simultaneously,

running

on

automatically

general

with the name node to determine

Remember,

from

the

submits

be processed

mappers

is replicated

processing,

application)

of tasks.

Hadoop

multiple

All of this is done

for

communicates

should

a set number

nodes. job

As a result, of Hadoop

a

within

programs.

16.2.3 Hadoop Ecosystem Hadoop is sets.

manage grown are

widely

and up

not

use, it

skilled

interact any

and

Copyright Editorial

review

2020 has

Cengage deemed

Most

and

Learning. that

any

All suppressed

how they

Java

Rights

Reserved. content

does

May not

not materially

be

to

obstacles.

As

make it

easier

programming.

that

Figure

use

other to pieces

are

to

copied, affect

each

scanned, the

overall

or

constantly are

and

shows

of related

applications

more

accessible

examples

ecosystem

of the

popular

to

to

of some

data create, have

users

of these

of applications

and their

more

large

effort

use a set of other related

evolving

some

extremely

considerable

a host

use

also

an entire

of analysing

requiring

who types

products

that

and tools.

Like

relationships

are changing,

components

in

a Hadoop

other.

duplicated, learning

to

16.5

Hadoop

produce

The following

relate

potential

a result,

a few

organisations

situation.

the

quite

the interconnected fluid

into

tool

attempt

each

tapping

Hadoop is a very low-level

to

complement

a rather

ecosystem

presents Hadoop

at complex

ecosystem,

so it is

by organisations

because

around

of applications.

16

used

Unfortunately,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 16.5

16

simplification Pig

Flume

NoSQL

839

applications

Hive

HBase

MapReduce Hadoop

Distributed

File

System

(HDFS)

Impala

Sqoop Data ingestion

Creating

Data and

Asample of the Hadoop ecosystem MapReduce

MapReduce

Big

Simplification

MapReduce

Core

applications

Hadoop

components

Direct

query

applications

Applications

jobs requires

significant

programming

skills.

As the

mapper and reducer

programs

become more complex, the skill requirements increase and the time to produce the programs becomes significant. These skills are beyond the capabilities of most data users. Therefore, applications to simplify the process of creating MapReduce jobs have been developed. Two of the most popular are Hive and Pig. Hive is a data

warehousing

system

that

sits

on top

of HDFS. It is

not a relational

database,

but

it supports its own SQL-like language, called HiveQL, that mimics SQL commands to run ad hoc queries. HiveQL commands are processed by the Hive query engine into sets of MapReduce jobs. As a result, the underlying processing tends to be batch-oriented, producing jobs that are very scalable over extremely large sets of data. However, the batch nature of the jobs makes Hive a poor choice for jobs that

only require

a small subset

of data to

be returned

very

quickly.

Pigis atool for compiling a high-level scripting language, named Pig Latin, into MapReduce jobs for executing in Hadoop. In concept it is similar to Hivein that it provides a means of producing MapReduce jobs without the burden of low-level Java programming. The primary difference is that Pig Latin is a scripting

language,

which

means it is

procedural,

while

HiveQL, like

SQL, is

declarative.

Declarative

languages allow the user to specify what they want, not how to get it. This is very useful for query processing. Procedural languages require the user to specify how the data is to be manipulated. Thisis very useful for performing data transformations. As a result, Pigis often used for producing data pipeline tasks that transform data in a series of steps. This is often seen in ETL(Extraction, Transformation and Loading)

processes

as described

in

Chapter

15.

Data Ingestion Applications One challenge faced by organisations that are taking advantage of Hadoops massive data storage and data processing capabilities, is the issue of actually getting data from their existing systems into the

Hadoop

cluster.

To simplify

this

task,

applications

have

been

developed

to ingest

or gather this

data into Hadoop. Flume is a component for ingesting data into Hadoop. It is designed primarily for harvesting large sets of data from server log files, like clickstream data from Web server logs. It can be configured to import

the

data on a regular

schedule

or based

on specified

events. In

addition

to simply

bringing

the

16

data into Hadoop, Flume contains a simple query-processing component so the possibility exists of performing some transformations on the data asit is being harvested. Typically, Flume would movethe data into the HDFS, but it can also be configured to input the data directly into another component of the Hadoop ecosystem named HBase.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

840

PART

VI

Database

Management

Sqoop is and forth scoop it

a more recent

between of ice

cream)

provides

Sqoop

a

way

works

is

an amalgam

in one direction data from reading

contents

format.

the of the

table

the

by row.

will usually

This is be

The resulting

a traditional

Direct

data

data

can

then

while

done in

into

Sqoop

from

files

back

the

back as in

Flume

files,

while Flume

and out of HDFS.

by

at a time

using

rows

in

MapReduce

relational

When

with the

MapReduce,

stored

a

in that

with log

Further,

manner

with the

to

to

primarily

one table

parallelised

HDFS

similar

into

HDFS, it can be processed

be exported

is

data

scoop,

SQL Server.

directions

several

converting

works

data is imported

a highly

into

for

(pronounced

Flume

MySQL and

HDFS, the

distributed

Sqoop

concept,

data in both

into

Once the data has been imported

Hive.

as Oracle,

It is a tool

name

However,

can transfer

database

row

The In

HDFS.

such

Sqoop

a relational

table

ecosystem.

HDFS.

of SQL-to-Hadoop.

databases

only,

Hadoop

and the

data into

with relational

operates

process

to the

database

of bringing

transferring

the

addition

a relational

so

a delimited

jobs

database,

or using most

often

warehouse.

Query Applications

Direct query applications These

applications

attempt

interact

to

with

provide

HDFS

faster

directly,

query access

instead

of

than is

going

possible

through

the

through

MapReduce.

MapReduce

processing

layer. HBase is

primary

a column-oriented

characteristics

NoSQL

SQL or SQL-like languages, system it

does

more

processing

and is more

detail

data

next

section.

first

SQL

SQL

HDFS

queries

directly

against

the

nodes.

It is

small

Hadoop

data

one

Hadoop data

available

It

from

analysts

was

through

such

data.

HBase is

components

HDFS.

by

if

makes for

at quickly

Hadoop

as

data

heavy

processing

good

ecosystem

a query

an organisation

analysts

The making

will be discussed

Cloudera

an SQL interface,

tool

very

of the

Prior to Impala,

Impala

HBases

processing,

databases

produced

One of

as Java for interaction. by batch

With Impala,

an appropriate

HDFS.

out easily. It does not support

caused

of the

HDFS.

database.

still in

considered

delays

popular

of the

to scale

Column-oriented

application.

to

are

more

system.

directly

sit on top

languages

the subsets

of the

a relational

while they

avoids

smaller

messaging

pull

into

generally

result

on

that

and imported

on data a relatively

HBase is

in the

so it

involving

for its

was the

make data from

from

sets.

to

and designed

on lower-level

jobs,

processing

designed

distributed

relying instead MapReduce

by Facebook

supports

to

on

for fast

sparse

used

Impala that

not rely

suitable

database

is that it is highly

would

can

in

engine needed

be extracted

write

SQL queries

use

of in-memory

large

amounts

caching of data into

set.

NOTE

Other than Impala, each of the components of the Hadoop ecosystem described in this section are all open-source, top-level projects of the Apache Software Foundation. Moreinformation on each of these projects

and

many others is

16.3 16

available

NoSQL DATABASES

NoSQL is the unfortunate developed

to address

NoSQL

in

Chapter

technologies

are,

what the

Copyright Editorial

review

2020 has

Cengage deemed

any

All suppressed

Rights

challenges

2, Data

Models.)

but rather are

Reserved. content

name given to a broad array of non-relational

the

technologies

Learning. that

at www.apache.org.

does

May not

not materially

represented The name is

what they not!

be

copied, affect

The

name

scanned, the

are

overall

or

duplicated, learning

by Big Data. (You unfortunate

not. In was

in experience.

whole

fact,

chosen

or in Cengage

part.

in that the

it

name

does also

as a Twitter

Due Learning

to

electronic reserves

rights, the

right

database technologies may recall that not

hashtag

some to

third remove

describe

does

party additional

what the

a poor job to

content

of

simplify

may content

that have

wefirst introduced

be

coordinating

suppressed at

any

NoSQL

explaining

time

from if

the

subsequent

eBook rights

a

and/or restrictions

eChapter(s). require

it

CHAPTER

meeting

of developers

developed encountering products

as their in this

stand

that

is

for

Not

Only

Server,

term

SQL.

the

are literally

NoSQL.

databases, databases

have

sense

all of those

following

Although

a cost

that,

The preference

is to

of the of the

databases

NoSQL

databases

Key-Value

value.

The value

Copyright review

2020 has

to

Cengage deemed

that

any

All suppressed

like

run

create

document

NoSQL

as open-source of the

operating

a cluster

open-source

system.

containing

It tens

for

Windows

or

Linux, that is freely

available

and highly

only in

licences

on

itself. defined

popular

as a part

a Linux

or

Unix

MacOS

environment.

The

approaches.

data

can be anything

Rights

Developer

Dynamo

Amazon Basho

Redis

Redis

Voldemort

LinkedIn

Labs

MongoDB, Inc.

CouchDB

Apache

OrientDB

OrientDB

RavenDB

Hibernating

HBase

Apache

Cassandra

Apache

Hypertable

Hypertable, Inc.

Ltd Rhinos

(originally

Neo4J

Neo4j

ArangoDB

ArangoDB,

GraphBase

FactNexus

Facebook)

LLC

does

16

May not

not materially

be

such

copied, affect

the

simplest

as a collection

the contents

Reserved. content

Databases

Riak

are conceptually

stores

understand

Learning. that

to

stores,

some

Linux

simply Oracle,

Databases

databases

database

attempt

Editorial

(KV)

going

purchase

name

broadly

produced

with the

as

off focusing

the

perceived

MongoDB

Graph databases

Key-value

NoSQL

Example

Column-oriented

16.3.1

products

been

were

the

data

such could

NoSQLdatabases

databases

Document

have generally

is

use a platform,

major

key-value

be associated

want to

under

a NoSQL

such

about

Table 16.2 shows

are

an organisation

NoSQL

as being

databases

to

worrying

that

support

NoSQL

product

products

841

were

of creating

that

you are better

than

categories:

databases

produced

a NoSQL RDBMS

they

products

appeal

NoSQL

were being

meant to imply

to interject

Regardless, refers

databases.

does not

Category

Key-value

if

organisation

each

of four

also tend

considered

traditional

Data and

that

such

has yet

tried

be considered

NoSQL

NoSQL

standpoint

discuss

TABLE 16.2

not all

they

most

one

be

term

can

no one

have

qualify.

the

and graph

As a result,

Therefore, sections

that

into

databases

been.

nodes.

products

which

many

Big

problems

was never

base of SQL users, the

of the

would

to

with the

NoSQL

Although

to all

technologies

deal

SQL. In fact,

observers

then

Access

to

The term for

ways.

requirement

fit roughly

of nodes, the

customisable.

NoSQL

of

Accordingly,

from

of thousands

if the

Microsoft

of each type. most

sizes. support

industry

supported,

of technologies

of these

movement.

for

fact,

hundreds

Most

some

database

Facebook

SQL, given the large

are

column-oriented

software,

makes

In

SQL

array

and

SQL in important

standard

MySQL and

understanding There

enormous include

never

non-relational

Amazon

should

More recently,

beyond

about the

Google,

sets reached

mimic

obvious.

languages

SQL

data

that implements

a product

that

like

category

query languages system

to discuss ideas

by organisations

16

as text,

the

overall

or

duplicated, learning

NoSQL pairs.

in experience.

component

whole

or in Cengage

part.

Due Learning

electronic reserves

models. acts

or an image.

or its

to

data

The key

an XML document,

of the value

scanned,

of the

of key-value

rights, right

some to

third remove

party additional

database

content

may content

be

any

time

from if

a

the

does not

simply

suppressed at

is

for

The database

meaningthe

the

A KV database

as an identifier

the

subsequent

stores

eBook rights

and/or restrictions

eChapter(s). require

it

842

PART

VI

Database

whatever the

Management

value is

meaning

provided

of the

for the

data in the

be tracked

among

databases

extremely

keys

database

equivalent but they

the

key. In

of the

and

the

words,

is

to

1 key

not

does

used to remove

exist,

KV model does not allow

of the

value

such

pair for

parse the

(One

important

in the

figure,

the

key-value

pairs

FIGURE 16.6

about

tabular

format

are

stored

16.3.2

Document

Document

data,

be in

2020 has

any

of the

LName

Learning. any

name

issue

that,

to

command

to

name, first

in

pair. If the new

have

name

DBMS

bucket

does

KV DBMS

and pairs

distinguish

other

query

not

even

content

return

the

know

how

characteristics.

appear

the

to

Since

to

the

application

1

Delete is

pairs.

understand the

Get

a key. If

value.

key-value

KV not

key-value

visually

are used.

it is not possible

the

does

plus

component

a value

with the

would be up to the

help

value

place

with three

it

within

bucket

key.

key-value

In fact,

although

to

used

component,

because

a get

is

bucket

value

on the

and delete operations

is replaced

example.

must be unique

in the

and

of as the KV

in tabular

form

components.

Actual

structure.)

Ramas Phone

LName

in key-value

Dunne

body

All suppressed

Rights

format,

of the

does

Initial

Leona

FName

A Areacode

0 Initial

Balance

K Areacode

0

Myron

Balance

Areacode

0161

0

May not

Unlike

such

as

difference

tags

not

be

copied, affect

XML,

while

Despite

the

overall

or

duplicated, learning

the

in experience.

value

component

value

component.

Object

whole

do.

Tags

of the

in

Cengage

part.

Due Learning

to

electronic reserves

document,

rights, right

some to

portions

document

third remove

party additional

may

Binary

may content

be

of

any

the

a document. the

be additional

title, tags

databases

suppressed at

JSON

understand

document

content

any

document

represents

there

documents,

the

or

to

be

data in

contain

The

(JSON),

named

the

stores can

do not attempt are

text

use of tags in

or in

Notation

can almost

that

in the

which

body

and they database

where the

KV databases

databases to identify

databases, is a NoSQL

a document

JavaScript

Within the

scanned,

key-value database

a KV database

document

document.

to

stores

is that,

may have

materially

similar A document

always

and sections.

Reserved. content

FName

Orlando

pairs.

component,

chapters

Alfred Balance

894-1238

222-1672

database

encoded

value

FName

844-2573

Phone

LName

a document

and

that

last

a convenience

of KV databases.

a document

example,

Cengage deemed

KV

Value

Another important

to indicate

review

for

be thought

based

on anything

as a new

a customer

name,

are

bucket

Store

component

on data in the

Be aware

are conceptually

a subtype

documents

(BSON).

Copyright

making

Databases

databases

considered

Editorial

perform,

= Customer

Phone

author

must

cannot

Key-value database storage

10014

For

value

a table-like

0181

content

added

last

in

10011

16

it is

to find the customers

0161

can

pair.

but it

16.6:

understand

relationships

Key values

operations

only get, store of the

then

could

is just

10010

of

the

and key 10011,

Figure

Key

type

the

as a customer

customer

Bucket

tagged

based

last

An application

not

data

by specifying

existing

data to

in fact,

DBMS

of keys.

query

simple

based

the

All data

exist,

customer

component

note

the

a thing

bucket

value

not

use the

keys;

A bucket can roughly

grouping

component

queries

on

component.

key-value to

is

based

work that

buckets.

pair. Figure 16.6 shows

a key-value there

are rather value

then

for

that

to

that

no foreign

processing.

performed

does

the

know

are

the

a key-value

pair

across

possible

combination

key combination

the

basic

A bucket is alogical

All queries

retrieve

are

organised into buckets.

on KV databases

used

bucket

it is

of the applications

There

simplifies

for

be duplicated

pair.

Operations or fetch

greatly

scalable

of a table. can

other

key-value

component.

at all. This

fast

Key-value pairs are typically a bucket,

key. It is the job

value

time

from if

the

subsequent

eBook rights

are

and/or restrictions

eChapter(s). require

it

CHAPTER

considered For

schema-less,

a document

documents in

are required

a document

capabilities to the

that

as

group

to

which

the

retrieve some

DBMS

all of the

key-value

into logical

groups

and

the same is

aware

such

includes

FIGURE 16.7

Balance

tutorial

basis

Tags inside

have

own for

the

Data and

data that is

documents

have its

are the

groups

843

not

all

The tags

of the

document

NoSQL

stored.

tags,

structure.

most

called

query

the

tag

buckets,

additional

are accessible

document

While a document based

on the

Figure 16.6, but in a tagged

as summing

a hands-on

to

within

on the

Big

possible. logical

possible

tags

document

all can

they

called collections.

data from

structure

although

document

because

querying

also

of the

MongoDB

so each

pairs into

where the

functions

in the

MongoDB,

key, it is

documents

aggregate

operations

tags,

that,

have over KV databases.

group

collection

a predefined

means

important

makes sophisticated

Figure 16.7 represents Because

same

extremely

databases

KV databases

the

do not impose

schema-less

have the

are

document

documents

specifying

being

database

DBMS,

Just

that is, they

database,

16

documents,

has the

value

or averaging

format

later in this

(available

on online

in

chapter,

of tags.

to

write

databases

queries.

and

database. queries

even

You learn

Appendix

by

For example,

for a document

possible

0. Document

balances

database

contents

it is

databases

may be retrieved

that

support

some

Q,

basic

Working

with

platform).

Document database tagged format Collection

= Customer

Key

Document

10010

{LName:

Ramas,

Areacode:

FName:

0161,

Alfred,

Phone:

Initial:

844-2573,

A,

Balance:

0}

{LName: Dunne, Areacode: 0181,

10011

FName: Leona, Initial: K, Phone: 894-1238, Balance:

0} 10014

{LName: 0161,

Orlando, Phone:

FName: 222-1672,

Myron,

Areacode:

Balance:

0}

Document databases tend to operate on an implied assumption that a document is relatively self-contained, not a fragment of the data about a given topic. Relational databases decompose complex data in the business environment into a set of related tables. For example, data about orders

may be decomposed

into

customer,

invoice,

line,

and

product

tables.

A document

database

would expect all of the data related to an order to be in a single order document. Therefore, each order document in an Orders collection would contain data on the customer, the order itself, and the products purchased in that order, all as a single self-contained document. Document databases do not store relationships as perceived in the relational model and generally have no support for join

operations.

16.3.3 Column-Oriented

Databases

The term column-oriented database can refer to two confused with each other. In one sense, column-oriented traditional,

relational

database

technologies

that

different sets of technologies that are often database or columnar database can refer to

use column-centric

storage

instead

16

of row-centric

storage. Relational databases present data in logical tables; however, the data is actually stored in data blocks containing rows of data. All of the data for a given row is stored together in sequence, with many rows in the same data block. If a table has manyrows of data, the rows will be spread across

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

844

PART

VI

Database

Management

many data blocks. across row

five

of

data

data.

Figure 16.8 illustrates

blocks.

Row-centric

Retrieving

FIGURE 16.8

one row

table

minimises

of data requires

CUSTOMER

relational

accessing

10011

just

of data that is physically

of disk reads

one

Ramas

Alfred

Dunne

Cus_City

data

necessary

block,

as shown

stored

to retrieve in

Figure

a

16.8.

Cus_Country

Manchester

UK

Leona

Durban

SA

Smith

Kathy

Paris

FR

10013

Olowski

Paul

Manchester

UK

10014

Orlando

10015

OBrian

Durban

SA

George

Utrecht

NL

Cape Town

SA

Myron

Amy

Brown

10017

Row-centric

Cus_FName

10012

10016

James

Williams

10018

Farriss

Anne

10019

Smith

Olette

Manchester

UK

storage

1

Block

Column-centric

4

Block

Block

10016,Brown,James,NULL,NULL

10010,10011,10012,10013,10014

10011,Dunne,Leona,Durban,SA

10017,Williams,George,Utrecht,NL

10015,10016,10017,10018,10019

Block

2

5

Block

10012,Smith,Kathy,Paris,FR

10018,Farriss,Anne,Cape

Town,SA

10013,Olowski,Paul,Manchester,UK

10019,Smith,Olette,Manchester,UK

3

storage

1

10010,Ramas,Alfred,Manchester,UK

Block

number

table

Cus_LName

10010

Block

with 10 rows

the

Comparison of row-centric and column-centric storage

Cus_Code

Block

a relational

storage

4

Manchester,Durban,Paris,Manchester,NULL

Durban,NULL,Utrecht,Cape

Block

2

5

Ramas,Dunne,Smith,Olowski,Orlando

UK,SA,FR,UK,NULL,

OBrian,Brown,Williams,Farriss,Smith

SA,NULL,NL,NL,UK

Block

Town,Manchester

3

10014,Orlando,Myron,NULL,NULL

Alfred,Leona,Kathy,Paul,Myron

10015,OBrian,Amy,Durban,SA

Amy,James,George,Anne,Olette

Remember, in transactional systems, normalisation is used to decompose complex data into related tables to reduce redundancy and to improve the speed of rapid manipulation of small sets of data. These manipulations tend to be row-oriented, so row-oriented storage works very well. However, in queries that retrieve a small set of columns across alarge set ofrows, alarge number of disk accesses are required.

For example,

a query that

wants to retrieve

only the

city and province

of every customer

will have to access every data block that contains a customer row to retrieve that data. In Figure 16.8, that would mean accessing five data blocks to get the city and province of every customer. A column-oriented or columnar database stores the data in blocks by column instead of by row. A single customers data will be spread across several blocks, but all of the data from a single column will be in just as all

of the

customer

very

well for

data

review

2020 has

that

and

data

easy to imagine

that

the

hundreds

for It is

database

meaning

technology,

Cengage

queries

warehouses. gains

would of data

transactions

over few Figure

At the

since insert,

province

but

shows

a few

size

grew

same

time,

column-centric

delete

has the

every

as is

rows

and

millions

or

storage

activities

can be achieved

and

to

just works

many rows,

only

table

and

for

storage

if the

update,

data

and

columns 16.8

storage

structured

city

of column-centric

be significant blocks.

the

type

Though

column-centric

it still requires

retrieving

This

will be stored together,

would

within relational

advantage

of supporting

queries.

Learning. that

case,

blocks.

used to run

worth noting that that

city data for customers that

data

of thousands

processing

In

two

are primarily

be very disk intensive.

deemed

only

systems

across

all of the

together.

accessing

be very inefficient

SQL for

Copyright

it is

In Figure 16.8,

will be stored

databases

of rows

would

data

many reporting

blocks,

billions

Editorial

state

blocks.

might require

done in

16

a few

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The other describe confines

of

of the

conform

with

use of the term

a type

to

model. structures,

Bigtable

Hypertable,

it to the

product.

key-value

pairs

databases

same

things.

close

enough

to the

the

Ramas

cus_city

databases,

began continued

to

a set

same terms family

column model

family

model.

Therefore, value

column

in

name

to

is

of the

a key-value

column,

the

data

to

described released

of the

most

data

is

similar

cus_lname Cape

are

to

can

a cell

is

help

of

is the

is the name

Town

conceptually

model

component

column

mean quite the

and

relational

value

While

dont

simple

pair that

and the

cus_city:

as the

one

component.

the terms

of the

Similarly,

Town

value

conceptually

Ramas is a column;

Cape

HBase,

into

845

originated

but Facebook

Cassandra

databases,

are

the

and

include

at Facebook,

the

model

NoSQL

is to

beyond

require

database

products

in the

cus_lname: column.

not

Data and

database,

storage

do

This

develop

understanding

A column

name

products

of columns

as relational

your

family

Big

database is a NoSQL database that organises

databases

that

column

queries.

as a project

A column family

the

data

these

database

Cassandra has

called

of column-centric

SQL for

which

mapped

The key is the

the

as the

keys

also

concept

support

many of the

column

column.

is

NoSQL do they

the

Other column-oriented

relational

database.

stored in that and

use

Fortunately,

understand

relational

As

databases. with

database,

takes

nor

community,

popular column-oriented family

that

and Cassandra.

open-source

data in

column-oriented

database

relational

predefined

Googles

earlier,

NoSQL

16

you

data in

a

data that is

of the column,

another

column,

with

value.

NOTE Even though column family databases do not (yet) support standard SQL, Cassandra developers created a Cassandra query language (CQL). It is similar to SQLin manyrespects and is one of the compelling reasons for adopting Cassandra.

As

more

columns

cus_fname, name.

added,

cus_lname,

and

Similarly,

to form is

are

cus_street,

a customers

a group

attributes

becomes

cus_initial, cus_city,

clear

which

that

some

columns

would logically

cus_province

form

group

natural

together

and cus_postcode

groups,

to form

are used to create super columns.

that

Recall

are logically

in the

entity

related.

relationship

model.

the

In

discussion

in

many cases,

Chapter

super

such

group together

A super

4 about

columns

as

a customers

would logically

address. These groupings

of columns

composite

it

have more

can

column

simple

and

be thought

of

as the composite attribute and the columns that compose the super column as the simple attributes. Just as not all simple attributes have to belong to a composite attribute, not all columns have to belong to a super column. Although this analogy is helpful in many contexts, it is not perfect. It is possible to group

columns

into

a super

column

that logically

belongs

together

for

application

processing

reasons

but does not conform to the relational idea of a composite attribute. Row keys are created to identify objects in the environment. All of the columns or super columns that describe these objects are grouped together to create a column family; therefore, a column family is conceptually similar to a table in the relational model. Although a column family is similar in concept to a relational

table,

Figure

16.9 shows

that it is structurally

very different.

Notice in

Figure

16.9 that

each

row key in the column family can have different columns.

16

NOTE A column family can be composed

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

of columns or super columns,

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

but it cannot contain both.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

846

PART

VI

Database

Management

FIGURE 16.9

Column family

database

Column Family

CUSTOMERS

Name

Rowkey

Key Columns

1

City

Manchester

Fname

Alfred

Lname

Ramas

Country

UK

Rowkey 2

Key Columns

Balance Fname

Kathy

Lname

Smith

Rowkey

Key Columns

345.86

3

Company

Local

Lname

Markets,

Inc.

Dunne

16.3.4 Graph Databases A graph database is a NoSQL database based on graph theory to store data about relationship-rich environments. Graph theory is a mathematical and computer science field that models relationships, or edges, between objects called nodes. Modelling and storing data about relationships is the focus of graph databases.

Graph theory

is a well-established

field

of study

going

back

hundreds

of years.

As a result,

creating a database model based on graph theory immediately provides a rich source for algorithms and applications that have helped graph databases gain in sophistication very quickly. Asit also happens that much of the data explosion over the past decade has involved data that is relationship-rich, graph databases

have been poised to experience

significant

interest

in the

business

environment.

Interest in graph databases originated in the area of social networks. Social networks include a wide range of applications beyond the typical Facebook, Twitter and Instagram examples that immediately come to mind. Dating websites, knowledge management, logistics and routing, master data management, and identity and access management, are all areas that rely heavily on tracking complex

relationships

among

objects.

Of course,

relational

databases

support

relationships

too.

One

of the great advances of the relational model wasthat relationships are easy to maintain. Arelationship between a customer and an agent is as easy to implement in the relational model as adding aforeign key to create a common attribute, and the customer and agent rows are related by having the same

16

value in the common

attributes.

If the

customer

changes

to

a different

agent, then

simply

changing

the

value in the foreign key will change the relationship between the rows to maintain the integrity of the data. The relational model does all of these things very well. However, whatif we want a like option so customers can like agents on our website? This would require a structural change to the database to add a new foreign key to support this second relationship. Next, whatif the company wants to allow

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

customers or the

on its

friends

among

website to friend

of their

individuals

(e.g.,

friends,

friends?

that

friends

relationships

need

social

to

of friends,

become just

each

In

so a customer data,

be tracked,

and

and friends

of friends

asimportant

The primary components

other

networking

of graph

as the

often

the

we want to

keep

data

they

about.

are the

Each

data that

node (circle)

of different

are tracked

This results

This is the area

in

Figure

store

the

node.

847

like,

many layers

deep

where

databases

the

shine.

as shown in Figure 16.10.

a single

All agent

NoSQL

relationships

The node is a specific instance

16.10 represents

about

Data and

a situation

where graph

are nodes, edges and properties,

in

we need to

Big

which agents their friends

be dozens

relationships

A node corresponds to the idea of a relational entity instance. attributes;

can

of friends).

data itself.

databases

can see

there

16

agent.

nodes

of something

Properties

might

have

are like properties

like first name and last name, but not all nodes are required to have the same properties. An edge is a relationship

between

nodes.

Edges (shown

as arrows

in

Figure

16.10)

can

be in

one

direction,

or they

can

be bidirectional. For

example,

in

Figure

16.10,

the

friends

relationships

are not. Note that edges can also have properties. Ramas liked

a traversal. Graph and

agent

Alex Alby is recorded

Instead

databases degree

bidirectional,

graph

database.

excel

at traversals

that

focus

but the

likes

A query in a graph

of querying the database, the correct terminology

relationships

on relationships

database

Alfred

is called

would be traversing the graph.

between

nodes,

such

as shortest

path

of connectedness.

Graph databases do not force

share some

characteristics

data to fit predefined

of processing, graph

in the

are

In Figure 16.10, the date on which customer

at least

databases.

for

structures,

with other do not support

relationship-intensive

Graph

databases

NoSQL

data.

do not

scale

in that

SQL, and are optimised

However,

out very

databases

other

well to

key

graph

to provide

characteristics

clusters

due to

databases velocity

do not

differences

in

apply

to

aggregate

awareness.

FIGURE 16.10

Graph database representation ID:

101 likes

Label: Date:

ID:

9/15/2019 ID:

1

Type:

Type: agent Fname: Alex Lname: Alby Phone: 228-1249

ID: Type:

ID:

100

Label: ID:

2

Fname:

Leona

Lname:

Dunne

ID:

agent

Leah

Lname:

Hahn

friends

assists

likes

Date:

8/15/2012

120

ID: 109 likes ID:

Type:

Fname:

103

Label:

Date:

has

Cengage deemed

Learning. that

any

All suppressed

Label:

likes

Date:

9/15/2019

ID:

Rights

does

May not

not materially

be

copied, affect

107 likes

Label: 3/20/2020 ID: Type:

Reserved. content

ID:

ID: 108 Label: friend

100

scanned, the

overall

Lname:

Olowski

Phone:

894-2180

Kathy Smith

duplicated,

agent

Fname:

John

Phone:

104 likes 10/11/2018 Label: Date:

ID:

Okon 123-5589

customer

Lname:

learning

3

Type: Lname:

Date:

6

Fname:

or

111 assists

Label:

Alfred

Renew: 04/05/2017

2020

ID:

assists

friends

7

106

Ramas

Amt:

review

ID:

1/07/2020

ID:

4

customer

Lname:

Copyright

ID:

Type: customer Fname: Paul

ID: Label:

102

Label:

105

Label:

Fname:

Label:

Editorial

5

customer

in experience.

whole

16

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

848

PART

VI

Database

Management

16.3.5 Aggregate Key-value,

document

Awareness and column

family

databases

are aggregate

aware.

Aggregate

aware

means

that the data is collected or aggregated around a central topic or entity. For example, a blog website might organise data around individual blog posts. All data related to each blog post are aggregated into a single denormalised collection that mightinclude data about the blog post (title, content and date posted), the

poster (user

name and screen

name),

and all comments

made on the

post (comment

content and commenters user name and screen name). In a normalised, relational database, this same data might call for USER, BLOGPOST and COMMENT tables. Determining the best central entity for forming aggregates is one of the most important tasks in designing most NoSQL databases, and is determined by how the application will use the data. The aggregate-aware

database

models achieve

clustering

efficiency

by

making each

piece

of data

relatively independent. That allows a key-value pair to be stored on one node in the cluster without the DBMS needing to associate it with another key-value pair that may be on a different node on the cluster. The greater the number of nodes involved in a data operation, the greater the need for coordination and centralised control of resources. Separating independent pieces of data often called shards across nodes in the

cluster, is

what allows

NoSQL

databases

to scale

out so effectively.

Graph databases, like relational databases, are aggregate ignorant. Aggregate-ignorant models do not organise the data into collections based on a central entity. Data about each topic is stored separately and joins are used to aggregate individual pieces of data as needed. Aggregate-ignorant databases,

therefore,

tend to

be

more flexible

at allowing

applications

to

combine

data elements

in a

greater variety of ways. Graph databases specialise in highly related data, not independent pieces of data. As aresult, graph databases tend to perform best in centralised orlightly clustered environments, similar to relational databases.

16.4

NewSQL DATABASES

Relational

databases

are the

mainstay of organisational

data, and

NoSQL

databases

do not attempt

to

replace them for supporting line-of-business transactions. These transactions that support the day-to-day operations of business rely on ACIDS-compliant transactions and concurrency control, as discussed in Chapter 12. NoSQL databases (except graph databases that focus on specific relationship-rich domains) are concerned withthe distribution of user-generated and machine-generated data over massive clusters. NewSQL

databases

provide the latest category success

try to bridge the gap between

RDBMS

and

NoSQL.

NewSQL

databases

attempt

to

ACIDS-compliant transactions over a highly distributed infrastructure. NewSQL databases are technologies to appear in the data management arena to address Big Data problems. As a new of data management products, NewSQL databases have not yet developed atrack record of and have been adopted by relatively few organisations.

NewSQL

products,

such

as ClustrixDB

and

NuoDB,

are designed

from

scratch

as hybrid

products

that incorporate features of relational databases and NoSQL databases. Like RDBMSs, NewSQL databases support: SQL

as the

primary

ACIDS-compliant Similar to

16

Highly

NoSQL, NewSQL databases also support: clusters

or column-oriented

As expected, NewSQL

transactions.

distributed

Key-value

interface

no technology

has disadvantages

data stores.

can perfectly (the

disadvantages that have been discovered Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

provide the advantages

CAP theorem

scanned, the

overall

or

duplicated, learning

covered

in

centre on NewSQLs in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

of both RDBMS and NoSQL, so

Chapter

14 still applies!).

Principally,

heavy use ofin-memory rights, the

right

some to

third remove

party additional

content

may content

be

storage.

suppressed at

any

time

from if

the

subsequent

the

Critics

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

point to the fact handle

vast

practical

limits

should

be able

specific

the durability

component

by the

on in-memory

can

significantly,

used by

in

NoSQL

reliance

be held in

in

practice

needs.

Further,

structures

Although been

to

Data and

the

because

in theory

done

RDBMS

success

sections

Neo4j.

These

databases.

Q and

has

over traditional

experienced

The following

relational

Appendixes

have

and

memory.

little

of ACIDS.

Big

are

databases

beyond

distribution,

849

ability to there

NewSQL

scale

NoSQL

a few

dozen

it is far from

the

databases.

products

MongoDB

by traditional

16.5

of data that

out

database

business

matched

be impacted

amount

scale

databases,

databases

can jeopardise

can

While this is a marked improvement

NoSQL

NoSQL

This

to

of nodes

A few to

sets

to the

data nodes. hundreds

that this

data

16

provide

two

niche

a brief

databases

You can find

R, respectively,

in

markets

introduction

provide

on the

to

a set

more detailed

available

by providing

online

two

solutions widely

of functionality

hands-on

used not

examples

yet

of these

platform.

WORKING WITH DOCUMENT DATABASES USING MongoDB

section

currently

introduces

available,

you to

MongoDB,

MongoDB

Therefore, learning

the

has

basics

a popular

been

one

of working

of the

document

database.

most successful

with MongoDB

in

Among

the

penetrating

NoSQL

the

can be quite useful for

databases

database

database

market.

professionals.

NOTE

MongoDB is a product of MongoDB, Inc. In this book, we use the Community Server v.4.0.9 edition, which is open source and available free of charge from MongoDB, Inc. New versions are released regularly. This version

of

The name, to

support

MongoDB is

available

for

Windows,

MongoDB, comes from the extremely

high

availability

high

scalability

high

performance.

large

data

MacOS and Linux from the

word humongous

sets. It is

designed

asits

MongoDB

developers intended

website.

their new product

for:

Online Content Anexpanded setof hands-on exercisesusingMongoDB canbefoundin Appendix

Q, Working

with

As a document database,

MongoD,

available

on the

online

platform.

MongoDB is schema-less and aggregate aware. Recall that being schema-less

means that all documents

are not required

to conform to the same structure,

and the structure

of documents

does not have to be declared ahead of time. Aggregate aware meansthat the documents encapsulate all relevant data related to a central entity withinthe same document. Datais stored in documents, documents of a similar type are stored in collections, and related collections are stored in a database. To the users, the documents appear as JSON files, which makes them easy to read and easy to manipulate

in

a variety

of programming

languages.

Recall that

JavaScript

Object

Notation (JSON)

1

is a

datainterchange format that represents data as alogical object. Objects are enclosed in curly brackets { } that contain key-value pairs. A single JSON object can contain many key:value pairs separated by commas. A simple JSON document to store data on a book might look like this: {_id: 101, title: Database Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

Principles} not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

850

PART

VI

Database

Management

This document

contains

_id is a key title The

is

previous

with Database

component

square

adding

are

have

values

101, title: Database

When JSON

documents

a single

the

value.

would

authors

be read

appropriate

could

have

used.

document

could

such

the

for

values

be expanded

Coronel

are

key.

In the

and Morris.

are placed inside

to:

Crockett,

they

a given

Arrays in JSON

Morris,

by humans,

readability,

be

an array is

author: [Coronel, to

to improve

that

key,

above

Systems,

associated

values

pair for

for

are intended

line

as the

multiple

[ ]. For example,

pair on a separate

value

a key:value

multiple

brackets

{_id:

pairs:

Principles

may

example,

When there

key:value

with 101 as the associated

a key

value

two

often

Blewett]}

displayed

with

each

key:value

as:

{ _id:

101,

title:

Database

author:

Principles,

[Coronel,

Morris,

Crockett,

Blewett]

} MongoDB many database the

are comprised When

object

you

of collections

connected

want to

to

work.

the

Alist

of documents.

MongoDB

of the

server,

databases

Each

the

first

available

MongoDB

task

on the

is

to

server

server

specify can

can host with

which

be retrieved

with

command: show

All

databases

databases.

dbs

data

new

manipulation

database

in

commands

in

MongoDB

MongoDB

is

as easy

informs

the

server

the

name

must

as issuing

the

be

directed

use

to

a particular

database.

Creating

a

command.

use fact

The use command If there

is

a database

commands.

with

If there

is

not

which

a database

is to

be the target

specified,

database then

that

database

with that

name,

then

one is

of the

will

commands

be used

created

for

that follow.

the

subsequent

automatically.

Online Content Thedocuments forthefact database areavailable asa collectionofJSON documents is

that

available

can

on the

be directly

online

16.5.1 Importing

imported

into

MongoDB.

will use to illustrate

Documents in

a sample

named

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

and

MongoDB

MongoDB

query is based

of documents. on a fact

The collection

database

Access to Computer Technology (FACT) is a small library run by the

Editorial

'Ch16_Fact.json'

platform.

Remember that a MongoDB database is a collection

16

The file is

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

of documents

and a patron

collection.

Computer Information

party additional

content

may content

be

suppressed at

any

time

from if

we Free

Systems

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

department with

at Tiny

patron

{_id:

as the

College. central

The portion

entity.

,system-generated

display:

,the

The

of the

model that is

documents

have

patrons

full

name as it

will be displayed

,patrons

first

name

in

all lowercase

letters.,

lname:

,patrons

last

name

in

all lowercase

letters.,

,either

age:

,patrons

checkouts:

faculty age in

years

,the

which

month in

,the

day

book:,the

book

title:,the

title

pubyear:

here consists

NoSQL

851

of documents

structure:

to users.,

checkout

this

object.,

occurred.,

checkout which

of the

history.

occurred.,

this

checkout

book for this

occurred.,

checkout.,

book.,

year

subject:,the

a student.,

checkout

checkout

month in

number

of the

,the

this

is

patrons

for this

which

of the

patron

for the

number

year in

month:,the day:

only if the

of objects

,an assigned

year:

used

Data and

or student.,

,an array

[id:

being

following

Big

ObjectID,

fname:

type:

the

16

the

subject

book

was

of the

published.,

book.]

} Notice that

that

the

under

patron each

together key:value the

the

patron has

patron.

document checked

collection

out.

Finally,

note

Notice that

the

with capitalisation,

and again

pairs. The reason

for this is that

faculty

name

twice

facilitates

contains

information

also that

the

patrons

name

with first

about

checkouts is

all searches in

patron

subdocument

stored

name and last

each

twice,

is

once

and

an array

with first

name in all lowercase

MongoDB

all the

letters

are case sensitive

of

and last

books objects name

in separate

by default,

storing

searches.

NOTE The database can be created using the Ch16_Fact.json file by using the following command at an operating system command prompt (note that the command is for use at a command prompt in the OS, not inside the MongoDB shell). mongoimportdb

factcollection

patrontype

jsonfile

Ch16_Fact.json

Mongoimport is an executable program that is installed with MongoDB that is used to import data into a MongoDB database. The above command specifies that the imported documents should be placed in the fact database (if one does not exist, it will be created) and in the patron collection (if one does not exist, it will be created).

Mongoimport

can

work

with different file types

such as CSV files

and JSON files.

16

The type

parameter specifies that the imported documents are already in JSON format. The file parameter specifies the name of the file to beimported. If your copy ofthe Ch16_Fact.json file is not in the current directory for your command prompt, you will need to provide an appropriate path for the file location.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

852

PART

VI

Database

Management

16.5.2 Example of a MongoDB Query Using find( ) Once the

patron

collection

is imported,

you are ready

to

query the

MongoDB

database.

In

order to

manipulate collections, a MongoDB database uses methods. Methods are programmed functions to manipulate objects. Examples of such methods are createCollection( ), getName( ), insert( ), update( ), find( ), and so on. The find( ) method retrieves objects from a collection that match the restrictions provided.

The find( )

method

has two

parameters:

find({,query.},{,projection.})

The ,query. parameter specifies the criteria to retrieve the collection objects. The ,projection. parameter is optional and specifies which key:value pairs to return. The value with each key in the projection object is either 0(do not return), or 1(return). For example, Figure 16.11 shows the code to retrieve the _id, display the name and age for patrons that

either

have the last

name barry

and are faculty,

or have the last

name hays

and are

under

30 years old: db.patron.find({$or:

[

{$and: [{lname: barry},

{type: faculty}]},

{$and: [{lname: hays},

{age: {$lt: 30}}]}

]}, {display:

FIGURE 16.11

1, age: 1,type:

Example of MongoDB document query

Notice also that this example used

to improve

MongoDB is

16

originally the

Copyright Editorial

review

2020 has

Learning. that

any

readability

of the

document

to

of its

All suppressed

uses the

a powerful

designed

structure

Cengage deemed

1}).pretty( )

Rights

does

May not

not materially

be

documents

and for

copied, affect

scanned, the

overall

or

its

duplicated, learning

method. The pretty( )

by

database

Web-based

documents

Reserved. content

support

pretty()

placing

that is

operations query

in experience.

whole

key:value

being and,

pairs

adopted

as such,

it

method is a MongoDB on separate

by

method

lines.

many organisations.

draws

heavily

on

It

was

JavaScript

for

language.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

16

Big

Data and

NoSQL

853

NOTE We have find(

)

introduced

method

databases.

you

here, Appendix

powerful

Q,

document

16.6

to the

but there

is

basic

concepts

much

more to learn

Working

database

with

of a

MongoDB,

and is located

MongoDB

if

you

contains

on the

collection

and

are interested a

online

in

how

more thorough

platform

to

pursuing

of this

query

a career

tutorial

it

using

in

the

document

on how

to

use this

book.

WORKING WITH GRAPH DATABASES USING Neo4j

Even though NoSQL

Neo4j is not yet as widely adopted

databases,

with thousands

as MongoDB, it has been one of the fastest

of adopters

including

LinkedIn

and

Walmart.

growing

Neo4j is a graph

database. Like relational databases, graph databases still work with concepts similar to entities and relationships. However, in relational databases, the focus is primarily on the entities. In graph databases, the focus is on the relationships.

Online Content Anexpanded setof hands-on exercisesusingNeo4jcanbefoundin Appendix

R,

Working

Graph databases therefore, to

are

scale

out

LinkedIn

that

friends a

up

heavily

with

this

What if

we

solution.

30 friends

want

can

This requires 2.7

phrase

has

product

yet

As you

relational

Graph

Learning. that

any

All

that

databases

to

suppressed

can

database

Rights

milliseconds

Reserved. content

can

does

May not

not materially

be

of the

affect

Then,

000-row

person

table

another

see,

by the

starting

to

bridge time

slow.

is

table

so the

we are

working

query

for

query,

hours to run in

scanned, the

overall

adopters

or

duplicated, learning

queries describe

in experience.

whole

or in Cengage

part.

a relational in

their

Due Learning

to

electronic reserves

use

of graph

rights, the

In fact,

right

some to

third remove

in the

database

can

relationship. bridge

are

table

90 billion

to

rows

The relational

of friends

of friends.

a Cartesian

product

of separation

types

of highly interdependent

database,

seconds.

a person

the join).

degrees

could take

these

and those

the

friends

with and

in it,

(there

producing

the six

entity

be

to retrieve

friend

connecting

not trivial

is unable to keep up. These types

complete

direct

as

can

relationship

Arelational

with to construct Now

whom

person to their friends

that

join

to itself

the

(rows)

A query

entity.

beyond

of

able

such

as a person for

people

one to link the the

each

this

000 rows.

we look

network

a bridge

technology

when

copied,

when

are the least

a social

people,

10 000

300

engine is contending

copy

other

Graph databases,

why they

of

create

has

has

two joins:

of friends?

but it is

entities.

which is

many

table

table

names from

a 300

DBMS

volume, another

person

comes

friends

Joining

data,

with

book.

among

an example

we would

bridge

friends

problem

that the

that

the

the

of this

model, we could represent

would require

those

about

about relationships

Cengage deemed

know

handle

minutes

so that

The

be included.

3 1016 rows!

databases.

2020

to

joining

of problems, queries

quickly.

be friends

of a relational

Imagine

each

to retrieve

their

Consider

In implementation,

of his or her friends

query

can

platform

relationships

among types.

A person

a two-entity

Cartesian

database

review

people.

online

with complex

database

relationship.

will have to

in the

at on the

on interdependence NoSQL

unary

and another

perform

Copyright

the

connects

average

bridge

Editorial

reliant

among

and the names

with

available

with many people. In terms

people

itself

Neo4j,

are used in environments

many-to-many

end

with

are the forte you

often

of graph

encounter

16

the

databases.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

854

PART

VI

Database

Management

NOTE Neo4j

is

a product

Community

of

Server

versions

Neo4j,

v.3.5.5

are released

and Linux from

Inc.

There

edition,

regularly.

the

Neo4j

are

This

open

version

versions

source

of

of

and

Neo4j is

Neo4j

available

available

available. free

for

In this

of charge

Windows

book,

from

(64-bit

we use the

Neo4j,

and

Inc.

32-bit),

New MacOS

website.

Neo4j provides several interface and optimised

multiple

which is

for interaction

options. It

through

was originally

a Java

designed

API. Later releases

with Java programming in

have included

the

options

mind,

for a Neo4j

command shell, similar to the MongoDB shell, a REST API for website interaction, and a graphical, browser-based interface for intuitive interactive sessions. In this section, you will use the Web browser interface.

16.6.1 Creating Nodesin Neo4j

NOTE An instance can

be

of

Neo4j

changed

each

databases

be

label

to

in

is the

closest

associate

properties

nodes

Each

Although the

describe so

both

and in the

are

to

SQL,

even

basic

Nodes

and

Copyright review

2020 has

Cengage deemed

Learning. that

data

the

data

If the

data

files in that path

before

path

for the

path

is

directory

starting

database

changed

to

on start-up.

the

server,

By

multiple

or type

of

is

composed

instances

the

of that

relational

if they

the

node.

are

model.

relational have

to

as a node.

Each restaurant

are

the

nodes, To

label.

same

more than

would are

distinguish

you can use labels.

get a Restaurant

model,

the

makes it

Just as

a node

has

databases

of properties.

In

group.

as a node.

or type

of node

of nodes

The nodes for This

a

used

of area restaurants.

kind

types

is

group.

be represented

one

the

Neo4j,

that

graph

set

one

Roughly In

a tag

same

share reviews

members

help

is

to the

to

members

edges.

database.

of that instance,

the

belong

where

and

Alabel

or belonging

Unlike

logically

of node.

nodes

a relational

characteristics

not required

critics

of in

both in

members

while code

might get a

more convenient

in code

nodes.

query

syntax

from

describe

programmers,

types

are entity

language

very

in

different.

Neo4j is

called

However,

Cypher is very easy to learn

Cypher.

being

and a few

Cypher

a declarative

simple

commands

is

declarative,

like

instead

of an

language

can be used to

perform

processing. relationships

are

created

using

a

CREATE

command.

The following

code

creates

a

node:

CREATE

Editorial

the

language,

database

member

and restaurants kind

to

label

one label

declarative

though

imperative

same

be represented

the

server.

of the same type to

of a club for food

another

between

The interactive,

16

as being

and nodes for restaurants

distinguish

the

of a table

characteristics

minds of users and

Member label,

Neo4j

databases

attributes

more than

members

restaurants

However,

all needed

changing

graph

with the

would

the

correspond

concept

for

the

have

at a time.

creates

and

chapter,

nodes

member

starting

of nodes

an example

club

before

database

have values

can

Consider

database

folder

to the

a collection

that

active

practice.

a graph thing

schema-less

fact,

for

in the

in

entity instances

are

a separate

earlier

nodes

one

Neo4j automatically

maintained

you learnt

speaking,

only

configuration

directory,

database

can

As

have

in the

point at an empty keeping

can

any

(:Member

All suppressed

Rights

{mid:

Reserved. content

does

May not

not materially

1, fname:

be

copied, affect

scanned, the

overall

Phillip,

or

duplicated, learning

lname:

in experience.

whole

or in Cengage

part.

Stallings})

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

16

Big

Data and

NoSQL

855

NOTE Neo4j

creates

internal as,

an internal

use

within

a unique

ID field

the

named

database

for

,id.

storage

for

every

node

algorithms.

It is

and relationship;

not intended

however,

to

be,

and

this

should

field not

is for

be

used

key.

The previous command creates a node with the Member label. That node was given the properties mid with the value 1, fname with the value Phillip and the property lname with the value Stallings. The mid property

named

being used as a member ID field to identify

the

members. If there is

not already

a label

Member, it is created at the same time the node is.

16.6.2 Lets

is

Retrieving

start

by issuing

MATCH

Node Data with a simple

command

to

MATCH and retrieve

our

WHERE

single

member

node:

(m)

RETURN(m) In this is for

case, this Phillip

command

command

Stallings such

In this

case,

clause

following

only node to

to retrieve

"Phillip"}),

nodes in the

(3

display. If

Phillip

{lname:

graph

database.

many nodes

In this

existed,

case, the only node

we could

have used a

Stallings:

"Stallings"})

m the

properties

allows for

previous

all of the

so that is the

as the

MATCH (m {fname: RETURN

retrieves

and

values

more complex

command

can

were

criteria,

be rewritten

embedded

such

using

in the

as using

a

WHERE

node.

Alternatively,

comparison clause

operators

the

use

other than

of a

WHERE

equality.

The

as follows:

MATCH (m) WHERE

m.fname

RETURN

5 "Phillip"

AND

m.lname

5 "Stallings"

m

Online Content The'Ch16_FCC.txt' file usedin thefollowingsection is available onthe online

platform

of this

editor

bar and

executed

The following

section

Ch16_FCC.txt

file,

78

additional

command

assumes

available

members, is

necessary

may seem

Working

Copyright Editorial

review

2020 has

with

Cengage deemed

Learning. that

any

All

Rights

to

does

are

you.

available

Reserved. content

May not

not materially

you

copied, affect

play

have

of the file button

preloaded

This file

using

the

To learn online

scanned, the

overall

or

duplicated, learning

the

Neo4j food

interface.

about

and

critics

massive

8 cuisines.

commands.

more

be copied

a single,

and

browser

with multiple

should

pasted

into

the

Neo4j

in the interface.

contains

67 restaurants,

on the

be

the

online.

owners,

you

The contents

using

that you

script files

unfamiliar Neo4j,

suppressed

43 if

use, it does not support that

to

book.

Providing

Because

it is

The command

such

commands,

database,

command the

code

designed

please

16

creates

as a single

for

includes

using the that

interactive

many statements

refer

to

Appendix

R,

platform.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

856

PART

VI

Database

Management

16.6.3 Retrieving Beyond

retrieving

Relationship

nodes, it is

possible

Data with MATCH and WHERE to retrieve

data based

on the relationships

between

nodes.

As

stated earlier, focusing on relationships is the primary strength of graph databases. For example, the following command retrieves every member who has reviewed the restaurant Tofu for You and rated the restaurant a4 on taste: MATCH (m :Member) RETURN m,r, res

2 [r :REVIEWED {taste:

4}]

. (res :Restaurant

{name: Tofu for You})

Whenretrieving data based on a relationship, criteria for the direction of the relationship and any data characteristics of the relationship can be specified in the query. In this example, there are two nodes (m and res) and a relationship

that joins them (r). In this

case,

we are

matching

all nodes that

are

members,

the one node that is named Tofu for You, and all relationships that arelabelled as REVIEWED and have a property named taste equal to the value 4. You could add comparisons and logical operators using the WHEREclause, as shown in the following command, with the results shown in Figure 16.12: MATCH (m:

Member)

WHERE(r.value

[r :REVIEWED]

. 4 ORr.taste

. (res :Restaurant)

. 4) AND res.state

5 KY

RETURN m,r, res

FIGURE 16.12

Neo4J query using MATCH/WHERE/RETURN

16 The command

retrieves

all members

who have reviewed

any restaurant

in Johannesburg

restaurant greater than 4 on value or taste. Notice that using the inequalities such as greater than, and logical operators.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

and rated the

WHERE clause allows the use of

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

16

Big

Data and

NoSQL

857

NOTE This

section

pursuing

is just

a very

a career

on how

to

in

use this

brief introduction

graph

databases.

powerful

graph

to

Neo4j,

Appendix database

R,

but there

is

Working

with

and is located

much

more to learn

Neo4j,

on the

contains

online

if you a

platform

are interested

more thorough of this

in

tutorial

book.

In Chapter 15, you learnt about data warehouses and star schemas for modelling and storing decision support data. In this chapter, you have added to that by exploring the vast stores of data that

organisations

are collecting

in

unstructured

formats

and the technologies

that

make that

data

available to users. Data analytics, discussed in Chapter 15, is used to extract knowledge from all of these sources of data NoSQL databases, Hadoop data stores, and data warehouses to provide decision support to all organisational users. Even though relational databases are still dominant for most business transactions, and will continue to be so for the foreseeable future, the growth of Big Data

must be accommodated.

There is too

much value in the immense

amounts

of unstructured

available to organisations for them to ignore it. Database professionals must beinformed new approaches to data management to ensure that the right tool is used for each job.

data

about these

SUMMARY Big Datais characterised struggles

to adapt to it.

by data of such volume, velocity and/or variety that the relational Volume refers

to the

quantity

of data that

must be stored.

model

Velocity

refers

to both the speed at which data is entering storage as well as the speed with which it must be processed. Variety refers to the lack of uniformity in the structure of the data being stored. As a result of Big Data, organisations are having to employ a variety of data storage solutions that include technologies, in addition to relational databases, a situation referred to as polyglot persistence.

Volume, velocity, variety, veracity and value are collectively referred to as the 5 Vs of Big Data. However, these are not the only characteristics of Big Data to which data administrators must be sensitive.

Additional

Vs that

have

been suggested

by the

data

management

industry

include

variability and visualisation. Variability is the variation in the meaning of data that can occur over time. Further, visualisation is the requirement that the data must be able to be presented in a manner that makesit comprehensible to decision makers. Most of these additional Vs are not unique to

Big Data. There are also concerns

for

data in relational

databases.

The Hadoop framework has quickly emerged as a standard for the physical storage of Big Data. The primary components of the framework include the Hadoop Distributed File System (HDFS) and MapReduce. HDFSis a coordinated technology for reliably distributing data over a very large

cluster

of commodity

servers.

MapReduce

is a complementary

process

for

distributing

data processing across distributed data. One of the key concepts for MapReduce is to movethe computations to the data instead of moving the data to the computations. MapReduce works by combining the functions of map, which distributes subtasks to the cluster servers that hold data to be processed,

and reduce,

which combines

the

map results

into

a single result

set. The Hadoop

16

framework also supports an entire ecosystem of additional tools and technologies, such as Hive, Pig and Flume, which work together to produce a complex system of Big Data processing. NoSQL is

a broad term that refers

management.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

to any of several

Most NoSQL databases fall into

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

non-relational

database

approaches

to

one of four categories:

key-value databases,

or in

party

Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

data

eBook rights

and/or restrictions

eChapter(s). require

it

858

PART

VI

Database

Management

document of

databases,

products

under

many

products

and

Key-value

makes fast

to the

no

the

understand Document

DBMS

understand

meaning

aware

of the

expect

pairs in

the

relational

family

are

data in

in the

not required

wide variability

all-encompassing,

pair, the

can

data in it.

and the

be

value

of the key

of any type,

These

types

of

application

programs

but the

data in the

called is

key,

have

the

such

must

and the

DBMS

databases can

are very

be relied

on to

of simple placed

similar

that

is,

possible.

are

an

Document

of one another.

which

key-value

themselves

of a similar

family.

they

The

to a composite

All objects

a column

is

or JSON.

data into

of columns,

attributes. within

structure,

on tags

organise

columns,

component

XML

independent

databases,

super

value

as in

querying

of a series

into

same

makes

family

composed

and

tags,

and relatively

column

composed

using

which

can be grouped

a row

pairs,

be encoded

be self-contained

also

to

of the

documents,

to

being

given

component

key-value

must

component

model

Due to the

necessarily

data.

store

Columns

as rows,

meaning

document

tags

value

pairs.

identified

the

documents

not

pairs. In a key-value value

independent,

databases,

which

key-value the

also The

databases.

are

categories.

data in the

of the

or graph

categories

data in key-value

but the

completely

databases

is

multiple

to

Column-oriented

in

store

document.

databases

umbrella,

data is

the

encoded

these

DBMS,

attempt

when

databases,

NoSQL

can fit into

databases

be known

column-oriented

the

Rows

type

within

not required

are

attribute

to

are

a column have the

same

columns. Graph

databases

properties.

A node

are

based

is

similar

relationships

between

that

the

describe

highly is

to

NewSQL transactions) MongoDB

and is

Neo4j is

a graph

to

done that

them.

primarily stores

done

primarily

of both

distributed

as nodes

the

MATCH

command

attributes

data that is

among

the

nodes,

it

manner.

ACIDS-compliant

format.

language,

The

documents

named

can

MongoDB

be

Query

method.

and relationships,

are queried

are

distributed

JSON

using

to

both

Cypher,

with SQL, but is still significantly

through

which

and are the

infrastructure).

in

the find( )

Edges

excel at tracking

RDBMS (providing

documents

through

data

properties,

a highly

a JavaScript-like

Neo4j databases

many commonalities

is

stores

in

edges

model.

many relationships

a cluster

a highly

using

have

Due to the

features

that

and queried

can

nodes,

relational

Graph databases

across

(using

data through

in the

edges

data.

to integrate

is

database

describe

media

database

deleted

and

or edge.

database

databases

a document

shares

retrieval

attempt

and represent of an entity

nodes

node

as social

Data retrieval

properties

Both

a graph

NoSQL

updated,

Language.

that

such

databases

created,

nodes.

distribute

theory

an instance

corresponding

interrelated,

difficult

on graph to

different

perform

of

which

can

a declarative in

pattern

contain

language

many ways.

Data

matching.

KEY TERMS aggregate aware

column-centric storage

graph database

aggregate ignorant

columnfamily

HadoopDistributedFileSystem

algorithm

columnfamily database

batch processing

Cypher

block report

document database

job tracker

BSON(Binary JSON)

edge

JSON(JavaScript Object Notation)

bucket

feedbackloop processing

key-value(KV) database

collection

find()

16

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

(HDFS) heartbeat

map

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

mapper

reducer

traversal

MapReduce

row-centric storage

unstructured data

method

scalingout

value

NewSQL

scalingup

variability

node

semi-structured data

variety

NoSQL

sentiment analysis

velocity

polyglot persistence

stream processing

veracity

pretty()

structureddata

visualisation

properties

super column

volume

reduce

task tracker

16

Big

Data and

NoSQL

859

REVIEW QUESTIONS 1

Whatis Big Data? Give a brief definition.

2

What are the traditional

3

Explain

3

why companies

Vs of Big Data?

like

Google

and

Briefly,

define

Amazon

each.

were among

the first to

address

the

Big Data

problem. 4

Explain the

difference

between

scaling

up and scaling

5

Whatis stream processing, and whyis it sometimes

6

How is stream processing

7

Explain why veracity, as Big Data.

Whatis polyglot persistence,

9

What are the key assumptions

made by the

10

Whatis the

a name

11

Explain the

12

Briefly explain how HDFS and

13

What are the four basic categories

14

How are the value components

15

Briefly explain the difference between row-centric

16

Whatis the

and whyis it considered

between

difference

MapReduce

processing?

can be said to apply to relational

8

basic steps in

necessary?

different from feedback loop

value and visualisation

difference

out.

a new approach?

Hadoop Distributed File System approach?

node

and a data node in

HDFS?

processing.

MapReduce are complementary

to each other.

of NoSQL databases?

of a key-value

between

databases as well

a column

database and a document and column-centric

and a super

17

Explain why graph databases tend to struggle

18

Explain whatit

column

database different?

data storage.

in a column

family

database?

with scaling out.

means for a database to be aggregate

aware.

16

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter 17 DatabaseConnectivity and Webtechnologies In thIs Chapter, About

the

different

Markup

(XML)

and features

ODBC,

Which services

can

technologies is

and

why it is important

for

Web

development

technologies:

cloud

About

connectivity

Language

About the functionality

About

WIll learn:

database

What Extensible

database

you

OLE, ADO.NET

are provided

computing

the

by

and

Semantic

actually

of various

connectivity

and JDBC

Web application

how it

Web and

database

enables

how it

servers

the

database-as-a-service

describes

concepts

in

model

a way that

computers

understand

Preview Databases

are the central repository

including

newer

channels

such

the

must

be available

to

data

a spreadsheet,

about

the

such architectures

The internet example,

Markup

Language

structured

data

Companies

leveraging

has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

the

May not

not materially

(XML)

that

be

services

copied, affect

the

overall

can

duplicated, learning

In this

to

has

interchanging

data

and

chapter,

and

origins

become

via

newer

you learn

operate.

For

commonplace.

an application

messages

way

the

end,

databases.

not only between

a standard

integrate

now

how

a quick

or

Web front

phones.

internet

universally,

may access

a

of all types

occurs

provides

users

connect

applications,

To be useful

and data.

of exchanging

In

and the Extensible

unstructured

and

applications.

will learn

scanned,

to

via the

applications

want to

offer

Those

Android

by business

devices.

application,

organisations

database

choose

model

and

in experience.

whole

cost

or in Cengage

part.

and

from

organisations

database-as-a-service

services

2020

and

between

you

and

interconnectivity

portfolio

Therefore,

review

how

but also between

applications

Copyright

goods

Basic

by applications

changed

environment,

database,

Editorial

used

has

buying

todays

iPhones

mobile

users.

Visual

as iPads,

data generated

Web and

all business

a user-developed

technologies

for critical

as the

can within

efficient

Due Learning

to

electronic reserves

Web technologies

a range benefit their

way to

rights, the

right

some to

third remove

within

of internet-based

IT

from

party additional

cloud

computing

environments.

provide

content

new

may content

These

business

be

suppressed at

any

time

their

services.

from if

by

cloud-based

services.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17.1

DataBase

Database connectivity communicate

17

Database

Connectivity

and

Web Technologies

861

ConneCtIVIty refers to the

mechanisms through

with data repositories.

Databases

store

which application

data in persistent

programs

storage

connect

structures

and

so that they

can be retrieved at a later time for processing. As you have already learnt, the database management system (DBMS) functions as an intermediary between the data (stored in the database) and the end users applications. Before learning about the various data connectivity options, it is important to review some important fundamentals you have learnt in this book: DBMSs provide of administrative

means to interact tools

and data

with the data in their databases. This could be in the form manipulation

tools.

DBMSs

also provide

for external application programs to connect to the database bythe programming interface. (See Chapter 1, The Database Approach.)

a proprietary

way

means of an application

Modern DBMSs have the option to store data locally or distributed in multiple locations. Locally stored data resides in the same processing host as the DBMS. A distributed database stores data in multiple geographically distributed nodes with data management capability. (See Chapter 14, Distributed Databases.) The database connectivity software we discuss in this chapter supports Structured Query Language (SQL) asthe standard data manipulation language. However, depending on the type of database

model, some

database

connectivity

interfaces

may support

other

proprietary

data

manipulation languages. Database connectivity software works in a client/server architecture, in which processing tasks are split among multiple software layers. In this model, the multiplelayers exchange control messages and data. (See Chapter 14, Distributed Databases, and Appendix F, Client/Server Systems, located on the online platform of this book, for moreinformation on this topic.) To better understand

database

connectivity

software,

we use client/server

concepts

in

which an application

is broken down into interconnected functional layers. In the case of database connectivity software, you could break down its basic functionality into three broad layers: 1 A data layer where the data resides. You can think of this layer as the actual data repository interface. This layer resides closest to the database itself and is normally provided by the DBMS vendor.

2 A middle layer that manages multiple connectivity and data transformation issues. This layer is in charge of dealing with data logic issues, data transformations, ways to talk to the database below it,

and so on. This would also include

the native language

supported

translating

by the specific

multiple

data

manipulation

languages

to

data repository.

3 A top layer that interfaces with the actual external application. This mostly comes in the form of an application programming interface that publishes specific protocols for the external programs to interact From the

with the data.

previous

discussion,

you can understand

why database

connectivity

as database middleware because it provides aninterface between database or data repository. The data repository, also known as the management application, such as Oracle, SQL Server, IBM DB2, or the data generated by the application program. Ideally, a data source anywhere

support

Copyright Editorial

review

2020 has

Cengage deemed

and hold

any type

of data.

multiple data sources

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

Furthermore,

the

same

database

at the same time. For example, the

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

software

is also known

the application program and the data source, represents the data NoSQL that will be used to store or data repository can belocated connectivity

middleware

can

17

data source could be a relational

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

862

part

VI

Database

database,

Management

a NoSQL

database,

multidata-source-type The need

for

the

de facto

for

enabling

database

standard

is

database

data

manipulation

applications

connectivity,

to this

Open

based

interfaces a standard

to

database,

of

well-established

cannot

be overstated.

database

data repositories.

covers

(vendor

Access

support

connectivity

section

Database

a Microsoft

on the

language,

connect

Native SQL connectivity Microsofts

a spreadsheet,

capability

there

data file.

access

Just

connectivity

Although

only the following

or a text data

as SQL

has

interface

is

many

ways to

are

This

standards. become

necessary

achieve

interfaces:

provided)

Connectivity

(ODBC),

Data

Access

Objects

(DAO)

and

Remote

Data Objects (RDO) Microsofts

Object

Linking

and

Microsofts

ActiveX

Data

Objects

Oracles The

Java

data

connectivity

importantly, form

they

the

of

other,

of

here

most

Data

and

(OLE-DB)

dominant

vendors.

Access

interfaces

enhanced

functionality,

players

In fact,

(UDA)

manage the

connectivity

providing

are

database

Universal

of data source

database

thus

Database

(JDBC)

illustrated

support

Microsofts

any type

Microsofts

of the

Connectivity

the

for

(ADO.NET)

interfaces

enjoy

backbone

used to access see,

Database

Embedding

architecture,

data through

have evolved features,

in

the

ODBC,

over time:

more

ADO.NET

of technologies

interface.

each interface

and

and,

and

a collection

a common

flexibility

market

OLE-DB,

As you builds

will

on top

support.

17.1.1 native sQl Connectivity Most

DBMS

support that

vendors

provide

more standard

is

provided

of native

and

configuration

is

for

to

the

is

most, if

the

and is

RDBMS. Oracles

connectivity not

all,

databases connectivity

not the

Oracle

To

only

connectivity

can

way to

the

to

most

the

to that a client

client

features. for

to the

The

application client

connection

best

to

although

example

an

interface of that

Oracle

computer.

they

type

database,

Figure

17.1

you

shows

the

computer. for their

DBMS, and these interfaces

However, the

maintaining

programmer.

database

multiple

Therefore,

connectivity

most current

being

databases,

refers

vendor.

in the

are optimised

native

to their

SQL connectivity

unique

on the

a database; common

connecting

interface

a burden

Usually,

connect

standards,

database

become

arises.

for

connect

SQL*Net

interfaces

of the

methods

well. Native

vendor

of Oracle SQL*Net interface

different

database

as

database

configure

Native database access

own

interfaces

by the

interface

must install

their

the

interface

DBMS

products

native need

for

provided

support

support interfaces universal

by the

other

vendor

database

ODBC.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIGure

17.1

17.1.2 Developed

in the

of the

is

probably

(API).

to

The

the

(APi)

blocks

easy to

ODBC

better

ability

way to

Data

Access

Objects

Data

Cengage deemed

Learning. that

any

Web Technologies

863

to

is

ActiveX (rDO)

need,

API that

was a higher

Data bring

Access

about

level

a single

allows

guarantee

provide

two

access

to

However,

the while

rapid

data

adoption

in

functionality needed

access

Microsoft DAO is

are

all programs

programmers

other

API so

APIs

new programs.

significant

Therefore,

puts

an

that

and it enjoyed

a

interfaces:

Access still

widely

used

. application

to interface

directly

model is a new framework

uniform

provide Although

users to learn

did not

data.

object-oriented

developers

(UDA)

they

developed

API.

Objects (ADO).

Basic. It allowed

ODBC style

Microsoft

object-oriented

Data

because

A good

A programmer

Windows,

standard,

evolved, relational

Microsoft's first

users

programming

environment.

makes it easy for

Windows interface

applications.

blocks.

Microsoft

ODBC

any

programming

an application

building

of

access.

ODBC allows

software

operating

middleware

manipulate that

for

implementation

database

application

building

as

the

That

languages

SQL to

Visual

such

good

database

for

all of the

with

for

defines

tools

by providing

ultimately

To answer

Objects

UDA is designed

has

and

environments,

Microsoft's

Universal

non-relational

2020

and

Microsofts

interface.

SQL via a standard

consistent

are

(DAO)

was

Microsoft

Microsoft's

review

data.

DAO

remote

Copyright

execute

is a move towards

used in

Editorial

operating

applications

is

standard

(www.webopedia.com)

a program

they

(ODBC)

(CLi)

connectivity

using

protocols

As programming

to

access

database.

there

dictionary

widely adopted

applications. the

database

sources,

API will have similar interfaces.

was the first

Windows beyond

write

Connectivity

Call Level interface

of routines,

programmers,

using a common

data

Most

can

Database

supported

develop

together.

for

Connectivity

uDa

Group

online set

and Open

relational

as a

programmers designed

1990s,

most widely

access

makes it

rDo

SQL Access

Webopedia

interface

API

Dao, early

the

application

Database

oracle native connectivity

oDBC,

a superset

17

API that

will allow

interface

that they access

primarily

with ODBC data sources. have proposed.

to relational

17

and

databases.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

864

part

VI

Database

Figure remote

FIGure

Management

17.2 illustrates

how

relational

sources.

17.2

data

Windows

applications

can use

ODBC,

DAO and

RDO to

access local

and

using oDBC, Dao and rDo to access databases

As you can tell

by examining

Figure

17.2, client

applications

can use

ODBC to

access

relational

data

sources. However, the DAO and RDO object interfaces provide morefunctionality. DAO and RDO make use of the underlying ODBC data services. ODBC, DAO and RDO are implemented as shared code that is dynamically linked to the Windows operating environment through dynamic link libraries (DLLs). DLLs are stored

as files

with the .dll

extension.

The basic ODBC architecture A high-level

17

ODBC

Running

has three

API through

which

as a DLL, the code speeds

up load

and run times.

main components: application

programs

access

ODBC functionality

A driver manager that is in charge of managing all database connections An ODBC

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

driver that

Rights

Reserved. content

does

communicates

May not

not materially

be

copied, affect

scanned, the

overall

directly

or

duplicated, learning

in experience.

whole

with the

or in Cengage

part.

Due Learning

DBMS.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17

Database

Connectivity

and

Web Technologies

865

Defining a data source is the first step in using ODBC. To define a data source, you must create a data source name (DSN) for the data source. To create a DSN, you need to provide: An ODBC driver. driver that

is

normally

connect

select

A name.

most

Oracle

This is

available

system

database

use to connect

vendor,

databases.

although

For example,

driver

provided

by

Oracle

or, if

a unique

name

by

the

data

source

to the

data source.

Microsoft

if you

ODBC

which

ODBC offers two only to the

driver

user.

parameters.

connection point

types

System

to the to the

Most

database.

location

are

desired,

provides

using

the

an

The ODBC

several

Oracle

drivers

DBMS,

Microsoft-provided

will be known

of data sources: data

password.

ODBC

some

sources

FIGure

drivers

Microsoft

to

ODBC

user and system.

are

and password

required

drivers

17.3

to

use the

create

native

Configuring

require

if you Access

If you are using a DBMS

screens

ODBC

ODBC

For example,

of the

name and the username the

driver to

you

will

ODBC

available

to

and,

therefore,

User data sources

all users,

including

operating

services.

ODBC

and

by the

common

the

Oracle.

applications.

are

to

provided

to

the

driver for

to

You need to identify

are

using

(.mdb)

server,

to the

data

by the

to

establish

Access

necessary,

must provide

ODBC

provided

Microsoft

and, if

to connect

a system

parameters

a

file

you

needed

driver

specific

database, provide

name, the

database.

Figure

for

an

Oracle

you

need

a username

the server

source DBMS

a

database

17.3 shows

DBMS.

Note

that

vendor.

an oracle oDBC data source Defining

an

ODBC

system data source name (DSN) to

connect

using

Oracle

ODBC

If

an

Oracle

ODBC

DBMS,

Driver

Driver

uses the native SQL

to

Oracle

Oracle

connectivity.

no

user

ID is

provided,

ODBC will prompt for the user ID run

and

password

at

time.

SOURCE:

Course

Technology/Cengage

Learning

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

866

part

VI

Database

Once the specific the

Management

ODBC

calls

Core,

to the

functions,

and

providing

appropriate

Level-1

may provide

to

data source is defined,

commands

data

and

Level-2.

support

for

indicated

However, in that

Figure

17.4

The

source.

for

ODBC

The

procedural

API

how

ODBC levels

ODBC,

can

The

API

Driver

three

including

sub

route

of compliance:

For

example,

queries

vendors

vendor

will properly

levels

features.

The database

database

ODBC API byissuing

Manager

defines

increasing

DML statements,

the

write to the

ODBC

standard

provide

SQL or cursors.

with

support you

programmers

parameters.

compliance

to interact

shows

application

required

most SQL DDL and

but no support

support.

the

Level-1

and aggregate

can choose

must implement

all

which level

of the

features

level.

could

use

Microsoft

Excel to retrieve

data

from

an

Oracle

RDBMS,

using

ODBC.

FIGure

17.4

Microsoft eXCel uses oDBC to connect to the oraCle

CLIENT

database

APPLICATION ODBC

Interface

ODBC

API

ODBC DRIVER

MGR

ODBC

DRIVER

2 RDBMS

SERVER DATABASE

1

SERVER COMPUTER DATABASE

1.

From

the

5

3

Excel,

4

2.

Select

3.

Enter

click

From

data the

the

6

to

To limit

to

choose

the

Select

the

6.

Select

filtering

7.

Select

sorting

8.

Select

9.

Select

how

placed

in

and

Excel

Data

you your

uses

in

the

The

first

time,

on down

the

Options

and

list.

query.

rows

returned.

rows. Office

view

click drop

the

connection

OK.

listed.

user,

use

the

Microsoft

ODBC

API

to

executes

calls

the

to

Excel.

data

and

where

you

want

it

workbook.

Oracle

populate

restrict order

to

select

options

the

Click

are

Owner

to

to to to

want Excel

the

issues

the

columns

uses

source.

access by the

from

options

Data,

Query

ODBC

data

has

owned

name

options

Return

the

user

tables

External

source.

parameters. to

the

table

database. Excel

data

connect

user

Get

Microsoft

RDBMS.

ODBC

which only

under

From

Oracle

authentication

to

5.

10.

Tab,

and

an

Gradora

all tables 4.

Data

Sources

from

parameters

7

the

Other

retrieve

to

the

pass the

ODBC

the

SQL

request API

request

and to

down

generates

retrieve

the

to

the

a result result

set

set. and

spreadsheet.

9 8

10

As much of the functionality sources,

the

provided

use of the interfaces

by these interfaces

is limited

when they

advent of object-oriented programming languages, to other non-relational data sources.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

is oriented to accessing relational

are used

with other

it has become

part.

Due Learning

to

electronic reserves

rights, the

right

data source types.

moreimportant

some to

third remove

party additional

content

may content

data

With the

to provide access

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17

Database

Connectivity

and

Web Technologies

867

17.1.3 ole-DB Although

ODBC,

DAO and

RDO

were

widely

used, they

did not provide

support

for

non-relational

data. To answer that need and to simplify data connectivity, Microsoft developed Object Linking and embedding for Database (OLe-DB). Based on Microsofts Component Object Model(COM), OLE-DB is database middleware that adds object-orientated functionality for access to relational and non-relational data.

OLE-DB

was the first

part

of Microsofts

strategy

to

provide

a unified

object-orientated

framework for the development of next-generation applications. OLE-DB is composed of a series of COM objects that provide low-level database connectivity for applications. Since OLE-DBis based on COM, the objects contain data and methods (also known as the interface). The OLE-DB modelis better understood when you divide its functionality into two types of objects: Consumers are objects (applications consumers request data by invoking interface)

and passing

the required

Providers are objects that consumers.

Providers

Data providers

provide

providers

parameters.

manage the connection

are divided into two data to other

expose the functionality Service

or processes) that request and use data. The data the methods exposed bythe data provider objects (public

processes.

of the underlying

provide

additional

with a data source and provide data to the

categories:

data providers

Database

vendors

data source (relational,

functionality

to

and service

create

data

provider

object-oriented,

consumers.

The service

providers. objects that

text, and so on).

provider is located

between the data provider and the consumer. The service provider requests data from the data provider, transforms the data and then provides the transformed data to the data consumer. In other words, the service provider acts like a data consumer of the data provider and as a data provider for the data consumer (end-user application). For example, a service provider could offer cursor

management

indexing

services,

transaction

management

services,

query

processing

services

and

services.

As a common

practice,

many vendors

provide

OLE-DB

objects

to

augment

their

ODBC

support,

effectively creating a shared object layer on top oftheir existing database connectivity (ODBC or native) through which applications can interact. The OLE-DB objects expose functionality about the database; for example, there are objects that deal with relational data, hierarchical data and flat-file text data. Additionally,

the

objects implement

specific

tasks,

such

as establishing

a connection,

executing

a query,

invoking a stored procedure, defining a transaction, or invoking an OLAP function. By using OLE-DB objects, the database vendor can choose whichfunctionality to implement in a modular way,instead of being forced to include all of the functionality all of the time. Table 17.1 shows a sample of the object-orientated classes used by OLE-DB and some of the methods (interfaces) exposed by the objects.

taBle Object

17.1

sample

Class

ole-DB

classes

and interfaces

Usage

Session

Sample

Used to create application

Command

Used to

an OLE-DB

and

a data

process

RowSet

commands

a data

consumer

IGetDataSource ISessionProperties

to

object

manipulate

a data

will create

RowSet

providers

data.

ICommandPrepare

objects to hold the

ICommandProperties

by a data provider.

Used to hold the result database

between

provider.

Generally, the command data returned

session

that

set returned

supports

SQL.

by a relational

Represents

style

a collection

database

of rows

in

or a

IRowsetInfo

a tabular

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

17

IRowsetFind

format.

Editorial

interface

IRowsetScroll

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

868

part

VI

Database

OLE-DB

Management

provides

provide

support

Server

Pages

called

ActiveX

additional

for (ASP)

to interact

OLE-DB

FIGure

ActiveX.

Objects

OLE-DB,

language

architecture,

17.5

for the applications

languages,

that

(A script

is

at run time.)

DAO and that

especially

To provide

(ADO).

and executed

with

programming

and

Data

but is interpreted

capabilities

scripting

uses

showing

the support,

the

underlying

developed

a high-level

ODBC

objects. and

However, it such

a new

language

object

to

Figure

native

does not as Active

framework

that is not compiled

application-orientated

a unified interface

OLE-DB with

data.

Web development,

Microsoft

ADO provides

how it interacts

used for

written in a programming

RDO. ADO provides the

accessing

ones

access

interface data from

17.5 illustrates

connectivity

the

any ADO/

options.

ole-DB architecture

ADO introduced

a simpler

the data manipulation Table 17.2.

object

model that

services required

was composed

of only a few interacting

by the applications.

objects

to provide

Sample objects in ADO are shown in

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

taBle Object

17.2

Connectivity

and

Web Technologies

869

Usage

Connection

Used to set up and establish data source.

Command

Contains be

Fields

the

written

the

ADO

commands

data

to the

Contains

model is

against

use its

a specific

by the The

of Field

descriptions

ADO will connect to any OLE-DB

data

can for

source).

It

will also

be disconnected

each

column

over the

access

(data

of a command.

Recordset

improvement

new

connection

execution

source.

a tremendous to

with a data source.

can be of any type.

generated data

a collection

programmers

a connection

The data source

Used to execute

Recordset

Although

Database

sample aDo objects

Class

encouraging

17

in the

OLE-DB

framework,

contain

from

the

any

data

new

data to

source.

Recordset.

model,

Microsoft is actively

ADO.NET.

17.1.4 aDo.net Based

on ADO, ADO.NeT

framework.

heterogeneous, under

any

interoperable

operating

framework

is

access

the

component

provided

component

ADO/OLE-DB

To understand

duo.

representation

memory-resident

DataSet.

interacts

DataSet.

The

with the

Once the

and the changes

are

processing

data in the

represents

of the the

.NET

basic

data

two

enhances

new features

model, you should That is, the

disconnected

DataSet done,

and

critical

the functionality

for the

development

support.

is then

DataSet

stored in

object

the

can

be

XML format

made

persistent

In short, you can think

the

over any network

coverage

will only introduce

extends

data are read from

is

distributed,

know that DataSet

a data from

to

make

DataSet

data

a DataSet is a disconnected

contains

provider,

the

data

changes are

tables,

the

columns,

data

provider.

The

(inserts,

data

updates

synchronised

rows,

are placed

with the

on a

consumer

and

deletes)

data

source

made permanent.

The DataSet is internally

environments.

DataSet

section

framework

database.

data in the

of data

Comprehensive

this

introduced

XML

new

Once the

development

ADO.NET.

and

of the

and constraints.

manipulating

Therefore,

ADO.NET

of this

application

any type

at

language.

the .NET

DataSets

.NET

aimed

book.

that

Microsofts

platform for developing

architecture,

the importance

memory-resident

application

.NET

of

is a component-based

programming

of this

understand

applications:

relationships

and

scope

of the

to

by the

of distributed

the

data access

applications

system

beyond

It is important

in the

is the

The Microsoft .NeT framework

persistent

of the

data stored in the

(you

will learn

as XML

documents.

about

XML later in this

This is

critical

in todays

DataSet as an XML-based, in-memory

data source.

Figure

17.6 illustrates

chapter),

the

and

distributed

database that main components

of the ADO.NET object model. The ADO.NET framework consolidated all data access functionality under one integrated object model. In this object model, several objects interact with one another to perform specific data manipulation

functions.

Those

objects

can be grouped

as data

providers

and consumers.

Data provider objects are provided by the database vendors. However, ADO.NET comes with two standard data providers: a data provider for OLE-DB data sources and a data provider for SQL Server. That way, ADO.NET can work with any previously supported database, including an ODBC database with an OLE-DB data provider. Atthe same time, ADO.NET includes a highly optimised data provider for

SQL Server.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

870

part

VI

FIGure

Database

Management

17.6

aDo.net

framework Client Applications

Data

Consumers Access

Internet

Excel

DataSet

ADO.NET

(XML)

Data Providers DataTableCollection

DataAdapter

DataTable

DataColumnCollection

DataReader

DataRowCollection Command ConstraintCollection Connection DataRelationCollection

OLE-DB

DATABASE

Whatever

in the

the

data

data source.

provider

is, it

Some

of those

must

support

objects

a set

of specific

are shown in

objects

Figure

17.6.

in

order

A brief

to

manipulate

description

the

of the

data

objects

follows: Connection. database,

The

Connection

object

and so on. This object

defines the

enables

the

data source

client

used, the

application

to

name

of the

open and close

server, the a connection

to

a database. Command.

The

specified

17

call to

returns

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Command

database be run

by the

All

Rights

Reserved. content

does

represents This

database.

a set of rows

suppressed

object

connection.

object

a database contains

When a SELECT

command

the

actual

statement

to

SQL

is

be executed

code

executed,

within

or a stored

the

a

procedure

Command

object

and columns.

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

DataReader. the

The

database

to

DataAdapter. most

DataReader

retrieve

object

objects

that

UpdateCommand

The

object

to

the

DataTable.

two

DataTable

object

the

it is

of the

This is the

contains

objects

to

populate

data.

contains

and the

the

InsertCommand,

data in the

object

data in

database.

This

a collection

of

DataRelationCollection

and

tabular

ways to

object

associate

format.

This

of entity

integrity.

enforcement

or

more

column nulls

descriptions. allowed,

one row

object

Each

maximum

or more than

object

completely

could

database.

In

within

a simple

to

populate

objects

in

create the

of the

one row

has

in

one

In turn,

column

value

one row

constraints

database

the

description

and

has

minimum

value.

with data as described

a

with tables,

the

DataSet data

source,

DataSet

can

come

table in

an

object

from

Oracle

that

the

rows

to the

a data

Two types

of

and constraints.

data source.

source.

which is from

tables

as though

sources.

they

way for

truly

heterogeneous

DataSet

disconnected. This

and a SALES table in

both

more

DataAdapter

once the

called

data

Even

The

However,

why its

different

database

relates

paves

for the table.

UniqueConstraint.

connection

of the

a DataSet

DataSet

and

a permanent

independent

then

short,

definition

require

have an EMPLOYEE

You

means

a SQL Server

were located distributed

in the database

applications.

The ADO.NET

framework

environment,

applications

a disconnected

system

on the

the

ForeignKeyConstraint

doesnt

DataTable

you could

database.

internet

object

uses those

relationships

the

data type,

zero rows,

a DataSet is, in fact, DataSet

Additionally,

support

with

main objects:

one

name,

the

allows

of three

contains

SelectCommand

populated,

same

which

are supported:

As you can see,

uses the

871

table.

represents

contains

data

session

object.

data source

database,

Web Technologies

DataColumnCollection.

ConstraintCollection

important,

another

contains

DataRowCollection

object

representation

the

a read-only

SelectCommand,

permanent

in-memory

describing

as column

DataAdapter

DataAdapter

with the

and

manner.

a DataSet

DataSet:

The DataTableCollection

PrimaryKey,

such

constraints

and

in

DataColumnCollection

?

that

row

The

Connectivity

creates

a rapid

managing

data in the

in-memory

up the

object is composed

in the

is

make of objects

related

properties

?

is the

of

framework.

The

Database

object that only) in

charge

the

DataSet

main objects.

property:

DataTable

ADO.NET

DeleteCommand.

that

The

important

object is in

managing

object

a collection

a table

(forward

data in the

DataSet

objects

contains

?

the

contains

DataTable

a specialised

in the aid in

and

and synchronise DataSet.

object is sequentially

The DataAdapter

specialised

following

data

17

optimised

exchange

as the

to

work in

disconnected

messages in request/reply

is the internet.

Web browser

databases

is

Modern

graphical

applications

format.

rely

user interface.

environments. The

next

a disconnected

most common

on the internet

In the

In

as the

section,

you

example

network

will learn

of

platform about

how

work.

17.1.5 Java Database Connectivity (JDBC) Java is an object-oriented in

2010)

that

languages

runs

for

multiple its

Copyright Editorial

review

2020 has

which

Learning. that

any

without Java

machine

All suppressed

Rights

Reserved. content

Web

browser Sun

means that

architecture.

a virtual

Cengage deemed

of

development.

environments

portable

run in

on top

Web

environment,

programming language

does

code

is

May

not materially

be

copied, affect

normally in

scanned, the

overall

Java

the

or

duplicated, learning

stored

in

is

one

of the

Java

write a Java

in

or in Cengage

part.

common once,

system.

Due Learning

to

electronic reserves

chunks This

rights, the

once

capabilities

pre-processed

operating

whole

most

as a write

application

The cross-platform

host

experience.

by Sun Microsystems (acquired

created

can

modification.

environment

not

software. Microsystems

a programmer any

developed

right

some to

third remove

programming run

additional

and then

known

content

anywhere

run it in

of Java are based

environment

party

by Oracle

may content

as applets has

be

any

time

that

17

well-defined

suppressed at

on

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

872

part

VI

Database

Management

boundaries, Java to

and all interactivity

run-time

environments

TV set-top

application

boxes. loads,

Another

it

can

application including

to

server,

main

technology data in the

access

FIGure

17.7

allows

databases,

the result of

personnel

companys

system

systems,

of using

from

Java

all its

is its

Java

the

data

sources,

spreadsheets

a data

source,

with

prepare

a

When a Java

they

use pre-defined

(JDBC) is an application

wide

send

devices

via the internet.

environment,

and text and

mobile

architecture.

Connectivity

to interact

Sun provides

handheld

components

Java run-time

Database

monitored.

to

or required

program

with

closely

on-demand

modules

data outside

is

computers

a Java

tabular

a connection

range

files.

the

of

data

JDBC

SQL code

sources,

allows to the

a Java database

set. JDBC

is

training.

that

it

JDBC

databases.

via database

ODBC driver.

operating

download

need to access

advantage

and

advantage

that

establish

and process

One

host

operating

programming interfaces.

interface relational

program

most

dynamically

When Java applications

programming

with the

for

As a

middleware.

allows

allows matter

Furthermore,

Figure 17.7 illustrates

a company

programmers

of fact,

JDBC

JDBC

to to

allows

provides

leverage

its

use their

SQL

direct

access

a way to connect

the basic JDBC architecture

existing skills to

investment

to

a database

database

the

server

to databases

and the various

in

manipulate

through

access

or

an

styles.

JDBC architecture Java

Client

Application

JDBC

JDBC

API

Driver

Manager JDBC-ODBC

Java

DB

Driver

Java

DB

Driver

Bridge

Driver

Database ODBC

Middleware

DATABASE

DATABASE

DATABASE

DATABASE

SOURCE:

Course

Technology/Cengage

Learning

As you see in Figure 17.7, the database access architecture in JDBC is very similar to the ODBC/OLE/ ADO.NET

architecture.

All database

access

middleware

shares

similar

components

and functionality.

One advantage of JDBC over other middleware is that it requires no configuration on the client side. The JDBC driver is automatically downloaded and installed as part of the Java applet download. Because Java is a Web-based technology, applications can connect to a database directly using a simple URL. Once the URLis invoked, the Java architecture comes into play, the necessary applets are downloaded

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

to the client (including are

executed

investing

the

securely

resources

on the internet.

to Such

In fact,

database

clients

develop

expand

generates

environment.

their

is likely

to

the

more

reliance

more

ways to

be stored

in

platform

Web Technologies

do

873

applets

companies more

are

business

databases.

on the internet

development

and

and then the

and

are finding

of data to

increasing

become

day,

and

amounts

Connectivity

information),

Every

Web presence

increasing

are part of the trend towards the internet

Database

driver and all configuration

run-time

and

business

the .NET framework resource.

JDBC

in the

17

Java

as a critical

and

business

of the future.

17.1.6 php PHP

or

suited

Hypertext for

Web

Scripting a

through

specific In

One of the to learn.

advantage (including

offers

Call Interface that

use

(OCI

must connect

then

to

SQL

degree

to the

17.2

database.

database

of control

as if

Millions

of people

services

to

all over the

allows

cursors

needed

OCI 11 is the

over

how

operating

Web servers

organisations

usually

The

creation

developed

program

most efficient

way

designed

11 gives using

to

close

is

Oracle

of applications OCI

application,

an application

major

most

addition,

by the

the

statements.

databases.

the In

application

within

short

used. Oracle

server.

of Java.

hold the the

the

OCI 11

data

cursor

and

of connecting

to

and

also

access

the

a higher

execution.

ConneCtIVIty

world

over

that

extensions

ahead

on all

medium

to

databases

use and not difficult

a few

used

PHP is

Server-Side

Web page is

PHP,

supports

to

connect

A typical

that

control

small

database

statements

be

and it

time if to

using

can

a

especially

ODBC

used

MySQL, is free to

It

is

of different

through

websites

PHP)

interface

execution.

SQL

a number

connecting of

for

be used

Oracle

believes

greater

Internet

databases

websites

open the

any

Oracle

gives

over program

DataBase

connecting

process

an

query

more databases,

query,

can

before the

of

Linux

development

that

access

SQL

to

everywhere.

most

ASP. It is

to

cent1

that

Web server

MySQL database

X and

programming

to

of

per

with versions

OS

addition,

module

MySQL,

almost Mac

an application

one or

by the

Oracle

In

of all stages

disconnect

any

supported

calls

of

language

Microsofts

connectivity

with a

and have a shorter

of function

control

extracted

and IIS.

11) is

to

on the

78.9

connect

Windows,

an extension

a series

developer

PHP is

requirements

also

case that

PHP, along to

scripting

an alternative

supports

in the

was reported

it is possible

Apache

have simpler

it

Microsoft

including

PHP

extensions

is that

as

PHP

for this is that

For example,

Another

seen

general-purpose

means that it is interpreted

2019,

key reasons

widely-used

be displayed.

database

systems today,

to

January

a

and is

which

Web browser

Oracle.

is

development

Language,

sent to

for

Preprocessor,

use

the

computers

Web.

and

Web database

Web browser

software

connectivity

opens

to

the

door

to

internet,

new innovative

that:

Permit rapid

responses

to

competitive

pressures

by bringing

new services

and products

to

market

quickly. Increase Allow

customer anywhere,

satisfaction anytime

Give fast

and

effective

or across

the

globe.

through

data

access

information

the

creation

using

mobile

dissemination

of

Web-based

smart

devices

through

support

services.

via the internet.

universal

access

from

across

the

street

17 1

Comparison

of the

Server-Side

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

usage of PHP vs Java for

Languages

All suppressed

Rights

Reserved. content

does

Available:

May not

not materially

be

copied, affect

websites,

W3Techs.com,

January 2019,

News, Technologies,

https://w3techs.com/technologies/comparison/pl-java,pl-php

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

874

part

VI

Database

Management

Given those

advantages,

architectures, Table

taBle internet

based

17.3

shows

17.3

many IS

on internet

a sample

of internet

Characteristics

and

to

face

characteristics

independence

Savings in

simple

user interface

Reduced

Location independence

development

at

manageable

support

(and

requirements of

multiple

Free

client

Low

entry

business

will learn and

in

access

costs

development

issues

that

the

same,

management

is

are

of this

they

hundreds

change

the

Webs

over

application

does

previous

not

the

negate

bad

the

in

particular,

and design,

may be

the

affected

by

design

analysis,

and

whether

transaction

structures

which transactions

development.

database

In the final

database

many database

in IS

profoundly

system-level

database of

why

and, are

chapters.

in line,

basic

see element

development

interface

effects

in

easy to

connectivity

way information

ability

cross-platform

details

relationships.

are

If

any

implementation

measured in

and

millions per

to

access

is

generated,

data in

databases

(heterogeneous)

accessed (local

functionality.

The

and

distributed.

and remote), Web has

the

helped

At the simplicity

create

a new

standard.

sections

databases

an environment

changing

dissemination

The following

The

multiple servers

be a critical

database

in the

same

using

it is to

Web servers

networks

and scalability,

by standing

this:

of free

private

per day.

is the

and

standards)

browsers)

environment,

and

the

it is

(open

availability

database

or

require

(Web

maintaining

database

online

be learnt,

is rapidly

interface,

information

and to

multiplied in

day, rather than in The internet

going

tools

tools

to the internet

were addressed

by

tools

information

connection

Web-based

connections

times

frequent of

user interfaces

a

dedicated

tools

processing

sections,

of

having

lesson

global

DBMS

following

a purchase

essentially

with

the

However,

immediate

of the

the

management

implementation

core

and

consider

for

development

costs

Distributed

infrastructure

costs!)

development

development

Reduced

current

development

Reduced

Relatively inexpensive

professionals

cost

internet

Reduced

make

provide.

and cost

Global access through

More interactive

you

they

development

multiple platform

Plug-and-play

Web.

benefits

making.

equipment

platform

time

end-user

Availability

costs

the

access

and portability

multiple

No need for

As you

data

decision

acquisition

on most existing

training

Reduced

creation

and the

equipment/software

No need for

In the

universal

technologies

Platform independence

Rapid

create

and to facilitate

Benefit

software

and

need to

operations

of internet

Ability to run

Common

the

streamline

technology

and benefits

Characteristic

Hardware

departments

standards,

examine

the

how

Web-to-database

middleware

enables

end

users

to interact

Web.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17.2.1

Web-to-Database

In general, the

17

Database

Middleware: server-side

Web server is the

main hub through

Connectivity

and

Web Technologies

875

extensions

which all internet

services

are accessed.

For example,

when an end user uses a Web browser to query a database dynamically, the client browser requests a Web page. Whenthe Webserver receives the page request, it looks for the page on the hard disk; when it finds the page (for example, a stock quote, product catalogue information or an airfare listing), the

server

sends it back to the

client.

online Content Systems,

Dynamic the

Web server

result

browser

generates

this

type

understand

the

page

(database

In the

that

before

page

connect

query),

and for

to the

display

but

the interaction

the

Figure

and,

therefore,

client.

The Web server receives Web-to-database

type

of scripting

browser,

4

database

Unfortunately,

data

from

the

capability

directly

the

Web

Therefore,

be extended

to

so it

can

extension.

Web server to handle specific

server-side

extension

Web server,

is that it

database

neither

a server-side

extension

Web browser.

the

database.

must

with the

the

client

must include

program

which, in turn, makes it

retrieves

sends

possible

provides its services

In short, the server-side

to the

the

to the

to

to the

extension

the

data

retrieve

and

Web server in

adds significant

internet.

known

as

Web server

web-to-database

and the

Web-to-database

middleware.

Web-to-database

middleware

Figure

17.8

middleware.

actions:

Web server.

and validates the request. In this case, the server passes the request to

middleware

language

The Web-to-database to the

to the

and read

example,

is also

the

page

back

data to the

browser.

program

17.8, trace

the

Web servers

more important client

the

sends

The server-side

to the

extension

between

query

retrieved

purposes.

whats

Web server

server-side

the

3

it

This job is done through

database

passes

websites. In this database query scenario, Web server

The client browser sends a page request to the

2

book.

is that the

to

the

requests.

preceding

transparent

As you examine

1

for this

scenario

how to

database

the query results,

A database

sends

of request

browser

functionality

it

contents

query

knows

a way that is totally

shows

before

databases

clients

present

platform

extension is a program that interacts

of requests. data from

to the

online

Web page

preceding

Web server

and process

A server-side types

the

with the

on the

nor the

support

Client/server systemsarecovered in detail in Appendix F, Client/Server

on the

Web pages are at the heart of current generation

The only problem query

located

to

for

processing.

enable the

Generally,

database

the requested

page

contains

some

interaction.

middleware reads, validates and executes the script. In this case, it connects

and passes the

query,

using the database

connectivity

The database server executes the query and passes the result

layer.

back to the

Web-to-database

middleware.

5

The

Web-to-database page

6

that

middleware

includes

the

compiles

data

retrieved

The Web server returns the just-created client

7

the result from

the

set, dynamically database,

and

generates

sends

HTML page, which now includes

an HTML-formatted

it to the

Web server.

the query result, to the

browser.

The client browser

displays the page on the local

computer. 17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

876

part

VI

FIGure

Database

Management

17.8

Web-to-database

middleware

3 SERVER

Web server

COMPUTER

2

page

and passes

CLIENT

1

COMPUTER HTTP

WEB

Web server page

SERVER

receives

request

SCRIPT

determines

contains

the

script the

the

language

script

page

to

Web-to-database

PAGE

middleware

request

WEB-TO-DATABASE

4

TCP/IP MIDDLEWARE

NETWORK Web-to-database middleware 6

Web server

HTML

sends

PAGE

8 The result

query

displayed HTML

HTML

HTML

formatted

using

Web-to-database middleware

7

query

in

passes

results

format

format

and

passes

client

is

to the

database

PAGE

page

to the

of the

database

the

connects

in

query database

connectivity

the

layer

HTML

back to the

ADO.NET

Web server

ADO OLE-DB ODBC

RDBMS COMPUTER

Database

server 5

passes results

the

query

back

to

RDBMS

the

SERVER

Web-to-database middleware

DATABASE

The interaction development

between the of a successful

wellintegrated

Web server internet

and the

database

with the other internet

Web-to-database

implementation.

middleware is

Therefore,

services and the components

the

crucial to the

middleware

must be

that are involved in its use.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17.2.2

17

Database

Connectivity

and

Web Technologies

877

Webserver Interfaces

Extending

Web server functionality

implies

that the

Web server

and the

Web-to-database

middleware

will properly communicate with each other. (Database professionals often use the wordinteroperate to indicate that each party to the communication can respond to the communications of the other. This books use of communicate assumes interoperation.) If a Web server is to communicate successfully with an external

program,

respond to requests. Common

both

Gateway Interface

Application

programs

must use a standard

Currently, there are two

well-defined

way to

exchange

messages

and to

Web server interfaces:

(CGI)

programming interface (API).

The Common Gateway interface (CGi) uses script files that perform specific functions based on the clients parameters that are passed to the Web server. The script file is a small program containing commands

written in a programming

language,

usually

Perl,

C11,

C# or Visual

Basic.

The script files

contents can be used to connect to the database and to retrieve data from it, using the parameters passed by the Webserver. Next, the script converts the retrieved data to HTML format and passes the data to the Web server, which sends the HTML-formatted page to the client. The

main disadvantage

of using

CGI scripts

is that

the

script

file is an external

program

that is

individually executed for each user request. That scenario decreases system performance. For example, if you have 200 concurrent requests, the script is loaded 200 different times, whichtakes significant CPU and memory resources away from the Web server. The language and method used to create the script can also affect system performance. For example, performance is degraded by using an interpreted language

or by writing the

script inefficiently.

An application programming interface (API) is a newer Web server interface standard that is more efficient and faster than a CGI script. APIs are more efficient because they areimplemented as shared code or as dynamic link libraries (DLLs). That meansthe APIis treated as part ofthe Web server program that is dynamically invoked when needed. APIs are faster

than

CGI scripts

because

the

code resides

in

memory and there is

no need to run

an external program for each request. Instead, the same API serves all requests. Another advantage is that an API can use a shared connection to the database instead of creating a new one every time, as is the case with CGI scripts. Although

APIs are

APIs share the same other disadvantage is The Webinterface Regardless of the must be able to

more efficient in handling

requests,

connect

with the

Use the native SQL access SQL*Net if you are using

database.

That connection

have some

disadvantages.

Because the

can be accomplished

in

one of two

ways:

middleware provided by the vendor. For example, you can use

Oracle.

Use the services of general database connectivity ADO.NET

they

memory space as the Web server, an API error can bring down the server. The that APIs are specific to the Webserver and to the operating system. architecture is illustrated in Figure 17.9. type of Web server interface used, the Web-to-database middleware program

standards

such as ODBC, OLE-DB, ADO,

or JDBC.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

878

part

VI

FIGure

Database

Management

17.9

Webserver CGI and apI interfaces

SERVER CLIENT

External

COMPUTER

CGI

COMPUTER

Program

TCP/IP Network

WEB API

SERVER

(DLL

Database

call)

ADO.NET

Connectivity

ADO

Middleware

OLE-DB ODBC

RDBMS COMPUTER RDBMS SERVER

DATABASE

17.2.3 The that

the

Web

browser

lets

end

generates

an

internet

is the

users

application

navigate

HTTP

GET

software

(browse) page

the

request

such Web.

that

as

Google

Each time

is

sent

to

Chrome,

the

the

end

user

designated

Apple clicks

Safari

or

Mozilla

a hyperlink,

Web server,

the

using

Firefox browser

the

TCP/IP

protocol.

The present

Web Browser

Web browsers the

different

interpretation

and

job is to interpret page

components

presentation

the in

capabilities

HTML code that it receives a standard

are

not

formatted

sufficient

from the

way.

to

Web server

Unfortunately,

develop

the

Web-based

and to

browsers

applications.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The Webis a stateless of the and

clients each

server

the

client

accessing

computers

it,

interact

browser is

what

in

concerned

client

the

client.

ended.

requests the

although

are actually just

browsing

The server

entered

in

selection,

you

need

user

to

is

a new

client

clicks

page

and

Web Technologies

components,

the

879

stored in the local

what the

end

and

so

on.

page

back

to

the

(go

doing

On the

with the

is

Web server),

directory)

want to

thus

to is

open,

document,

you

when

page

communication

communication

Web, if

know

is

requested

cache (temporary

user is

selected,

the

page to

communicate the

and

For example,

second

sends

server

client

model.

client/server

that

the

Web! Instead,

computers

server

and think

between

request-reply

no way for the

and the

a page

line

worldwide

the

and server

a link

and its

any idea

a

follow

page, so there is

HTML document

option

to jump

Connectivity

communication in

that

current

the

have

no open

is impractical

may be browsing

the not

is

The only time the

you

which

there

conversations

with the

receives

does

a form,

of course

short

when the

client

Therefore,

browser.

which

page.

a page

Once

That is,

only

Database

at any given time, a Web server does not know the status of any

with it.

very

was done in the first

the

are

system

communicating

17

losing

you

of your

which

act

data

on a clients

track

of

what

was

done before. The

Web browser,

output no

text

and

way to

perform

in the

client,

and

VBScript.

servers

Web

defers

The browser processing

you

must

provide

the

Client-side

various

inputs.

data

entry

to

other

Even

Web

programming

form

other

abilities

accepts

to

as

the

On the

data,

crucial

PHP,

side,

is

JavaScript

can perform

capabilities

server

there

processing

Java,

only data and

To improve

extensions.

field

such

such

displays

beyond formatting

form

perform

languages

that

data inputs.

client-side

processing

browser

Therefore,

a dumb terminal

and

necessary

when the

validation.

as accepting

plug-ins

only

of the

Web

Web application

power.

extensions

extensions

in

field

resembles

such

use

17.2.4 Client-side

available

its use of HTML, does not have computational

form

immediate

the

rudimentary browser,

through

accepting

add functionality

forms,

the

most

to the

commonly

Web browser.

encountered

Although

extensions

client-side

extensions

are

are:

Plug-ins Java

and JavaScript

ActiveX

and

VBScript

A plug-in is an external application that is automatically invoked it is

an

external

application,

data object are

not

generally

originally

server

Reader

JavaScript that

allows

Java,

it is

a page

the to

present

to

operating-system-specific.

if it

The

to allow the

one

as

manipulate

of the

page

a portable

the

components

being loaded

when

from

the

was Microsoft's Microsoft's

is

a specific

server

into

Because

embedded event

in

takes

client

browser (Internet

with

document,

the

and launch

a

that Web Adobe

computer.

is simpler

such

It is

as a

to

or macros)

generate

downloaded

mouse

click

than

with the

on an object

or

memory.

Explorer).

ActiveX

for Internet

associated

of a series of commands

Web pages.

place

Because

data properly

object

JavaScript

the

a .pdf

alternative to Java. ActiveX was a specification

compatible.

a replacement

code

sites.

is

is

handle

format

on the

when needed.

plug-in

Web server to

document

document

design interactive

JavaScript

executed

cross-platform

Edge,

example,

recognise and

to learn.

and is

ran inside

is

extension

is a scripting language (one that enables the running

easier

ActiveX

plug-in

For

data,

Web authors

Web page

truly

supported.

will receive

Acrobat

the

using the file

bythe browser

support

Explorer

However, was

with

despite

dropped

no ActiveX

for

Microsoft's

and in

2015

writing programs that

efforts,

Microsoft

ActiveX

was not

released

Microsoft

support.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

880

part

VI

Database

Management

From the

developers

absolute on the to

necessity. client

perform

input

routines

the

The

of view,

set

thus

them

to

a

application

is

can

Present

database

Create

Web pages

Enforce

referential

Web application

Security

and

That

processing

Web applications.

that

expands

client side is an

validation

scenario

cycles.

for

on the

no data

is

requires

Therefore,

the

done server

client-side

Most of the

data

data validation

such

as databases,

a consistent

perform

the

from

a

the

functionality

directory

run-time

of

systems,

environment

Web servers

and

for

search

by

engines.

Web applications.

Web

following:

Web page. various

formats.

pages.

to insert,

update

and

in the queries

servers

An integrated

provide

development

application

to

a database

nested

data validation

Web form

Web server.

CPU

application

of services,

integrity

and

permit on a

or VBScript.

also provides

Web search

Create

Use simple

valuable

data in a Web page using

dynamic

that

entered

be sent to the

middleware

be used

Connect to and query

are

servers

a

server

servers

must

wasting

JavaScript,

wide range

Web application

data

most basic requirements

Web application server

using routines

when

data

is one of the

Web application

linking

entire

validation,

are done in Java,

17.2.5 A

side, all data

validation

point

For example,

delete

application

program

and

programming

features

such

environment

database

data.

logic.

logic

to represent

business

rules.

as:

with

session

management

and

support

for

persistent

variables and authentication

Computational Automatic

languages generation

Performance

Database

to represent of

HTML

and

pages

and fault-tolerant

access

Access to

of users through

user IDs store

and

passwords

business

integrated

with

logic

Java,

in the

application

JavaScript,

server

VBScript,

ASP,

and

so

on

features

with transaction

management

multiple services,

such

Web application

servers

capabilities

as file transfers

(FTP),

database

connectivity,

email

and directory

services.

Examples Server

of

by IBM,

WebObjects data

WebLogic by

sources

Apple.

and

compatibility

Server All

other

by

Web application

services.

with other

include

Oracle,

They

ColdFusion/JRun

Fusion

by

servers

offer the

vary in their

Web and database

range

tools,

by Adobe,

NetObjects,

Visual

ability

to

of available

and extent

.NET

connect features,

of the

WebSphere

Studio

Application

by

Microsoft

Web servers

to

robustness,

development

and

multiple

scalability,

environment.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

17.2.6

17

Database

Connectivity

and

Web Technologies

881

Web Database Development

Web database

development

deals

with the

process

of interfacing

databases

with the

Web browser

in

short, how to create Web pages that access data in a database. As you learnt earlier in this chapter, multiple Web environments can be used to develop Web database applications. One of the most common web application development environments is known as LAMP. LAMP is

made up of the

Linux

operating

system,

the

Apache

Web server,

MySQL

database

and the

PHP

programming language (although Perl and Python can be used instead of PHP). It is often used within organisations that need an effective way of managing organisational data but do not have the time or money to invest in alarge-scale, costly web development project. LAMP allows Web developers to build efficient Web applications that are reliable and stable. Examining the components of LAMP will allow

us to see

why:

The Linux operating system is open source can be used to offer cross-platform is important

to enable

The Apache

your

website to

be used across

Web server is the leading

because it allows,

with PHP, the

all

major browsers

compatibility.

and any

mobile

platform in terms of its total number of domains.

development

of highly interactive

34.8 per cent of domains were hosted on Apache servers power the most sites 40.65 per cent.2

Web applications.

Web servers. As of 2018,

This

device.

This is

In 2018,

Microsofts

Web

MySQL databases can be used to store data for both simple and complex websites with varying degrees of database complexity. It allows easy retrieval and capturing of data from the Web. The programming language PHP is used to link all the components of LAMP. PHP allows the dynamic content of the website to be obtained through accessing data within the MySQL database. The main benefits of LAMP are that it is easy to programme and applications can be developed offline and then deployed onto the Web. Deployment is also relatively straightforward as PHP is easily integrated

with the

Apache

Web server

and

MySQL.

Despite the development

of the

LAMP

components

being independent, when combined they offer one ofthe best solutions for Webdatabase development. In order to illustrate the use of PHP to retrieve a data through a simple query, lets examine a PHP code example. Because this is a database book, the examples focus only on the commands used to interface

with the

database,

rather

than

the

specifics

of HTML code.

A Microsoft Access database named Orderdb is used to illustrate the Web-to-database interface examples. The Orderdb database, whose relational diagram is shown in Figure 17.10, was designed to track the purchase orders placed by users in a multidepartment company.

17 2

Copyright Editorial

review

Web

2020 has

Server

Cengage deemed

Learning. that

any

Survey.

All suppressed

Rights

Available:

Reserved. content

does

May not

not materially

http://news.netcraft.com/archives/category/web-server-survey/

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

882

part

VI

FIGure

Database

Management

17.10

the orderdb relational

diagram for the

Web database development

SOURCE:

The following rows.

The

1

example scripts

will explain

used in these

how

to

examples

Query the database

using standard

VENDOR

examples

source

table. was

The

defined

using

use perform

create

two

basic

SQL to retrieve

will use

the

PHP to

an

operating

ODBC

system

a simple

Technology/Cengage

Web page

to list

a data set that contains source

Figure

to the

17.11

shows

client the

the

VENDOR

shown

in

all records in the

named

RobCor.

Section

17.1.2.

The

2 Format the records generated in Step 1in HTML so they are included in the returned

Learning

tasks:

data

tools

Course

examples

ODBC

data

Webpage that is

browser. PHP

code

to

query

the

VENDOR

table.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIGure

17.11

17

Database

Connectivity

and

Course

Technology/Cengage

note that PHP uses multiple tags to query and display the data returned

Take a closer look

883

php code to query the VenDor table

SOURCE:

In the figure,

Web Technologies

at the

Learning

by the query.

PHP functions:

The ODBC_CONNECT

function

(line

11) opens

a connection

to the

ODBC

data source.

A handle

to this database is set in the $dbc variable. The ODBC_EXEC

function

(line

$dbc database connection. The

WHILE function

(line

13) executes

the

SQL query stored in the

$sql variable

against the

The querys result set is stored in the $rs variable.

15) loops

through

the result

set ($rs)

and uses the

ODBC_FETCH_ROW

17

function to get one row at a time from the result set. Notice that PHP variables start with the dollar sign ($).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

884

part

VI

Database

The

Management

ODBC_RESULT

stores

it in

stores

them

The

in

ECHO

previous .

function

a variable.

(lines

1730)

This function

gets a column

extracts

the

value from

different

values

for

a row in the result each

field

to

set and

be displayed

and

variables.

function

lines.

(lines

You

32-47)

can

also

outputs

combine

text

text

to the

(HTML

Web page

code)

and

using

PHP

the

variables

variables

defined

(lines

3346)

in the using

the

delimiter.

The The

ODBC_CLOSE

previous

examples

Web applications. servers

other

based

format.

That is the

eXtensIBle

efficiency

using

Web.

internet

to

or

create

business

processes

business

entities.

commerce

heading

styles, tags

as the

not

angle

order

a true data

definition

that

Web-enabled

data in

integrate

their

and

with

a standard-based

data

organisations

to increase whether

and services

or services

to

they

a global

can take

a business

and

a consumer

Since

B2B

e-commerce

over

on the

represent, is

the

in

pairs

to

display

to

place

(business-to-consumer

Web

order

start

FOR

and

on the

bold

Web

located

thing

be in the

include

form

features.

to differ

as an item

ID?

of an

HTML

as

well as

formatting

Web page, such

end formatting

SALE in

as typefaces

and

For

the

example,

Arial font:

Web page, there is

number,

only

product

describe

orders

data

how to

elements.

code, display

To solve

no easy

quantity, the

that

extract

or

price

a

Web browser;

order in problem,

way to

a new

from

details

an

markup

HTML it

does

language

was developed.

facilitate

the

exchange

World

Wide

standard

sets

platform. for

same

to

would

different

and use data, tends

the

was expected

looks

data from the

customer can

That

standard

code

integrates

among

SALE,/font.,/strong.

The

1998.

identify

a product

Web browser

how something

would

of business information

Language (XML) is a metalanguage

vendor-independent exchange

businesses.

the transfer

travelling

come

of the

internet. in

between

displayed

describe

date,

designed

the

place

it requires

Markup Language

Markup

XML is

standard

page

document

manipulation

over

exchange

enables

or between

For example,

often

number,

HTML

extensible elements.

of

with one another

the sale of products

B2B)

which businesses

brackets

as Extensible

invoices,

able to

with

Web application

development

market and sell products

wayin

needs to get the

the

be

of systems

company.

face5Arial.FOR

The

permit

known

the

databases

that

(XMl)

to

or

take

order

Web

and they in

If an application

document.

to

HTML tags

,strong.,font

such

must

transactions

companies,

a purchase HTML

details.

following

transactions

among

company

The

order

just

and

features

can communicate

(e-commerce)

or not-for-profit

However, the

from

Until recently,

the

systems

Web pages

multiple

B2C).

e-commerce

document.

of the

more than

new types

(business-to-business

Most

substantially

can interface

that

lanGuaGe

Electronic

for-profit

businesses

you

surface

involve

Clearly,

market of millions of users. E-commerce between

ways the

of XML.

costs.

or private,

connection.

applications

MarKup

the

and reduce

are public

many

systems

on the role

database

of the

They also require

not

are

the

only scratch

Current-generation

systems

Companies

two

examples

applications.

17.3

closes

are just

These

provide.

database

function

Web the

Therefore,

e-commerce

used to represent

of structured Consortium

stage

it is

for

and manipulate data

documents, (W3C)3

giving

XML the

not surprising

such

published

that

as

the

real-world

orders

first appeal

XML is rapidly

and

XML

1.0

of being

becoming

the

applications.

17 3

Visit the to

Copyright Editorial

review

2020 has

Cengage deemed

W3C

develop

Learning. that

any

the

All suppressed

page,

XML

Rights

Reserved. content

at

www.w3.org,

for

additional

information

about

the

efforts

that

have

been

made

standard.

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The XML elements is

derived

for the

metalanguage

used in from

an

the

industry

HTML

which

document.

Standard

publication

aviation

allows the

XML

and

additional

XML tags

XML tags

be

SGML

,ProductID.

is,

identification

must be properly

to

(SGML).

are too

an XML document

and

SGML

is

and

Web Technologies

to describe

be an extensible

documents

complex

describe

for the

XML

standard

as those

unwieldy

885

the data

language.

an international such

is a text file,

data elements,

is not the same

that

as ,ProdPrice.,

said

technical

Connectivity

used

by the

Web. Just like

but it has afew

although

as follows:

of new tags to

well formed;

product

that

Database

such

XML is

Language

complex

characteristics,

definition

must

the

Markup

military services

XML is case sensitive:

example,

feature,

of highly

was also derived from

XML allows the

of new tags,

Given that

Generalised

distribution

and the

very important

definition

17

nested.

each

such

as

,ProductId..

a corresponding

closing

as ,Productid..

opening

tag

has

would require

the format

For example,

a properly

tag.

For

,ProductId.2345-AA,/ProductId.. nested

XML tag

might look like this:

,Product.,ProductId.2345-AA,/ProductId.,/Product.. You can use the The XML

is

XML

and

not

a new

representation

,--and

xml

prefixes

than

over

XML

based features.

the

it is

of the

product

with

data

are

XML

Markup

HTML,

concerned

HTML

Language

and

HTML

requires

strict

a B2B example

in 17.12

next

rather

generation

of

to include

adherence

to

Company

shows

of

of structured

standard

which

and

the job

complementary, is the

the

Figure

description remains

manipulation

(XHTML)

B over the internet.

the

display

perform

expands

XHTML

with

(Data

exchange

and

use of XML, consider Company

is

specification

than

XML document.

displayed.

the sharing,

XHTML

powerful

XML

data

short,

Hypertext The

more

In

in the

only.

HTML.

way the

that facilitate

Extensible

comments

XML

for

the

XML framework.

productlist.xml

for

boundaries.

As an illustration

exchange

FIGure

than

the semantics

Although

requirements. XML to

rather

functions. on the

to enter

or replacement

data,

organisational

overlapping,

HTML

symbols

are reserved

version

of the

HTML.) XML provides documents

--.

the

syntax

A uses

contents

of

document.

17.12

Contents of the productlist.xml document

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

886

part

VI

Database

Management

The XML example

shown in

The first line represents Every

XML

root

Figure 17.12 illustrates the XML

document

has

a root

element

Product

as a child element

Each

element

several and

Figure

child

child

In

the

example,

the

mandatory.

second

line

declares

the

ProductList

elements

or sub-elements.

In the

example,

line

three

declares

of ProductList.

contain

elements,

sub-elements.

represented

B receives

the

17.12 are

tags

the

For

example,

by P_CODE,

ProductList.xml

created

is fairly

complete.

value correct?

type

contains

can

Company

share

element.

and it is

as follows:

each

Product

P_DESCRIPT,

element

is

P_INDATE,

composed

of

P_ONHAND,

P_MIN,

P_PRICE.

understands

data

declaration,

XML features,

element.

The root

Once

document

several important

by

Company

self-evident, For

but there

example,

you

And what happens if

data descriptions definitions

and

is

could

no easy

schemas

are

to

XML

validate

document in the

data

value

a Vendor

address

the tags

the

a P_INDATE

data elements?

used

can process of the

way to

B expects

business

it

meaning

encounter

Company

about their

XML

document, A. The

or to

element

way to

transactions

must

have

a

the

how document

concerns.

Companies

B2B

whether

but is that

will show

and XMl schemas

use

check

in

as well? How can companies

17.3.1 Document type Definitions (DtD) that

it

shown

of 25/06/2019

The next section

those

assuming

example

understand

and

validate

one

anothers

tags. One wayto accomplish that task is through the use of Document Type Definitions. A Document Type Definition (DTD) is afile with a .dtd extension that describes XML elements in effect, a DTD file

provides

for

each

business

FIGure

17.13

the

XML

Companies

in

Figure

databases

(The

DTD

that intend

17.13

shows

to

logical

model

component

is

engage in

the

and

defines

similar

to

e-commerce

productlist.dtd

the

having

business

document

syntax

a public

rules

or valid

tags

data

dictionary

for

transactions

for the

must develop

productlist.xml

document

17.12.

Contents of the productlist.dtd document

examine

Figure

productlist.xml The first

of the

document.

DTDs. Figure

earlier

As you

composition

of

data.)

and share shown

the

type

17.13,

note

document.

line

declares

the

that

the

productlist.dtd

In particular, ProductList

file

provides

definitions

of the

elements

in

note that:

root

element.

17 The

Copyright Editorial

review

2020 has

Cengage deemed

ProductList

Learning. that

any

All suppressed

Rights

root

Reserved. content

does

element

May not

not materially

be

copied, affect

has

scanned, the

overall

one

or

duplicated, learning

child,

in experience.

the

whole

or in Cengage

Product

part.

Due Learning

to

element.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The plus 1

symbol indicates

An asterisk * A question

would mark ?

The

second

line

The

question

to

To

be able to

reference

use

As you

XML

documents

DTD

B,it

and

they

element

and

second

note

within

and

Web Technologies

887

ProductList.

more times.

optional.

that to

verify

that

they

has six children

are

optional

elements.

sub-elements.

data.

within

an

XML

document,

the

productlistv2.xml

DTD

must

document

be referenced

that includes

the

line. the

be

P_INDATE

optional

if

the

and

P_MIN

elements.

Company

DTD only once.

B will be able to

or

indicates

shows the

For example,

create the

zero

is

P_MIN

element

17.14

declared

of the same type.

17.14

To further

17.14,

will need to

Company

FIGure

in the

Connectivity

element.

elements

Figure

Database

one or more times occurs

the actual text

define

were

child

Product

represents

Figure

because

the

P_INDATE

show that the

productlist.dtd

occurs

element

Product

the

a DTD file to

examine

Company

after

XML document.

to the

definitions

the

keyword

within that

mean that

describes

The #PCDATA

from

would

eight

Product

mean that the child

mark ?

Lines three

that

17

data

The

Aroutinely

All subsequent

being

do

DTD

not

can

appear

in

all

Product

be referenced

exchanges

by

product

XML documents

many

data

will refer

with

to the

received.

Contents ofthe productlistv2.xml document

demonstrate

the

use of XML and

DTD for

e-commerce

business

data exchanges,

assume

the case of two companies exchanging order data. Figure 17.15 shows the DTD and XML documents for that scenario. Although the use of DTDsis a great improvement for data sharing over the Web, a DTD provides only descriptive information for understanding how the elements root, parent, child, mandatory or optional

relate

to

one another.

A DTD provides

limited

additional

semantic

value,

such

as data type

support or data validation rules. That information is very important for database administrators who are in charge of large e-commerce databases. To solve the DTD problem, the W3C published an XML schema standard to provide a better way to describe XML data.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

17

eChapter(s). require

it

888

part

VI

FIGure

Database

Management

17.15

DtD and XMl documents for the order data

OrderData.dtd

+

sign

indicates

one

or

more

ORD_PRODS

elements

OrderData.xml

Two ORD_PRODS elements

in

XML

document

The

XML

schema

is

an advanced

data

definition

language

that

is

used to

describe

the

structure

(elements, data types, relationship types, ranges and default values) of XML data documents. One ofthe main advantages of an XML schema is that it more closely mapsto database terminology and features. For example, an XML schema will be able to define common database types such as date, integer or decimal, minimum and maximum values, list of valid values and required elements. Using the XML schema,

a company

would be able to

validate

the

data for values

that

may be out of range, incorrect

dates, valid values, and so on. For example, a university application must be able to specify that a grade point average (GPA) value be between zero and 4.0, and it must be able to detect an invalid birth date such as13/16/1987. (There is no 16th month.) Many vendors are adopting this new standard and are supplying tools to translate DTD documents into XML Schema Definition (XSD) documents. It is widely expected

that

XML schemas

will replace

DTD as the

method to

describe

XML data.

Unlike a DTD document, which uses a unique syntax, an XML schema definition (XSD) file uses a syntax that resembles an XML document. Figure 17.16 shows the XSD document for the OrderData XML document. 17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIGure

17.16

17

Database

Connectivity

and

Web Technologies

889

the XMl schema document for the order data

The code shown in Figure 17.16 is a simplified version ofthe XML schema document. As you can see, the XML schema syntax is similar to the XML document syntax. In addition, the XML schema introduces additional

semantic

types; required

information

elements;

for the

OrderData

XML document,

such

as string,

date and decimal

data

and minimum and maximum cardinalities for the data elements.

17.3.2 XMl presentation One of the main benefits of XMLis that it separates data structure from its presentation and processing. By separating data and presentation, you are able to present the same data in different ways which is similar

to

having

views

in

SQL.

The Extensible

Style

Language

(XSL)

specification

provides

the

mechanism to display XML data. XSL is used to define the rules by which XML data areformatted and displayed. The XSL specification is divided into two parts: Extensible Style Language Transformations (XSLT) and XSL style sheets. Extensible Style Language Transformations (XSLT) describe the general mechanism that is used to extract and process data from one XML document and enable its transformation within another document. Using XSLT, you can extract data from an XML document and convert it into a text file, an HTML

Web page

or a Web page that is formatted

for a mobile device.

What the user sees in

those cases is actually a view (or HTML representation) of the actual XML data. XSLT can also be used to extract certain elements from an XML document, such as the product codes and product 17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

890

part

VI

Database

Management

prices, to create another

XML

a product

XSL style sheets presentation

FIGure

The

when they

17.17

XSLT can even

define the presentation

templates.

elements

catalogue.

be used to transform

an XML document

into

document.

are

rules

XSL style

displayed

sheet

applied

to

describes

on a browser,

XML elements

the

formatting

smartphone,

something options

tablet

to

screen

like

apply

and

so

to

XML

on.

Framework for XMl transformations XSL

XSL style

transformations

HTML

sheets

XML document

HTML

Extract

Apply

Convert

formatting rules

to

XML The process

elements

different New

for

XML

different

web browser another

can

be

document

used

into

purposes,

such as one page for a

document

XSLT

can render webpages

to

transform

another

one

XML

for

and

a mobile

device.

XML

document.

Figure 17.17 illustrates the framework used by the different components to translate XML documents into viewable Web pages, an XML document or some other document. To display the XML document with Windows Internet Explorer (IE) 5.0 orlater, enter the URL of the XML

document

in the

browsers

address

bar. Figure

17.18 is based

on the

productlist.xml

document

created earlier. As you examine Figure 17.18, note that IE shows the XML data in a colour-coded, collapsible, tree-like structure. (Actually, this is the IE default style sheet that is used to render XML documents.)

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIGure

Internet HTML

17.18

Explorer code

works

also

that

tag to include

Displaying

is

the

only in IE

provides

used

to

binding

an

XML data in the

5.0

Database

Connectivity

and

Web Technologies

891

XMl documents

data bind

17

XML

of

XML

data

document

to

HTML document,

to an

HTML HTML

later

to

documents. table.

Figure

The

bind it to the

17.19

example

shows

uses the

HTML table.

the

,xml.

This example

or later.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

892

part

VI

FIGure

Database

Management

17.19

XMl data binding

17.3.3 sQl/XMl

and XQuery

As you havejust learnt, XML is used to transfer data from a Web-based application to the database and back again. SQL/XML and XQuery are two standard querying languages that are used to retrieve data from a relational database in the XML format. XQuery 1.0 is the W3Clanguage designed for querying XML data and it is relatively

similar

to

SQL, except it

was designed

to

query semi-structured

XML data.

SQL/XML is an extension of SQL that is part of ANSI/ISO SQL 2011 standard. Thisis because only small additions have been madeto the standard SQLlanguage. These additions include: XML publishing functions that can beincorporated include: ? xmlelement(

),

which creates

an XML element

directly into the SQL query. These functions

with a specific

name

? xmlattributes( ), which creates a set of XML attributes from the columns database table(s)

17

within the specific

? xmlroot( ), which creates the root element of an XML document

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

? xmlcomment( ?

xmlpi(

),

), which

which

allows

allows the

creation

? xmlparse(

),

which

parses

?

),

which

creates

xmlforest(

an XML

comment

of an

a string a list

XML

to

processing

elements

Database

Connectivity

and

Web Technologies

specific

database

893

be created instruction

as XML and returns of XML

17

the resulting

from

the

XML value

columns

within

the

table(s) ?

xmlconcat(

),

? xmlagg( ),

which

combines

a list

which aggregates

of XML

a number

values

of single

into

one that

contains

XML values together

an

XML forest

to create

a single

XML forest. An XML datatype A set Lets

of rules

now look SELECT

map relational

VENDOR

17.20

data to

at an example.

Consider

V.VEND_CODE,

FROM

Figure

to

XML.

the

following

V.VEND_CONTACT

SQL

query:

AS VENDOR_NAME,

V.VEND_AREACODE

V;

shows the

FIGure

17.20

Database

Ch17_SaleCo

contents

of the vendor

the contents

of the

table

VenDor

and the results

of this

query.

and proDuCt

tables

and results

of the

query

Table name: VENDOR veND_CODe

veND_CONTACT

veND_AreACODe

veND_PHONe

230

Shelly

K. Smithson

7325

555-1234

231

James

Johnson

0181

123-4536

Sibiya

7325

224-2134

0113

342-6567

0181

123-3324

0181

899-3425

232

Khaya

233

Lindiwe

234

Nijan

235

Henry

Molefe Pillay Ortozo

Table name: PRODUCT PrOD_CODe

PrOD_DeSCriPT

001278-AB

Claw

123-21UUY

Houselite

QER-34256

Sledge

hammer,

SRE-657UG

Rat-tail

file

ZZX/3245Q

Steel tape,

PrOD_PriCe

PrOD_ON_HAND 23

232

150.09

4

235

14.72

6

231

2.36

15

232

5.36

8

235

10.23

hammer chain

saw,

12

16 cm

bar

veND_CODe

16 kg head

mlength

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

894

part

VI

Database

Management

Data returned SELECT FROM

In

by query

V.VEND_CODE, VENDOR

order to

V.VEND_CONTACT

AS VENDOR_NAME,

V.VEND_AREACODE

V;

display

veND_CODe

veNDOr_NAMe

veND_AreACODe

230

Shelly

7325

231

James Johnson

0181

232

Khaya Sibiya

7325

233

Lindiwe

0113

234

Nijan Pillay

0181

235

Henry

0181

these

results

K. Smithson

Molefe

Ortozo

as XML, the

xmlelement(

) function

can be incorporated

into

the

SQL

statement like this: SELECT

XMLELEMENT(NAME

'VENDOR',

XMLELEMENT(NAME

'VEND_CODE',

V.VENDCODE),

XMLELEMENT(NAME

'VENDOR_NAME',

XMLELEMENT(NAME

'VEND_AREACODE',

V.VEND_CONTACT), V.VEND_AREACODE))

FROM VENDOR V; Each row returned

by the query corresponds to one VENDOR element,

whichis represented

as:

,VENDOR.

,VEND_CODE.230,/VEND_CODE. ,VENDOR_NAME.Shelly

K. Smithson,/VENDOR_NAME.

,VEND_AREACODE.7325,/VEND_AREACODE. ,/VENDOR. As you this

will have

query

columns

seen,

using

the

within the

SELECT

the

SQL/XML

publishing

query

function

VENDOR table.

XMLELEMENT(NAME

we have just xmlforest(

The query

),

written which

is

quite

creates

complicated.

a list

of

We could

XML

elements

rewrite from

the

would then look like this:

'VENDOR',

XMLFOREST(V.VENDOR_CODE,

V.VEND_CONTACT

AS VENDOR_NAME,

V.VEND_AREACODE))

FROM VENDOR V; Producing

XML from

functions, all the

Copyright Editorial

review

2020 has

in

Cengage deemed

we want to

products

found

17

if

that

any

All suppressed

Rights

Reserved. content

queries

display

were

Figure 17.17.)

Learning. that

SQL

does

that

the

May

not materially

results

associated

This could

not

contain

be

copied, affect

with

in

relational a way the

each

be achieved

scanned, the

overall

or

duplicated, learning

joins

in experience.

user

vendor.

requires

or in Cengage

part.

Due Learning

to

use

will understand.

(The

contents

using the following

whole

the

electronic reserves

more

Suppose of the

XML

publishing

we wanted

PRODUCT

to list

table

can

be

SQL query:

rights, the

of

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

SELECT

V.VEND_CODE,

V.VEND_CONTACT

17

Database

AS VENDOR_NAME,

Connectivity

and

Web Technologies

895

P.PROD_CODE,

P.PROD_DESCRIPT

FROM VENDOR V, PRODUCT, P WHERE

V.VEND_CODE

5 P.VEND_CODE;

To represent the results of this query as XML, we want to show the vendor details once and then alist of the products that the vendor supplies. In SQL/XML this can be achieved using the publishing function xmlattributes(

) in

a subquery

that

retrieves

the

products

associated

with each vendor.

Subqueries

in

SQL/XML are only designed to return one row, so if multiple rows are to be returned they must be aggregated into one single value using the function xmlagg( ). Thefollowing SQL/XML query makes use of these publishing functions to display all products associated with each vendor: SELECT

XMLELEMENT(NAME XMLATTRIBUTES

VENDOR,

(V.VEND_CODE

AS VEND_CODE),

XMLFOREST(V.VEND_CONTACT XMLELEMENT(NAME

(SELECT

AS VENDOR_NAME,

XMLAGG(XMELEMENT(NAME (P.PROD_CODE

PRODUCT,

AS PROD_CODE),

XMLFOREST(P.PROD_DESCRIPT PRODUCT

AS DESCRIPTION)))

P

WHERE P.VEND_CODE

5 V.VEND_CODE)))

AS 'PRODUCTS

FROM

AS AREA),

PRODUCT,

XMLATTRIBUTES

FROM

V.VEND_AREACODE

VENDOR

RELATED

TO

VENDORS'

V;

An alternative approach to SQL/XML is XQuery. XQuery is alanguage that can query, store, process and exchange

structured

or semi-structured

XML

data.

XQuery is used in conjunction

with XPath,

which is

used to navigate through elements and attributes in an XML document. XPath is a major component of W3Cs XSLT standard. XQuery includes over 100 built-in functions including functions, for manipulating strings and comparing dates. The following is an example of an XQuery that retrieves alist of products which has been supplied by each vendor: FOR V$IN $VENDOR/ROW RETURN ,VENDOR_CODE ,VEND_NAME

5'{$V/VENDOR_CODE}'. .{ STRING ($V/ VENDOR_CONTACT)},/VEND_NAME.

,PRODUCT.

{ FOR $P IN $PRODUCT/ROW WHERE $P/VENDOR_CODE

5 $V/VENDOR_CODE 17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

896

part

VI

Database

Management

RETURN ,PROD_CODE PROD_DESCRIPT

5 '{$P/PROD_CODE}'

5 '{$P/PROD_DESCRIPT}'/.

} ,/PRODUCT.

,/VENDOR. The XQuery

performs

can see requires is that

it

can

Lets

FIGure ,?xml

query

consider

17.21

exactly

the same

more in-depth data

stored

a simpler

query

knowledge inside

the

example

as the last of

XML

database

by using

SQL/XML

programming. or directly

the

query that

welooked

One of the

main strengths

from

DVDStore.xml

an

XML

document

at, but as you of XQuery

source. in

Figure

17.21.

DVDstore.xml Document

version5"1.0"

,!Created

a

encoding5"ISO-8859-1"?.

by KAC --.

,dvdstore. ,dvd

category5"Children". ,title.ToyStory

,/title.

,year.2005,/year. ,price.9.00,/price. ,/dvd. ,dvd

category5"Action".

,title.Indiana

Jones

,/title.

,year.2001,/year. ,price.15.00,/price. ,/dvd. ,/dvdstore.

In order to extract

data from

XML

documents,

the

doc( ) function

is used to

open the

dvdstore.xml

file

as shown below: doc(dvdstore.xml)

In order to extract data elements, illustrates

how the title

element

path expressions

would be extracted

from

from the

XQuery are used. The following dvdstore.xml

example

document:

doc(dvdstore.xml)/dvdstore/dvd/title Executing this function

would display the following:

,title.ToyStory

,/title.

,title.Indiana

Jones

,/title.

Writingthe function as /dvd/title selects the child elements of the top-level dvd element. If we wanted to extract elements based on a specific condition, for example to select the details of DVDs costing less than twelve rand, we would write:

17

doc(dvdstore.xml)/dvdkstore/dvd[price,12]

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

This function ,dvd

would return

17

Database

Connectivity

and

Web Technologies

897

the following:

category5Children. ,title.ToyStory

,/title.

,year.2005,/year. ,price.9.00,/price.

,/dvd. FLWOR Order is

expressions By ..

mandatory.

rand

is

are a fundamental

Return.

Each

An example

shown

$y in

order

by $y/title

expression

returns

expression, called

also

as the

d.title

An in-depth the

the

the

at

reading

title

used

d

for

an acronym

and

only

for For .. Let ..

the

retrieving

is

function,

all dvd elements

use

DVDs

only

dvd

except

the

under the elements

of the

costing

Where ..

Return

less

results

parent

are

clause

than

twelve

and the

return

comparison

purposes,

the

book,

additional

ordered.

dvdstore

with a price less

alphabetically For

clause above

In this

element

than

states

what

FLWOR

into

twelve

a

rand.

should

be

expression

can

query:

WHERE beyond

at the

previous

selects

results

SQL

dvd

XQuery

clause

elements.

following

section

as the

selects

where

title

FROM

look

further

expression

result

clause

the

orders

case

written

same

the for

by clause

SELECT

the

$y. Then,

in this

be

FLWOR is

as a clause

$y/title

FLWOR

order

known

doc(dvdstore.xml)/dvdstore/dvd $y/price,12

variable The

is

of a FLWOR

where

return

returned,

part of XQuery.

elements

below:

for

This

of the

d.price the

end

, 12;

scope

of this

of this

but

reading

can

be found

in

chapter.

17.3.4 XMl applications Now that

you have some idea

applications Keep in

lend

themselves

mind that

designers

and

the

customers.

In for

Legacy

the

new life in old

2020 has

Cengage deemed

Learning. that

any

earlier,

question

XML?

This

is, how

section

can you use it?

will list

only

by the imagination

the

exchange

All

positioned

of the

Reserved. content

supply

some

What kinds

of the

uses

of

of

XML.

and

creativity

of developers,

data,

providing

the

XML

does

For

May not

not materially

be

applications.

provides example,

copied, affect

scanned, the

overall

because

duplicated, learning

in experience.

whole

example

to

a data

features

or in Cengage

that

with large

part.

Due Learning

to

electronic reserves

legacy

system

be used

is the

use

warehouse make it

rights, right

some to

of

third remove

or

as the

more flexible.

data

with

to inject

some

XML to import

database.

a good

amounts

the

(EDI)

and

could

standard

government

expensive

that integrates

Web portals

or

it is less

Another

the

Data Interchange

XML technologies

databases several

competitors,

Electronic

the glue

operational

of B2B

with partners,

to replace

Web and

legacy

multiple

data

chain

XML provides

but trusted

Rights

enables

XML is

Web systems.

scenarios.

suppressed

XML

exchange

development.

development

review

well to

XML is limited

need to

integration.

data from

Web page

Copyright

of

automation

e-commerce

transaction

Editorial

that

particular,

systems

modern

use

As noted

all organisations

standard

particularly

future

next

programmers.

B2B exchanges.

for

of what XML is, the

fit for

certain

of personalised

party additional

content

may content

be

data

suppressed at

any

Web

time

from if

the

subsequent

17

can

eBook rights

and/or restrictions

eChapter(s). require

it

898

part

VI

Database

Management

use XML to different

pull data from

presentation

Database

support.

systems

(Web,

types

mobile

tree

of these structure

query

inside

industries.

accounting

exchanging

databases.4

or form. with

The

Most

store typically on.

contents

database

their

with external

creation

of

such

activities

HR-XML

data

model

XML format.

native format.

would

The

a hierarchical-like

also

require

that

data.

for the

from

standard

new

or generate

native

metadictionaries,

(METS)

market

simple

for

or vocabularies,

human

the

resources

Library

patient

reporting

The

for

entire

industry,

of Congress,

data exchange

language

XML to

and

XML

(XBRL)

structure

sections,

paragraphs, Oracle,

the

the clinical in electronic

standard

for

DB XML

by

DB2

Oracle

its

some

databases

would be

for

the

well suited

structure:

footnotes,

MS SQL

shape

provide

database

charts,

and

data in object

databases

database

dictate

figures,

IBM

to

XML

an XML

would

manage

software

servers.

For example,

books

are

support

middleware engines

relationships.

Berkeley

the

XML data in its

on XML

business

database

databases

is the

enable

XML format

and apply

devices.

would even be able to store

create

include

on the from

of chapters, of XML

to

mobile

information.

XML

of a book.

consists

Examples

XML

to full

of data in complex

the

used

extensible

databases range

using

store

you

data exchange

and the

approaches

data,

queries

standard

and financial

XML interfaces

storage

also

metadictionaries

(CLAIM)

business

data in

Of course,

support

and transmission

systems,

and thus

the

structure. to

XML is

information

record

storing

and stocks)

well as

will be able to integrate

so on)

a XML data type to

a relational

of

encoding

still

weather as

or export

are far-reaching

be extended

Examples

metadata

medical

while

metadictionaries.

and

can import

capabilities

language

Database

XML

queries

as news, computers

XML exchanges

systems,

databases

(such

on desktop

supports

a DBMS can support

implications

the

SQL

sources

pages

data, legacy

These

from

Alternatively,

external

to format

A DBMS that

of systems.

documents

multiple

rules

endnotes,

Server.

to

a book and

An example

so

of a full

(www.oracle.com/database/berkeley-db/xml.

html). XML services. services

Many companies

based

on XML and

interoperability

barriers

facilitates

would

use

17.4

services

You have almost

term

1990s,

17

4

cloud during

For

to

other

them

peak

to

provides

desk,

publish

the their

and learn

XML technologies

break

of

down the

the infrastructure

street,

and the

interfaces.

their

that world.

Other

vocabulary

work together

services,

(service

in innovative

ways

computing.

heard

was used

and

XML

the

would locate

of a new breed

promise

alike.

across to

Web, virtualisation cloud

development

technologies

a conversation.

over the

serVICes

about the cloud

years,

although

it

analysis

growth,

of

XML

from

the thousands

has represented

by telecommunication of internet

a comprehensive

companies

services,

certainly

the

and

to establish

is

on the

These services

work together

internet

ClouD CoMputInG

have used the term the

and

which internet, IT

working

with existing

and replies)

One area in

systems

systems

XML

wanting to interact request

to leverage

among

heterogeneous

Services

are already Webtechnologies.

companies the

term

database

to

depicted

products,

see

of publications

different describe

their

the internet

XML

and TV ads that

concepts. data

itself.

Database

In the late networks.

Then,

Products

in

by

1980s,

In the late 2006,

Ronald

Google

Bourret

at

www.rpbourret.com.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

and

Amazon

services. But

began

Google,

using the term

Yahoo,

what exactly

cloud

(NIST),5

is

eBay

cloud

computing

cloud

and

computing?

is a

pool of configurable

and services)

can be rapidly

that

cloud

computing.

applications,

describe

early

to the

National

provisioned

resources

of this

Institute

ubiquitous, (e.g.,

and released

Connectivity

and

Web Technologies

a new set of innovative

adopters

model for enabling computer

Database

of Standards

paradigm.

and

Technology

on-demand

servers,

minimal

Web-based

computing

convenient,

networks,

with

new

storage,

management

899

network

applications

effort

or service

The term cloud services is used in this book to refer to the services provided by Cloud

storage,

and economically.

FIGure

to

were the

According

computing

access to a shared

provider interaction.

computing

Amazon

17

services

servers,

allow

Figure 17.22 shows

17.22

any

processing

organisation power,

to

add information

databases

a representation

technology

and infrastructure

of cloud

computing

services

to its IT

services

such

portfolio

as

quickly

on the internet.

Cloud services

Email

Desktop

Storage

Server

RDBMS

NoSQL

Content Delivery Simple

Simple

Messaging

Storage

Simple

Cloud Service Providers

Relational

Queuing

DB

Elastic

NoSQL

Compute

DB

SOURCE:

Course

Technology/Cengage

Learning

Cloud computing allows highly specialised, IT-savvy organisations such as Amazon, Google and Microsoft to build high-performance, fault-tolerant, flexible and scalable IT services. These services include applications, storage, servers, processing power, databases and email, which are delivered via the internet

5

to individuals

Recommendations Publication

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of the

800-145,

All suppressed

and organisations

Rights

Reserved. content

National

Institute

September

does

May not

not materially

be

using

a pay-as-you-go

of Standards

and

price

Technology,

Peter

model.

Mell and

Timothy

Grance,

17

Special

2011.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

900

part

VI

Database

Management

For example,

imagine

that

officer

A few

this

to the

email

systems

infrastructure

from

maintenance.

However,

for

and

Business

fraction and

of the

Microsoft

cost.

maintaining

recovery. If

or

need

more

processing more you

power

scaled

previous

scale

to

of

services

and

to

your

and

is important

Cloud

potential

to turn

enable

a revolution

The technologies technologies

include

However,

cloud

of

minutes. Instead

can employ relational

basic

IT

that

The

in

a

cloud.

Christmas

season,

you

scale

can

subsides,

beauty

you

of cloud

of

for a

managing

tolerance

and

matter

of

If

need

you

minutes. more

you simply down

can

add

as easily

go

services

because

it has the

and technological

barriers

processes

services

could

be done

fault

G Suite

solution

as

back to

is that

your

you

can

intervention.

technologies

business

into

with

minimal

commodity

change

potential

effort

and

computing

cost.

such

way that

to

become

so organisations

services,

not only the

Carr put it so vividly: Cloud

make

the

Web,

messaging,

itself

is still in the

this,

cloud

more

computing

and

(AWS)

or

of spending

Microsoft

have

virtualisation,

or

large

work

a can

In fact,

cloud

as electricity,

companies

is for IT

in

organisations

NoSQL) Azure

amounts

Microsoft

around

remote

for

and

have

gas

do business,

what the invention

Azure

and

buying Figure

Amazon

years

protocols,

into

hardware

cloud

ready

the

and

before it to

you

can log

for

use in

these XML.

can be

services

and software,

depicts

now;

VPN

Currently,

database

17.23

a few

mature further

organisations. a relational

of cash

for

desktop

are tapping

their

model for their IT services.

instance

been

early years and needs to

more

(relational

a pay-per-use

database

use.

financial

that

services

Web Services

you

configuration,

daily chores

storage

or storage

the

electricity.6

Despite

database

in their

As Nicholas

computing

adopted.

advanced

database

busy

add

building

Googles

email

security,

now your

use

more reliable

more importantly,

processing what

to

during the

Even

eliminates

have the

was for

orders

servers.

for

computing

can unit

setup,

can

worry about the

to implement storage

you

patches,

wants to

have implied

software,

era,

and

OS updates,

an administrators

technologies

grid

as

another

only for

database

power

Amazon

pay

hardware,

you do not have to

organisation

would

computing

get a scalable, flexible

additional

without

but the IT business itself.

widely

for

cloud

or years

add

cloud

need

usage

water, and to

of the

just

of a non-profit proposition

up, including

such

months

you

automatically,

changer.

leverage

take

your

Cloud computing game

ground

handle last-minute

Once

levels

down

to

ago,

in todays

IT infrastructure,

units

up.

the

Office 365 and

space,

processing

years

The best part is that

the

What used

you

portfolio.

chief technology

services

operation

IT

the

email

secure

a

in to matter

organisations

cost of provisioning

a

RDS, respectively.

17 6

Copyright Editorial

review

2020 has

Nicholas

Cengage deemed

Learning. that

any

Carr,

All suppressed

The

Rights

Big

Reserved. content

does

Switch:

May not

not materially

be

Rewiring

copied, affect

scanned, the

overall

the

or

duplicated, learning

World, from

in experience.

whole

or in Cengage

Edison

part.

Due Learning

to

to

electronic reserves

Google,

rights, the

right

some to

W.W.

third remove

party additional

Norton

content

may content

& Co.,

be

2009.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

902

part

VI

Database

Management

Private cloud. servicing

its

organisations

needs.

to

managed

add

IT

cloud.

a common

education.

Private

agility

by internal

Community share

This type ofinternal

own

such

These

computing

The prevalent

share

public

access

provision,

has access

Shared

infrastructure.

possible

by

Lower lower

than

range

from

needed

cloud

Flexible fault

area.

the

military

IT staff

cloud

be

or higher

or an external

services

share

third

party.

a common

set

of

next section.

The characteristics

Amazon,

All cloud

Google,

services

listed

in this

Salesforce,

use internet

provide.

infrastructure

technologies.

pricing.

pricing

and

is

SAP

and

section

and

are

Microsoft.

Web technologies

The basic requirement

Cloud

shared

by

multiple

services

effectively

by the

consumers

managed

The initial

costs

of using

IT infrastructures.

Because from

and scalable

tolerant

as

which is locally

on-premise

benefit

resource

in the

services they

service

virtualisation

variable

to fixed

most

principles.

such

technologies.

35 per cent to 55 per cent

in this

could

is that the

users.

Sharing

provide

an

is

made

organisation

organisation

as if it

of the infrastructure.

building

consumers

infrastructure

group of organisations that

government,

by internal

uses,

of guiding

manage the

The

and

cloud

to the internet.

user

costs

The

dispersed

are:

and

Web and

only

federal

are explored

providers

with a virtual IT infrastructure, were the

services.

managed

organisation

a set

cloud

via internet

deliver

device

IT

geographically

of Cloud services

characteristics

Ubiquitous

to

by large,

party.

of the

can be

characteristics

services

by prominent

third

as agencies

an

17.4.2 Characteristics Cloud

used

to internal

or an external

implementation

core characteristics.

often

This type of cloud is built by and for a specific

trade,

of the

are

and flexibility

staff

The cloud infrastructure

Regardless

shared

cloud is built by an organisation for the sole purpose of

clouds

the

According

depending

Web services

usage

and flexible

pricing

based

on

levels

services.

Cloud services

very reliable.

The

services

services

to

some

on company

lower

minimum

cloud

tend

studies,7

size, although

is

metered

options.

These

to

per

the

savings

could

more research

volume

options

be significantly

and time

range

is

utilisation,

from

pay-as-you-go

of service.

are built on an infrastructure can

scale

up and

down

that is highly

on

demand

scalable,

according

to

demands.

Dynamic servers,

provisioning. processing

and then

adding

The consumer power,

can quickly

storage

and removing

and email,

services

provision

any needed

by accessing

on demand.

This

the

resources,

including

Web management

process

can

also

be

dashboard

automated

via

other

services.

Service

orientation.

services

that

and

be

can

Managed

IT staff.

Cloud computing

use well-known delivered

anytime

operations.

Cloud

The system

7

Copyright Editorial

review

2020 has

The

Compelling (TCO)

Aggarwal,

Partner;

Cengage deemed

TCO

ownership

Learning. that

any

All suppressed

Rights

Case

Laurie

does

May not

is

not materially

be

minimises

managed

Cloud

copied, affect

cloud

Partner;

scanned, the

overall

or

duplicated, learning

in

in and

whole

need

cloud

maintenance

SMB

and

on-premise

Hurwitz

experience.

the

by the

and

Computing

comparing McCabe,

consumers hide the

with specific,

complexity

from

well-defined the end user,

anywhere.

management

for

on providing

These interfaces

computing

perspective

Reserved. content

and

infrastructure

IT staff is free from routine

17

focuses

interfaces.

Cengage

part.

Due Learning

to

extensive

provider. tasks

Mid-Market

electronic reserves

and

expensive

The consumer

so they

in-house

organisations

can focus

Enterprises:

business

& Associates,

or in

for

on other tasks

A 4-year

application

total

cost

development,

of

Sanjeev

2009.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

within the that

organisation.

outsource

The preceding offerings. move

list is

Although

to

them

otherwise

Managed

cloud

to

not exhaustive,

they

move to

are

the

services

Connectivity

that

and

use public

Web Technologies

clouds

903

and

party.

point for

services

way to

Database

organisations

third

cloud

best

Not all cloud

in the next

apply to

an external

but it is a starting

most companies

because

be unavailable.

as explained

operations

management

17

understanding

because

gain

access

are the

same;

most cloud

of cost

to

savings,

specific

in fact,

IT

there

computing

some

companies

resources

are

several

that

would

different

types,

section.

17.4.3 types of Cloud services Cloud

services

In fact,

come

cloud

options

different

services

according

sophisticated the

in

often

shapes

follow

categories:

Software

as a Service (SaaS).

cloud.

Consumers

Web or any

mobile

cannot

make

from

and

created

consumer

can

and interfaces.

Examples

and

their

processing

or remove

a server

can

choose

on top

cloud

internally

The

application

of SaaS include

is

multiple

of one another

services

provide

can be classified

organisations

by

via the

aspects

of the

application

actually

shared

among

Microsoft

service

to

applications that run in

in their

certain

all consumers.

Office 365,

Google

but users

Docs, Intuits

signage.

the

providers

manage

consumer

does

Microsoft

App Engine

cloud

applications not

infrastructure.

using manage

the

Azure platform with Python

the

In this

providers underlying

with .NET

scenario,

the

tools,

languages

cloud cloud

and the

infrastructure.

Java

development

or Java.

as a Service (iaaS). In this case, the cloud service provider offers consumers the

provision

databases,

provided,

customise

itself.

using

the

Google

consumers

works for

The cloud service provider offers the capability to build and deploy

and

the

of PaaS include

infrastructure

add

deploy

However,

environment,

ability to

digital

applications

build,

of service

can build

applications can

Examples

SCALA

model;

of services

providers

application

as a Service (PaaS).

consumer

carte

type

The cloud service provider offers turnkey the

to the

no single

These services

The consumer

organisations.

Online

Platform

can run device.

changes

multiple

TurboTax

la

needs.

Based on the types

following

the

an

to their individual

solutions.

and forms;

the

computer

own resources units

and

resources that

runs

on demand;

even

a complete

as needed. Linux

and

For

these resources virtualised

example,

Apache

include

desktop.

a consumer

Web server

can

using

storage,

The

64

servers,

consumer

use

GB of

AWS RAM

can then

and and

provision 1 TB of

storage. Figure

from

17.24

illustrates

any computing Cloud

creates

services

has a virtual

of the

different

types

of cloud

services;

these

services

can

be accessed

device.

computing

technologies

a sample

enabled

have

the

computer

evolved

creation

on the

in their

of new

cloud

that

sophistication

options can

such

and flexibility.

as desktop

be accessed

from

The

merging

as a service, any

device

which

over

the

of new

effectively

internet.

For

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

904

part

VI

FIGure

Database

Management

17.24

types of cloud services Servers Laptops Tablets

Desktops

Smart phones

Internet

Software

as a Service

Microsoft Office 365 Google Docs, Google Email Salesforce CRM Online SAP Business ByDesign

Platform

as a Service

Amazon

Web Services,

MS Azure

Google

Platform,

App

Amazon

MS SQL

Relational

Data

Service,

Amazon

as a Service

Amazon

Web

Amazon

Elastic

Services

Elastic

Amazon

Simple

Storage

Amazon

Elastic

Load

Computing

MapReduce

Cloud

2 (EC2)

Service Service

(S3)

Balancing

Service

SOURCE:

example,

you

can

over the

desktop

use

a service

Web for

via the

your

such

as VirtualBox

personal

use in a

Web browser

or using

the

computing

advantages

Table

8

17

has

17.4

grown

any Remote

of cloud

summarises

Cloud

Computing

Global

Banking

computing, the

Market and

remarkably

Outlook

in the but its

2019

Review,

|

and

minutes.

Desktop

past

few

widespread and

Global

2019,

of

Moreover,

Technology/Cengage

get

a

you

Protocol (RDP)

Learning

Windows

can

10 desktop

access

your

virtual

application.

and Disadvantages

main advantages

Finance

Course

(www.virtualbox.org) matter

17.4.4 Cloud services: advantages Cloud

DB

Engine

Infrastructure

running

Simple

Service

years.

Companies

adoption

disadvantages

Opportunities,

Available:

is

of all

still limited

of cloud

Challenges,

sizes

are

by several

enjoying factors.8

computing.

Forecast

and

Strategies

To

2028,

www.globalbankingandfinance.com/category/news/

cloud-computing-market-outlook-2019-global-opportunities-challenges-forecast-and-strategies-to-2028/

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

taBle

17.4

advantages

and disadvantages

Advantages cost

of entry.

of entry

of building

when

easy to

add

Issues

multiple

access.

from

and remove

High reliability

of

Consumers

costs

can

access

and

Web Technologies

905

bandwidth

Complex

organisation

otherwise

licensing

schemes.

are faced

services

to leverage.

are

in

Can the

cloud

Migrating

infrastructure

that

implement

schemes

Companies that

complete of the

vendor

hard to

agreements.

and control.

no longer

process.

the

Organisations

What is the responsibility breached?

for

It is

with complex licensing

service-level

Loss of ownership

difficult

difficult

operation.

and lengthy

and time-consuming.

cloud services

is

migration costs.

be difficult

Cloud providers are

and data

can

and complicated

that

and

of data to and from

as

Trusting

entities

organisations.

of implementation

amounts

access.

and performance.

compliance.

external

large

cloud

as long

and

data to

Data migration is a difficult

mobile

at any time,

solid infrastructures average

privacy

company

Hidden

Cloud computing

types

anywhere

have internet

of security,

sensitive

estimate

mobile computing.

Ubiquitous

the

has lower alternative

devices.

resources

Connectivity

of cloud computing

most data-cautious It is

support

computing

for

with the

on demand.

Support for

build

computing

in-house.

resources

they

Cloud

compared

Scalability/elasticity.

providers

Database

Disadvantages

Low initial costs

17

control cloud

use your

use cloud

of their

provider data

if

data. data

without

are

your

consent? Fast provisioning. demand

in

a

Resources

matter

of

can be provisioned

minutes

with

minimal

Organisation

on

change.

effort.

single ten

Managed infrastructure. implementations or external

are staff.

staff to focus

This allows

by dedicated

the

the

internal

organisations

Do the

justify

Will the

being

cloud

dependent

provider

be

to

on

around

a in

years?

integration

cloud

with internal

services

authentication

IT

End users tend to be resistant

savings

provider?

Difficult

Most cloud managed

culture.

IT

to integrate

and

system.

Configuring

transparently

other internal

with internal

services

could

be a

daunting task.

on other areas.

As the table shows, the top perceived benefit of cloud computing is the lower cost of entry. Atthe same time, the chief concern of cloud computing is data security and privacy, particularly in companies that deal

with sensitive

data and are subject

to high levels

of regulation

and compliance.9

This concern leads

to the perception that cloud services are mainly implemented in small to medium-sized companies where the risk of service loss is minimal. In fact, some companies that are subject to strict data security regulations tend to favour private clouds rather than public ones.10 One of the biggest growth segments in cloud services is mobile computing. For example, Netflix,11 the

video-on-demand

infrastructure infrastructure

9 Are

issues From

11 NoSQL

at

delaying

FarmVille,

Netflix,

adoption Charles

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

of cloud Babcock,

Yuri Israilevsky,

in

2011 that

does

computing?,

it

Ellen

InformationWeek,

Director

http://techblog.netflixx.com/2011/

Editorial

announced

had

moved

significant

parts

to AWS. Netflix decided to move to the cloud because of the challenges fast enough to keep up withits relentless growth.

security

10 Lessons

trailblazer,

of

Cloud

Messmer,

May 16,

and

Systems

Network

World,

of its

IT

of building IT

April

27,

2009.

2011.

Infrastructure

at

Netflix,

January

28,

17

2011,

nosql-at-netflix.html.

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

906

part

VI

Database

Management

note Cloud reality Cloud

Check: is the Cloud enterprise

service

types

outages

and sizes

infrastructure to

steal

providers.

thousands

of people

etc.).

and

all over

of dollars in lost

world,

downdetector.com.

have

retrieval

to

seen

in

data

service

such

and restore,

databases

and

to

manipulate

centre

millions

Facebook,

Twitter,

or cost

web services,

of all system

millions

go to

with a live

within reach

hackers

affect

degradation,

all

cloud

http://

outage

map.

development.

of any type

Cloud

of organisation.

over

queries,

indexing,

to

users.

end

SDS

Tabular

and

amounts

XML.

from

At the

of data

provides

cloud

access

have

simple

same

and

stable

expanded

management

ODBC

time,

while controlling

a relatively

vendors

software,

to

and

data

companies

costs

without

and reliable their

platform

business

to

offer

data management

companies

infrastructure

uses

to

stored

standard

of database

database

procedures,

Other features

of all sizes

personnel.

Services

(TDS)

as

data

for

communication

SQL networking for

servers

that

administrators triggers,

such

are available

data

encapsulate Data

uses a cluster

the internet

and exporting

without

This type

of

reporting

SQL

analytical

data

access

as SQL-Net

Server

backup

purposes.

and relational

Microsoft

Typically,

data

administrative

such

alarge

users.

and

synchronisation,

protocols,

provide

and

databases,

for

protocols.

Oracle

inside

the

protocol. interface.

data.

SDS is transparent interfaces

Programmers

asif the

data

disadvantage,

reliable

ADO.NET

evolved

(SDS) refers to a cloud computing-based

SDS typically

services

have

benefits:

programming

the

High level

affect

in

allowed

could

by provider

that is

computing

hardware,

use familiar

SQL data services Highly

at the

using

storage,

programming

database

potential

17

SQL

networking

A common

the

unique

protocols.

continue

data

and data importing

these

TCP/IP

that

(Instagram,

most common

technologies

services;

of in-house

available

Typically,

access

Cloud

functionality

as

are

Standard

remain

breach

services

problems

ever-growing

business

management.

of database

functions

of the

Such incidents

Other incidents

performance

management

processing

features.

relational

some

data

data

SQL data services

high costs

features

to data

manage

deploying

provides

Hosted

databases

data

better

management

provides

subset

most recent

of the

year.

to service interruptions

security

media

data loss,

alist

chapter,

remote

and

that

social

status

size,

this

SQL data services. service

in

iCloud

celebrities.

up-to-date

can find

ways to

developing

the typically

as the

every

universities

Data services

for

sacrificing for

such

well-known

as interruptions

a new dimension

advanced

are looking

public,

in large

service interruption,

of a companys brings

sQl

you

you

are reported

breaches

from

To see the

There,

computing

As

very

such

incidents

data

pictures

can cause

business.

Regardless

are

private

the

breach

from

Some

of

These incidents

17.4.5

security

of organisations,

ready?

write

of failure

embedded

as SQL

were stored locally instead

however, is that

offer the following and

to

such

scalable tolerance

data

a remote

for

are

Programmers Studio.NET

applications

of the

and

to connect

on the internet.

may not be supported

with in-house

a fraction

normally

Visual

location

data types

when compared

database

because

developers. and

code in their

ofin

some specialised

advantages

relational

application ADO.NET

to

One by SDS.

systems:

cost

distributed

and replicated

among

multiple

servers

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Dynamic

and automatic

Automated

data backup

Dynamic Cloud

creation

providers

running

in

a

scalability

such

matter

development business

data

for

the

service

17.5

Amazon,

Google

Microsoft

the

start

of the

the

and

you

work

The

patterns,

then

our

actually

is

relate

allow have

services

to

907

service

you

to

to

get

worry

your

about

to

be knowledgeable

database fault

enables

database

database

about

to

application to

create

deploy

the

technology

database

server

tolerance,

rapid

and allows them

use the

to relational

own

backups,

services

resources,

is free

access

as

2001, and

accurate

machine

a

one of the founders

mechanism

he published hypertext

could

be

of the

could

our

for

via

design

vision

best a SQL

and

SQL to

of

a very

in

future

of the

that

the

peoples

a

interactions,

management

the typical

way that WWW as:

machine-readable

thoughts,

powerful

through

WWW and director

concepts

of the

so intuitive state

become

working together

of the

describing

his initial

representation analysis

tool,

problems

seeing

which

beset

organisations.12

for

using a variety

the

majority

A traditional

the

Berners-Lee,

(W3C) In

information

easy

of

between

of formats

human

computer,

meanings

have the same

however,

two

including

beings can

concepts

images,

to

understand

only

understand

written

using

multimedia as they

the

syntax

different

and natural

comprehend

the

of the language

natural

languages

that

meaning.

In 2001, Tim Berners-Lee web in

by Tim

person an

of large

of language.

cannot

Web Technologies

storage.

technology

having

need

work and facilitating

WWW represents

semantics

and

do not

of cloud

Consortium

gave

which

and

WeB

between

management

language,

you

with the

The use of SQL data

information

still

understand.

space

patterns in the

Web

actually

interaction

information

Connectivity

applications.

Wide

can

better,

However,

Web was conceived

World

and

tasks.

A consumer at hand.

the seMantIC

computers

in

Even

with limited

rapidly.

is just

The Semantic

and

processes

minutes.

problem

included

of database

maintenance

high-quality

recovery

allocation

businesses

Database

balancing

and disaster

as

of

solutions for

develop

If

and

and routine

solution

load

17

which information

formally

is

given

defined

his idea

well-defined

of a Semantic

meaning,

better

to

Web

Web as An

enabling

extension

computers

of the

and

people

current to

work

cooperation.13 Today,

the

the

Semantic

WWW to share

However,

it is

maintains

its

your

not own

calendar)

reused

we had

you took

across

data relates

combination

to real

data

a specific

between

one

underlying

any

boundaries.

of data

from

different

The framework

allows

schemas

are

of data.

On a daily

bank accounts

applications

would

to

basis,

individuals

and view their

as each

be possible

ongoing

produce

without

such feature

database

it

Web is

which is a model that has a number

example,

a

application

know

what

use

own calendar. manages

you

were

and

doing

(via

photograph.

Sematic

world objects.

as

manage

Web of data,

The aim is to

applications,

and

holidays,

to link a

on the

and industry.14

integration

(rDF),

possible

work

referred

book

always

and

researchers

often

data. If

when

Research

Web is

photographs,

and led

a framework The framework sources

and

is based

of features

data from

by the

that

two

W3C in

allows

will establish develop

on the

resource

for

formats

Description

to

be

and for

modelling

how

Framework

data over the

applications

with

be shared

common

alanguage

for interchanging

different

collaboration

all data to

WWW. For

merged

even if the

different.

17 12

Berners-Lee,

T. WWW:

13

Berners-Lee,

T.,

14

Copyright Editorial

review

2020 has

W3C

Semantic

Cengage deemed

Learning. that

any

All suppressed

Past,

Hendler, Web

Rights

J. and

Activity.

Reserved. content

Present,

does

May not

and

Lassila,

Available:

not materially

be

copied, affect

O. The

Computer, Semantic

October Web,

1996 (vol.

Scientific

29 no.

American,

10),

pp.

6977.

May 2001.

www.w3.org/standards/semanticweb/

scanned, the

Future,

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Cloud computing pool to

is a computing

of configurable

a cloud

access

application

can

data

and local

development

deployment

model that that

computing-based

ubiquitous

rapid

resources

management

management

for

to

businesses

of business

provides

be rapidly

ubiquitous,

service

that

companies

using

Database

SQL

data

provides

information

This

Web Technologies

access to

data

refers

storage,

enables

rapid

resources.

and common

909

a shared

(SDS)

service

technology

protocols

and

services

relational

of all sizes.

standard

Connectivity

on-demand

provisioned.

with limited

solutions

17

SDS allows

programming

interfaces.

The Semantic WWW to

Web, often referred

be shared

and reused

to

as a

across

Web of data, is

applications,

a framework

without

that

allows

all data on the

any boundaries.

Key terMs

ADO.NET

dynamic linklibraries(DLLs) Extensible Markup Language (XML) Infrastructure asa Service (InaS)

publiccloud Remote DataObjects (RDO) Resource Description Framework (RDF)

application programming interface (API)

Java

script

CallLevelInterface(CLI)

cloud services

JavaDatabase Connectivity (JDBC) JavaScript LAMP Microsoft .NET framework

server-side extension Software asaService (SaaS) SQLdataservices (SDS) stateless system

common cloud

ObjectLinkingandEmbedding for Database

tags

Common Gateway Interface(CGI) DataAccess Objects (DAO) datasourcename(DSN) databasemiddleware

(OLE-DB) Open Database Connectivity (ODBC) pathexpressions Platform asaService (PaaS)

DataSet

plug-in

Document TypeDefinition (DTD)

privatecloud

ActiveX ActiveX Data Objects(ADO)

client-sideextensions cloud computing

Further

Web-to-database middleware XMLschema XMLschema definition (XSD)

reaDInG

Duckett, J., PHP & MySQL: Server-side

Web Development.

John

Fawcett, J., Ayers, D. and Quin, L., Beginning XML, 5th revised Jain,

Universal Data Access (UDA) VBScript

A., The Cloud

DBA-Oracle:

Managing

Oracle

Wiley & Sons, 2019.

edition. John

Database in the

Cloud.

Wiley & Sons, 2012.

Apress,

2017.

online Content Answers to selectedReviewQuestions andProblems forthis chapter are available

on the

online platform for this book.

reVIeW QuestIons 1

Copyright Editorial

review

2

What are

3

Whatis the difference between

2020 has

Give some examples of database connectivity

Cengage deemed

Learning. that

any

ODBC,

All suppressed

Rights

DAO and

Reserved. content

does

May not

not materially

be

RDO?

copied, affect

How are they related?

DAO and RDO?

scanned, the

options and what they are used for.

overall

or

duplicated, learning

in experience.

whole

or in Cengage

17

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

910

part

VI

Database

Management

4

What are the three basic components

of the ODBC architecture?

5

Which steps are required to create an ODBC data source name?

6

Whatis

OLE-DB used for, and how does it differ from

ODBC?

7

Explainthe OLE-DB model based onits two types of objects.

8

How does ADO complement

9

Whatis ADO.NET, and what two new features

OLE-DB? makeit important

for application

10

Whatis a DataSet, and whyis it considered to be disconnected?

11

What are Webserver interfaces

12

Whatdoes this statement system

have for

used for?

Give some examples.

mean: The Webis a stateless system.

database

applications

development?

Whatimplications

does a stateless

developers?

13

Whatis a Web application

14

Whatare scripts, and whatis their function? (Thinkin terms of database applications development.)

15

Whatis XML, and whyis it important?

16

What are Document Type Definition (DTD) documents,

17

What are XML Schema Definition (XSD) documents,

18

Whatis JDBC, and whatis it used for?

19

Whatis cloud computing,

20

Nameand contrast the types of cloud computing implementation.

21

Name and describe the

22

Using the internet, provide

(SaaS,

server, and how does it work from a database perspective?

and whyis it a game

and what do they do?

and what do they do?

changer?

most prevalent characteristics

of cloud computing

search for providers of cloud services.

PaaS

services.

Then, classify the types

of services they

and IaaS).

23

Summarise the

main advantages

and disadvantages

24

Define SQL data services and list their advantages.

25

Whatis

meant by the Semantic

of cloud computing

services.

Web?

online Content Thedatabases usedin the Problems forthis chaptercanbefoundonthe online

platform

for

this

book.

proBleMs In the following

17

Copyright Editorial

review

2020 has

exercises, you set up database connectivity

1

Use Microsoft Excel to connect to the and retrieve all of the AGENTs.

2

Use Microsoft Excel to connect to the Ch02_InsureCo and retrieve all of the CUSTOMERs.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

Ch02_InsureCo

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

using

Microsoft Excel.

Microsoft Access database, using

Microsoft Access database,

rights, the

right

some to

third remove

party additional

content

may content

be

using ODBC,

suppressed at

any

time

ODBC,

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

3

17

Use Microsoft Excel to connect to the Ch02_InsureCo and retrieve

4

the

customers

whose

AGENT_CODE

is

Database

Connectivity

and

Web Technologies

911

Microsoft Access database, using ODBC,

equal

to

503.

Create an ODBC System Data Source Name Ch02_SaleCo, using the Control Panel, Administrative Tools,

5

Data

Use

Microsoft

System

6

Sources

Excel to list

ODBC System

Administrative

Use

Tools,

Microsoft

System

option.

all of the invoice

lines

for Invoice

103, using the

Ch02_SaleCo

DSN.

Create an

7

(ODBC)

Data

Data Source Sources

Excel to list

Name

(ODBC)

Ch02_Tinycollege,

using the

Control

option.

all classes taught

in room

KLR200, using the

Ch02_TinyCollege

DSN.

8

Create a sample XML document

and DTD for the exchange

of customer

9

Create a sample XML document

and DTD for the exchange

of product and pricing data.

10

Create a sample XML document

and DTD for the exchange

of order data.

Create a sample

and DTD for the exchange of student transcript

11

college

Panel,

XML document

transcript

data.

data. Use your

as a sample.

17

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

A

alias

An alternative name for a column or table in a

SQL statement.

access

plan

at application

A set ofinstructions compilation

generated

time that is created

and managed by a DBMS. The access plan predetermines how an applications query will access the database at run time.

the

Microsoft

client

browser,

OLE-DB,

DAO, and

RDO. ADO

Aspur-of-the-moment

developing

distributed,

and

database

ignorant

A data modelthat

anticipated

a central

entity

atomic

A process

or set of operations

See transaction

processor

server-independent.

attribute to

An attribute that cannot be further

produce

meaningful

components.

For

atomic transaction property A property that requires all parts of atransaction to be treated as a single, logical unit of workin which all operations

on the

usage of the data.

algorithms

A PL/SQL block that name.

example, a persons last name attribute cannot be meaningfully subdivided.

does not

based

expressions evaluate to true.

processor

subdivided

be used.

data around

in a WHERE or HAVING clause. It requires

application programming interface (API) Software through which programmers interact with middleware. An API allows the use of generic SQL code, thereby allowing client processes to be

aggregate aware A data model that organises data around a central entity based on the waythe data will

organise

is

AND - No match found showing the function The SQLlogical operator used to link multiple conditional

application (TP).

interoperable applications aimed at manipulating any type of data over any network using any operating system and programming language.

aggregate

When the

anonymous PL/SQL block has not been given a specific

question.

heterogeneous,

command

that all conditional

ADO.NET The data access component of Microsofts .NET application development framework, whichis a component-based platform for

structure.

expressions

provides a unified interface to access data from any programming language that uses the underlying OLE-DB objects. ad hoc query

changes

to table

make

analytical database A database focused primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making.

ActiveX Data Objects (ADO) A Microsoft object framework that provides a high-level, application-oriented to

used to

American National Standards Institute (ANSI) The group that accepted the DBTG recommendations and augmented database standards in 1975 through its SPARC committee.

Internet Explorer. Oriented mainly to Windows applications, it is not portable. It adds controls such as drop-down windows and calendars to Web pages.

interface

The SQL command

followed by a keyword (ADD or MODIFY), it adds a column or changes column characteristics.

ActiveX Microsofts alternative to Java. A specification for writing programs that will run inside

ALTER TABLE

in a

must be completed

calculation.

consistent

(committed)

to

produce

a

database

912

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Glossary

atomicity

See atomic transaction

attribute

A characteristic

An attribute

attribute

has a name

domain

property.

transaction, item.

of an entity or object.

no other transaction

(relationship)

See domain.

hierarchy

A top-down

that is used for two drill-down/roll-up

audit log

A security feature system

description

performed

the

that

aggregation

by all

database

node

operations

the

that

a report

are

only registered

users

can

logical

management

Such

view

DBMS

Procedures

database

procedures

management, and

usage

a DBMS

include

definition,

that

security

then

and

execution

AVG

DBMS

access

then

control,

most

A method by which

efficient

access

path for

data

function

that

outputs

column

the

in

into

a single

to

check

beginning

to

end

NOT.

components

begins

into

and

units. In

by defining

entities.

database

attributes

Compare

system is subjected.

to

and

top-down

and

which any

These limits

existing

include

hardware

normal form (BCNF)

normal

a candidate

form

(3NF)

key.

A table

in

and

A special type

which

in

every

BCNF

a single

batch to

update

determinant

must

be in

3NF.

is

within

a master

bucket

of

A movement to find new and better

data

and

derive

simultaneously scalability

amounts

business

insight

providing

high

at a reasonable

binary lock

Learning. that

any

Web-generated

from

it,

performance

All suppressed

(0). If

and

cost.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

is locked

scanned, the

overall

or

duplicated, learning

of related

key-value

business

intelligence

in

whole

format

JSON format

including

binary

collect,

with the

purpose to

or principle

or in Cengage

part.

Due Learning

to

electronic reserves

right

some to

database,

and processes

of

store,

generating

and and

business

collection

additional

content

may content

be

any

data

presenting

For

suppressed at

used to

making.

of a policy, procedure,

an organisation.

party

cohesive,

analyse

decision

A description

third

a logical

A comprehensive,

support

remove

organised

pairs.

integrate,

within

rights, the

data structure

set of tools

capture,

business rule

by a

experience.

the

tree.

In a key-value

information

a data item

An ordered

and integrated

while

Alock that has only two states: locked

unlocked

Cengage

of

expands

data types

as an upside-down

operator

a range

A computer-readable

that

additional

entity.

objects.

b-tree index

a value

large

JSON)

data interchange

to include

values.

manage

See composite

BSON (Binary for

Aroutine that pools

whether

to

and

from

operation.

ways

deemed

and

The external limits to

bridge entity

method that

In SQL, a special comparison

Big Data

has

tasks

update routine

specified

2020

design

personnel,

of third

any user interaction.

BETWEEN

review

AND,

them into larger

them

Boyce-Codd

or expression.

A data processing

processing

transactions

Copyright

OR,

software.

for a specified

processing

batch

Editorial

node.

See also determinant.

without

(1)

node

A design philosophy that begins

the process

budgets,

of a query.

A SQL aggregate

batch

used

data

name

A branch of mathematics that

individual

groups

proposed

the

B

table

by the

the

design.

is

runs

data

File System

monitoring.

the

mean average

on that

design

aggregates

design,

user access

query optimisation

finds

informing

operators

boundaries automatic

or condition.

access

bottom-up

and guarantee

integrity.

node

algebra

uses the

by identifying

protect

of a value

sent every six hours

name

blocks

Boolean

which a

database.

authorisation

existence

In the Hadoop Distributed

to the

which

For example,

users.

The process through

verifies

represent

(HDFS),

records

entities.

COURSE.

Anindex that uses a bit array (0s and

block report

of a database

two

teaches

bitmap index

and

1s) to

automatically

of the

authentication DBMS

main purposes:

data analysis.

management a brief

data organisation

data

An ER term for an association

between

PROFESSOR

attribute

can use that

See also lock.

binary relationship

and a data type.

913

time

from if

the

subsequent

example,

eBook rights

and/or restrictions

eChapter(s). require

it

914

Glossary

a pilot cannot during

be on duty for

a 24-hour

to four

period,

classes

during

more than

10 hours

or a professor

superclass

may teach

up

and each child

a semester.

client/server

architecture

of hardware

C

and

a system

(CLI)

by the

Group for

SQL

Access

candidate that

key

does

itself

not

a superkey.

value

to

occurrence

for

last

names

last

names,

a set

are

which

centralised

in

within

the

when number

a data

database

is

stored

at one

decentralised

located

at a

in

component

which the

system

writes all of its

class

A collection

A class

methods

organised

in

class

diagram

notation

in the

creation

of class

class

hierarchy

2020 has

existing

tables

tree in

Learning. that

any

All

Classes

permits

the

existing

tables

use

ActiveX,

operators

produce

of relational

new relations.

operators that

algebra to

on

operators

produce

on

new relations.

object

data

The set of symbols

Reserved. does

May not

used

of configurable

not

be

class is a

copied, affect

access

resources

that

to

a shared

can be rapidly

The services provided by cloud

Cloud services

allow

any organiastion

to

power,

databases,

and infrastructure.

to a table in a relational

database.

In a column family

collection

of columns

collection

of rows.

or super

database, a

columns

related

to

a

scanned, the

overall

or

duplicated, learning

family

database

A NoSQL

database

column-centric storage A physical data storage technique in which data is stored in

of classes in

materially

modelthat

on-demand

model that organises data into key-value pairs, in which the value component is composed of a set of columns that vary by row.

notation.

parent

A computing

are

diagrams.

content

common

operators that

algebra

to

(relations)

ubiquitous,

column

UML

Rights

(relations)

column family

data representation

which each

suppressed

of relational

A property of relational

analogous

with shared

(methods).

The organisation

Cengage deemed

behaviour

disk.

A diagram used to represent in

most

JavaScript,

collections In document databases, alogical storage unit that contains similar documents, roughly

hierarchy.

relationships

a hierarchical

an

buffers to

an objects

a class

The

Java,

cohesivity The strength of the relationships between a modules components. Module cohesivity must be high.

Compare

management

implementation.

diagram

and their

use

processing

of a relatively

management,

updated

encapsulates

a

consists procedures.

database

and

Web browser.

quickly and economically add information technology services such as applications, storage, servers,

It is typically

of similar objects

(attributes)

Extensions that add

plug-ins,

cloud services

match an

requirements.

In transaction

of

provisioned.

design.

checkpoint

structure

the

pool

which a single

modelled to

and

features

and a provider

A property of relational

permits

provides

database.

A process in

of objects

a

cloud computing

A data allocation

A database

database

architecture

or a client,

VBScript.

closure all

and

ordered.

as a centralised

design is

small

operation

are

closure

which

and,

to

extensions

computing.

used

review

are

entire

design

organisations

Copyright

as a list

ordered

database

conceptual

Editorial

of

with am single

A nested ordering such

names

the

range

client/server

extensions

functionality

and

to form

servers,

or a server.

client-side

is

site.

centralised

class

that

entity.

data allocation

in

the

associated

alphabetically

Also known

and

expresses

of rows,

all first

centralised

to

and

occurrences

order sequence

sequence

single

of attributes

The

resources,

See key.

of the related

cascading

site.

access.

components

of clients,

a user of resources,

A property that assigns a specific

entity

strategy

database

a subset

connectivity

allowed

developed

A minimal superkey; that is, a key

contain

cardinality

A standard

See

The arrangement

software

composed

middleware.

Call Level Interface

class is a subclass.

also inheritance.

in experience.

whole

blocks,

which

across

many rows.

or in Cengage

part.

Due Learning

to

electronic reserves

hold

rights, the

right

some to

data from

third remove

party additional

content

a single

may content

be

column

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

COMMIT

The SQL command that permanently

conceptual

schema

saves

changes

conceptual

model,

data

Common server to

to

a database.

Gateway Interface

interface

perform

standard

specific

(CGI)

that

functions

See also conceptual

A Web

concurrency

uses

script

files

based

on

a clients

parameters.

completeness specifies

constraint

whether

each

entity

must also be a member The

completeness

Partial

be partial

not

Total completeness

be

some

members

means that

of at least

between

further

attribute

subdivided

and

example,

a phone

may be divided

simple

(615),

into

entitys

an exchange

code (2368).

primary

keys

Also known

two

primary

Compare

of the

1:M relationships. key

comprises

entities

as a bridge

that

entity.

it

key

computer-aided

(CASE)

systems

Development

conceptual data-modelling database

that

design

techniques

to

software-and

hardware-independent.

global

model

process. view

of an entire

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not

be

and

copied, affect

1:1, 1:M,

in a known

consistent

0.00 and 4.00.

because they help to

subquery

A subquery that executes outer

query.

both

provides describes

A query

optimiser

technique

COUNT A SQL aggregate function that outputs the number of rows containing not null values for a given column or expression, sometimes used in

objects are

optimiser

that uses an algorithm based on statistics about the objects being accessed, including number of rows, indexes available, index sparsity, and so on.

conjunction

with the

CREATE INDEX

a

indexes

the

on the

DISTINCT

clause.

A SQL command that creates basis of a selected

attribute

or

attributes.

details.

materially

database

each row in the

cost-based

both

a model of a

real-world

model

database

avoiding

include

objects are

The techniques

conceptual

Classifications

GPA must be between

once for

The output of the conceptual

The

main data objects,

of a

uses

create

as possible.

design

that

that represents

as realistically

with the

correlated

model

The techniques

A process

structure

a

real-world

hardware-independent.

conceptual

review

represents

as possible.

of the relationship

protocol.

create

software-and

database

Copyright

part or all of the

as realistically

data-modelling

Editorial

to

on a database.

coordinator The transaction processor (TP) node that coordinates the execution of atwo-phase COMMIT in a DDBMS. See also data processor (DP), transaction processor (TP), and two-phase commit

A process that uses

structure

conceptual

key.

engineering

techniques

entities.

Constraints are important ensure data integrity.

Life Cycle.

design

place

are working

The classification

students

See also linking

Tools used to automate

Systems

that takes

constraint Arestriction placed on data, usually expressed in the form of rules. For example, A

The at least

connects.

A multiple-attribute

A backup

consistent database state A database state in which all data integrity constraints are satisfied.

table.

composite

system

state. If not, the transaction will yield an inconsistent database that violates its integrity and business rules.

An entity designed to transform

M:N relationship

database

data integrity.

M:N.

begin For

as 615-898-2368

code

and a four-digit

entity

composite the

such

an area

attributes.

attribute.

composite an

additional

number

into

number (898), to

yield

multiprocessing

of

consistency A database condition in which all data integrity constraints are satisfied. To ensure consistency of a database, every transaction must

one

An attribute that can be

to

execution

subtype.

subtype.

composite

a

backup

connectivity

supertype

every supertype

must be a member

in

of the graphically.

A DBMS feature that

simultaneous

while one or more users

or total.

of any

transactions

concurrent

occurrence

one subtype.

can

means that

might

occurrence

of at least

constraint

completeness

occurrences

supertype

the

expressed

model.

control

coordinates

while preserving

A constraint that

Arepresentation

usually

915

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

916

Glossary

CREATE TABLE

A SQL command that creates

a tables

structures

attributes

given.

cross join (or

entity

the

and

product)

relationship

of two

arelational

the

product

reserved

corrected

data

is

not

that

uses

many

a three-pronged

sides

In

multidimensional

memory

area where

area that

blocks

to the

in

stores

RAM. fast

slower

the

cube

cache

number

of input/output

primary

and

OLAP, the

in

structure

SQL to

hold the

A cursor

are held.

speeding

data rows

used in returned

may be considered

memory

in

holding

which

query

columns

reserved

client

construct

up data

value in the

and rows.

memory

procedural

is

area in the

like

are

DBMS

to

store

held in

server,

before

querying

a graph

that

a

allows

database

query language

used in Neo4j

the

or facts

to reveal

Access

that

their

Visual

Objects

other

interface

The

relational-style

managing

entire

(DBA).

than

known

deciding

review

exposes

has

the

any

to

define the

subschema.

are

dependent

on

characteristics.

data resource,

have

functionality

All suppressed

Rights

does

it is

May

definition

as

and relationships. data

the

data

well as

A data

that

are

external

to

resource

metadatadata

about

contains

the

data.

and relationships. data

that

be

which

copied, affect

the

overall

or

duplicated, learning

A data dictionary

are

external

to the

resource

DBMS.

dictionary.

data extraction

A process used to extract and an operational

from

a database.

scanned,

data

data from

database

and

in experience.

A named physical storage space that

a databases

data. It can reside in a different

directory on a hard disk or on one or more hard disks. All data in a database are stored in data files. Atypical enterprise database is normally composed of several data files. A data file can contain rows

DBMS, the process

not

the

as well as their

validate

data file

resource

materially

Thus,

data definition

external data sources prior to their placement in a data warehouse.

authority

administrator

made to

not

data

that

Thus,

A DBMS component that

may also include

data fragments.

Reserved. content

data.

may also include

characteristics

access

whether

database

been

the

about

DBMS. Also known as an information

dictionary

The person responsible

the

contains

stores

optimised

sources.

to locate

Learning. that

and

Also known as aninformation

A data abnormality in

Cengage deemed

administrator

A DBMS component

data dictionary

from

MS Access is

In a distributed

changes

2020

Thelanguage

manipulation

characteristics

stores

data anomaly inconsistent

Copyright

an

as an information

where

be

dictionary.

(IRM).

data allocation

Editorial

provides

or not. The DA has broader

Also

manager

databases

can be used to

(DA)

and responsibility

must

cannot

used to access

dBase

on which

data

the

computerised

of

that

DAO interface

data administrator for

and DAO

Jet data engine,

based.

so they

(DDL)

data storage

dictionary

end user.

An object-oriented

interface

programs.

programming

of the

(DAO)

MS FoxPro,

Basic

meaning they

query.

metadatadata

their

have not yet been

meaning to the

programming

MS Access,

data

database.

the Data

of each

A data condition in which data

and

physical

dictionary

application

a

not in the

D processed

data in

on its x-, y-, and

used,

schema,

data dictionary

Raw facts,

data

The location

are

a database structure,

stores

data

the

between

manipulate

are static,

they

by an ad hoc

representation

for

compared

minimising

operations

and

DBMS.

data dependence

A declarative

advantage

memory.

data definition language

an array

computer.

Cypher

memory

accessed

takes

memory, (I/O)

Data cubes

created

area of

stored,

Cursors

primary

data cube is based

be created

by a SQL query.

a reserved

output

most recently cache

The multidimensional used

z-axes. A special

address

database.

A shared, reserved

the

secondary

data cube

shared,

access.

cursor

but the in the

of the

data cubes

assists

all files

A buffer

secondary

multidimensional Using

moves,

in

or buffer cache

of a computers

of the

relationship. cache

an employee

change

memory

tables.

Arepresentation

diagram

to represent

cube

For example,

data cache

Foot notation

symbol

characteristics

Ajoin that performs

Cartesian

Crows

using

whole

or in Cengage

part.

one or

Due Learning

to

electronic reserves

more tables.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

data filtering validate external data

A process used to extract and

data from data

an operational

sources

warehouse.

prior

See

that

allows

users fragment

object

to

can

placement

a system

be stored

in

a

senior

data processor

of a DDBMS

be broken

or fragments.

database,

supervising

programmers,

and troubleshooting

the program. Also known as a data manager(DM).

and

(DP)

The resident software

extraction.

A characteristic

a single

more segments

to their

data

data fragmentation

database

917

into

The object database,

at any

site

two

or

that

a DDBMS.

The

local

Each

data

on a computer

stores

the

and retrieves

DP is responsible

data in the

to that

might be a

or a table.

component

computer

data for

and

coordinating

data. See also transaction quality

validity,

the

access

processor (TP).

A comprehensive

accuracy,

through

managing

approach

and timeliness

to ensuring

of data.

network.

data redundancy

data inconsistency different

A condition in

versions

(inconsistent)

of the

is

storage

environment

different

by changes

in the

database

physical

data

the

data in the

and referential

data

management

data collection,

database

comply

functions

modification,

and listing.

that focuses

and retrieval. include

manager (DM)

all

on

Common

addition,

multiple

fragments

Data replication

sites

on a DDBMS.

is transparent

provides

fault

to the

tolerance

deletion,

filtered

See data processing (DP)

commands

that

allows

data in the

SELECT,

an end

database.

INSERT,

(DML) user

from the

external

and operational

data, and

query tool

The set of to

data warehouse Anintegrated, subject-oriented, time-variant, non-volatile collection of data that provides support for decision making, according

manipulate

The commands

UPDATE,

and

enhancements.

will be stored for access by the end-user for the business data model.

manipulation language

end

data store The component of the decision support system that acts as a database for storage of business data and business model data. The data in the data store have already been extracted and

data

manager.

the

(unnecessarily

data source name (DSN) A name that identifies and defines an ODBC data source.

constraints.

A process

storage,

with

at

of the

performance

database, a condition

integrity

management

data

redundant

which a data

The storage of duplicated

fragments

Duplication user.

In arelational

entity

data

in

data.

data replication

characteristics.

which

contains

duplicated)

A condition in which data

unaffected

data integrity in

data yield

A condition

results.

data independence access

same

which

DELETE,

include

COMMIT,

and

to

ROLLBACK.

Bill Inmon,

the acknowledged

father

of the

data

warehouse. data

mart

A small,

subset

that

group

of people.

data tools

provides

mining

decision

sources

support

data in a data and to

relationships

data

warehouse data

to a small

and

warehouse

proactively

identify

warehouse

and possible

database

anomalies.

administrator

responsible

data

model

complex

real-world

used in the Life

Copyright review

data structure.

database

design

(DP)

who

a department

has

Data

phase

data processing

2020

usually graphic, of a of the

evolved

Cengage deemed

into

managing

Learning. that

any

All suppressed

manager

technical

models are Database

Rights

Reserved. content

does

May not

not materially

be

human

copied, affect

scanned, the

overall

Roles

second

resources,

or

duplicated, learning

in experience.

whole

for

(DBA)

planning,

The person

organising,

controlling,

and

database design The process that yields the description of the database structure and determines the database components. Database design is the

A DP specialist

supervisor. and

subject-oriented,

monitoring the centralised and shared corporate database. The DBAis the general manager of the database administration department.

Cycle.

include

Editorial

Arepresentation,

Anintegrated,

time-variant, nonvolatile collection of data that provides support for decision making, according to BillInmon, the acknowledged father of the data warehouse.

A process that employs automated

to analyse

other

single-subject

or in Cengage

part.

Due Learning

to

electronic reserves

phase

rights, the

right

some to

of the

third remove

party additional

content

Database

may content

be

Life Cycle.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

918

Glossary

database design

development

The process of database

DataSet

and implementation.

In ADO.NET, a disconnected,

memory-resident

database fragment

The

A subset of a distributed

DataSet

representation contains

relationships, database.

Although

at

sites

different

set

the within

of all fragments

fragments

may be stored

a computer

network,

is treated

See also horizontal

as a single

fragmentation

the

database.

that

and vertical

study, and

clients

possible

tuning

requests

while

columns,

rows,

are

Activities to ensure

addressed

making optimum

as quickly

as

use of existing

resources.

Life Cycle (DBLC)

A cycle that traces

of a database

an information

history

system.

tables,

database.

constraints.

DBMS performance

the

fragmentation.

Database

and

of the

within

The cycle is divided

into

design, implementation

evaluation,

operation

A condition in

transactions the lock

six phases: initial

and loading,

testing

maintenance,

and

and

deadlock

deadly

which two

wait indefinitely on a previously

embrace.

for

the

locked

or more other

to release

data item.

Also

called

See also lock.

deadly embrace

See deadlock.

evolution.

decentralised database

management

collection

of programs

structure the

system

and

that

controls

(DBMS)

The

manages the

access

to the

conceptual

database

data

stored

an organisations

in

database

middleware

Database

through

connect

and communicate

connectivity

which application

programs

large

with data repositories.

number

centralised database and

performance

procedures

time

of a database

an end-user minimum

query is

to

the

is,

processed

decision

response

to

ensure

by the

DBMS in the

The process of restoring

a previous

consistent

statement

in

database for the

an

security

security,

integrity,

a

that

system

management,

and recovery

An organisation

and

use

of the

the collection,

of data in

database

access

Atype

to the

of lock

owner

works

online

for

batch

multiuser

Copyright review

2020 has

storage,

Learning. that

any

of the lock

processes

All suppressed

Rights

Reserved. content

does

to

and allows

the

database.

This

but is

unsuitable

for

May not

not materially

(DSS)

An arrangement

See deferred update.

be

copied, affect

scanned, the

overall

or

A process by which a table

from

a higher-level

dependency diagram dependencies (primary within atable.

that restricts

normal form

to

duplicated, learning

yields

Arepresentation of all data key, partial, or transitive)

derived attribute An attribute that does not physically exist within the entity and is derived via an algorithm.

DBMSs.

Cengage deemed

Compare

alower-level normal form, usually to increase processing speed. Denormalisation potentially data anomalies.

a database

one user at a time to access

lock

Editorial

lock

procedures.

write technique.

is changed

of components

environment.

database-level

After

DELETE A SQL command that allows data rows to be deleted from atable.

The person responsible

backup,

and regulates

and

system

denormalisation

defines

only

of objects

deferred-write

or a transaction.

database.

database

requirements.

of

of a single SQL

program

officer

subsets

deferred update In transaction management, a condition in which transaction operations do not immediately update a physical database. Also called

state.

The equivalent

application

model

design.

support

deferred

database request

database

to

of computerised tools used to assist managerial decision making within a business.

that

of time.

database recovery

A process in which used

A set of activities

to reduce

systemthat

amount

database

tuning

designed

is

verification of the views, processes, and constraints, the subsets are then aggregated into a complete design. Such modular designs are typical of complex systems in which the data component has arelatively

database.

software

design design

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

For example, subtracting

the the

description provides

Age attribute birth

a precise,

design trap

with the

real

design trap is known

runs

value See

that

when a is

most

not

database that

Boyce-Codd

normal

values form

in that

whose

related

computer

systems

that the value

dimension used star

to

of attribute

tables

search,

schema.

filter,

or classify table

a given

that

dedicated

to

securing

disaster

subtype

are

availability

that

different

contains

Learning. that

following

and among

schema

The database

of a distributed

database

schema

as seen by the

administrator.

processing Sharing the logical of a database over two or more sites by a network.

data processors

(DPs) in a distributed

distribution transparency A DDBMS feature that allows a distributed database to look like a single logical database to an end user.

a

failure.

subtype)

from

Document

In

that

and

any

All suppressed

one

the

Rights

Reserved. content

does

May not

not

be

copied, affect

the

overall

or

pairs in

duplicated, learning

in experience.

whole

model

which the

of atag-encoded

the syntax

rules

or valid tags for each type

of

domain In data modelling, the construct used to organise and describe an attributes set of possible values.

See distributed

scanned,

A NoSQL database

XML document.

Also

(DDD).

(DDD)

materially

defines

(fragment

database.

data dictionary

databases

data in key-value

document type definition (DTD) Afile with a .DTD extension that describes XML elements; in effect, a DTD file describes a documents composition and

A data

description

stores

value component is composed document.

another.

(DDC)

of a distributed

as a distributed

Cengage deemed

data

distributed transaction A database transaction that accesses data in several remote data processors (DPs) in a distributed database.

qualifying

integrity

a unique

distributed data dictionary data catalogue.

has

both

distributed

database.

entity set.

data catalogue

names, locations)

2020

a

processing

over interconnected

which are

global

several remote

A SQL clause that produces only a list

that

distributed

review

a

perspectives

(non-overlapping hierarchy,

DISTINCT

Copyright

data in

sites;

and

distributed request A database request that allows a single SQL statement to access data in

a one-to-many

additional

data

nonoverlapping

of values

Editorial

up.

The set of DBA activities

a specialisation

known

different

storage

sites.

distributed processing connected

B

A means

within

design,

or a database

subtype

dictionary

database

independent

tables.

provide

management

disjoint

is in

related

physically

fact.

disaster physical

facts

In a star schema

characteristics to

of attribute

B can be looked

with dimension

dimensions

determines

In a data warehouse, tables

The fact

relationship

A

the value

Alogically

functions

description

the

knowing

database

independent

(BCNF).

a database

that

more

the

of logically

distributed

row.

The role of a key. In the context of

indicates

or

related

physically

several

governs

processing

determination

statement

more

database management system A DBMS that supports a database

database

table,

in two

across

DDBMS

several

other

Alogically or

database

stored

distributed

common

computer.

determines

is

distributed (DDBMS)

identified

way that

in two

sites.

Any attribute in a specific row

directly also

a The

A single-user

on a personal

that

as a fan trap.

database

determinant

in world.

database

stored

distributed

activities

or incompletely

is

sites.

environment.

A problem that occurs is represented

consistent

that

and

of the

operating

is improperly

and therefore

distributed

by

date.

up-to-date,

description

an organisations

relationship

current

A document that

detailed,

reviewed

desktop

the

of operations

thoroughly define

might be derived

date from

919

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

920

Glossary

DO-UNDO-REDO a data

transactions log

protocol

processor

(DP)

A protocol

to roll

with the

help

back

used by

end-user

or roll forward

of a systems

create

transaction

from

entries.

To decompose

componentsthat

is,

aggregation.

data into

data

at lower

This approach

more atomic

levels

the

support

system

to focus

geographic

areas,

business

types,

and

in

entity

on specific and

so

on.

as tables,

indexes,

and

ERD.

users.

Permanently deletes an index

DROP TABLE

Permanently

Thetransaction

the permanence Transactions a system

have

failure

been

if the

consistent

completed

database

the

the

SQL

most

database.

Contrast

strategy

at run

information

with static

query

An entity

SQL

statement

is

generated

An environment

not known in

at run

a program required

can

time.

key

entity

the

to

is

that

are

entity

resulting

relationship

semantic

embedded

in

SQL

application

end-user

Copyright Editorial

review

and

and

end-user

2020 has

Learning. that

provide

any

additional

contained such

tool

presents

All suppressed

as

within COBOL,

does

May not

not materially

copied, affect

scanned, the

overall

or

duplicated, learning

relationships

in experience.

whole

model

with the

developed

by P.

A diagram models

entities,

model, a grouping

of

In a generalisation/specialisation of an entity the

subtypes

contain

supertype.

common

the

unique

The entity

characteristics

and

characteristics

of

entity.

In a generalisation/specialisation

a generic

entity

Ajoin

condition tables.

Due

of the

to

electronic reserves

rights, the

right

some to

third remove

that

of entity

operator

columns

Learning

type

characteristics

on an equality

part.

was

relationship

contains

Cengage

1:M, and level

diagram (ERD)

a subset

or in

A data

(1:1,

conceptual

In a relational

equijoin

data compiled

be

values.

and relations.

subtype

hierarchy,

A data analysis tool

Reserved. content

table

value in a

no null

model (ERM)

The

entity supertype

tool.

Rights

table

entities.

common

selected

has

hierarchy,

each

languages

of a relational

supertype the

ColdFusion.

query

Cengage deemed

of extended

model.

SQL statements

presentation

organises

by the

that

ER

programming

C++, ASP, Java,

that

the

a specific

has a unique

at the

an entity

set

entity

application

in

1975.

related

The entity relationship the

concepts

content

of a

nodes.

from

virtual

an entity

occurrence.

key

(ER)

entities

depicts

entity

diagram

considered

See entity instance.

entity relationship

queries.

In a graph database, the representation

EER diagram (EERD)

the

describes

attributes,

between

each entity

and that

of ER diagrams.

Chen in

E relationship

abstract

actually

ER modelling,

relationship

help

that

edge

is

not

The property

M:N) among

environment,

SQL statements

ad hoc

In

occurrence

entity

SQL

but instead SQL

a single

the

which the

a dynamic

it is

in the

by combining

into

as an entity

guarantees

primary

optimisation.

advance,

In

generate

to respond

in

attribute.

ERD.

model that dynamic

also

or event for

and relationships

cluster

because

Also known

that

time,

about

data present

entity type used to

entities

entity integrity

be lost

durability.

The process of

access

up-to-date

proper

See

entities

interrelated object.

row.

state.

will not

has

for

concept,

cluster is formed

entity instance

property that indicates

query optimisation

determining using

be stored.

Avirtual

or abstract

deletes atable (and its

of a databases

that

support

needs.

multiple

the final

dynamic

can

An entity

entity

data)

durability

provides

future

data

information

The overall company

which

represent

used to delete database

views,

desired

A person, place, thing,

multiple

DROP INDEX

in

database

entity cluster

such

access

See

up.

A SQL command

A data analysis tool used to

that

store.

expected

which

objects

data

representation,

of

is used primarily

a decision

DROP

queries

enterprise

drill down

also roll

query tool

the

contains

that links that

party additional

content

tables

compares

may content

the

subtypes.

be

specified

suppressed at

any

time

based

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

eventual

consistency

consistency

in

propagate

A modelfor database

which

through

updates

the

to the

system

so that

all

will be consistent

eventually.

exclusive lock

Alock that is reserved

transaction.

An exclusive

transaction item

requests

and no locks

lock

to

copies

An exclusive

transactions

to

lock

access

the

a

data

a data

data item does

not

existence

other

entities.

In

such

first

cannot

table

because

reference

the

a table

In

a subquery

explicit

cursor

hold

the

return

returns

or

model

entities.

operator

that

that

checks

entity

relationship

semantic entity

model;

that

facts

may

return

zero

constructs, subtypes,

such

and

entity relationship

extended relational that

includes

features

the

in

an inherently

facts linked

with

that

used

Copyright review

2020 has

permits

Cengage deemed

Learning. that

any

to

Unlike other the

All suppressed

Rights

Reserved. content

does

May not

not materially

to

A

copied, affect

scanned, the

overall

or

duplicated, learning

that

their

sales

measurements

figures

costs,

aspect

are

product

used in

units,

table.

business

represent

numeric

or service

business

prices,

data

and revenues.

is

not

expressed

loop

produce

Social

in experience.

whole

or in Cengage

bank

part.

Due Learning

to

even if

electronic reserves

a network

that

place,

rights, right

some to

third remove

defines

additional

thus

entities

Analysing stored data

a characteristic For example,

address,

constitute

party

entities, other

or numeric character

number, all

the

results.

or thing.

Security

when one entity

model.

processing

An alphabetic

the

among

in the

actionable

balance

occurs with other

an association

person,

data

Afeature that allows of a DDBMS,

1:M relationships

of characters

manipulate

of a documents

be

dimension

a specific

A design trap that

feedback

to the

markup languages,

manipulation

through

table

fails.

field and

associated

operation

producing

in size automatically

represent

each

Facts commonly

is in two

increments.

data elements. XML

refers

and classified

For example,

fan trap

database

Markup Language (XML)

metalanguage

original

that

of data files to expand predefined

data

Afact table is in a one-to-many

represent

continuous

best

relational

In a DBMS environment,

Extensible

Editorial

simpler

warehouse,

failure transparency

A model

models

extends

using

to the

model (ERDM)

environment.

of

the star schema

analysis include

node

object-oriented

of the

more

model.

data

focus,

In a data

In a data warehouse, the

sales.

supertypes,

clustering,

structural

ability

of adding

as entity

entity

(ER)

result

view

The specific representation

dimensions.

(values)

or

model

the

business

with a data subset

view of the

relationship

SQL, a cursor created

Sometimes referred to as the enhanced

relationship

Given its

view; the end users

contains

common

or activity.

entity

programmers

schema.

measurements

extended

over the

F

only one row.

(EERM)

data

of structured

environment.

when referencing

could

exchange

of an entity

statement

but

languages,

and invoices

works

schema

an external

exist.

any rows.

of a SQL

more rows,

orders

A manipulate

of a documents

the

environment.

fact table

In procedural

markup

The application

database

external

key

yet

other

as

model

data

global and

table.

output

two

first

SQL, a comparison

whether

not

over the

and

manipulation

such

an external

one or more related

must be created

of the

the

A property

an existence-dependent

EXISTS

does

of structured

internet.

more

must be created

that

can exist apart from a table

or

existence-dependent

existence-independent

Such

one

an environment,

existence-independent loaded

on

Unlike

the

documents

also

A property of an entity

depends

exchange

and invoices

to represent

XML facilitates

external

existence-dependent

used

elements.

elements.

allow

See

the

orders

Markup Language (XML)

XML permits

by any

database.

as

internet.

lock.

whose

such

metalanguage

when

update

are held on that

transaction.

to

data

XML facilitates

documents

by a

is issued

permission

other

that

will

Extensible

other

shared

elements.

database

921

content

may content

or group of a

a persons

phone

number,

and

fields.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

922

Glossary

field-level

lock

transactions

to

require that

the

row.

Alock that allows concurrent access

use

of lock

data access

computer

same

of different

This type

multiuser

the

row

as long

fields

(attributes)

yields

the

but requires

running

as they

under

find()

A MongoDB

from

of

fully replicated

normal

depicted and

in tabular

a primary

flags

It

describes

format,

with

be used

All nonkey

dependent

a required

on the

absence

nulls in

key (FK) whose

another

fourth

or

must

whose

no

multivalued

as a single

database

values

must

and

memory location.

on graph

See

data

key.

full functional an attribute

it is

periodically

divided

It

systems

that

has

backup

dependent

types

Learning. that

any

that

Data

are

All suppressed

a

stores

data

ensures disaster

does

at their

be atomic

data.

model based

on relationship-rich

and

combined in

edges.

with

any

a SELECT

a

at the

of the

statement.

May not

not materially

be

conceptual

that

over

data

scanned, the

overall

which a used in

changes

on the

in

database

the

the

design

level.

duplicated, learning

In the a signal

node the

in

whole

or in Cengage

part.

Due Learning

Hadoop

name

to

selected

node

is

still

rows.

Distributed

node

to

File System

seconds notify

from the

the

name

node

available.

transparency

to integrate

one logical

experience.

to restrict

sent every three

to the

data

a system

models

or

A clause applied to the output of a

heterogeneity

a

management

affect

hardware

Therefore,

BY operation

heartbeat

management

copied,

on the

will have no effect

GROUP

data

and relational)

does

A condition in

depend

or

A system that

different

not

implementation.

(HDFS),

database

Reserved. content

by the

A SQL clause used to create frequency when

hardware

key.

support

Rights

determines

stored

said to

of nodes

functions

models

in

on a composite

of database

different

may even

Cengage

of the

network,

supports

deemed

model

A condition in which

DDBMS

(hierarchical,

network.

row.

A NoSQL database

theory

HAVING

dependence

different

A

H

into

failure.

heterogeneous

2020

updated

Afull

key but not on any subset

to

of detail represented

a tables

as a collection

distributions

copy of an entire

is functionally

integrates

in

GROUP BY

of

database

of all data after a physical

integrity

A determines

B. The relationship

hardware independence

separate

database

of attribute

equivalent

of granularity

database

be null.

a distributed though

A complete

full recovery

A is

R, an

on an attribute

A DDBMS feature

to treat

saved

review

graph

Atable that is in 3NF

even

on

Thelevel

stored

key in

sets

of also

written as AB.

level

aggregate

database

Copyright

values

primary

independent

See

Within a relation dependent

of attribute

dependent

granularity

the

more fragments.

full backup

Editorial

the

transparency

a system

systems

may to

sites.

database.

one value

lowest

match

multiple

allows

fully

Flags

attention

multiple copies

multiple

G

dependencies.

fragmentation

or

values.

stores at

dependence

B, and is

key.

a table.

normal form (4NF)

contains

B is

in

An attribute or attributes in one

values

table

attributes

by designers

by bringing

that

fragment

Aif and only if a given value groups

primary

In a DDBMS, the

Bis functionally

exactly

alert end users to

or encode

prevent

of a value

foreign table

response,

conditions, to

database

database

attribute

a relation

no repeating

as

DDBMS and homogeneous

database

partially replicated

documents

stage in the

Special codes implemented

specified

two

The first

key identified. are

to trigger

that

(1NF)

process.

relation

and

each

functional

form

normalisation

the

of related records.

a collection.

first

such

microcomputers.

DDBMS.

most flexible

a high level

method to retrieve

systems,

and

See also heterogeneous

within

overhead.

A named collection

computer

minicomputers,

distributed

file

different

mainframes,

several

Afeature that allows centralised

DBMSs

into

DDBMS.

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

heterogeneous integrates

DDBMS

different

management

A system that

types

of centralised

systems

over

heterogeneous

distributed

heterogeneous

DDBMS)

implicit

database

a network.

See

database

also fully

system

in

returns

(fully

and homogeneous

cursor

created

IN

DDBMS.

whose basis

for

An early database

concepts

and

subsequent

model is

in

model

basic

based

database

record

is the

root

relationship

is called

segment.

to the

tree

Each

segment

problem

has

DDBMS

only one type

of centralised

system

over a network.

DDBMS

and fully

database

system

A system

yielding

heterogeneous

different

be avoided.

relational

homonyms

existence

has

index

arises

a transaction-calculating

when

horizontal

fragmentation

database

generally

either

index

should

automatically

alerts

the

design

process rows.

in the

array

host language

be used in

The distributed that

breaks

See also

a table

database

its

primary

processing.

meaning.

Information

keys

are

the

primary

immediate

engineering

A methodology

Learning. any

dependent

Arelationship

of corporate

parent

Rights

during

Reserved. content

that

Also

strategic

making.

goals into

IE focuses

data instead

on the

of the

resource

that

helpful

description

processes.

the

relationship

primary

key

does

May not

not materially

be

its

copied, affect

overall

the

execution,

commit

scanned, the

as the future

or

duplicated, learning

in

whole

engineering

basis for

of the

The

(IE) process

planning,

information

of an object

methods

ability

and

point.

experience.

(ISA)

developing,

and

systems.

In the object-oriented

inheritance

a transactions

architecture

of the information

ability

See data

to inherit

classes

data model,

the

data

structure

above it in the

class

hierarchy. See also class hierarchy.

contains

entity.

reaches

systems

serves

and

manager

(DA).

Inheritance

called

A database update that is

the transaction

All

output

are

in which

identifying

entitys

key of the

suppressed

(IE)

a companys

controlling

existence-dependent.

immediately

before

of

information

in tables.

or strong

update

performed

consists

into fragment

model, such identifiers

relationship

entities

that

as

A measure of how likely an index query

decision

In an ERM, unique names of each entity to

Cengage

used to

Also known

data and facilitates

translates

In the relational

because

deemed

are generally

data retrieval.

and row

transformed

information

relationship

has

key values

statements.

a strong

2020

of index

The result of processing raw data

administrator

identifying

review

the last

See index.

information

Any language that contains

SQL

identifiers

Copyright

since

makes the

I

Editorial

data,

key.

information

even

database

Indexes

selectivity

is to

fragmentation.

embedded

the

the

backup.

data and applications.

related

of data

A process that only backs up

changed

or full

key

index

user

See also synonym.

of unique

vertical

mapped

a set

are updating

up and facilitate

to reveal

instance.

over

control

results.

An ordered

speed

DDBMS).

or automatically

adjustments.

and

values.

A concurrency

ID values (pointers).

distributed

software

and

appropriate

subsets

used to check

of specified

management

heterogeneous

Homonyms

Some

for

to their

a list

The use of the same name to label attributes.

checks

operator

among

backup

incremental

an index

homonym

statement

retrievals

erroneous

incremental

See also heterogeneous

(fully

SQL

a 1:M

it.

that integrates

database

is

(aggregate)functions

data that

homogeneous

when the

value.

while other transactions

The top

below

that

summary

structure

segment

directly

the

This

a segment.

a value

inconsistent

formed

development.

on an upside-down

which each record

model

characteristics

one

SQL

In SQL, a comparison

whether

hierarchical

A cursor that is automatically

procedural

only

923

In the object-oriented of an object

methods

of the

to inherit classes

data model,

the

above

data

structure

it in the

class

hierarchy.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

924

Glossary

inner join meet

Ajoin

a given

can

operation in which only rows that

criterion

be an equality

or an inequality

the

are

selected.

condition condition

most commonly

The join

(natural (theta

join

join).

used type

join

Contrast

and stored

advantage

or equijoin)

The inner

of join.

compiled

criterion

create is

their

many

with

is its

applications

application

input/output

(I/O) request

that reads devices

Alow-level

or writes data to

such

as

memory,

and from hard

operation

Java

sources,

video,

sources,

disks,

and

of

one

A SQL command that allows the insertion or

more

data

rows

into

a table

using

by

a

model

In

data abstraction a specific

internal

by the

model

requires

models

internal

adapts the

a designer

the

chosen

to

words,

match

the

conceptual

database

of

join

environment,

pools

and inconsistent

of aninternal

to

data

has

created

text

key

duplicated,

managed

by

of a database

transaction

until the

first

several

ways.

primary

key (PK),

of

See also

2020 has

Cengage deemed

Learning. that

any

The

values.

Notation)

A human-readable

data interchange

also

programming language

Microsystems software.

All suppressed

Rights

Java

Reserved. content

that

does

May not

runs

that

defines

not

be

copied, affect

scanned, the

overall

concept

of

may be classified

on the

in

superkey,

candidate

key,

key,

and foreign

The attributes that form

key.

duplicated, learning

and

in experience.

whole

or in Cengage

part.

Due

to

indicators

that

sales

earnings

Learning

attribute.

electronic reserves

rights, the

right

In business

or scale-based

a companys

in reaching

Examples by

per

(KPIs)

numeric

assess

goals.

turnovers, are

or

a primary

The attributes that form a primary prime

or success

operational

on top

applications

materially

based keys

quantifiable

effectiveness

review

similar

secondary

key performance

An object-oriented

Copyright

on

a Hadoop

key.

one

J

Editorial

See

measurements

Web browser

for

dependence;

intelligence,

of the

and report in

share

Object

key attribute

is not

A process based on repetition

Sun

place,

in

procedures.

by

jobs

An entity identifier

functional

iterative

developed

event takes

page

and values in a document.

key attributes

Java

embedded

with the

Columns that join two tables.

format

key.

and

is

an object.

monitor,

generally

ends.

steps

design

K

often

and

transactions

process

on

processing

columns

by

a value.

used by one transaction

other

code

downloaded

to

In the old file system

A property

available

developed

operator used to

of independent,

which a data item

distribute,

JSON (JavaScript

supported

departments.

isolation

click

column(s)

attributes

an attribute

files.

JavaScript

when a specific

data

environment.

to those

constructs

tabular

Web authors

and then

a

of data

A central control program used

accept,

join

Arepresentation

allows

mouse

MapReduce

model.

ofinformation

different

to

the internal

and constraints

In SQL, a comparison

whether

islands

of a database

other

as a

allows

databases,

and text

Websites.

job tracker

The

in

An

that

A scripting language

Web pages,

such

model

them

with a wide range

relational

that

and activated

of

database.

IS NULL check

In

implementation

using

alevel

conceptual

representation

DBMS.

schema

model

modelling,

model for implementation.

characteristics

selected

the

that

DBMS

model is the

as seen

the

database

run

(JDBC)

interface

spreadsheets,

Netscape

in

internal

to

including

interactive

subquery.

main

developers

and then

Connectivity

to interact

JavaScript

INSERT

Javas

application

once

programming

program

computer

printers.

Web server.

to let

environments.

Java Database

outer join.

on the

ability

of

promotion,

strategic

KPI are sales

and

product

by employee,

share.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

key-value

A data model based on a structure

composed in

of two

which

every

of values.

The

associative

has

data

and

data

also

value

enable

or set called

A NoSQL

component

granularity

take

place

database

page,

of key-value is unintelligible

The body

a specific

familiarity,

subject.

awareness,

information

as it

characteristic

of information Knowledge

and

applies

is that

and facts

understanding

to

new

implies

an

environment.

knowledge

manager

design

the

L left

outer join

join

that

those

yields

that

table.

In a pair of tables to be joined, all the

rows

no

matching

have

For example,

with AGENT including

ones that

row.

LIKE

In

whether

values

outer

outer

an attributes

string

pattern.

linking

table

a

requirements

and right

value

rows,

matches

logical

check

the

In the relational an

M:M relationship.

See

also

composite

model

mapping transparency

DDBMS in

which

A property

database

access

of a

requires

the

lost

end

know

fragments.

See

location which

database name

locations

lock

and location

requires

database

transaction

requires

See

(Fragment

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

model for

system,

such as DB2,

Access,

be changed (The

or Ingress.

which the

without

internal

a

affecting

model is

because it is unaffected which

the

in

software

storage

affect

devices

the internal

A concurrency

updates

by

is installed. or operating

model.)

control problem in

are lost

during

the

concurrent

of transactions.

mandatory

also local

transaction

a lock

used to translate

the internal

M

the user to know fragments.

be known.)

a particular

on

data

or Ingress.

transparency.

one

A device that guarantees unique use of a in

is

A condition in

model.

will not

execution

A property of a DDBMS in

access

not

of the

transparency.

data item

has

name

also location

of the

need

mapping

2020

the

transparency

only the

review

both

can

updates

which user to

DB2,

and is therefore

design

management

a change

systems

a

as

design to the

Oracle, IMS, Informix,

computer

such

Access,

DBMS

into

independence

Therefore,

entity.

local

design

conceptual

the

model for

system,

Logical

hardware-independent

model, a table that

the internal

selected

database

internal

a specified

used to translate

Astage in the design phase

conceptual

SQL Server,

outer join.

is

Oracle, IMS, Informix,

of the

selected

used to

into

phase

DBMS and is therefore design

matches the conceptual

the

locks.

design

management

software-dependent.

matching

operator

text

other

of CUSTOMER

have

join

SQL, a comparison

implements

Copyright

design

design

including

table,

design to the

Logical

database

that

CUSTOMER

do not

See also

table, in the

join

will yield all of the

the

AGENT

a left

in the left

A stage in the

conceptual

logical

a

and releasing

of the selected

SQL Server,

use. Locking can database,

The way a person views data.

software-dependent.

old knowledge.

for

(attribute).

matches the conceptual

requirements

be derived

to

data item

oflock

levels:

assigning

logical that

execution

the

A DBMS component that is

for

selected

Editorial

and field

data format

A key

can

following

logical

of

to lock

Thelevel

at the

row,

lock

to

operations

use.

responsible

knowledge

from

own

after the

transactions

lock

DBMS.

about

other

their

the

model.

data as a collection

which the

lock is released

a value,

value

model is

databases

stores

a key

a corresponding

key-value

(KV)

model that

the

key

elements:

or attribute-value

Key-value

pairs in

data

925

prior

not materially

be

operation. to

copied, affect

data

scanned, the

overall

A

access;

or

duplicated, learning

in experience.

participation occurrence

in another

EMPLOYEE

works in

companys

or in Cengage

part.

Due Learning

to

electronic reserves

have

entity.

For example,

a DIVISION.

without

in which

a corresponding

being

(A

an

person

assigned

cannot

to

a

division.)

rights, the

Arelationship must

occurrence

be an employee

the

whole

entity

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

926

Glossary

many-to-many (M:N or *..*) relationship Associations among two or more entities

in

which

associated

entity

one

with

occurrence

many

and one occurrence

associated

map

with

is

of the related

many occurrences

data into

must

entity is

of the first

subtask

within

Mapper

a larger

entity.

contains

the

rows

stores

but

SQL the

created

first

rows

materialised when the

actual

the

the summary

MAX

and

to

rows.

time

the

is run

are stored in the table.

view rows

module

base tables

are

attribute

and

updated

metadata

in

a given

and relationships.

See

also

global,

data

method

In the object-oriented set

of instructions

Methods represent invoked

to

actions,

and are

linked

platform

for

characteristic

of interest

regardless

the

development

any type

of operating

that

of

distributed,

applications of data

over

in rows

and

(columns).

unit,

and is

a system.

(2)

An

which modules are

to

each transaction.

The

uses

database management A database management proprietary

system

attribute

value

and

all that

elements

be

defined

defined in the database

in

arrays of n dimensions

is

required in the

model

has

Cengage deemed

Learning. that

any

In

by database and

other

(MPSD)

words,

All

Rights

Reserved. content

does

May not

of online

processing analytical

database

system

not materially

with support

for

processors

copied, affect

A scenario

in

single-site which

data

multiple

processes

sharing a single data

elements

by at least

be

processing,

run on different computers repository.

transactions

all data

must be used

suppressed

management

multiple-site

multivalued attribute An attribute that can have many values for a single entity occurrence.

one

transaction.

2020

known

column.

needed.

model,

online analytical

multiple data processors and transaction at multiple sites.

Defined as All that is needed is

is there

to store

multiple-site processing, multiple-site data (MPMD) A scenario describing a fully distributed

network

and programming

a given

techniques

aimed

any

A SQL aggregate function that yields the

minimal data rule

review

several

attributes

produce

An extension

database

minimum

Copyright

of the

processing to multidimensional management systems.

language.

Editorial

into

to the

A component-based

interoperable

manipulating

data

to

multidimensional

heterogeneous,

must

of horizontal

data fragmentation,

divided

as an autonomous

(MOLAP)

all

one

(1) A design segment that can be

data in matrix-like as cubes.

In a data warehouse, numeric facts that

Microsoft .NET framework

there,

for

may be

unique timestamp

system

user.

MIN

A combination

multidimensional system (MDBMSs)

an action.

messages.

measure a business

at

elements

by at least

data model, a

perform

real-world

through

metrics end

transactions

all data

timestamp value produces an explicit order in which transactions are submitted to the DBMS.)

dictionary.

named

words,

monotonicity A quality that ensures that timestamp values always increase. (The time-stamping approach to scheduling concurrent transactions assigns a

column.

Data about data; that is, data about

characteristics

and

module coupling The extent to independent of one another.

updated.

value

other

database

must be used

has a subset

sometimes

A SQL aggregate function that yields the

maximum

In

information system component that handles a specific function, such asinventory, orders, or payroll.

The

are automatically

by model,

strategies

implemented

materialised

query

model

a table

each row

generate

The

required in the

needed.

transaction.

vertical

which

command

is

mixed fragmentation

pairs as a

A dynamic table that not only query

Defined as All that is needed is

is there

elements

be defined

database

job.

view

all that

defined in the

A program that performs a mapfunction.

materialised

view is

a set of key-value

and

all data

of a related

The function in a MapReduce job that sorts

and filters

data

of an entity

occurrences

minimal data rule there,

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

For example, store

the

an EMP_DEGREE

string

different

BBA,

degrees

MBA,

requires

NoSQL

might

to indicate

three

A new generation of database

systems

held.

mutual consistency that

attribute

PHD

that

database

rule

all copies

A data replication

of data

fragments

rule

to

that

database

mutual exclusive rule one transaction on the

A condition in

at a time

same

can

own

an

exclusive

on the

management

traditional

relational

is

not

based

on the

management

traditional

relational

model.

NOT

which only

based

A new generation of database

systems

identical.

not

model.

NoSQL

be

is

927

A SQL logical

operator that negates a given

predicate.

lock

object.

null

In SQL, the absence of an attribute

that

a null is

not

value. Note

a blank.

N natural join

Arelational

by selecting their

only the

common

natural

identifier

implies, forms

of time

in

object

A generally

real-world

objects.

to

day-to-day

for

point

A to

a data

packet

point

end users

to

objects

name

and

network

model

late

1960s

of record with

that

adds

a round

types

an owner

and

part

of

next-generation

record

type

and

a

member

record

(O/R

in a 1:M relationship.

network nodes

partitioning

become

network

node single

The delay imposed

suddenly

unavailable

when

due to

by

a

the

failure.

relationship

which

the

entity

does not contain

primary

entity.

non-key

many relational relational

database

was the

provide

for the

a unified

development

of

management

based

championed

researchers,

response

many within

extended

The ERDM,

database

models

on the

system

of the

to

the

constitutes OODM.

object-oriented

an inherently

This models

simpler

relational

structure.

instance.

non-identifying

parent

A DBMS

model (ERDM).

best features

of a

OLE-DB

to

database

model includes

In a graph database, the representation entity

data.

that

accessing

applications.

DBMS)

relational

Object

middleware for

strategy

framework

Database

Component

database

non-relational

object/relational

type

for

functionality

Microsofts

object-oriented

sets

with other

Microsofts

OLE-DB is

represented

as predefined

on

object-oriented

A data model standard created in and relationships

of a real-world

embedded

and Embedding

Based

relational

as a collection

identity,

and the ability to interact

Model (COM),

by the amount

B.

data

a unique

and itself.

(OLE-DB)

first

the

has

Object Linking

vocabulary.

make

An abstract representation that

properties,

As its

business

The delay imposed

required

from

values

entity

key is familiar

of their

network latency trip

common

identifier) for

a natural part

with

attribute(s).

key (natural

accepted

O

operation that links tables

rows

key

of the

dependent

the

Also known

attribute

Arelationship primary

object-oriented

in

(many

model

whose

basic

model (OODM) modelling

A data

structure

is an object.

side)

object-oriented database management (OODBMS) Data management software

key of the related

as a weak relationship.

See nonprime

data

to

attribute.

manage

data

in

an

object-oriented

system used

database

model.

non-prime

attribute

An attribute

that is

not part

of

one-to-many relationship

a key. normalisation to

entities

A process

so that

that

assigns

data redundancies

attributes

are reduced

entities

or

Copyright review

2020 has

Cengage deemed

are

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

used

among two or more

by data

models.

one entity instance

many instances

Learning. that

that

relationship,

eliminated.

Editorial

(1:M or 1..*) Associations

some to

of the

third remove

party additional

content

may

be

with

entity.

suppressed at

a 1:M

is associated

related

content

In

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

928

Glossary

one-to-one

(1:1 or 1..1) relationship

among

or

two

models. is

In

more

entities

that

a 1:1 relationship,

associated

with

only

Associations

are

one

used

entity

one instance

by

outer join

data

instance

of the

related

entity.

a table

analytical

support

system (DSS)

data analysis data

processing

analysis

making,

tools

techniques.

that

Decision

use

retained;

unmatched

Contrast

null.

that

modelling,

an advanced

supports and

transaction

systems

that

processing

operations

support

operations.

databases,

operational

that

support

retained;

unmatched

Contrast

overlapping The

developed

database

access

operational primarily

by

API to

database

to

support

operations.

hints

command

that

provide

(row)

disk

a

optimistic assumption

that

technique

most

for the the

SQL

such

pairs

related

are

table

are

See also left

outer

subtypes a condition

In a

in

which

supertype

can

each

appear

in

as a directly

A diskpage

system

locks

an entire

A diskpage

can

one or

more rows

and from

one or

partial

completeness in

not

which

a fixed

which

be

some

members

or

data for

more tables.

supertype

hierarchy,

occurrences

of any subtype.

In normalisation,

an attribute of the

diskpage,

contain

In a generalisation

dependency

(subset)

has

In this type oflock, the database

of a disk.

in

do not

be described of a disk.

section

partial

on the

operations

can

of a

as 4K, 8K, or 16K.

a condition

management,

based

database

in the

of the

section

management

database

inside

In transaction

control

which

page-level lock

day-to-day

embedded

block,

might

approach

unmatched

In permanent storage, the equivalent

size,

text.

a concurrency

are

one subtype.

addressable

applications.

as a transactional

are

values

hierarchy,

instance

page

Database

to

Special instructions

optimiser

table

P

database.

optimiser

all

with inner join.

entity

or

A database designed

Also known

or production

Microsoft

a companys

related

are

OLTP are known

(ODBC)

Windows

pairs

algebra JOIN operation that

which

specialisation

databases.

middleware

in the

(non-disjoint)

more than

databases,

Connectivity

in

left

day-to-day

transactional

Open Database

query

a companys

Databases

OLTP

(OLTP)

values

unmatched

with inner join.

a table

null.

all

join and right outer join.

decision

research.

online

algebra JOIN operation that

which

Arelational

produces

multidimensional

OLAP creates

environment

business

(OLAP)

in

left

outer join

online

as

Arelational

produces

is

dependent

primary

a condition

on only

a portion

key.

conflict. optional that

attribute

In

does not require

ER

modelling,

an attribute

a value; therefore,

it can be left

empty.

optional in

participation

which

one

entity

a corresponding

In ER modelling, a condition

occurrence entity

does

occurrence

The SQLlogical

expressions

to

ORDER BY output

ascending

Copyright review

operator used to link

expressions

clause. It requires

Editorial

in

in

database.

has

in

or

HAVING

conditional

A SQL clause that is useful for ordering query

Cengage

Learning. that

any

All suppressed

(for

example,

order).

Rights

Reserved. content

does

key

attributes

in

May

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

For

based

reserves

in the

CLASS,

on the

In partitioned

the

participants

databases,

determine

one or

the fragment

more in

will be stored.

data allocation

electronic

example,

teaches

a table that

that

to

database

See also fully

CLASS.

of dividing

fragments

not

is

and

partition

strategy

some

multiple sites.

a relationship.

relationship

partitioned

in

of only

at

PROFESSOR

which a row

of a SELECT

deemed

WHERE

A distributed

An ERterm for entities that

PROFESSOR

multiple

copies

replicated

relationship

be true.

or descending

2020

a

only one of the

which

are stored

teaches

conditional

the

database

database

fragments

participate

a particular

relationship.

OR

replicated

participants

not require

in

partially

are stored

rights, the

right

A data allocation

a database

some to

third remove

into

at two

party additional

content

may content

two

or

be

or

suppressed at

more

more sites.

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

partitioning subsets

The process of splitting a table into

of rows

performance that

allows

a system

database

tuning

perform

access

perform

as though

it

were

in

Activities that

more efficiently

make a

in terms

a table,

only, previous

a framework

of fact)

can

about the time span of data

usually

years,

expressed

as current

to

line

with

stored

breaks

standard

extensions

SQL statements

primary

that

is

stored

A block of code

and

and

physical

data format

method

to improve

documents

the

through

the

use

of

and indention.

key (PK)

In the relational

composed

of one

a row.

or

model, an

more

attributes

Also, a candidate

at the

that

key

See also key.

DBMS

prime attribute A key attribute; that is, an attribute that is part of a key oris the whole key. See also key attributes.

The way a computer sees

data.

physical

design

maps the

data

function

and

Because

of the types

hardware,

the

system

private cloud Aform of cloud computing in which aninternal cloud is built by an organisation to serve its own needs.

A stage of database design that

storage

of a database.

the

(statement

or false.

selected as a unique entity identifier.

server.

(stores)

find()

uniquely identifies

procedural

executed

true

year

or all years.

module (PSM)

which an assertion as either

of retrieved

identifier

persistent

in

mathematics to

MongoDB, a method that can be

the

readability

of a variety

technologies

infrastructure.

be verified

In

management

Used extensively in

provide

pretty()

of storage

The coexistence data

an organisations

chained

Information

and

predicate logic

a

speed.

periodicity stored

to

persistence storage

within

A DDBMS feature

DBMS.

performance and

of data

transparency

centralised

polyglot

or columns.

929

access

these

are

supported

by the

of devices

data access

physical

design

software-dependent.

See

characteristics

characteristics

methods

supported

are

hardware-and

also

both physical

a

Procedural Language SQL (PL/SQL) A type of SQL that allows the use of procedural code and in which SQL statements are stored in a database

by

as a single

model.

callable

object

that

can be invoked

by

name. physical independence physical

model

internal

can

be changed

affecting

procedures

the

model

described

the

hardware-and

Platform

as location,

data.

The

path,

physical

and format model is

are

both

as a Service (PaaS) provider

consumer-created

A modelin which

can build and deploy

applications

using

the

In the

client-side, invoked

public.

providers

types

policies

application

browser

when

query

that is automatically needed

to

manage

for

statements

manage company

communication

and

of direction

operations

support

of the

that

2020 has

Cengage deemed

to the

used

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

of

or in Cengage

part.

Due Learning

to

electronic reserves

right

some to

third remove

code.

A specific

by the

request

end user or the

DBMS.

to

language

rights, the

SQL

issued

A nonprocedural language

by a DBMS

of a query

Learning. that

form

manipulation

query language

the

organisations

objectives.

review

in the

data

application

are

through

A question or task asked by an end user of a

database

of data.

General

used to

Q

World Wide Web(WWW), a

external by the

specific

Copyright

of an activity or process.

infrastructure.

plug-in

Editorial

during

public cloud Aform of computing in which the cloud infrastructure is built by athird-party organisation to sell cloud services to the general

software-dependent.

cloud service

be followed

properties In a graph database, the attributes or characteristics of a node or edge that are ofinterest to the users.

A modelin which physical

such for

Series of steps to

the performance

characteristics

cloud

without

which the

model.

physical

the

A condition in

manipulate is

party additional

content

its

data.

that is

An example

SQL.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

930

Glossary

query optimiser SQL

queries

access

the

access

A DBMS process that analyses

and finds data.

the

The

query

or execution

query result returned

most

efficient

optimiser

plan for the

set

contents;

way to generates

the

query.

The collection

the

JOIN,

PRODUCT,

and

relational (RDBMS)

of data rows

by a query.

a relational

the

RAID

An acronym for Redundant

to

Disks.

create

virtual

individual

the

use

multiple

volumes)

RAID systems

fault

Array of

systems

disks (storage

disks.

improvement,

RAID

from

provide

tolerance,

and

disks

database.

several

a balance

A collection

of related (logically

connected)

data

security,

query

A nested query that joins

data

a single

entity

married to

type.

Arelationship

For example,

an EMPLOYEE

of another

or a PART is

IBM

within

an EMPLOYEE

for

is

and summarises

produce

a single

reducer

the results

of

also

to

help

language

creates

and

to the

(SQL)

and

into

provide

concurrent

administration

A graphical

easy

data

application

entities,

representation the

of

attributes

and the relationships

in

model 1970,

users

designers

within

among

the

a

major

A program that performs a reduce

conceptual

based

on

and represents

data

relations. Each relation (table) represented as a matrix of

rows

and columns.

The relations

are related to each other through of common entity characteristics columns).

result.

breakthrough

of its

model is

set theory

intersecting

by E. F. Codd of

because

The relational

as independent is conceptually

map functions

Developed

it represented

and

mathematical

The function in a MapReduce job that

collects

RDBMS

integrity,

databases

entities,

simplicity.

a component

PART.

reduce

and retrieve

entities.

atable

found

locate

dictionary

diagram

relational relationship

software

(queries)

A good

a data

and system

a relational

to itself.

recursive

RDBMS

requests

physically

a query

those

fields.

recursive

The

logical

data.

maintains

relational

record

DIFFERENCE,

programs.

between

two.

to

requested

through

performance

that

and

access,

are SELECT,

UNION,

DIVIDE.

a users

commands

R

main functions

INTERSECT,

database management system A collection of programs that manages

translates

Independent

eight

PROJECT,

the sharing (values in

function.

redundant the

transaction

transaction

systems

log

to

referential

kept

ensure

will not impair

that

the

integrity tables

matching

or a

Relations to

common

each

2020 has

Learning. that

any

table.

have

an invalid

a null Even

relational relational

entry.

through

the

of a

relationship

(a value in a column).

All suppressed

Rights

Reserved. content

does

May not

relational

not materially

be

copied, affect

the

Arelationship higher.

table

scanned, overall

or

duplicated, learning

An association degree

or participants

A set of mathematical principles manipulating

that

use

schema The organisation of a database as described by the database

relationship

Relations

sharing

functions

administrator.

model, an entity

as tables.

processing

processing

relational schema The organisation of a relational database as described by the database administrator.

a corresponding

database

other

basis for

Cengage deemed

have

characteristic

algebra

that form the

review

to

data.

must have either

are implemented

entity

relational

Copyright

may not

In a relational

are related

Editorial

key

Analytical

relational databases and familiar relational query tools to store and analyse multidimensional data.

of a disk

by which a

entry in the related

it is impossible

relation

failure

ability to recover

foreign

an attribute

attribute,

physical

online analytical

(ROLAP)

management

A condition

dependent

set.

the

relational

Multiple copies of

by database

DBMSs

entry though

logs

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

between entities.

The number of entities

associated

with a relationship.

degree can be unary, binary, ternary, or

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

Remote

Data Objects (RDO)

object-oriented remote

application

database

DAO

and

was optimised such

servers.

ODBC

as

for

to

RDO

direct

deal

A higher-level,

interface

used

uses

access

the

to

to

lower-level

Oracle, and

ROLLBACK

A SQL command that restores the

database

contents

after

databases.

with server-based

MS SQL Server,

access

RDO

the last

DB2.

request

a single

SQL

A DDBMS feature that allows

statement

remote

DP. See

remote

transaction

to

access

also remote

data in

data in

data

which

repeating

group

describing

a group

type

for

a single

example,

a car

interior,

In

key

bottom,

trim,

replica transparency the

existence

of

so

access

of the same For

colors

the

of data from the

user.

data allocation

strategy

which

in

fragments

are stored

replication

versions

place

access

copies

time

reserved cannot in

A data allocation

of one

at several

or

more database

different

words

Oracle

any

SQL, the

tables

right

outer

other

purpose.

is

For

cannot

In a pair of tables

yields

the

table.

CUSTOMER

used

with no

For with

to

matching

example,

AGENT

the

CUSTOMER row. join.

be used

to

scaling

be joined,

review

2020 has

a right

outer

will yield

ones

any

A query

that

that

all

do not

join

of

of the

have

All suppressed

distributing

a cluster

up

when

optimisation

query optimisation

algorithm

uses

A query

preset rules

first,

involves

powerful

to

aggregate

and

making the

up the

by

Rights

Reserved. content

does

May not

migrating

not

be

copied, affect

scanned, the

overall

or

the

structures

servers.

same

with data growth

structure

to

more

systems.

scheduler

matching

the

The DBMS

order in

execution

which concurrent

to

ensure

that

transaction

The scheduler

of database

sequence

component

operations

interleaves

operations

in

establishes

the

a specific

serialisability.

different

data is the exact

materially

storage

AGENT

a

used with the

data

data

of commodity

A method for dealing

are executed.

Rolling

Learning. that

even

A method for dealing with data growth

involves

scaling

See also left outer join and outer

BY clause

Cengage deemed

table,

page.

portion is calculated

out

across

schema

opposite

of drilling down the data. See also drill down.

Copyright

same

to

S

example,

values in

In SQL, an OLAP extension

dimensions.

Editorial

of the same

optimiser

multiplication

all of the rows in the right table,

ones

including

GROUP

rows

on the

technique

that

roll up

are

transactions

correct answer 17.

word INITIAL

join

the

rows,

of rows.

or columns.

including other

set

which

database lock in

concurrent

optimisation

that

a join that

different

rows

allows

query optimisation

Words used by a system that

for

of a given

rule-based

and to improve

and fault tolerance.

be used

name

Replication

locations

blocks,

rules of precedence Basic algebraic rules that specify the order in which operations are performed. For example, operations within parentheses are executed first, so in the equation 2 +(3 5),the

sites.

of a database.

in

in

points to determine the best approach to executing a query.

The process of creating and managing

duplicate to

copies

stored

mode based on the rule-based algorithm.

The DDBMSs ability to hide

replicated

A physical data storage

data is

Aless restrictive

DBMS

rule-based

for its top,

on.

multiple copies

existed

statement.

all columns

lock

that

row-level trigger Atrigger that is executed once for each row affected bythe triggering SQL statement. Arow-level trigger requires the use of the FOR EACH ROW keywords in the trigger declaration.

request.

occurrence.

multiple

and

the

a characteristic entries

attribute have

to

also remote

a relation,

of multiple

can

requests)

DP. See

from

row-level

transaction.

by several

remote

which

condition

a single

A DDBMS feature that allows

(formed

a single

COMMIT

in

access

a transaction

to the

storage

technique hold

remote

table

row-centric

databases

931

duplicated, learning

such

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

Alogical

as tables,

rights, the

right

some to

third remove

grouping of database objects,

indexes,

party additional

content

views,

may content

be

and queries, that

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

are

eChapter(s). require

it

932

Glossary

related

to

single

user

each other.

schema such

to

single

user

to

a

types

set theory

grouping of database objects,

indexes,

views,

other.

and

Usually,

queries,

a schema

that

are

belongs

to

deals a

the

or application.

design,

according

to

operational

with

sets,

basis for

in the

is in

normal form (2NF)

normalisation

process,

1NF and there

(dependencies

purposes.

For

customer

combination

example,

number

appropriate

table

See

also

equivalent

of a file

systems

SELECT

A SQL command or a subset

not likely

models

data

that

modelling

both

structure in

data

the

the

a table.

known

1981,

The

data from

by

the

a data

other

database.

See

The SDM, M. Hammer

world,

in

a single

to

in

which

computers

one

single-site

all processing

is

done

on a single

and all data are stored

local

on the

disk.

database

user

data (SPSD)

A database that supports

at a time.

An attribute that can have

snowflake schema Atype of star schema in which dimension tables can have their own dimension

published and

Compare

slice and dice The ability to cut slices off a data cube (drill down or drill up) to perform a more detailed analysis.

real

relationships

as an object.

was developed

allows

the

from

on the

components.

single-valued attribute only one value.

values

The first of a series of data represented

meaningful

processing,

single-user only

yields

in

attribute.

CPU or host computer

type.

that

into

composite

A scenario

data model, the

and their

access

data

held

An attribute that cannot be

subdivided

host

used to retrieve

closely

are

A shared lock to

attribute

single-site

middle initial, match

record

model

more

transaction.

as model.

when a

to read

locks

used

relational

to

but the

tables.

semantic

permission

no exclusive

transactions

simple

key.

of rows

is

and is

in the

also exclusive lock.

key).

key),

will probably

row.

statement

are

name,

In the hierarchical

of all rows

primary

(primary

name, first

add and intranets.

dependencies

customers number

of last

and telephone

a relation

of things,

manipulation

Alock that is issued

and

read-only

A key used strictly for data retrieval

know

SELECT

which

are no partial

key

segment

in

in only part of the

secondary their

The second stage

extensions

Web servers

or groups

data

requests

database

by another

second

to

A part of mathematical science that

transaction

requirements.

Server-side

functionality

shared lock

The part of a system that defines the extent

of the

of requests.

Alogical each

scope

belongs

significant

as tables,

related

Usually, a schema

or application.

tables.

D.

The snowflake

normalising

schema

is

usually the result

of

dimension tables.

McLeod.

Software

semi-structured processed

to

sentiment that

data some

positive,

to

negative,

serialisability order

A method of text analysis

determine

a statement

or neutral

state

operations

that

would

creates

have

had been executed

server-side

extension

Copyright Editorial

review

with the

2020 has

Cengage deemed

any

All suppressed

process

Rights

Reserved. content

does

same

produced

instances

if the

May not

not materially

be

copied, affect

the

overall

or

duplicated, learning

In

measurement

multidimensional of the

data analysis, a

data density

held in the

data

cube.

specific

scanned,

which

applications

is low.

sparsity

that interacts

to handle

A model in

sparse data A case in which the number of table attributes is very large but the number of actual data

final

in a serial fashion.

A program

server

Learning. that

been

the

(SaaS)

software independence A property of any model or application that does not depend on the software used to implement it.

a

attitude.

transactions

directly

conveys

A property in which the selected

of transaction

database

if

as a Service

the cloud service provider offers turnkey that run in the cloud.

extent.

analysis

attempts

Data that have already been

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Glossary

specialisation top-down

hierarchy

process

specific

entity

supertype.

subtypes

A shared,

stores the

called

SQL

data

services

that

provide

amount

to

star

the

a given

database.

data

a central

stateless server

system

memory between

the

client

statement-level

more

if the

omitted.

This type

or after

the

static

in

tables.

clients

server.

A SQL trigger that is

of trigger

is

executed

statement

completes,

optimisation

which the

predetermined

A query

access

path to

at compilation

optimisation

SQL

SQL statements

do not

change

SQLin which the

while the

Copyright review

2020 has

Cengage deemed

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

logic or

language.

of data inputs

about

which

before

storage.

A relationship

dependence

data

that

to

keep

occurs

in the

A data characteristic in database

thus requiring

that

schema

changes

affects

in all access

enable

whole

users to

create

database

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

COMMIT

protocol.

A query that is embedded (or nested)

another

or aninner

Learning. that

discard

relationship

subquery inside

application

is running.

Editorial

code

The processing

using the two-phase

A style of embedded

Business

of SQL

procedural

decisions

which data to

(2)

form

of

subordinate In a DDBMS, a data processor (DP) node that participates in a distributed transaction

is with

dynamic query optimisation. static

the

by

code.

and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information.

and is

Contrast

statements.

in

make

of commands

before

a database

time.

as indicated

program

Structured Query Language (SQL) A powerful and flexible relational database language composed

are once,

a value,

and

structured data Unstructured data that have been formatted to facilitate storage, use, and information generation.

state

ROW keywords

of procedural

structural independence A data characteristic in which changes in the database schema do not affect data access.

not reserve

communications

DBMS

best

programs.

a

which a Web of the

SQL

processing to

group

in its

DBMS-specific

which a change

case.

query

mode in

an open

and

data access,

represents table

the

(1) A named collection

on a server

structural

data into

dimension

Web does

FOR EACH

triggering

default

The

trigger

assumed

the

or

and the

used

used to

as a fact

A system in

maintain

are

The

determine

when two entities are existence-dependent; from a database design perspective, this relationship exists whenever the primary key of the related entity contains the primary key of the parent entity.

set of

support

known

one

with it.

to

strong

output.

does not know the status

communicating

end.

Standards

statement

procedure

order

and

minimum

The star schema

with

in

answer

modelling technique

table

that returns

stored

minimum

decision

a relational

1:M relationship

the activity.

multidimensional

using

server

of the

A data

SQL statements

stream

correct

the

and specific

quality

schema

map

at the

A named

procedural

access,

a database.

to

function

another

the

using

describes

for

evaluate

data storage,

returns

of time,

A detailed that

requirements

management

about

statistics

that uses

strategy.

a RETURN

Activities to help

that

of resources

standards

Data

tuning

query

information

stored

relational

amount

instructions

or

and functions.

over the internet.

a SQL

in the least

stored

SQL statements

triggers

based query optimisation A query optimisation technique

uses these

access

memory area that

executed

(SDS)

SQL performance generate

then

cache.

services

management

statistical unique

of the subtypes.

including

procedure

entity

on grouping

reserved

most recently

procedures,

Also

based

statistically algorithm

more

a higher-level

is

and relationships

SQL cache

PL/SQL

lower-level,

from

Specialisation

characteristics

and

A hierarchy based on the

of identifying

933

right

query.

Also known

as a nested

query

query.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

934

Glossary

subschema the

In the network

database

seen

that

produce

the

database.

subtype

the

desired

table

programs from

the

each

data in

to

The attribute in the

that

determines

supertype

to

which

occurrence

of all values for a given

super column

column

only

entity

is related.

that is

composed

lock lock

any row

In

as a file

a command

of other related

scheme that allows

locks

to

access

an entire

a table.

table,

by transaction

markuplanguages inserted

document

in

should markup

Web browser

preventing

T2 while

for

such as HTML and XML,

a document

to

be formatted. languages

specify

Tags

how

are

used in

and interpreted

presenting

by a

data.

An attribute or attributes that uniquely each

entity

in

a table.

task

See key.

trackers

framework surrogate

key

generally

numeric

A system-assigned

primary

key,

tasks

responsible

object,

relationship;

MapReduce

to running

map and reduce

and auto-incremented.

The use of different names to identify

same

A program in the

on a node.

ternary synonym the

storage space

known

T1 is using the table.

server-side

superkey

Also

at a time

A table-level

the

columns.

identify

data.

Alocking

access to

tag

database, a

of a group

related

one transaction

transaction

or expression.

In a column family

In a DBMS, alogical

group

group.

A SQL aggregate function that yields the sum

column

space

used

table-level entity

subtype

model, the portion of application

information

discriminator

supertype

SUM

by the

such

as an entity,

synonyms

should

an attribute,

generally

relationship

an association or a

For example,

be avoided.

An ER term used to describe

(relationship)

between

a CONTRIBUTOR

to a FUND from

three

entities.

contributes

which a RECIPIENT

money

receives

money.

See also homonym.

theta join system

catalogue

A detailed system data

dictionary

that

describes

systems

administrator

for coordinating

all objects

in

inequality

a database.

join

The person responsible

an organisations

need

for

systems

be

The process

traces The

the

SDLC

database

2NF

history provides

out

and

big

picture

to

development

evaluated.

each

A matrix composed and

columns

an entity

set in the

relational

that model.

rows

that

assigns

data

of time.

top-down

Also

called

a

relation.

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

design and

Due Learning

that is, it

management, concurrent unique

timestamp

to

electronic reserves

the

right

history

some to

third remove

to

party additional

can

of all

is tracked.

main structures

moves

rights,

data

A design philosophy

then

the

time-variant

when a companys

by defining

part.

attribute;

Data whose values are a

appointments

begins

Cengage

>=) in the

is functionally

a global

For example,

system

or in

>, ,