Database Principles: Fundamentals of Design, Implementation, and Management [3 ed.] 9781473768062


512 21 294MB

English Pages [965] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Part I: Database Systems
Chapter 1: The Database Approach
Chapter 2: Data Models
Chapter 3: Relational Model Characteristics
Chapter 4: Relational Algebra and Calculus
Part II: Design Concepts
Chapter 5: Data Modelling with Entity Relationship Diagrams
Chapter 6: Data Modelling Advanced Concepts
Chapter 7: Normalising Database Designs
Part III: Database Programming
Chapter 08: Beginning Structured Query Language
Chapter 09: Procedural Language SQL and Advanced SQL
Part IV: Database Design
Chapter 10: Database Development Process
Chapter 11: Conceptual, Logical, and Physical Database Design
Part V: Database Transactions And Performance Tuning
Chapter 12: Managing Transactions and Concurrency
Chapter 13: Managing Database and SQL Perfomance
Part VI: Database Management
Chapter 14: Distributed Databases
Chapter 15: Databases for Business Intelligence
Chapter 16: Big Data and NoSQL
Chapter 17: Database Connectivity and Web Technologies
Glossary
Index
Recommend Papers

Database Principles: Fundamentals of Design, Implementation, and Management [3 ed.]
 9781473768062

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Australia

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Brazil

Reserved. content

does

May not

not materially

Mexico

be

copied, affect

South

scanned, the

overall

or

duplicated, learning

Africa

in experience.

Singapore

whole

or in Cengage

part.

United

Due Learning

to

electronic reserves

Kingdom

rights, the

right

some to

third remove

United

party additional

content

States

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

This is an electronic

some third content

does not

to remove valuable

formats,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

print textbook.

affect the

this title

overall

at any time

learning

on pricing,

www.cengage.com/highered

your areas

Rights

be

Media

available

Reserved. content

does

May not

in

not materially

The publisher

rights

changes

restrictions,

restrictions

to current

editions,

to search by ISBN#,

reserves

require

it.

the right For

and alternate

author, title, or keyword for

of interest.

Notice: not

editions,

rights

has deemed that any suppressed

experience.

if subsequent

please visit in

previous

Due to electronic

Editorial review

information

Important may

from

of the

may be suppressed.

materially

content

materials

text

version

party content

be

copied, affect

content the

referenced

eBook

scanned, the

overall

or

within

the

product

description

or the

product

version.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Database

Principles:

Design,

Fundamentals

Implementation,

and

of

2020

Cengage

US

Edition

from

Database

Management

Authors:

Carlos

Coronel,

Steven

13th

Keeley

Crockett,

Craig

Blewett

Marinda

Marketing

Louw

Manager:

Anna

Cengage

RIGHTS

Content

Project

reproduced,

transmitted,

in

or

any

form

Manager:

Manager:

Sue

Povey

written

Eyvett

by

Cover

No part

2019.

of this

stored,

any

means,

recording

permission U.K. from

Steven All

Morris.

Rights

work

may be

distributed

or used

electronic,

mechanical,

or

otherwise,

without

of Cengage

Learning

Copyright

Licensing

the

the

or under

prior

license

Agency

Ltd.

Author(s)

and

the

Adapter(s)

have

asserted

the

right

SPi-Global under

Cover

Inc.,

Davis

The Typesetter:

Coronel,

&

Reading

in the Manufacturing

by Carlos

Learning,

RESERVED.

photocopying, Senior

Edition,

Design, Implementation,

Reserved.

ALL Publisher:

Systems:

Morris Copyright

Adapters:

EMEA

Management

Adapted Third

Learning

Designer:

Simon

Levy

Image(s):

Vijay

Kumar/Getty

the

Copyright

identified

Associates

Images

Designs

as Author(s)

For product

us

at

permission

product

Patents

Adapter(s)

information

contact

For

and

and

Act

1988

of this

and technology

to

be

Work.

assistance,

[email protected]

to

use

and for

material

from

permission

this

text

queries,

or

email

[email protected]

British

Library

A catalogue

British

Cataloguing-in-Publication

record

for

this

Data

book

is

available

from

the

Library.

ISBN:

978-1-4737-6804-8

Cengage

Learning,

Cheriton

House,

Andover,

Hampshire,

United

EMEA

North

Way SP10

5BE

Kingdom

Cengage

Learning

learning different around

is

a leading

solutions

with

countries

and sales

the

world.

provider

employees

Find

your

in

of

customized

residing

in

more than

local

nearly

125

40

countries

representative

at:

www.cengage.co.uk.

Cengage by

Learning

Nelson

To learn

register

more

Printed Print

Copyright Editorial

review

2020 has

in

China

Number:

Cengage deemed

Learning. that

any

at

RR

All

Print

Rights

Reserved. content

in

Canada

Cengage

your

materials

platforms

online

for

your

and

learning

services,

solution,

or

course,

www.cengage.com.

Donnelley

01

suppressed

are represented

Ltd.

about

or access

purchase visit

products

Education,

does

May not

not materially

be

Year:

copied, affect

2020

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Brief Contents

Part i

Database systems 2

1

The Database Approach

2

Data

3 4

Relational Model Characteristics 70 Relational Algebra and Calculus 119

Models

Part ii 5 6 7

5

34

Design Concepts 162

Data Modelling with Entity Relationship Diagrams 165 Data Modelling Advanced Concepts 233 Normalising Database Designs 271

Part iii

Database Programming

8

Beginning Structured

Query Language

9

Procedural

SQL and

Part iV

Language

320

Advanced

SQL 426

Database Design 522

10

Database Development

11

Conceptual,

Logical,

Process

525

and Physical

Database

Part V Database transactions tuning 632 Transactions

and

Managing

13

Managing Database and SQL Performance

Part Vi Database

Concurrency

Management

Appendix

A:

Appendix

B: The

Appendix

C: Global

2020 has

706

860

938

Appendices (Available

review

672

912

Index

Copyright

578

635

Distributed Databases 709 Databases for Business Intelligence 750 Big Data and NoSQL 826 Database Connectivity and Web Technologies

Glossary

Editorial

Design

and Performance

12

14 15 16 17

318

Cengage deemed

Learning. that

any

Designing

All suppressed

Databases

University

Rights

Lab:

Tickets

Reserved. content

online)

does

May not

Ltd:

not materially

be

copied, affect

with

Visio

Professional:

Conceptual,

Logical,

Conceptual,

scanned, the

overall

or

duplicated, learning

Logical,

in experience.

whole

or in Cengage

A Tutorial

and Physical and

part.

Due Learning

Database

Physical

to

electronic reserves

Database

rights, the

Design

right

some to

third remove

party additional

Design

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

iv

Brief

Contents

Appendix

D: Converting

Appendix

E:

Comparison

Appendix

F:

Client/Server

G:

Appendix

H: Databases I:

Appendix

Copyright Editorial

review

2020 has

of ER

Appendix

Appendix

an ER Model into

The

Databases

in e-Commerce

Network

Appendix

K:

Database

Appendix

L:

Data

Database

Database

Implementation

M: Creating

Appendix

N: A Guide

Appendix

O: Building

Appendix

P:

Microsoft

Appendix

Q:

Working

with

Appendix

R:

Working

with Neo4j

Cengage

Learning. that

any

All suppressed

Rights

a New Database to

Using

SQL

a Simple

Reserved. content

does

Model

Model

Administration

Warehouse

Appendix

deemed

Structure

Notations

Systems

Object-Orientated

The Hierarchical

J:

a Database

Modelling

May

not materially

Using

Oracle 12c

Developer

with

Object-Relational

Access

not

Factors

Oracle

12c

Database

Using

Oracle

Objects

Tutorial

MongoDB

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

Preface

xiii

Changes

to the

Third

Edition

Acknowledgements

About the

Authors

Walk Through Dedication

xvii

Tour

xviii

xx

Teaching

and

Learning

Parti

Support

Vignette:

1.1 1.2

The Relational Revolution

Historical

and the

DBMS

is important

files

and

with file

system

Database

systems

21

Preparing

for your

data

processing

data

database

8

13 13

management

professional

17

career

28

30

Key terms

30

reading

Review

31

questions

Problems

31

32

Data Models 34 Preview

34

2.1

The importance

2.2 2.3 2.4

Data

The evolution

2.5

Degrees

model

Business

Summary

66

any

36

of data

models

abstraction

39 58

65

questions

Learning.

blocks

65

Review

that

building

35

37

of data

Problems

Cengage

basic

models

64

Key terms

deemed

of data

rules

Further reading

has

database design

roots:

Problems

Further

2020

3

6

the

Why database

Summary

2

vs information

Introducing

1.4 1.5 1.6 1.7

review

An Historical Journey

5 Data

1.3

Copyright

xxi

the Database Approach 5 Preview

Editorial

Resources

Databasesystems 2

Business

1

xv

xvi

All suppressed

Rights

Reserved. content

does

65

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

vi

Contents

3

relational Preview

3.1

A logical

3.2 3.3 3.4 3.5

Keys

revisited

98

database

rules

catalogue

85

database

87

relational

103

104 104

Further

reading

Review

questions

104

105

108

relational

Algebra and Calculus 119

119

4.1 4.2 4.3

Joins

4.4

Relational

Relational

operators

121

133

Constructing

queries

using

calculus

relational

algebraic

expressions

141

148

153

154

Further

reading

Review

questions

Problems

155 155

157

Partii

Design Concepts 162

Business

Vignette:

Using Data to Improve the Lives of Children and Women 163

Data Modelling Preview

with entity relationship

Diagrams 165

165

5.1 5.2

The entity relationship

5.3

Database

Developing

Summary

(ER)

an ER diagram design

model 167 196

challenges:

conflicting

goals

212

215

Key terms

216

Further

reading

Review

questions

Problems

216 217

220

Data Modelling Advanced Concepts 233 Preview

6.1 6.2

Cengage deemed

relational

101

Codds

Key terms

has

within

Data redundancy Indexes

Summary

2020

83

the

Preview

review

rules

and the system

Problems

6

72

Relationships

Key terms

Copyright

of data

The data dictionary

Summary

Editorial

view 78

Integrity

3.7 3.8

5

70

70

3.6

4

Model Characteristics

Learning. that

any

233 The

extended

Entity

All suppressed

Rights

clustering

Reserved. content

entity

does

May not

not materially

relationship

model

234

242

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

6.3

Entity

6.4 6.5

Design

Data

Key terms

257

reading

Review

studies

Preview

244 design

249

255

258

261

Database Designs 271

271

7.1

Database

tables

7.2 7.3 7.4 7.5

The

for

need

and

the

design

Surrogate

key

considerations

Higher-level

7.7 7.8

Normalisation

normal

276

284 289

forms

and

Denormalisation

272

272

process

Improving

7.6

290

database

design

296

302

303

Key terms Further

normalisation

normalisation

The normalisation

Summary

306

reading

Review

306

questions

Problems

306

308

Part iii 8

checklist

keys database

258

normalising

Business

flexible

257

questions

Case

primary

learning

modelling

256

Problems

selecting

cases:

Summary

Further

7

integrity:

vii

Database Programming Vignette:

318

Open Source Databases 319

Beginning structured Preview 320 Introduction 8.1

Query Language 320

to SQL 321

8.2

Data definition

8.3 8.4 8.5

Data manipulation commands 339 Select queries 347 Advanced data definition commands

commands

8.6

Advanced

select

324

queries

361

369

Virtual tables: creating a view 383 8.7 Joining database tables 385 8.8 Summary 392 Keyterms 393 Further reading

393

Review questions Problems 401

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

394

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

viii

Contents

9

Procedural Language sQL and Advanced sQL 426 Preview

426

9.1

Relational

9.2 9.3 9.4 9.5

SQLjoin

set

operators

operators

Subqueries

and correlated

SQL functions Oracle

446

468

Updatable

views

9.7 9.8

Procedural

SQL

Embedded

SQL 495

472 475

500

Key terms

501

Further

reading

Review

questions

Problems Case

queries

459

sequences

9.6

Summary

428

438

502

502

503

515

PartiV Database Design 522 Business Vignette: EM-DAT:TheInternational DisasterDatabasefor DisasterPreparedness523

10

Database Development Preview

10.1 10.2 10.3

Process 525

525

The information system 527 The systems development life cycle (SDLC) The database life cycle (DBLC) 532

10.4

Database design strategies 552

10.5 10.6

Centralised vs decentralised design 553 Database administration 555

Summary

573

Key terms

574

Further

reading

Review

questions

Problems

529

575 575

576

11 Conceptual, Logical, and Physical Database Design 578 Preview

578

11.1

Conceptual design 580

11.2

Logical database design 594

11.3

Physical database design 603

Summary

625

Key terms

626

Further

reading

Review

questions

Problems

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

627 627

628

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

ix

Part V Databasetransactions and Performance tuning 632 Business

12

Vignette:

From Data Warehouse to Data Lake 633

Managing transactions Preview

and Concurrency

635

12.1

What is a transaction?

12.2 12.3 12.4

637

Concurrency Concurrency Concurrency

control control control

646 withlocking methods 651 with time stamping methods 659

12.5

Concurrency

control

with optimistic

12.6 12.7

ANSI levels of transaction isolation 661 Database recovery management 662

Summary

660

668

reading

Review

668

questions

Problems

13

methods

666

Key terms Further

635

668

669

Managing Database and sQL Performance Preview

672

672

13.1 13.2 13.3

Database performance-tuning concepts Query processing 678 Indexes and query optimisation 682

13.4

Optimiser

13.5 13.6 13.7 13.8

SQL performance tuning 685 Query formulation 688 DBMS performance tuning 689 Query optimisation example 692

Summary

683

699

Key terms Further

choices

673

700

reading

700

Review

questions

Problems

701

700

Part Vi Database Management 706 Business

14

Vignette:

Distributed Preview

14.1 14.2 14.3

Copyright Editorial

review

2020 has

Cengage deemed

The FacebookCambridge

Learning. that

any

Analytica Data Scandal andthe GDPR 707

Databases 709

709

The evolution of distributed database management systems DDBMS advantages and disadvantages 712 Distributed processing and distributed databases 714

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

710

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

x

Contents

14.4

Characteristics of distributed database management systems 715

14.5 14.6 14.7 14.8

DDBMS Components 717 Levels of data and process distribution 719 Distributed database transparency features 722 Distribution transparency 723

14.9

Transaction transparency 726

14.10 14.11 14.12

Performance and failure transparency Distributed database design 733 The CAP theorem 740

14.13

Database security 742

14.14 14.15

Distributed databases within the cloud 742 C.J. Dates 12 commandments for distributed

Summary

745

Key terms

746

Further

reading

Review

questions

Problems

15

732

744

746 746

747

Databases for Business intelligence Preview

15.1 15.2

databases

750

750

The need for data analysis 751 Business intelligence 751

15.3

Decision support data 762

15.4 15.5 15.6 15.7

The data warehouse 767 Star schemas 777 Data analytics 789 Online analytical processing

15.8

SQL analytic functions

15.9

Data visualisation

Summary

818

Key terms

819

Further

reading

Review

questions

Problems

794

805

811

820 820

821

16 Big Data and nosQL 826 Preview

16.1 16.2

826

Big data 827 Hadoop 833

16.3

NoSQL databases 840

16.4 16.5 16.6

NewSQL databases 848 Working with document databases using MongoDB 849 Working with graph databases using Neo4j 853

Summary

857

Key terms Review

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

858 questions

All suppressed

Rights

Reserved. content

does

859

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Contents

17

Database Connectivity Preview

17.1

Database connectivity 861

17.2 17.3 17.4 17.5

Database internet connectivity 873 Extensible markup language (xML) 884 Cloud computing services 898 The semantic web 907 908

Key terms

909

Further reading Review

Index

909

questions

Problems

Glossary

909

910

912 938

Appendices (Available Appendix

A:

Appendix

B: The

Appendix

C:

2020 has

Lab:

Tickets

Ltd:

Global

D:

Converting

Comparison

Appendix

F:

Client/Server

Appendix

G:

Object-Orientated

Appendix

H:

Databases

J:

an

in

Hierarchical

Network

K:

Database

L:

Data

Conceptual,

ER

Model into

Database

Database

P:

Microsoft

Q:

Working

with

MongoDB

Appendix

R:

Working

with

Neo4j

does

May not

a New

Factors

Appendix

Reserved.

Structure

Model

Oracle

Appendix

content

Design Design

Model

Database

O: Building

Rights

Database Database

Notations

Implementation

N: A Guide to

All

Physical

Administration

Warehouse

Appendix

suppressed

and

Databases

Appendix

any

Physical

e-Commerce

M: Creating

Learning.

Logical,

A Tutorial

and

a Database

Modelling

Appendix

that

Logical,

Systems

The

Appendix

with Visio Professional: Conceptual,

of ER

The

Appendix

Cengage deemed

Databases

E:

I:

online)

University

Appendix

Appendix

review

Designing

Appendix

Appendix

Copyright

860

860

Summary

Editorial

and Web technologies

xi

Using

SQL Developer

a Simple

not

be

copied, affect

Database

Using

Oracle

Objects

Tutorial

scanned, the

12c

with Oracle 12c

Object-Relational

Access

materially

Using

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PrefACe

We are solid

excited

and

to

introduce

practical

This foundation

creation

the

foundation is

built

depends

of core in the

database

Approach:

As the

broad to

title

suggests,

database

the

by poorly

induced

provides when

database

databases

systems.

their

successful

level

and and

database

the

areas

will also provide practical

coverage

management

systems

of business

technology,

Design

detail,

and

Management

reasons,

special

covers

attention

to

potential

three

is given

processing

Learning. any

All suppressed

Rights

does

and

conflicts

May

not materially

In

and

better

our

design

software

likely

approach

many, if

be solved

even the

people to

without

experience,

cannot

DBMS

system

with the

to

not

help

overcome

best bricklayers

management

worthwhile

management

to skills

use in

most, of even

problems

and carpenters

seem to

scarce

order

be

affect

carefully

function

techniques

be triggered

resources

to

to

exercise

ones

warehouses

structures,

and

develop

them

design

useful

design

between

scanned, the

overall

or

are

on crises

duplicated, learning

it

makes little

in experience.

whole

or in Cengage

part.

Due Learning

may

completed. database

data from more

operational

sense

covered

problems make

elegance,

sense to

to

clients

of current

make

we have

We also

design

what they

when the

understood.

stressed,

skills.

get

design is

much of their

end-of-chapter

database

to

In fact,

understanding

procedures are

numerous

more likely

database

derive

and implementation

the

are

and thoughtfully.

once a good

promotes

data

For example,

copied,

Clients

approached

of database

real

not

design

of communication. is

because

sure that

speed.

Reserved. content

disasters.

poor

seems

concepts,

aspects

develop

the create

blueprint.

hardly

really

structure

and actual

even database-inexperienced

Using an analogy,

means

design

making

enables

Nor is

design

warehouse

practical

in

students

that

a

databases.

For example, data

Cengage

in

Unfortunately,

system

organisations

operational

deemed

studying

Implementation,

with database

It

system

databases,

transaction

has

of

database

to

a bad

database

an excellent

with

of

associated

database

Familiarity

the

from

designed

how their

procedures

number

databases.

discover

software

by poor design.

extensive

technologies.

2020

Stages

applications.

managers.

building

by poorly

Design

review

provide

database

things,

comprehensive

on courses

Design,

are traceable

problems

and

any

and

designed

excellent

Copyright

way to

a good

Most difficult

Editorial

to

define them.

at undergraduate

only for those

of

practical

However, for several important

database

failures

or magnified

create

Because

that

designed

management

very

Providing

those

Principles:

database

programmers

created

the

and

system

best

need

not also

on the

systems.

of excellent

paves

database

for

Database

databases

cant

concepts

is

design:

usually

the

Emphasis

of database

The availability create

text but

and are

databases courses.

which

and data analytics.

Continued

aspects

course in

an ideal

Principles,

databases

the important

science,

data science

while

postgraduate

it is

of computer

Database

implementation

that,

a first

conversion

concepts,

context

introductory

The

for

for

of

design,

notion

on understanding

material

edition

for the

on the

This edition is suitable essential

third

electronic reserves

sure

right

that

some to

third remove

concepts

additional

content

understand

requirements,

databases

party

and

challenging

students

information

design

rights, the

design

are sufficiently

may content

that

be

meet design

suppressed at

any

time

and

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

xiv

Preface

elegance the

standards

use

of

requirements This

Standard

(UML

Foot

of both

Copyright Editorial

review

2020 has

Cengage deemed

these

Learning. that

any

by the

In

is

to high

of

information

that

the

2017

with the

Modelling

are

Therefore,

capable

of

notation

for

third

edition.

to

data

modelling

Language)

Group has led to

edition

second

this

Appendix

requirements.

databases

we explore

meeting

end-user

data

modelling.

standards.

Management

as the

approaches

maintained.

design

Object

within

ensure

UML (Unified

keeping

models

notation

that familiarity

to use

2.5.1 is available

relationship

Crows

the

reviewed.

meet end-user

trade-offs

conforming

retains

development

continually

entity

defined

while

edition

Continual

is

while they fail to

carefully

standard: edition,

order

E, Comparison

ISO/IEC

UML

However, in

has

as to

of ER

UML becoming 19505-1

continued

organisations maintain

Modelling

and 19505-2), to

be used

still

legacy

an International

use

both

systems,

Notations,

to

which produce

Chen

and

it is important

contains

coverage

notations.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

ChAnges to the thirD eDition

In this

third

edition,

database

design

To support Data

the

of Big

Data

technologies

that

have

been

developed

and

expanded

Business and

databases

Cengage deemed

Learning. that

any

coverage

of

a few

and

focuses

data

of the

NoSQL in

and

to

strengthen

the

already

strong

highlights:

technology,

greater

to

continued

depth

support

its

visualisation

we have on the

added

use, including

tools

a new

characteristics Hadoop

and techniques

Chapter

of

in

Big

and

16:

Data

Big

and the

MongoDB.

Chapter

15,

Databases

the

classroom.

Intelligence.

updated of

An additional

has

growth

new features

are just

chapter

Coverage

2020

Here

The

New

review

coverage.

some

NoSQL.

New

Copyright

added

and

for

Editorial

we have

Business

MongoDB

with

appendix

All

Rights

provide

exercises

coverage

topical for

discussion

querying

of Neo4j

points

MongoDB

with hands-on

in

databases

exercises

(Appendix

for

querying

Q). graph

R).

Reserved. content

to

hands-on

containing

(Appendix

suppressed

Vignettes

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

ACknoWLeDgeMents

The

publisher

feedback Emilia

on the

second

Mwim,

UNISA

Patricia Judy

acknowledges

Casper

Essop, Jakeman,

Andy

Davies,

Mick Ridley, Ray

Turner,

Mark

For this

Lecturer

Oxford

I

coverage

of relational

Last,

College

of Essex University of

Glamorgan

to

say

a special

School

of

Computing,

of experience

within

thanks Maths

the

to and

Pamela Digital

database

field

Quick,

who

Technology have

previously

at

been

worked

Manchester

very

valuable,

Metropolitan specifically

I have

Louw.

been lucky

Marinda

to

provided

work

fantastic

with a very support

patient,

in

supportive

answering

all

and

It

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

not least,

Reserved. content

has

been

with you.

certainly

thank

you

to

my family

(my

ohana)

for

your

patience

and

support.

January

Copyright

the

professional

my emails.

Keeley

Editorial

as a

algebra.

working and

State

of Bradford

like

edition,

Marinda

a pleasure

invaluable

Greenwich

Brookes

would

Free

Regional

University

in the

third

provided

College

University

Her years

On this

who

of Technology

of the

of

Blackburn

University.

Publisher,

University

Peterborough

McPhee,

edition,

lecturers,

of Pretoria

University

University

Green,

Duncan

Senior

Central

University

Chris

following

UNISA

Macdonald,

Ismael

of the

editions:

University

Wessels,

Theo

contribution

and third

Alexander,

van Biljon,

the

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

Crockett 2020

eBook rights

and/or restrictions

eChapter(s). require

it

ABout the Authors

Carlos

Coronel

Tennessee

is

State

Administrator,

courses

in

Web development,

Steven

completed

Morris Design

and

Dr Keeley

and

and

from

Rule Induction data

in

dialogue

such

Women

as

in

be a STEM

Dr Craig in

South

the

Activated

is the such is

founder

also

has

Cengage deemed

Learning. that

any

technology

literacy,

All

Rights

does

May not

fields

at

as

a

and

Middle

Database

has

taught

data communications

not materially

be

copied, affect

been

which

at the

of

has

Women

in

Masters

explored

the

His PhD, in education (ACT)

companies

teaching

speaker

who is

digital

scanned, the

overall

or

of

resulted to teaching

of numerous

using

articles

undertaking

many

is

and IEEE also

proud

to

schools.

and

Artificial

Technology

Intelligence

in the

to

development

with technology. books

running,

his innovative

for its

and journal

Systems

application

the

and natural

committee,

of Information

author

She leads

Keeley

in rural

Fuzzy

presence

IEEE

in

systems

systems,

papers

roles.

with technology,

entitled

database

international

science

Mathematics

1998 of

Leadership

approach

and is the

systems,

in

in the

other

technology,

model, a unique

has published

students.

conference

many

area

Systems

a BSc Degree (Hons)

fuzzy

volunteer

in computer

in the

Steven

field

a strong

Engineering

among

outreach

the

intelligence,

an active

PL/SQL,

of Computing,

postgraduate

established

She is

School

She gained

machine learning

and

He has taught

and

University.

within

artificial

University. SQL

journals.

in the

teaching

using

and teaching His

State

over 125 refereed

IEEE

Auburn

University.

subcommittee

acclaimed

Reserved. content

Labs

Specialist,

Advanced

of several

Intelligence

Profiling

database

changing

with

undergraduate

Lab,

years.

of

multiple

various

and

PhD from

boards

has

both

of the

Teaching

our rapidly

suppressed

to

management.

an internationally

in

She

been researching 25

and

and a PhD in the field

with a passion for

over

in

Computer

Technology

Middle Tennessee

review

and journals.

Classroom

as computer

education

2020

for

Business

experience and

Programming

She has published

member

transaction

of

development,

Metropolitan

intelligence

has

Africa

database

1993,

conferences a

of

and

Computational

Psychological

Ambassador

Blewett

in

Domains.

systems.

being

years

Science

of MIS at

20 years

Computational

of

on the

Research

Adaptive

major international

roles

review

Data

Intelligence

into

College

Manager

Database

Manchester

UMIST in

for

the

design

Bachelor

serves

at

engineering

language

his

a Reader

from

Computational

research

database

and Principles

is

25 Web

Development,

Technology

Computation

over

for

levels.

currently

Crockett

Digital

and

and

Design,

articles,

Director

He has

and graduate

many

Lab

Administrator,

undergraduate

Analysis

Copyright

the

University.

Network

Database

Editorial

currently

covering

and

active

approaches

to

topics

living.

help

of Craig

He

change

world.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

WALk-through tour CHAPTER 1 The

IN

THIS

CHAPTER,

The

BUSINESS

Database

VIGNETTE

a

between

database

valuable

RELATIONAL

AN

REVOLUTION

HISTORICAL

How

differs Until

the

late

difficult

1970s,

to

the

databases

navigate.

stored

large

Programmers

database

was

designed.

time-consuming

and

amounts

needed

to

Adding

or

of

know

data

what

changing

in

structures

clients

the

way

that

wanted

the

to

data

were

do

were

inflexible

with

the

1970,

article

Edgar

stored

or

entitled

realised

of

Codd,

theories

computers

and

query

strange

Ted

five-page

of

would

spark

a

to

one

Data

for

a

And

I

said,

Data

Chamberlin,

it

this

very

The

before

analysed

was

main

to

fund

At

on

par

with

of

guy

Ted

the

the

SQL,

Codd

time,

System

R,

eventually

The

lead

a

a

the

who

had

most

popular

some

kind

listened

as

Chamberlin

Codd

recalls.

reduced

number

vested

interest

about

a

research

to

the

of

years,

in

the

same

who

role

of

open

had

this

two

research,

project

creation

that

of

which

IMS,

a

time

as

read

Codds

tight-knit

IBM

Ellison,

SQL

built

and

turned

reliable,

a

DB2.

out

to

high-end

System

R

The

prototype

of

IBM,

be

a

however,

a

symposium

database,

System

decision,

R

because

its

had

just

up,

system

two

established

fuelled

staff

that

professors

a

a

to

series

publish

founded

programmers

Navy,

Ellison

By

the

1983,

a

from

was

able

the

to

had

from

similar

of

these

small

System

market

company

database,

database

are

management

and

how

database

system

a

database

system

(DBMS)

been

the

project

papers.

systems

the

first

(Software

had

and

the

over

those

decisions

require

Data

this

are

to

you

other

in

had

and

information,

which

managed

most

what

a

is

derived

efficiently

from

when

database

management

is,

what

methods.

they

it

You

raw

are

does

will

facts

stored

and

also

known

in

why

learn

a

it

as

database.

yields

about

better

different

types

databases

and

why

database

evolved

is

design

from

now

is

so

computer

largely

important.

file

outmoded,

systems.

Although

understanding

the

file

system

characteristics

of

data

file

systems

of

papers

the

1979,

a

changed

CIA

well

released

had

be

learn

data

1968.

potential

the

from

database

annually,

good

likely

chapter,

than

is

was

important

because

chapter,

they

you

will

are

also

the

source

learn

how

of

the

serious

data

database

management

system

limitations.

approach

In

helps

eliminate

Laboratories.

funding

relational

000

quality

competition

market

Development

securing

Laboratories)

910

data

California

The

the

reading

Software

and

SQL-based

3

and

back

in

of

Ingres.

of

Among

called

Ingres,

Development

grossed

governance

company

released

University

called

Unaware

papers.

company

R

data

which

on

the

most

of

a

convinced

relational

kept

crucial

database

started

work,

groups

allowed

who

Recruiting

systems

components

PREVIEW

of

this Larry

file

management

main

source

of

management the

from

data

of

Databases Berkeley,

between

are

complicated

of At

at

they

system

functions

importance

results had

why

development

In for

and

nobody

data. would

burner

are,

groundbreaking

Good IBM

file

databases

seriously.

Chamberlin

Wow,

a

Banks.

co-inventor

was

took

and

published

revolution

There

nobody

IBM,

Shared

technological

symposium,

line.

by

Large

Don

explains:

but

organised

programs

employed

internet.

today,

notation,

Codd

mathematician

Model

the

language

mathematical

Then

a

Relational

Codds

personal

database

system

systems

a

of

expensive.

Ted

A

that

evolved

file

database

from

types

making

design

and

data

The In

in

the

information

different

database

databases

flaws

What

LEARN:

and

the

decision

of

modern

About

JOURNEY

what

for

importance

WILL

data

is,

assets

The

THE

YOU

difference

What

Approach

and

before

the

shortcomings

of

file

system

data

management.

IBM.

portable

its

of

the

version

name

to

Oracle.

3

Business Vignettes illustrate the parttopics with a

Chapter Previews setthe scenefor the chapter and

genuine scenario and show how the subject integrates

with

provide an overview of the chapters

contents.

the real world.

20

PART

I

Database

CHAPTER 3

Systems

The

are

1

criticisms

not

of

unique

to

introduced

design

Relational

Model

in

THIS

CHAPTER,

YOU

WILL

a

adhering

end

to

the

relational

That

the

relational

tables

in

How

a

database

model

models

relational

takes

basic

field

Entity

you

learn

about

database

definitions

and

and

of

always

structure

of

important

in

Both

learn

6,

data

environment,

the

designers

of

1.3

they

about

Chapter

types

Figure

later,

you

issues

the

reflect

requirements.

naming

be

implementation

must

processing

file

to

when

Regardless

database

the

prove

conventions

Diagrams,

Design.

a

and

in

will

naming

Relationship

Database

or

shown

conventions

and

are

database

Data

Modelling

in

Chapter

11,

the

design

documentation

needs

are

best

served

conventions.

LEARN: Online

That

when

system

reporting

proper

conventions

such

with

Physical

file

users

naming

definitions

Modelling

and

involves

the

field

and

Logical

it

and

and

Because

revisit

Data

Concepts;

whether

needs

IN

will

5,

Conceptual,

by

definitions

systems.

You

Chapter

Advanced

Characteristics

field

file

early.

a logical

view

components

are

of

this

data

relations

implemented

Content

Appendices

A to

P are

available

on the

online

platform

accompanying

book.

through

DBMS

relations

are

organised

in

tables

composed

of

rows

(tuples)

and

columns

(attributes)

NOTE Key

terminology

About

used

the

role

of

in

the

describing

data

relations

dictionary,

and

the

system

catalogue No

How

data

redundancy

is

handled

in

the

relational

database

naming

the

Why

indexing

is

convention

can

fit

all

requirements

for

all

systems.

Some

words

or

phrases

in

some

are

reserved

for

model

important

DBMSs

your

be

internal

DBMS

use.

might

interpreted

you

as

would

For

interpret

get

a

an

example,

a

command

(-)

to

error

the

hyphen

name

as

a

subtract

the

message.

On

the

ORDER

generates

command

to

NAME

other

an

subtract.

field

from

hand,

error

Therefore,

the

CUS

the

field.

CUS_NAME

DBMSs.

field

Because

would

work

Similarly,

CUS-NAME

would

neither

fine

field

because

exists,

it

uses

an

underscore.

PREVIEW 1.5.3 In

Chapter

and

2,

data

Data

Models,

you

independence

allow

learnt

you

that

to

the

examine

relational

the

data

models

models

logical

the

physical

aspects

of

data

storage

and

retrieval.

You

Data

also

without

learnt

that

file

ERM

may

be

used

to

depict

entities

and

their

relationships

graphically

through

systems

structure

an

organisational

chapter,

you

will

learn

some

important

details

about

the

relational

models

and

more

about

how

the

ERD

can

be

used

to

design

a

relational

will

learn

how

the

relational

databases

basic

data

components

fit

into

construct

known

as

a

table.

You

will

discover

that

one

is

unlikely

that

database

physical

important

reason

for

be

models

units.

related

simplicity

You

to

will

one

also

is

learn

that

its

how

tables

the

can

be

independent

treated

as

tables

learning

introduced

an

of

the

to

the

and

part

are

few

way

which

that

relational

poorly

to

For

those

components

example,

and

shape

database

designed

introduced

chapters.

in

their

concepts

of

and

you

next

the

tables,

basic

integral

well-designed

Finally,

it

storage

islands

difficult

of

of

to

the

combine

same

data

basic

information

for

from

data

such

multiple

in

sources.

different

scattered

locations.

data

locations.)

in

different

locations

will

logical

within

versions

numbers

of

occur

in

the

always

same

both

the

be

data.

updated

For

consistently,

example,

CUSTOMER

in

and

the

As

the

Figures

islands

1.3

AGENT

files.

and

of

1.4,

You

the

need

only

correct

copy

of

the

agent

names

and

phone

numbers.

Having

them

occur

in

more

than

one

place

rather

the

data

redundancy.

Data

redundancy

exists

when

the

same

data

are

stored

unnecessarily

at

database places.

another.

about

to

such

make

the

different

phone

Uncontrolled After

security

term

the

different can

stored

contain

and

produces than

data

often

names

one relational

the

a agent

logical

use

database. information

You

of

promotes

professionals

logical it

structure

lack

structure

ERD. (Database

this

and

the The

In

Redundancy

structural

structure

The considering

the

their

design

design,

of

you

will

relationships,

tables.

also

you

Because

the

the

characteristics

learn

table

is

Data

you

relationships

concepts

will

might

that

examine

be

will

become

different

handled

your

kinds

in

the

of

in

in

the

gateway

files

on

relationships

relational

sets

Data

appear

address

basic

redundancy

inconsistency.

data

tables.

some

data

the

stage

for:

are

which

inconsistency

different

the

file.

of

For

If

different

version

exists

places.

AGENT

contain

you

data

the

data

is

when

example,

forget

for

same

different

suppose

to

the

make

and

you

conflicting

change

corresponding

agent.

versions

an

agents

changes

Reports

will

yield

of

phone

in

the

the

same

number

or

CUSTOMER

inconsistent

file,

results

depending

used.

database Poor

data

being

susceptible

security.

Having

multiple

copies

of

data

increases

the

chances

of

a

copy

of

the

data

environment.

Learning

Objectives

appear at the start of each chapter

to

Online Content

to help you monitoryour understandingand progress

unauthorised

access.

boxes draw attention to relevant

material

onthe online platformfor this book.

through each chapter. Each chapter also ends with a

Notes highlight important facts about the concepts

summary section that recaps the key content for revision

introduced in the chapter.

purposes. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

154 64

PART

I

Database

PART

I

Database

Systems

Systems

User TABLE

Levels

2.3

of

data

queries

can

expression,

Degree

Model

be

written

as

relational

algebraic

expressions.

In

order

to

write

such

as

an

abstraction

Focus

of

Independent

the

?

List

all

?

Select

?

Specify

following

the

steps

attributes

we

should

need

be

to

give

taken:

the

answer.

of all

the

relations

we

need,

based

on

the

list

of

attributes.

Abstraction

2

High

External

End-user

views

Global

Conceptual

view

of

data

(independent

of

database

Hardware

and

software

Hardware

and

software

Specific

database

Storage

and

relational

calculus

predicate

calculus.

operators

is

a

and

formal

the

language

intermediate

results

based

upon

a

that

branch

are

of

needed.

mathematical

logic

called

model)

Tuple Internal

the

Relational

relational

calculus

allows

users

to

describe

what

they

want,

rather

than

how

to

compute

it,

modelHardware and

underlines

the

appearance

of

Structured

Query

Language

(SQL).

Expressions

in

tuple

4 Low

Physical

access

methods

Neither

hardware

nor

software

relational

calculus

Domain

relational

that

TABLE

SUMMARY

data

model

is

Database

a

(relatively)

designers

The

Business

data

abstraction

models

data-modelling

rules

real-world

simple

use

basic

are

of

to

used

to

identify

a

complex

communicate

are

and

real-world

with

components

define

the

data

applications

entities,

basic

and

relationships

modelling

tuples

calculus

on

values

is

from

Summary

which

different

an

of

Operator

for

given

from

attribute

predicate

tuple

is

relational

true.

calculus

as

it

uses

of

the

domain

variables

domain.

relational

Symbol

a

operators

Description

environment.

programmers

attributes,

take

4.1

Relational A

users.

return

and

components

end

constraints.

within

a

specific

SELECT

s

Selects

a

subset

PROJECT

P

Selects

a

subset

Selects

tuples

in

Relation1

but

INTERSECT

Selects

tuples

in

Relation1

or

UNION

Selects

tuples

in

Relation1

and

-

DIFFERENCE

of

tuples

of

from

a

columns

relation.

from

a

not

relation.

in

Relation2*.

environment.

The

hierarchical

the

concepts

The

relational

the

and

are

network

found

in

model

end

user

means

of

is

the

database

for

to

and

visually

end

users

being

stored

complements

present

different

to

that

are

no

longer

used,

but

some

standard.

in

tables.

Tables

The

entity

the

the

are

of

data

the

to

data

into

a

seen

common

is

ER

by

PRODUCT

THETA

other

model

The

as

model,

each

(ER)

model.

the

relational

related

relationship

relational

views

integrate

In

X

JOIN

Computes

{

popular

two

5,

,,

the

possible

relations

,5,

excluding

combinations

to

.5,

Relation2,

,

be

.}.

of

combined

When

the

duplicate

one

operator

tuples*.

tuples.

using

is

comparison

5 the

operators

operator

is

known

as

an

EQUIJOIN.

allows

database

all

Allows

u

by

a

model

Relation*.

of

CARTESIAN

attributes.

that

and

models

implementation

common

modelling

designers

programmers

as

early

models.

database

data

in

data

were

data

current

values

tool

models

current

the

perceives

common

graphical

data

in

NATURAL

|X|A

JOIN

version

designers,

of

the

EQUIJOIN

Relation1Tuple.Y

framework.

both

which

5

relations

selects

those

Relation2Tuple.Y.

which

Y is

must

share

and

natural

the

tuples

a

same

set

where

of

common

domain.

attributes

Duplicate

to

columns

are

removed. The

object-orientated

object

data

resembles

also

an

includes

model

entity

in

information

objects,

thus

(OODM)

that

it

about

giving

its

uses

includes

objects

the

relationships

data

more

as

facts

the

that

between

basic

define

the

modelling

it.

facts

But

as

structure.

unlike

well

as

an

An

entity,

the

relationships

object

with

OUTERJOIN

Based

other

all

relational

model

relational

data

and

scientific

the

most

has

model

adopted

many

(ERDM).

At

applications,

likely

object-orientated

this

while

future

point,

the

scenario

is

(OO)

the

ERDM

an

OODM

is

is

largely

primarily

increasing

extensions

used

geared

merger

of

to

to

in

the

specialised

business

OODM

become

overshadowed

by

NoSQL

the

databases

4

need

are

to

a

distributed

support

to

new

data

consistency

the

develop

internet

generation

stores

and

very

access

of

specific

that

strategies

databases

shifting

needs

provide

the

high

burden

tuples

ERDM

that

of

Big

for

do

Data

not

use

scalability,

in

Relation1

JOIN,

that

have

the

no

OUTERJOIN

in

corresponding

addition

values

in

selects

and

fault

of

maintaining

Although

technologies,

UNIVERSAL

both

relational

model

NoSQL

relationships

and

tuples

in

Relation1

that

match

every

row

in

the

relation

Relation2.

A

formula

The

;

must

formula

be

true

must

for

be

at

true

least

for

one

instance

all instances

are the

case

of

these

operators,

relations

must

be

union-compatible.

databases.

the

organisations.

availability

Selects

'

EXISTENTIAL

engineering

applications.

and

and

KEY geared

u-JOIN

extended

* in

are

the

the

Relation2.

meaning.

DIVIDE The

on

databases

tolerance

data

integrity

TERMS

offer

by

sacrificing

to

the

data closure

natural

difference

PROJECT

DIVISION

predicate

SELECT

join

program safe

expression

code.

Data

modelling

requirements

are

a

function

of

different

data

views

(global

vs

local)

and

domain level

of

data

abstraction.

The

American

National

Standards

Institute

Standards

Planning

relational

calculus

Requirements

Committee

(ANSI/SPARC)

describes

three

levels

of

data

abstraction:

lowest

level

internal.

of

There

data

is

abstraction

also

is

a

fourth

level

concerned

of

data

abstraction

exclusively

with

(the

physical

algebra

relational

algebraic

relational

schema

expression

theory

theta

join

tuple

relational

calculus

external, INTERSECT

and

relational

and equijoin

conceptual

set

calculus

the

physical

level).

storage

This

join

column(s)

left

outer

UNION

union-compatible

RESTRICT

methods. right

join

outer

join

Summary Eachchapter ends witha comprehensive

Key Terms arelisted atthe end ofthe chapter and

summary that provides a thorough recap of the issues in

explained in full in a Glossary at the end of the book,

each chapter, helping you to assess your understanding

and

enabling you to find explanations of key terms quickly.

revise key content.

CHAPTER

single-user

query

database

1

transactional

The

Database

Approach

32

31

language

query

result

social

set

record

semi-structured

media

workgroup

structural

dependence

structural

independence

Structured

data

XML

I

Database

Systems

PROBLEMS

database

1 query

PART

1

database

database

Online

Query

Language

(SQL)

in

a

Content

Microsoft

platform

FURTHER

READING

Given

the

1 Codd,

E.F.

Date,

C.J.

The

Capabilities

The

of

Database

Assessment

of

Relational

Database

Relational

E.F.

Model,

Codds

Management

A

Contribution

Systems.

Retrospective

to

Review

the

Field

of

IBM

and

Database

Research

Analysis:

a

Technology.

Report,

Historical

RJ3132,

Account

Addison

2

Date,

C.J.

An

Introduction

C.J.

Date

to

on

Database

Database:

Systems,

Writings

8th

20002006.

edition.

Addison

Apress,

2006.

Review

Questions

Wesley,

Content

are

available

Answers

on

the

to

online

selected

platform

accompanying

this

and

shown

records

file

structures

database

you

named

see

in

this

problem

Ch01_Problems,

set

available

are

on

simulated

the

online

book.

in

does

problem

would

Figure

the

you

P1.1, P1

file

answer 1answer

contain,

Problems Problems1

and

encounter

if

how

you

1-4.

many

wanted

4

fields

to

are

produce

there

a

per

listing

by

record?

city?

How

would

you

2000. this

problem

by

altering

the

file

structure?

for

Problems

2003.

FIGURE

Online

structure

many

What

solve Date,

this

1981.

and

Wesley,

file

How

The

Access

for

Problems

for

this

chapter

P1.1

PROJECT_

book.

The

file

structure

PROJECT_

CODE

MANAGER

21-5Z

Holly

25-2D

Jane

14

MANAGER_

PROJECT_BID_

MANAGER_ADDRESS

PRICE

PHONE

B.

Naidu

33-5-59200506

180

Boulevard

Dr,

D.

Grant

0181-898-9909

218

Clark

Blvd.,

Dr.,

Phoenix,

64700

London,

13

9

NW3

179

975.00

787

037.00

TRY

REVIEW

QUESTIONS 25-5A

1

Discuss

each

of

the

following

Menzi

25-9T a

Holly

27-4Q

c

Menzi

Holly

is

data

redundancy

and

which

characteristics

of

the

file

system

can

lead

to

Discuss What

5

What

6

Explain

7

What

the is

a

lack

of

DBMS,

data

and

independence what

are

in

its

Boulevard

0181-227-1245

124

River

33-5-59200506

180

Boulevard

Durban,

25

4001

Dr,

Phoenix,

64700

458

16

005.00

887

Zulu

Naidu

Dr.,

Durban,

8

4001

Dr,

Phoenix,

64700

181.00

078

124.00

20

014

885.00

file

systems.

is

structural

is

independence,

the

difference

the

role

and

between

of

a

data

DBMS,

List

and

describe

What

What

11

Explain

12

What

Use

are

is

What

15

Explain

Further

main

and

why

and

what

different

K.

Moor

wanted

postal

is

it

to

Via

39-064885889

code,

how

produce

alisting

would

you

of

alter

the

you

detect,

the

Valgia

file

Silvilla

file

contents

23,

by

last

Roma,

00179

name,

area

44

516

677.00

code,

city,

FIGURE

its

of

of

a

What

data

redundancies

do

county

or

structure?

and

how

could

those

redundancies

lead

to

anomalies?

important?

information.

are

types

components

you

P1.2

The

file

structure

for

Problems

58

advantages?

databases.

PROJ_

NUM

database

system?

EMP_

NAME

EMP_NAME

NUM

1

Hurricane

101

John

1

Hurricane

105

David

F.

1

Hurricane

110

Anne

R.

2

Coast

101

John

D.

Dlamini

2

Coast

108

June

H.

Ndlovu

3

Satellite

110

Anne

R.

3

Satellite

105

David

F.

3

Satelite

123

3

Satellite

112

D.

Dlamini

Schwann

JOB_

JOB_CHG_

PROJ_

CODE

HOUR

HOURS

EE

65.00

13.3

31-20-6226060

CT

40.00

16.2

0191-234-1123

CT

40.00

14.3

34-934412463

EE

65.00

19.8

31-20-6226060

EE

65.00

17.5

0161-554-7812

CT

42.00

11.6

34-934412463

CT

6.00

23.4

0191-234-1123

EE

65.00

19.1

0181-233-5432

BE

65.00

20.7

0181-678-6879

EMP_PHONE

metadata?

why

are

database

the

design

potential

examples

in

are

a

the

what

of

compare

typical

six

is

costs

to

prevalent

14

the

the

If

functions?

PROJ_

13

River

180

it?

4

10

124

file

What

3

9

F.

B.

William

31-7P

8

0181-227-1245

33-5-59200506

record

d

4

Naidu

field 29-2D

3

Zulu

B.

data

b

2

F.

terms:

and

business

levels

is

important.

implementing

a

contrast

database

system?

structured

and

unstructured

data.

Which

type

is

more

Ramoras

environment?

on

meant

Ramoras

which

by

data

Reading

the

quality

of

data

can

be

examined?

Mary

Allecia

Schwann

D.

Chen

R.

Smith

governance.

allows you to explore the subject further,

Problems

become progressively

more complex as

and acts as a starting pointfor projects and assignments.

students draw onthe lessons learnt from the completion of

Review

preceding problems.

Questions

help reinforce and test your knowledge

and understanding, and provide a basis for group discussions and activities. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

DeDiCAtion

To

my son,

To Craig, you

Kona,

I

being

would

my

be

am today.

there

To

whom I

my best friend

nothing

person

of

for

am so

proud

and

patient

possible.

In

To

my

keep

husband. memory

mother,

Norma

following

your

Thank you for

of

my father,

Crockett,

dreams.

supporting

Frank

who is the

my crazy

Crockett, angel

busy life

who inspired

in

my life.

without

me to

Thank

you

be the

for

always

me.

mother-and

father-in-law

Jackie

and

Bill

Smith

who

have

provided

me

with

much love

and

support. In

memory

To

of Leslie

my family

Much love

Crockett,

and friends, and

aloha

to

a true

all of you

gentleman

whom

have

and

painted

much-loved rainbows

uncle.

in

my life.

all.

Keeley

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

Crockett

eBook rights

and/or restrictions

eChapter(s). require

it

Teaching Support

Cengages

courses

& Learning Resources

peer-reviewed

is

content

accompanied

support

resources.

specific

needs

Examples

The

kind

are

carefully and

provided

area

a test

an instructors

example,

Lecturers:

to

resources

for

Students:

area

online

and learning

tailored the

to

the

course.

include:

instructors

PowerPoint

with,

slides

and

for

students

appendices,

including,

useful

for

weblinks

and

terms.

discover

the

accompanying

access:

education

manual.

An open-access

glossary

bank,

for

further

teaching

student

of resources

example,

and

of digital

resources

instructor,

A password-protected

for

higher

by a range

of the

of the

for

dedicated this

teaching

textbook

digital

please

support

register

here

cengage.com/dashboard/#login

to

resources

discover

the

accompanying

Database

Principles:

Implementation,

dedicated

this

learning

textbook,

Fundamentals

and

please of

Management.

digital

support

search

for

Design,

Edition

on: cengage.com

BEUNSTOPPABLE! Learn Copyright Editorial

review

more 2020 has

Cengage deemed

at cengage.com Learning.

that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

DATABASE PRINCIPLES

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

PartI

DATABASE SySTEmS 1 The Database Approach 2 Data Models

3 Relational Model Characteristics 4 Relational Algebra and Calculus

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

BuSINESS VIgNETTE THE RELATIONAL REVOLuTION AN HISTORICALjOuRNEy Until the late 1970s, databases stored large amounts of data in structures that wereinflexible and difficult to navigate. Programmers needed to know what clients wanted to do with the data before the database was designed. Adding or changing the way the data were stored or analysed was time-consuming and expensive. In

1970,

Edgar Ted

Codd,

a mathematician

employed

by IBM,

published

a groundbreaking

article entitled A Relational Model of Data for Large Shared Data Banks. At the time, nobody realised that Codds theories would spark atechnological revolution on par with the development of personal computers and the internet. Don Chamberlin, co-inventor of SQL, the most popular database

query language

today,

explains:

There

was this

guy Ted Codd

who had some

kind of

strange mathematical notation, but nobody took it very seriously. Then Ted Codd organised a symposium, and Chamberlin listened as Codd reduced complicated five-page programs to one line. And I said, Wow, Chamberlin recalls. The symposium convinced IBM to fund System R, a research project that built a prototype of a relational database, which would eventually

lead

to the

creation

of SQL and

DB2. IBM,

however,

kept

System

R on the

back

burner for a number of years, which turned out to be a crucial decision, because the company had a vested interest in IMS, areliable, high-end database system that had been released in 1968. At about the same time as System Rstarted up, two professors from the University of California at Berkeley, who had read Codds work, established a similar project called Ingres. The competition between

the two tight-knit

groups

fuelled

a series

of papers.

Unaware

of the

market

potential

of

this research, IBM allowed its staff to publish these papers. Among those reading the papers was Larry Ellison, who had just founded a small company called Software Development Laboratories. Recruiting programmers from System R and Ingres, and securing funding from the CIA and the Navy, Ellison

was able to

market the first

SQL-based

relational

database

in 1979,

well before IBM.

By 1983, the company (Software Development Laboratories) had released a portable version of the database, had grossed over 13 910 000 annually, and had changed its name to Oracle.

?

3

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

4

PART I

Database

Systems

Spurred on by competition, IBM finally released SQL/DS, its first relational database, in 1980.1 In 2008, a group of leading database researchers metin Berkeley and issued a report declaring that the industry had reached an exciting turning point and was on the verge of another database revolution.2

In 2010, Oracle acquired MySQL as part ofits acquisition of Sun. It has since maintained the free open-source MySQL Community Edition while providing several versions (Standard Edition, Enterprise Edition and Cluster Edition) for commercial customers. In 2019, the release of MySQL Document

Store

brought together

the

SQL and the

NoSQL languages,

enabling

developers

to link

SQL relational tables to schema-less NoSQL databases.3 Oracles latest offering is Oracle Database 19c, where the c represents cloud; new versions now come out every year. In our historical journey, we must also mention PostgreSQL, developed in1986 as part of the POSTGRES project at the University of California at Berkeley. PostgreSQL4 is afree, open source, object-relational

database

that

extends

the traditional

SQL language

by allowing

creation

of new

datatypes and functions, and the ability to write code in different programming languages. It is a strong competitor to MySQL, given that it has had over 33 years of active development. Analysts, journalists and business leaders continually see new developments with data acquisition and its management, such as the explosion of unstructured data, the growing importance

of business intelligence,

and the

emergence

of cloud technologies,

which

may require

the development of new database models. Although traditional relational databases meetrigorous standards for data integrity and consistency, they do not scale unstructured data as well as new database models such as NoSQL. NoSQL is also known as a non-relational database, which allows

the

storage

and retrieval

of unstructured

data using

a dynamic

schema.

A key

question

asked by database developers today is whether they need a NoSQL database or an SQL database for their application. For example, Twitter and Facebook, which do not require high levels of data consistency and integrity, have adopted NoSQL databases. In 2019, businesses are opting for SQL and NoSQL multiple database combinations, which suggests that one size does not fit all. As of

March 2019, the

most popular

database

management

systems

worldwide

were Oracle,

MySQL, Microsoft SQL and PostgreSQL.5 So, whatis the future? Disruptive database technologies are required for business to remain competitive and the key is real-time data. Alternative database models such as cloud database platforms, which have the capability for real-time data analytics, are for certain. Big data has a role to play as additional data sources must be processed using data

pipelines,

regulations.

1

IBM

2

accordance

The relational

and

Rakesh

all in

Oracle

Agrawal

with the

new

model will survive,

Trade

Barbs

et al.,The

over

Claremont

General

Regulation

(GDPR)

but it will also adapt at unprecedented

speed.

Databases,

Report

on

Data Protection

data

https://phys.org/news/2007-05-ibm-oracle-barbs-databases.html

Database

Research,

http://db.cs.berkeley.edu/claremont/

claremontreport08.pdf. 3

MySQL

4

Editions,

PostgreSQL,

5

Top

www.mysql.com/products/ www.postgresql.org/about/

10 Databases

for

2019,

The

Database

Journal,

www.databasejournal.com/features/oracle/slideshows/

top-10-2019-databases.html

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER 1 The Database Approach IN THIS CHAPTER, yOu The difference

between

What a database valuable

is,

assets

How

modern

About

flaws

What the differs The

decision

making

of database

design

databases in file

types

from

data

systems

a file

file

of databases

are, and

why they

are

systems

management main components

are

and

how

a database

system

system

main functions

The role

evolved

system

database

from

data and information

what the different

for

The importance

wILL LEARN:

of a database

of open

source

The importance

of

data

management

database

system

(DBMS)

systems

governance

and

data

quality

Preview Good

decisions

data.

Data

In this

require

are likely

chapter,

results

than

other

data

and

why

is important this most

review

2020 has

of the

Cengage deemed

Learning. that

any

management

you

they

shortcomings

All suppressed

Rights

does

May not

not materially

be

system

affect

scanned, the

overall

does

systems.

and

facts

known

in

a database.

why it yields about

of serious

duplicated,

Although the

database

learning

raw

are stored

You will also learn

data

or

from

different

as

better types

so important. file

how the

copied,

what it

understanding

are the source

of file

Reserved. content

is

computer

outmoded,

will also learn

derived when they

is,

methods.

design

from

now largely

because

chapter,

which is

most efficiently

what a database

database

evolved is

information,

managed

you learn

Databases

Copyright

be

of databases

management

Editorial

good

to

file

system

characteristics

of file

data

management

system

approach

data systems

limitations. helps

In

eliminate

management.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

6

PART I

Database

Systems

1.1

1

DATA VSINFORmATION

To understand

what

information. to reveal

their

lab think

of its

performance. your

they

of our labs

(a) initial

because

has

survey

Cengage deemed

graphics

summary

(c) information

2020

It is

are second-year

bar

FIguRE 1.1

review

(c).

any

would

has

possible

customer

base?

your

you

begin

Web survey

been

completed,

the

In

get this

(38

raw

data into

quick

answers

case,

that

you

a data

can

extract

respond

are

saved

to

summary

quickly

the

facts

like

the

such

a data

in

hand,

determine

data

one

that

is the

most

(32

quickly,

to

shown

as, What

undergraduates

meaning from

to

and ones is not likely

questions

and first-year

and

labs

to

have

page of zeros

computer

users

data

now

data

processed

of a computer

assess the

raw you

to

users

enables

forms

Although

page after

per cent)

ability to

Panel

users to

form the

Panel (b).

reading

to

know

by surveying

the

1.1,

undergraduates

1.1,

between

of your

per cent).

you show the

(d).

Transforming raw datainto information screen

(b)

in summary

Learning. that

that

you transform

Figure

difference

what the

now

in

the

want to

Figure

can enhance

graph

understand

suppose

useful in this format Therefore,

Panel

to

have not yet been

shows

in

need

that the facts

form

shown

you

word raw indicates

you (a),

survey

one

much insight. 1.1,

Panel

When the as the

customers

data

Copyright

Typically,

1.1,

composition

And,

Editorial

services. Figure

are not particularly

Figure

design,

The

For example,

such

provide in

database

meaning.

questions.

repository,

drives

Data are raw facts.

All suppressed

Rights

Reserved. content

does

format

May not

not materially

be

raw

data

(d) information

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

in graphic format

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

information simple using For

is the result

as organising statistical the

data

weaknesses, Raw data

student

undergraduates

In this

you to

1 to

to

data types age,

to

said

In turn,

be entering

familiarity,

characteristic

Data

constitute

some the

key

building

is used to reveal

Timely

relevant

and

must

environment generation, that

Data

of lab

and presentation. the results

based

customers.

For example, on the

The respondents

More complex

making. strengths

yes/no

formatting

the

classifications responses

is required

when

or images. and timely business

are the

information

survival

foundation

in

is the a global

key to market.

of information,

good We are

which

is

the

and facts about a specific subject. Knowledge

of information can

as it applies

be derived

from

to an environment.

old

A key

knowledge.

data.

meaning

key to

of data.

is

is the

and retrieval

accurate

easy to

activity

good

survival

data.

access

Such

and

decision in

a global

data

process.

making.

must

environment.

be generated

And, like

any

properly,

basic

Given the for

any

crucial role that

business,

data play, it should

government

agency,

and they

resource,

Data management is a discipline that focuses

of data.

a core

key to

organisational

requires that

is

decision

out the labs

needs

1

inferences

of information.

must be managed carefully.

management

for

point

7

may be as

or drawing

meet the

other.

key to

knowledge

information

is the

a format

storage

data

age.6

the

and timely

information

in

show

relevant

making is the

by processing

making

useful

be stored

to

can

better

processing

videos

of accurate,

new

to

storage.

as sounds,

blocks

Information

decision

foundation

form

Approach

points:

is produced

Good

as the

survey

a category

data

and understanding is that

Information

Accurate,

such

knowledge

awareness

summarise

be used

that is, the body ofinformation

of knowledge

Lets

and

decision

the

storage,

for

production

good

bedrock of knowledge

for

forecasts

Panel (c) is formatted

a Y/N format

information

now

implies

1.1,

making

decisions

Database

meaning. Data processing

as

on the

make informed

with complex

making.

question

3, postgraduates

be converted

decision

or as complex can then

each

formatted

Figure

years

to

working

for

must be properly in

patterns

Such information

summary

helping

classification

may need

of processing raw data to reveal its

to reveal

modelling.

example,

and

data

1 The

the

data

on the proper

not surprise

you

organisation

or

service

charity.

1.1.1 Data Quality and Data governance The quality

of the data

long-term

business

within the database

decisions.

develop new strategies can

be examined Accuracy:

Completeness: Timeliness:

6

Peter

knowledge

Copyright Editorial

review

2020 has

Cengage deemed

and

data updated

the Mr

phrase

George

has it

data

purpose

and this

is to

often

make accurate

means

that

it

short-and

can

generation of an organisation.

be used to

Data quality

including:

been

obtained

from

a verifiable

source?

organisation?

being stored?

frequently

in

knowledge

Gilder,

organisation

the income

levels,

to the

if the

Dr

order to

worker George

in

meet the

1959

Keyworth

and

in

his

business

book

Dr Alvin

requirements?

Landmarks

Toffler

of

Tomorrow.

introduced

the

In

1994,

concept

of the

age.

Learning. that

accurate

data relevant

coined

Dyson,

be fit for

of different

Is the required

Is the

Drucker

Ms Esther

data

Is the

must

which aim to increase

at a number

Is the

Relevance:

Data

is essential

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

8

PART I

Database

Systems

Uniqueness:

1

Is the

Unambiguous: The above

to

not

be subject

major

to

Individuals

South

Africa

POPIA

promotes

development and

the

within

database.

Master

and accurate reporting

working

Once in

place,

the

to

ensure

that

they

to

ensure

that it is

1.2

within and

procedures

is

to

create,

a component

of the

managing example, update

of a data

strategy.

within an organisation

and

strategy

levels

within

is

the

polices

and the

several

data,

and

up to

Once the

strategy

will allow

date for

strategy

The

metadata

or raw

that

facts

stores

of interest

filing

the

cabinet

in

cabinets

Figure The

to

the

Copyright Editorial

review

2020 has

Cengage deemed

which

In

to

the

has

any

All suppressed

structure

Reserved. content

does

quality, who

owns

new records

strategy

that

that

data is

to

allow the

in the provides

consistent auditing,

the

be regularly

of the

May

not materially

willinvolve

of the

many

developed

and

with the

measured data

organisation.

and

governance

Data

people put into

strategy. monitored strategy

profiling

and

data

of data over time.

be

copied, affect

data

scanned, the

overall

serves

duplicated, learning

set of relationships

a very

is

that link

well-organised

management

(DBMS)

to the

as the

electronic

system,

a collection

data stored in the

intermediary

and translates

hides

or

system

and managed.

helps

manage

of programs

that

database.

DBmS

requests

DBMS

and the

resembles

as a database

access

of the

A database is a shared,

data are integrated

characteristics

a database

known

database.

user

which the end-user

and controls

DBMS

The

not

and

of:

management

all application

requests.

Rights

willinvolve

defines

delete

been

process to keep track

end

a sense,

software,

A database

that

receives

those

Learning. that

powerful

database

1.2 illustrates

and

monitoring

purpose

a collection

of the

database.

Role and Advantages

DBMS

fulfil

the

contents.

manages the

1.2.1

a description

within

2013.

THE DATABASE AND THE DBmS

structure

provide

data found

procedures. law in

that

all data complies

should

continual

monitoring

metadata, or data about data, through

the

given.

usability,

strategy

task

Efficient data management typically requires the use of a computer

end-user

is

by an organisation

strategy

technology

and time-consuming

of the

This

are often used as part of the

computer

defined

governance

months to ensure that

procedures

being followed.

still relevant

a complex

organisation.

organisation

INTRODuCINg

integrated

not

bodies.

MDM ensures

provides

May

of how the

into

availability,

the

25

consent

and statistical

private

an

of an individual

was signed

methodology

for For

authorised

(MDM)

or

Europe from

an explanation

own data governance

organisation.

who is

and

which

which governs

of data.

governance

are

and

its

which

public

in

explicit

ask for

mathematical

a strategy

for implementation

all systems

by

of data

rights

unless

to

Act (POPIA)

produces

the

Management

will take the

quality tools

of policies

organisation

compliance

different

it

describe

Each organisation

of data

Data

a data

at

information

across

Creating

operation,

of personal

storage

the

profiling, right

appropriate

protection

foundation

and

must utilise

Information

of a series

GDPR includes

have the

of Personal

security

the technological

making

on the

Regulation (GDPR),

for all organisations

22 of the

which includes

decision

used to

own laws

Data Protection

requirement

Article

making,

such

is the term

the

General

alegal in

decision

data quality.

integrity

became detailed

to

will have their

the

Protection

the

Data governance

data

data,

and organisations

has the

to safeguard the

Most countries

changes

subject

decision is reached

data clear?

For example,

automated

who are

without redundancy?

of the

exhaustive.

and processing

One of the

and

meaning

must adhere to.

collecting 2018.

Is the

list is

organisation

data unique

much

in experience.

whole

or in Cengage

of the

part.

Due Learning

between

them

to

into

databases

electronic reserves

rights, the

right

the

the

some to

user

complex

internal

third remove

party additional

content

and the

database.

operations

required

complexity

may content

be

suppressed at

any

time

from if

from

the

subsequent

eBook rights

the

and/or restrictions

eChapter(s). require

it

CHAPTER

application

programs

and users.

programming

language

DBMS

program.

utility

such

FIguRE 1.2

as

The application Python,

program

Visual

Basic,

might be written

C++

or Java,

The DBmS managesthe interaction

1 The

Database

by a programmer

or it

might

be

created

Approach

using

9

a

through

1

a

between the end user and

the database End

users Application

Database

structure

request Metadata

Data

Customers DBMS database management End

End-user

Invoices

system

data

users Products

Application request

Data

Having

a

DBMS

advantages.

between

First,

or users.

the

Second,

the

DBMS

the

end

users

enables

the

applications

data in the

DBMS integrates

the

and

database

many different

the

to

database

be shared

users

offers among

views

some

important

multiple

of the

applications

data into

a single

all-encompassing

data repository. Because

data

managing

such

efficient

and

In

more

respond

and

quickly

to

actions

in

changes

segment

and of the

Minimised data inconsistency. data

appear

in

different

stores

department

stores that

regional its

sales

shows

national

sales

office

inconsistency

is

greatly

Improved

data

the

access

to

a clearer

For

The

data

of the

affect

big

other

example,

price the

such

need

data

a good

way

management

of

more

as:

in

which end users

makes it

well-managed view

company

reduced

you

make

possible

for

end

have better users

to

environment.

persons

shows

access.

access

derived,

helps

an environment

DBMS

It

name

of product

makes it

exists

as Thobile

X as

products

a properly

an integrated

becomes

view

much

easier

of the

to

see

how

exists when different versions of the same

data inconsistency

name as Bathobile

same

in

promotes

picture.

segments.

Data inconsistency

same

is

DBMS

advantages

Such

a sales representatives

office

the

provides

data.

places.

department

which information book,

helps create

in their

Wider

from

in this

a DBMS

The DBMS

operations one

material

better-managed

data integration.

organisations

raw

will discover

particular,

data sharing. to

Better

crucial

As you

effective.

Improved access

are the data.

price

possible

Cele and the M. Cele

R390.00 as

designed

when

in

a companys

companys

or when the

South

R350.00.

African The

sales

personnel

companys currency

probability

and

of data

database.

to

produce

quick

answers

to

ad hoc

queries.

From a database perspective, a query is a specific request for data manipulation (for example,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

10

PART I

Database

Systems

to read or update the data) issued to the DBMS. Simply put, a query is a question and an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to the application. For example, end users, when dealing withlarge amounts of sales data,

1

might

want

quick

?

What

?

What is the

?

answers

was the

How

many

volume

of our

customers

decision better-quality

Increased

end-user

The

usable

more

making.

for

each

have

credit

hoc

on

productivity.

of our

of using

a DBMS

as you learn

past

six

of

months?

during

better

end

3

users

not limited

to

are

make

global

to the

000)

three

or

months?

more?

make it

possible

to

based.

with the tools

quick,

informed

that transform

decisions

that

can

be

economy.

few

technical

past

data access

decisions

of data, combined

in the

the

the

R5 000 (or

data and improved

and failure

more about

the

as:

salespeople

The availability

are

such

balances

which

empowers

success

queries)

during

Better-managed

information,

between

advantages

(ad

by product

figure

information,

difference

advantages

of sales

bonus

generate

data into

questions

sales

Improved

the

to

just

listed.

details

In fact,

you

of databases

will discover

and their

many

proper

design.

1.2.2 Types of Databases A DBMS the

can

number

usage

support

and the

The

number

B and

When the

used

which

of users

C must

or a specific is

to

types

where the

of

the

data

determines

databases.

Databases

are located,

the

data

are

can

type

be classified

of data

stored,

according

to

the intended

data

structured.

whether the

database

is

classified

as single-user

or

multi-user.

database supports only one user at a time. In other words, if user Ais using the database, wait until

is called a desktop time.

different

supported,

degree

A single-user users

many

of users

multi-user

entire

might

done.

A single-user

supports

database

database

a relatively

small

that

and

supports

many

users

number

(more

the database is known as an enterprise also

be used

to

classify

the

database.

runs

supports

on a personal

database.

than

computer

multiple users at the same

of users (usually

within an organisation, it is called a workgroup

organisation

many departments, Location

Ais

database

department

by the

user

database. In contrast, a multi-user

50,

fewer

than

50)

Whenthe database

usually

hundreds)

across

database.

For

example,

a database

that

supports

data

located at a single site is called a centralised database. A database that supports data distributed across several different sites is called a distributed database. The extent to which a database can be distributed,

and the

Distributed The

way in

product

popular

way

must

as an online

Copyright review

2020 has

Cengage deemed

any

All suppressed

Rights

is

addressed

in

detail

in

Chapter

14,

does

however,

from

purchases

them.

(OLTP),

based

on how they

For example,

reflect

and immediately.

operations is classified

is

critical

that

is

as an operational

transactional

transactions

day-to-day

A database

will be

used

such

operations.

designed

primarily

database,

or production

as

Such to

also referred

database.

databases comprise two main components: a data warehouse and an online (OLAP) front end. The data warehouse is a specialised database that stores for

decision

databases

Reserved. content

accurately

processing

optimised

operational

Learning. that

managed,

today,

gathered

and supply

day-to-day

transaction

a format

the

payments

be recorded

Typically, analytical analytical processing

Editorial

is

databases

of the information

sales,

support a companys

from

distribution

of classifying

sensitivity

or service

transactions

data in

such

Databases. most

and on the time

to

which

May not

not materially

be

as

copied, affect

The

well as data from

scanned, the

support.

overall

or

duplicated, learning

in experience.

whole

or in Cengage

data other

part.

Due Learning

warehouse external

to

electronic reserves

sources.

rights, the

contains

right

some to

third remove

historical Online

party additional

content

may content

data

analytical

be

suppressed at

any

time

processing

from if

obtained

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

is a set of tools processing

that

and

application

work together

modelling

has

grown

intelligence.

capturing

and processing

decision

making. (See

data

are

can

data

Therefore,

that

use, that

to

processing,

37890

postal

code

with it.

merely

and

data.

apply

Some

data

hand,

the

code

and is

if this

value

concept

store

of

these

a graphic

invoices

sales,

such

a (structured)

graphic

been

processed

in

focus

to

some

a prearranged

other

on the to the

storage

use

valuable

procedures,

and

rules

are being

management

of structured

data.

information

and

For

that

can

Web pages.

addressed

you

of and

on the

if you look

company

the

you

can

scan

store

data have the

data

corporations Just

documents

known

are

and

thus

are

data.

data storage

the

mentioned

unstructured

of databases

and

computations.

types

and

you

as monthly

Web page,

memos

a

If

you could

However, and

data

them

such

requisite

at a typical

of

as numeric.

invoices.

Semi-structured

data.

types

computations

paper

Instead,

and semi-structured

some

example,

printed

The database

and

processing

value represents

be formatted

the

yields

storage of

for For

display,

emails,

generation

its

mathematical

perform

collected. that

type

must

as semi-structured.

structured

were

derive information

can

semi-structured

in

a new

to

business

Unstructured

they

code. If this

not be useful.

use

Unstructured

through

discipline:

processing

(unstructured)

it

want to

highly

be found

1

approach

data to facilitate

a stack

would

which

perform

some information.

also

11

database

own

structured.

to the

based

transaction,

retrieval

example,

of

They

its

of

to support

of processing.

you cannot

so that

are in

itself

be ready

types

hand, if you

to convey

into

data

or a product

imagine

format

extent.

format

evolved

Approach

for retrieving,

area

information

(format) not

a sales

storage

spreadsheet

this

a comprehensive

unstructured

for future

other

environment

times,

format

not lend

value

as text,

as images

has

the

is, in the

might

further,

Onthe

which

structure

for

stored

structure

format.

to

does

a sales

represents

it

of generating

of formatting

code,

other

that

describes

that

that

You

analysis

recent

Database

Business Intelligence.)

state

a format

(structured)

average

presented

needs

on the

point

degree

most data you encounter are best classified

already

of the

(raw)

in

of information.

perform

for

be ready

data in

Actually,

exist

the

purpose

the

a postal

want to

invoice

Databases

data are the result

or a product

save them in

with the

to reflect

original

data

to

In

intelligence

might

On the

limited

in their

data

warehouse.

usage,

might refer to

To illustrate

far

exist

but they

value

totals

be classified

generation

you intend

15,

also

Structured

and the

data

an advanced

data

business

business Chapter

provide the

and

The term

unstructured

information.

to from

in importance

business

Databases

data

1 The

not

think

such

as

management

as XML

databases.

extensible Markup Language (XML) is a special language used to represent and manipulate data elements in a textual format. An XML database supports the storage and management of semi-structured XML

Connectivity

and

Analytical for

tactical

(data

data.

and

to

sophisticated

tools.

transactional

or

etc.

strategic

easier

to

to

retrieve

Copyright review

2020 has

Cengage deemed

The

15, Databases

Learning. that

any

in

more

All suppressed

Rights

Reserved. content

does

design,

for

May not

not materially

data,

the

base end

detail

in

data

Chapter

16,

Database

to

massaging

forecasts,

advanced

typically

formulate

data

require

pricing

can store

warehouse

structure

and

of data

use

data

sales

market analysis

of

data used to generate information

by data are based

implementation

extensive

perform

decisions

to

metrics used exclusively

decisions,

on storing

warehouse data

requires

pricing

user

Such

information

the

typically

to

primarily

supported

the

such

analysis which

decisions.

extract

Most decisions

on historical data is

extensive

decisions,

data

forecasts,

data obtained

derived quite

sales

from

different

warehouses

are

from

many sources. from

that

of

in

detail

covered

a

Business Intelligence.

Table 1.1 compares features

Editorial

discussed

allow

focuses

Additionally,

database.

Chapter

warehouse

manipulation)

databases.

make it

on

databases

make tactical

market positioning, operational

be

Such

information

Analytical

a data

(data

making.

produce

on.

data using

massaging

in

decision

to

so

In contrast, required

will

Web Technologies.

or strategic

business

databases

databases focus primarily on storing historical data and business

manipulation)

strategies

To

XML

be

of several well-known

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

database

part.

Due Learning

to

electronic reserves

management systems.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

12

PART I

Database

Systems

TABLE 1.1

Types of databases

1 Product

Number

Data Location

Of Users

XML

Data Usage

Multi-user

Single User

workgroup

X

X

X3

X

X

X

X

X

X

X

X3

X

X

X

X

X

X

X

MySQL

X

X

X

X

X

X

X

X

Oracle

X3

X

X

X

X

X

X

X

MS

enterprise

Centralised

Distributed

Operational

X

Analytical

X

X

Access MS SQL Server IBM

DB2

RDBMS

All the

database

commercial

DBMS, its system

applications

the

for

purpose,

any to

The there

MySQL look

general

main benefit of open source

define

and the

Perl

blocks for

the

most popular

PostgreSQL8

media

such

Over the

and

widespread

term

NoSQL9

Copyright Editorial

review

2020 has

PostGres

9

NoSQL

Cengage deemed

Learning. that

any

not

SQL) is

based

Available:

All suppressed

Rights

develop

the

source the

the

which

provided buy

by actual

database

database

database

will then

and

system

be released

for

A disadvantage

of open

Twitter grow

and

new breed

this

LinkedIn

exponentially

new

generally

on the traditional

relational

source

capture

is

a new

database

the

system

vast

as they

software

is

DBMS building

such

as

stick to the

organisations

is that it

to

does

not

systems.

as the

use

basis for the new

Social

media refers

interactions.

amounts

the

of

to

Websites

of data

about

specialised

end

database

has grown in sophistication

known

as

generation

model.

products

and

human

database

database

describe

basic

companies

on

and require

of

to

MySQL

and analysed.

of specialised

type

used

Web server,

commercial

always

LAMP

provides

technologies

anytime,

However,

The term

DBMS products

smaller

by large-scale

anywhere,

stack

management

vendor

ideal

product itself.

software.

Apache

software

database

required

and use the

You

a

NoSQL

database.

of database will learn

The

management

more about

NoSQL

NoSQL.

www.postgresql.org/ http://nosql-database.org/

Reserved. content

to

www.mysql.com/

Available: Available:

data

years, this

16 Big Data and

mysql.com

8

can

Linux,

this

of data are being stored

enable

Currently,

only

namely:

Together

makes them

durability

Instagram,

These

usage.

is

are

order

distribute

of the

World Wide Web and internet-based

that

past few

(Not

that

Chapter

users

support

use than large-scale This

great amounts

Facebook,

consumers.

systems.

7

generation,

Google,

and

systems

of the

ongoing

open source

quickly.

and

mobile technologies

as

users

Typically,

applications

Withthe emergence

Web and

choice,

any improvements,

software,

languages.

principles.

functionality

and

source

are easier to

database

the robust

social

is that

make

in

MySQL7 is an open

of their

software is that it is free to acquire development

open

websites.

database-centred

provide

and

MySQL)

a company

maintenance.

The idea

code

development

developing

fundamental

develop

in the

PHP/Python

MySQL and basic

and

1.1 (except

from

modify a database

product.

source

Table

public.

will be costs involved

used to

in

DBMS

and

in

investment

support

build

at the

shown

a significant

and ongoing users to

actual

the

systems

and require

which allows

improve

back

management

vendors

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1 The

Database

Approach

13

1

NOTE Most

of the

database

production First,

design,

(transactional)

production

implementation

databases.

databases

are the

enrolling in a class, registering warehouse poorly

designed,

1.3

design

to

store

data

most of their

warehouse

to the

manage

does

a crucial

of good

refers

and

requirements

such

the

databases

a car, buying

derive

management

issues

on production most frequently

a product

is

in this

based

encountered

in

production

based

databases,

on them

book

on two

are

activities

or withdrawal.

their

reliability

on

such

Second,

and if production

will lose

based

considerations.

common

or making a bank deposit

data from

databases

addressed

databases

databases

and

value

as

data

as

are well.

wHy DATABASE DESIgN IS ImPORTANT

Database used

databases

and

The focus

not just

aspect

database

activities

end-user happen;

of working

design

that

focus

data. its

on the

A good

structure

must

with databases

techniques.

design

database

that

of the that

be designed

DBMS

structure

a database

carefully.

most of this

Even a good

database

is,

book is

In fact,

poorly

will be

meets

all user

database

dedicated

will perform

that

that

to the

design

is

development

with a badly

designed

database. Proper

database

expected

use.

operational

Designing

speed.

aggregated

design

Designing

approach

emphasises

the

15 also

requires

that

used

critical

designer

database

design

issues

the

databases

accurate

and

consistent

the

use of historical

a centralised,

single-user

a distributed,

single-user

confronting

precisely

recognises

of

centralised,

identify

database

be used in

the

to

emphasises

warehouse

to in

of transactional,

examine

database

of a data

a database

from

design

the

a transactional

The

data.

a different

and

design

the

environment

multi-user

and

designer

data

requires

database.

multi-user

This

databases.

of distributed

and

generates

accurate

and

and

book

Chapters

data

14

warehouse

databases. A

well-designed

information. errors

that

may lead

organisation. study

database

A poorly

seminars,

and

1.4

bad

why

data

database decision

making

design

to and

bad

often

a breeding

decision

to

of all types

consultants

and

become

too important

why organisations

database

management

is likely

design is simply

design,

ground

making

be left to luck. and sizes make

can lead

Thats

send

valuable

difficult-to-trace

to

the

failure

why university

personnel

an excellent

and

for

to

of an

students

database

design

living.

HISTORICAL ROOTS: FILES AND DATA PROCESSINg

Understanding considering can

to

Database

database

facilitates

designed

be

what a database what

helpful

in

Understanding

a database

is,

is

what it

not.

understanding

A brief

the

these limitations

data

is relevant

does

and the

explanation access

to

proper

of the

evolution

limitations

database

way to

that

use it

of file

system

databases

designers

can

be clarified data

processing

to

overcome.

attempt

and developers

by

because

database

technologies do not make these problems magically disappear database technologies simply make it easier to create solutions that avoid these problems. Creating database designs that avoid the pitfalls of earlier systems requires that the designer understands these problems and how to avoid them;

otherwise,

technologies

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

the

database

and techniques

All suppressed

Rights

Reserved. content

does

May not

not materially

technologies

are no better (and

are potentially

even

worse!) than

the

they have replaced.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

14

PART I

Database

Systems

1.4.1

manual File Systems

1 To be successful,

an organisation

must develop

systems

for handling

core business

tasks.

Historically,

such systems were often manual, paper-and-pencil systems. The papers within these systems were organised to facilitate the expected use of the data. Typically, this was accomplished through a system of file folders and filing cabinets. As long as a collection of data was relatively small and an organisations

business

users had few reporting

requirements,

the

manual system

served its role

well

as a data repository. However, as organisations grew and as reporting requirements became more complex, keeping track of data in a manual file system became more difficult. Therefore, companies looked to computer technology for help.

1.4.2 Computerised

File Systems

Generating

manual file

reports

from

systems

was slow

and

cumbersome.

In fact,

some

business

managers faced government-imposed reporting requirements that led to weeks of intensive effort each quarter, even when a well-designed manual system was used. Therefore, a data processing (DP) specialist was hired to create a computer-based system that would track data and produce required

reports.

Initially,

the

computer

files

within the file

system

were similar

to the

manual files.

A

simple example of a customer data file for a small insurance company is shown in Figure 1.3. (You will discover later that the file structure shown in Figure 1.3, although typically found in early file systems, is unsatisfactory for a database.)The description of computer files requires a specialised vocabulary. Every discipline develops its own terminology to enable its practitioners to communicate clearly. The basic file vocabulary

shown in Table

1.2

will help you to understand

subsequent

discussions

more easily.

Online Content Thedatabases usedin the chapters areavailable onthe onlineplatform accompanying to

chapter

access

Raw facts,

smallest letter Field

online

Online

platform.

Content boxes

Please

see the

highlight

prelims

for

material related

details

on how to

resources.

such as a telephone Data have little

piece A, the

record

define

store

File

A collection

2020 has

Cengage deemed

Learning. that

any

phone

or a file

All suppressed

Rights

such

by the

as /.

computer

A single

(alphabetic

a record number,

name and a year-to-date

is

character

or numeric)

records.

for a customer date

does

May not

not materially

be

of birth,

a single

requires

that

has

copied, affect

scanned, the

overall

or

duplicated, learning

named

a file

the records

describes

credit limit

For example,

might contain

Reserved. content

be recognised

set of one or morefields that

of related

Company,

can

of characters

constitute

address,

a birth date, a customer

character, 1 byte

a specific

(YTD)

manner. The such

as the

of computer meaning.

storage.

A field

is

used

data.

connected

the fields that

number,

meaning unless they have been organised in some logical

5 or a symbol

or group and

Alogically

name,

of data that

number

A character to

review

on the

book,

Basic file terminology

sales value.

Copyright

located

useful

the

Definition

Data

Editorial

book. Throughout

content

these

TABLE 1.2 Term

this

for

in experience.

whole

a person,

J. D. Rudd

might consist

and

balance.

unpaid

might contain

the

or in Cengage

students

part.

Due Learning

place or thing.

to

data

about

currently

electronic reserves

rights, the

right

some to

third remove

of J. D. Rudds

vendors

enrolled

party additional

content

For example,

of

ROBCOR

at Gigantic

may content

be

University.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIguRE

1.3

C_NAMe Alfred

A.

Database

Approach

15

Contents of the CuSTOmER file

C_PHONe

C_ADDreSS

32-3-8891367

Stationsplein

Ramas

Sea

0181-894-1238

C_POSTCODe

A_NAMe

A_PHONe

TP

AMT

reN

2880

Leah

F.

27-21-410-7100

T1

100.00

05-Apr-2018

B.

0161-228-1249

T1

250.00

16-Jun-2018

F.

27-12-410-7100

S2

150.00

29-Jan-2018

F.

27-21-410-7100

S1

300.00

14-Oct-2018

0181-228-1249

T1

100.00

28-Dec-2018

0181-123-5589

T2

850.00

22-Sep-2018

27-21-410-7100

S1

120.00

25-Mar-2018

0181-123-5589

S1

250.00

17-Jul-2018

0161-228-1249

T2

100.00

03-Dec-2018

0181-123-5589

S2

500.00

14-Mar-2018

Hahn

Town

Box 12A

Dlamini

2,

1

Point,

Cape Mpu K.

1 The

Rd,

N6 4WE

Alex

Highgate,

Alby

Johannesburg Loli

32-3-8890340

W.

Rijksweg

Ndlovu

58,

2880

Nkita

Pretoria

Paul

31-20-6226060

F.

Brown

Martin

Olowski

Rd,

1018

Nkita

Westville,

Brown

Durban 0161-222-1672

Fatima

Box 111

Naidoo

Dr.,

M15 REE

Alex

Chatsworth,

B.

Alby

Durban Amy

B.

0181-442-3381

387 Troll

OBrian

Dr.,

N6 LOP

Menzi

Highgate,

East James

G.

19

33-5-59200506

Khumalo

London East

Block

647000

F.

Brown

Plain

3 Baobab

39-064885889

Mahraj

Nkita

Street,

Mitchells Saajidah

T.

Ndlovu

00179

Menzi

Street,

T.

Ndlovu

Queenswood, Pretoria Anne

G.

2119

0181-382-7185

Farriss

Elm

St.,

NW3

RTA

Alex

Parkview,

B.

Alby

Johannesburg Olette

K.

35 Libertas

34-934412463

Snyman

08001

Menzi

Avenue,

T.

Ndlovu

Stellenbosch

C_NAME

5 Customer

C_PHONE C_ADDRESS

A_NAME

Using the

proper

1.3.

The

of nine fields: REN.

its filename

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

phone

type

5 Insurance

policy

REN

5 Insurance

renewal

amount,

in thousands

of euro

date

name

file terminology CUSTOMER

given in

file

shown

C_PHONE,

records

is

5 Agent

5 Insurance

AMT

postcode

C_NAME,

The ten

TP

address

5 Customer

5 Agent

A_PHONE

phone

5 Customer

C_POSTCODE

Figure

name

5 Customer

are

stored

in

Table

1.2, you can identify

Figure

1.3

C_ADDRESS,

in

a named

file.

contains

ten

C_POSTCODE, Because

the

the file

records. A_NAME,

file in

Figure

components

Each record A_PHONE, 1.3 contains

shown

is

in

composed

TP,

AMT

customer

and data,

CUSTOMER.

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

16

PART I

Database

Systems

When business

1

the

users

DP specialist.

from

the

report.

the

file, If

For manipulate

a request

existing

program

their

more

data

and

had to

the insurance

the

files

of other

DP specialist

created

automate

was

were used to coverage,

was so

to

asked

among

1.4

Contents

A_NAMe

A_PHONe

A_ADDreSS

Alex

0161-228-1249

Deken

Alby

had

printed

This

user run

results

business

which

which

processing the

personnel and

AGENT

transfers

user.

in turn

SALES, the

the

file

that

other in

Figure

daily sales

1.4.

to

be able to

data

create

management department

efforts.

at

The sales

demanded

The

rerun

saw the

to

data

functions.

of taxes

could

users

sales

manager

personnel

keep track

more

data

as a printed

wanted

the

data to the

DP specialist

example,

helped track

shown

(EFTs),

they

department

it

business

the

meant For

present

DP specialist

As other

for

for the

to retrieve

and

previously,

reports.

payroll create

had requested

for

that

programs

were being reported,

data,

named

sent requests

create

more requests

more requests

a file

to

to the

generated

obvious

fund

the been

data

file, they

had

access

to

Consequently,

the

in the

file

AGENT

paid and summarise

insurance

other tasks.

FIguRE

B.

to

do electronic

the

and

computerised

manner that

which customer

be created,

success

DP specialist

provide

the

DP specialist

whatever

fashions.

company

departments

the

a report

similar

computerised

programs

it in

ways in

in

data from

request,

was for

new and innovative view

wanted

each

of the

Van

Erpstraat

AgENT file

POSTCODe

HireD

YTD_PAY

YTD_iT

YTD_Ni

YTD_SLS

DeP

5492

01-Nov-2001

20

806.00

5

201.00

1

664.00

103

963.00

3

8002

23-May-2004

25

230.00

6

308.00

2

018.00

108

844.00

0

2193

15-Jun-2003

18

169.00

4

542.00

1

453.00

99

20,

Best Nkita

F.

27-21-410-7100

West

Brown

Quay

Road, Waterfront, Cape

Menzi

T.

452

0181-123-5589

Town Elm

St.,

548.00

2

Parkview,

Ndlovu

Johannesburg

A_NAME

5 Agent

A_PHONE

5 Agent

A_ADDRESS

address

5 Agent

5 Agent

As the

YTD_PAY

phone

5 Agent

POSTCODE HIRED

name

date

postcode

owned

the

used its

file

system

DP specialist

alarger,

or the grew,

5 Year-to-date

file

programs

the

demand to

for

the

The new

like

tax

national

the

DP specialists

one shown

in

and

its

DP department.

activity

remained

programmer

Copyright Editorial

review

2020 has

Cengage deemed

In

Learning. that

any

All

of these

programming,

and

suppressed

spite

and the

more time

Rights

program

Reserved. content

does

May not

organisational

and the

DP

changes,

manager

Figure

modify

inevitably

1.5, evolved.

data.

Each file

And each file

was

creation.

The size additional

skills of the

grew

file

managing technical

(DP)

however, spent

even faster,

system

programming

Therefore, the DP specialists job evolved into that of a data processing a

paid

programming

programmers.

and

paid

insurance

sales

to store, retrieve

computer

programming

income

of dependents

commissioned

hire additional

computer.

to spend less time

that

pay

5 Year-to-date

5 Number

system,

department

was authorised

more complex

DP specialist

a small

own application

by the individual

As the

5 Year-to-date

YTD_NI

DEP

of files increased,

in the system

YTD_IT

YTD_SLS

of hire

number

5 Year-to-date

the

and

also required

staff

caused

the

and human resources.

manager,

who supervised

DP departments

much time

primary

as a supervising

senior

troubleshooter.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIguRE

1.5

Database

Approach

17

Asimple file system

Sales

1 Personnel

department

File

department

File

Management

Management

Programs

Programs

CUSTOMER

SALES

file

AGENT

file

file

File

1.5

1 The

File

Report

Report

Program

Program

PROBLEmS wITH FILE SySTEm DATA mANAgEmENT

The file system system

method of organising

and served

a useful

and managing data was a definite improvement

purpose in

data

management

for

over two

in the computer era. Nonetheless, many problems and limitations critique of the file system method serves two major purposes: Understanding the shortcomings modern databases.

decades,

on a manual

a very long timespan

became evident in this approach.

of the file system enables you to understand the development

A

of

Many of the problems are not unique to file systems. Failure to understand such problems is likely to lead to their duplication in a database environment, even though database technology makes it easy to avoid them. The following problems severely challenge the types as well as the accuracy of the information:

of information

that can be created from the data

Lengthy development times. The first and most glaring problem with the file system approach is that even the simplest data-retrieval task requires extensive programming. Withthe older file systems,

programmers

had to specify

what

must be done

and how to

do it.

As you

will learn in

upcoming chapters, modern databases use a non-procedural data manipulation language allows the user to specify what must be done without specifying how.

that

Difficulty in getting quick answers. The need to write programs to produce even the simplest reports makes ad hoc queries impossible. DP specialists who work with mature file systems often receive numerous requests for new reports. They are often forced to say that the report will be ready next week or even next month. If you need the information now, getting it next week or next

Copyright Editorial

review

2020 has

month

Cengage deemed

Learning. that

any

All suppressed

will not serve your information

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

needs.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

18

PART I

Database

Systems

Complex

1

the

system

system

file

administration.

expands.

management

to

add,

and

system

Each file

delete

records;

and limited

and limited

the file

multiple

geographically

management consequently

password measures system

are

often

protection, designed

to

data

security,

and

Extensive

the

safeguard

programming.

system

from

system

devices

changes

to

tend

to

just

when

file

one field

features

is

can

of

to

data

program

include

effective

and other

made to improve

and

effectiveness.

be

original

alack

data among

of creating

difficult

system itself,

scope

structure in the

is

Sharing

are

an attempt in

user

ad hoc

own files.

In terms

Such

be limited

an existing

changing

risks.

or parts of the

Even

allow the

data repository

features

environment.

confidentiality.

security

For example,

a program

a file

out parts of files

data

Making

environment.

require

omitted

several

Because

are closely related.

of security

of files in

The problem is compounded

of a file system

data-sharing

number maintaining

that

reports.

its data by creating its

a lot and

and

programs

generate

multiply quickly.

owns

as the

creating

management

and security

security

ability to lock

the

can

users introduces

programs,

more difficult

requires

and to

Another fault

Data sharing

dispersed

and reporting

own file

programs

data sharing.

files

contents;

in the organisation

data sharing.

becomes

with afew

must have its

to list

each department

of security

security

and

file

are not possible, the file reporting

by the fact that Lack

System administration

a simple

programs.

modify

queries

Even

difficult

in

CUSTOMER

a file file

would

that:

1 Reads a record from the original file. 2 Transforms the original data to conform to the new structures 3

Writesthe transformed

storage requirements.

data into the new file structure.

4 Repeats the preceding steps for each record in the original file. In fact, that

any change

use the

spent

using

structural

to a file

data in that a debugging

and

data

structure,

file.

process

adding

five

will

steps

work

to

a customer

programs

Even

changes

to

exhibit

in the

when it is ability

in file

The

(how

data

data). to

do it.

type,

its

record

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

to

does

May

be

structural

such

changes

in

of the

data

affecting

afile

copied, affect

systems

program

scanned, the

overall

the

or

programs

programs

must

be

application

programs

Conversely,

structural

structure

the

file

without

affecting

the

definitions.

data

Data

management

in experience.

whole

or in Cengage

part.

Due Learning

to

dependence

electronic reserves

require

when

not the

to

to

make

access

logical

the

data

computer

only

what to

opening

makes the file

changes

the

data. format

sees do,

the

but also

of a specific system

file

extremely

of view.

rights, the

the

(how

specify

point

possible ability

computer

that

decimal,

data type), the file system is said

between

data format

to

are subject to change

when it is

difference

lines

integer

programs

must tell the

contain

from

the

exists

physical

must

duplicated, learning

file

previous

system

system

programs

application

is

and the

and

file

file

a field

all data access

dependence

data)

of the

dependence. the

as changing

data independence

without

and its field

not

exhibit

none

the

change (that is, changing

each

materially

because

characteristics

accesses

not

all

short,

make

file shown in Figure 1.3 would require

change,

Because

a programming

Reserved. content

lead

data.

characteristics,

of

specification

this

Therefore,

In

to

the

CUSTOMER Given

access the file.

views the

that

from

turn,

time is

For

they

possible

Conversely,

Consequently,

cumbersome

Editorial

being

Any program

how

that

significance

human

in

programs

additional

that is, access to afile is dependent onits structure.

structure.

structure.

access

characteristics

practical

the

to

dependence.

storage

file

new file

data storage

data

data

limitations,

all of the

and

of

section.

in the file structure,

changes in all programs any of the files

Those

in

errors (bugs),

problems

field to the

previous

CUSTOMER

exists

application

date-of-birth

to the

by change

independence

modifications

produce

errors.

dependence;

in the

new

conform

are affected

those

to

and Data Dependence

described

with the

modified

minor, forces

are likely

to find

Afile system exhibits structural example,

matter how

dependence.

1.5.1 Structural

the

no

Modifications

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1.5.2 Field Definitions

1 The

Database

Approach

19

and Naming Conventions 1

At first

glance,

the

CUSTOMER

file

shown

in

Figure

1.3 appears

to

have served

its

purpose

well:

requested reports could usually be generated. But suppose you want to create a customer phone directory based on the data stored in the CUSTOMER file. Storing the customer name as a single field turns out to be aliability because the directory must break up the field contents to list the last names, first

names

and initials

in

alphabetical

order.

Or suppose

you

want to

get a customer

listing

by area

code. Including the area code in the phone number field is inefficient. Similarly, producing alisting of customers by city is a more difficult task than is necessary. From the users point of view, a much better (more flexible) record definition would be one that anticipates reporting requirements by breaking up fields into their component parts. Thus, the CUSTOMER files fields

might be listed

as shown in

TABLE 1.3

Sample

Table 1.3.

customer

Customer

last

name

Ramas

CUS_FNAME

Customer

first

name

Alfred

CUS_INITIAL

Customer

initial

CUS_AREACODE

Customer

area

CUS_PHONE

Customer

phone

CUS_ADDRESS

Customer

street

CUS_CITY

Customer

CUS_COUNTY CUS_POSTCODE

Selecting

field

proper

field

name would

origin,

which is the

as

name

London

Customer

county/district

Eastern

Customer

postcode

3001

also important. the

file

customers

file.

Therefore,

file

number

in

renewal

the

portion

the

shown

insurance First,

or box

For example,

structure

prefix

the field not

can

1.3, it is

Using the be

used

of the

field

name

structure

becomes

which

the

files

is

field

more descriptive

belong

to

name

and

of the

yields

That is,

are

that

the

CUS_RENEW_

of the

self-documenting.

fields

names

obvious

as an indicator

question

Lane

Cape

make sure that

date.

CUS

Meadow

Figure

you know that the field in

a few

within

fields

place restrictions

on the length

those

In

restrictions.

on a page,

thus

addition,

making

output

CUSTOMER_INSURANCE_RENEWAL_DATE,

Another

problem

CUSTOMER

have

several

field that

has

address

fields

a CUSTOMER fields

contents.

by simply

what information

looking the

fields

of field very long

names, field

names

spacing

a problem.

being

self-documenting,

while

so it is

wise to

make it

For

be as

difficult

example,

the

is less

to field

desirable

CUS_RENEW_DATE.

The

2020

East

determine

packages

possible

more than

than

0161-234-5678

contain.

software

descriptive

can

1615

code

city

RENEW_DATE

you

A

Green

reasons.

conventions,

names,

to

the

for two

the

is

entry

123

examining

CUSTOMER

naming

field

Some

fit

be better

Second,

With proper

are likely

In

REN represents

DATE

at the

names

descriptive.

property.

review

Sample

CUS_LNAME

reasonably

Copyright

fields

Contents

Field

Editorial

file

Cengage deemed

any

Figure

All suppressed

Reserved. content

does

May not

CUSTOMER

does

named

a unique

Rights

1.3s

currently

customers

contains

Learning. that

in file

not

James

customer

not materially

be

copied, affect

a unique

G. Khumalo.

account

scanned, the

file is the

have

overall

or

duplicated, learning

in

whole

of finding

identifier.

Consequently,

number

experience.

difficulty

record

or in Cengage

For

the

desired

data

example,

addition

of

it is

efficiently. possible

to

a CUS_ACCOUNT

would be appropriate.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

20

PART I

Database

Systems

The criticisms

1

are

not

introduced design

of field

unique

to file

early. in

You

Chapter

Advanced

whether

Data

and the

by adhering

a file

end

to

and

definitions

Online Content this

Design.

Regardless

of the

always

of Figure later,

you learn

and in

Chapter

implementation

issues

the

designers

Both types

are

database

Data

in

1.3

they

about 6,

Modelling

Chapter

data environment,

reflect

requirements.

naming

structure

be important

when

Diagrams,

must

processing and

to

conventions

database

or a database

reporting

field

naming

about

in the file

will prove

Relationship

Database

system

users

proper

and

you learn

shown

conventions

with Entity

when

conventions

such

definitions

and Physical

it involves

needs

field

Modelling

and

Logical,

and naming

Because

will revisit

5,

Concepts;

Conceptual,

definitions

systems.

the

11,

design

documentation

of needs

are

best

served

conventions.

Appendices Ato Rareavailable ontheonlineplatformaccompanying

book.

NOTE No naming

the your

convention

DBMS

fit

all requirements

use. For example,

might interpret

be interpreted you

can

DBMSs internal

get

all systems.

name

ORDER

a hyphen (-) as a command

as a command

would

for

the

an error

to

subtract

message.

the

On the

NAME

other

Some

to subtract. field

from

hand,

words

generates

or phrases

Therefore,

the

are reserved

an error in some

CUS field.

CUS_NAME

would

DBMSs.

the field

CUS-NAME

Because

neither

field

because

it

work

fine

for

Similarly, would exists, uses

an

underscore.

1.5.3 Data Redundancy The file The

systems

structure

organisational

and lack

structure

Database professionals it is

unlikely

information agent one

that

data

contain

different

and phone

numbers

correct

copy

produces

data

different

places.

stored

of the

agent

different AGENT

on Poor

Copyright review

2020 has

Cengage deemed

of the

security.

any

All suppressed

Rights

to

Reserved. content

both the

and

phone

sets the

If

data is

does

the

May

not materially

be

multiple

copied, affect

in

different

be updated

consistently,

and the

Having

them

when the

same

different

and

the

in Figures

occur

in

As

islands

of

1.3 and 1.4, the

AGENT files.

data

locations.

data locations.

You need

more than

are stored

only

one

place

unnecessarily

at

stage for: when

to same

suppose make

you change

corresponding

agent.

conflicting

Reports

versions

an agents

changes

of the

phone

in the

same

number

CUSTOMER

will yield inconsistent

results

or file,

depending

copies

of data increases

the

chances

of a copy

of the

data

access.

scanned, the

data

multiple sources.

used.

unauthorised

not

basic

data. For example,

numbers.

For example,

data for

data from

for such scattered

CUSTOMER

exists

exists

you forget

same

will always

occur in

names

Having

susceptible

Learning. that

different

version

data

being

Editorial

contain

which

locations

of the same

places. file.

of the

to combine

of information

versions

Data inconsistency

data appear in

difficult

storage

Data redundancy

address

in the

the

different

data redundancy

Data inconsistency.

files

in

redundancy.

Uncontrolled

the

promotes

makeit

use the term islands

often

names

of security

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

1 The

Database

Approach

21

1

NOTE Data that display data inconsistency defined and

as the

condition

conditions.

In

are

accurate;

Data

are verifiable;

Data entry

errors

shown

has

in

file,

spelled

name

accrue

Data anomalies. should

be

by forcing

to

change.

address

occur

corrections, an agent Any

in

any field

entered in

only three

hundreds

or even

? Insertion

?

anomalies.

will also

1.6

problems

Copyright Editorial

review

2020 has

file

Cengage deemed

any

All suppressed

Rights

Clearly,

only the

a field

which

the

change phone of

to

CUSTOMER

entry

to reflect

making

for

file

new

a single and

hundreds

occurs

of

when

data integrity.

data are not made as follows:

number

phone such

must be

number is shown.

changes

might

data inconsistencies

existed,

the

1.3. are

a new agent.

maintain

number, that

system,

in

number

problem

defined

potential

change condition

and phone

Ms Browns file

value

file in Figure

The same

places

an incorrectly

CUSTOMER

prospect

and

a non-existent

name, address

name,

commonly

a large

to

add

agents

is

a new

addition.

In

occur

in

great.

agent, Again,

you the

would be great.

Amy B. OBrian,

T. Ndlovus

systems

reference

transaction

does

are

in

of records.

data

many

as

name

allow

problems.

must be assigned

has a new phone

made. In

if

agent! agent

the

agent

Saajidah

data.

Clearly,

Maharaj and this

is

not

Olette

K. Snyman,

then

desirable.

SySTEmS in file

Reserved. content

1.3

be

customer

agents

file

CUSTOMER

an abnormal

changes in the redundant

file records

must

example,

Menzi

made

and the

Learning. that

inherent

Figure

data inconsistencies

delete

often

in

such

Ideally, fosters

with the

by that by that

the

manager

error

numbers)

CUSTOMER

27-21-410-1700).

into

supplies

phone/address

that

made in

Nikita F. Brown

If you delete

DATABASE

systems

master

For

Deletion

served served

found

CUSTOMER

a dummy

for creating

is

events

CUSTOMER

of data integrity

agent

and/or

time be faced

must be correctly

thousands

potential

you

The

add

could

customers

changes

anomalies. also

You

name each

entry

Look at the

move, the

when all of the required

If agent

each of the

case,

change

file.

anomalies

anomalies.

however,

phone

than

number

agency

kind

abnormality.

Each customer

value

data

real-world

the

in the

personnel

a data

as an

a single

make the

record rather

the

same

as 12-digit

phone

Data redundancy,

married and

making just

develops

The

Update

quit.

get

third

and

fact, the

many different locations.

one for each of the

successfully.

would

of

name

In

yields

anomaly

place.

CUSTOMER

decides to

change

this

defines

the

And should

benefits?

number

decides to

must

in the

Data integrity

with the

morefiles. In fact,

if the insurance

not exist.

and

entries (such

(27-12-410-7100

agents

be impressed

phone

changes in

also

A data anomaly

?

to

a single

Instead

you

data integrity.

consistent

results.

error:

number

sales

who does

dictionary

Nikita F. Brown

file (AGENT),

are

in one or

an entry

phone

bonuses

only

field value

If agent likely

The

made in

consistent

when complex

such

agents

or an incorrect

yield

occur

just

are not likely

to

database

and/or recur frequently

a non-existent

of an agent

agent

data in the

data inconsistencies.

to

contains

enter

but customers

of the

will always

more likely

1.3

to

number

no

data

digit in the

possible

phone

are

different files

Figure

are also referred to as data that lack

all

words,

the

are

a transposed

It is

which

there

made in several

file

file

other

Data

are

in

May not

not materially

file,

be

copied, affect

make

to

which

scanned, the

several

overall

or

duplicated, learning

using

a

files

such

were

in experience.

whole

stored

or in Cengage

part.

database

system

as the separately.

Due Learning

to

electronic reserves

very

customer

desirable. master

However,

rights, the

right

some to

third remove

party additional

unlike

content

may content

Traditional

file, the

be

suppressed at

any

time

the

product

file

system,

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

22

PART I

1

Database

Systems

the

database

label

reflects

consists the

fact

of logically that,

related

although

contents

may actually

be physically

Since

databases

data

the

in the 1.6,

way end-user provides

possible

to

structural the

that

DBMS

be referred

make a human

In the sections

FIguRE

the

DBMS

1.6

over file

system

systems

the

one

as the

database

youll

learn

database

of

DBMS

change

shown in Figure

data

by

making it

dependency

software

and the

its

locations.) major

Figure 1.5,

anomaly,

structures

crucial

stores

access

paths

also takes

components heart.

more than

and not

only

to those

care of defining,

of a database

However,

a DBMS to

what a database

system

a

user,

components.

systems

it takes

and/or

of DBMS software

to those

of several

facilities

DBMS,

logical

end

represents

shown in

generation

(The

unit to the

The databases

data

those

generation

paths

storage

database

management,

current

data repository.

be a single

data

the

managed.

between

The current

database

Contrasting

yet, the

access

to

unit,

data inconsistency,

relationships

being function,

that follow,

fits into

and

to

multiple

logical

accessed

is just

appears

among

a single

Better

also the

all required

may even

heart to

the

file

problems. but

managing

Remember DBMS

of the

all in a central location.

and

is

advantages

most

structures,

structures, storing

numerous eliminate

data stored in a single logical data repository

distributed

repository

data are stored,

dependency

data

the

just

as it takes

make a database

system is,

what its

system.

The

more than

a

system function.

components

are and how

picture.

and file systems A Database

Personnel

System

D ata b a s e

dept

E m pl my o e es DBMS

er s

C us t o s

S al Sales

ne

dept

I n v e t or y u nt s

Acco

Accounting

dept

A File System Personnel

dept

Sales

mpl oy e e s

E

C u st o mer

Accounting

dept

I n v e nt or y

S al es

dept

A c c o u nt s

1.6.1 The Database System Environment The term

database

collection,

storage,

management

point

1.7: hardware,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

system

of view,

software,

Rights

Reserved. content

refers

to

management

does

the

May

not materially

be

copied, affect

organisation

use

database

people,

not

an

and

of

system

procedures

scanned, the

overall

or

duplicated, learning

of

data is

components

within

that

a database

composed

of the

define

and

regulate

the

environment.

From

a general

major

shown

in

five

parts

Figure

and data.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Lets take

a closer look

Hardware.

Hardware

(microcomputers, devices ID

refers

to

switches,

components

shown

all of the systems

mainframes,

(hubs,

readers,

at the five

workstations

routers

and fibre

in Figure

devices

and servers),

storage

and

other

Database

Approach

23

1.7:

physical

optics)

1 The

for example, devices,

devices

1

computers

printers,

(automated

network

teller

machines,

etc.).

FIguRE 1.7

The database system environment writes

Procedures

and

and standards

supervises

enforces Database Analysts

System

administrator

Database

administrator manages

designer

designs End

Hardware

Programmers

users

Application DBMS

programs

use

utilities

write DBMS

access Data

Software.

Although

the

most readily

identified

software

is the

DBMS itself,

to

make the

database

system function fully, three types of software are needed: operating system software, software, and application programs and utilities: ? Operating system software all other

software

Microsoft

to run

on the

Windows, Linux,

? DBMS software software

manages all hardware components computers.

Microsoft

and makesit possible for

of operating

system

software

include

Mac OS, UNIX and MVS.

manages the database

include

Examples

DBMS

Access

within the database system. Some examples

and

SQL Server,

Oracle

Corporations

of DBMS

Oracle and IBMs

DB2. ? Application and to

programs and utility software

manage the

computer

are used to access and

environment

in

manipulate data in the DBMS

which data access

and

manipulation

take

place.

Application programs are most commonly used to access data found within the database, and to generate reports, tabulations and other information to facilitate decision making. Utilities are the software tools used to help manage the database systems computer components. For example, all of the major DBMS vendors now provide graphical user interfaces (GUIs) to help create People.

Copyright review

2020 has

Cengage deemed

Learning. that

structures,

This component

functions,

Editorial

database

any

five types

All suppressed

Rights

Reserved. content

does

control

includes

database

all users

access

of the

and

database

monitor system.

database On the

operations.

basis of primary job

of users can beidentified in a database system: systems

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

administrators,

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

24

PART I

Database

1

Systems

database

administrators,

Each

type,

?

user

Systems

?

database

described

administrators

Database

Database

designers

database

their

?

is

data resources,

Systems

?

design

produce

dimensions

the poor,

even the database

the

database

design

and create the

data entry

access

and

manipulate

are the

operations.

tactical

Procedures.

system.

business

used to

play

entered

into

designers

managerial

with varying with

tends

to

mainframes

In

Copyright Editorial

review

2020 has

at

addition

Cengage deemed

be low.

tends to the

any

the

book.

architects.

most to

to

application

dedicated

optimise

cover

new

programs.

programs

the

to

run

and

obtained

that

They

which end users

organisations

directors from

generated

daily

are all classified

the

database

to

as

make

of facts

they

and audit the

and use of the component

enforce

with customers.

monitor

both the

determination

are to

be organised

of is

standards are

data that

by also

enter the

data.

database.

the

the

of the

Procedures

use of that

stored in the

generated,

design

forgotten,

because

and

through

data

govern the

occasionally

a company

way to

All suppressed

Rights

to

does

May not

an

organisations

on the

can be created

Since which

data are the

data

a vital

part

are to of the

gym

be

database

the

size, its functions

at different levels

managed compare

system

the

system

alocal

may

procedures

is likely

to

and programmers;

procedures

structure.

organisations

membership

claims

many designers

management

and

For example,

microcomputer,

The insurance

are likely

to

have

the

be are

and its

gym

managed

how

corporate

of complexity membership by two

probably

simple

at least

one systems

hardware

probably

be numerous,

Just

complex

and

system

people,

the

and the

data

administrator,

includes

several

and rigorous;

and

be high. levels

account:

Reserved. content

depends

The

locations;

to

standards.

a single

different

into

is

system.

probably

multiple

fact

Learning. that

has expanded

through

and rules although

dimension

precise

DBAs and

volume

important

this

database

strive

managers

organisation

is

systems

to

claims

used is

data

the

and the

the information

how those

structure

adherence

several full-time

the

effect,

and procedures

supervisors,

collection

a new

database

an insurance

volume

the

that is

and

adds

Therefore,

hardware

application

is an organised

database

accompanying

programmers

description

reports

role in

within

which information

system

this

culture.

the

job.

A database complex

the

that

data.

a critical,

an important

conducted

from

ensure

As organisations

and implement

screens,

clerks,

are

and the information

material

and

decisions.

Data. The word data covers the raw

DBMS

are, in

application

job

design

are the instructions

ensure that there

database

sales

Procedures

is

They

environment.

designers

who use the

business

Procedures

Procedures

which

the

and end users.

functions:

operations.

on the online platform

best

end users employ

strategic

system.

general

manage

structure.

databases

people

High-level

and

database

the

For example,

end users.

and programmers,

complementary

responsibilities.

programmers

users

DBAs,

database

and

End

and

systems

available

a useful

and growing analysts

as

Administration,

design

cannot

database

known

analysts

unique

The DBAsroleis sufficientlyimportantto warranta detailedexploration in

K, Database

DBAs

systems both

properly.

Appendix

If the

the

also

is functioning

Online Content

?

performs

oversee

administrators,

database

designers,

below,

not materially

be

of database

database

copied, affect

scanned, the

overall

or

duplicated, learning

system

solutions

in experience.

whole

complexity,

managers

must be cost-effective

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

must also take

as

party additional

content

another

well as tactically

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and

and/or restrictions

eChapter(s). require

it

CHAPTER

strategically

effective.

example the

of good

database

Producing

database

a

million-rand

system

technology

already

selection in

solution

to

or of good

use is likely

to

a thousand-rand

database

affect

the

design

selection

1 The

problem and

Database

is

hardly

management.

of a database

Approach

25

an

1

Finally,

system.

1.6.2 DBmS Functions A DBMS in the

performs

several

database.

through

the

Most

use

data and

functions

integrity

database

dictionary

functions

data

security

work through

data component complex

DBMS

are

automatically

of the

stores

in

of

CUSTOMER

Oracles

data

uses

most can

of the

data

be achieved

storage

only

management,

control, and

Chapter

development

data

backup

application

elements

SQL

dictionary

data

and recovery programming

and their

In

to look

you from

freeing other

dependency

2, Data tool

data

any changes

thereby

structure.

and

data

thus relieving

dictionary,

structural

of the

the

Additionally,

changed

in

access

consistency

In turn, all programs that access the data in the

DBMS

program.

the

data abstraction how

The

the

access

and it removes

more about example

that

and data

languages

definitions

and relationships,

in each

recorded

programs

abstraction

DBMS.

structures

relationships

users,

multi-user access

and

interfaces.

The

the

end

management,

database

communication

the integrity

to

dictionary

relationships (metadata) in a data dictionary. database

guarantee

management,

management,

management.

that

are transparent

They include

and presentation,

management,

Data

of those

of a DBMS.

transformation

interfaces,

important

having to

made in you

words, from

up the

the

having

DBMS

system.

Models).

For example,

Developer

presents

code such

a database

from

the

to

structure modify

provides (You

Figure

the

required

will learn

1.8 shows

data

all

data

definition

an for

the

table.

FIguRE 1.8

Illustrating

metadata with Oracles SQL Developer

Metadata

Data storage

management.

The DBMS

creates

and

manages the

complex

structures

required

for

data storage, thus relieving you of the difficult task of defining and programming the physical data characteristics. A modern DBMS system provides storage not only for the data, but also for related data entry forms or screen definitions, report definitions, data validation rules, procedural code, structures

to handle

video

and picture formats,

database performance tuning.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

Performance

scanned, the

overall

or

duplicated, learning

etc. Data storage

management

is

tuning relates to the activities that

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

also important

for

makethe database

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

26

PART I

Database

Systems

perform

1

more efficiently

as a single (see

Figure

DBMS

the

The

Name:

datafiles

six

E:

database

drive

and access

actually

in

be stored

disk request

the

13,

Although

database

on different

to finish

concurrently.

Chapter

speed.

stores

the

Database

user sees the

multiple

storage

before

physical

media.

next

Data storage

Managing

the

in

one

database data files

Therefore,

starts.

management

In

the

other

words,

and performance

and SQL Performance.

data storage management with Oracle

The

is

Oracle

Manager also

six

Enterprise Express

shows

space

located

the

used

interface amount

of

by each

of the

datafiles.

of the

server

one

requests

are addressed

physical

into

tablespaces

on the

database

DBMS may even

wait for

Illustrating

database in

organised

logical

data files

of storage

the

PRODORA

PRODORA stored

Such have to

issues

FIguRE 1.9

actually

unit,

1.9).

doesnt

in terms

storage

DBMS can fulfil

tuning

Database

data

computer

The

data structures.

characteristics

and

presentation.

The

it

The DBMS relieves

and the

conform

physical

to the

multinational

company. In

of the

logical

DBMS

same

data presentation

South

Africa

the

entered

data

to

conform

of making a distinction

to

enter

in the

to

between

physically

the logical

data

data to

make

database

data

United

required

retrieved

an enterprise

expect

be entered

DBMS

the

imagine

would

would

format,

data

database.

DBMS formats

For example,

date

the

PRODORA

transforms

That is, the

user in

the

GUI shows

for the

expectations.

An end

contrast,

Express

you of the chore

data format.

users

as 11/07/2020.

Regardless

Manager

management

Data transformation

format

Oracle Enterprise

storage

such

States

used

by a

as 11 July

2020

as 07/11/2020.

must manage the date in the

proper format

for each country. Security

management.

privacy.

Security

rules

user

access

and

can

This is

especially

simultaneously.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

The DBMS creates determine which

important

in

All database

Rights

Reserved. content

does

May not

not materially

be

which

data

affect

multi-user

scanned, the

users

operations

users

copied,

a security

overall

can

(read, database

system that

access add,

duplicated, learning

in experience.

whole

or in Cengage

systems

part.

Due Learning

database,

delete

may be authenticated

or

the

or

electronic reserves

the

right

many users

some to

third remove

party additional

content

and data

data items

user

can

each

perform.

access

DBMS through

rights, the

user security which

modify)

where

to the

to

enforces

the

database

a username

may content

be

suppressed at

any

time

from if

and

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

password

or through

information

to

biometric

assign

access

authentication privileges

to

such

various

as a fingerprint database

scan.

1 The

Database

Approach

27

The DBMS uses this

components,

such

as queries

1

and reports.

Online Content AppendixK, Database Administration, examines datasecurityandprivacy issues

in

greater

Multi-user

detail

access

sophisticated

without

and recovery

perform recovery capability

is

covers

critical

backup

and

used to

database

Database

access

languages

through

and

user specify

what

Visual

Basic.NET

and

The

the

majority

Procedural

SQL

communication

multiple,

different

database

?

is the

network

provides

environment, users

communications

can

generate

answers

SQL,

are

data dictionary

in

and

DBMS

Chapter

provides

be done.

languages

such

Structured

address

the

The

used

of

DBMSs

example,

of

Web browsers

the

can

be accomplished by filling

in

DBMS

C,

by the

DBA

Structured

supported and

end-user

might

by

Chapter

requests

provide

as

Chrome,

several

ways:

screen

DBMS

9,

SQL.

accept

such in

standard

Query Language,

use

data

that lets

as COBOL,

utilities

8,

Concurrency.

languageone

how it is to

data

transactional

addressed

The

and data access

For

queries

minimising

in

Transactions

administrative

use

to

Such

Administration,

monitor and maintain the database.

Current-generation

the

with the

failure.

stored in the

a non-procedural

procedural

8, Beginning

environments. through

or a power

important

interfaces.

is

to

also

Advanced

DBA to

deals

rules, thus

issues

Managing

query language

Chapter

and

interfaces.

via the internet

End

de facto

of DBMS vendors. Language

Database

In this

(SQL)

allow the

K, Database

especially

having to specify

interfaces

and the database designer to create, implement, Query Language

is

programming

DBMS

disk

Appendix

12,

A query language

C#.

in the

management

Chapter

without

and

to ensure

management

The data relationships

application

programming

sector

data integrity

and

must be done

application

utilities that

Recovery

and enforces integrity

and transaction

Language,

Transactions

platform).

promotes

Ensuring

uses

concurrently

and data recovery

special

integrity.

online

data consistency.

a query language.

also provides Java,

Query

provide

as a bad

DBMS

database

Managing

backup

procedures.

databases

(see

The DBMS

Data integrity

Structured

such

the

data integrity.

systems.

Beginning

a failure,

the

the

12,

book.

control.

provides

systems

this

consistency,

access

Chapter access

and restore

issues

maximising

enforce

DBMS

data can

database.

The

accompanying

and users

multi-user

DBMS

preserving

management.

redundancy

access

to

after

and recovery

Data integrity

of the

backup

database

platform

multiple

of the

Current

special

online

data integrity that

management.

and

of the

ensure

details

and integrity.

routine

on the

the integrity

covers the

data safety

available

To provide to

compromising

Backup

the

control.

algorithms

Concurrency,

are

and is

forms

access

Firefox

through

via

to

the

or Edge.

their

preferred

Web browser.

? The DBMS

can automatically

?

can

The

DBMS

productivity

Copyright review

communication

Databases,

in

2020 has

Cengage deemed

Learning. that

any

All suppressed

to third-party

predefined systems

reports to

on a website.

distribute

information

via email

or other

applications.

Database

in e-Commerce

Editorial

connect

publish

interfaces

Chapter

17,

(see

online

Rights

Reserved. content

does

May not

are

Database

examined

Connectivity

and

in

greater

detail

Web Technologies,

in

Chapter

and in

14,

Appendix

Distributed H, Databases

platform).

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

28

PART I

Database

1.6.3

Systems

managing the Database System: A Shift in Focus

1 The introduction

of a database

system

provides

a framework

in

which strict

procedures

and standards

can be enforced. Consequently, the role of the human component changes from an emphasis on programming to afocus on the broader aspects of managing the organisations data resources and on the administration of the complex database software itself. The database

system

makes it possible

to tackle

far

more sophisticated

uses of the

data resources

as long as the database is designed accordingly. The kinds of data structures created within the database and the extent of the relationships among them play a powerful role in determining the effectiveness of the database system. Although the database system yields considerable advantages over previous data management approaches,

database

systems

do impose

significant

overheads.

For example:

Increased costs. Database systems require sophisticated hardware and software and highly skilled personnel. The cost of maintaining the hardware, software and personnel required to operate and manage a database system can be substantial. Management complexity. Database systems interface with many different technologies and have a significant impact on a companys resources and culture. The changes introduced by the adoption of a database system must be properly managed to ensure that they help advance the companys objectives. Given the fact that database systems hold crucial company data that are accessed from

multiple sources,

security issues

must be assessed

constantly.

System maintenance. To maximise the efficiency of the database system, you must keep your system current. Therefore, you must perform frequent updates and apply the latest patches and security

measures to all components.

training Vendor

costs tend to

be significant.

dependence.

Given the

Since database

heavy investment

may be reluctant to change database vendors. pricing point advantages to existing customers of database system components.

1.7

technology

advances

in technology

rapidly,

and personnel

personnel

training,

companies

As a consequence, vendors are less likely to offer and those customers may be limited in their choice

PREPARINg FOR yOuR DATABASE PROFESSIONAL CAREER

In this chapter, you wereintroduced to the concepts of data, information, databases and DBMSs. You also learnt that, regardless of what type of database you use (OLTP or OLAP), or whattype of database environment

you are

working in (for

example,

Oracle,

Microsoft

or IBM),

the

success

of a database

system greatly depends on how wellthe database structure is designed. Throughout this book, you willlearn the building blocks that lay the foundation for your career as a database professional. Understanding these building blocks and developing the skills to use them effectively will prepare you to work with databases at many different levels within an organisation. A small sample

of such

career

opportunities

is shown

in

Table 1.4.

As you also learnt in this chapter, database technologies are constantly evolving to address new challenges such aslarge databases, semi-structured and unstructured data, increasing processing speed and lowering costs. While database technologies can change quickly, the fundamental concepts and skills do not. It is our goal that, after you learn the database essentials in this book, you will be ready

to

apply

cutting-edge,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

your

knowledge

complex

Rights

Reserved. content

does

and skills to

work

database technologies

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

with traditional

OLTP and

OLAP systems

as

well as

such as:

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

TABLE 1.4

Database

career

Database

developer

Creates

Database

Approach

29

opportunities

Description

Job Title

1 The

Sample

and

maintains

database-based

Skills

Programming,

required database

fundamentals,

SQL

applications Database designer

Designs and

Database

Manages

administrator Database

maintains databases

and

maintains

DBMS

and

Develops

databases

for

decision

architect

Designs

and

consultant

database

(conceptual,

SQL courses

hardware

database

to improve and

Implements

security

Cloud Computing

business

achieve

security

specific policies

infrastructure cloud

Scientist

for

database

Analyze to

data

for

data

warehouses,

modelling,

SQL,

generate

(VLDB).

the

insights,

data

Internet

data

relationships,

behaviors

Many vendors

database

technologies,

cloud storage

data security,

large

databases,

Data analysis,

statistics,

mathematics,

SQL,

the

need for

administration,

technologies

machine learning,

are addressing

modelling,

SQL, DBMS, hardware,

security

technologies, tuning,

of varied

data

technologies

DBMS fundamentals,

next-generation

amounts

design,

vendor-specific

systems

large

and predictable

databases

data

knowledge

database

goals

SQL,

Design and implement

Data

Architect

optimisation,

Database fundamentals,

administration

officer

query

physical)

processes

Very large

design,

SQL, vendor

DBMS fundamentals,

logical

Helps companies leverage technologies

Data

database

data lakes

and implements

environments

Database

fundamentals,

SQL,

support reporting

Database

design,

Database

databases analyst

Database

Systems

performance

etc. advanced

programming,

data

mining,

data visualization

databases

that

support

large amounts of data, usually in the petabyte range. (A petabyte is more than 1 000 terabytes.) VLDB vendors include Oracle Exadata, IBMs Netezza, Greenplum, HPs Vertica and Teradata. VLDB are now being overtaken in marketinterest by Big Data databases. Big Data databases. Products such as Cassandra (Facebook) and Bigtable (Google) are using columnar database technologies to support the needs of database applications that manage large

amounts

In-memory

of non-tabular

databases.

data.

Most

See

more about this topic

major database

vendors

in

also offer

Chapter

2.

some type

of in-memory

database

support to address the need for faster database processing. In-memory databases store most of their data in primary memory (RAM) rather than in slower secondary storage (hard disks). In-memory databases include IBMs solidDB and Oracles TimesTen. Cloud databases. Companies can now use cloud database services to add database systems to their environment quickly, while simultaneously lowering the total cost of ownership of a new DBMS. A cloud database

offers all the advantages

of alocal

DBMS, but instead

network infrastructure, it resides onthe internet.

of residing

within your organisations

See more about this topic in Chapter 14.

Weaddress some of these topics in this book, but not all no single book can cover the entire realm of database technologies. This books primary focus is to help you learn database fundamentals, develop your database design skills and master your SQL skills so you will have a head start in becoming a successful database professional. However, you first need to learn about the tools at your disposal. In the

next chapter,

influence

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

you

willlearn

different

approaches

to

data

management

and how these

approaches

your designs.

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

30

PART I

Database

Systems

SummARy

1

Data are raw facts.

Information

is the result

of processing

data to reveal its

relevant and timely information is the key to good decision the key to organisational survival in a global environment. Data are usually

stored in a database.

To implement

meaning.

Accurate,

making, and good decision

a database

and to

manage its

making is

contents,

need a database management system (DBMS). The DBMS serves as the intermediary user and the database. The database contains the data you have collected and data known as metadata.

you

between the about data,

Database design defines the database structure. A well-designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organisation. Databases evolved from manual and then computerised file systems. In afile system, data are stored in independent files, each requiring its own data management programs. Although this method of data management is largely outmoded, understanding its characteristics makes database design easier to understand. Awareness of the problems of file systems can help you avoid

similar

problems

with DBMSs.

Some limitations of file system data management are that it requires extensive programming, system administration can be complex and difficult, making changes to existing structures is difficult,

and security

redundant Database

are likely

management systems

weaknesses. to the

features

to

be inadequate.

data, leading to problems of structural

Rather than

files tend to

data

within independent

data repository.

files,

This arrangement

a DBMS presents

promotes

DBMS software

allows

users to

develop the database

the

data sharing,

eliminating the potential problem ofislands ofinformation. In addition, the integrity, eliminates redundancy and promotes data security. Open source

contain

were developed to address the file systems inherent

depositing

end user as a single

Also, independent

and data dependency.

database

thus

DBMS enforces data

system for any purpose, look

at

the source code and make any improvements, which willthen be released back to the general public. Open source DBMSs such as MySQL are currently free to acquire and use, making them ideal for smaller companies and organisations to develop database-centred applications quickly.

KEy TERmS

Copyright Editorial

review

adhocquery analytical database

dataprocessing (DP)specialist dataquality

information

business intelligence

dataredundancy

knowledge

centralised database data dataanomaly datadependence

datawarehouse database database design database management system (DBMS)

logical data format

datadictionary

database system

online analytical processing(OLAP)

datagovernance datainconsistency dataindependence dataintegrity

desktop database distributed database enterprise database Extensible Markup Language (XML)

online transaction processing(OLTP)

data management

field

physical dataformat

dataprocessing (DP) manager

file

production database

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

islandsofinformation

electronic reserves

metadata

multi-userdatabase NoSQL

opensource

operationaldatabase performance tuning

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

query

single-user database

querylanguage

social media

1 The

Database

Approach

31

transactional database

1

workgroup database

queryresultset

structuraldependence

record

structuralindependence

semi-structured

Structured QueryLanguage(SQL)

XMLdatabase

FuRTHER READINg Codd,

E.F.

Date,

C.J.

The

Capabilities

The

of

Database

Assessment

of

E.F.

Date,

C.J.

An Introduction

Date,

C.J.

Date

on

Codds

Database:

on the

REVIEw 1

c

record

d

file

Writings

20002006.

the

Field 8th

of

Database

edition.

Apress,

Research

Report,

a Historical

Technology.

RJ3132,

Account

Addison-Wesley,

1981.

and 2000.

2003.

2006.

Answers to selectedReviewQuestionsand Problemsforthis chapter online platform

accompanying

this

book.

Whatis data redundancy

3

Discuss the lack

4

Whatis a DBMS, and what areits functions?

5

Whatis structural independence, and whyis it important?

and which characteristics

of data independence

of the file system can lead to it?

in file systems.

Explain the difference between data and information. Whatis the role of a DBMS, and what areits advantages? List and describe the different types

9

What are the

10

main components

of databases.

of a database system?

Whatis metadata?

11

Explain why database design is important.

12

What are the potential costs ofimplementing

13

a database system?

Use examples to compare and contrast structured and unstructured data. Whichtype is more prevalent

14

in

a typical

business

environment?

What are the six levels on which the quality of data can be examined?

15

2020

IBM Analysis:

Addison-Wesley,

2

8

has

Systems,

and

data field

7

review

to

Database

Systems.

Review

QuESTIONS

b

6

Copyright

Management

A Retrospective

Discuss each ofthe following terms: a

Editorial

Database

Model,

Contribution

to

Online Content are available

Relational

Relational

Explain whatis

Cengage deemed

Learning. that

any

All suppressed

Rights

meantby data governance.

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

32

PART I

Database

Systems

PROBLEmS

1

Online Content Thefile structures youseein this problem setaresimulated in

a Microsoft

platform

Given the file 1

How

2

for this

Ch01_Problems,

available

Figure

P1.1, answer

contain,

Problems

on the

online

and how

1-4.

many fields

are there

would you encounter if you wanted to produce by altering

per record?

alisting

by city?

How would you

the file structure?

The file structure for Problems 14

PrOJeCT_

CODe

named

does the file

problem

P1.1

PrOJeCT_

shown in

many records

solve this

database

book.

structure

What problem

FIguRE

Access

MANAGer_ADDreSS

MANAGer_

PrOJeCT_BiD_ PriCe

MANAGer

PHONe

21-5Z

Holly

B. Naidu

33-5-59200506

180

Boulevard

25-2D

Jane

D. Grant

0181-898-9909

218

Clark

Blvd.,

F. Zulu

0181-227-1245

124

River

Dr.,

Dr, Phoenix,

13

64700

London,

NW3

TRY

179 975.00

9

787 037.00

25-5A

Menzi

25-9T

Holly B. Naidu

33-5-59200506

180 Boulevard

27-4Q

Menzi F. Zulu

0181-227-1245

124 River Dr., Durban, 4001

29-2D

Holly B. Naidu

33-5-59200506

180 Boulevard

64700

20

014 885.00

39-064885889

Via Valgia Silvilla 23, Roma, 00179

44

516 677.00

William K. Moor

31-7P

Durban,

4001

Dr, Phoenix,

64700

25

458 005.00

16

887 181.00

8 078 124.00

Dr, Phoenix,

3 If you wanted to produce alisting of the file contents bylast name, area code, city, county or postal

4

how

would

you

What data redundancies

FIguRE

P1.2

alter

the

file

structure?

do you detect, and how could those redundancies

lead to anomalies?

The file structure for Problems 58

PrOJ_

PrOJ_

eMP_

NUM

NAMe

NUM

1

Hurricane

101

1

Hurricane

1

eMP_NAMe

JOB_

JOB_CHG_

PrOJ_

CODe

HOUr

HOUrS

John D. Dlamini

EE

65.00

13.3

31-20-6226060

105

David

F.

CT

40.00

16.2

0191-234-1123

Hurricane

110

Anne

R. Ramoras

CT

40.00

14.3

34-934412463

2

Coast

101

John

D. Dlamini

EE

65.00

19.8

31-20-6226060

2

Coast

108

June

H. Ndlovu

EE

65.00

17.5

0161-554-7812

3

Satellite

110

Anne R. Ramoras

CT

42.00

11.6

34-934412463

3

Satellite

105

David F. Schwann

CT

6.00

23.4

0191-234-1123

3

Satelite

123

Mary D. Chen

EE

65.00

19.1

0181-233-5432

3

Satellite

112

Allecia R. Smith

BE

65.00

20.7

0181-678-6879

Copyright Editorial

code,

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

Schwann

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

eMP_PHONe

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

5 Identify in

6

and discuss the serious data redundancy

Figure

problems

exhibited

1 The

Database

by the file structure

Approach

33

shown

1

P1.2.

Looking atthe EMP_NAME and EMP_PHONE contents in Figure P1.2, which change(s) would you recommend?

7 Identify the different data sources in the file you examined in Problem 5. 8

Given your answer to Problem 7, which new files should you create to help eliminate the data redundancies

found

FIguRE P1.3

in the file shown

in

Figure

P1.2?

Thefile structure for Problems 910 DAYS_TiMe

TeACHer_

BUiLDiNG_

rOOM_

TeACHer_

TeACHer_

CODe

CODe

LNAMe

FNAMe

KOM

204E

Mbhato

Horace

KOM

123

Adam

Maria

L

LDB

504

Patroski

Donald

J

KOM

34

Hawkins

Anne

JKP

225B

Risell

James

LDB

301

Robertson

Jeanette

KOM

204E

Adam

Maria

LDB

504

Mbhato

Horace

KOM

34

Adam

Maria

L

MWF

LDB

504

Patroski

Donald

J

MWF 2:00-2:50

9 Identify

and discuss the serious

data redundancy

iNiTiAL MWF 8:00-8:50

G

MWF 8:00-8:50 TTh

W

MWF 10:00-10:50 TTh 9:00-10:15

P

TTh 9:00-10:15 MWF 9:00-9:50

I

TTh

G

problems

exhibited

Copyright Editorial

review

2020 has

Given the file structure KOM were deleted?

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

shown in Figure P1.3, which problem(s)

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

1:00-2:15 11:00-11:50

by the file structure

Figure P1.3. (The file is meant to be used as a teacher class assignment schedule. problems with data redundancy is the likely occurrence of data inconsistencies initials have been entered for the teacher named Maria Adam.) 10

1:00-2:15

shown in

One of the many two different

might you encounter if building

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter 2 Data Models In thIs

Chapter,

Why data

wIll learn:

models are important

About the

basic

data-modelling

What business How the How

you

rules

models

blocks

are and how they influence

major data

data

building

database

design

models evolved

can

be classified

by level

of abstraction

Preview This chapter

examines

design journey, resides

in the

end

most pressing

users

see

data in

data can lead to database failing

to

meet end-user

database the

uses

designers, Data

First,

you

database

data number

notation. are

still

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

about

Finally,

you views

Rights

Reserved. content

does

May not

known

be

the

not materially

be

systems,

emerging

copied, affect

model

scanned, the

overall

or

duplicated, learning

how

data

different

these

as Chen and

model

object and

degrees

to a are

diagrams.

Crows

Within (UML)

Next, model.

being

a

Foot notation

standard.

how it is

of data

how

There

language

relational

media data sets

and

of those

(ERD).

modelling

new industry

and the

social

draw

are

to

them.

will be introduced

diagram

to

and

and implementation

you

unified

such

UML is the

NoSQL

will also learn

used

the

ER model notations

object-oriented

same

are

among

development

design

Second,

data

database

complexities

concepts

database

failures,

as possible.

relations

the

relationship

to

of the among

real-world

and the

same

operation,

such

of ambiguities

Tracing

book.

that

introduced

manage very large

of the

the

entity

actual

nature

data-modelling

models.

of this

systems

entities

basic

as the

briefly

of the

be as free

of the

To avoid

Communication

by reducing

earlier

you understand

in legacy

need to

description

define

of the

from

notation

will

to the

varying

that

programmers

views

an organisations

organisation.

that

designers,

different

requirements.

communications

in the rest

Whilst traditional

will learn

Editorial

model

common

the

some

will help

you

be introduced

current

what

technique

ER

efficiency

a precise

within

developed

are addressed

of

database

and the database

design is that

do not reflect

data

obtain

such

will learn

chapter

and

abstractions

models

modelling

this

step in the

objects

Consequently,

and end users should

clarifies

models

that

of database ways.

data

understood

data

issues

must

of that

modelling

current

modelling is the first real-world

different

designs that

programmers

more easily

Data

between

problems

needs

designers

many

modelling.

as a bridge

computer.

One of the and

data

serving

used

you

will

Then,

you

to fulfil

the

efficiently

and effectively.

abstraction

help reconcile

data.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.1

the IMportanCe

oF Data

2

Data

Models

35

MoDels

Traditionally, database designers relied on good judgement to help them develop a good design. Unfortunately, good judgement is often in the eye of the beholder, and it often develops after much trial and error. Fortunately, data models (relatively simple representations, usually graphical, of more complex real-world data structures), bolstered by powerful database design tools, have made it possible

to

diminish

the

potential

for errors in database

design

substantially.

In

general terms,

2

a model

is an abstraction of a more complex real-world object or event. A models mainfunction is to help you understand the complexities of the real-world environment. Within the database environment, a data model represents data structures and their characteristics, relationships, constraints and transformations.

note Theterms model

Data

data model and database

model are often used interchangeably.

will be used to refer to the implementation

models

can facilitate

interaction

among

of a data

the

designer,

In this book, the term database

model in a specific

the

applications

database

system.

programmer

and the

end

user. A well-developed data model can even foster improved understanding of the organisation for which the database design is developed. This important aspect of data modelling was summed up neatly by a client whose reaction was as follows: I created this business, I worked with this business for years,

and this is the first time Ive

really

understood

how

all the

pieces really fit together.

Theimportance of data modelling cannot be overstated. Data constitute the most basic information units employed by a system. Applications are created to manage data and to help transform data into information. But data are viewed in different ways by different people. For example, contrast the (data) view of a company manager with that of a company clerk. Although the manager and the clerk both work for the

same

company,

the

manager is

more likely

to

have an enterprise-wide

view

of company

data than the clerk. Even different managers view data differently. For example, a company director is likely to take a universal view of the data because he or she must be able to tie the companys divisions to a common (database) vision. A purchasing manager in the same company is likely to have a more restricted view of the

data,

as is the

companys

inventory

manager. In

a subset of the companys data. The inventory while the purchasing manageris more concerned relationships with the suppliers of those items. Applications

programmers

have yet another

effect,

each

department

manager

works

with

manager is more concerned about inventory levels, about the cost ofitems and about personal/business

view of data,

being

more concerned

with data location,

formatting and specific reporting requirements. Basically, applications programmers translate company policies and procedures from a variety of sources into appropriate interfaces, reports and query screens. The different users and producers of data and information often reflect the blindfolded people and the elephant analogy: the blindfolded person whofelt the elephants trunk had quite a different view of the

elephant

from those

who felt the

elephants

leg

or tail.

Whatis needed is the

ability to see the

whole

elephant. Similarly, a house is not arandom collection of rooms; if someone is going to build a house, he or she should first have the overall view that is provided by blueprints. Likewise, a sound data environment requires an overall database blueprint based on an appropriate data model. When a good

database

blueprint

is available,

view of the data is different from that of the

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

it

does not

matter that

an applications

programmers

manager and/or the end user. Conversely,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

when a good

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

36

part

I

Database

Systems

database

blueprint is not available,

program

or an

costing

the

data

entry

without

thousands

(or

a house

blueprint

to

2.2 The

selecting

stored.

without

blocks

anything

(a

of all

an abstraction;

person,

An entity represents

a blueprint,

as customers

a place,

or products;

an inventory

of operational

management

requirements,

thereby

you

cannot

live

in the

data out of the

you are equally

blueprint.

data

unlikely to

Similarly,

the

model. Just as you are create

a good

database

model.

models

BloCks

are

a thing

a particular

For instance, set

draw the required

data

data

overall

millions).

is

you cannot

house

the

MoDel BasIC BuIlDIng

building

entity is

even

an appropriate

Data

basic

such

build a good

first

are likely to ensue.

may not fit into

mind that

model is an abstraction;

not likely

problems

system

company

Keep in

2

order

type

entities,

attributes,

or an event)

about

of object in the real

but entities

relationships

which

and

data are to

world. Entities

may also be abstractions,

such

constraints.

An

be collected

may be physical as flight

routes

and

objects,

or musical

concerts.

An attribute is a characteristic by attributes

such

and customer

as customer

credit limit.

Arelationship and

customer

may be served

agents

many-to-many

and

One-to-many

entity

are

(the

agent.

often

Many-to-many

Thus, the

capitalised

the

by

as *:*. thus

yielding

the

the

*:* relationship

many customers,

and

each

one-to-many,

shorthand among

is related

PAINTER so they

INVOICE

to the

paints

are

notations

1:*, *:*

the three:

easily

designers

label and

for

the

(the many).

as 1:*. (Note

distinguished.) (the

many)

skills,

that

Similarly, is

generated

a by only

would also be labelled

many job

many classes

paintings

PAINTING

relationship

may learn

label

address

exists between

of relationships:

use the

but each invoice

Database can take

types

distinctions

painter (the one)

An employee

a student

customer

many different paintings, but each one of them

relationship

generates

many employees.

Similarly,

phone,

systems.

can serve

three

usually

illustrate

many invoices,

The CUSTOMER

use

designers

as a convention

may generate

customer

an agent

models

A painter paints label

name,

of fields in file

as follows: Data

examples

(*:*) relationship.

may be learnt

first

among entities. For example, a relationship

Database

designers

one)

customer.

students,

customer

be described one

(1:*) relationship.

names

a single

can by

The following

database

customer

SKILL

that

by only one painter.

Therefore,

name,

are the equivalent

one-to-one.

and 1:1, respectively.

painted

last

Attributes

describes an association

customers

is

of an entity. For example, a CUSTOMER entity would be described

and each job

1:*. skill

the relationship

EMPLOYEE

learns

each

be taken

many

class

relationship

can

expressed

by

by STUDENT

takes

CLASS.

One-to-one (1:1) relationship. of its

stores

be

manages labelled The

managed

only a single

Aretail companys

by a single

store.

employee.

Therefore,

management structure

In turn,

each

the relationship

store

mayrequire that each

manager,

EMPLOYEE

who is

manages

an employee,

STORE is

1:1.

preceding

discussion

identified

each relationship

in

both

directions;

that

is, relationships

are

bidirectional: One CUSTOMER Each

of the

A constraint

Copyright review

2020 has

Cengage deemed

Learning. that

any

many INVOICEs

is a restriction

data integrity.

Editorial

can generate

Constraints

All suppressed

Rights

Reserved. content

does

May not

is

many INVOICEs. generated

placed

on the data.

are normally

not materially

be

copied, affect

scanned, the

by only

overall

or

duplicated,

in experience.

whole

CUSTOMER.

Constraints

expressed

learning

one

are important

in the form

or in Cengage

part.

Due Learning

to

electronic reserves

because

they

help to ensure

of rules; for example:

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

The employees

salary

A students

grade

Each

must

class

must have values that

must be between have

one

and

are between

2

Data

Models

37

6 000 and 350 000.

0 and 100.

only

one teacher.

2 How

do you

identify

the

2.3

identify

business

rules

BusIness

When

that

properly

database

of the

of data

such

data

From

a database

go

build

are in

attributes,

environment

an

about

a data

do not,

point

of view,

A business

organisation, that

Business

rules,

and enforce and

written you

see

business

rules

application

of business

person

in simple

by

yield

the

entities,

by gaining

used the

attributes

a thorough

and in

which

required

becomes

generate

in the are

one

In

a business, to

clearly

time

frames

a sense,

a government

of an

environment.

organisations to

this

Business

entities,

as an

you

of

what

used.

But

total

business.

when it reflects

properly

description

rules

are

a religious

organisations

of a policy,

misnamed:

group

they

or a research

are

operations,

rules

operational

define

such

agent,

unit,

are

of the

only

business

relationships

they

understanding

meaningful

and

understanding

information.

description

used

rules throughout

business

in the

the

of business

A customer

rules

seeing

help to

must be rendered

create

in

writing

environment.

attributes,

agent

can

relationships

serve

business

book, especially

is

A training

session

be easy

shares

to

many

rules

at

in the chapters

rules

understand

a common

main and distinguishing

and

constraints.

customers, work.

and

You

devoted

to

each

will see

data

the

modelling

and

interpretation

widely

of the

characteristics

of the

disseminated

rules.

to

Business

ensure

rules

data as viewed

that

describe,

by the company.

are as follows:

may generate

An invoice

must

organisation

language,

Examples

step is to

design.

To be effective, every

data

statements

served

are

of data

organisations

relationship

determining

data

a detailed

change

may be

database

from

any

customer

and

The first

modelling.

organisation.

or small

uses

within that

to reflect

Properly Any time

and

derived

actions

updated

constraints?

rule is a brief, precise and unambiguous

a specific

large

stores

or

collection

within

or principle any

are

and

may start

by themselves,

the

procedure to

selecting

how the

defined business rules. apply

you

model, they

organisation,

and information

laboratory

relationships

rules

designers

will be used to

types

entities,

generated

many invoices. by only

cannot

one

customer.

be scheduled

for

fewer

than

ten

employees

or for

more than

30 employees. Note

that

two

those

business

those

two

rules

establish

entities.

more than and

business

rules

The third

30 people;

two

The

main sources

written and

Copyright review

2020 has

entities,

entities,

business

entities,

rule

relationships

and

constraints.

CUSTOMER

and INVOICE,

establishes

a constraint:

EMPLOYEE

and

TRAINING;

For

no fewer

and

example,

the

and a 1:* relationship than

a relationship

ten

people

between

first

between and no

EMPLOYEE

TRAINING.

2.3.1 Discovering

Editorial

establish

two

Cengage deemed

of business

documentation, more

direct

Learning. that

any

All suppressed

Business rules

source

Rights

Reserved. content

rules

such

does

are company

as a companys

of business

May not

not materially

be

copied, affect

rules

scanned, the

overall

or

is

duplicated, learning

managers, procedures,

direct

in experience.

whole

policy

interviews

or in Cengage

part.

makers,

standards

Due Learning

with

to

electronic reserves

end

rights, the

right

department

or operations

some to

users.

third remove

party additional

managers manuals.

Unfortunately,

content

may content

be

because

suppressed at

any

time

and

A faster

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

38

part

I

Database

Systems

perceptions rules.

differ, end users sometimes

For example,

maintenance task.

2

a

maintenance

procedure,

Such

when

a distinction

are crucial

to the

Too often, interviews of

what the job

general

does

and

verify

not the

can

people

help the results

the

designer.

same job

to

yield

can

perform

very

database

designers

that

the

rules

essential

job

a

users

perceptions.

different

business

a

such

end

end-user

to management

ensure

business

can initiate

Although

pays to verify

may point The

reconciliation

mechanic

consequences.

rules, it

a discovery

any

authorisation

major legal

who perform

database

of the

that

with inspection

have

of business

While such

when it comes to specifying

may believe

mechanics

but it

development

are.

source

mechanic

only

trivial,

with several

components

diagnosis

differences

actually

may seem

contributors

are aless reliable

department

perceptions

problems,

that

is to reconcile

such

rules

are

appropriate

and accurate. The

process

of identifying

and

documenting

business

is

to

database

design

for

several

reasons: They

help

standardise

the

companys

They can be a communications

to understand

They

allow

the

designer

to

They

allow

the

designer

to

create

pilot

not

can

business

fly

more than

rule

can be enforced

ten

In

keep

in

a business

nouns

track

the

rule

associates

hours

be

relationship

modelled.

within

any

their

the

To properly

the type

go both

ways.

by the is

used

entity

to identify

objects.

business

in the

model

the entities. nouns (customer

rule,

you

of interest

could for

the

between

For example,

one-to-many

customer

a

the

and

environment

or passive)

business

wants

rule,

a noun

associating

rule a

customer

and a verb (generate)

that

that: and

should

be represented

by

rule

(1:*).

properly

you should

the

an

business

invoice

Customer

identify

and invoice.

is

is the

consider

rule a

customer by

side,

and invoice

1

the relationship

type,

How

many instances

of A are related

to

one instance

of B?

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

that relationships

generated

of A?

All

no

such

relationships

As a general

(active

environment

one instance

suppressed

that

However,

business

them.

a verb

deduce

to

any

specifies

modelled.

and invoices)

of B are related

Learning.

that

attributes,

For example,

many instances

that

be

for

and

How

Cengage

rule

If the

rules

two

objects

cannot

of entities,

among

business

a business

period

identification

of relationship,

business

As a general rule, to

deemed

an

relationship

identify

complemented

are

contains

are

constraints,

entities.

a generate

is, they

into

this

and

Data Model Components

will be specific

a relationship

From

rules

software.

proper

names there

and invoice

is

relationship

into

nouns.

respective

There

for the

world,

objects,

data.

participation

For example,

24-hour

by application

many invoices

Customer

has

can

will translate

will translate

may generate

that

real

of the

of the

processes.

appropriate

stage

set the

and scope

model.

Business

rules

designers.

nature, role

Business rules into

to

2020

users and

2.3.2 translating

constraints.

review

rules

data.

business

develop data

all business

the

understand

an accurate

of

between

designer

Of course,

Copyright

tool

They allow the

and to

Editorial

view

part.

Due Learning

to

electronic reserves

may generate

only

one

rights, right

some to

third remove

In

many

additional

content

may content

that

is

case,

the

side.

ask two

party

bidirectional;

many invoices

customer.

is the

you should

the

are

questions:

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

For example,

you could identify

How

many classes

How

many

the relationship

can one student

students

can

enrol

in

between

enrol in?

one

class?

student

Answer:

and class

by asking two

2

Data

Models

39

questions:

Many classes.

Answer:

Many students.

2 Therefore,

the

opportunities

soon the

to

between

determine

process

2.4 The

relationship

the

will become

relationships

second

the eVolutIon

quest

for

better

file

systems

is,

what it should

data

critical

chronological

order.

remarkable

You

of the

taBle

to

2.1

many-to-many

entities

to

several

as you

different

models represent

of structures

that it

This section

some

major data

is

(*:*).

proceed

You

that

through

many

of the old

evolution

should

of the

new

that

this

attempt

of thought

employ,

an overview

model

models

schools

gives

data

of major data

Time

Data

First

1960s-1970s

File system

as to

many

book,

and

to resolve

the

what a database

and the technology of the

database

concepts

major

concepts

and

that

data

structures.

would

models

and

be

in roughly

structures

Table

bear

2.1 traces

a

the

Model

models examples

Comments

VMS/VSAM

Used

mainly

Managed 1970s

IMS,

Hierarchical and

Third

have

models.

Generation

Second

will

MoDels

has led

will discover

resemblance

evolution

between

These

structures.

class

nature.

management

shortcomings.

these

and

oF Data

do, the types

used to implement

student

Mid-1970s

ADABAS,

IDS-II

Early

network DB2

Relational

Oracle

on IBM

records,

database

Server

access

Conceptual

simplicity

support

for

systems

systems

Navigational

Entity relationship

MS SQL

mainframe not relationships

(ER)

relational

modelling and data

modelling

MySQL Fourth

Mid-1980s

Object-oriented

Versant

Object/

Objectivity/DB

relational

(O/R)

Object/relational

support

DB2 UDB

Star Schema support

Oracle 11g

warehousing Web databases

Fifth

Mid-1990s

XML Hybrid

DBMS

dbXML

Unstructured

Tamino

O/R

DB2 UDB

Hybrid

Oracle 11g MS SQL Emerging

Late

Models:

2000s

to

Key-value

present

Column

store

Bigtable

NoSQL

Support

(Amazon)

High

Cassandra (Apache)

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

large

databases highly

performance,

Very large

rights, the

right

some to

third remove

party additional

content

XML

documents

end to

databases

Distributed,

(Google)

support

supports

(terabyte

size)

scalable fault

tolerant

storage (petabytes)

Proprietary

Copyright

data

common

DBMS adds object front

Suited for sparse

Editorial

object

for data

become data

model

relational

Server

SimpleDB

store

for

types

data

API

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

40

part

I

Database

Systems

online Content Thehierarchical andnetworkmodels arelargelyof historical interest,yet they

do still

technical

2

on the

contain

some

details of those two accompanying

model.

However,

focuses

online given the

on that

The hierarchical manufacturing

model (Each

as the

parent

can

is the

devoted

presence

rocket

that

an upside-down of a file

of the segment

children,

to improve

Appendices

to the

professionals.

The

I and J, respectively,

object-orientated model,

(OO)

most of the

tree.

directly

but

each

database

that

The

book

with the

The schema is the conceptual

only

is

organisation

contains

allows

not

levels,

a higher

child.

and its

basic or

layer

The hierarchical

children

segments.

parent.)

of records

a record

used

today,

are still

more effectively than

a database

as a collection

model

models

hierarchy,

data relationships

model

generally

network

a parent

one

The

structure

Within the

and to impose

database

of data for complex

1969.

which is called the

complex

network

model

emerged

type.

between has

moon in

hierarchical

beneath it,

performance

the

on the

record

child

the network model,

manage large amounts

landed

systems

database

hierarchical

network

concepts

database

of the relational

(1:*) relationships

user perceives

the

While the

database

current

Models

Apollo

by

many

model,

unlike

parent.

market

Gis

model was created to represent

model, the

However,

as the

equivalent

parent

have

hierarchical

network

such

a set of one-to-many

The network the

dominant

is represented

A segment

depicts

Appendix

model was developed in the 1960s to

structure

is perceived

that interest

models are discussed in detail in platform.

and network

projects,

segments.

and features

model.

2.4.1 hierarchical

logical

elements

used

in

to

the by

standard.

1:* relationships.

have

more than

definitions

modern

In the

of

data

one

standard

models:

of the entire database as viewed by the database

administrator.

The subschema actually

A data and is

A schema

to

desired

language

work

with the

needs

grew

model became

and

programs

Copyright review

2020 has

Cengage

Learning. that

any

All suppressed

that

Large

the

programs that

database.

which data can be managed

to define the

Rights

Reserved. content

does

May not

databases

The lack to

of ad hoc

produce any

applications

the

change

database.

replaced

were required,

query capability

even the simplest

structural

data from

were largely

and

by the

put heavy

reports. database

Because

of the

relational

pressure

Although the

in the

the

could

still

produce

disadvantages

data

on

existing

of the

model in the

1980s.

Model

Shared

of the

sophisticated

drew

they

model wasintroduced

Data for

Communications

deemed

more

models,

both users and designers.

Editorial

by the application

within

(DDL) enables the database administrator

data independence,

network

The relational

1

data

database.

code required

2.4.2 the relational of

the

defines the environment in

cumbersome.

the

limited

all application

hierarchical

Model

(DML)

data in the

and

too

to generate provided

in

from

data definition language

programmers databases

information

components.

As information

havoc

the

manipulation used

schema

network

defines the portion of the database seen

produce

by E.F. Codd (of IBM) in 1970 in hislandmark

Databanks.1

To use an analogy,

ACM,

not materially

be

pp. 377-387,

copied, affect

scanned, the

overall

or

duplicated, learning

The relational

model

the relational

model produced

June

in experience.

whole

represented

a

paper A

major

Relational

breakthrough

an automatic

for

transmission

1970.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

database set the In

to replace stage

for

1970,

the standard

a genuine

Codds

simplicity

was

to implement

work

bought

was

expense

efficiency.

Better

desktop

and laptop

computers,

relational

mainframe

ingenious

yet, the

cost

overhead;

preceded

it. Its

conceptual

a fraction

software

software

of

provided

The relational

computers

computer

of computers

costing

database

relational

that

but impractical.

of computer

model. Fortunately,

system

other

databases

Data

Models

41

simplicity

revolution.

considered

at the

the relational

sophisticated

transmission

database

2

power

rapidly

such

Oracle,

the

power

ancestors

as

conceptual

lacked

power

2

as did operating

as their

mainframe

by vendors

time

grew exponentially,

diminished what their

models

at that

did,

grew.

can run

DB2, Informix,

Today

relatively

Ingres

and

vendors.

note The relational in

Chapter

relational

The

database

model

3, Relational

Model

model is

relational

system

hierarchical

model

Arguably

easier

the

relational relational

database

as

data in a way that Each

table

relations,

a

the

in the

implemented

to

in addition

Relational

the

a more detailed

Algebra

discussions

a

performs

in

sophisticated

same

and

Calculus.

relational

basic

discussion In fact,

most of the remaining

functions

to a host of other functions

of the

RDBMS

RDBMS

manages

of tables

in

database

provided

that

the

chapters.

by the

make the relational

is its

all of the

which

data

ability

to

physical

are

hide

the

details,

stored

and

complexities

while the

can

of the

user

manipulate

sees and

the

query

and logical.

consisting

each

through

RDBMS

advantage The

CUSTOMER

4,

to introduce

and implement.

a collection

matrix,

are related

contained

is

seems intuitive

is

example,

Chapter basis for

The

user.

designed

and in

understand

the

is

as the

will serve

DBMS systems, to

chapter

Characteristics,

model

most important

model from

in this

that it

(rDBMS).

and network

database

For

so important

database

management

presented

of a series

other

through

table

in

the

Figure

of row/column sharing

2.1

intersections.

of a field

might

which

contain

Tables,

is

a sales

common

agents

also

to

both

number

called entities.

that

is

also

AGENT table.

online Content Thischaptersdatabases canbefound onthe accompanying online platform Figure

for this

The common or her data is

sales are

Kubu

link

For example,

in the

between

agent

stored

even

in

though

the

table.

because

other,

minimum

level

you of

for

can easily

associate

Copyright review

2020 has

are stored

you

Dunne,

redundancy

and

CUSTOMER

enables

you to

tables

shown

in

for

can

the

data

to

eliminate

one table

most

that

tables

Bhengani.

between

and the

determine

CUSTOMER

Kubu

the

in

easily

sales

the tables

Dunnes

agent

is

which

model

redundancies

501,

are independent

The relational

of the

to his

representative

customer

AGENT_CODE

Although

tables.

match the customer

provides

commonly

found

a in

systems. The relationship

Editorial

data

AGENT_CODE

controlled

AGENT

and AGENT tables

customer

customer

of the

Ch02_InsureCo.

For example,

type

(1:1,

1:*

or *:*) is

depicted in Figure 2.2. Arelational the

contents

named

CUSTOMER

AGENT tables

of each

the

database

the

another

Bhengani,

matches the

file

book.

2.1 are found

attributes

Cengage deemed

Learning. that

any

within

All suppressed

Rights

those

Reserved. content

does

entities

May not

not materially

be

copied, affect

often

shown

in

a relational

diagram is a representation and the

scanned, the

overall

or

relationships

duplicated, learning

in experience.

whole

or in Cengage

between

part.

Due Learning

to

electronic reserves

schema,

an

example

of the relational those

rights, the

right

of

databases

which

is

entities,

entities.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

42

part

I

Database

FIgure Database

2

Systems

2.1

linking

name:

relational

Ch02_InsureCo

AGeNT_

Table

AGeNT_LNAMe

tables name:

AGENT

(first

AGeNT_FNAMe

six attributes)

AGeNT_iNiTiAL

AGeNT_

CODe

AGeNT_PHONe

AreACODe

501

Bhengani

Kubu

B

0161

228-1249

502

Mbaso

Lethiwe

F

0181

882-1244

503

Okon

John

T

0181

123-5589

Link through

Table name:

CUSTOMER

CUS_

CUS_

CUS_

CODe

LNAMe

FNAMe

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Du Toit

10013

Pieterse

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

CUS_

CUS_reNew_

AGeNT_

AreACODe

PHONe

DATe

CODe

A

0181

844-2573

05-Apr-2018

502

K

0161

894-1238

16-Jun-2018

501

0181

894-2285

29-Jan-2018

502

0181

894-2180

14-Oct-2019

502

0181

222-1672

28-Dec-2019

501

B

0161

442-3381

22-Sep-2019

503

G

0181

297-1228

25-Mar-2018

502

0181

290-2556

17-Jul-2019

503

iNiTiAL

W

Jaco

F

Myron

George

Padayachee

10019

CUS_

CUS_

Maelene

Williams

10018

AGENT_CODE

Moloi

Vinaya

G

0161

382-7185

03-Dec-2019

501

Mlilo

K

0181

297-3809

14-Mar-2019

503

In Figure 2.2, the relational diagram shows the connecting fields (in this case, AGENT_CODE) and the relationship type, 1:*. In this example, the CUSTOMER represents the many side because an AGENT can have many CUSTOMERs. The AGENT represents the 1 side because each CUSTOMER has only one

AGENT.

Arelational table stores a collection of related entities. In this respect, the relational database table resembles a file. However, there is one crucial difference between a table and a file: a table yields complete data and structural independence because it is a purely logical structure. How the data are physically stored in the database is of no concern to the user or the designer; the perception is what counts.

And this

property

of the relational

database

model, explored

in

depth in the

next

chapter,

became the source of a real database revolution. Another reason for the relational database models rise to dominance is its powerful and flexible query language. Relational algebra, which was defined by Codd in 1971, wasthe basis for manyrelational query languages

and

will be introduced

in

more detail in

Chapter

4, Relational

Algebra

and

Calculus.

For

most

relational database software, the query language used is known as Structured Query Language (SQL). SQLis a 4GL that allows the user to specify what must be done without specifying how it must be done. The RDBMS uses SQL to translate user queries into instructions for retrieving the requested data. SQL makesit possible to retrieve data with far less effort than any other database orfile environment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.2

relational

diagram: a relational

2

Data

Models

43

class diagram

2

From an end-user a user interface,

explained

perspective,

any SQL-based relational

a set of tables

stored in the

database

database

and the

application involves

SQL engine.

Each

three

of these

parts: parts is

below:

The end-user interface. Basically, the interface allows the end user to interact with the data (by auto-generating SQL code). Each interface is a product of the software vendors idea of meaningful interaction with the data. You can also design your own customised interface with the help of application generators that are now standard in the database software arena. A collection of tables stored in the database. In a relational database, all data are perceived to be stored in tables. The tables simply present the data to the end user in a way that is easy to understand.

Each table is independent

from

another.

Rows in

different

tables

are related,

based

on common values in common attributes. SQL engine.

Largely

hidden from

the end user, the

SQL engine

executes

all queries

or data

requests. Keep in mind that the SQL engine is part of the DBMS software. The end user uses SQL to create table structures and to perform data access and table maintenance. The SQL engine translates all of those requests into the instructions necessary to perform such tasks largely

behind the scenes

and

without the

end users

knowledge.

Hence, its

said that

SQL is a

declarative language that tells what must be done but not how it must be done. (You willlearn more about the SQL engine in Chapter 13, Managing Database and SQL Performance.) Because the RDBMS performs the behind-the-scenes tasks, it is not necessary to focus on the physical aspects of the database. Instead, the chapters that follow will concentrate on the logical

portion

of the relational

database

in Chapter 8, Beginning Structured SQL and Advanced SQL.

2.4.3 the entity relationship The conceptual

simplicity

the rapidly increasing

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

database

Furthermore,

SQL is

covered

in

and in Chapter 9, Procedural

technology

and information

scanned, the

design.

detail

Language

Model

of relational

transaction

and its

Query Language,

overall

or

duplicated, learning

in experience.

whole

triggered

requirements

or in Cengage

part.

Due Learning

to

electronic reserves

the

demand for

RDBMSs.

created the need for

rights, the

right

some to

third remove

party additional

content

may content

be

more complex

suppressed at

any

In turn,

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

44

part

I

Database

Systems

database (For

implementation

example, Complex

2

model

features

that

graphically

activities

would

require

describe

them

a

widely

accepted

Chen first

and their relationships the

relational

the foundation

model

diagram

One of the

between

relationships over

among

data

When the

were illustrated:

were represented database

designers

including

one of the

1976;

that

notation

it

quickly

results.

network

prefer to

models,

it

Although

the

still lacked

the

use a graphical

(er)

was the

graphical

became

popular

database

model

representations

structures

tool in

which

model, or erM,

representation

because

and

of their

ERM

has

of entities

it complemented

combined

was and

how it

between

entities

was

(1:M),

to

provide

model,

most common

more

versions

using

simple

ERD,

of

the

Chens

which uses the

modelling

notation

such

(1:1).

notation

as n

Relationships line.

were

Foot

style

of relationships

relationship

Crows

Chen also

notation

types

and one-to-one

entities

data

Chens

three

through

versions

model,

in the

a relationship.

(M:N)

entities

graphical

of the

to

achieved

related

components. between

ER data

debate

were introduced,

many-to-many

to the

of the

a large

was different

model components

connected

this

This fuelled

originally

model database

made a distinction

early releases

own.

an entity

one-to-many

to

was that it clearly

However in the

basic data

adopted

successful

a kennel.)

database design. ER models are normally represented in an entity

Chens

by a diamond

design tools.

building

modelling.

The relational

attributes

associations

many.

database

than

Because it is easier to examine

designers

model in

structure

them.

have

what exactly

for representing

to indicate

to

yield and

design tool.

which uses graphical

of Peter

to

activities

Thus, the entity relationship

data

concepts.

(erD),

strengths

and the relationships

community

ER data

for tightly structured

relationship

allowed

for

the

more effective

design

hierarchical

database

standard

in a database

database

the

database

in text,

need for

detailed simplicity

over

are pictured.

introduced

the

more

conceptual

make it an effective

than to

Peter

creating

requires

was a vast improvement

entities and their relationships become

thus

a skyscraper

design

relational

structures,

building

Whilst

developed,

notation.

note One of the

more recent

Foot notation James such

the

was originally

Martin. In as n

symbol

of

legacy

UML,

invented

many

many

of Peter

used the

you

Chen. side

organisations

that

produce

larger

entity

UML

standard.

online

with

C. Finkelstein,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

and

is

Foot

the

from is

simple

the

Crows

This is

modelling

and

notation

three-pronged

a general

shift

towards

particularly

but are vital to the Foot

The

by Clive Finkelstein2

derived there

true

organisation.

in

It is

notations.

Modelling Language (UML) has been used

diagrams

notation

willtherefore

model.

of using the

notation.

Crows

of the Unified class

Foot

is

have

been

emerging

be used to

developed

as the

as a part

industry

data

of the

modelling

model ERDs using relational

concepts.

Morein-depth coverage ofthe Crows Foot notationis providedin

E, Comparison

of ER

An Introduction

Addison-Wesley,

Foot

and software

Chens

method,

UML notation

Crows

made popular

Although

Crows

hardware

Although

design

book the

Crows

use the

both

as the

were used instead

relationship.

component models.

Content

Appendix

2

relationship

object-orientated

In this

still

known

and later

symbols

The label

on obsolete

are familiar

is

Everest

of the

today

Morerecently the class diagram to

notations

graphical

by

many

which are running

important

Chens

by Gordon

Foot notation,

to represent

systems

therefore

Crows

to indicate

used

use

versions

Modelling

to Information

Notations,

available

Engineering:

From

on the

Strategic

online

platform.

Planning

to Information

Systems.

1989.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

45

note UML is and

an object-orientated

published

common

and

as

set

databases. model

the

Rather,

website

based

and

The

a language The

and

that

OMG

is

of an

Object

effort

(symbols

UML is

describes

Management

headed

and

constructs)

a set of diagrams

which

OMG

for

the

software

includes

to

for

that

More details

which

data are to

)

2

a

design

developing

can be used to

consortium

UML.

(OMG

develop

analysis,

or procedure

and symbols

not-for-profit

computing,

Group

by the

not a methodology

an international

object

by the

result

notations

mind that

of distributed

on the following

Earlier in this

collected

is

area

the

that can

is

setting

be found

on

www.uml.org/

The ER model is

box.

UML is

sponsored

UML is

Keep in

graphically.

in the

language

1997. diagrams

of systems.

a system

Entity.

in

of object-orientated

modeling

standards

modelling

a standard

stored.

name

generally

in

was defined

is represented

entity,

a noun,

is

capital

letters

and is

or EMPLOYEE

relational

an entity

An entity

of the

written

PAINTERS,

chapter,

components:

rather

model, an entity is

as an entity instance

ERD

in the

centre

written

of the

singular

Usually,

a relational

occurrence

about

by a rectangle,

in the

EMPLOYEES.

mapped to

or entity

in the

written

than

as anything

table.

also

rectangle. form:

known The

as an entity

entity

PAINTER

when applying

be

name

rather

the

than

ERD to the

Each row in the relational

table is

known

in the ER model.

note A collection

of like

entities

is known

Figure 2.3 as a collection depicts

entity

conform

Each

entity

example, a first

sets.

to that

is

name.

entity

Data

can

describe

written

connects

two

entities.

be illustrated:

next to

line. paints

2.3 shows

connectivities.

in the

Copyright Editorial

review

2020 has

ERD

Cengage deemed

Learning. that

any

many

some

All

examine

box.)

Reserved. content

entity

AGENT file in

speaking,

set,

the

and this

ERD

book

will

components.

characteristics

as an employee

does

May

basic (1:*)

number,

Diagrams,

data

of the

data.

of the

entity.

For

a last

name

and

explains

how

Most relationships

model, three

many-to-many

(*:*)

are represented

attributes

describe

of relationships

one-to-one

(1:1).

ERD

(The connectivities

by a relationship

an active

companys

types and

of relationships.

of the relationship,

ERDs that basic

use the UML

or vertically. just

not

among

Relationships

each

the

horizontally

Rights

of the

or passive

DEPARTMENTs

line

verb, is

has

that

written

on the

many EMPLOYEEs;

PAINTINGs.

basic

are immaterial;

suppressed

for

and its

Relationship

to label the types

The name

For example,

As you

may be presented

can think

set. Technically

as a substitute ERD

particular

such

Entity

Within the

one-to-many

entity

entities.

a PAINTER

Figure

each

related

relationship

entity

describes

associations

modellers use the term connectivity are

that

you

AGENT entity

any

attributes with

For example,

ERD.)

between

data

Modelling

use

discussing

will have

Relationships

associations among

5,

set. in the

designers when

by a set of attributes

(Chapter

Relationships.

ERD

practice

EMPLOYEE

in the

as an entity

agents (entities)

Unfortunately,

established

described

the

are included

of three

not materially

be

affect

scanned, the

overall

ERD in

to read

or

Figure

The location

remember

copied,

UML notation

duplicated, learning

in experience.

2.3,

and the

to illustrate note that order in

a 1:* relationship

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, right

some to

the

the

third remove

relationships

entities

which the

from

the

these

1

party additional

content

and relationships

entities

are

side to the

may content

and

be

suppressed at

any

time

presented

*

from if

side.

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

46

part

I

Database

FIgure

Systems

2.3

the basic uMl erD A One-to-Many

(1..*)

Relationship:

each

PAINTING

A PAINTER

is

painted

can

paint

many

PAINTINGs:

by one PAINTER.

2 PAINTER

paints

PAINTING

c

1..1

A

Many-to-Many

(*..*)

each

0..*

Relationship:

SKILL

An

EMPLOYEE

can be learned

EMPLOYEE

by

learns

can learn

0..*

(1..1) each

Relationship:

STORE is

An EMPLOYEE

managed

EMPLOYEE

manages

and their

manages

STORE

c

Because

set in the

of participation

1

be aware that, typically,

associations.

of an entity

ER

the

an object

model.

in a relationship

one STORE:

by one EMPLOYEE.

1

You should

SKILLs:

SKILL

c

0..*

A One-to-Many

many

many EMPLOYEEs.

class is

Likewise,

is

UML class

diagram

a collection

an association

often referred

to

was developed of similar

is

similar

objects,

to

as multiplicities.

to

model object

a class is the

a relationship

The only

classes

equivalent

where

the

major difference

degree

between

a UML class and an ER entity is that a blank box is left in the drawing of the UML class to add the names of methods which are required when developing object-orientated systems. However,from a data modelling perspective this does not affect the structure of the data and you will use the UML notation to represent relational concepts only. Chapter 5, Data Modelling with Entity Relationship Diagrams, will introduce

the concepts

of both

Crows

Foot notation

and the

Class

Diagram

notation in

more detail.

Most database modelling tools let you select the UML model diagram option. Microsoft Visio Professional software was used to generate the UML class diagrams you will see in subsequent chapters.

note Many-to-many them.

(*:*)

However,

appropriate

you

relationships will learn

in a relational

exist in

at a conceptual

Chapter

3,

Relational

level, Model

and

you

should

Characteristics,

know

that

how

to

recognise

*:* relationships

are

not

model.

online Content Fora moredetaileddescription ofthe Chen,CrowsFootandotherER model notation

systems,

see Appendix

E, Comparison

of ER Model Notations,

available

on the

online platform.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

47

note For the Figure

purposes

Figure

2.6

shows

alternative

Crows

Foot

models

of the

UML

ERDs in

2

FIgure

As you

2.4

examine

represented

the basic Crows Foot erD

the

basic

by the

be presented

Figure

visual

Nevertheless,

the

2.4,

three-pronged

horizontally

Its exceptional

to

of illustration,

2.4.

note

or vertically

simplicity

search for

that

Crows

and the

makes the better

data

the

Foot.

1

is represented

As

with

order is

UML

again

the

line

segment

entities

and the

and relationships

*

is may

unimportant.

ER model the dominant modelling tools

by a short

notation

database

continues

as the

modelling

and design tool.

data environment

continues

evolve.

2.4.4

the

Increasingly

object-orientated complex

(oo)

real-world

problems

Model demonstrated

a need

for

a data

model

that

more

closely

represented the real world. In the object-orientated data model (OODM), both data and their relationships are contained in a single structure known as an object. In turn, the OODM is the basis for the object-orientated database management system (OODBMS).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

48

part

I

Database

Systems

online Content Thischapter introducesonlybasicOOconcepts.Youllhavea chance to examine

object-orientated

concepts

and principles in

detail in

Appendix

Like the

relational

G, Object-Oriented

Databases, which can be found on the online platform.

2

An

OODM

is

reflects

described

relationships

other

a different

by its

between

objects.

data

be

on it,

The

An object

to

an ER

Attributes

describe Name,

objects

on the

the ID

share

the

contains

of

as finding In

languages.

structure

models includes about

meaning.

entity,

an object

information its

about

relationships

with

The OODM is said to be

also

to

contain

a specific

all

data

and operational potentially

operations

value

and

that

procedures,

a basic

can

printing

data

the object

building

block for

class

inheritance

is the

methods

of the

be created inherit

has

To illustrate

the

As you examine The

OO data

other

objects

all related

that

Learning.

within the

All suppressed

related

Rights

that

Reserved. content

does

from

May

not materially

behaviour.

the

the

class two

class data

this

of the

OO data

representation.)

and the

the

CUSTOMER

and

EMPLOYEE,

CUSTOMER

and

class

respect.)

attributes

and

tree in

EMPLOYEE

model in this

to inherit

case,

a PERSONs

an upside-down

CUSTOMER

classes,

action

programming

variants in their

hierarchical

hierarchy

In

or printing

resembles

it

a real-world

in traditional (Some

a class

set in that

represents

name

of similar

sense,

an entity

methods

hierarchy

to the

PERSON.

can

EMPLOYEE

will

PERSON.

the

OO

problem

an object

same

object

objects

model

and

shown

in

the

ER

Figure

2.5.

as a box; all of the object

model,

examine

is related

be

copied, affect

scanned, the

overall

to

box.

Note that

one

For

and

must contain

or

duplicated, learning

in experience.

attributes

box. The object representation

to the INVOICE.

each INVOICE

not

includes

a general

from

method

do not include

example,

In

different

a PERSONs

objects

similarity

within the object

each INVOICE

object indicates

object

their

graphical

2.5, note that:

model represents

of the

occurrence

in this list.)

A class is a collection

of procedures

The class

class

between

are included

objects

relationship

an

is

changing

within the

invoicing

one individual

of the items

a PERSON

A classs

For example, the

For

the

methods

simple

Figure

it.

may be considered

only

several

(methods).

equivalent

model

parent.

from

and

the

define

object

above

difference in

methods.

name,

of an object

classes

attributes

representations

indicates

ability

as subclasses

all

as

(Note

represents

in classes.

a class

methods

one

an object

example,

behaviour

are the

class.

For

and

a class hierarchy.

PERSON

an object

However,

PERSONs

only

each

a parent

terms,

defined through

are grouped

methods

semantic

general

of Birth.

set.

known

OO terms, as the

In

of an object. Date

entity

words,

components:

entity.

(attributes)

models

which

any

object finding

at least

More precisely,

and

a selected

such

following

characteristics

in

that

greater

of relationships

content is

properties

Number

Classes are organised

share

entity.

semantic

procedures

other In

model

Cengage

an

object

of a real-world

similar

ER

a set

address.

deemed

based

object

well as information

values,

types

making the

models

with shared

resembles

has

data

an

meaning.

allowed

its

data, various

objects

attributes

Objects that

2020

as

entity,

are given

indicates

has

thus

an abstraction

of an entity. (The

review

an

object,

semantic

as changing

model is

is

equivalent

Copyright

the

entities.

unlike

structures.

OO data

such

use

quite

within the object

because

self-contained,

autonomous

within

development

As objects include

becomes

Editorial

facts

such

and

But

the facts

OODM

performed

values.

the

model

Subsequent

define

content.

Therefore,

a semantic

way to

factual

whole

or in Cengage

the

only

one

at least

part.

Due Learning

connectivities

example,

to

the

1:1

and relationships of the INVOICE

(1:1

and

1:*) indicate

next to the

CUSTOMER.

The

reserves

rights, the

right

some to

third remove

party additional

content

1:* next to

may content

the

CUSTOMER the

one LINE but can also contain

electronic

to

includes

be

many LINEs.

suppressed at

any

time

object

LINE

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.5

a comparison

OO data

2

Data

Models

49

of the oo model and the er model

model

ER model

2 INVOICE

INV_NUMBER

INV_DATE INV_SHIP_DATE INV_TOTAL 1 CUSTOMER *

LINE

The ER model uses three separate entities and two relationships to represent an invoice transaction. As customers can buy more than one item at a time, each invoice references one or morelines, one item per line. And because invoices are generated by customers, the data modelling requirements

include

a customer

entity

and a relationship

between

the

customer

and

the invoice. The

OODM

advances

influenced

many areas,

from

system

modelling

to

programming.

(Most

contemporary programming languages have adopted OO concepts, including Java, Ruby, Perl, C# and Visual Studio) The added semantics of the OODM allowed for a richer representation of complex objects. This in turn enabled applications to support increasingly complex objects in innovative

ways.

online Content Ausefulcomparison between the OOandER model components canbe found

in

Table

G.3, located

platform for this

It is important suited data

than

to

purposes.

and

Appendix

G, Object-Orientated

to

some

not

all data

tasks.

For

while implementation

The

network

such

note that

others

modelling,

in

Databases,

available

on the

online

book.

entity

as the relational

model

are created

example, models

relationship

models

models

is

equal;

conceptual are

better

an example

are examples

of implementation

model and the

OODM,

could

some

models

at

are

managing

be used

suited

as both

are

to

better

high-level

data for implementation

model,

At the

models

better

stored

of a conceptual

models.

data

while

the

same time,

conceptual

hierarchical

some

models,

and implementation

models.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

50

part

I

Database

Systems

2.4.5 other Facing the

Models

demand

to support

more complex

data representations,

the relational

models

main vendors

evolved the model further and created the extended relational data model (erDM). The ERDM adds many ofthe OO models features within the inherently simpler relational database structure. The ERDM gave birth to a new generation of relational databases that support OO features such as objects

2

(encapsulated

data

and

methods),

extensible

data types

based

on classes

and inheritance.

Thats

why a DBMS based on the ERDM is often described as an object relational database management system (OrDBMS). Today, mostrelational database products can be classified as object relational, and they represent the dominant market share of OLTP and OLAP database applications. The success of the ORDBMS can be attributed

transaction OODBMS is (CAD/CAM), support for

to the

models

conceptual

simplicity,

data integrity,

easy-to-use

query language,

high

performance, high availability, security, scalability and expandability. In contrast, the popular in niche markets such as computer-aided drawing/computer-aided manufacturing geographic information systems (GIS), telecommunications and multimedia, which require more complex objects.

From the start, the

OO and relational

data

models

were developed

in response

to

different

problems.

The OO data model was created to address very specific engineering needs, not the wide-ranging needs of general data management tasks. The relational model was created with afocus on better data management based on a sound mathematical foundation. Givenits focus on a smaller set of problem areas, it is

not surprising

that

the

OO

market has not grown

as rapidly

as the relational

data

model

market. However, large DBMS vendors such as Oracle readily promote their once relational DBMS now as object relational, with each new release adding new functionality. This gives organisations more choice and flexibility in the design and development of new database applications and in the integration with existing OO applications. The use

of complex

objects

received

a boost

with the internet

integrated their business models with the internet, they exchange critical business information. This resulted in business communication tool. Within this environment, as the de facto standard for the efficient and effective unstructured

data.

Organisations

that

revolution.

When organisations

realised its potential to access, distribute and the widespread adoption of the internet as a Extensible Markup Language (XML) emerged exchange of structured, semi-structured and

use XML data soon realised

that they

needed

to

manage large

amounts of unstructured data such as word-processing documents, Web pages, emails and diagrams. To address this need, XML databases emerged to manage unstructured data within a native XML format. (See Chapter 17, Database Connectivity and Web Technologies). Atthe same time, ORDBMSs added support

for

XML-based

documents

within their

relational

data structure.

Due to its robust

foundation

in broadly applicable principles, the relational model is easily extended to include new classes of capabilities, such as objects and XML. Modelling spatial data for use in applications such as route optimisation (an ambulance finding the quickest route to a patient) or urban planning requires yet another type of data model. Spatial data comprises objects

such

as cities

or forests

that

exist in

a multi-dimensional

space.

Storing

such

data in a relational

database would simply take up too much space and queries would be too long and complex to manage. A spatial database management system (SDBMS) is a database system with additional capabilities for handling spatial data. SDBMS include spatial data types (SDTs) in its data model and query language. For example

the

ability to

model objects (forests,

cities

or rivers) in space

using types

such

as POINT, LINE

and REGION. The POINT data type refers to the objects centre point in the multi-dimensional space, the LINE data type is used to represent connections in multi-dimensional space, e.g. rivers or roads, and the REGION data type is a representation of an extent e.g. alake in a 2-D space. In addition SDMS supports spatial indexing allowing the fast retrieval of objects in a specific area and efficient algorithms for supporting

spatial joins.

SDBMS

are often used to support

GIS applications

one of the

most popular

today being Google Earth.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Although a new

relational

generation

some

and

of

object

databases

relational has

years has become

histories,

customer

Twitter

and

According

to

many studies,

manage

balancing

ways to

about

NoSQL

The Big

in

Data

It is

not

always of rows millions lead

that

data

specific

processing

data

needs,

challenges

found

in

2

of

Data

growth

this

data from

(rapid

Data.

challenge

Todays

growing data

organisations

have accumulated

of browsing

patterns,

sources

of structured

is the top

rapidly

trends

for

data

and

derive

managers

The need to

scalability

business

data.

with system

(IT)

budgets.

performance,

at a reasonable

as Facebook,

unstructured

technology

with shrinking

growth,

purchasing

organisations,3

information

data

such

and

Big Data refers to a movement to find

and scalability

and

media

with combinations

Web-generated

relational

Web data that

challenges.

called Big

and lower

new and better

insight

from

cost. (You

willlearn in

the

of

it,

while

more detail

NoSQL.)

approach

does

not

always

match

needs

organisations

with

social

media

data into

the

conventional

relational

the

of

multiformat

need

for

(structured

more storage,

the type

and

in the relational of high-volume

come

non-structured)

processing

power

on

a daily

sophisticated

basis

data

will

analysis

environment.

implementations

with a hefty

data

and

price tag for

required

expanding

in the

RDBMS

hardware,

storage

environment and

licences.

highly

data

collected

based

on OLAP tools

structured

data.

from

Web sources

will probably

fault-tolerant

cure try to

sell

infrastructure

business

world

has

advantage,

and

others

MySpace

Barnes

it is

that

had

not

mining for

requires to

Big

miss it.

to

developed that

hidden

analysis

ask

in

Netflix

business

a viable

internet

some

needs

could

of

(although

prove

that

of unstructured

many

to

be a

leverage

matter

to

landscape

database

a highly

of business

technology

business

established

creating

scalable,

survival.

gain

The

a competitive

would

be different

if:

in time. model sooner. strategy

organisations

mountains

environments

amounts

organisations,

how the

challenge

in relational vast

approach.

For some

yourself

Facebooks

to the

surprising

of information

idea).

of companies Just

data in the

management

on the Data

be very successful usable

a different

data

you

many examples

had reacted

& Noble

Therefore,

for

had responded

Blockbuster

has proven to

However,

no one-size-fits-all

vendors

unstructured,

columns.

of rows

speaking,

with

is

to fit

and

to

Data analysis

before

Amazon.

are turning

Web data

and

to

gain

NoSQL

databases

a competitive

to

mine the

advantage.

www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo, Gartner

Cengage deemed

the

Big Data problem

software

has

converging

may not be available

Generally for the

See

manage

of

and social

of data

high performance

possible

Adding

wealth

to

16 Big

is that

inevitably

tools

2020

very

51

challenges:

structure

There

pace

amounts

Chapter

problem

need

all these

providing

patterns

as the next biggest

a phenomenon

manage large

simultaneously

review

most current

some

Web data in the form

organisations

the rapid

the

and leverage

costs) has triggered

mountains

need.

behaviour

have inundated

and scalability

are constantly

from the

an imperative

preferences,

LinkedIn

performance

Copyright

address

address

Models

Data Models: Big Data and nosQl

Deriving usable business information over the

Editorial

to

Data

organisations.

2.4.6 emerging

3

databases

emerged

2

Learning. that

any

All suppressed

Identifies

Rights

Reserved. content

does

May not

Top

not materially

be

10

copied, affect

Data

scanned, the

overall

and

or

duplicated, learning

Analytics

in experience.

whole

Technology

or in Cengage

part.

Due Learning

to

Trends

electronic reserves

rights, the

right

for

some to

third remove

2019,

party additional

February

content

may content

be

2019.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

52

part

I

Database

Systems

note Does this

2

mean that

No, relational and

structured

approach

relational

databases data

the

2.4.7

Every time

any

challenges

Bigtable

storage,

relational

The value,

key-value

to

three draw

simple

drivers

has

review

2020 has

example the

one

any

All suppressed

Rights

or

data to

column

different

model. In fact,

models

are

grouped

stores

and

key-value

to

the

these This

and

grow

of products

Facebook,

watch

database.

types that

a

As with

of technologies.

address

is

the

specific

no standard

under

the

stores.

It is

as Amazons and

comes

as LISP), in

NoSQL

stores

store

from

the

fact

to

in the

Googles as the

data

in

early

secondary

that

which in-memory

early

force

SimpleDB,

column

NoSQL umbrella,

still too

a dominant

permanently

emphasis

(such

there

become

stores

models

added

to

such

key-value data

languages

based

does

May not

on a structure

a corresponding

more

composed

of two

value

or set

of values.

data

model.

To better

or associative

these

arrays

data

of values

data

The

elements:

key-value

understand

a key

data the

and

model is

key-value

a

also

model,

2.6.

of a small truck-driving

certifications

and

other

company

general

called

information.

Trucks-R-Us. Using

this

Each

example,

of the we can

points: every of the model,

an attribute

Reserved. content

many

via

a NoSQL

consistency.

will survive

success

that

in Figure

an attribute

key-value

Learning.

to

using

tolerance.

data

points

example

model,

that

early

attribute-value

relational

Cengage

to friends

are

NoSQL.

relational

stores,

database.

has

In the

points

the

model is key

important

deemed

best

were

data.

models

programming

following

column

name

and fault

on the

graph data

other

the

In the

the

data.

2.6 shows

represents

Copyright

any

the

different

indicates

from

every

as the

Figure

and

databases

more detail.

based

Cassandra

stores

data

which

you

applied

than transaction

in

many

However,

word

hold

rather

to

to

messages

Maps,

be loosely

hence

of sparse

of these

Apaches

like

can

send

Google

availability

amounts

are not

and

at the

of application,

architectures.

high

contrary,

The

in

referred

look

database

characteristics

any,

originated

in

to refer

model,

databases if

are used to

Editorial

these

To the

just

models

NoSQL

performance

arena.

leaders.

challenges? transactions

businesses.

on Amazon,

directions

NoSQL

scalability,

document

database

Data

2019, relational

characteristics:

databases

which,

for

areas

September

Big

most day-to-day

general

Geared towards

know

has its

in

with

support

Big Data era and have the following

very large

from

technology

perspective,

to

of databases

Supports

model.

for

term

distributed

NoSQL

organisations

generation

high

data

in

databases

a new

on the

examine

a product

uses

Provides

Lets

for

the

of the

Supports

DBMS

DDMS technology

or search

chapter

based

a place

Databases

new technology, this

have

and dominant

Each

most dominant

YouTube

However,

Not

needs.

you search

on

dont

preferred

best tool for the job. In

nosQl

video

the

analytics

is to use the

still significantly

databases

remain

not materially

be

row

represents

entity

occurrence.

each

row

and the

copied, affect

scanned, the

overall

a single

or

duplicated, learning

Each

represents value

in experience.

or in Cengage

part.

occurrence

column

one

column

whole

entity

has

attribute

Due

to

electronic reserves

the

rights, the

right

some to

every

a defined

of one

contains

Learning

and

third remove

data

entity

actual

party additional

type.

instance.

value

content

column

may content

for

be

The key

the

attribute.

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.6

2

Data

Models

53

a simple key-value representation Trucks-R-Us Data stored

Data stored using traditional

In the

relational

Each

row

Each

column

In the

row

one

represents in

entity one

a column

key-value

Each

model

are

of the

of the

same

data

entity type

model:

represents

one

attribute/value

of

one entity

Driver 2732

The key The

2

model

instance attribute

instance column

values

type

using

key-value

model:

represents

The values

relational

in

could

represent

the value

and therefore

any

column

it is

entitys

could

generally

attribute

be of any

assigned

data

a long

string

data type SOURCE:

Course

The data type of the value column is generally along string to accommodate data types of the values placed in the column.

Technology/Cengage

Learning

the variety of actual

To add a new entity attribute in the relational model, you need to modify the table definition. To add a new attribute in the key-value store, you add a row to the key-value store, which is whyit is said to beschema-less. NoSQL databases do not store or enforce relationships among entities. The programmer is required to manage the relationships in the program code. Furthermore, all data and integrity validations

must be done in the

expanded to support

program

code (although

some implementations

have been

metadata).

NoSQL databases use their own native application programming interface (API) with simple data access commands, such as put, read and delete. Because there is no declarative SQL-like syntax to retrieve data, the program code must take care of retrieving related data in the correct way. Indexing and searches can be difficult. Because the value column in the key-value data model could contain many different data types, it is often difficult to create indexes on the data. Atthe same time, searches can become very complex. As a matter of fact, you could use the key-value structure as a general data modelling technique when attributes are numerous but actual data values are scarce. The key-value data modelis not exclusive of NoSQL

databases;

actually,

key-value

data structures

could

reside

inside

a relational

database.

However, because of the problems with maintaining relationships and integrity within the data, and the increased complexity of even simple queries, key-value structures would be a poor design for most structured business data. Several

NoSQL

database implementations,

such as Googles

Bigtable

and Apaches

Cassandra,

have

extended the key-value data model to group multiple key-value sets into column families or column stores. In addition, such implementations support features such as versioning using a date/time stamp. For example, Bigtable stores data in the syntax of [row, column, time, value], where row, column and

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

54

part

I

Database

value

Systems

are string

composed NoSQL is

2

that

data types

of (row, supports

they

nodes.

to

use

that

supports particular,

very large

but the number

any certification

very large they

exam,

possible

certificates

only four

data instances.

than

for

each

not required

driver,

there

extrapolate

500 possible tests, remembering NoSQL

provides

databases to the

are

distributed

tolerance

means that,

Most NoSQL of the

biggest

databases fault

problems

with

If the

a copy

In a relational back.

the system

NoSQL only

you

data.

sacrifice

is

need

of

to

of the

technology.

hottest

to

in

the

best

section

and disadvantages

which

data tool

briefly

goes

levels

about

of

this

means that

updates

of the

technologies

for

the

job

summarises

performance.

today.

But,

from

and

any

a data

other

update?

is rolled

Chapter

NoSQL

database

12,

databases

will propagate

consistency,

as you learnt

database

understanding

the evolution

Distributed

data are

after an update.

Whichever

by

One

availability

(See

Some

With eventual

data immediately

management.

high

during

to the

Fault

as normal.

or the transaction

topic.)

NoSQL

consistency.

be served

down

all.

of nodes

operating

ensure

can

more

downtime.

data consistency. to

be consistent

will be consistent.

database

in

high

and

Web origins,

will keep

request

network

to

more

all copies

trends

select

The following

advantages

items

attain

learn

all data copies across

many emerging

be able

to to

consistency,

be consistent

if the

it

are

to take

in the form

than transaction

nodes

the

there

patients

without

is

can take

however,

capacity and

volumes

and three

but is not required

enforcing

down,

happens

consistency

eventual

goes

high

drivers

with 15 000

add

fails,

multiple

are guaranteed

Concurrency,

and eventually

to one

one

and

what

is

at

most

of attributes

drivers

are three

True to its

to

rather

databases

data

updates

of a clinic

database

elements

of the

very

practice,

a few tests

ability

performance

distributed

However,

In

do it transparently

distributed

of data

transaction

called

in the

Bigtable)

number

example,

if there

points.

case

handle

preceding

data

as the

and to

value.

database

of some

which the

and fault tolerance.

high,

with the requested

databases

not guaranteed

it is

node

of the

a feature

nodes

is

labs

data is

databases

of distributed

can

in

case,

can take

such

are geared towards

make copies

Transactions

provide

of the

of very large

database,

NoSQL

Managing

through

one

automatically

tolerance.

node

if

demand

databases

Using the

for the

Web operations,

when the

databases

example

NoSQL

(Cassandra,

network

cases

all. In this

each patient

of

of them

in the research

possible

high availability

to support

database

this

that

high scalability,

designed

will be nine

the stored

budgets!

NoSQL

is low.

access

most recent

big advantages

a complex

that is, for

to take

used to the

several

small

data.

data

data instances

are

Now

sparse

of the

originated

on very

The key

to indicate

fact,

to form

of sparse

for

of actual

One

databases

amounts

but they

blank

In

servers

most started

are suited

data type.

be left

architecture.

NoSQL

and

can

architecture.

commodity

several

NoSQL

a date/time time

database

Web companies,

of data. In

is

where

a distributed

use low-cost

Remember

successful

time),

distributed

generally

are designed

and time

column,

the

of data

in

Chapter

technology pros

and

1,

you cons

use,

of each

models and provides

some

of each.

2.4.8 Data Models: a summary The

evolution

complex Figure

order

be

of data

widely

model

semantic

2020 has

to

A data

model

Cengage deemed

Learning. that

been

driven

by the

of the

search

for

most commonly

new

ways

of

recognised

modelling

data

increasingly

models is shown

in

any

models,

some

All suppressed

than

must represent

Rights

semantics

common

of conceptual

database. the

real

characteristics

that

data

models

must have

the

real

to the

It

does

May not

not materially

be

copied, affect

scanned, the

overall

world

models

or

duplicated, learning

does

not

simplicity

without

make sense

to

compromising

have

a data

the

model

that

is

more

world. as closely data

while data representation

Reserved. content

are some

degree

of the

conceptualise

more

there

accepted:

must show

data behaviour,

review

always

A summary

completeness

difficult

by adding

Copyright

has

data.

evolution to

A data

Editorial

DBMSs

2.7.

In the in

of

real-world

as possible.

representation.

constitutes

in experience.

whole

or in Cengage

part.

the

Due Learning

to

electronic reserves

static

rights, the

right

This

goal is

more

easily

(Semantics

concern

aspect

of the real-world

some to

third remove

party additional

content

may content

be

the

suppressed at

any

time

dynamic

scenario.)

from if

realised

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

Representation consistency

of the real-world and integrity

FIgure

2.7

transformations

characteristics

(behaviour)

of any

data

must be in

compliance

2

Data

Models

55

with the

model.

the evolution of data models 2

Semantics in Data

Comments

Model

least

1960

Difficult

Hierarchical

to represent

(hierarchical Structural 1969

Network

1970

Relational

level

No ad

hoc

Access

path

dependency

queries

(record-at-a-time

predefined

Conceptual

access)

(navigational

simplicity

access)

(structural

independence)

Provides ad hoc queries (SQL) Set-oriented

1976

M:N relationships

only)

Entity Relationship

Easy to

understand

Limited

to

(no

1983

access (more

conceptual

semantics) modeling

implementation

component)

Internet is born

Semantic

1978

More semantics Support

in

data

complex

Inheritance

1990

1985

for

(class

model

objects hierarchy)

Behaviour Extended

Object-Oriented

(O/R

most

Relational

Unstructured

DBMS)

XML

Addresses

2009 Big

Data

Big

data

data

Data problem

Lesssemantics in data model

NoSQL

Based on schema-less

key-value

Best suited for large

sparse

SOURCE:

Each

the

new

data

model

hierarchical

relationships. models

In turn,

through

language;

environment.

relational

store

note

model is

of implementation

OODM,

review

2020 has

Cengage deemed

any

All suppressed

also

emerged

the

Big

that

not

all

data

For example,

an example

be used

Rights

Reserved. content

does

May not

several

as the

models

as

not materially

of

both

conceptual

of the various

be

copied, affect

scanned, the

are

overall

or

duplicated, learning

equal;

some

stored

time,

whole

data

while the

hierarchical

models,

applications.

within the

models

or in Cengage

The

business

of alternative

are

better

data

suited

modelling,

purposes.

as the

models.

query

management.

and

such

network

The ERDM added

data for implementation

some

and

development data

Learning

easy-to-use

market share

with traditional

and implementation

in

hierarchical

business

the

Technology/Cengage

model replaced

framework.

has stimulated

created

database

experience.

for

model

(many-to-many)

models are better suited to high-level

model,

same

model

maintain strong also

network

and

data

data stores

Course

complex

over the

data

a break

managing

At the

The

independence

within a rich semantic

conceptual

a conceptual

data

dominant

data

represents

models.

advantages

superior

Data phenomenon

data that

models.

much easier to represent

model and allowed it to

manage

and disadvantages

Learning. that

could

offered

models are better for

examples

advantages

Copyright

years,

relationship

the

model

of previous

made it

support for complex

others for some tasks.

while implementation

Editorial

model

and

to

shortcomings

data representation,

to the relational

In recent

model,

It is important

than

on the

the former

relational

simpler

model introduced

many OO features

ways to

the

its

the

OO data

capitalised

model because

(XML)

exchanges

The entity

network

models

relational

model

Table

2.2

summarises

are and the

models.

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

56

part

I

Database

Systems

a all

all

in still

(no good

in

hardware

2

a

hierarchical

DBMS. of

data

changes

use

development storage

relatively

changes

complex

complex management,

to

or

requires

efficiency

the

limitations

data in

substantial

gives overhead.

require

tools yields

yields

require

relationships). limits

knowledge standards.

definition

the

application

M:N

of

physical system

language

system

requires

system.

development,

structure

of

or

simplicity programs

changes

implementation

implementation in

requires

software

data

programs.

people

lack

no complexity

a

are

is

is

management.

RDBMS system

use;

manipulation

multiparent Complex

Disadvantages

knowledge

characteristics.

1.

Navigational

application

Changes

and

path.

2.

application

3.

navigational

implementation,

Navigational

There

There

There

System

4.

5.

6.

1.

2.

that

as

and

The

application

Structural

Conceptual

and

1.

3.

untrained

2.

or

and the

a

to

by

DBMS.

in

(DDL)

data

conceptual

enforced

such

access

in promotes

in equal

data

and

models.

than

models

(DML)

types,

Changes

promoted

least

language promotes

promotes

standards. is at affect

to is

relationships. provided

relationship

not

flexible system tables.

model.

language

is

sharing.

1:M

do

database

definition

more

relationship

file

with

data

relationship

relationship

and

is simplicity

programs.

more

data

security

multiparent.

independence

conformance

structure

various hierarchical

owner/member

access

independent

manipulation

is

and integrity.

of

of

efficient

DBMS.

the promotes

handles includes

is

Database

Parent/child

M:N

Conceptual

Parent/child

by

Data

Data

of

simplicity.

integrity.

It

There

data

data

Structural

application

use

hierarchical

tables

It

It

It

Advantages 1.

4.

3.

2.

6.

5.

4.

3.

2.

1.

5.

1.

disadvantages

and No

No

Yes

Yes

Yes

Yes

Network

Relational

Structural

independence

advantages

Data

independence

2.2

Model

taBle Data

Copyright Editorial

review

2020 has

Cengage deemed

Hierarchical

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

Chapter

2

Data

Models

57

an

in has

by

may

to

when

only

found it

caused support.

2

provides

graphical

language. departments

applications.

limitation

enhancements,

entities

occurs

transactions.

it

system.

representation.

representation. required.

information

curve.

accepted own

and

(This own

anomalies

support

of

is

integrity

slows

from

unchecked,

model

standards

content subsequent

data

if

of

their widely their

manipulation

relationship

islands

consistency,

in

constraint

a

learning navigational

displays.

and, removed

overhead

same individuals

code.

data

data

relationship

supply

transaction

develop

consistent

steep

as

are programming

no

the

to

poorly limited

limited

is

is

no

no

is

is

of

a

information promote

complex crowded

is

addressed

is

development

of

system eliminating

easily

a systems.

may terms is

system

problems

produce file

can

There

There

There

Loss

versions.)

vendors

Slow

Complex

There

been avoid

attributes

High

There

application

There

eventually

standard.

thus

In It

It

3.

1.

2.

1.

4.

3.

2.

4.

3.

1.

4.

3.

2.

SQL. tolerance conceptual efficiency. user simplicity. on

effective

relational

fault

improves promoting

an

end

semantic

hardware. it

and storage

and

based

improves

integrity.

the

implementation, is

dominant

exceptional

thereby

data

makes

management added.

includes

details the

use.

improves

is isolates

commodity

design, availability

yields

and

Data.

tool. with

and

capability

substantially

promotes

model

Big

simplicity,

content

RDBMS

query view modelling

database

low-cost representation

representation physical-level

scalability,

provided.

hoc

integrated

is

management

Tabular

conceptual

Powerful

Ad

easier

implementation Visual

from

Visual

model.

Semantic Visual

communication

content.

uses

supports

It

It

2.

3.

Key-value

High are Inheritance

simplicity.

It

4.

3.

2.

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

Yes

Yes

Yes

Yes

Yes

Object-Orientated

NoSQL

May not

1.

3.

2.

Yes

Relationship

Entity

Editorial

1.

3.

2.

1.

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

4.

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

58

part

I

Database

Systems

2.5

Degrees

In the

early

1970s,

Requirements

2

the

As you to

details

can

see,

the

are

framework That is, details helpful

transfer

level

of of

specified,

created

by the

closer

multiple

(and

in integrating

floor

Designing

a usable

that

cannot

Using

conflicting)

of

follows overall of

of

produced.

the

the

floor.

proceeds

the

engineering

basic

conceptual

same

basic

process.

environment

and adds

abstraction

can

as seen

at different

data

Finally,

and

unless

data

engineers

on the factory

abstraction

without

design.

Next,

be

proceed

levels

views

can

and

of data

of automotive

be used

exist

of the

Planning

on degrees

produced.

to

database

view

to implementation. sometimes

be

at a high level

cannot

based

example

to

a structure

process

with an abstract

is

specifications

details

Standards

modelling the

that

into

begins

engineering

designer.

comes

design

car

The factory

and the

starts

concept

car

production the

(ANSi)

data

consider

of the

basic

into

producing

detail.

institute for

abstraction,

concept

the

designer

Standards a framework

of data the

are translated

properly

as the

meaning

process

a database

defined

drawing

help

drawings

an ever-increasing

details

by

that

National

(SPARC) the

begins

engineering

aBstraCtIon

American

To illustrate

designer

design

the

the

Committee

abstraction. A car

oF Data

also

be very levels

of

an organisation.

ANSI/SPARC

architecture

external,

The

conceptual

and internal.

as shown

in

Figure

of a physical

FIgure

2.8. In the

model

2.8

to

(as it is

figure,

address

often referred

to)

defines

You can use this framework the

ANSI/SPARC

physical-level

to

three

better

framework

has

implementation

levels

been

details

of data

understand expanded

of the

abstraction:

database

models,

with the

internal

model

addition

explicitly.

Data abstraction levels

End-User

View

End-User

View

External

External

Model

Model

Degree

of

Abstraction Conceptual

Characteristics

Designers

Model

High

View

ER

Hardware-independent Software-independent

Logical

independence Relational

Medium

Hardware-independent

Object-Orientated Internal Model

View

Network Low

Physical

Software-dependent

DBMS

Hardware-dependent

Hierarchical

Software-dependent

independence

Physical

Model

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.5.1 the external The

external

2

Data

Models

59

Model

model is the

end

users

view

of the

data

environment.

The term

end

users refers

to people who use the application programs to manipulate the data and generate information. End users usually operate in an environment in which an application has a specific business unit focus. Companies are generally divided into several business units, such as sales, finance and marketing.

Each

business

unit is

subject

to

specific

constraints

and requirements,

and

each

2

one

uses a data subset of the overall data in the organisation. Therefore, end users working within those business units view their data subsets as separate from or external to those of other units within the organisation. As data is being modelled, ER diagrams will be used to represent the external views. A specific representation

of an external

view is known

as an external

schema.

To illustrate

the

external

models

view, examine the data environment of Tiny University. Figure 2.9 (a) and (b) presents the external schemas for two Tiny University business units: student registration and class scheduling. Each external schema includes the appropriate entities, relationships, processes and constraints imposed by the business unit. Also note that, although the application views are isolated from each other, each view shares

a common

entity

with the

other

view.

For example,

the registration

schemas share the entities

CLASS and COURSE.

Note the entity relationships

represented in Figure 2.9. For example:

A LECTURER

may teach

many CLASSes,

is, there is a 1:* relationship A CLASS

may ENROL

and each

CLASS is taught

and scheduling

external

by only one LECTURER;

that

between LECTURER and CLASS.

many students,

and each student

may ENROL in

many CLASSes, thus

creating a *:* relationship between STUDENT and CLASS. (You willlearn about the precise nature of the ENROL entity in Chapter 5, Data Modelling with Entity Relationship Diagrams.) Each COURSE may generate many CLASSes, but each CLASS references a single COURSE. For example, there may be several classes (sections) of a database course having a course code of CIS-420. One of those classes may be offered on Mondays, Wednesdays and Fridays from 8:00 a.m. to 8:50 a.m., another may be offered on Mondays, Wednesdays and Fridays from 1:00 p.m. to

1:50 p.m.,

while a third

may be offered

on Thursdays

from

6:00 p.m. to

8:40 p.m.

Yet all three classes have the course code CIS-420. Finally, a CLASS requires one ROOM, but a ROOM may be scheduled for many CLASSes; that is, each classroom may be used for several classes: one at 9:00 a.m., one at 11:00 a.m., and one at 1:00 p.m., for example. In other words, there is a 1:* relationship between ROOM and CLASS. The use of external views representing It

makesit easy to identify

It

makes the

designers

specific

job

easy

subsets

of the database has some important

advantages:

data required to support each business units operations.

by providing

feedback

about the

models

the model can be checked to ensure that it supports all processes models, as well as all operational requirements and constraints.

adequacy.

Specifically,

as defined bytheir external

It helps to ensure security constraints in the database design. Damaging an entire database is more difficult when each business unit works with only a subset of data. It

Copyright Editorial

review

2020 has

makes application

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

program development

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

much simpler.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

60

part

I

Database

FIgure

Systems

2.9

external

models for tiny university (a) Student

2

A student classes

registration

STUDENT

may take up to six per registration

1..1 enrols_in

c

1..6

ENROL

1..35 is_taken_by

1..1

COURSE

generates

CLASS

c

1..1

1..*

A class is limited

to

35 students

(b) Aroom

Class scheduling ROOM

may be used to teach many classes

1..1 is_used_for

c

1..*

Each class is taught in only one room Each class is taught by one lecturer

CLASS

COURSE

b generates

1..*

1..3 teaches

1..1

c

1..1

LECTURER

Alecturer

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

may teach

scanned, the

overall

or

duplicated, learning

in experience.

up to three

whole

or in Cengage

part.

Due Learning

to

classes

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2.5.2 the Conceptual

Model

Having identified

views,

the

external

a conceptual

model is used,

graphically

2

represented

Data

Models

61

by an ERD

(Figure 2.10), to integrate all external views into a single view. The conceptual model represents a global view of the entire database. It is a representation of data as viewed by the entire organisation. That is, the conceptual model integrates all external views (entities, relationships, constraints and processes)

into

a single

global view of the

entire

data in the enterprise,

known

as a conceptual

2

schema.

The conceptual schema is the basis for the identification and high-level description of the main data objects (avoiding any database model specific details). The most widely used conceptual modelis the ER model. Remember that the ER modelis illustrated with the help ofthe ERD, which is, in effect, the basic database blueprint. The ERDis used to graphically represent

the conceptual

schema.

The conceptual model yields some very important advantages. First, it provides a relatively easily understood birds-eye (macro-level) view of the data environment. For example, you can get a summary of Tiny Universitys data environment by examining the conceptual model presented in Figure 2.10. Second,

the

conceptual

model

is independent

of

both

software

and

hardware.

Software

independence means that the model does not depend on the DBMS software used to implement the model. Hardware independence means that the model does not depend on the hardware used in the implementation of the model. Therefore, changes in either the hardware or the DBMS software

will have

no effect

on the

database

design

logical design is used to refer to the task implemented in any DBMS.

FIgure

2.10

Conceptual

at the

of creating

model for tiny

conceptual

a conceptual

level.

Generally,

data

model that

the

term

could

be

university

enrols_in

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

62

part

I

Database

Systems

2.5.3 the Internal Once

a specific

Model

DBMS

has been

selected,

the internal

model

maps the

conceptual

model to the

DBMS. The internal model is the representation of the database as seen by the DBMS. In other words, the internal model requires the designer to match the conceptual models characteristics and constraints to those of the selected implementation model. An internal schema depicts a

2

specific

representation

of an internal

model,

using

the

database

constructs

supported

by the

chosen database. Since this book focuses on the relational model, a relational database was chosen to implement the internal model. Therefore, the internal schema should mapthe conceptual modelto the relational model constructs. In particular, the entities in the conceptual model are mapped to tables in the relational model.

Likewise,

since

a relational

database

has

been selected,

the internal

schema

is

expressed

using SQL, the standard language for relational databases. In the case of the conceptual model for Tiny University depicted in Figure 2.10, the internal model wasimplemented by creating the tables LECTURER, COURSE, CLASS, STUDENT, ENROL and ROOM. A simplified version of the internal model for Tiny College is shown in Figures 2.11 (a) and (b). The development

of a detailed

internal

model is especially

important

to

database

designers

who

work with hierarchical or network models because those models require very precise specification of data storage location and data access paths. In contrast, the relational model requires less detail in its internal model because most RDBMSs handle data access path definition transparently; that is, the designer

need

not be aware

of the

data

access

path

details.

Nevertheless,

even relational

database

software usually requires data storage location specification, especially in a mainframe environment. For example, DB2 requires that the data storage group, the location ofthe database within the storage group, and the location of the tables within the database be specified. Because the internal model depends on specific database software, it is said to be software-dependent. Therefore,

a change

in the

DBMS

software

requires

that the internal

model be changed

to fit the characteristics and requirements of the implementation database model. When you can change the internal model without affecting the conceptual model, you have logical independence. However, the internal modelis also hardware-independent, because it is unaffected bythe choice ofthe computer on which the software is installed. Therefore, a change in storage devices or even a change in

operating

systems

will not affect the internal

2.5.4 the physical

model.

Model

The physical model operates at the lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes. The physical model requires the definition of both the physical storage

devices

and the (physical)

access

methods

required

to reach

the

data

within those

storage

devices, makingit both software-and hardware-dependent. The storage structures used are dependent on the software (DBMS, operating system) and on the type of storage devices that the computer can handle. The precision required in the physical models definition demands that database designers who work at this level have a detailed knowledge of the hardware and software used to implement the database

design.

Early data models forced the database designer to take the details of the physical models data storage requirements into account. However, the now-dominant relational modelis aimed largely at the logical rather than the physical level; therefore, it does not require the physical-level details common to its

Copyright Editorial

review

2020 has

predecessors.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

2.11

an internal

model for tiny

2

Data

Models

63

university

2

Although the relational physical

storage

model does not require the designer to be concerned

characteristics,

the implementation

of a relational

model

about the datas

may require

physical-level

fine-tuning for increased performance. Fine-tuning is especially important when very large databases are installed in a mainframe environment. Yet even such performance fine-tuning at the physical level does not require knowledge of physical data storage characteristics. As noted earlier, the physical model is dependent on the DBMS, file level access methods and types

of hardware

storage

devices

supported

by the

operating

system.

When you can change

the

physical model without affecting the internal model, you have physical independence. Therefore, a change in storage devices or methods and even a change in operating system will not affect the internal model. A summary

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

of the levels

All suppressed

Rights

Reserved. content

does

May not

of data abstraction

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

is

in experience.

whole

given in Table

or in Cengage

part.

Due Learning

to

electronic reserves

2.3.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

64

part

I

Database

taBle

Systems

2.3

levels

of data abstraction

Degree

Model

2

of

independent

Focus

of

Abstraction High

External

End-user

Hardware

views

(independent Internal Low

Physical

of database

Specific

database

Storage

and

software

Hardware and software

Global view of data

Conceptual

and

model) Hardware

model

access

Neither

methods

hardware

nor software

suMMary A data

model is

Database users.

The

Business

basic

rules

real-world The

a (relatively)

designers

data-modelling

and

end

graphical

perceives

tool for

database

data

are

and

end

resembles

real-world

with entities,

data

applications attributes,

basic

environment.

programmers relationships

modelling

and and

end

constraints.

components

within a specific

most likely

future

are

used,

geared

are

to support

a new the

and

shifting

of the

data into

uses

objects

the

but

that

between

the

relational to

some

of

(ER)

as seen

a common

it.

facts

by

model is a popular

by

model allows

database

modelling

designers,

structure.

But unlike

as

other

framework.

basic

define

model,

each

model. The ER

data

as the

facts

object-orientated

point,

the

An

an entity,

the

well as relationships

object

with other

of

used in

geared

access

to

become

the

specialised

business

strategies

that

of Big

high scalability,

burden

to

extended

engineering

applications.

Although

merger of OODM and ERDM technologies,

of databases needs

extensions

is largely

primarily

an increasing

specific

(OO)

OODM

is

develop internet

provide

the

In the

are related

the relational

views the

ERDM

generation

very

data stores that

consistency

At this

is

Tables

The entity relationship

different

many

while the

scenario

no longer

meaning.

adopted

by the need to

that

standard.

tables.

complements

relationships

model (ERDM).

databases

distributed

has

in

attributes.

it includes

more

applications,

overshadowed NoSQL

data

models

implementation

model (OODM) in that

early

stored

and to integrate

about

its

model

data

being

present

users

were

models.

database as

visually

data

giving

scientific

data

an entity

The relational relational

current

information

thus

models

data

modelling that

to

also includes

are

of a complex

communicate

define the

values in common

The object-orientated

objects,

data

the

designers

programmers

the

and

in current

model is the

user

means of common

and

to

components

network

are found

The relational

object

abstraction

models

environment.

concepts

the

data

are used to identify

hierarchical

the

simple

use

do not

Data

maintaining

use the

relational

organisations.

availability

NoSQL

model

and

and

databases

and fault tolerance

relationships

both are

for databases.

offer

by sacrificing

data integrity

to the

data

program

code.

Data level

modelling requirements of

data

abstraction.

Requirements conceptual

lowest

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

The

Committee

All suppressed

American

There is

of data abstraction

Rights

Reserved. content

does

May not

not materially

be

copied, affect

also

of different

National

(ANSI/SPARC)

and internal.

level

are a function

is concerned

scanned, the

overall

or

duplicated, learning

Standards

describes a fourth

in experience.

whole

data views (global

level

three

of data

exclusively

or in Cengage

part.

Institute levels

Due Learning

to

Standards

of data

abstraction

reserves

rights, the

right

some to

third remove

and the

Planning

abstraction: (the

with physical

electronic

vs local)

physical

storage

party additional

content

may content

and

external, level).

This

methods.

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

2

Data

Models

65

key terMs AmericanNationalStandards Institute(ANSI)

entityrelationshipdiagram(ERD)

object-orientateddata model(OODM)

attribute

entityset

object-orientateddatabasemanagement

Big Data

extended relational data model(ERDM)

businessrule

external model

one-to-many (1:*) relationship

class

external schema

one-to-one (1:1) relationship

classdiagram

hardwareindependence

physicalindependence

class hierarchy

hierarchical model

physical model

conceptual model

inheritance

relational database managementsystem

conceptual schema

internal

connectivity

internal schema

relationaldiagram

constraint

logical design

relational model

Crows Foot notation

logical independence

relations

system (OODBMS)

model

(RDBMS)

data definition language (DDL)

many-to-many (*:*) relationship

relationship

data manipulationlanguage (DML)

method

schema semantic data model

data models

network model

softwareindependence

entity

NoSQL

subschema table

entity instance

object

entity occurrence

object relational database management

entityrelationship(ER) model(ERM)

Further Blaha,

Premerlani,

P. The

1(1):

system(ORDBMS)

reaDIng

M. and

Chen,

Unified Modelling Language(UML)

W. Object-Oriented

entity-relationship

model

Modelling

towards

and

a unified

Design

view

for

of data,

Database ACM

Applications.

Prentice

Transactions

on

of the

ACM,

Hall,

Database

1998.

Systems,

1976.

Codd,

E.F. A

Codd,

E.F. A

relational

Conference Codd,

E.F.

Lausen,

on The

Data

for large

Model G.

Database

shared

founded

Description,

Vossen,

NoSQL

of data

sublanguage

Relational

G. and

Oracle

model

database

Access for

and

Database

Models

and

databanks,

on relational Control,

Documentation,

pp.

Management,

Languages

of

ORACLE,

Communications

calculus, 3568,

2.

Addison-Wesley,

Orientated

[online]

of the

pp.

AIM

377-387,

1970.

SIGFIDET

1971.

Version

Object

2019

Proceedings

1990.

Databases.

Available:

Addison-Wesley,

1998.

https://docs.oracle.com/en/database/

other-databases/nosql-database/index.html Thalheim,

B. Entity-Relationship

Modelling

Foundations

of

Database

Technology.

Springer,

2000.

online Content Answers to selected Review Questions andProblems forthischapter can

be found

reVIew 1

review

for

this

book.

QuestIons of data modelling.

Whatis a business rule, and whatis its purpose in data modelling?

3

How would you translate

2020 has

platform

2

business rules into

Describe the basic features user

Copyright

online

Discuss the importance

4

Editorial

on the

Cengage deemed

and the

Learning. that

any

All suppressed

of the relational

data model components? data model and discuss their importance

to the end

designer.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

66

part

I

Database

5

Systems

Explain

how the

database

6

design

entity

relationship

(ER)

model

helped

produce

a

more structured

relational

environment.

Usethe scenario described by A customer can make many payments, but each payment is by only

2

your

one

customer

answer

using

UML

as the

basis

class

diagram

for

an entity

relationship

diagram

(ERD)

presentation.

Whyis an object said to have greater semantic content than an entity?

8

Whatis the difference between an object and a class in the object-orientated

9

How would you model Question 6 with an OODM? (Use Figure 2.7 as your guide.)

10

Whatis an ERDM, and what role does it play in the

11

Whatis arelationship,

12

Givean example of each ofthe three types of relationships.

13

Whatis atable,

14

Whatis arelational

15

Whatis connectivity?

16

Describe the

17

Whatis sparse data? Give an example.

18

Defineand describe the basic characteristics of a NoSQL database.

19

Describe the key-value

20

to

of relationships

data model(OODM)?

database environment?

exist?

model?

Give an example.

Draw ERDs to illustrate

connectivity.

Big Data phenomenon.

model

key-value

and which three types

modern (production)

and what role does it play in the relational

Using the example how

Show

notation.

7

diagram?

made

this

data model.

of a medical clinic with patients and tests, example

modelling

using

the

relational

model

and

provide a simple representation

how it

would

be represented

of

using

the

technique.

21

Whatis logical independence?

22

Whatis physical independence?

proBleMs Use the

contents

of

Figure

would the

would

5

and the

the

1-5.

between

Using and

look

like?

wereimplemented

Label the

Figure

Learning. that

any

structure

in a hierarchical fully,

identifying

model, the

root

1 segment.

between

network

structure

AGENT and CUSTOMER.

AGENT and CUSTOMER

model look

like?

(Identify

the

wereimplemented

record

types

and

in a network

model, what

set.)

OO model.(Use Figure 2.7 on p. 55

guide.) P2.1

attributes

Cengage deemed

AGENT and CUSTOMER

hierarchical Level

between

Using the ERD you drew in Problem 2, create the equivalent as your

has

Problems

Given the business rule(s) you wrotein Problem 1, create a basic UML class ERD.

4 If the relationship

2020

work

2

segment

review

p.46 to

Writethe business rule(s) that govern the relationship

what

Copyright

on

1

3 If the relationship

Editorial

2.3

for

All suppressed

Rights

as your the

Reserved. content

does

guide,

DealCo

May not

not materially

be

answer

stores,

copied, affect

scanned, the

overall

Problem

6. The

in

regions

located

or

duplicated, learning

in experience.

whole

two

or in Cengage

part.

Due Learning

to

DealCo

Class

of the

electronic reserves

rights, the

right

ERD

shows

the

initial

entities

country.

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

p2.1

2

Data

Models

67

the DealCo class erD

2

6 Identify Using

each relationship

Figure

entities

P2.2

as your

and attributes

7 Identify

for

type and write all of the business rules.

guide,

answer

Problems

7-9.

The

Tiny

University

class

ERD

shows

the initial

Tiny University.

each relationship

type and write all of the business rules.

8 A hospital patientreceives medicationsthat have been ordered by a particular doctor. Becausethe patient

often

ORDER. ORDER

a

and

painters,

paintings

one

gallery.

gallery.

many paintings.

2020 has

per

can include

several

day, there

is

a 1:* relationship

medications,

creating

between

database

model to capture these business rules.

and

galleries.

A gallery

Similarly,

Using

can

A painting exhibit

a painting

PAINTER,

is

is

many created

PAINTING

paintings, by

and

artists. UBA maintains a small database to

created

by a particular but

a single

each

painter,

GALLERY, in terms

artist

painting but

and then can

each

of a relational

b

How might the (independent)

c

Drawthe complete ERD.

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

tables

scanned, the

overall

or

duplicated, learning

exhibited

be exhibited

painter

Whichtables would you create, and what wouldthe table components be?

Cengage

and

between

PATIENT, ORDER and MEDICATION.

a

deemed

PATIENT

a 1:* relationship

MEDICATION.

a particular

only

review

medications

United Broke Artists (UBA) is a broker for not-so-famous in

Copyright

order

Create an ERD that depicts arelational

track

Editorial

several

each

Identify the business rules for

b 9

receives

Similarly,

can

in

create

database:

be related to one another?

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

68

part

I

Database

FIgure

Systems

p2.2

the tiny

university

class erD

2

ENROL_GRADE

10

Using the ERDfrom attributes the

Problem 9, create the relational

for each of the

entities.

schema. (Create

Make sure you use the

appropriate

an appropriate naming

collection

conventions

of

to name

attributes.)

11

Describe the relationships (identify the business rules) depicted in the ERD shown in Figure P2.3.

12

Convert the ERD from

13

Describe the relationships

14

Create a UML ERD for each of the following more

than

a

one in the

Each of the those

Problem 11into a UML class diagram.

database

modelling

has

The word many merely means

environment.)

many employees

Each department

is

manage only one department

b

descriptions. (Note:

MegaCo Corporations divisions is composed of many departments. Each of

departments

department.

shown in the ERDin Figure P2.4.

assigned

managed

to it,

but each

by one employee,

employee

works

and each of those

for

only

one

managers

can

at a time.

During a period oftime, a customer can rent many DVDsfrom the BigVid store. Each ofthe BigVids

DVDs

can

be rented

to

c

An airliner can be assigned to fly

d

The KwikTite region but

e

Corporation

can be home

each

of those

An employee

to

many

customers

manyflights,

operates is

may have earned

employed

that

period

of time.

but each flight is flown by only one airliner.

manyfactories.

many of KwikTites

employees

during

Each factory is located in a region.

factories. by only

Each factory

employs

Each

many employees,

one factory.

many degrees, and each degree

may have been earned by

many employees. Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

Chapter

FIgure

p2.3

2

Data

Models

69

the Crows Foot erD for problem 11

LECTURER

CLASS 2 Teaches

Advises

STUDENT

FIgure

p2.4

the

uMl

erD for

problem

13

note Many-to-many them. not

Copyright Editorial

review

2020 has

However, appropriate

Cengage deemed

(*:*) relationships

Learning. that

any

All suppressed

you in

Rights

in

a relational

Reserved. content

will learn

does

May not

not materially

be

exist

at a conceptual

Chapter

3,

level,

Relational

and you should

Model

Characteristics,

know that

how to recognise

*:* relationships

are

model.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 3 Relational Model Characteristics IN THIS CHAPTER,YOU WILLLEARN: That the

relational

database

That the

relational

models

tables

in a relational

How relations

model basic

takes

a logical

components

view

of data

are relations

implemented

through

DBMS

are organised

in tables

composed

of rows

(tuples)

and columns

(attributes) Key terminology About

the

How

data

used in

role

of the

data

redundancy

Why indexing

describing

relations

dictionary,

is

handled

and the

in the

system

relational

catalogue

database

model

is important

PREVIEW In

Chapter

and

data

2,

considering ERM

the

chapter,

structure

and

You

you

physical

to

introduced

Finally, to the

you

next few

and the

way in

basic

tables,

and retrieval.

models

be used

You

to

will discover

a relational components

database.

one important

fit

into

reason

can be treated tables

the

an ERD.

models logical

data

how the independent

that

through

design

that

without

also learnt

graphically

basic

structural

structure

details about the relational

is that its tables

their

concepts

and poorly

that

chapters.

components shape

database

designed

are introduced

which

logical

for the

aslogical

within the

a

rather

database

another.

part of relational

of well-designed

data

models

relationships

can

You

simplicity

You will also learn

about

to the

such an integral

relational

the

databases

as a table.

one

learning

ERD

relational

models

units.

be related After

how the

the

storage

and their

some important

the

known

that examine

of data

entities

about how

database

you to

aspects

depict

willlearn

more

construct

relational

you learnt

allow

physical

will learn

logical

can

Models,

may be used to

In this

than

Data

independence

to

the

design

design,

you

their

relationships,

of tables.

you

Because

the

are

table

is

will also learn the characteristics

tables.

some

basic

For example,

those

and

you

relationships

concepts

that

will examine might

will become

different

be handled

kinds

in the

your

gateway

of relationships

relational

database

environment.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3 Relational

Model

Characteristics

71

NOTE The

relational

model,

Predicate logic, of fact)

can

be

of 12345678

is

theory data

is a

as either

named

Cela

mathematical in the

as B(44,

yields

a result

A and

B share

Based

set

Alogical

77, 90,

a common

data structure

deals

model.

1970,

For

is

based

example,

on

predicate

logic

example,

assume

information,

77. This result

that

and

a student

be demonstrated

or groups

Furthermore,

Given this

suppose

can easily

with sets,

For

77).

number,

value,

in

or false.

24,

11).

concepts,

Codd

This assertion

that

A(16,

with a single

on these

true

relational as

E.F.

set theory.

mathematics, provides aframework in which an assertion (statement

Nkosi.

science

77, represented

represented

by

verified

manipulation

24 and

introduced

used extensively in

of things, that

set

B contains

you

can

three

numbers

44,

that

a student

ID

or false.

Set

as the

A contains

four

can be expressed

with

be true

and is used

set

conclude

to

numbers, 77,

of

B 5 77. In

16,

90 and

the intersection

as A

3

basis for

11,

A and

other

B

words,

77.

the

relational

represented

model

has three

by the relational

well-defined

table,

where

components:

data are stored (Sections

3.1, 3.2

and 3.4). A set

of integrity

rules

to

enforce

that

the

data

are

and remain

consistent

over

time

(Sections

3.3,

3.5,

3.6 and 3.7). A set

of operations

that

define

how

data

are

manipulated

(Chapter

4,

Relational

Algebra

and

Calculus).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

72

PART I

Database

3.1

Systems

A LOGICAL VIEW OF DATA

In Chapter metadata. structure. eliminates

1, The Database Approach, you learnt that a database stores and manages both data and You also learnt that the DBMS manages and controls access to the data and the database Such an arrangement placing the DBMS between the application and the database most of the file systems inherent limitations. The result of such flexibility, however, is afar

more complex

3

physical

structure.

In fact,

the

database

structures

required

by both the

hierarchical

and network database models often become complicated enough to diminish efficient database design. The relational data model changed all of that by allowing the designer to focus on the logical representation ofthe data and their relationships, rather than on the physical storage details. To use an automotive

analogy,

the relational

database

uses an automatic

transmission

to relieve

you of the

need

to manipulate clutch pedals and gear levers. In short, the relational model enables you to view data logically rather than physically. The practical significance of taking the logical view is that it serves as areminder of the simple file concept of data storage. Although the use of a table, quite unlike that of a file, has the advantages of structural

and

data independence,

a table

does resemble

a file from

a conceptual

point

of view.

Since

you can think of related records as being stored in independent tables, the relational database model is much easier to understand than its hierarchical and network database predecessors. Greaterlogical simplicity tends to yield simpler and more effective database design methodologies. As the table

our discussion

plays such a prominent

role in the relational

begins with an exploration

model, it

deserves

of the details of table structure

a closer look.

Therefore,

and contents.

NOTE Relational

database

terminology

is

very

precise.

Unfortunately,

file

system

terminology

sometimes

creeps into the database environment. Thus, rows are sometimes referred to as records and columns are sometimes labelled asfields. Occasionally, tables arelabelled files. Technically speaking, this substitution of terms

is

not

always

terms file, record table is rows

actually

alogical

as records

familiar

file

appropriate;

and field and

system

the

database

describe physical

rather

of table

than

table

a physical

columns

is

a logical

concepts.

as fields.

construct, In

rather

than

Nevertheless, you

fact,

may (at the

many

a physical

as long

conceptual

database

concept,

and the

as you recognise that the

software

level)

think

vendors

of table

still

use

this

terminology.

3.1.1 Tables and Their Characteristics The logical view of the relational database is facilitated by the creation of data relationships based on alogical construct known as a table. Atable is perceived as atwo-dimensional structure composed of rows and columns. As far as the tables user is concerned, a table contains a group of related entities, that is,

an entity

set; for that

reason,

the terms

entity

set and table

are often

used interchangeably.

Atable is also called arelation because the relational models creator, E.F. Codd, used the term relation as a synonym for table. You can think of atable as a persistent relation, that is, a relation whose contents can be permanently saved for future use. Withinthe relational model, columns oftables are referred to as attributes

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

and rows

All suppressed

Rights

Reserved. content

does

of tables

May not

not materially

be

copied, affect

are known

scanned, the

overall

or

duplicated, learning

as tuples.

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3 Relational

Model

Characteristics

73

NOTE The

concept

of a relation

restricted

set

mathematics,

is

modelled

of rules.

For

a relation

is formally

on a

example,

every

mathematical relation

defined

construct

within

the

and therefore

database

must

must follow have

a certain

a distinct

name.

In

as:

distinct), Ris a relation on these n sets, Given a number of sets D1 , D2 , ..., Dn(which are not necessarily it is a set of tuples each of which hasits first element from D1 , second element from D2and so on. Lets last

examine

names

this

formal

and

STU_LNAME

{Ndlovu,

DEPT_CODE

{BIOL,

Then

definition

(STU_LNAME)

a relation

can

R 5 {(Ndlovu,

3.1

shows

CIS,

A table

is

2

Each table

row

Each cell or column/row

All values integer Each

7

The order

8

Each table

rules

3.1

in

the

review

2020 has

two

Table

with three

Introduction

to

Learning. that

any

and

All suppressed

have

of students enrolled.

as:

EDU)}

pairs.

conform

to.

columns

an attribute

hence

to the

cells

same

column

and columns.

within the

of values

entity

set and

must be distinct.

is immaterial

to the

a relation.

contains

only an atomic

attribute

attribute

that is, a single

if the

attribute

is

assigned

an

must be integers. domain.

that

LECTURER

The table

multiple

value

DBMS.

of attributes

The

name.

For example,

that

as the

or a combination

a distinct

of a relation.

data format.

known

constitutes

has

contain

representing

LECTURER.

COURSE_NAME

of rows

column

should

in the

and

column

each

in a relation

in the

COURSE

occurrence

and

not allowed

range

and

entity

composed

values.

uniquely

table

COURSE

identifies

conforms however

For example

each row.

to

is

all of the

not

CRS_CODE

a relation

CIS-420 is

values:

and Implementation

Databases

Modelling:

Cengage

are

a specific rows

5 2), one

they

a relation.

must conform

tables: 3.1

Design

deemed

has

sets (n where

DEPT_CODE

EDU),(Ismail,

structure

an attribute,

all values

must have

Database

Data

a column

of the

must

intersection

values

COURSE_NAME

associated

Copyright

in

column

shows

listed

because

Multiple

we have two

(DEPT_CODE)

and

a set of ordered

a single in

represents

data format,

6

Figure

column

represents

4

5

Roux,

a relation

not allowed

Each table

value.

STU_LNAME

as a two-dimensional

are

Assume codes

of a relation

3

data

Editorial

is simply

(tuple)

rows

sets

CIS),(Le

that

perceived

Duplicate

Roux, Ismail}

over the

Properties

1

Le

(Smithson,

properties

TABLE 3.1

an example. department

EDU}

be defined

BIOL),

the

with of the

Smithson,

So, as you can see, a relation

Table

one

3

An Introduction

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

74

PART I

Database

Systems

FIGURE 3.1(a) Table

name:

The relation

LECTURER EMP_

LECTURER_

LECTURER_

LECTURER_HIGH_

NUM

OFFICE

103

DRE 156

6783

PhD

104

DRE 102

5561

MA

105

KLR 229D

8665

PhD

106

KLR 126

3899

PhD

110

AAK 160

3412

PhD

114

KLR 211

4436

PhD

155

AAK 201

4440

PhD

3

FIGURE 3.1(b) Table

LECTURER

name:

EXTENSION

The non-relational

table

DEGREE

COURSE

COURSE CRS_

COURSE_NAME

CODE CIS-220

Introduction

CIS-420

to

Computer

Assembly

Language

Database

Design and

Science

Programming

Implementation Introduction Data QM-261

to

Modelling:

Intro.

to

Applying the concepts A relational

described

Applications

of relations to database

schema

is

byits name followed

a textual

An Introduction

Statistics

Statistical

entity.

Databases

models allows us to define arelational

representation

of the

database

tables,

schema for each

where each table

is

bythe list ofits attributes in parentheses.

NOTE A relational

schema

belonging

the

to

R can be formally

defined

as R5{a1, a2,...,an} where a1...an

is

a set

of

attributes

relation.

For example,

consider

the

database

table

LECTURER

in Figure 3.1. The relational

schema for LECTURER

can be written as: LECTURER(EMP_NUM,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

LECTURER_OFFICE, LECTURER_EXTENSION,

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

LECTURER_HIGH_DEGREE)

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3.1.2 Attributes Each

attribute

is

3 Relational

Model

Characteristics

75

and Domains

a named

column

within the relational

table

and

draws its

values

from

a domain.

A domain is the set of possible values for this attribute. For example, an attribute called STU_CLASS, which stores the students classification whilst at university, may have the following domain {UG1, UG2, UG3, PG, Other}, which meansthat STU_CLASS can only have one ofthese values within the database. The domain

of values for an attribute

should

contain

only atomic

values

and any one value

should

not

3

be divisible into components. In addition, no attributes with morethan one value are allowed. (These are often referred to as multi-valued attributes.) For example, the value of STU_CLASS could not be UG1 and UG2 at the same time. Each domain is also defined by its data type for example, character string, number, date, etc. The fundamental

principle

of the relational

achieved by comparisons of their values. if their values are drawn from the same LECT_POSTCODE may bein two different postal codes and could be compared. In STU_NAME

with STU_CLASS,

model is that relating

different

entities

A pair of attribute values can only be domains. For example, the columns relational tables, but would share the contrast, it would be nonsense to try

even though

the

domains

are defined

to

one another

is

meaningfully compared STU_POSTCODE and common domain of all to match the attribute

by the data type (character

string).

3.1.3 Degree and Cardinality Degree and cardinality are two important properties of the relational model. A relation with N columns and Nrows is said to be of degree N and cardinality N. The degree of a relation is the number of its

attributes

and the

cardinality

of a relation

is the

number

of its tuples.

The product

of a relations

degree and cardinality is the number of attribute values it contains. Figure 3.2 shows the relational table DEPARTMENT with a degree of 4 and a cardinality of 4. The product of the relational table DEPARTMENT is 16 (4 * 4) and, as you can see in Figure 3.2, it contains 16 attribute values.

FIGURE 3.2 Table

name:

Cardinality

Degree and cardinality

of the DEPARTMENTrelation

DEPARTMENT

5 4

DEPT_CODE

DEPT_NAME

DEPT_ADDRESS

DEPT_EXTENSION

ACCT

Accounting

KLR 211, Box 52

3119

ART

Fine Arts

BBG 185, Box 128

2278

BIOL

Biology

AAK

Box 415

4117

CIS

Computer

Box 56

3245

Info.

Systems

Degree

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

KLR 333,

5 4

rights, the

230,

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

76

PART I

Database

Systems

NOTE The word relation, theory

from

also known

which

relationships

Codd

among

relationships.

derived

tables,

many

as a dataset in

Microsoft

his

the

model.

Since

database

Many then incorrectly

conclude

Access, is

relational

based on the

model

users

incorrectly

that

only the relational

assume

uses

that

the

attribute term

mathematical values

relation

to

set

establish

refers

to

such

model permits

the use of relationships.

to

define

3

3.1.4 You

Summary will

discover

thereby the

greatly

the

table

Characteristics

view

the task

of a relation

listed

of

data

makes

of database

in

Table

3.1

it

easy

design.

can

spot

The tables

be applied

to

and

shown in

a database

entity

Figure

relationships,

3.3 illustrate

how

table.

STUDENTtable attribute values

name:

Table name:

that

simplifying

properties

FIGURE 3.3 Database

of Relational

Ch03_TinyUniversity

STUDENT STU_

STU_

STU_

STU_

STU_

DEPT_

STU_

LECT_

DOB

HRS

CLASS

GPA

TRANSFER

CODE

PHONE

NUM

C

12-Feb-1999

42

UG3

2.84

No

BIOL

2134

205

K

15-Nov-2000

81

UG2

3.27

Yes

CIS

2256

222

23-Aug-2000

36

UG3

2.26

Yes

ACCT

2256

228

H

16-Sep-1996

66

UG2

3.09

No

CIS

2114

222

STU_

STU_

STU_

NUM

LNAME

FNAME

INIT

321452

Ndlovu

Amehlo

324257

Smithson

Anne

324258

Le

Dan

Roux

STU_

324269

Oblonski

324273

Smith

John

D

30-Dec-1998

102

PG

2.11

Yes

ENGL

2231

199

324274

Katinga

Raphael

P

21-Oct-1999

114

PG

3.15

No

ACCT

2267

228

Hemalika

T

08-Apr-1999

120

PG

3.87

No

EDU

2267

311

John

B

30-Nov-2001

15

UG1

2.92

No

ACCT

2315

230

324291

Ismail

324299

Smith

STU_DOB

5

Student

date of birth

STU_HRS

5

Credit

STU_CLASS

5

Student

STU_GPA

5

Grade

point

STU_PHONE

5

4-digit

campus

LECT_NUM

5

Number

Copyright Editorial

Walter

review

2020 has

Cengage deemed

Learning. that

any

hours

All suppressed

earned

classification

average phone

extension

of the lecturer

Rights

Reserved. content

does

May not

who is the

not materially

be

copied, affect

students

scanned, the

overall

or

duplicated, learning

advisor

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using the to the

1

STUDENT

points

in

table

Table

shown in Figure

3.3, you can draw the following

3 Relational

conclusions

eight

rows

degree

is

attributes

2

(tuples)

12.

corresponding

You

and can

twelve

also

columns.

describe

The

the

cardinality

table

as

of

being

STUDENT

composed

is

therefore

of eight

8 and

records

and

the

twelve

(fields).

entity

set is represented

by STU_NUM Amehlo

by the

5 321452

C.

Oblonski. the

77

structure composed

Each row in the STUDENT table describes a single entity occurrence (The

Characteristics

3.1:

The STUDENT table shown in Figure 3.3is perceived to be atwo-dimensional of

Model

Ndlovu.

Similarly,

STUDENT

defines

For

the

example,

row

entity

STUDENT

table.)

characteristics

row

3 describes

4 in

eight

3.3

row

Dan

entities

(entity

or record)

or fields)

describes

named

distinct

the

(attributes

Figure

a student

set includes

Note that

Roux.

defined

of a student

a student

le

3

within the entity set. named

named

Given

the

Walter

table

H.

contents,

(rows).

3 Each column represents an attribute, and each column has a distinct name. 4

All of the values in a column point

average

must

be classified

different

a

(STU_GPA)

match the entitys

column

according

data types,

to their

most support

STU_HRS

STU_PHONE

is

b

not intended

STU_FNAME,

c d

In

and

for

3.3,

not

Microsoft

Figure

of the

various

table

rows.

DBMSs

can

Data

support

are

numeric

adding

attributes.

or subtracting

On the

phone

other

hand,

numbers

does

result.

mathematical

manipulation.

STU_CLASS

In

is

Figure

and

a data

3.3, the

all, relational

Access

In

Figure

STU_PHONE

3.3, for

example,

are character,

text

or

STU_LNAME,

or string

attributes.

attribute.

range

known

04,

STU_TRANSFER

database

uses the label

a data type

to the

Each table

software

Yes/No

student

students find

number) last

several

is the

name

the

domain

quite

students

Cengage

Learning. that

any

possible

All suppressed

Rights

does

May not

key.

Using

would

be

copied, affect

scanned, the

overall

duplicated, learning

Smith.

in experience.

format.

data type

TRUE, FALSE

whereas

and

NULL.

Because the STU_GPA values

whole

the primary

data

Even

presented

or in Cengage

the

primary named

part.

Due Learning

to

key (PK) is an attribute

any given row. In this

be a good

one student

or

format.

data

is [0,4].

the

not be an appropriate

more than

is

not

last

not

name

data

logical

a logical

domain.

identifies

whose

materially

to indicate

a logical

the

can have values

general terms,

uniquely

primary

uses

support

to the user.

would

to find

Reserved. content

key. In

that

(STU_LNAME)

name (STU_FNAME)

which

values is known as its

inclusive,

of attributes)

attribute

packages

data type

as Boolean,

must have a primary

a combination

(the

deemed

3.3

because

The order of rows and columns is immaterial

(or

has

in

attribute

The columns range of permissible are limited

2020

STU_GPA

STU_DOB

transfer?

but

Oracle uses

review

Although

For example, the grade

each

Logical. Logical data can have only atrue or false (yes or no) condition. For example, is a student Most,

Copyright

and function.

for

the following:

meaningful

STU_INIT,

Figure

a university

is

entries

Date. Dateattributes contain calendar dates storedin a special format known as the Julian date format.

7

characteristics.

STU_GPA

Character. Character data, also known astext data or string data, can contain any character symbol

Editorial

format

at least

not a numeric

not yield an arithmetically

6

attribute

only

Numeric. Numeric data are data on whichyou can perform meaningful arithmetic procedures. For example,

5

contains

reserves

Figure

key

combination

STU_NUM

observe

it is

of the last

name

as Figure

that

possible

a to

and first

3.3 shows,

it

Smith.

rights, the

3.3,

because

key because,

John

electronic

in

primary

case,

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

78

PART I

Database

Systems

Online Content on the

online

names

platform

Allofthe databases usedtoillustratethe material in this chapterarefound for this

used in the figures.

'Ch03_TinyUniversity'

book.

The

database

names

For example, the source

used in the folder

of the tables

match the

database

shown in Figure 3.3 is the

database.

3

3.2

KEYS

A key consists number

of one or more attributes

identifies

One type of table shown because the the primary attention.

all of the invoice

that determine

attributes,

such

other attributes. (For example,

as the invoice

date

and the

an invoice

customer

name.)

key, the primary key, has already been introduced. Given the structure of the STUDENT in Figure 3.3, defining and describing the primary key seems simple enough. However, primary key plays such animportant role in the relational environment, we will examine keys properties more carefully. There are several other kinds of keys that warrant

In this

section,

you

will also

become

acquainted

with superkeys,

candidate

keys

and

secondary keys. The keys role is based on a concept known as determination. In the context of a database table, the statement A determines B indicates that if you know the value of attribute A, you can look up (determine) the value of attribute B. For example, knowing the STU_NUM in the STUDENT table (see Figure 3.3)

means that

you are able to look

up (determine)

that

students

last

name,

grade

point average,

phone number and so on. The shorthand notation for A determines B is A ? B.If A determines B, C and D, you write A ? B, C, D. Therefore, using the attributes of the STUDENT table in Figure 3.3, you can represent the statement STU_NUM

determines

STU_LNAME

by writing:

STU_NUM ? STU_LNAME In fact, the STU_NUM value in the For example, you can write: STU_NUM

STUDENT table

determines

all of the students

attribute

? STU_LNAME,

STU_FNAME,

STU_INIT

? STU_LNAME,

STU_FNAME,

STU_INIT, STU_DOB, STU_TRANSFER

values.

and STU_NUM In

contrast,

STU_NUM

is

not

determined

by STU_LNAME

because

it is

quite

possible

for

several

students to have the last name Smith. The principle of determination is very important because it is used in the definition of a central relational database concept known as functional dependence. The term functional dependence can be defined

most easily this

way: the

attribute

Bis functionally

dependent

on Aif

A determines

B. More

precisely: The output

of the

DIVIDE

Using the contents is functionally

operation

is a single

column

with the

values

of column

of the STUDENT table in Figure 3.3, it is appropriate

dependent

on STU_NUM.

For example,

the

STU_NUM

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

to say that

value

STU_PHONE value 2134. Onthe other hand, STU_NUM is not functionally

B.

321452

STU_PHONE determines

the

dependent on STU_PHONE

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

because

the

STU_PHONE

(Apparently,

some

STU_LNAME

value

because

one

The functional

occur

a phone.)

But the

student

definition

more than

with two

Similarly,

STU_NUM

may have

dependence

values

2267 is associated

share

Smith.

more than

attribute

value

students

the

value

the last

is

value

not functionally

name

a table.

values:

STU_NUM

Model

324274

324273

Characteristics

and 324291.

determines

dependent

79

on

the

STU_LNAME

Smith.

can be generalised

once in

STU_NUM

3 Relational

to cover the

Functional

case in

dependence

which the determining

can then

be defined

this

way:1

3 Attribute table

A determines

that

agree

Be careful student

in

when

value

defining

classification

for

B(that is,

attribute

the

based

TABLE 3.2 Hours

attribute

Bis functionally

A also

agree

dependencys on hours

Student

in

value

direction.

completed;

for

For

these

are

you

can

its

write:

? STU_CLASS

the specific

number

undergraduate

the

determines

3.2.

PG

more

STU_HRS

words,

University

Table

UG1

30

UG3

a third-year

Tiny

in

Classification

60-89

However,

B.

example,

UG2

Therefore,

attribute

shown

30-59

90 or

on A)if all of the rows in the

classification

completed

Fewer than

dependent

classification

of hours is not dependent

(UG3)

with

(STU_CLASS)

62 completed does

not

on the hours

classification.

or one

determine

one

with

and

It is quite possible

84 completed

only

one

value

hours.

for

to find In

completed

other hours

(STU_HRS).

Keep in is,

a key

mind that it

might take

may be composed

of

more than more than

a single

attribute

one attribute.

to

Such

composite key. Any attribute that is part of a keyis known as a key attribute. the

students

last

name, first

attributes.

last

name

would

name, initial

For example,

STU_LNAME,

not

be sufficient

and home

you

can

STU_FNAME,

to

serve

dependence;

multi-attribute

key is

that

known

as a

For instance, in the STUDENT table,

as a key.

phone is very likely

define functional a

Onthe

to produce

other

unique

hand,

the

combination

of

matches for the remaining

write:

STU_INIT, STU_PHONE

? STU_HRS, STU_CLASS

or

1

ISO-ANSI

Working

provided

Copyright Editorial

review

2020 has

Cengage deemed

through

Learning. that

any

All suppressed

Draft the

Rights

Reserved. content

Database

courtesy

does

May not

not materially

Language/SQL

of

be

copied, affect

Dr David

scanned, the

overall

or

Foundation

(SQL3),

Part

2, 29

August,

1994.

This

source

was

Hatherly.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

80

PART I

Database

Systems

STU_LNAME,

STU_FNAME,

STU_INIT, STU_PHONE

? STU_HRS, STU_CLASS,

STU_GPA

or

STU_LNAME,

STU_FNAME,

STU_INIT,

STU_PHONE

? STU_HRS,

key, the

of functional

STU_CLASS,

STU_GPA,

STU_DOB Given the

refined

3

possible

existence

by specifying

If the

attribute

composite Within the

(B)

any key that attributes.

dependence:

is functionally

dependent

key, the

broad

of a composite

full functional

attribute

uniquely

identifies

STUDENT

a composite

(B) is fully functionally

key classification,

In the

on

notion

several

key (A)

dependent

specialised

keys

superkey

could

but

not

can

on any

be further

subset

of that

on (A).

can be defined.

each row. In short, the superkey

table, the

dependence

For example,

functionally

a superkey

determines

is

all of the rows

be any of the following:

STU_NUM STU_NUM,

STU_LNAME

STU_NUM,

STU_LNAME,

In fact,

STU_NUM,

attributes

or

without

additional

attributes,

can

be

a superkey

even

when the

additional

are redundant.

A candidate Using this

key can be described as a superkey

distinction,

STU_NUM, is

with

STU_INIT

note that the

composite

without redundancies,

that is, a minimal superkey.

key

STU_LNAME

a superkey,

but it is

not

a candidate

key

because

STU_NUM

by itself

is

a candidate

key!

The

combination STU_LNAME, might last

also

be a candidate

name,

If the 3.3

first

would

would

name,

students

perhaps

one

STU_FNAME,

named

be driven

as long

and

STU_ID

by the

as you

phone

discount

and

student.

designers

the

possibility

that

two

choice

as one of the attributes

STU_NUM In that

would

case,

the

or by end-user

unique row identifier.

have

in the

been

selection

the

same

requirements.

keys,

STU_NUM In

Note, incidentally,

short,

that

table in Figure because

as the the

primary

a primary

either

primary

key

key is the

key is

a superkey

key.

each

(that is,

share

STUDENT

candidate of

primary

key

value

must

be unique

to

ensure

that

each

bythe primary key. In that case, the table is said to exhibit entity integrity. a null value

students

number.

both it each

key chosen to be the

a table,

STU_PHONE

had been included

identify

as well as a candidate Within

key,

initial

ID number

uniquely

candidate

STU_INIT,

no data entry at all)is

not permitted

in the

primary

row

is

uniquely

identified

To maintain entity integrity, key.

NOTE A null

does

not

A null is created words,

Copyright Editorial

review

2020 has

mean

a zero

when you press the

a null is no value

Cengage deemed

Learning. that

any

or a space.

All suppressed

Rights

keyboards

the

keyboards

space

Enter key without

bar

creates

a blank

(or

a space).

making a prior entry of any kind. In other

at all.

Reserved. content

Pressing

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Nulls can never in

other

are

working

be the

too.

with non-key

values

nulls

cannot

is

often

example,

A not

missing,

on the

sophistication

such

as

attributes

PRODUCT

in

tables

Because

the

once

You

table

Figure

232

table

is likely some

to

of the

may be situations

entities. In any case,

the

have

existence

many

software,

used.

In

of nulls in

different

nulls

addition,

VENDOR

even

a table

meanings.

can

create

nulls

can

For

problems

create

logical

table,

In

thus

database

2,

3.4,

tables

evidence

Data

the

Models,

that

the

data

PRODUCT

note

that

the

VEND_CODE

these

the

that

they

And

value

VEND_CODE

between VENDOR

VEND_CODE

multiple

because

that the

share

PRODUCT

is

occurrences

is the

1

value

may

the

*

of the

are required

to

redundancy

exists

values,

VENDOR

and

side in

side

occur of the

VEND_CODE

make the relationship only

when

there

is

point

to

values.

note

that

the

other table.

is

VEND_CODE

the

VENDOR table.

delivered

connection

value

For example,

Ortozo in the bar

unique

terms,

does

through

is

providing

as

1:* relationship

VENDOR

database

VEND_CODE.

once,

table

value

given

within the

Note, for example,

make the

VEND_CODE

Tables

named

more than

to

any

work.

together.

occurs to the

But

Chapter

16 cm

are

attribute

are not redundant

Henry

saw,

be linked

is related

of attribute

The same

by

can

Henry

be

in

one table

VEND_CODE

Consequently, Ortozo

made for

used

to

value 235 in the

he can

product

be

you discover

and that

the

can

Steel

PRODUCT

that the product

be contacted

tape,

12

by calling

mlength

in the

table.

Remember

the

naming

belong

CODE indicates used

that

be shown

to the

points

VENDOR

3.1.1,

attribute(s)

Normalising

the table.

VEND_CODE

point

key

convention PRODUCT

in section

primary 7,

to the

to

As defined

to

table

a relational is (are)

Database

prefix

PROD

Therefore,

the

some

other

in the

database

For

table

Figure

VEND in the in the

3.4 to indicate

PRODUCT

database.

In

that

tables

this

case,

the

VEND_ the

VEND

database.

underlined

Designs.

was used in prefix

can also be represented with the

example,

schema. the

You

relational

by a relational will see such

schema

for

schema.

schemas

Figure

3.4

in

would

as:

VENDOR

(VEND_CODE,

PRODUCT

Learning. that

they

database

values is required

value in the

chain

Cengage

table

from

points to vendor

deemed

value

PRODUCT

recall

0181-899-3425.

has

share

relationship.

corresponding

relational

relationship.

examine

PRODUCT

3.4

VENDOR

in the

SUM

to

duplication

Houselite

2020

attributes

you

are linked.

of the

Each

should

As you

review

two

In fact,

when

Therefore,

section that there

between

development

and

a common

PRODUCT

unnecessary

Copyright

tables

middle initial.

later in this

because

enable the tables

PRODUCT

VENDOR-PRODUCT values in the

Chapter

a

sparingly.

application

AVERAGE

VEND_CODE

work.

more than

The

have

possible

3

Figure

VENDOR-PRODUCT

is

be used

problems,

makes the

occurrence

PRODUCT

prefix

an EMPLOYEE

not

extent avoided

design.

of the

tables

that

tables

multiple

attributes

of

be reasonably

81

value.

COUNT,

redundancy

VENDOR

table

one do

to the greatest

cannot

of the relationship

must

create

attribute

when relational

common

the

nulls

Characteristics

value.

condition.

Controlled

Editorial

can

attribute

functions

work.

example,

nature

they

database

applicable

problems

the

of poor

but

Depending

235.

be avoided

which

Model

a null can represent:

A known,

the

should

in

employees

of the

be avoided,

improperly,

An unknown

and

For

some

because

always

used

attributes.

cases

may be null. You will also discover

an indication

Nulls, if

key, and they

are rare

However,

which a null exists

if

when

There

EMP_INITIAL.

EMP_INITIAL in

be part of a primary

attributes,

3 Relational

any

VEND_CONTACT,

(PROD_CODE,

All suppressed

Rights

Reserved. content

does

May not

VEND_AREACODE,

PROD_DESCRIPT,

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

VEND_PHONE)

PROD_PRICE,

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

PROD_ON_HAND,

rights, the

right

some to

third remove

party additional

content

VEND_CODE*)

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

82

PART I

Database

Systems

FIGURE 3.4 Database Table

An example of a simple relational

name:

name:

Ch03_SaleCo

PRODUCT

PROD_CODE

3

database

Primary

key:

PROD_CODE

Foreign

key:

VEND_CODE

PROD_PRICE

PROD_DESCRIPT

PROD_ON_HAND

10.23

001278-AB

Claw hammer

123-21UUY

Houselite

QER-34256

Sledge hammer,

SRE-657UG

Rat-tail file

ZZX/3245Q

Steel tape,

chain

12

saw,

16 kg head

232

23 4

235

14.72

6

231

2.36

15

232

5.36

8

235

150.09

16 cm bar

VEND_CODE

mlength link

VEND_CODE

VEND_CONTACT

VEND_PHONE

7325

555-1234

Johnson

0181

123-4536

Sibiya

7325

224-2134

0113

342-6567

0181

123-3324

0181

899-3425

Shelly K. Smithson

230

Table

VEND_AREACODE

231

James

232

Khaya

233

Lindiwe

234

Nijan

235

Henry

name:

Molefe Pillay Ortozo

VENDOR

Primary key: VEND_CODE Foreign key: none

The link between the PRODUCT and VENDOR tables in Figure 3.4 can also be represented by the relational diagram shown in Figure 3.5. In this case, the link is indicated by the line that connects the VENDOR and PRODUCT tables.

FIGURE 3.5

The UMLentity relationship diagram for the CH03_SaleCodatabase

The relationship

line in Figure 3.5 is created

More specifically,

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

the

Rights

Reserved. content

does

primary

May not

not materially

be

when two tables

key of one table (VENDOR)

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

share an attribute

appears

to

electronic reserves

rights, the

right

with common values.

as the foreign

some to

third remove

party additional

content

key in

may content

be

a related

suppressed at

any

time

from if

the

subsequent

eBook rights

table

and/or restrictions

eChapter(s). require

it

CHAPTER

(PRODUCT). table.

Aforeign

For example,

as a foreign VENDOR

key (FK) is an attribute

in

Figure

key in the table

PRODUCT

shown

If the foreign

3.5, the

in

table.

Figure

key contains

3.4

not

that

key

contains

referential

a value,

matching

integrity

is

Finally, a secondary customer you

facilitated

that when

members

yield

keys

effectiveness

key is.

of view, the to

dozens

For instance, attribute

examine

of

and

VENDOR

in

one

number? number

could phone

which

narrowing

values New

the

used.

of the

a third

occurs

table,

the

means that, if the

another

tables

course,

database

table

last

Suppose Do

can

primary

name

and

be key

phone

For example, Smith family of last

combination

on

a specific how

is legitimate

to

key.

matches if several

for

produce

CUS_CITY

name match.

restrictive

from

that

a database

a usable return

is

3

3.4.

the

depends

are not likely

Figure

the

outcome.

be searched

key CUS_CITY

(Of

Note

a customer

case,

customers

a search

in

purposes.

for

yield a unique

then

relation.

shown

In that

Similarly,

could

or Paris

matches.

in

retrieval

yield several

down

and it

number is the primary

Data are

line.

secondary

York

of possible

(row)

which the customer

number

83

make(s) use of that foreign

used strictly for data retrieval

combination

matches,

although

millions

PRODUCT

phone

only

to

integrity

the

key does not necessarily

with

that

tuple

and

table

not linked

valid

key is the

at a residence

could

A secondary

want

secondary

a secondary

is

an existing

their

Characteristics

key.

or nulls, the table(s)

table in

name

table

VENDOR

to

will remember

name and home telephone

code

VENDOR

as a key that is

last

the

mind that

last

secondary

than

customers

were living

postal

point

defined

key in the

In other words, referential

refers

Model

matchthe primary key values in the related

primary

a foreign

values

between

customers

number;

Keep in

a customers

you

maintained

key is

most the

customer

number.

and

value

data are stored in a CUSTOMER

suppose

is the

that

the

contain

key is (are) said to exhibit referential integrity. foreign

is the

Because

does

either

whose values

VEND_CODE

3 Relational

a better

unless

secondary

key

CUS_COUNTRY.) Table

3.3

summarises

TABLE 3.3

the

different

Relational

database

Key type

Definition

Superkey

An attribute

Candidate

key

relational

A minimal

(or

keys.

keys

combination

superkey.

of attributes)

A superkey

that

that

does

uniquely

not contain

identifies

each row in

a subset

of attributes

a table. that

is itself

a superkey. Primary

key

A candidate Cannot

Secondary Foreign

key key

database

RDBMSs

application rules

combination

An attribute

(or

combination

Copyright review

2020 has

enforce

Learning. that

any

rules

All suppressed

rules

integrity

conforms

are summarised

Cengage deemed

integrity

design

The integrity

Editorial

values in any given row.

null entries.

(or

primary

all other attribute

key in

another

of attributes)

used

of attributes) table

in

strictly

one table

for

data retrieval

whose

values

purposes. must either

match

or be null.

INTEGRITY RULES

Relational all)

contain

An attribute

the

3.3

key selected to uniquely identify

Rights

in

rules

Table

does

May not

not materially

be

to

automatically.

to the entity

good

database

However,

and referential

it is

integrity

design.

much

safer

Many (but to

by no

make

rules

mentioned in this

Figure

3.6.

sure

means

that

chapter.

your

Those

3.4.

summarised

Reserved. content

are very important

in

copied, affect

Table

scanned, the

overall

or

duplicated, learning

3.4 are illustrated

in experience.

whole

or in Cengage

part.

Due Learning

in

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

84

PART I

Database

TABLE 3.4 Entity

Systems

Integrity

rules

Description

integrity

All primary key entries are unique,

Requirement

Each row

Purpose

will have a unique identity,

reference

3

primary

No invoice

Example

can

are uniquely Referential

and no part of a primary key and foreign

may be null.

key values can properly

key values. have

a duplicate

identified

by their

number, invoice

nor

can it

be null. In

short,

all invoices

a part

of its tables

number.

Description

integrity

A foreign

Requirement

primary

key key)

it is related.

may have

either

or an entry (Every

a null entry (as long

that

non-null

matches foreign

the

primary

key value

as it is

not

key value in

must reference

a table

to

an existing

which primary

key value.) Purpose

It is possible for an attribute impossible rule

to

foreign

The CUSTOMER

The enforcement a row in

key values

one table

in

another

an assigned

to have an invalid

of Figure 3.6 at the top

Entity integrity.

entry. delete

might not yet have

will be impossible

1

to

matching,

A customer

Note the features

an invalid

makes it impossible

mandatory, Example

have

NOT to have a corresponding

sales

value,

but it

of the

referential

whose

primary

integrity key has

table. representative

sales representative

(number),

but it

(number).

of the next page.

tables

primary

key is

CUS_CODE.

The CUSTOMER

column has no null entries, and all entries are unique. Similarly, the AGENT tables AGENT_CODE, and this primary key column also is free of null entries. 2

will be

primary

key

primary key is

Referential integrity. The CUSTOMER table contains a foreign key AGENT_CODE, which links entries in the CUSTOMER table to the AGENT table. The CUS_CODE row that is identified bythe (primary key) number 10013 contains a null entry in its AGENT_CODE foreign key, because MrJaco Pieterse does not yet have a sales representative assigned to him. The remaining AGENT_CODE entries in the

To avoid

nulls,

CUSTOMER

some

table

designers

all

match the

use special

AGENT_CODE

codes,

known

entries in the

as flags,

AGENT table.

to indicate

the

absence

of some

value. Using Figure 3.6 as an example, the code -99 could be used as the AGENT_CODE entry of the fourth row of the CUSTOMER table to indicate that customer Jaco Pieterse does not yet have an agent assigned to him. If such a flag is used, the AGENT table must contain a dummy row with an AGENT_ CODE value of -99. Thus, the AGENT tables first record might contain the values shown in Table 3.5. TABLE

3.5

A dummy

variable

value

used as a flag

AGENT_CODE

AGENT_AREACODE

AGENT_PHONE

AGENT_LNAME

AGENT_YTD_SALES

-99

0000

000-0000

None

0.00

Chapter 5, Data Modelling may be handled.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

with Entity Relationship

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

Diagrams, discusses several

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

ways in which nulls

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.6 Database

Anillustration

name:

Table name:

ofintegrity

3 Relational

Model

Characteristics

85

rules

Ch03_InsureCo

CUSTOMER

Primary key: CUS_CODE Foreign

key:

AGENT_CODE CUS_

CUS_

CUS_

CUS_

CODE

LNAME

FNAME

INITIAL

CUS_

CUS_

CUS_RENEW_

AGENT_

AREACODE

PHONE

DATE

CODE

10010

Ramas

Alfred

A

0181

844-2573

12-Mar-19

502

10011

Dunne

Leona

K

0161

894-1238

23-May-18

501

10012

Du Toit

W

0181

894-2285

05-Jan-19

502

10013

Pieterse

0181

894-2180

20-Sep-19

10014

Orlando

0181

222-1672

04-Dec-18

501

10015

OBrian

Amy

B

0161

442-3381

29-Aug-19

503

10016

Brown

James

G

0181

297-1228

01-Mar-19

502

10017

Williams

George

0181

290-2556

23-Jun-19

503

10018

Padayachee

Vinaya

G

1061

382-7185

09-Nov-19

501

10019

Moloi

Mlilo

K

0181

297-3809

18-Feb-19

503

Table

name:

Marlene Jaco

F

Myron

3

AGENT

Primary key: AGENT_CODE Foreign

key:

none

AGENT_CODE

AGENT_LNAME

AGENT_AREACODE

AGENT_PHONE

AGENT_YTD_SLS

501

Bhengani

0161

228-1249

1

371 008.46

502

Mbaso

0181

882-1244

3

923 932.59

503

Okon

0181

123-5589

2

444

244.52

Other integrity rules that can be enforced in the relational model are the NOT NULL and UNIQUE constraints. The NOT NULL constraint can be placed on a column to ensure that every row in the table has a value for that

column.

The UNIQUE

constraint

is

a restriction

placed

on a column

to ensure that

no duplicate values exist for that column.

3.4

THE DATA DICTIONARY AND THE SYSTEM CATALOGUE

The data

dictionary

provides

a detailed

accounting

of all tables

found

within the

user/designer-created

database. Thus, the data dictionary contains atleast all of the attribute names and characteristics for each table in the system. In short, the data dictionary contains metadata data about data. Using the small database presented in Figure 3.6, you might picture its data dictionary as shown in Table 3.6.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

86

PART I

Database

Systems

TABLE 3.6 Table

Name

Asample

Attribute

Name

data dictionary

Contents

Type

Format

Domain

Required

FK

PK

Referenced

or

Table

FK

CUSTOMER

CUS_CODE

3

account

CUS_FNAME

code

Customer

CUS_INITIAL

last

name

CUS_RENEW_DATE

Customer

AGENT_CODE

99999

10000-99999

Y

PK

VARCHAR2(20)

Xxxxxxxx

100-999

Y

FK

VARCHAR2(20)

Xxxxxxxx

CHAR(5)

Customer

CUS_LNAME

first

name Customer

AGENT

Y

CHAR(1)

X

DATE

dd-mmm-yyyy

CHAR(3)

999

CHAR(3)

999

CHAR(4)

999

CHAR(14)

999-9999

Y

Xxxxxxxx

Y

initial

Customer insurance renewal

AGENT

date

Agent

code

AGENT_CODE

Agent

code

AGENT_AREACODE

Agent

area

AGENT_PHONE

Agent

AGENT_LNAME

number

AGENT_YTD_SLS

Agent

code

telephone

VARCHAR2(20) last

Agent

NUMBER(9,2)

name

PK

Y 0.00-9

9 999

999

Y

999.99

Y

999.99

year-to-date sales

FK

5

Foreign

PK

5

Primary

5

Fixed

VARCHAR2

CHAR

5

Variable

NUMBER

5

key

key

character

length

character

Numeric

data

MONEY

or

data

length

(1-255

data

(NUMBER(9,2)

characters)

(1-4

is

CURRENCY

data

000

used

characters)

to

specify

numbers

with

two

decimal

places

and

up

to

nine

digits,

including

the

decimal

places.

Some

RDBMSs

permit

the

use

of

a

type.)

NOTE Telephone area codes are always composed of digits 0-9. Because area codes are not used arithmetically, they are most efficiently stored as character data. Also, the area codes are always composed of a maximum of four digits. Therefore, the area code data type is defined as CHAR(4). Onthe other hand, names do not conform to a standard length. Therefore, the customer first names are defined as VARCHAR2(20), thus indicating

that

up to

20 characters

may be used to

store the

names.

Character

data

are shown

as

left-justified.

NOTE The data dictionary in Table 3.6is an example of the human view of the entities, attributes and relationships. The purpose of this data dictionary is to ensure that all members of database design and implementation teams use the same table and attribute names and characteristics. The DBMSs internally stored data dictionary

contains

additional

and enforcement, database

Copyright Editorial

review

2020 has

implementation

Cengage deemed

Learning. that

any

information

and index types

All suppressed

about relationship

and components.

types,

entity

and referential

This additional information

integrity

checks

is generated during the

stage.

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The data the

dictionary

design

is sometimes

decisions

about

described

tables

Like the data dictionary, the system described

as a detailed

data about table the

data type

access

system

database

catalogue

information store

effect,

the

such

of the

same

in

in

is

very

to

describe

the

must be avoided. 33 at the

end

3.5

same

of the

content.

users

and

information,

fact,

current

designers

the

the

3

relational

data

database

Therefore,

database

allows the

homonyms

dictionary

whose tables

system

spelled

words

with

catalogue

As new

check for

For

table

the

confusion,

meanings,

word

example,

and also

use

tables

eliminate

such

homonym

you

might

C_NAME

you should

and

words with different

different

context,

attributes.

To lessen

documentation.

RDBMS to

are similar-sounding

a database

For

example,

of a homonym car

and

why using synonyms

avoid

as fair

indicates use

the

C_NAME

to

to label

a consultant

database

homonyms;

is

auto

and indicates

refer

a bad idea

to the

the use of different

same

when you

object.

Synonyms

work through

Problem

chapter.

know that relationships

(*:*). This section developing

explores

database

The 1:* relationship norm in The

the

each table,

RELATIONSHIPS WITHINTHE RELATIONAL DATABASE

You already

start

which

including

in

authorised

In

can be

regard.

attribute.

You will discover

87

it records

database,

dictionary

a system-created

and

In a database context, a synonym is the opposite names

the

creators, data

from

produces

different

in this

Characteristics

table.

a CUSTOMER table.

useful

because

of columns

interchangeably.

actually

also

In

to label

attribute

a CONSULTANT

dictionary

is

or identically festival).

name

name

used

characteristics

automatically

son,

index

catalogue,

catalogue

within

number

all required

often

documentation

(meaning

attribute

a customer

data

and

and fair

name attribute the

as sun

just)

all objects

date, the

filenames,

a system

and synonyms. In general terms,

meanings, (meaning

describes

user/designer-created

catalogue

that

metadata. The system catalogue

index

are

only

any

contains

contains

dictionary

database

database,

that

column,

The system

like

designers

and creation

catalogue

provides

just

system

to the

homonyms

use

data

generally

be queried

are added

label

and

database

catalogue

dictionary

each

user/designer-created

can

In

to

system

database

Model

structures.

creator

the

may be derived.

the

tables

Since

software

data

the tables

corresponding

privileges.

terms

system

names,

as the

and their

3 Relational

will see

how

focusing

is the relational database

1:1 relationship

should

cannot

as one-to-one

those relationships

designs,

any relational

*:* relationships

are classified

further,

on the

one-to-many

(1:*), and

to help you apply them

following

modelling ideal.

(1:1),

many-to-many

properly

when you

points:

Therefore,

this relationship

database

design.

type

should

be the

design.

be rare

in

any relational

be implemented

any *:* relationship

can

as such in the relational be changed

into

two

model. Later in this

section,

you

1:* relationships.

NOTE The

UML class

element

diagram

to represent

to represent

represents

relationships

*:* relationships

a *:* association

as associations

directly.

between

two

However,

you

classes in

among

objects

will also learn

Chapter

5, Data

how

and

can

use the

an association

Modelling

multiplicity

class is

with Entity

used

Relationship

Diagrams.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

88

PART I

Database

Systems

3.5.1 The 1:* Relationship The 1:* relationship

is the relational

database

norm.

To see how such

implemented, consider the PAINTER paints PAINTING example that the data models in Figure 3.7 withits implementation in Figure 3.8.

FIGURE 3.7

3

The 1:* relationship

As you examine

the

PAINTER

a relationship

was used in

is

modelled

and

Chapter 2. Compare

between PAINTERand PAINTING

and

PAINTING

table

contents

in

Figure

3.8, note the following

features:

each painting is painted by one and only one painter, but each painter could have painted many paintings. Note that painter 123 (Onele P. Najeke) has three paintings stored in the PAINTING table. There is only one row in the PAINTER table for any given row in the PAINTING table, but there may be manyrows in the PAINTING table for any given row in the PAINTER table.

FIGURE 3.8 Database Primary

name: key:

Theimplemented 1:* relationship Ch03_Museum

PAINTER_NUM

Table name:

PAINTER

Foreign

none

PAINTER_NUM

Thunder

1339

Vanilla

Roses

1340

Tired

1341

Hasty

1342

Plastic

Table name:

PAINTING

Primary

PAINTING_NUM

Key:

P

Julio

G

PAINTER_NUM 123 To Nowhere

123

Flounders

126

Exit

123

Paradise

126

Foreign

As we are using the

PAINTER_INITIAL

Onele

Itero

PAINTING_TITLE Dawn

PAINTER_FNAME

Najeke

126

1338

key:

PAINTER_LNAME 123

PAINTING_NUM

between PAINTERand PAINTING

UML notation,

Key:

it is

PAINTER_NUM

worth pointing

out some

of the

different

terminology

may see when representing relationships amongst entities. In UML, relationships associations among entities. Associations have several characteristics:

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

that

you

are also known as

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Association over the

name.

written

on the

Association to

the

the

line.

In the

association

direction.

direction

in

PAINTING

Role

Each association

association

Associations the

The participating

the

name.

entities A role

relationship

A PAINTER

paints

example

In

name

of the

association

association

represented

Figure

3.7, the

seen

role

the

entity

and each

names

can role

name

is

written

paints

by an arrow (

arrow

is

would

shown

concepts,

for

PAINTING

have

by a given

name

(class);

be paints

alternatively

played

as the association

by each

a PAINTING,

the two

expresses

names,

as

relationship

is

? ) pointing

pointing

towards

role

class

names

in the

paints is displayed.

3

instead

relationship.

The role

names

example:

is_painted_by

a PAINTER.

and is_painted_by.

we shall not use role

Multiplicity refers to the number ofinstances

one instance

information model.

of a related

as the

As we are concentrating

names in

modelling

in

any relationships

entity

connectivity,

(class).

Multiplicity

cardinality

of one entity (class) that are associated

in the

and relationship

UML

model

participation

provides

the

constructs

same

in the

ER

For example:

One (and and

the 3.7, the

89

entities.

Multiplicity. with

flows.

in the

name

book on modelling relational

between

Figure

also have a direction,

relationship

Figure 3.7 does not show role

this

Normally,

in

Characteristics

line.

which

of an association

In this

shown

Model

entity.

name.

represent

has a name.

example

3 Relational

only

only

one)

PAINTER

one

PAINTER.

generates

one to

many

PAINTINGs,

implemented

in the

and

one

PAINTING

belongs

to

one

NOTE The

one-to-many

of the 1

(1:*)

side in the table

The 1:* relationship will discover

that

COURSE.

relationship

For

Wednesdays

is found each

and

an

Fridays

can

COURSE

There

Figure

and

review

2020 has

from

many

course

10:00

a.m.

Students

CLASSes might

to

two

a.m.

between

in

but that

yield

10:50

the 1:* relationship

by putting

the

primary

key

a typical each

classes:

and

one

CLASS one

offered

COURSE

college

or university

refers

offered

to

one

Mondays,

on Thursdays

and CLASS

only

on

(Th)

from

might be described

one row

many rows in the 3.9

many CLASSes,

maps the

in the

COURSE

CLASS table for

ERM (Entity

but each

CLASS references

table

any

for

given

row

any given row in the

Relationship

Model)

for the

only one

in the

CLASS

COURSE. table,

but there

COURSE table.

1:* relationship

between

COURSE

CLASS.

Cengage deemed

can have

will be only

can be

Copyright

II

model

key.

environment.

generate

Accounting

(MWF)

as a foreign

relational

way: Each

Editorial

side

in any database

6:00 p.m. to 8:40 p.m. Therefore, this

easily

of the many

COURSE

example,

is

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

90

PART I

Database

Systems

FIGURE 3.9

The 1:* relationship

between COURSEand CLASS

3 The 1:* relationship

FIGURE 3.10 Database Primary

key:

COURSE

and

CLASS is further

Theimplemented 1:* relationship Table name:

Ch03_TinyUniversity

Foreign

CRS_CODE

key:

illustrated

in

Figure

3.10.

between COURSEand CLASS COURSE none

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

CRS_CREDIT

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

CIS-420

CIS

Database

QM-261

CIS

Introduction

QM-362

CIS

Table

name:

Primary

key:

to

Computer

3

Science

4

Design and Implementation

3

to Statistics

Statistical

4

Applications

CLASS

Foreign

CLASS_CODE

key:

CRS_CODE

CLASS_CODE

CRS_CODE

CLASS_SECTION

10012

ACCT-211

1

MWF 8:00-8:50

10013

ACCT-211

2

MWF 9:00-9:50

10014

ACCT-211

3

10015

ACCT-212

1

10016

ACCT-212

2

10017

CIS-220

1

MWF 9:00-9:50

10018

CIS-220

2

MWF 9:00-9:50

10019

CIS-220

3

MWF 10:00-10:50

10020

CIS-420

1

W6:00-8:40

10021

QM-261

1

MWF 8:00-8:50

10022

QM-261

2

10023

QM-362

1

10024

QM-362

2

Copyright Editorial

name:

between

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

CLASS_ROOM

LECT_NUM

a.m.

BUS311

105

a.m.

BUS200

105

BUS252

342

BUS311

301

BUS252

301

a.m.

KLR209

228

a.m.

KLR211

114

KLR209

228

KLR209

162

KLR200

114

KLR200

114

KLR200

162

KLR200

162

CLASS_TIME

TTh

2:30-3:45

p.m.

MWF 10:00-10:50 Th 6:00-8:40

a.m.

p.m.

a.m.

p.m.

TTh 1:00-2:15

a.m. p.m.

MWF 11:00-11:50 TTh

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

2:30-3:45

or in Cengage

part.

Due Learning

a.m.

p.m.

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Using Figure

3.10, take

CLASS

uniquely

key.

table

However,

in the

class

the

In

key.

other

Note in in the

Figure

PAINTING

CRS_CODE,

the

key

table

example, as

is included

SQL in

key.

CLASS

been

will also

composed

Similarly,

Model

Characteristics

CLASS_CODE

chosen

to

uniquely

of CRS_CODE

not null and unique

in the

be the

primary

identify

and

91

each

row

CLASS_SECTION

constraints

enforced.

(You

will

8.)

PAINTER

table

Note that has

CLASS_SECTION key

Chapter

the

terminology.

CLASS_CODE

must have the

that

a foreign

in the

and

composite

when you learn

3.8, for

Therefore,

CRS_CODE

words,

Any candidate

see how this is done

some important

each row.

combination

table.

is a candidate

a minute to review

identifies

3 Relational

tables in

primary

Figure

as a foreign

key,

3.10,

the

PAINTER_NUM,

COURSE

is included

tables

primary

3

key,

key.

3.5.2 The 1:1 Relationship As the vice

1:1 label

versa.

For

department

exhibit

in this

example,

can

one

have

only

a 1:1 relationship.

be required at this

implies,

stage

of the

FIGURE 3.11

you

the

Each lecturer

is

EMP_NUM.

should

in

Figure

tables

in

a Tiny

(However,

can

only

entities

chair

one

one

on the

is

entity,

and

and

one

DEPARTMENT

thus

and lecturers

entities is

basic

1:1 relationship

other

department

and

chair a department

attention

basic

only

LECTURER

between the two

your

5.) The

cannot

optional.

However,

1:1 relationship. modelled

in

Optional

Figure

3.11,

and

between LECTURER and DEPARTMENT

3.12,

University

note that

employee.

that

to

3.12.

Figure

note

be related

not all lecturers

focus

Chapter

The 1:1 relationship

examine

The

That is, the relationship

in

shown

can

a lecturer

chair.

might argue that

discussion,

is

entity

chair

department

(You

will be addressed

its implementation

one

department

one

to chair a department.

relationships

As you

relationship,

not

all

there

are

Therefore,

employees

several the

are

important

lecturer

features:

identification

LECTURERS

is

theres

through

another

the

optional

relationship.) The 1:1 LECTURER foreign

key in the

1:* relationship contains

in

the

Also

which the

EMP_NUM

note that

DEPARTMENT participate

chairs

DEPARTMENT

DEPARTMENT

the

(or

many

LECTURER

the

table

contains

to

a single

that

the

relationship.

more) relationships

is implemented

1:1 relationship

key to indicate

LECTURER

even

relationship

Note that

side is restricted

as a foreign

employs

in two

table.

occurrence.

it is the

In this

case,

that

foreign

a good

EMP_NUM

as a special

department

DEPT_CODE

This is

by having the

is treated

has

case

DEPARTMENT a chair.

key to implement

example

of the

of how two

the

entities

1:* can

simultaneously.

Online Content If youopenthe'Ch03_TinyUniversity' database available onthe online platform

accompanying

LECT_NUM which is

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

this

book youll

as their foreign an example

All suppressed

Rights

of the

Reserved. content

key.

does

May not

not materially

use

be

copied, affect

see that the

LECT_NUM

and

of synonyms

scanned, the

overall

or

duplicated, learning

STUDENT

EMP_NUM

or different

in experience.

whole

or in Cengage

part.

names

Due Learning

to

and

CLASS entities still use

are labels

electronic reserves

for the

for the

rights, the

right

same

some to

third remove

same

attribute,

attribute.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

92

PART I

Database

Systems

FIGURE 3.12 Database

name:

Primary

3

key:

The implemented Ch03_TinyUniversity

Table

Foreign

EMP_NUM

key:

name:

between LECTURER and DEPARTMENT LECTURER

DEPT_CODE LECT_EXTENSION

LECT_HIGH_DEGREE

DRE 156

6783

PhD

ENG

DRE 102

5561

MA

ACCT

KLR 229D

8665

PhD

KLR 126

3899

PhD

EMP_NUM

DEPT_CODE

103

HIST

104 105

LECT_OFFICE

MKT/MGT

106 110

BIOL

AAK

160

3412

PhD

114

ACCT

KLR 211

4436

PhD

AAK

4440

PhD

MATH

155

201

160

ENG

DRE 102

2248

PhD

162

CIS

KLR 203E

2359

PhD

191

MKT/MGT

KLR 409B

4016

DBA

195

PSYCH

AAK 297

3550

PhD

209

CIS

KLR 333

3421

PhD

228

CIS

KLR

300

3000

PhD

297

MATH

AAK

194

1145

PhD

299

ECON/FIN

KLR 284

2851

PhD

301

ACCT

KLR 244

4683

PhD

335

ENG

DRE 208

2000

PhD

342

SOC

BBG 208

5514

PhD

387

BIOL

AAK

230

8665

PhD

401

HIST

DRE 156

6783

MA

425

ECON/FIN

KLR 284

2851

MBA

435

ART

BBG

2278

PhD

The 1:* DEPARTMENT CODE foreign

employs

key in the

The 1:1 LECTURER foreign

key in the

chairs

DEPARTMENT

Primary

key:

DEPT_CODE

Foreign

key:

2020 has

is implemented

through

the placement

of the

DEPT_

relationship

is implemented

through

the placement

of the

EMP_NUM

EMP_NUM

Cengage deemed

relationship

DEPARTMENT table.

DEPARTMENT

review

LECTURER

185

LECTURER table.

Table name:

Copyright Editorial

1:1 relationship

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

DEPT_NAME

DEPT_

CODE

SCHOOL_

EMP_

CODE

NUM

3 Relational

Model

DEPT_ADDRESS

Characteristics

DEPT_

EXTENSION

ACCT

Accounting

BUS

114

KLR 211, Box 52

3119

ART

Fine Arts

A&SCI

435

BBG 185, Box 128

2278

BIOL

Biology

A&SCI

387

AAK 230,

Box 415

4117

CIS

Computer

BUS

209

KLR

333,

Box 56

3245

ECON/FIN

Economics/Finance

BUS

299

KLR

284,

Box 63

3126

ENG

English

A&SCI

160

DRE 102,

Box

223

1004

HIST

History

A&SCI

103

DRE 156,

Box

284

1867

Info.

Systems

MATH

Mathematics

A&SCI

297

AAK 194,

Box 422

4234

MKT/MGT

Marketing/Management

BUS

106

KLR

Box 55

3342

126,

PSYCH

Psychology

A&SCI

195

AAK 297, Box 438

4110

SOC

Sociology

A&SCI

342

BBG 208, Box 132

2008

illustrates

a proper

The preceding

LECTURER

chairs

DEPARTMENT

the use of a 1:1 relationship ensures that should not be. However, the existence of a were not defined properly. It could indicate As rare as 1:1 relationships should be, suppose

you

manage the

database

example

93

1:1 relationship.

3

In fact,

two entity sets are not placed in the same table when they 1:1 relationship sometimes meansthat the entity components that the two entities actually belong in the same table! certain conditions absolutely require their use. For example,

for a company

that

employs

pilots, accountants,

mechanics,

clerks,

salespeople, service personnel and more. Pilots have many attributes that the other employees dont have, such aslicences, medical certificates, flight experience records, dates offlight proficiency checks and proof of required periodic medical checks. If you put all of the pilot-specific attributes in the EMPLOYEE

table,

you

will have several

nulls in that table for all employees

who are not pilots.

To avoid

the proliferation of nulls, it is better to split the pilot attributes into a separate table (PILOT) that is linked to the EMPLOYEE table in a 1:1 relationship. Since pilots have many attributes that are shared by all employees such as name, date of birth and date of first employment those attributes would be stored in the EMPLOYEE table.

Online Content If youlook atthe'Ch03_AviaCo' databaseonthe onlineplatform for this book, you will see the implementation relationship

will be examined

in

of the 1:1 PILOT to

detail in

Chapter

6, Data

EMPLOYEE relationship. Modelling

Advanced

This type

of

Concepts.

3.5.3 The *:* Relationship A many-to-many (*:*) relationship is a more troublesome proposition Traditionally in data modelling the *:* relationship can be implemented set of 1:* relationships.

To explore

the

many-to-many

(*:*) relationship,

in the relational environment. by breaking it up to produce a consider

a rather

typical

college

environment in which each STUDENT can take many CLASSes and each CLASS can contain STUDENTs. The ERD modelin Figure 3.13 shows this *:* relationship.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

many

eBook rights

and/or restrictions

eChapter(s). require

it

94

PART I

Database

Systems

FIGURE 3.13

The *:* relationship

between STUDENT and CLASS

3 Note the features Each

CLASS

There

can

can be

TABLE

Students

can have

be

the

three

Figure 3.13:

many STUDENTs,

many rows

in the

CLASS

STUDENT

*:* relationship

classes.

3.7

Last

ERD in

many rows in the

To examine

takes

of the

more

Name

closely,

times

in the

and

each

the

of those

hours

CIS-220, code

to

and

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

to

May not

not materially

QM-261,

code 10021

reflected

copied, affect

phone

would

of whom

overall

or

also

be repeated

in

in experience.

whole

or in Cengage

STU_NUM

student

of the

taking

lead

Due Learning

output

to

electronic reserves

the

records

the

such

such

many

as

STUDENT

table,

here.

Similarly,

generates

a CLASS

attributes

as credit

discussed

operations

as shown

occur

shown

class

anomalies

the relational

and

part.

to the

values

attributes in

CLASS table included

redundancies

errors

not be implemented

be contained

each

each student

worse if the

duplicated, learning

10018

the

additional

would

of the two tables,

scanned,

code

note that

situation,

home

efficiency

the

each

10018

in Figure 3.13, it should

For example,

Those

system

students,

10014

Statistics,

many duplications:

be

code

CIS-220,

a real-world

and contents

to lead

with two

students.

code

Science,

would be even

structure

are likely

1, ACCT-211,

and

description.

and there

10021

Computer

values

contains

course

In

major

table,

code 10014

QM-261,

table.

The problem

Given the and

Copyright

1, ACCT-211, Science,

attribute

CLASS table

record.

Editorial

STUDENT

STUDENT

data

many redundancies.

classification,

university

Computer

is logically

address,

many CLASSes.

CLASS table.

data for the two

Statistics,

reasons:

create

a small

to

3.14 for

The tables

imagine

to

the *:* relationship good

row in the

Intro

Intro to

two

can take

Classes

Accounting

Figure

given

Intro

Intro

Although

any

enrolment

enrolment

Accounting

Smithson

for

STUDENT

table for any given row in the

Selected

Ndlovu

in

table

Table 3.7 shows the

Sample student

and each

in

become

Chapter

1.

very complex

errors.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.14 Database Primary

Table

name: key:

Model

Characteristics

Ch03_CollegeTry

Table

name:

Foreign

STUDENT

key:

none

STU_NUM

STU_LNAME

CLASS_CODE

321452

Ndlovu

10014

321452

Ndlovu

10018

321452

Ndlovu

10021

324257

Smithson

10014

324257

Smithson

10018

324257

Smithson

10021

3

CLASS

Key:

CLASS_CODE

CLASS_CODE

Foreign

STU_

CRS_CODE

Key:

STU_NUM

CLASS_

NUM

CLASS_TIME

CLASS_

SECTION

PROF_NUM

ROOM

10014

321452

ACCT-211

3

TTh 2:30-3:45

p.m.

BUS252

342

10014

324257

ACCT-211

3

TTh 2:30-3:45

p.m.

BUS252

342

10018

321452

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10018

324257

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10021

321452

QM-261

1

MWF 8:00-8:50

a.m.

KLR200

114

10021

324257

QM-261

1

MWF 8:00-8:50

a.m.

KLR200

114

Fortunately,

95

between STUDENT and CLASS

STU_NUM

name:

Primary

The *:* relationship

3 Relational

the

problems

inherent

in the

many-to-many

(*:*) relationship

can

easily

be avoided

by

creating a composite entity or bridge entity. Because such a table is used to link the tables that originally were related in a*:* relationship, the composite entity structure includes asforeign keys at least the primary keys of the tables that are to belinked. The database designer has two main options when defining a composite tables primary key: use the combination of those foreign keys or create a new primary

key.

NOTE In UML class diagrams, the composite

entity,

multiplicity element can represent *:* relationships

an association

explore the concept Diagrams.

class is used to represent

of an association

the association

directly. Instead between

two

of using a

entities.

We will

class further in Chapter 5, Data Modelling with Entity Relationship

Remember that each entity in the ERD is represented by a table. Therefore, you can create the composite ENROL table shown in Figure 3.15 to link the tables CLASS and STUDENT. In this example, the

ENROL tables

primary

key is the

combination

of its foreign

Butthe designer could have decided to create a single-attribute

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

keys

CLASS_CODE

and

STU_NUM.

new primary key such as ENROL_LINE,

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

96

PART I

Database

using

Systems

a different line

use the

FIGURE 3.15 Database

value to identify

Autonumber

name:

data

type

to

each

such line

Converting the *:* relationship Ch03_CollegeTry2

Table

Primary key: STU_NUM

3

ENROL table

generate

name:

Table

users

might

STUDENT

STU_LNAME

321452

Ndlovu

324257

Smithson

CLASS_CODE1STU_NUM

keys:

CLASS_CODE,

name:

Primary

Access

ENROL

Primary key: Foreign

(Microsoft

key: none

STU_NUM

Table

uniquely. automatically.)

into two 1:* relationships

name:

Foreign

row

values

STU_NUM CLASS_CODE

STU_NUM

ENROLL_GRADE

10014

321452

C

10014

324257

B

10018

321452

A

10018

324257

B

10021

321452

C

10021

324257

C

CLASS

key:

CLASS_CODE

Foreign

key:

CRS_CODE

CLASS_CODE

CRS_CODE

CLASS_SECTION

CLASS_TIME

10014

ACCT-211

3

TTh 2:30-3:45

10018

CIS-220

2

MWF 9:00-9:50

10021

QM-261

1

MWF 8:00-8:50

Because

the

linking

ENROL table in

Figure

3.15 links

CLASS_ROOM

PROF_NUM

BUS252

342

a.m.

KLR211

114

a.m.

KLR200

114

p.m.

two tables,

STUDENT

table. In other words, alinking table is the implementation

and

CLASS, it is also called

of a composite

a

entity.

NOTE In

addition

as the

to the linking

grade

designer

earned

attributes,

in the

wants to track.

the

course.

composite

In fact,

Keep in

ENROL

a composite

mind that the

table

table

can

can

composite

also

contain

contain

entity,

any

although

relevant

number

attributes,

such

of attributes

it is implemented

that

the

as an actual

table, is conceptually alogical entity that was created as a meansto an end: to eliminate the potential for multiple redundancies in the original *:* relationship.

The linking composite

Copyright Editorial

review

2020 has

Cengage deemed

(ENROL)

entity

Learning. that

table

any

All suppressed

shown

represented

Rights

Reserved. content

does

May not

not materially

in

Figure

by the

be

copied, affect

scanned, the

overall

3.15

ENROL

or

duplicated, learning

yields

table

in experience.

whole

the

required

*:* to

must contain

or in Cengage

part.

Due Learning

to

electronic reserves

at least

rights, the

right

some to

third remove

1:* conversion. the

party additional

Observe

primary

content

may content

keys

be

suppressed at

any

time

that

of the

from if

the

subsequent

the

CLASS

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

and

STUDENT

Also

note

table

tables

that

the

contains

incapable

multiple

of

be assigned

FIGURE

ENROL tables

the

3.16

As you

class

code

conversion

between

3.16,

and

The 1:* relationship With the control

help

the

between

sections

of this

to

FIGURE 3.17

each

foreign

key

key consists student in the

that

values,

of the two

ERM,

per entity.

but those

controlled

too.

enforced.

to satisfy

attributes

are needed

to

The revised

Model

which it serves

one row

is

is selected

number

for

only

integrity

ENROL_GRADE

primary

note

respectively) contain

as referential

and the

Characteristics

as a connector.

The linking

ENROL

redundancies

Additional

a reporting

a particular

relationship

is

are

attributes

may

requirement.

Also

CLASS_CODE

define

97

and

STU_NUM,

students

shown

in

grade.

Figure

3

3.16.

to two 1:* relationships

the

composite

entity

named

ENROL

represents

the

linking

table

CLASS. COURSE

relationship,

you

redundancies. and

of a CLASS

common

case,

between

databases COURSE

of the

now

the *:* relationship

Figure

STUDENT

STU_NUM,

tables

as long

is reflected

Changing

examine

and

CLASS

anomalies

as needed. In this

both the

Naturally,

and

occurrences

producing

note that the because

(CLASS_CODE

STUDENT

3 Relational

CLASS

while

CLASS

shown

kept

CLASS

can increase Thus, in

controlling

are

and

Figure Figure

was first illustrated

the 3.16 3.17.

COURSE

The expanded entity relationship

of available

be expanded

Note that

redundancies

in the

amount can

by

in Figure

the

making

3.9 and Figure

information,

to include model

is

sure that

even

the

able

3.10. as you

1:* relationship

to

handle

all of the

multiple

COURSE

data

table.

model COURSE

1..1

has

c

1..*

STUDENT

registers

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

shows_in

1..*

1..1

Editorial

ENROL

c

does

May not

not materially

be

copied, affect

1..*

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

CLASS

c

1..1

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

98

PART I

Database

Systems

The ERD

will be examined

complex

databases.

of a realistic

database

3.6

in

The

greater

ERD

design

detail in

will also in

Chapter

5 to

as the

basis

be used

Appendices

B and

C (see

show for

the

you

the

how it is

used to

development

online

platform

design

more

and implementation for this

book).

DATA REDUNDANCY REVISITED

3 In

Chapter

the

1 you learnt

effectiveness

control

that

of the

data redundancy

database.

data redundancies

The

proper

use

of foreign

that, in the

strictest

because

foreign

values

keys

minimises

key

data

crucial

that

to

the

exercising

thus

are

data

anomalies

database by tables,

redundancy

control.

the

chance

that

it

called

possible

foreign

However,

keys does not eliminate Nevertheless,

can destroy

makes

shared

many times.

minimising

Those

relational

that

use of foreign

be repeated

redundancies,

data anomalies.

attributes

sense, the can

to

also learnt

common

keys is

emphasising the

You

by using

leads

to keys.

it is

worth

data redundancies,

the

proper

use

of foreign

destructive

data

anomalies

will

are stored,

but whether the

develop.

NOTE The real test elimination

of redundancy

of an attribute

information

can still

redundant. multiple

in

Given

be generated

that

view

occurrences

mind that

in

controlled

and/or information

is not how will eliminate

many copies

information.

through

relational

of redundancy,

a table.

proper

However,

redundancies

of a given attribute

Therefore,

even

algebra,

foreign

when

Exclusive

reliance

delete

an attribute

the inclusion

keys

you

are

use this

are often designed

requirements.

if you

clearly

less

of that

restrictive

algebra

attribute

not redundant view

as part of the system

on relational

and the

to

in

original would

spite

be

of their

of redundancy,

keep

to ensure transaction

speed

produce

required

information

maylead to elegant designs that fail the test of practicality.

You

will learn

in

requirements:

Chapter

design

15,

Databases

defined

and controlled the

As important must

such

review

2020 has

Cengage deemed

any

All suppressed

about

a consistent

a system

Rights

Reserved. content

does

May not

one

Regardless

serve

when

crucial

the

data. For example,

input

are shown in

not materially

be

copied, affect

at a time,

purchased

pricing

scanned, the

overall

or

each

consider

in experience.

whole

The

or in Cengage

that

Due Learning

to

electronic reserves

table

appears

class

rights, the

right

You seem

some to

third remove

system.

several

should

content

may content

be

any

Because

the

LINEs, product

The tables

time

that

Figure 3.19.

suppressed at

to

The system

contain

ERD is shown in

additional

exist

invoice

on the invoice.

party

will learn to

an INVOICE.

may contain

PRODUCT

redundancy

purposes.

generating

data

control.

of data

a small invoicing thus

The systems

part.

level

in

carefully

of how you describe and careful

the

will learn

requires

data redundancies

an invoice

product

Figure 3.18.

duplicated, learning

product. for

design

information

when

contradictory

And you

warehousing

are times

are times

often

requirements.

data

properly.

there

And there

product

three

by proper implementation

is,

database

15.

of the

proper

who may buy one or more PRODUCTs,

more than

details

provide

Learning. that

buy

control

reconcile

and information

that

damage is limited

Chapter

must

to function

make the

accuracy

are part of such

Copyright

to

CUSTOMER,

providing

price to

for

historical

may

speed

Intelligence,

redundancy

in

a customer each

data

designers

processing

data redundancies

be increased

the

database

Business

redundancies

preserve the includes

for

potential as

actually

about

Editorial

elegance,

Chapter

redundancies,

5 that

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 3.18 Database

Asmall invoicing

name:

3 Relational

Model

Characteristics

system Table

Ch03_SaleCo

name:

Foreign

Primary key: CUS_CODE

CUSTOMER

key: none

CUS_CODE

CUS_LNAME

CUS_FNAME

CUS_INITIAL

CUS_AREACODE

CUS_PHONE

10010

Ramas

Alfred

A

0181

844-2573

10011

Dunne

Leona

K

0161

894-1238

10012

Du Toit

0181

894-2285

10013

Pieterse

0181

894-2180

10014

Orlando

0181

222-1672

10015

OBrian

Amy

B

0161

442-3381

10016

Brown

James

G

0181

297-1228

0181

290-2556

10017

Marlene

George

Moloi

10019

Table

F

Myron

Padayachee

10018

W

Jaco

Williams

99

Vinaya

G

0161

382-7185

Mlilo

K

0181

297-3809

3

name: INVOICE

Foreign

Primary key: INV_NUMBER INV_NUMBER

key: CUS_CODE

CUS_CODE

INV_DATE

1001

10014

08-Dec-19

1002

10011

08-Dec-19

1003

10012

08-Dec-19

1004

10011

09-Dec-19

Table name: LINE Primary

key: INV_NUMBER

1 LINE_NUMBER

Foreign

key: INV_NUMBER,

PROD_CODE

INV_NUMBER

Copyright Editorial

review

LINE_PRICE

LINE_NUMBER

PROD_CODE

LINE_UNITS

1001

1

123-21UUY

1

1001

2

SRE-657UG

3

2.36

1002

1

QER-34256

2

14.72

1003

1

ZZX/3245Q

1

5.36

1003

2

SRE-657UG

1

2.36

1003

3

001278-AB

1

10.23

1004

1

001278-AB

1

10.23

1004

2

SRE-657UG

2

2.36

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

150.09

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

100

PART I

Table

3

Database

name:

Systems

PRODUCT

Primary

key:

PROD_CODE

Foreign

key:

none

PROD_CODE

PROD_DESCRIPT

001278-AB

Claw

123-21UUY

Houselite

QER-34256

Sledge

SRE-657UG

Rat-tail file

ZZX/3245Q

Steel tape,

FIGURE 3.19

PROD_PRICE

PROD_ON_HAND

VEND_CODE

23

232

4

235

6

231

2.36

15

232

5.36

8

235

10.23

hammer chain

saw,

hammer,

16 cm

150.09

bar

14.72

16 kg head

12 mlength

The ClassERDfor the invoicing system

As you examine

the tables

in the invoicing

system in Figure 3.18 and the relationships

depicted

in Figure

3.19, note that you can keep track oftypical sales information. For example, by tracing the relationships among the four tables, you discover that customer 10014 (Myron Orlando) bought two items on 8 December, 2012 that were written to invoice number 1001: one Houselite chain saw with a 16-inch bar and three rat-tail files. (Note: Trace the CUS_CODE number 10014 in the CUSTOMER table to the matching

CUS_CODE

value in the INVOICE

table.

Next, take the INV_NUMBER

1001 and trace it to the

first two rows in the LINE table; then match the two PROD_CODE values in LINE with the PROD_CODE values in PRODUCT.) Application software will be used to write the correct bill by multiplying each invoice line items LINE_UNITS byits LINE_PRICE, adding the results, applying appropriate taxes, etc. Later,

other

application

software

might use the

same technique

to

write sales reports

that

track

and

compare sales by week, month or year. As you examine the sales transactions in Figure 3.18, you mightreasonably suppose that the product price billed to the customer is derived from the PRODUCT table because thats where the product

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

data are stored. redundancy?

But why does that

It certainly

success.

Copying

accuracy

of the

table

appears

the

product

you use the Now suppose reflected

sales

transaction

thus

revenues took

place!

eliminating

the

price

data

are

price

will always

such

planned

is

case,

the

data

is

are

stored.

You

on

myinvoice

from those

3.7

not

you

orderly

potential price

topic

arrangement

through books

when

other

3

the

hand, if the

LINE

You

table,

that

will discover

that

LINE table in Figure primary but this

numbers

generation,

key

and,

redundancy

automatically.

the redundancy

3.18.

is

In this

not

a source

benefit: the order of the retrieved invoicing

as soon when

effect

the

composite

were entered. If product

codes

change

calculations

will be incorrect,

is redundant,

such line

automatic

are looking

look

in

the

For

codes

as the invoice

a customer

is

calls

at an invoice

and

are used as part of the completed

and the

says,

second

whose lines

The

show

data item

a different

order

and

pointers. you

database

in

and see if the PAINTER_NUM

table

and use the index

in the

index

Cengage

find

any

All suppressed

Rights

the

does

Figure

May not

not materially

be

in this

read

the

to locate

is

reference

the

to the location of the

pointers.

every

system)

which

points

you

matter. Anindex is an

you

make sense

that

point

described

in

the

go to to the

by each

preceding

and a set

of

anindex is an ordered by the

a given row

in

up the appropriate index

key.

painter the

However, if you index

the

you

key

data identified

must read

speaking,

to

quickly.

created

merely need to look

to read

much simpler

of an index

of the

painter.

Conceptually

item

Moreformally,

paintings

an index,

references

indexes

point.

up

you

page

composed

points

Without

through

catalogue,

Does it

not; it is

a needed

work like

an index

all

book.

Of course

matches the requested

matching

in

Reserved. content

3.8.

key PAINTER_NUM,

depicted

Learning. that

and

key

manual or a computer

model,

and

used

of view,

Each

Figure

table

presentation

model,

want to look

a

the topic?

environment

point

to look

logically.

as ER

is

make sense

not; you use the librarys

of the book a quick and simple

across ER

Does it

Of course

a table

an index

database

suppose

Ch03_Museum

in such

phrase case,

a library.

(in either

key is, in effect, the indexs

of keys

example,

rows

a topic,

a conceptual

pointers. The index arrangement

each

relational

From

The index

access

to find

up the

In

in

want?

making retrieval

until you stumble

page(s).

Indexes

to

book

one you

and author.

want

page

index,

paragraphs.

deemed

you

a particular

thereby used

you

every

appropriate

has

be a sufficient

generates

confusion

and

until you find the

by title,

Or suppose

2020

product

the

copy!

want to locate

to the books location,

review

the

an incorrect

which the

LINE

calculate

the

in

time.

historical the

This price

not in

a data

systems

INDEXES

is indexed

Copyright

given its

at that

LINE_NUMBER

that

to

Onthe

was used in the

data

can imagine

book in the library

Editorial

Yes, the

table

101

design.

attribute

order in

on the customers

Suppose

the

place

database

in

transactions

transaction

also adds another

those

all past

with the

took

the

was

over time.

stored

and PROD_CODE

But

that

that

maintains

changes.

Characteristics

to the

Unfortunately,

price for

of LINE_NUMBER

will arrange

has

product

that

software

necessary.

PROD_PRICE

Isnt

LINE_PRICE

PRODUCT

sales comparisons

good

redundant?

the

Model

crucial

table

and

LINE_NUMBER

by invoicing

match the

key, indexing

in

is

write the

calculations.

calculations

transaction

of INV_NUMBER

The inclusion

new

table

common

why the

tables

LINE

to

price) from

the

making proper

the

to the

Relational

LINE table?

redundancy

you fail

revenue

revenue

PRODUCT

reflect

created

will always

primary

reflect

table

that

(product

again in the

apparent

PRODUCT

sales

the

of

the

LINE_NUMBER

redundancy

of anomalies.

also

are

combination

commonly

the

from

might wonder

isnt

quite

now

As a result,

the

PRODUCT

will

redundancies

Wouldnt the

the

the

all subsequent

accurately

Finally, you

therefore,

that

price occur

time,

for instance,

in

possibility

copied

from

PROD_PRICE

will be properly past

product

But this

Suppose,

sales revenue.

of

be.

price

transactions.

and that

same

to

3

in the

PAINTING

the

PAINTER

PAINTER_NUM

would

resemble

the

3.20.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

102

PART I

Database

Systems

FIGURE 3.20 PAINTING

Components table

of anindex

index

PAINTING

123

1, 2, 4

126

3, 5

table

3

PAINTER_NUM (index key)

Pointers to the PAINTING table rows SOURCE:

Course

Technology/Cengage

Learning

As you examine Figure 3.20, note that the first PAINTER_NUM index key value (123) is found in records 1, 2 and 4 of the PAINTING table. The second PAINTER_NUM index key value (126) is found in records

3 and 5 of the

PAINTING

table.

DBMSs use indexes for many different purposes. You just learnt that an index can be used to retrieve data more efficiently. But indexes can also be used by a DBMS to retrieve data ordered by a specific attribute or attributes. For example, creating anindex on a customers last name will allow you to retrieve the customer data alphabetically ordered by the customers last name. Also, anindex key can be composed

of one or more attributes.

For example,

in

Figure

3.18, you can create

an index

on VEND_CODE and PROD_CODE to retrieve all rows in the PRODUCT table ordered by vendor and within vendor, ordered by product. Indexes play animportant role in DBMSs for the implementation of primary keys. Whenyou define atables primary key, the DBMS automatically creates a unique index on the primary key column(s) you declared. For example, in Figure 3.18,

when you declare

CUS_CODE to

be the

primary

key of the

CUSTOMER

table,

the DBMS automatically creates a unique index onthat attribute. A unique index, asits name implies, is an index in whichthe index key can have only one pointer value (row) associated withit. (The index in Figure 3.20 is not a unique index because the PAINTER_NUM has multiple pointer values associated withit. For example,

painter

number

123 points to three rows

1, 2 and 4 in the

PAINTING table.)

Indexes are crucial in speeding up data access. They can be used to facilitate searching, sorting and even joining tables. Theimprovement in data access speed occurs because anindex is an ordered set of values that contains the index key and pointers. A table can have manyindexes, but each index is associated with only one table. Theindex key can have multiple attributes (composite index). Creating an index

is

easy.

You

will learn

in

Chapter

8 that

a simple

SQL command

will produce

any required

index.

NOTE You willlearn more about how indexes can be applied to improve Conceptual, Logical, and Physical Database Design.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

data access and retrieval in

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

Chapter 11,

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

3.8 In

3

Relational

Model

Characteristics

103

CODDS RELATIONAL DATABASE RULES

1985,

Dr E.F. Codd

published

alist

of 12 rules

to

define

a relational

database

system.2

The reason

Dr Codd published the list was his concern that many vendors were marketing products asrelational even though those products did not meet minimum relational standards. Dr Codds list, shown in Table 3.8, serves as a frame of reference for what atruly relational database should be. Bearin mindthat even the

dominant

database

TABLE Rule

3.8

vendors

do not fully

Dr Codds

12 relational

Rule Name

1

Information

2

Guaranteed

support

all 12 rules.

database

All information

Access

in a relational

values

Every

Systematic

Treatment

of

Nulls

Nulls

Based

Online on the

Catalogue

5

Comprehensive

Data

guaranteed

name,

The relational

database.

key value in

through

as

a

and column

a systematic

one

management

and Such

managed data

name.

way,

may support

well-defined

authorised

language. However

language

data

constraints,

commit

data, that is, in to

many languages.

declarative

view definition,

(begin,

as ordinary

must be available

database relational

database

must support

it

with support

manipulation (interactive authorisation

and transaction

and rollback).

Any view that is theoretically

Updating

be accessible

and treated

and by program), integrity

View

to

primary

must be stored

within the

for data definition,

6

represented

within tables. is

users, using the standard

Sub-language

must be logically

of data type.

metadata

tables

Model

database

must be represented

The

Relational

a table of table

independent

Dynamic

in rows

value in

combination

4

rules

Description

column

3

3

updatable

must be updatable

through

the

system. 7

High-Level and

Insert,

Physical

8

The

Update

database

Data Independence

Application physical

9

must support

Logical

Data Independence

programs access

Application changes

programs are

Integrity

Independence

11

Distribution

Independence

12

Non-Subversion

The

made to the

Rule Zero

to

Codd,

E.F., Is

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

Your

and 21

All suppressed

Rights

DBMS October,

Reserved. content

users data

does

and

May not

not materially

Really

structures

application

location

bypass

All preceding to

table

constraints

and

deletes.

rules

are

based

that

when

unaffected

preserve

or inserting

the

when

original

columns).

not at the application unaware

of the

of and

level.

unaffected

databases).

access to the

on the

it

are logically

are

vs local

rules

relational,

unaffected

are changed.

catalogue,

programs

low-level

the integrity

be considered

are logically

must be definable in the relational

(distributed

If the system supports way to

14 October

updates

structures

order of column

and stored in the system

end

by the

or storage

and ad hoc facilities

All relational integrity language

2

inserts,

and ad hoc facilities

methods

table values (changing 10

set-level

Delete

data, there

must not be a

database.

notion

that,

in

must use its relational

order for

a database

facilities

exclusively

manage the database.

Relational?

and Does

Your

DBMS

Run by the

Rules?

Computerworld,

1985.

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

104

PART I

Database

Systems

SUMMARY Tables are the basic building blocks of a relational as an entity

set, is stored in

a table.

Conceptually

database. speaking,

A grouping of related entities, known the relational

intersecting rows (tuples) and columns. Each row represents represents the characteristics (attributes) of the entities.

3

Keys are central to the

use of relational

tables.

table is composed

of

a single entity, and each column

Keys define functional

dependencies;

that is, other

attributes are dependent on the key and can, therefore, be found if the key value is known. A key can be classified as a superkey, a candidate key, a primary key, a secondary key or aforeign key. Each table row

must have a primary

key. The primary

key is an attribute

or a combination

of

attributes that uniquely identifies all remaining attributes found in any given row. Because a primary key must be unique, no null values are allowed if entity integrity is to be maintained. Although the tables key of one table

are independent,

can appear

they can belinked

as the foreign

integrity dictates that the foreign table or must contain nulls. Once you know the relational

by common attributes.

key in another

table to

key must contain values that

database

basics,

Thus, the primary

which it is linked.

Referential

match the primary key in the related

you can concentrate

on design.

Good design

begins by identifying appropriate entities and attributes, and the relationships among the entities. Those relationships (1:1, 1:* and *:*) can be represented using ERDs. The use of ERDs allows you to create and evaluate simple logical design. The 1:* relationships are most easily incorporated in a good

design;

you just

have to

make sure that the

primary

key of the 1

is included

in the table

of

the many.

KEYTERMS associations

flags

predicate logic

associationclass

foreign key(FK)

primary key (PK)

attribute domain

full functional dependence

referential integrity

bridge entity

functional dependence

relation

candidatekey

homonyms

relationalschema

cardinality

index

secondary key

composite entity

index key

superkey

composite key

key

synonym

datadictionary

key attribute

systemcatalogue

determination

linking table

tuple

domain

multiplicity

entity integrity

unique index

null

FURTHER READING Codd,

E.F.

Codd,

E.F. Relational

The

Series

RJ987

March

(6

Copyright review

2020 has

Cengage deemed

Learning. that

any

Series

All suppressed

Model

for

Data

completeness

Symposia

Symposia

Editorial

Relational

Rights

6, Data 1972). 6.

Base

Republished

does

May not

not materially

be

Management:

base

Systems,

Prentice-Hall,

Reserved. content

Base

of data

New

in

Version

sublanguages York

Randall

J.

City,

Rustin

2.

Addison-Wesley,1990.

(presented NY,

(ed.),

2425 Data

at May,

Base

Courant

1971).

Computer

IBM

Systems:

Science

Research

Courant

Report

Computer

Science

1972.

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Date,

C.J.

The

Date,

C.J.

Darwen,

Relational

Date,

C.J.

Date

Date,

C.J.

Database

Database

Dictionary.

H. Databases,

on

Database: in

Types Writings

Depth:

The

OReilly,

and the

Relational

Model.

APress, Model

Relational

Model

Characteristics

105

2006.

Relational

20002006.

3

for

Addison-Wesley,

2006.

2006.

Practitioners.

OReilly,

2005.

Online Content Allofthe databases usedin the questions andproblems areavailableon the

online

platform

database is

the

accompanying

names

used

in the

'Ch03_CollegeQue'

chapter

are

also

REVIEW

this figures.

database.

available

on the

book. For

The

example,

Answers online

database the

to

names

source

selected

used

of the

Review

in the

tables

folder

shown

Questions

and

match in

2

What does it

3

Whyare entity integrity and referential integrity important in a database?

4

What can a NULL value represent?

5

Whatis the domain of an attribute?

6

Create the basic ERD using UML notation for the database shown in Figure Q3.1.

Table

this

QUESTIONS

Whatis the difference between a database and a table?

Database

Q3.1 for

platform.

1

FIGURE

the

Figure

Problems

3

meanto say that a database displays both entity integrity

Q3.1 name:

name:

The Ch03_CollegeQue

database

and referential integrity?

tables

Ch03_CollegeQue Table

STUDENT

STU_CODE

LECT_CODE

100278

name:

LECTURER

LECT_CODE

DEPT_CODE

1

2

128569

2

2

6

512272

4

3

6

531235

2

4

4

531268

553427

7

Copyright Editorial

review

2020 has

1

Create the basic ERD using UML notation for the database shown in Figure Q3.2.

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

106

PART I

Database

FIGURE Database Table

3

Systems

Q3.2 name:

name:

The Ch03_TravelQue

database tables

Ch03_TravelQue

CUSTOMER

CUS_CODE

CUS_LNAME

CUS_EMAIL

CUS_MOBILE

24563

GARNETT

[email protected]

08703345671

24565

MWBAU

[email protected]

08734566664

Table name: BOOKING BOOKING_NO

PACKAGE_ID

BOOK_TOTAL_COST

BOOK_PAID

BOOK_DEP_DATE

24563

9910001

956.00

Y

06-Jan-19

24565

9910001

895.00

N

07-Sep-19

24563

9910003

3056.00

N

05-Oct-19

Table name: PACKAGE_HOLIDAY PACKAGE_ID

PACK_DESTINATION

9910001

Spain

Riveria Travel

7

9910002

USA

Mouse

14

9910003

Australia

Wallaby Tours

8

PACK_OPERATOR

PACK_DURATION

Holidays

21

Suppose you have the ERD shown in Figure Q3.3. How would you convert this that displays only 1:* relationships? (Make sure you create the revised ERD.)

FIGURE Q3.3

The UMLClassERDfor question 6 TRUCK

DRIVER

1..*

1..*

During

some

TRUCKS

9 10

What are homonyms

and

time

any

interval,

TRUCK

and synonyms,

How would you implement example.

Use your knowledge

Copyright review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

of naming

not materially

be

copied, affect

a DRIVER

can

be driven

by

can

the

overall

or

duplicated, learning

many DRIVERs.

in a database

composed

of two tables?

Give an

ofthe table shown in Figure Q3.4, using correct terminology.

conventions

scanned,

drive many

and why should they be avoided in database design?

a 1:* relationship

11 Identify and describe the components

Editorial

modelinto an ERD

in experience.

whole

to identify

or in Cengage

part.

Due Learning

to

the tables

electronic reserves

rights, the

right

some to

probable

third remove

party additional

content

foreign

may content

be

key(s).

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE Database Table

Q3.4 name:

name:

The Ch03_NoComp

Characteristics

Ch03_NoComp

EMPLOYEE EMP_FNAME

11234

Friedman

K

Robert

MKTG

12

11238

Zulu

D

Cela

MKTG

12

11241

Fontein

11242

Theron

11245

Smithson

11256

McBride

11257

Mazibuko

11260

Ratula

Oleta

ENGR

8

Randall

ENGR

8

the

b

Identify the foreign Create the

Q3.5

Database name:

Table

name:

primary

has

William

Learning. that

any

14

MKTG

14

5

INFS

Katrina

of the two tables shown in Figure Q3.5.

keys.

ERM.

The Ch03_Theatre

database tables

Ch03_Theatre

DIRECTOR

name:

Cengage deemed

MKTG

keys.

DIR_NUM

DIR_LNAME

DIR_DOB

100

Broadway

12-Jan-75

101

Hollywoody

18-Nov-63

102

Goofy

21-Jun-72

PLAY

PLAY_CODE

PLAY_NAME

DIR_NUM

1001

Cat On a Cold, Bare Roof

102

1002

Hold the

1003

2020

Fikile

Suppose you are using the database composed Identify

6

INFS

G

A

a

9

ENG

Bernard

D

3

5

B

W

Smith

JOB_CODE

INFS

Emma

J

Washington

11258

DEPT_CODE

Juliette

11248

Table

107

database EMPLOYEE table

EMP_INITIAL

FIGURE

review

Model

EMP_LNAME

c

Copyright

Relational

EMP_NUM

12

Editorial

3

All suppressed

Rights

Reserved. content

does

I

Mayo, Pass the

Never Promised

1004

Silly

Putty

1005

See

No Sound,

1006

Starstruck

1007

Stranger

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

Goes

in In

whole

Bread

101

You Coffee To Hear

102

Washington

100

No Sight

101

Biloxi

102

Parrot Ice

or in Cengage

part.

Due Learning

to

101

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

108

PART I

Database

d

Systems

Suppose you wanted quick lookup director.

e

Which table

be the

What would be the conceptual contents

13

would

of the

conceptual

capability to get alisting basis for

the INDEX

table,

of all plays directed and

what

would

3

world.

The database

table.

composed

a

Identify the primary keys.

b

Identify the foreign

c

Createthe ERM.

FIGURE Q3.6 Database

is

name:

key?

view of the INDEX table that is described in Part d? Depict the

INDEX

Suppose you are using the database to enable a museum to find the location the

by a given

be the index

of the three

tables

shown

in Figure

of artefacts around

Q3.13.

keys.

Table Name Artefact Museum

Database

ARTEFACT_DESCRIPTION

ARTEFACT_

TRACK_ID 10034

Greywacke

Statue Tribute to Isis

10039

The Golden Rhinoceros

ARTEFACT_

ARTEFACT_

ARTEFCAT_

AGE

VALUE

LOCATION_ID

664525

of

BC

6000000

78343

10751220

12100000

56432

18th

85900000

23412

Mapungubwe 10056

Pinner

Qing

Dynasty

Vase

Century 19002

Rosetta

181

Stone

BC

23412

Table name: LOCATION ARTEFACT_LOCATION_ID

ARTEFACT_COUNTRY

78343

FRANCE

56432

USA LONDON

23412

d

Suppose the could

be

museum database

contacted

CURATOR_NO,

for

to

request

to

CURATOR_NAME

more than

one location.

was to be expanded

to include

see

details

an

and

artefact.

The

CURATOR_CONTACT.

Modify your

ERM to include

details of a curator

that

need

to

A curator

be

may

who

stored

are

a

be responsible

this information.

PROBLEMS Use the four

database

tables

that

shown reflect

in

Figure

these

P3.1 to

work

Problems

1-7.

Note that

the

database

is

composed

of

relationships:

An EMPLOYEE

has only one JOB_CODE,

An EMPLOYEE

can

participate

in

many

but a JOB_CODE PLANs,

and

any

can be held

PLAN

can

by many EMPLOYEEs.

be assigned

to

many

EMPLOYEEs. Note

table

Copyright Editorial

review

2020 has

also that

serves

Cengage deemed

Learning. that

any

the

*:* relationship

has been

as the composite

All suppressed

Rights

Reserved. content

does

May not

not materially

be

or bridge

copied, affect

scanned, the

overall

or

broken

two

1:* relationships

for

which the

BENEFIT

entity.

duplicated, learning

down into

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE P3.1 Database

name:

Relational

Model

Table name: JOB

EMP_LNAME

JOB_CODE

JOB_CODE

14

Rudell

2

1

Clerical

15

Arendse

1

2

Technical

16

Ruellardo

1

3

17

Smith

3

20

Smith

2

name:

109

Ch03_BeneCo

EMP_CODE

Table

Characteristics

The Ch03_BeneCo database tables

Table name: EMPLOYEE

1

3

BENEFIT

JOB_DESCRIPTION

3

Managerial

Table name: PLAN

EMP_CODE

PLAN_CODE

PLAN_CODE

PLAN_DESCRIPTION

15

2

1

Term life

15

3

2

Stock purchase

16

1

3

Long-term

17

1

4

Dental

17

3

17

4

20

3

For each table in the

have a foreign

database,

identify

the

primary

key and the foreign

disability

key(s). If a table

does

not

key, write None in the space provided.

Primary

Table

Key

Foreign

Key(s)

EMPLOYEE BENEFIT

JOB PLAN

2

Create the ERD using UML notation to show the relationship

between EMPLOYEE and JOB.

3

Do the tables

explain

exhibit

entity integrity?

Answer

yes or no; then

Entity Integrity

Table

your

answer.

Explanation

EMPLOYEE BENEFIT JOB PLAN

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

110

PART I

Database

4

Systems

Dothe tables (not

exhibit referential integrity?

applicable)

if the

table

does

not

Referential

Table

Answer yes or no; then explain your answer.

have

a foreign

Write NA

key.

Integrity

Explanation

EMPLOYEE BENEFIT

3

JOB PLAN

5

Createthe ERD using Crows Foot notation to show the relationships JOB and

6

among EMPLOYEE, BENEFIT,

PLAN.

Create the ERD using UML class diagram notation to show the relationships BENEFIT,

JOB

among EMPLOYEE,

and PLAN.

Usethe database shown in Figure P3.2 to answer Problems 7-13.

FIGURE P3.2 Database

name:

Table name:

Ch03_StoreCo

EMPLOYEE

EMP_CODE

EMP_TITLE

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_DOB

STORE_CODE

21-May-70

3

09-Feb-75

2

1

Mr

Govender

Adimoolam

2

Ms

Ratula

Nancy

3

Ms

Greenboro

Lottie

R

02-Oct-67

4

4

Mrs

Rumpersfro

Jennie

S

01-Jun-77

5

5

Mr

Smith

Robert

L

23-Nov-65

3

6

Mr

Renselaer

Cary

A

25-Dec-71

1

7

Mr

Ogallo

Roberto

S

31-Jul-68

3

8

Ms

Van Blerk

Elandri

10-Sep-74

1

9

Mr

Eindsmar

Jack

19-Apr-61

2

10

Mrs

Jones

Rose

06-Mar-72

4

11

Mr

12

Mr

13

Mr

14 15

Broderick

W

I W R

Tom

21-Oct-78

3

Alan

Y

08-Sep-80

2

Smith

Peter

N

25-Aug-70

3

Ms

Smith

Sherry

H

25-May-72

4

Mr

Olenko

Howard

U

24-May-70

5

16

Mr

Archialo

Barry

V

03-Sep-66

5

17

Ms

Grimaldo

Jeanine

K

12-Nov-76

4

18

Mr

Rosenberg

Andrew

D

24-Jan-77

4

19

Mr

Bophela

F

03-Oct-74

4

20

Mr

Mckee

Robert

S

06-Mar-76

1

21

Ms

Baumann

Jennifer

A

11-Dec-80

3

Copyright Editorial

The Ch03_StoreCo database tables

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Washington

Rights

Reserved. content

does

May not

not materially

be

Ingwe

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

Table

3

Relational

Model

Characteristics

111

STORE

STORE_CODE

STORE_NAME

1

Access

2

Database

3

Tuple

4

Attribute

5

Primary

name:

STORE_YTD_SALES

EMP_CODE

730.05

2

8

123 370.04

2

12

792

Junction 1

Corner

Charge Alley Key

REGION_CODE

779

558.74

1

7

746

209.16

2

3

314 777.78

1

15

2

Point

3

REGION REGION_CODE

REGION_DESCRIPT 1

2

East

West

7 For eachtable, identify the primary key and the foreign key(s).If atable does not have aforeign key,

write

None in the space

provided.

Primary

Table

Key

Foreign

Key(s)

EMPLOYEE STORE REGION

8

Dothe tables exhibit entity integrity?

Entity

Table

Answer yes or no; then explain your answer.

Integrity

Explanation

EMPLOYEE STORE REGION

9

Do the tables

exhibit referential

(not applicable) if the table

integrity?

Referential

Table

Answer

does not have aforeign

yes or no; then

explain

your

answer.

Write NA

key.

Integrity

Explanation

EMPLOYEE STORE REGION

Copyright Editorial

review

10

Describe the type(s) of relationship(s)

11

Create the ERD using UML notation to show the relationship

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

between STORE and REGION.

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

between STORE and REGION.

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

112

PART I

Database

12

Systems

Describe the type(s) of relationship(s) many

13

employees,

Create the

one

of

ERD using

whom

between EMPLOYEE and STORE. (Hint: Each store employs

manages

the

store.)

UML notation to show the relationships

among EMPLOYEE,

STORE and

REGION.

Use the

database

shown in Figure

P3.3 to

answer

Problems

14-18.

3

FIGURE P3.3 Database

name:

The Ch03_CheapCo database tables Ch03_CheapCo

Table name:

PRODUCT

Foreign

VEND_CODE

key:

PROD_

Primary key: PROD_CODE

PROD_DESCRIPTION

CODE

PROD_ON_

PROD_

VEND_

DATE

HAND

PRICE

CODE

12-WW/P2

18 cm power saw blade

07-Apr-16

12

10.94

123

1QQ23-55

6 cm wood screw,

19-Mar-16

123

13.55

123

231-78-W

PVC pipe, 8 cm, 2.44

07-Dec-15

45

17.01

121

33564/U

Rat-tail

08-Mar-16

18

10.94

123

AR/3/TYR

Cordless

136.33

121

DT-34-WW

Philips

118.40

123

EE3-67/W

Sledge

ER-56/DF

Houselite

file,

100 m

0.5 cm, fine

drill,

0.6 cm

screwdriver

29-Nov-15 20-Dec-15

pack

hammer,

8

7 kg

chain saw, 40 cm

11

25-Feb-16

9

114.21

121

28-Dec-15

7

1186.04

125

FRE-TRY9

Jigsaw,

30 cm blade

12-Aug-15

67

11.15

125

SE-67-89

Jigsaw,

20 cm blade

11-Oct-15

34

11.07

125

23-Apr-16

14

110.26

123

01-Mar-16

15

17.07

121

ZW-QR/AV

Hardware

ZX-WR/FR

Claw

VENDOR

Foreign

none

key:

cloth,

Primary key: VEND_CODE

VEND_CODE

VEND_NAME

120

Bargain

121

Cut n

122

Rip & Rattle

123

Tools R

124

Trowel

125

Bow

review

2020 has

Cengage deemed

VEND_CONTACT

Snapper, Glow

write

Learning. that

any

All suppressed

Anne

does

May not

not materially

0181

899-1234

Olero

0181

342-9896

Morrins

0113

225-1127

G. McHenry

0161

546-7894

F. Frederick

0113

453-4567

0113

324-9988

T. Travis

R.

George

Inc.

& Wow Tools

the

VEND_PHONE

J.

Juliette

& Dowel,

Reserved. content

Co.

Us

None in

Rights

Henry

Co. Supply

VEND_AREACODE

Melanie

Inc.

For each table, identify key,

Copyright

0.6 cm.

hammer

Table name:

14

Editorial

PROD_STOCK_

Bill S. Sedwick

the primary key and the foreign space

be

copied, affect

key(s). If a table

does not have aforeign

provided.

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

Primary

3

Foreign

Key

Relational

Model

Characteristics

113

Key(s)

Product VENDOR

15

Dothe tables exhibit entity integrity?

Entity

Table

Answer yes or no; then explain your answer.

Integrity

3

Explanation

Product VENDOR

16

Dothe tables exhibit referential integrity? Answer yes or no; then explain your answer. (not applicable) if the table does not have aforeign key.

Referential

Table

Integrity

Write NA

Explanation

Product VENDOR

17

Create the ERD using UML notation for this database.

18

Create the data dictionary for this database.

Use the

database

shown

FIGURE P3.4 Database Table

name:

name:

Foreign

in

Figure

Copyright review

answer

Problems

Ch03_TransCo Primary

TRUCK

key:

19-24.

The Ch03_TransCo database tables

BASE-CODE,

key:

TRUCK_NUM

TYPE_CODE

TRUCK_

BASE_

TYPE_

TRUCK_

TRUCK_BUY_

TRUCK_SERIAL_

NUM

CODE

CODE

KM

DATE

NUM

1001

501

1

32 123.50

23-Sep-13

AA-322-12212-W11

1002

502

1

76 984.30

05-Feb-12

AC-342-22134-Q23

1003

501

2

12 346.60

11-Nov-13

AC-445-78656-Z99

1

2 894.30

06-Jan-14

WQ-112-23144-T34

45 673.10

1004

Editorial

P3.4 to

01-Mar-13

FR-998-32245-W12

245.70

15-Jul-10

AD-456-00845-R45

3

32 012.30

17-Oct-11

AA-341-96573-Z84

502

3

44 213.60

07-Aug-12

DR-559-22189-D33

503

2

10 932.90

12-Feb-14

DE-887-98456-E94

1005

503

2

1006

501

2

1007

502

1008 1009

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

193

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

114

PART I

Table

Database

name:

Foreign

Systems

BASE

Primary

key:

BASE_CODE

key: none

BASE_CODE

BASE_CITY

BASE_PROVINCE

BASE_AREA_CODE

BASE_MANAGER

BASE_ PHONE

3

501

Polokwane

502

Cape

503

Best

North

504

Durban

KwaZulu-Natal

Table

name:

Foreign

Town

Western

Cape

Brabant

0700

123-4567

Sibusiso

7100

234-5678

Clementine

4567

345-6789

4001

456-7890

Primary

TYPE

key:

19

Limpopo

key:

Balisa Daniels

Maria J. Talindo Pragasen

Khan

TYPE_CODE

none TYPE_CODE

TYPE_DESCRIPTION

1

Single

box,

2

Single

box, single-axle

3

Tandem

For each table, identify key,

write

trailer,

single-axle

the primary key and the foreign

None in the space

Primary

Table

double-axle

key(s). If a table

does not have aforeign

provided.

Key

Foreign

Key(s)

exhibit entity integrity?

Answer yes or no; then explain your answer.

TRUCK BASE TYPE

20

Dothe tables

Entity

Table

Integrity

Explanation

TRUCK BASE TYPE

21

Dothe tables (not

exhibit referential integrity?

applicable)

if the table Referential

Table

Answer yes or no; then explain your answer.

does not have a foreign

Write NA

key.

Integrity

Explanation

TRUCK BASE TYPE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

22 Identify the TRUCK tables 23

For each table, identify

Relational

Model

Characteristics

115

candidate key(s).

a superkey and a secondary

key.

Superkey

Table

3

Secondary

Key

TRUCK

3

BASE TYPE

24

Createthe ERD using UML notation for this database.

FIGURE Database

P3.5 name:

Table name: CHAR_ TRIP

The Ch03_AviaCo

database tables

Ch03_AviaCo

CHARTER

CHAR_

CHAR_

CHAR_

AC_

CHAR_

CHAR_

CHAR_

CHAR_

DATE

PILOT

COPILOT

NUMBER

DESTINATION

DISTANCE

HOURS_

HOURS_

FLOWN

10001

05-Feb-20

104

10002

05-Feb-20

101

10003

05-Feb-20

105

10004

06-Feb-20

106

1484P

CPT

10005

06-Feb-20

101

2289L

CDG

10006

06-Feb-20

109

4278Y

CPT

10007

06-Feb-20

104

2778V

10008

07-Feb-20

106

1484P

10009

07-Feb-20

105

2289L

LHR

10010

07-Feb-20

109

10011

07-Feb-20

101

10012

08-Feb-20

101

2778V

10013

08-Feb-20

105

4278Y

10014

09-Feb-20

106

4278Y

10015

09-Feb-20

104

101

2289L

10016

09-Feb-20

109

105

2778V

10017

10-Feb-20

101

10018

10-Feb-20

105

The

destinations

CDG CPT

Copyright Editorial

review

2020 has

5 PARIS 5 CAPE

Cengage deemed

Learning. that

any

All

109

105

May

not materially

320.00

1.6

0

7.8

0

472.00

2.9

4.9

023.00

5.7

3.5

397.7

472.00

2.6

5.2

LHR

1 574.00

7.9

TYS

644.00

4.1

1 574.00

6.6

23.4

affect

998.00

6.2

352.00

1.9

884.00 644.00

the

97.2

1

10019

2

10011

117.1

0

10017

0

348.4

2

10012

0

140.6

1

10014

459.9

0

10017

3.2

279.7

0

10016

5.3

66.4

1

10012

4.8

4.2

215.1

0

10010

3.9

4.5

174.3

1

10011

936.00

6.1

2.1

302.6

0

10017

1 645.00

MOB TYS

6.7

0

459.5

2

10016

MQY

312.00

1.5

0

67.2

0

10011

CPT

508.00

3.1

0

105.5

0

10014

644.00

3.8

4.5

167.4

0

10017

three-letter

airport

FRANCE,

LHR

SOUTH

AFRICA

duplicated, learning

72.6

10014

CDG

or

10011

10016

TYS

overall

1

2

CDG

scanned,

CODE

0

1

LHR

CUS_

OIL_QTS

339.8

BNA

by standard

copied,

354.1

1 574.00

4278Y

be

2.2

LHR

DE GAULLE,

not

5.1

BNA

INTERNATIONAL,

does

936.00

CHAR_

GALLONS

4278Y

1484P

Reserved. content

WAIT

2778V

1484P

104

CHARLES

Rights

CDG

4278Y

104

are indicated

TOWN

suppressed

2289L

CHAR_ FUEL_

in experience.

whole

or in Cengage

codes.

5 LONDON

part.

Due Learning

to

electronic reserves

For example, HEATHROW,

rights, the

right

some to

third remove

party additional

UNITED

content

may content

be

KINGDOM

suppressed at

any

time

from if

the

subsequent

AND

eBook rights

and/or restrictions

eChapter(s). require

it.

116

PART I

Table

Database

name:

Systems

AIRCRAFT

AC_NUMBER

3

1 833.10

101.80

2289L

C-90A

4 243.80

768.90

1 123.40

2778V

PA31-350

7 992.90

1 513.10

789.50

4278Y

PA31-350

2 147.30

622.10

243.20

5 Aircraft total time, left

AC_TTER

5 Total time,

right

developed table

Table name:

AC_TTER

1 833.10

5 Total time,

a fully

AC_TTEL

PA23-250

AC_TTEL

CHARTER

AC_TTAF

1484P

AC_TTAF

In

MOD_CODE

system, entries

airframe (hours)

engine

(hours)

engine such

(hours) attribute

values

would

be updated

by application

software

when the

are posted.

MODEL

MOD_CODE

MOD_MANUFACTURER

MOD_SEATS

MOD_NAME

MOD_CHG_MILE

C-90A

Beechcraft

KingAir

8

1.67

PA23-250

Piper

Aztec

6

1.20

PA31-350

Piper

Navajo

10

1.47

Customers

number

are charged

per round-trip

mile, using

of seats in the airplane, including

a pilot

and

copilot

Table

name:

has

six

passenger

the

Chieftain

MOD_CHG_MILE

the pilot and copilot

seats

rate.

seats.

The

Therefore

MOD_SEAT

gives the total

a PA31-350 trip that is flown

by

available.

PILOT

EMP_

PIL_

NUM

LICENCE

PIL_RATINGS

PIL_MED_

PIL_MED_

PIL_PT135_

TYPE

DATE

DATE

101

ATP

ATP/SEL/MEL/Instr/CFII

1

20-Jan-20

11-Jan-20

104

ATP

ATP/SEL/MEL/Instr

1

18-Dec-19

17-Jan-20

105

COM

COMM/SEL/MEL/Instr/CFI

2

05-Jan-20

02-Jan-20

106

COM

COMM/SEL/MEL/Instr

2

10-Dec-19

02-Feb-20

109

COM

ATP/SEL/MEL/SES/Instr/

1

22-Jan-20

15-Jan-20

CFII

The pilot licences Pilot.

Businesses

(FARs) 135

that

shown in the that

operate

are enforced

operators.

pilots

by the

Part 125

six months. The Part

PILOT table include on demand Federal

operations

135 flight

must have at least

Aviation

require

are

governed

Administration

that

proficiency

a commercial

the ATP 5 Airline Transport

air services

pilots

(FAA).

successfully

Such

of the

flight

medical certificate

Air Regulations

are known

proficiency

in PIL_PT135_DATE.

and a second-class

5 Commercial

Federal

businesses

complete

check data is recorded

licence

Pilot and COM

by Part 135

as Part

checks

every

To fly commercially,

(PIL_MED_TYPE

5 2).

The PIL_RATINGs include: SEL

5 Single

engine,

land

MEL

SES 5 Single engine, sea CFI

Copyright Editorial

review

5 Certified

2020 has

Cengage deemed

Learning. that

any

flight

All suppressed

Instr.

instructor

Rights

Reserved. content

Multi-engine,

does

5Instrument

CFII

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

land

5 Certified

or in Cengage

part.

Due Learning

to

electronic reserves

flight

instructor,

rights, the

right

some to

third remove

instrument

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it.

CHAPTER

Table

name:

3

Relational

Model

Characteristics

EMPLOYEE

EMP_NUM

EMP_TITLE

EMP_LNAME

EMP_FNAME

EMP_INITIAL

EMP_DOB

EMP_HIRE_DATE

100

Mr.

Nkosi

Cela

D

15-Jun-52

15-Mar-98

101

Ms.

Naude

Amahle

G

19-Mar-75

25-Apr-96

102

Mr.

Vandam

Rhett

14-Nov-68

18-May-03

103

Ms.

Jones

Anne

11-May-84

26-Jul-09

104

Mr.

Lange

John

P

12-Jul-81

20-Aug-00

105

Mr.

Williams

Robert

D

14-Mar-85

19-Jun-13

106

Mrs.

Duzak

Jeanine

K

12-Feb-78

13-Mar-99

107

Mr.

Diante

Jorge

D

01-May-85

02-Jul-07

108

Mr.

Wiesenbach

Paul

R

14-Feb-76

03-Jun-03

109

Ms.

Travis

Elizabeth

K

18-Jun-71

14-Feb-16

110

Mrs.

Genkazi

Leighla

19-May-80

29-Jun-10

Table

name:

M

W

3

CUSTOMER

CUS_ LNAME

CUS_ FNAME

10010

Ramas

Alfred

10011

Dunne

Leona

10012

Smith

Kathy

10013

Pieterse

Jaco

10014

Orlando

10015

OBrian

Amy

10016

Brown

James

10017

Williams

George

10018

Padayachee

Vinaya

10019

Smith

Olette

CUS_CODE

Use the

117

database

CUS_ PHONE

A

0181

844-2573

10.00

K

0161

894-1238

10.00

0181

894-2285

1559.73

0181

894-2180

1802.09

0181

222-1672

1420.15

B

0161

442-3381

1633.19

G

0181

297-1228

10.00

0181

290-2556

10.00

G

0161

382-7185

10.00

K

0178

297-3809

1283.33

W F

Myron

shown in

Figure

P3.5 to

CUS_ BALANCE

CUS_ AREACODE

CUS_ INITIAL

answer

Problems

25-28.

ROBCOR is

an aircraft

charter

company that supplies on-demand charter flight services using a fleet of four aircraft. Aircraft are identified by a unique registration number. Therefore, the aircraft registration number is an appropriate primary key for the AIRCRAFT table. The nulls in the CHARTER tables CHAR_COPILOT column indicate that a copilot is not required for some

charter trips

or for some aircraft.

(Federal

Aviation

Administration

(FAA) rules require

a copilot

onjet aircraft and on aircraft having a gross take-off weight over 5 500 kg. None of the aircraft in the AIRCRAFT table are governed bythis requirement; however, some customers mayrequire the presence of a copilot for insurance reasons.) All charter trips are recorded in the CHARTER table.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

118

PART I

Database

Systems

NOTE Earlier both

in the the

the

chapter

pilot

it

was stated

and the

CHARTER

copilot

table.

are

Therefore,

that

it is

pilots the

best

in the

to

avoid

PILOT

synonyms

homonyms

table,

but

CHAR_PILOT

and

synonyms.

EMP_NUM and

In

cannot

this

be used

CHAR_COPILOT

problem, for

were

both in

used

in

the

CHARTER table.

3

Although is

the

solution

not required.

charter flight

Worse,

company

such

grows

engineers

additional

works in this

and load

crew

would

have

to

trip

without

would

yield

additional

to

You will have a chance points:

Dont

synonyms.

greatest

structural on the

25

26

extent,

in the

Given this

For

requirements then

have to

required

change,

example,

if the

AviaCo

to

modified

to include

be

include the

CHAR_LOADMASTER

time

aircraft,

when a copilot

may increase

and

each

in larger

nulls

a smaller

the

aircraft

missing

crew

flew

a

members

table.

design

requires

tables.

shortcomings

the

design the

database

change.

crew would

generates

CHAR_FLT_ENGINEER

table.

those

design

table

as

members

CHARTER

to correct

possible

changes

CHARTER

and it

requirements

aircraft,

attributes

of crew

If your

larger

CHARTER

number nulls in the

two important

To the

The

such the

as crew

using

masters.

be added the

proliferate

starts

assignments;

charter

use

nulls

and

case, it is very restrictive

use

Problem

of synonyms,

database Plan

in

revise

to accommodate

ahead

27. The problem illustrates

and try to

the

design!

growth

anticipate

without requiring

the

effects

of change

database.

For each table,

where possible, identify:

a

The primary

key.

b

A superkey.

c

A candidate

d

The foreign

e

A secondary

Create the

key.

key(s). key.

ERD using

UML notation.

(Hint:

Look

at the table

contents.

You

will discover

that

an

AIRCRAFT can fly many CHARTER trips, but each CHARTER trip is flown by one AIRCRAFT, that a MODEL references many AIRCRAFT, but each AIRCRAFT references a single MODEL, etc.) 27

Modify the ERD you created in Problem 26 to eliminate the problems created by the use of synonyms. (Hint: Modify the CHARTER table structure by eliminating the CHAR_PILOT and CHAR_COPILOT attributes; then create a composite table named CREW to link the CHARTER and EMPLOYEE tables. Some crew members, such as flight attendants, may not be pilots. Thats why the

28

EMPLOYEE

Create the

table

ERD using

enters into

UML notation

this relationship.) for the design

you revised

in

Problem

27. (After

you have had

a chance to revise the design, your instructor will show you the results of the design change, using a copy of the revised database named Ch03_AviaCo_2).

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER 4 Relational Algebra and Calculus IN THIS CHAPTER,YOU WILLLEARN: What is

meant

How to

by relational

manipulate

How the

DBMS

The different How to

database

supports

types

and relational

tables

the

using

calculus

relational

key relational

set

operators:

operators

select,

project

and join

of joins

write queries

About tuple

algebra

using relational

and domain

algebra

relational

expressions

calculus

PREVIEW Relational

algebra

databases

and

relational of

how

both

it

and relational

model.

Codd

proposed

actually

be

components

and

of formal

is

language.

a

Predicate

basis

for

Once

we have

Query

Copyright review

from

which

required theory,

Language,

you

model.

relations and

manipulation

is relatively

easy

will learn

how

the

understand. calculus,

modified.

SQL

such In

in

the

These which

can

to

logic

as the and

set

important

This is

usually

as SQL (Structured DML languages

both

8, Beginning be used

Set theory used

next

a relation. such

in

a database.

provide

as SQL use alimited

Chapter

commands

or false.

predicate

(DML)

to

as a result.

and is

database, within

data to

a collection

a framework

on relations

data

key

as a procedural

of things,

Together,

and relational

often

which allows

as either true

language

by any DML. Languages are

in

modify

of the

algebra is

provides

or groups

described

one

described

mathematics,

operations

be

that

new relations

and is

the

independently

should

Relational

produce

defining

have

the

basic

implementation Structured

accomplish

Query relational

tasks.

Cengage deemed

relational

the

algebra

and

that

sets,

performing

data

relational

of relational

has

the for

specified

a high-level

operations

2020

in

basis

with

modelled

data

of a relation,

manner.

in

be

basis for relational basis for

we identified

can be verified

deals

how to retrieve

Language),

algebra

Editorial

is

using

stemmed

of fact)

that

2

set theory

extensively

(statement

an ideal

consideration achieved

used

and

as the

do this,

concept

relations

logic

1971 should

to

Chapter

was the

on these

manipulation

provide

data

that,

in a structured

on predicate

science

data

model

logic,

mathematical

theory

In

acting

based

which an assertion is

minimally.

mathematical

Codd in

the

and

database

operations

The algebra

that

used

of the relational within the

are the

by E.F.

would

mathematically

be stored

calculus

were proposed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

120

PART I

Database

Systems

Although language algebra that

is

both

gain

provides

us to

be used to

is

not an easy language of the

with aformal retrieve

express

the

form

relational

can

operators

queries

using

both tuple

also

be

and

how

relational

modify

same

data

of how the

which

they

by the

can

be used

complete

to

database

First, data.

you

Then,

will explore

study

the

relational

and the

mathematics

relational

calculus

a relationally

you

to

Essentially,

operates,

if any query that

manipulate

necessary

and tuple

we have

language.

Finally,

it is

operations.

algebra

means that

query

expressions.

relational

understand,

Relational

is relationally

expressed

to

manipulation

a relational

data.

queries,

algebraic

and domain

basic

description

and

We say a query language

algebraic

using

algebra

an understanding

necessary

language.

write

relational

to

complete

can

query

can be written in relational will learn you

about

will learn

how to

the

basic

about

write

how

simple

to

queries

calculus.

4

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4.1

4

Relational

Algebra

and

Calculus

121

RELATIONAL OPERATORS

Relational

algebra

relational

defines the theoretical

operators.

Codd originally

way of manipulating table contents through

defined

eight relational

operators,

called

a number

SELECT (or

of

RESTRICT),

PROJECT, JOIN, PRODUCT, INTERSECT, UNION, DIFFERENCE and DIVIDE. The most important operators are SELECT, PROJECT and JOIN, which can be used to formulate relational algebra expressions to answer many user queries. The relational operators have the property of closure; that is, relational algebra operators are used on existing tables to produce new tables. The relational operators are classed

as being

unary or binary.

Unary operators,

such

as SELECT

and PROJECT,

can be applied

to one relation, whilst binary operators such as JOIN are applied on two relations. In Chapter 3, Relational Model Characteristics, welearnt about a number of important concepts and properties of relations that are essential for understanding the relational model.In this chapter, we will build

on these

concepts

to understand

how relational

algebra

can be used to

write queries.

4

Within

Chapter 3, we modelled a relation on a mathematical construct, which had to abide by a set of rules (Table 3.1). When applying relational operators to relations, we have to follow these rules in addition to those defined for each relational operator. In the following sections you willlearn about the theory associated with common relational operators and view some

practical

examples.

Remember

that the term relation

is a synonym

for table.

NOTE To

be

considered

PROJECT

minimally

and

JOIN.

relational,

Very few

the

DBMSs

DBMS

are

must

capable

support

the

of supporting

key

all eight

relational

operators

relational

SELECT,

operators.

A NOTE ON SET THEORY Set theory is one of the most fundamental concepts in mathematics.1 The theory is based on the idea that elements have membership in a set. Given two sets, A and B, wesay that Ais a member of B, which can be written

as A [

B. Alternatively,

we can say that the

set

B contains

A as its element.

The elements

of a set can be numbers, the names of students who enrolled in a course or the flight numbers of all the flights operated by an airline. Each set is then determined by its elements and each element in a set is unique. Venn diagrams2 are a way of visually representing sets. Supposing we have the following two sets: Set

A 5 Students

who take

the

Databases

Set

B 5 Students

who take

the

Programming

Some

of the

Venn

diagram

1

Karel

2

John

Copyright review

2020 has

Hrbacek

Cengage deemed

Learning. that

any

and

All

set

Rights

the

Reserved. content

in

Thomas

On

Magazine

suppressed

in

as shown

Venn (1880)

Philosophical

Editorial

students

does

A appear Figure

Jech,

May not

not materially

be

also in

Introduction

copied, affect

to

and

Journal

of

scanned, the

{Sarah, unit set

{Paul, B and

vice

Phinda, Mikla,

Paul,

Asanda,

versa.

Hamzah, Kiki,

Mikla}

Craig}

We can represent

these

facts

using

a

4.1.

Diagrammatic and

unit

overall

Science

or

duplicated, learning

Set Theory,

Mechanical 9(59):

in experience.

whole

third

edn.

Marcel

Representation

Dekker,

of Propositions

Inc.,

1999.

and

Reasonings.

Dublin

118.

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

122

PART I

Database

Systems

FIGURE 4.1

Asimple Venn diagram

4

In

Figure

and

4.1, the

two

Programming

sections

of the

the left-hand right-hand

represent

and

who

the

appear

two

circles.

Sarah,

circle,

whilst

Asanda,

in

Phinda

two

sets

A and

both

sets

are

and

Kiki and

Hamzah

B.The

Paul only

take

Craig only take

students

and

the

who take

Mikla. the

These

Database

Programming

both

the

will go in the unit,

Database overlapping

so these

go only in

unit and only appear in the

circle.

We will be using union,

circles

units

Venn

intersection

and

diagrams

throughout

this

chapter

to illustrate

the

three

relational

set

operators:

difference.

4.1.1 Selection The relational or it

operator SELECT, also known as RESTRICT, can be used to list all of the row values,

can return

a horizontal

only those subset

row

values

that

match

a specified

criterion.

In

other

words,

SELECT

returns

of a relation.

The SELECT operator, denoted by su, is formally

defined

as:

su(R)

or s,criterion. (RELATION)

where su(R) is the set of specified tuples the

required

of the relation

R and uis the predicate (or criterion) to extract

tuples.

NOTE The Euro, denoted as , became the official currency of 12 European member states in 2002. Today the Euro is used by more than 175 million Europeans in 19 of 28 EU member countries, as well as some countries that are not formally members of the EU.

Figure

4.2 (a)

contains shows

Copyright Editorial

review

2020 has

the

Cengage deemed

Learning. that

shows

visually

information

any

effects

All suppressed

Rights

about of selecting

Reserved. content

how

does

May not

not materially

rows

products

be

all rows

copied, affect

scanned, the

within

which

overall

or

with

duplicated, learning

a relation

are

sold in

no criteria.

in experience.

whole

or in Cengage

part.

are a store The

Due Learning

to

electronic reserves

selected. is

criterion

rights, the

An example

shown

right

in

Figure

specified

some to

third remove

party additional

content

in

may content

of 4.2 (b).

Figure

be

any

time

that

Figure

4.2 (c)

4.2 (d)

suppressed at

a relation

from if

the

subsequent

selects

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

those

rows

P_CODE

only

where the

123456

is

4.2 (a)

than

2.00

Figure

4.2 (e),

Algebra

only the row

name:

Row

1

Row

2

Row

3

Row

4

Row

5

P_DESCRIPT

containing

1

Column

2

4

123456

Flashlight

123457

Lamp

123458

Box Fan

213345

Relation

Figure 4.2 (c) s(PRODUCT)

PRICE

P_CODE

P_DESCRIPT

123456

Flashlight

19.87

123457

Lamp

8.68

123458

Box

9 v battery

1.52

213345

9 v battery

1.52

254467

100

1.16

254467

100

1.16

311452

Powerdrill

27.64

311452

Powerdrill

4.16

W bulb

s price , 2.00(PRODUCT)

4.2 (d)

Figure

P_CODE

P_DESCRIPT

213345

9 v battery

1.52

254467

100

1.16

Figure

possible

contains

create the

about only the

PRICE

W bulb

to

4.3 illustrates

information

123

Ch04_Relational_DB_Operators

P_CODE

It is also

Calculus

SELECTION

Figure 4.2 (b) The PRODUCT

Figure

and

The SELECToperator

Column

Database

and, in

Relational

displayed.

FIGURE 4.2

Figure

price is less

4

more complex

use

courses tuples

of the offered

where

the

criteria AND

at

University.

Tiny

DEPT_CODE

operator

is

4.16

19.87 Fan

8.68

W bulb

27.64

(PRODUCT) s p_code5123456

4.2 (e)

P_CODE

P_DESCRIPT

123456

Flashlight

by using the logical

logical

using Figure

CIS and the

PRICE

PRICE 4.16

operators

the

COURSE

4.3 (b)

shows

AND,

the

CRS_CREDIT

OR and

relation,

which

new

value

NOT. stores

relation,

is

which

4.

Online Content Allofthe databases usedtoillustratethe material in this chapterarefound

Copyright Editorial

review

2020 has

on the

online

names

used in the figures.

Cengage deemed

Learning. that

any

All suppressed

platform

Rights

Reserved. content

does

for this

May not

not materially

be

book.

copied, affect

The

scanned, the

overall

or

database

duplicated, learning

in experience.

whole

names

or in Cengage

part.

used in the folder

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

match the

party additional

content

may content

database

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

124

PART I

Database

Systems

FIGURE 4.3 Database Figure

name:

Ch04_TinyUniversity

4.3 (a) the

COURSE

DEPT_CODE

CRS_DESCRIPTION

CRS_CREDIT

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Intro.

to

Computer

Science

3

and Implementation

4

Statistics

Statistical

3

Applications

4

s dept_code5CIS ANDcrs_credit5 4(COURSE)

4.3 (b)

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

CIS-420

CIS

Database

Design

QM-362

CIS

Statistical

Applications

4.1.2 The

Relation

CRS_CODE

4

Figure

Selecting from the COURSErelation

CRS_CREDIT and Implementation

4 4

Projection

PROJECT

vertical

operator

subset

defined

returns

of a relation

all values

excluding

for

any

selected

duplicates.

attributes. The

In

other

PROJECT

words,

operator,

PROJECT

denoted

by

returns

a

P,is formally

as:

Pa1...an (R)

or P,List of attributes.

(Relation)

where the projection the relation Figure Figure to

4.4 (b)

4.4 (c)

create

(d)

Copyright review

2020 has

relation

the

effect

that

how columns

within a relation

stores

information

about

the

PROJECT

relational

of applying

containing

only the

PROJECT

operator.

attribute

PRICE.

Notice that

the

products

which

operator

The two

order

attributes a1...an of

are selected. are

on the

sold

in

a store.

PRODUCT

further

examples

of attributes

is

relation,

in

Figure

maintained

4.4

in the

relations.

Learning. that

R, denoted by Pa1...an (R) is the set of specified visually

a relation

the

and (e) illustrate

Cengage deemed

shows

shows

a new

resulting

Editorial

of the relation

R. Figure 4.4 (a) shows

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.4 Database Figure

Relational

Algebra

and

Calculus

125

The PROJECT operator

name:

4.4 (a)

4

Ch04_Relational_DB_Operators

PROJECTION

Column

Column

1

2

Row 1 Row 2

4

Row 3 Row 4 Row 5

Figure

4.4 (b)

Figure

The

PRODUCT

relation

P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box

213345 254467 311452

Powerdrill

Figure

P_DESCRIPT

PRICE

PRICE 4.16

Flashlight

4.16

19.87

Lamp

19.87 8.68

Box Fan

8.68

9 v battery

1.52

9 v battery

1.52

100

1.16

100

1.16

Fan

W bulb

W bulb

27.64

Powerdrill

27.64

(PRODUCT) Pprice

4.4 (c)

(PRODUCT) Pp_descript,price

4.4 (d)

Figure

(PRODUCT) Pp_code,price

4.4 (e)

PRICE

P_CODE

PRICE

4.16

123456

19.87

123457

19.87

8.68

4.16

123458

8.68

1.52

213345

1.52

1.16

254467

1.16

27.64

311452

27.64

4.1.3 UNION The

UNION

relations

set

must

be used in the degree, The

Copyright Editorial

review

2020 has

and

UNION

Cengage deemed

Learning. that

any

operator

have the

UNION.

Rights

denoted

Reserved. content

from

characteristics

or more tables

does

May not

not materially

by

be

copied, affect

, is formally

scanned, the

overall

or

duplicated, learning

two

relations,

(the

columns

share the same

share the same (or compatible)

operator,

All

all tuples

attribute

When two

when they

suppressed

combines same

domains,

defined

in experience.

whole

or in Cengage

part.

excluding and

number they

duplicate

domains

must

of columns,

are said to

tuples.

The

be identical)

i.e.

to

have the same

be union-compatible.

as:

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

126

PART I

Database

Systems

The union of relations relation

R3(c1 , c2,...,

R1(a1 , a2,..., an) and R2 (b1, b2,..., bn) denoted

cn)

where for

each i (i

R1

5 1, 2..n), ai and bi must have

The degree of R3is the same as that of R1and R2.However the cardinality b are the cardinalities

of R1 and

R2respectively,

Figure 4.5 (a) visually shows R1 Figure

4

4.5 (b) to (c)

Both

PRODUCT1

same

domains.

FIGURE 4.5 Database

name:

shows

the

since there

R2 with degree compatible

n, is the

domains.

of R3is a 1 b, only if a and

may not be duplicate

tuples

in

R1 and

. R2

. R2 effect

and PRODUCT2

of the

UNION

operator

are union-compatible

on relations

as they

PRODUCT1

have the

same

and

degree

PRODUCT2.

and share the

The UNIONoperator Ch04_Relational_DB_Operators

Figure 4.5 (a) R1 Union R2

R1

R2

Figure 4.5 (b) The UNION_PRODUCT1

Figure

relation

4.5 (d)

Result

of UNION_PRODUCT1

UNION_PRODUCT2 P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box Fan

8.68

213345

9 v battery

1.52

254467

100

1.16

311452

Powerdrill

Figure

4.5 (c)

The

P_CODE

Copyright Editorial

review

2020 has

PRICE 4.16

19.87

Wbulb

27.64

UNION_PRODUCT2

P_DESCRIPT

relation

Microwave

126.40

345679

Dishwasher

395.00

Cengage

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

P_DESCRIPT

PRICE

123456

Flashlight

123457

Lamp

123458

Box

213345

9 v battery

1.52

254467

100

1.16

311452

Powerdrill

4.16 19.87 8.68

Fan

W bulb

27.64

345678

Microwave

126.40

345679

Dishwasher

395.00

PRICE

345678

deemed

P_CODE

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure that

4.6 shows the

only

one

exists in the

effects

additional

UNION operator

has

been

added

in

when two relations Figure

4.6 (c),

as

contain

Relational

Algebra

duplicate

tuples.

CRS_CODE

and

5 ACCT-211

Calculus

127

Notice already

COURSE_RELATION.

FIGURE 4.6 Database name: Figure

of the

tuple

4

4.6 (a)

The Union operator

COURSE

COURSE2

Ch04_TinyUniversity

The

COURSE_RELATION CRS_CREDIT

CRS_CODE

DEPT_CODE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Intro.

to

Computer

3

Science

4

and Implementation

3

Statistics

Statistical

4

4

Applications

Figure 4.6 (b) The COURSE2_RELATION DEPT_CODE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

CIS-430

CIS

Advanced

Databases

6

Figure 4.6 (c) Result of COURSE

CRS_DESCRIPTION

ACCT-211

ACCT

Accounting

I

3

ACCT-212

ACCT

Accounting

II

3

CIS-220

CIS

Introduction

to

CIS-420

CIS

Database

Design

QM-261

CIS

QM-362

CIS

Statistical

Applications

4

CIS-430

CIS

Advanced

Databases

6

(a)

and the

attribute.

In the

2020 has

relation

in

example,

the

4.7 (a) is

not

could

3

UNION operator cannot be applied as the results

UNION allowed

COURSE

write PCRS_CODE (COURSE)

4

and Implementation

operator

to the

(COURSE

be used to restrict

both relations

3

Science

Statistics

then the

applying

operator

to

Computer

and

COURSE

CLASS).

the columns

CLASS

have

(CLASS) PCRS_CODE

In

order

to

obtain

Figure

4.6

around

this

over a common

attribute the

in get

in each relation

a common

and

relation

CRS_CODE.

resulting

relation

We shown

4.7 (b).

Cengage deemed

example,

PROJECT

could therefore Figure

For

CLASS

the

Intro.

are not union-compatible,

be invalid.

problem,

in

review

CRS_CREDIT

DEPT_CODE

would

Copyright

COURSE2

CRS_CODE

If two relations

Editorial

CRS_CREDIT

CRS_CODE

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

128

PART I

Database

Systems

FIGURE 4.7 Database Figure

The Union operator

name:

not union-compatible

example

Ch04_TinyUniversity

4.7 (a) the

CLASS_RELATION

CLASS_CODE

CRS_CODE

CLASS_TIME

CLASS_

CLASS_ROOM

LECTURER_ NUM

SECTION

4

10012

ACCT-211

1

MWF 8:00-8:50

a.m.

BUS311

105

10013

ACCT-211

2

MWF 9:00-9:50

a.m.

BUS200

105

10014

ACCT-211

3

TTh 2:30-3:45

BUS252

342

10015

ACCT-212

1

MWF 10:00-10:50

BUS311

301

10016

ACCT-212

2

Th 6:00-8:40

BUS252

301

10017

CIS-220

1

MWF 9:00-9:50

a.m.

KLR209

228

10018

CIS-220

2

MWF 9:00-9:50

a.m.

KLR211

114

10019

CIS-220

3

MWF 10:00-10:50

KLR209

228

10020

CIS-420

1

W 6:00-8:40

KLR209

162

10021

QM-261

1

MWF 8:00-8:50

KLR200

114

10022

QM-261

2

TTh 1:00-2:15

KLR200

114

10023

QM-362

1

KLR200

162

10024

QM-362

2

KLR200

162

MWF

p.m.

p.m.

a.m.

p.m. a.m. p.m.

11:00-11:50

a.m.

TTh 2:30-3:45

(COURSE) Figure 4.7 (b) Result of PCRS_CODE

a.m.

p.m.

(CLASS) PCRS_CODE

CRS_CODE ACCT-211 ACCT-212 CIS-220 CIS-420 QM-261 QM-362

4.1.4 INTERSECT The INTERSECT true

in the

cannot

operator,

case

of

denoted

UNION,

use INTERSECT

the

if

as

tables

one

,

returns

must

of the

attributes

in the second table is character-based.

only the

tuples

that

be union-compatible in

the

first

The INTERSECT

to

table

is

appear

give

in

valid

numeric

both

relations.

results. and the

operator is formally

As

was

For example,

you

corresponding

one

defined as:

The intersect of relations R1 (a1, a2,..., an) and R2 (b1, b2,..., bn) denoted R1 R2 with degree n, is the relation R3(c1 , c2,..., cn) that includes only those tuples of R1that also appear in R2 where for each i (i

5 1, 2..n), ai and

bi must have

compatible

Figure 4.8 (a) visually shows R1

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

domains.

. R2

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

The effect is

shown

in

F_NAMEs

of applying Figure

that

appear

FIGURE 4.8 Database

the INTERSECT

4.8 (d). in

Only both

Kuhle

operator and

Jorge

to the first appear

name

in the

INTERSECT_RELATION_1

4

name:

column (F_NAME)

final

relation

4.8 (b)

and

Calculus

129

in two relations

as they

are the

only

two

and INTERSECT_RELATION_2.

Ch04_Relational_DB_Operators

R2

R1

4

R2

Figure

The INTERSECT_

RELATION_1

Algebra

TheINTERSECT operator

Figure 4.8 (a) R1INTERSECT

Figure

Relational

4.8 (c)

RELATION_2

relation

The INTERSECT_

Figure

relation

4.8 (d)

Result

of

INTERSECT_RELATION_1 INTERSECT_RELATION_2

F_NAME

F_NAME

F_NAME

George

Kuhle

Kuhle

William

Kuhle Elaine

Jorge

Piet

Dennis

Jorge

Jorge

4.1.5 DIFFERENCE The

DIFFERENCE

is, it

subtracts

operator

returns

one relation

from

must be union-compatible.

all tuples

the

other.

in

one relation

The

that

DIFFERENCE

The DIFFERENCE

are

not found

in the

operator

also requires

operator is formally

defined as:

other

that

the

relation; two

The difference of relations R1 (a1, a2,..., am) and R2 (b1, b2,..., bm) denoted R1 R2 with degree relation R3(c1 , c2,..., cm) that includes all tuples that arein R1 but not in R2 wherefor each i (i domains. ai and bi must have compatible Figure

4.9 (a) shows

The effect relation that

Copyright review

2020 has

Figure

appear

result

Editorial

in

in

4.9 (c)

Learning. that

any

DIFFERENCE

shows

only

DIFF_RELATION_1

All suppressed

Rights

order

Reserved. content

does

May not

operator

George, and

not in

of the relations

not materially

m,is the

51,2..m),

R2 can be visualised.

the

as BA, i.e. the

Cengage deemed

how R1

of applying

that

relations

be

copied, affect

scanned, the

overall

or

duplicated, learning

to two

Elaine

and

relations Piet,

is

DIFF_RELATION_2.

are important

in experience.

whole

or in Cengage

part.

Due

to

in

electronic reserves

rights, right

some to

third remove

only

AB

DIFFERENCE

the

Figure

are the

Note that

in the

Learning

shown

as these

party additional

4.9.

The resulting

values

of F_NAME

will not

give the

same

operator.

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

130

PART I

Database

Systems

FIGURE 4.9

The DIFFERENCEoperator R1R2

R1

R2

4

Database Figure

name:

4.9 (b)

Ch04_Relational_DB_Operators

The

DIFF_

Figure

4.9 (c)

The

Figure

DIFF_

4.9 (d)

Result

RELATION_1

of DIFF_

- DIFF_RELATION_2

RELATION_1 relation

RELATION_2 relation

F_NAME

F_NAME

F_NAME

George

Kuhle

George

Kuhle

Elaine

William

Elaine

Piet

Jorge

Piet

Dennis

Jorge

4.1.6 CARTESIAN PRODUCT The CARTESIANPRODUCTis usually written as R1 3 R2withthe new resulting relation R3containing all the attributes that are present in R1 and R2along . both R1 and R2 It

can

be formally

defined

with all the possible combinations

of tuples from

as:

The CARTESIAN PRODUCT of two relations R1 (a1, a2,..., an) with cardinality i and R2(b1, b2,..., bm) , with cardinality j is arelation R3 with degree k 5 n 1 m, cardinality i*j and attributes (a1, a2,..., an, b1

b2,..., bm).This can be denoted as R3 5 R1 3 R2. Therefore, two 4

if

one relation

attributes,

the

1 2 5 6 attributes,

Figure

4.10

LOCATION

(c)

i.e. the

shows

relations

You can see in cardinality

it is

Copyright Editorial

review

2020 has

Cengage deemed

by itself,

used in

known

the

Figures

conjunction

would

PRODUCT

of 6 (3

many tuples

with the

other relation

is

composed

be 18 tuples used

on

has three

of 6 and the

rows

and

3 5 18 rows

and

degree

combining

the

would

be 6.

PRODUCT

and

and (b) respectively.

a degree

combines

and the

a new relation

new relation

4.10 (c) that the result

3 3) and as it

of the

CARTESIAN 4.10 (a)

attributes

creates

RESTRICT

of PRODUCT

3 LOCATION

1 3). The

CARTESIAN

that

no association

have

(SELECT)

operator,

it

is a new relation

PRODUCT with

becomes

is

each

not

with a

a very

other.

useful

However,

a very important

if

operator

as a JOIN.

Learning. that

in

and four

PRODUCT

cardinality

how

Figure

of 18 (6

operation

has six rows

CARTESIAN

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.10 Database Figure

name:

4.10 (a)

The

PRODUCT

Figure 4.10 (c) PRODUCT

2020 has

Cengage deemed

Learning. that

any

All suppressed

and

Calculus

131

relation P_CODE

P_DESCRIPT

123456

Flashlight

123457

Lamp

123458

Box

213345

9 v battery

1.52

254467

100

1.06

311452

Powerdrill

Rights

Reserved. content

does

4.16 19.87 8.68

Fan

Wbulb

4

27.64

AISLE W

SHELF 5

24

K

9

25

Z

6

X LOCATION P_CODE

P_DESCRIPT

STORE

AISLE

SHELF

123456

Flashlight

4.16

23

W

5

123456

Flashlight

4.16

24

K

9

123456

Flashlight

4.16

25

Z

6

123457

Lamp

19.87

23

W

5

123457

Lamp

19.87

25

Z

6

123457

Lamp

19.87

24

K

9

123458

Box Fan

10.99

23

W

5

123458

Box Fan

10.99

24

K

9

123458

Box Fan

10.99

25

Z

6

213345

9 v battery

1.52

23

213345

9 v battery

1.52

24

K

9

213345

9 v battery

1.52

25

Z

6

254467

100

W bulb

1.16

23

254467

100

W bulb

1.16

24

K

9

254467

100

W bulb

1.16

25

Z

6

311452

Powerdrill

27.64

24

W

5

311452

Powerdrill

27.64

25

K

9

311452

Powerdrill

27.64

26

Z

6

May not

PRICE

relation

23

review

Algebra

Ch04_Relational_DB_Operators

STORE

Copyright

Relational

The CARTESIAN PRODUCT

Figure 4.10 (b) The LOCATION

Editorial

4

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

PRICE

or in Cengage

part.

Due Learning

to

electronic reserves

W

5

W

rights, the

right

some to

third remove

5

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

132

PART I

Database

Systems

4.1.7 DIVISION , that The DIVISION operation produces a new relation by selecting the tuples in one relation, R1 match every row in another relation, R2.It is essentially the inverse of the CARTESIAN PRODUCT operation, just like the arithmetic divide is the inverse of multiplication. DIVISION, denoted by R1 4 R2, can be formally defined as:

(b1, b2,..., bm) with cardinality j R1 (a1, a2,..., an) with cardinality i and R2

The DIVISION of two relations

is arelation R3with degree k 5 n 2 mand cardinality i 4 j. Using the example shown in Figure 4.11, note that: Table 1 (Figure 4.11(a)) is divided by Table 2 (Figure 4.11(b)) to produce Table 3(Figure 4.11(c)). Tables 1 and 2 both contain the column CODE but

4

do not share

LOC.

To be included

in the resulting

Table 3, a value in the

unshared

column

(LOC)

be associated (in the dividing Table 2) with every value in Table 1. The only value associated A and Bis 5.

FIGURE 4.11 Database

Name:

must

with both

The DIVISION operator Ch04_Relational_DB_Operators

Figure 4.11 (a) Division Table 1 CODE

LOC

A

5

A

9

A

4

B

5

B

3

C

6

D

7

D

8

E

8

Figure 4.11 (b) Division Table 2 CODE A B

Figure

4.11

(c)

Result

of

Division

Table

1

4

Division

Table

2

LOC 5

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4.2

4

Relational

Algebra

and

Calculus

133

JOINS

The JOIN operation is one of the essential operations of relational algebra. It is a binary operation that allows the user to combine two relations in a specified way. JOIN operations are the real power behind the relational database, allowing the use ofindependent tables linked by common attributes. The JOIN oftwo relations R1and R2is arestriction ontheir Cartesian product R1X R2to meet a specified criterion. Thejoin itself is defined on an attribute a of R1and an attribute b of R2 where the attributes the same domain. A JOIN operator may be formally defined as:

a and b share

(a1, a2,..., an) and R2 Thejoin of two relations R1 (b1, b2,..., bm) is a relation R3 with degree k 5 n 1 m ) that satisfy a specific join condition. and attributes (a1, a2,..., an, b1 , b2,..., bm In this section we willlook at a number of different kinds ofjoin operations including EQUIJOIN, NATURAL JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN.

4

the THETA JOIN,

4.2.1 Theta Join and Equijoin One of the equality

most commonly

condition

that

used joins is known as an equijoin,

compares

specified

columns

whichlinks tables

of each table.

The outcome

on the basis of an of the equijoin

does

not eliminate duplicate columns, and the condition or criterion used to join the tables must be explicitly defined. The equijoin takes its name from the equality comparison operator (5) used in the condition. If any other comparison operator is used the join is called a theta join denoted with the symbol u(u-join). So, theta represents

a predicate

The equijoin is therefore

that

consists

of one of the comparison

operators

{ 5, ,,

,5,

.5,

,

.}.

one special type of theta join:

Let R1 (a1, a2,..., an) and R2 (b1, b2,..., bm) be relations that may have different schemas. Then the u-join . of R1and R2is denoted as R1 uR2 and the equijoin is denoted as R1 R1.a5R2.bR2 It is also

possible to

express

both the u-join

and the

equijoin in terms

of the restriction

and

Cartesian

). product operations. So,for example, the equijoin R1 R1.a 5 R2.bR2 mayalso be written as sR1.a 5R2.b (R1 3 R2 Looking at the u-join and the equijoin in this way allows us to create some simple rules, which will allow us to compute such joins on any two relations: . This first performs a Cartesian product to form all possible combinations Compute R1 3 R2

1

of the

. rows of R1and R2 2

Restrict the Cartesian product to only those rows

where the values in certain columns

match.

For example, suppose we wish to find out all students who take classes in each department at Tiny University. To answer this query, we mustjoin together the two relations STUDENT-2 and DEPARTMENT-2 shown in Figure 4.12 (a) and (b). Following the two rules stated above, this will first involve finding the Cartesian

product

of the

STUDENT-2

and

DEPARTMENT-2

relations

shown in

Figure 4.12 (c).

Then,

we

need to restrict the resulting relation in Figure 4.12 (c) to only those tuples that satisfy the join condition on the common columns of DEPT_CODE, which is found in both relations (Figure 4.12 (d)). In this case, this would be where STUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE. This query, which we will call STUDENT_IN_DEPT, can be written in relational algebra as: (STUDENT 3 DEPARTMENT) STUDENT_IN_DEPT 5 sSTUDENT.DEPT_CODE 5DEPARTMENT.DEPT_CODE

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

134

PART I

Database

Systems

FIGURE 4.12 Database Figure

name:

4.12 (a)

4

Figure

4.12 (b)

Equijoin example Ch04_TinyUniversity

The

STUDENT-2

relation STU_LNAME

STU_FNAME

STU_DOB

321452

Ndlovu

Amehlo

12 February

324257

Smithson

Anne

15 November

324258

Le Roux

Dan

23 August 1986

324269

Oblonski

324273

Smith

The

Walter

1992

30

BIOL

1997

16 September

John

DEPARTMENT-2

Figure 4.12 (c) The Cartesian

CIS ACCT

1997

December

CIS

1975

ENGL

relation DEPT_CODE

DEPT_NAME

ACCT

Accounting

BIOL

Biology

CIS

Computer

ENGL

English

product (STUDENT

Info.

Systems

3 DEPARTMENT) DEPT_NAME

S.DEPT_

D.DEPT_

CODE

CODE

1992

BIOL

ACCT

Accounting

1992

BIOL

BIOL

Biology

12 February

1992

BIOL

CIS

Computer

Amehlo

12 February

1992

BIOL

ENGL

English

Smithson

Anne

15

1997

CIS

ACCT

Accounting

324257

Smithson

Anne

15 November

1997

CIS

BIOL

Biology

324257

Smithson

Anne

15 November

1997

CIS

CIS

Computer

324257

Smithson

Anne

15 November

1997

CIS

ENGL

English

324258

Le Roux

Dan

23 August

1986

ACCT

ACCT

Accounting

324258

Le Roux

Dan

23 August

1986

ACCT

BIOL

Biology

324258

Le Roux

Dan

23 August

1986

ACCT

CIS

Computer

324258

Le Roux

Dan

23 August 1986

ACCT

ENGL

English

324269

Oblonski

Walter

16 September

1993

CIS

ACCT

Accounting

324269

Oblonski

Walter

16 September

1993

CIS

BIOL

Biology

324269

Oblonski

Walter

16 September

1993

CIS

CIS

Computer Info.

324269

Oblonski

Walter

16 September

1993

CIS

ENGL

English

324273

Smith

John

30 December

1975

ENGL

ACCT

Accounting

324273

Smith

John

30 December

1975

ENGL

BIOL

Biology

STU_

STU_

STU_

NUM

LNAME

FNAME

321452

Ndlovu

Amehlo

12

February

321452

Ndlovu

Amehlo

12

February

321452

Ndlovu

Amehlo

321452

Ndlovu

324257

Copyright Editorial

DEPT_CODE

STU_NUM

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

STU_DOB

May not

not materially

be

November

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

Info.

Systems

Info.

Systems

Info.

Systems

suppressed at

any

time

Systems

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

STU_DOB

STU_

STU_

STU_

NUM

LNAME

FNAME

324273

Smith

John

30

324273

Smith

John

30 December

Figure 4.12 (d) the final relation

STUDENT_IN_DEPT

December

4

Relational

Algebra

S.DEPT_

D.DEPT_

CODE

CODE

1975

ENGL

CIS

Computer

1975

ENGL

ENGL

English

and

Calculus

135

DEPT_NAME

Info.

Systems

(STUDENT 5 sSTUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE

3

DEPARTMENT) STU_

STU_

STU_

STU_DOB

NUM

LNAME

FNAME

321452

Ndlovu

Amehlo

12 February

324257

Smithson

Anne

15 November

324258

Le Roux

Dan

23 August 1986

324269

Oblonski

324273

Smith

Walter

1992

16 September 30 December

John

Notice in Figure 4.12 (c) that there

are two

columns

1997

1993 1975

called

S.DEPT_

D.DEPT_

CODE

CODE

BIOL

BIOL

Biology

CIS

CIS

Computer Info.

ACCT

ACCT

Accounting

CIS

CIS

Computer Info.

ENGL

ENGL

English

This is

due to the fact that

DEPT_CODE.

DEPT_NAME

4 Systems

Systems

both

STUDENT-2 and DEPARTMENT-2 both contain a column of the same name. In this case DEPT_CODE also shares the same domain and provides referential integrity between the two relations. In order to distinguish between them, a prefix of S and D has been added to the name of these columns, i.e. S.DEPT_CODE and D.DEPT_CODE, to makethem easier to read. You can also see these two common columns

again in the resulting

relation

in

Figure

4.12 (d)

as the

equijoin

columns. Ideally, it would be far better not to show duplicate equijoins are so common, so an operator called the natural join

does not eliminate

columns in the resulting was defined.

duplicate

relation,

as

4.2.2 The Natural Join The natural join

operation

is the

most common

variant

of the joins.

The natural join

operation

requires

that the two operant relations must have at least one common attribute, i.e. attributes that share the same domain. The common column(s) is (are) referred to as the join column(s). The natural join is in fact an equijoin; however, in addition, we drop the duplicate attributes, so the resulting relation contains one less column than that of the equijoin. Let R1be arelation having attributes (a1, a2,..., an, y), R2be another relation having attributes (b1, b2,..., bm y) where y is a set of common attributes (join column(s)) that share the same domain. The natural join operator is defined as: The natural join of R1and R2,denoted R1|3| R2 , consists of combining the tuples of R1 and R2to build a new relation R3,such that if R1Tuple [ R1 , R2Tuple [ R2 , and R1Tuple.y 5 R2Tuple.y, then R3Tuple 5 R1Tuple.a1 , R1Tuple.an, R1Tuple.y, R2Tuple.b1,... R2Tuple.bm. R1Tuple.a1 corresponds

; the notation Note that the common set of attributes y appears only once in R3 . to the a1attribute value of atuple of R1 Although

this

definition

appears

to

be quite complicated,

join of two relations is quite straightforward 1

Copyright Editorial

review

2020 has

. This first Compute R1 3 R2 . rows of R1and R2

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

performs

copied, affect

scanned, the

overall

or

and is the result a Cartesian

duplicated, learning

the

in experience.

whole

or in Cengage

of a three-stage

product

part.

Due Learning

to

steps required

electronic reserves

to form

rights, the

right

to

compute

to

third remove

party additional

natural

process:

all possible

some

the

content

combinations

may content

be

suppressed at

any

time

from if

the

subsequent

of the

eBook rights

and/or restrictions

eChapter(s). require

it

136

PART I

Database

2

Systems

Select those tuples values

3

in the join

where R1Tuple.y

column(s)

are

Perform a PROJECT operation final

relation.

joining

This is to

column,

ensure

thereby

DEPARTMENT

tables

on either R1 .y or R2.yto the result

that

the

eliminating on the

5 R2Tuple.y. Only the rows are selected

where the attribute

equal.

final

relation

duplicate

DEPT-CODE

results

columns. joining

in

a single

For example,

column,

of step (2), and call it yin the copy if

of each

wejoined

we would

only

attribute

the

want

in the

STUDENT

one

column

called

DEPT_CODE in our final relation. Finally, project the rest of the attributes in R1and R2except drop the prefix R1and R2in the final relation. Let us now apply these

4

AGENT

that

steps to an example.

will be used

FIGURE 4.13

to illustrate

the

Figure

natural

4.13 shows two

join

relations

called

and

y and

CUSTOMER

and

operator.

The CUSTOMERand AGENTrelations

Database name: Ch04_Relational_DB_Operators Relation:

CUSTOMER CUS_CODE

Relation:

CUS_LNAME

CUS_POSTCODE

AGENT_CODE

1132445

Strydom

4001

231

1217782

Adares

7550

125

1312243

Nokwe

678954

167

1321242

Reddy

2094

125

1542311

Smithson

1401

421

1657399

Vanloo

67543W

231

AGENT

1

Copyright review

2020 has

Cengage deemed

AGENT_PHONE

125

01812439887

167

01813426778

231

01812431124

333

01131234445

First, compute the Cartesian product operation

Editorial

AGENT_CODE

Learning. that

any

All suppressed

will produce

Rights

Reserved. content

does

May not

not materially

the results

be

copied, affect

scanned, the

overall

of CUSTOMER and AGENT,i.e.

shown

or

duplicated, learning

in experience.

in Figure

whole

or in Cengage

part.

CUSTOMER

3 AGENT. This

4.14.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.14 Database

name:

and

Calculus

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

125

01812439887

1132445

Strydom

4001

231

167

01813426778

1132445

Strydom

4001

231

231

01812431124

1132445

Strydom

4001

231

333

01131234445

1217782

Adares

7550

125

125

01812439887

1217782

Adares

7550

125

167

01813426778

1217782

Adares

7550

125

231

01812431124

1217782

Adares

7550

125

333

01131234445

1312243

Nokwe

678954

167

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1312243

Nokwe

678954

167

231

01812431124

1312243

Nokwe

678954

167

333

01131234445

1321242

Reddy

2094

125

125

01812439887

1321242

Reddy

2094

125

167

01813426778

1321242

Reddy

2094

125

231

01812431124

1321242

Reddy

2094

125

333

01131234445

1542311

Smithson

1401

421

125

01812439887

1542311

Smithson

1401

421

167

01813426778

1542311

Smithson

1401

421

231

01812431124

1542311

Smithson

1401

421

333

01131234445

1657399

Vanloo

67543W

231

125

01812439887

1657399

Vanloo

67543W

231

167

01813426778

1657399

Vanloo

67543W

231

231

01812431124

1657399

Vanloo

67543W

231

333

01131234445

Notice

C.CUS_

in

Figure

4.14

C.AGENT_CODE to the

column

relations. i.e.

3

in the

from

prefixed

AGENT_CODE AGENT

result

Therefore

of

Step

we SELECT

2 so that

of the

attributes

prefix

C and

Cengage

Learning. that

any

All suppressed

only

one

A in the

Reserved. content

does

our

May not

not materially

copied, affect

the

starting

relation

scanned, the

overall

or

duplicated, learning

in experience.

is

letter

whilst

of

each

4

relation.

A.AGENT_CODE

refers

which the

4.15 shows

appears

whole

or in Cengage

part.

is

Due Learning

in the

electronic reserves

appears

values

in

the both

are equal,

of Step 2.

or A.AGENT_CODE to the result

shown

to

as it

AGENT_CODE

the results

final

CUS_POSTCODE, relation

we must first identify

AGENT_CODE

C.AGENT_CODE

column

The final

this

for

Figure

CUS_LNAME,

relation.

be

with

example

only the rows

AGENT_CODE

final

column CUSTOMER

5 R2Tuple.y. To perform this step

1. In

on either

(CUS_CODE,

Rights

each in the

5 A.AGENT.CODE.

Perform a PROJECT operation

deemed

137

relation.

where R1Tuple.y

the

C.AGENT_CODE

Step

has

we have

to the

Select those tuples join

2020

that

refers

AGENT_CODE

2

review

Algebra

Ch04_Relational_DB_Operators

CODE

Copyright

Relational

Step 1: CUSTOMER X AGENT

C.CUS_

Editorial

4

in

rights, the

relation.

Then

project

AGENT_PHONE)

right

Figure

some to

third remove

the

and

of

rest

drop

the

4.16.

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

138

PART I

Database

Systems

FIGURE 4.15 Database Relation

name:

Step 2: Selecting rows

where values in the join column

Ch04_Relational_DB_Operators

CUSTOMER

X AGENT Joining

4

columns

C.CUS_

C.CUS_

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

CODE

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

125

01812439887

1132445

Strydom

4001

231

167

01813426778

1132445

Strydom

4001

231

231

01812431124

1132445

Strydom

4001

231

333

01131234445

1217782

Adares

7550

125

125

01812439887

1217782

Adares

7550

125

167

01813426778

1217782

Adares

7550

125

231

01812431124

1217782

Adares

7550

125

333

01131234445

1312243

Nokwe

678954

167

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1312243

Nokwe

678954

167

231

01812431124

1312243

Nokwe

678954

167

333

01131234445

1321242

Reddy

2094

125

125

01812439887

1321242

Reddy

2094

125

167

01813426778

1321242

Reddy

2094

125

231

01812431124

1321242

Reddy

2094

125

333

01131234445

1542311

Smithson

1401

421

125

01812439887

1542311

Smithson

1401

421

167

01813426778

1542311

Smithson

1401

421

231

01812431124

1542311

Smithson

1401

421

333

01131234445

1657399

Vanloo

67543W

231

125

01812439887

1657399

Vanloo

67543W

231

167

01813426778

1657399

Vanloo

67543W

231

231

01812431124

1657399

Vanloo

67543W

231

333

01131234445

The tuples

shaded

in

produce the results

blue are those

where

C.AGENT_CODE

5 A.AGENT.CODE.

These

are then

selected

to

of Step 2.

C.CUS_

C.CUS_

C.CUS_

C.AGENT_

A.AGENT_

A.AGENT_

CODE

LNAME

POSTCODE

CODE

CODE

PHONE

1132445

Strydom

4001

231

231

01812431124

1217782

Adares

7550

125

125

01812439887

1312243

Nokwe

678954

167

167

01813426778

1321242

Reddy

2094

125

125

01812439887

1657399

Vanloo

67543W

231

231

01812431124

Copyright Editorial

match

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

FIGURE 4.16 Database

name:

no

Relational

Algebra

and

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

crucial features

match is

made

tuple.

Smithson

139

Ch04_Relational_DB_Operators CUS_LNAME

unmatched

Calculus

CUSTOMER|X| AGENT

CUS_CODE

Note a few If

Step 3: Final relation

4

of the

between

In that

is included.

CUS_POSTCODE

natural join the

case,

tuples

Smithsons

AGENT_PHONE

4

operation:

in the

neither

AGENT_CODE

relation,

the

AGENT_CODE

AGENT_CODE

new relation

does

not include

421 nor the customer

421

does

not

match

the

whose last

any

entry in

name is

the

AGENT

table. The

column

on

which

the join

was

made

that

is,

were to

occur

several

AGENT_CODE

occurs

only

once

in the

new

table. If the

same

AGENT_CODE

be listed

for

each

AGENT

table,

occur three result

the

match.

For example,

customer

named

times in the resulting

because it

if the

times

Nokwe

who is

table. (A good

would contain

unique

in the

AGENT_CODE

primary

AGENT 167

associated

with

AGENT table

table,

were to

a customer

occur

three

AGENT_CODE

cannot,

would times

167,

of course,

in the

would

contain

such

a

key values.)

4.2.3 The Outer Join When using

the

theta

join

do not have identical that

all the tuples

have a join

and the

natural

join,

it is

values for the common

from the

which keeps

original tables all the tuples

possible

attributes.

are to

outer join,

denoted

There are three Left Right Full As you

outer

join

outer

in relation

join,

whether

Copyright Editorial

review

2020 has

keeps steps

except

that

we are

Cengage

Learning. that

any

All suppressed

determining

Reserved. content

does

the

As a result these tuples

R1 which

from from

May not

or right

aleft first

not materially

left-hand

be

right-hand

both

relations

an outer

affect

the

overall

no corresponding

have

null

If

then it is values

values.

This type

we require

necessary in the

of join

to

relation

is

known

or

relation

join

data from

outer join

scanned,

will be lost.

relation,

relations

are

very

the left

similar

or right

to

side

those

of the

steps

for

relation,

computing

depending

on

outer join.

performs

copied,

have

R2 will

in the joined

relation

the

determining

a left

tuples

outer join:

we also include

performing

Rights

data

for

. This Compute R1 3 R2 rows of R1and R2.

deemed

data

of the

.

of the

data from

keeps

the

The stages in

1

join

symbol

types

keeps

outer join

will see,

a natural

by the

common

some

be shown in the resulting

R2 . In these tuples, the attributes in the second relation as the

that

are:

a Cartesian

duplicated, learning

in experience.

whole

or in Cengage

product

part.

Due Learning

to

to form

electronic reserves

rights, the

right

all possible

some to

third remove

party additional

content

combinations

may content

be

suppressed at

any

time

from if

of the

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

140

PART I

Database

2

Systems

Select those tuples values

in the join

4

Perform a PROJECT operation

in For

in

Aleft

join,

including

those

that

in

4.17.

Notice

Figure been

entered

has

outer join

for

AGENT,

do not have a matching that

there

in the

is

columns.

the

the

final

R2Tuple.y.

of Step 2, and call it simply yin

in a single

Finally,

relations

will return

AGENT_PHONE

copy

project

the

of each attribute rest

of the

name:

relation.

CUSTOMER

and

AGENT,

of the

tuples

AGENT relation.

for

the

in

the

customer

Smithson

and

relation,

a value

returns

values in the

all of the

CUSTOMER

tuples

relation.

in the

AGENT

The result

relation,

of

NULL

including

of this join is

Left outer join : CUSTOMER

shown in

AGENT

Ch04_Relational_DB_Operators CUS_POSTCODE

AGENT_CODE

AGENT_PHONE

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

1542311

Smithson

1401

421

name:

Right outer join : CUSTOMER

NULL

AGENT

Ch04_Relational_DB_Operators

CUS_CODE

CUS_LNAME

CUS_POSTCODE

AGENT_CODE

AGENT_PHONE

1132445

Strydom

4001

231

01812431124

1217782

Adares

7550

125

01812439887

1312243

Nokwe

678954

167

01813426778

1321242

Reddy

2094

125

01812439887

1657399

Vanloo

67543W

231

01812431124

NULL

NULL

NULL

333

01131234445

Learning. that

were

of this join is shown

CUS_LNAME

Cengage

which

CUSTOMER

The result

CUS_CODE

deemed

in

attributes

field. AGENT,

matching

all

value in the

no AGENT_PHONE

CUSTOMER

do not have

FIGURE 4.18

2020

duplicate

results

,.

4.18.

FIGURE 4.17

review

an

CUSTOMER

outer join,

that

Figure

Copyright

eliminating

performing

where the attribute

4.14.

outer

those

thereby

consider Figure

A right

Editorial

This is to ensure that the final relation

column,

example,

has

on either R1 .y or R2.yto the result

R1and R2,except y, and drop the prefix R1and R2in

defined

Database

5 R2Tuple.y. Only the rows are selected

equal.

Select those tuples in R1that do not have matchingvalues in R2, so R1Tuple.y

the joining

Database

are

3

the final relation.

4

where R1Tuple.y

column(s)

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

So, regardless the

of the type

matched

null.

pairs

Outer joins

of outer join,

would

are

especially

useful

cause(s)

referential

primary

key values in the related

other

amounts

the

integrity

non-database

vast

and

any

when

you

problems

data into

of time

the two

be retained

and

relational

Figures

values to

determine

what key

In fact, if you

are asked to

tables,

you

when

relation

value(s)

encounter

and

would in

values

convert

will discover

you

Algebra

Calculus

4.18 have shown

other

when foreign

headaches

Relational

4.17 and

in the

created

are

database

uncounted

in

are trying

which

table(s).

examples unmatched

4

large

that the

not

tables match

the

spreadsheets

outer joins

referential

that be left

related

do

integrity

141

or

save you errors

after

conversions. You

may

the tables

wonder

why the

are listed

in the

outer

joins

are labelled

SQL command.

left

Chapter

and right.

The labels

8 will explore

refer

to the

order

in

which

such joins.

4

4.3

CONSTRUCTING QUERIES USING RELATIONAL ALGEBRAIC

EXPRESSIONS The

main purpose

a database. are

used

to tell

calculus

the

provides

relations.

non-procedural

1977

(Lacroix

and

properties

and

over

set

again,

writing relational

algebraic not

optimiser. access and

the

SQL

need

in

for

relations.

in terms

and

a

calculus

databases.

power

is

of relational

relational

with relational

other

calculus

one form

by domain

expressive

Relational of those

relational

Codd proposed

use

in its

other

whilst

followed

relational this

relational

algebra

book

to

calculus.

users

will ask

will

that

expressions This

different

at the

to formulate

examine

For those end

of this

on the

spur

of the

of

smaller

used

query

in

the

in

However,

both

provide

the

expressions

the

mathematical

who

are interested

using

definitions, there

is

a

chapter.

these

results

DBMSs,

is to

it is

the

moment.

query

each

of the

query.

pointing

queries

building step

out that

Chapter

a when

of individual same,

the

but

can

efficiency

determined

the

a query

generates

Generally,

be the

is

and find in

will be asked

of execution

will always

of execution

optimiser

of

where

the order

query

Some

The task

steps

worth

order

of queries.

steps,

queries, of the

analyse the

more about the

kinds

following

However,

most

optimiser

You will discover

in the

to represent

expressions.

The job

many different

a number

means that

and that,

of the

be

into

are then

very important

data.

no

section

down

matter.

by slightly

is

of

relation

in

section

queries.

reading

others

query

results

does

In 1972,

algebra

on applying

is

behind further

whilst

the

operations obtained

There

of a database,

of intermediate

be

database

terms

(tables)

previous

Queries

breaking

a query

real

in

relations

about in the

language,

was later

designed

relational

manipulate

of the required

logic.

and this were

to

relation

definition

as a procedural

versions

and

you have just read

required

on predicate

equivalent

a way to create

that

the

calculus

Both

is

in the

During the lifetime over

some

classed

and based

characteristics

material

4.3.1 Building

involves

often

relational

operators.

and of

build

we will be focusing

main relation

selection

is

specifying

section,

to

provide

algebra

for formulating

Pirotte).

calculus

base for

In this

how

algebra

as tuple

relational

the

DBMS

language

known

required

algebra is to of relational

a notation

Relational

calculus

tuple

of relational

The operations

of

by a query

most efficient

13, Managing

way to Database

Performance.

In order to build a query using a relational

algebraic

expression,

you should take the following

steps:

1 List all the attributes we need to givethe answer. 2 Select allthe relations we need, based onthe list of attributes.

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

142

PART I

Database

3

Systems

Specify the relational

To learn

how to

small

database

Each

car is

a new

queries

to

undergo

maintenance parts

to

are shown in Figure

FIGURE 4.19

these

steps,

about the

is

each

created

purchased

and

and fitted.

are completed

results that are needed.

we will now look

maintenance

an inspection

record

be

FAIL until all the repairs

4

following

stores information

required

inspection, require

build that

operators and the intermediate

year

to

any repairs

If

a car

test

needs

are

based

it is roadworthy.

needed

a repair,

examples

on a

ERD is shown in Figure 4.19).

whether

that

and then it is set to

at some

of cars (the

then

PASS. The tables

After

are recorded. the

each

A repair

EVALUATION

representing

is

this

can set to

database

4.20.

The car inspection

ERD CAR

MAINTENANCE_RECORD REGISTRATION INSPECTION_CODE

{PK}

REGISTRATION

b requires

{PK}

CAR_MAKE

{FK}

CAR_MODEL

INSPECTION_DATE

0..*

EVALUATION

MODEL_YEAR

1..1

LICENCE_NO

1..1

is_for

c

0..* PART

REPAIR INSPECTION_CODE PART_NO

{PK}

requires {PK}

c

PART_NO

{FK}

{FK}

Database

name:

Table name:

Thecarinspection database Ch04_Car_Inspection

CAR

REGISTRATION

CAR_MAKE

Toyota

3679MR82

Copyright Editorial

review

PART_COST

0..*

0..*

FIGURE 4.20

{PK}

PART_NAME

CAR_MODEL

CAR_COLOUR

MODEL_YEAR

LICENCE_NO

Corolla

Blue

2016

1967fr89768

Micra

Red

2004

1973Smith121

E-TS865

Nissan

PE57UVP

Peugeot

508

Blue

2017

1990bty3212

PISE567

Volkswagen

Eos

Lime

2016

DF-678-WV

ROMA482

Volkswagen

Golf

Black

2017

AQ-123-AV

Z-BA975

Peugeot

Black

2017

1980vrt7312

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

GT

208

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Table

name:

PART_NO

PART_NAME

12390

Paint sealants

List To

answer

12392

Brake

pads

24.99

12393

Brake

discs

49.54

12395

Spark

plugs

0.99

12396

Airbag

24.95

12397

Tyres

25.00

REGISTRATION

INSPECTION_DATE

Copyright Editorial

review

2020 has

Cengage deemed

any

FAIL

10/05/2018

100390

ROMA482

01/09/2018

106750

E-TS865

01/03/2016

PASS

122456

Z-BA975

03/10/2018

FAIL

145678

PISE567

30/09/2017

PASS

200450

E-TS865

21/02/2015

PASS

200456

E-TS865

01/04/2017

FAIL

query,

the .

All suppressed

query

asked

about

cars

you

relation

Rights

12396

106750

12397

100036

12393

200450

12391

100036

12397

200450

12392

200456

12397

where

The the

106750

the

model

interpret

that

user

only

relational

year is List

wants

2016.

all information

to

operator

after

about

see information

SELECT

on

we can

cars cars

means list where

write this

query

the

all the attribute

as a relational

as:

Reserved. content

Using

PART_NO

by a user:

must first CAR.

2016.

expression

Learning. that

EVALUATION

PE57UVP

following

MODEL_YEAR

algebraic

4

REPAIR

this in

143

19.95

Wiper

100036

all information

attributes

Calculus

14.95

INSPECTION_CODE

1

and

MAINTENANCE_RECORD

Table name:

the

Algebra

PART_COST

12391

INSPECTION_CODE

Consider

Relational

PART

Table name:

Example

4

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

144

PART I

Database

Systems

(CAR) smodel_year . 2016 The resulting

relation

FIGURE4.21

4

is shown

CAR_MAKE

PE57UVP

Peugeot

ROMA482

Volkswagen

Z-BA975

Peugeot

Example

the

CAR_COLOUR

508 Golf

GT

208

mechanic

The following

query is

Display

all the

query

at the

names

only

the

SELECT

operator.

FIGURE4.22

garage

wishes

and their

specific

attributes

2017

1990bty3212

Black

2017

AQ-123-AV

Black

2017

1980vrt7312

find

information

will also

Consider

algebraic

a

more complex

cars

parts

to

of the

to

restrict

the

part is

greater

be displayed,

PART_COST.

stock.

Both

so

are

rows

we

20.00.

will need

obviously

where

for this

than

in the

the

relation

relation

PART_COST

PART.

. 20.00

using

Ppart_name (s part_cost.20.00(PART))

query is

4.22.

PART_NAME

PART_COST

Brake Pads

24.99

Brake

49.54

of

and

Discs

24.95

different

model

operator

and show

how

we can

write expressions

when

tables.

details

and

out

after

was carried

Cengage

part

numbers

for

01/03/2018,

all

which

cars

resulted

where in

the

model

a part

being

year is required

and

and

will have to

results.

CAR_MODEL

MODEL_YEAR

is

be

broken

The first

part of the

which are located 2017.

down

in the

This information

can

into

a number

query states

that

CAR relation. be

written

of different

we need the

Also, using

stages,

each

attributes

we are only interested

the

following

relational

expression:

Learning. that

query

a set of intermediate

whose

algebraic

deemed

are in

query:

an inspection

REGISTRATION

has

a number

following

car registration where

one having

2020

parts

a repair.

This is

review

cost

expression

is shown in Figure

will also use the natural join from

the

the

2017.

Copyright

which

3 example

data is required

Editorial

about

Resultof Ppart_name (PART)) (s part_cost . 20.00

Example The final

in

out information

the

about and

be required

The relational

relation

where

Airbag

for

LICENSE_NO

Blue

to

prices

PART_NAME

PART_COST

The resulting

List

MODEL_YEAR

asked:

part

requires

contains

The attribute the

CAR_MODEL

2

Supposing

which

4.21.

(CAR) Resultof s model_year . 2016

REGISTRATION

This

in Figure

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

4

Relational

Algebra

and

Calculus

145

Pregistration, car_model (smodel_year 52017(CAR)) The result

of applying

this

FIGURE4.23

statement

to the

CAR table is

shown in

Resultof Pregistration, car_model (smodel_year5 2017(CAR)) REGISTRATION

CAR_MODEL

PE57UVP

508

ROMA482

Golf

next

part

01/03/2018.

of this

query is not asking means the values the

query,

part

query

Information

for

any specific

selecting

query

can

be

information

inspections

of all attributes

by only

of the

requires

about

those

is

4

so

in

inspections

the

where

we will assume

the

that

were

carried

MAINTENANCE_RECORD

relation.

after The

about inspections

However,

INSPECTION_DATE

out

relation.

that information

MAINTENANCE_RECORD

tuples

written

about stored

attributes,

in the

GT

208

Z-BA975

The

Figure 4.23.

we must restrict

. 01/03/2018.

This

second

as:

( MAINTENANCE_RECORD) sinspection_date . 01/03/2018

The result

of applying

this

FIGURE4.24

expression

to the

INSPECTION_DATE

EVALUATION

100036

PE57UVP

10/05/2018

FAIL

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

with the

REGISTRATION.

TempR where

been

tables

can

be

Copyright review

2020 has

Cengage deemed

Learning. that

any

4.24.

shown

the

in Figures

column written

for

in

first

two

parts

FAIL

of the

query.

4.23 and 4.24. This join

both the

CAR and

The

next

operation

stage is to join

is the

MAINTENANCE_RECORD

now natural

relations

being

as:

(MAINTENANCE_RECORD) 5 Pregistration, car_model (s model_year 52017 (CAR)) |3|s inspection_date . 01/03/2018

a relation

which stores

of the

natural

join

is

prefixed

with

the

letters

(MAINTENANCE_RECORD

Editorial

expressions

common

This

TempR is

The result have

algebraic

from the resulting

operation,

Figure

REGISTRATION

have relational

the rows join

table is shown in

(MAINTENANCE_RECORD) Resultof sinspection_date . 01/03/2018

INSPECTION_CODE

We now

MAINTENANCE_RECORD

All suppressed

Rights

and

Reserved. content

does

May not

not materially

be

the intermediate

shown

using

the

M and

results.

three

C to

steps

show

in

Figure

which

4.25.

relations

Notice they

that

were

the

attributes

originally

from

CAR respectively).

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

146

PART I

Database

Systems

FIGURE 4.25 Step

The TempR relation

1: Compute

the

Cartesian

M.INSPECTION_

product:

MAINTENANCE_RECORD

M.REGISTRATION

M.INSPECTION_

CODE

4

X CAR. M.EVALUATION

C.REGISTRATION

C.CAR_

DATE

MODEL

100036

PE57UVP

10/05/2018

FAIL

PE57UVP

100036

PE57UVP

10/05/2018

FAIL

ROMA482

100036

PE57UVP

10/05/2018

FAIL

Z-BA975

208

100390

ROMA482

01/09/2018

PE57UVP

508

100390

ROMA482

01/09/2018

ROMA482

100390

ROMA482

01/09/2018

Z-BA975

208

122456

Z-BA975

03/10/2018

FAIL

PE57UVP

508

122456

Z-BA975

03/10/2018

FAIL

ROMA482

122456

Z-BA975

03/10/2018

FAIL

Z-BA975

Step

2: SELECT

only the rows

for

which the

REGISTRATION

values

are

equal, i.e.

508 Golf GT

Golf

GT

Golf

GT

208

M. REGISTRATION

5 C.

REGISTRATION. Joining

Columns

M.REGISTRATION

M.INSPECTION_

C.CAR_ MODEL

FAIL

100036

PE57UVP

10/05/2018

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

Step 3: Perform a PROJECT prefixes

C and

of

3.

result

C.REGISTRATION

DATE

CODE

the

M.EVALUATION

M.INSPECTION_

Step

on either

Min the final

FAIL

C.REGISTRATION

relation.

The table

or M.REGISTRATION

below

shows

508

PE57UVP

the

relation

ROMA482

Golf

Z-BA975

208

to the result TempR,

of Step 2 and drop

which

has

been

created

INSPECTION_CODE

REGISTRATION

INSPECTION_DATE

EVALUATION

CAR_MODEL

100036

PE57UVP

10/05/2018

FAIL

508

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

The next

part of the

query requires

Golf

we have

as a

GT

208

FAIL

the information

GT

obtained

so far to

be restricted

even further

by only displaying information for cars where a part was needed for arepair. To find out this information we have to look to see if there is a PART_NO in the REPAIR relation, which corresponds to a specific INSPECTION_CODE in the MAINTENANCE_RECORD relation. The relation TempR already stores the intermediate results from the first part of our query, so we must now connect TempR to the REPAIR relation

using

a natural join

QueryResult

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

on the INSPECTION_CODE

5 TempR |3|

Rights

Reserved. content

does

May not

not materially

be

column.

This can be

written as the

expression:

REPAIR

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

CHAPTER

Figure

4.26 shows

called

QueryResult.

the result

FIGURE 4.26 The relation

of performing

this

natural join

operation

4

Relational

Algebra

and stores the results

and

Calculus

in a relation

The QueryResultrelation

TempR

INSPECTION_CODE

REGISTRATION

100036

PE57UVP

10/05/2018

100390

ROMA482

01/09/2018

122456

Z-BA975

03/10/2018

The relation

INSPECTION_DATE

EVALUATION

CAR_MODEL

FAIL

508

FAIL

208

Golf GT

4

REPAIR

QueryResult

5 TempR

|3|

INSPECTION_CODE

INSPECTION_CODE

PART_NO

106750

12396

106750

12397

100036

12393

200450

12391

100036

12397

200450

12392

200456

12397

REPAIR REGISTRATION

INSPECTION_DATE

EVALUATION

CAR_MODEL

PART_NO

100036

PE57UVP

10/05/2018

FAIL

508

12393

100036

PE57UVP

10/05/2018

FAIL

508

12397

Finally,

the

147

original

This requires

query

us to

requested

perform

using the following

that

a PROJECT

we only list

the

operation

on the intermediate

car registration,

in

4.27.

model results

details in the

and

part numbers.

QueryResult

relation

expression:

(QueryResult) Pregistration, car_model,part_no The final

results

of the

FIGURE 4.27

query

are

shown

Figure

Solution to example 3 REGISTRATION

As you

can

see, it is

smaller

relational

possible

algebra

to

CAR_MODEL

PART_NO

PE57UVP

508

12393

PE57UVP

508

12397

solve

a complex

expressions.

The full

query

by

expression

breaking

for

down

example

the

3 can

query be

into

written

a number

of

as:

car_model (smodel_year 52018 (CAR)) |3|sinspection_date . Pregistration, car_model,part_no((REPAIR) |3| ( Pregistration, 01/03/2018 (MAINTENANCE_RECORD)

Copyright Editorial

review

2020 has

Cengage deemed

Learning. that

any

All suppressed

Rights

Reserved. content

does

May not

not materially

be

copied, affect

scanned, the

overall

or

duplicated, learning

in experience.

whole

or in Cengage

part.

Due Learning

to

electronic reserves

rights, the

right

some to

third remove

party additional

content

may content

be

suppressed at

any

time

from if

the

subsequent

eBook rights

and/or restrictions

eChapter(s). require

it

148

PART I

Database

4.4

Systems

RELATIONAL CALCULUS

Relational

calculus

calculus.

There

are two types

Tuple

relational

calculus. compute

it. In

will learn

uses

addition,

about

domain

tuple.

In the

is a formal

in

variables

8.

the

Domain

take

sections

a branch

users

tuple to

relational

of

is

about

calculus is a precise language

that

calculus

what

they

two

called

predicate

and

domain

relational

rather

Language

from

domain,

these

logic

want,

Query

different

an attribute

more

mathematical

Structured

calculus

from

will learn

of

relational

describe

appearance

on values

you

upon

calculus,

allows

underlines

that

based

of relational

calculus it

Chapter

following

language

tuple

rather

types

than

(SQL),

relational

than

to you

calculus

values for

of relational

how which

as it

an entire

calculus.

4

NOTE A NOTE ON PREDICATE CALCULUS First-order

logic

or predicate

are words that

describe certain relations

can be used to express

and properties. In logic,

queries.

Predicates

a predicate has the form:

name_of_predicate(arguments). Consider

the following

statements:

student(Alex) studies(Alex,

Database Systems)

In these two statements, student and studies are the names of the predicates. The statement student(Alex) has a value TRUE if Alexis a student, and a value FALSE if Alexis not a student. Variables

are used if

individual.

we want to express the

So the above

statements

property

of being a student,

and not refer to a specific

become:

student(x) studies(x,y) The expression student(x) is now referred to as a predicate expression. It has no predetermined truth value as the value of xis currently unknown. Variables in a predicate expression can take values within a certain domain. The domain of a predicate variable is the set of all values that can be substituted in the place of the variable. When writing expressions in predicate

P(x)represents

a pr