447 22 294MB
English Pages [965] Year 2020
Australia
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Brazil
Reserved. content
does
May not
not materially
Mexico
be
copied, affect
South
scanned, the
overall
or
duplicated, learning
Africa
in experience.
Singapore
whole
or in Cengage
part.
United
Due Learning
to
electronic reserves
Kingdom
rights, the
right
some to
third remove
United
party additional
content
States
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
This is an electronic
some third content
does not
to remove valuable
formats,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
print textbook.
affect the
this title
overall
at any time
learning
on pricing,
www.cengage.com/highered
your areas
Rights
be
Media
available
Reserved. content
does
May not
in
not materially
The publisher
rights
changes
restrictions,
restrictions
to current
editions,
to search by ISBN#,
reserves
require
it.
the right For
and alternate
author, title, or keyword for
of interest.
Notice: not
editions,
rights
has deemed that any suppressed
experience.
if subsequent
please visit in
previous
Due to electronic
Editorial review
information
Important may
from
of the
may be suppressed.
materially
content
materials
text
version
party content
be
copied, affect
content the
referenced
eBook
scanned, the
overall
or
within
the
product
description
or the
product
version.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Database
Principles:
Design,
Fundamentals
Implementation,
and
of
2020
Cengage
US
Edition
from
Database
Management
Authors:
Carlos
Coronel,
Steven
13th
Keeley
Crockett,
Craig
Blewett
Marinda
Marketing
Louw
Manager:
Anna
Cengage
RIGHTS
Content
Project
reproduced,
transmitted,
in
or
any
form
Manager:
Manager:
Sue
Povey
written
Eyvett
by
Cover
No part
2019.
of this
stored,
any
means,
recording
permission U.K. from
Steven All
Morris.
Rights
work
may be
distributed
or used
electronic,
mechanical,
or
otherwise,
without
of Cengage
Learning
Copyright
Licensing
the
the
or under
prior
license
Agency
Ltd.
Author(s)
and
the
Adapter(s)
have
asserted
the
right
SPi-Global under
Cover
Inc.,
Davis
The Typesetter:
Coronel,
&
Reading
in the Manufacturing
by Carlos
Learning,
RESERVED.
photocopying, Senior
Edition,
Design, Implementation,
Reserved.
ALL Publisher:
Systems:
Morris Copyright
Adapters:
EMEA
Management
Adapted Third
Learning
Designer:
Simon
Levy
Image(s):
Vijay
Kumar/Getty
the
Copyright
identified
Associates
Images
Designs
as Author(s)
For product
us
at
permission
product
Patents
Adapter(s)
information
contact
For
and
and
Act
1988
of this
and technology
to
be
Work.
assistance,
[email protected]
to
use
and for
material
from
permission
this
text
queries,
or
email
[email protected]
British
Library
A catalogue
British
Cataloguing-in-Publication
record
for
this
Data
book
is
available
from
the
Library.
ISBN:
978-1-4737-6804-8
Cengage
Learning,
Cheriton
House,
Andover,
Hampshire,
United
EMEA
North
Way SP10
5BE
Kingdom
Cengage
Learning
learning different around
is
a leading
solutions
with
countries
and sales
the
world.
provider
employees
Find
your
in
of
customized
residing
in
more than
local
nearly
125
40
countries
representative
at:
www.cengage.co.uk.
Cengage by
Learning
Nelson
To learn
register
more
Printed Print
Copyright Editorial
review
2020 has
in
China
Number:
Cengage deemed
Learning. that
any
at
RR
All
Print
Rights
Reserved. content
in
Canada
Cengage
your
materials
platforms
online
for
your
and
learning
services,
solution,
or
course,
www.cengage.com.
Donnelley
01
suppressed
are represented
Ltd.
about
or access
purchase visit
products
Education,
does
May not
not materially
be
Year:
copied, affect
2020
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Brief Contents
Part i
Database systems 2
1
The Database Approach
2
Data
3 4
Relational Model Characteristics 70 Relational Algebra and Calculus 119
Models
Part ii 5 6 7
5
34
Design Concepts 162
Data Modelling with Entity Relationship Diagrams 165 Data Modelling Advanced Concepts 233 Normalising Database Designs 271
Part iii
Database Programming
8
Beginning Structured
Query Language
9
Procedural
SQL and
Part iV
Language
320
Advanced
SQL 426
Database Design 522
10
Database Development
11
Conceptual,
Logical,
Process
525
and Physical
Database
Part V Database transactions tuning 632 Transactions
and
Managing
13
Managing Database and SQL Performance
Part Vi Database
Concurrency
Management
Appendix
A:
Appendix
B: The
Appendix
C: Global
2020 has
706
860
938
Appendices (Available
review
672
912
Index
Copyright
578
635
Distributed Databases 709 Databases for Business Intelligence 750 Big Data and NoSQL 826 Database Connectivity and Web Technologies
Glossary
Editorial
Design
and Performance
12
14 15 16 17
318
Cengage deemed
Learning. that
any
Designing
All suppressed
Databases
University
Rights
Lab:
Tickets
Reserved. content
online)
does
May not
Ltd:
not materially
be
copied, affect
with
Visio
Professional:
Conceptual,
Logical,
Conceptual,
scanned, the
overall
or
duplicated, learning
Logical,
in experience.
whole
or in Cengage
A Tutorial
and Physical and
part.
Due Learning
Database
Physical
to
electronic reserves
Database
rights, the
Design
right
some to
third remove
party additional
Design
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
iv
Brief
Contents
Appendix
D: Converting
Appendix
E:
Comparison
Appendix
F:
Client/Server
G:
Appendix
H: Databases I:
Appendix
Copyright Editorial
review
2020 has
of ER
Appendix
Appendix
an ER Model into
The
Databases
in e-Commerce
Network
Appendix
K:
Database
Appendix
L:
Data
Database
Database
Implementation
M: Creating
Appendix
N: A Guide
Appendix
O: Building
Appendix
P:
Microsoft
Appendix
Q:
Working
with
Appendix
R:
Working
with Neo4j
Cengage
Learning. that
any
All suppressed
Rights
a New Database to
Using
SQL
a Simple
Reserved. content
does
Model
Model
Administration
Warehouse
Appendix
deemed
Structure
Notations
Systems
Object-Orientated
The Hierarchical
J:
a Database
Modelling
May
not materially
Using
Oracle 12c
Developer
with
Object-Relational
Access
not
Factors
Oracle
12c
Database
Using
Oracle
Objects
Tutorial
MongoDB
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Contents
Preface
xiii
Changes
to the
Third
Edition
Acknowledgements
About the
Authors
Walk Through Dedication
xvii
Tour
xviii
xx
Teaching
and
Learning
Parti
Support
Vignette:
1.1 1.2
The Relational Revolution
Historical
and the
DBMS
is important
files
and
with file
system
Database
systems
21
Preparing
for your
data
processing
data
database
8
13 13
management
professional
17
career
28
30
Key terms
30
reading
Review
31
questions
Problems
31
32
Data Models 34 Preview
34
2.1
The importance
2.2 2.3 2.4
Data
The evolution
2.5
Degrees
model
Business
Summary
66
any
36
of data
models
abstraction
39 58
65
questions
Learning.
blocks
65
Review
that
building
35
37
of data
Problems
Cengage
basic
models
64
Key terms
deemed
of data
rules
Further reading
has
database design
roots:
Problems
Further
2020
3
6
the
Why database
Summary
2
vs information
Introducing
1.4 1.5 1.6 1.7
review
An Historical Journey
5 Data
1.3
Copyright
xxi
the Database Approach 5 Preview
Editorial
Resources
Databasesystems 2
Business
1
xv
xvi
All suppressed
Rights
Reserved. content
does
65
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
vi
Contents
3
relational Preview
3.1
A logical
3.2 3.3 3.4 3.5
Keys
revisited
98
database
rules
catalogue
85
database
87
relational
103
104 104
Further
reading
Review
questions
104
105
108
relational
Algebra and Calculus 119
119
4.1 4.2 4.3
Joins
4.4
Relational
Relational
operators
121
133
Constructing
queries
using
calculus
relational
algebraic
expressions
141
148
153
154
Further
reading
Review
questions
Problems
155 155
157
Partii
Design Concepts 162
Business
Vignette:
Using Data to Improve the Lives of Children and Women 163
Data Modelling Preview
with entity relationship
Diagrams 165
165
5.1 5.2
The entity relationship
5.3
Database
Developing
Summary
(ER)
an ER diagram design
model 167 196
challenges:
conflicting
goals
212
215
Key terms
216
Further
reading
Review
questions
Problems
216 217
220
Data Modelling Advanced Concepts 233 Preview
6.1 6.2
Cengage deemed
relational
101
Codds
Key terms
has
within
Data redundancy Indexes
Summary
2020
83
the
Preview
review
rules
and the system
Problems
6
72
Relationships
Key terms
Copyright
of data
The data dictionary
Summary
Editorial
view 78
Integrity
3.7 3.8
5
70
70
3.6
4
Model Characteristics
Learning. that
any
233 The
extended
Entity
All suppressed
Rights
clustering
Reserved. content
entity
does
May not
not materially
relationship
model
234
242
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Contents
6.3
Entity
6.4 6.5
Design
Data
Key terms
257
reading
Review
studies
Preview
244 design
249
255
258
261
Database Designs 271
271
7.1
Database
tables
7.2 7.3 7.4 7.5
The
for
need
and
the
design
Surrogate
key
considerations
Higher-level
7.7 7.8
Normalisation
normal
276
284 289
forms
and
Denormalisation
272
272
process
Improving
7.6
290
database
design
296
302
303
Key terms Further
normalisation
normalisation
The normalisation
Summary
306
reading
Review
306
questions
Problems
306
308
Part iii 8
checklist
keys database
258
normalising
Business
flexible
257
questions
Case
primary
learning
modelling
256
Problems
selecting
cases:
Summary
Further
7
integrity:
vii
Database Programming Vignette:
318
Open Source Databases 319
Beginning structured Preview 320 Introduction 8.1
Query Language 320
to SQL 321
8.2
Data definition
8.3 8.4 8.5
Data manipulation commands 339 Select queries 347 Advanced data definition commands
commands
8.6
Advanced
select
324
queries
361
369
Virtual tables: creating a view 383 8.7 Joining database tables 385 8.8 Summary 392 Keyterms 393 Further reading
393
Review questions Problems 401
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
394
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
viii
Contents
9
Procedural Language sQL and Advanced sQL 426 Preview
426
9.1
Relational
9.2 9.3 9.4 9.5
SQLjoin
set
operators
operators
Subqueries
and correlated
SQL functions Oracle
446
468
Updatable
views
9.7 9.8
Procedural
SQL
Embedded
SQL 495
472 475
500
Key terms
501
Further
reading
Review
questions
Problems Case
queries
459
sequences
9.6
Summary
428
438
502
502
503
515
PartiV Database Design 522 Business Vignette: EM-DAT:TheInternational DisasterDatabasefor DisasterPreparedness523
10
Database Development Preview
10.1 10.2 10.3
Process 525
525
The information system 527 The systems development life cycle (SDLC) The database life cycle (DBLC) 532
10.4
Database design strategies 552
10.5 10.6
Centralised vs decentralised design 553 Database administration 555
Summary
573
Key terms
574
Further
reading
Review
questions
Problems
529
575 575
576
11 Conceptual, Logical, and Physical Database Design 578 Preview
578
11.1
Conceptual design 580
11.2
Logical database design 594
11.3
Physical database design 603
Summary
625
Key terms
626
Further
reading
Review
questions
Problems
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
627 627
628
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Contents
ix
Part V Databasetransactions and Performance tuning 632 Business
12
Vignette:
From Data Warehouse to Data Lake 633
Managing transactions Preview
and Concurrency
635
12.1
What is a transaction?
12.2 12.3 12.4
637
Concurrency Concurrency Concurrency
control control control
646 withlocking methods 651 with time stamping methods 659
12.5
Concurrency
control
with optimistic
12.6 12.7
ANSI levels of transaction isolation 661 Database recovery management 662
Summary
660
668
reading
Review
668
questions
Problems
13
methods
666
Key terms Further
635
668
669
Managing Database and sQL Performance Preview
672
672
13.1 13.2 13.3
Database performance-tuning concepts Query processing 678 Indexes and query optimisation 682
13.4
Optimiser
13.5 13.6 13.7 13.8
SQL performance tuning 685 Query formulation 688 DBMS performance tuning 689 Query optimisation example 692
Summary
683
699
Key terms Further
choices
673
700
reading
700
Review
questions
Problems
701
700
Part Vi Database Management 706 Business
14
Vignette:
Distributed Preview
14.1 14.2 14.3
Copyright Editorial
review
2020 has
Cengage deemed
The FacebookCambridge
Learning. that
any
Analytica Data Scandal andthe GDPR 707
Databases 709
709
The evolution of distributed database management systems DDBMS advantages and disadvantages 712 Distributed processing and distributed databases 714
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
710
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
x
Contents
14.4
Characteristics of distributed database management systems 715
14.5 14.6 14.7 14.8
DDBMS Components 717 Levels of data and process distribution 719 Distributed database transparency features 722 Distribution transparency 723
14.9
Transaction transparency 726
14.10 14.11 14.12
Performance and failure transparency Distributed database design 733 The CAP theorem 740
14.13
Database security 742
14.14 14.15
Distributed databases within the cloud 742 C.J. Dates 12 commandments for distributed
Summary
745
Key terms
746
Further
reading
Review
questions
Problems
15
732
744
746 746
747
Databases for Business intelligence Preview
15.1 15.2
databases
750
750
The need for data analysis 751 Business intelligence 751
15.3
Decision support data 762
15.4 15.5 15.6 15.7
The data warehouse 767 Star schemas 777 Data analytics 789 Online analytical processing
15.8
SQL analytic functions
15.9
Data visualisation
Summary
818
Key terms
819
Further
reading
Review
questions
Problems
794
805
811
820 820
821
16 Big Data and nosQL 826 Preview
16.1 16.2
826
Big data 827 Hadoop 833
16.3
NoSQL databases 840
16.4 16.5 16.6
NewSQL databases 848 Working with document databases using MongoDB 849 Working with graph databases using Neo4j 853
Summary
857
Key terms Review
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
858 questions
All suppressed
Rights
Reserved. content
does
859
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Contents
17
Database Connectivity Preview
17.1
Database connectivity 861
17.2 17.3 17.4 17.5
Database internet connectivity 873 Extensible markup language (xML) 884 Cloud computing services 898 The semantic web 907 908
Key terms
909
Further reading Review
Index
909
questions
Problems
Glossary
909
910
912 938
Appendices (Available Appendix
A:
Appendix
B: The
Appendix
C:
2020 has
Lab:
Tickets
Ltd:
Global
D:
Converting
Comparison
Appendix
F:
Client/Server
Appendix
G:
Object-Orientated
Appendix
H:
Databases
J:
an
in
Hierarchical
Network
K:
Database
L:
Data
Conceptual,
ER
Model into
Database
Database
P:
Microsoft
Q:
Working
with
MongoDB
Appendix
R:
Working
with
Neo4j
does
May not
a New
Factors
Appendix
Reserved.
Structure
Model
Oracle
Appendix
content
Design Design
Model
Database
O: Building
Rights
Database Database
Notations
Implementation
N: A Guide to
All
Physical
Administration
Warehouse
Appendix
suppressed
and
Databases
Appendix
any
Physical
e-Commerce
M: Creating
Learning.
Logical,
A Tutorial
and
a Database
Modelling
Appendix
that
Logical,
Systems
The
Appendix
with Visio Professional: Conceptual,
of ER
The
Appendix
Cengage deemed
Databases
E:
I:
online)
University
Appendix
Appendix
review
Designing
Appendix
Appendix
Copyright
860
860
Summary
Editorial
and Web technologies
xi
Using
SQL Developer
a Simple
not
be
copied, affect
Database
Using
Oracle
Objects
Tutorial
scanned, the
12c
with Oracle 12c
Object-Relational
Access
materially
Using
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
PrefACe
We are solid
excited
and
to
introduce
practical
This foundation
creation
the
foundation is
built
depends
of core in the
database
Approach:
As the
broad to
title
suggests,
database
the
by poorly
induced
provides when
database
databases
systems.
their
successful
level
and and
database
the
areas
will also provide practical
coverage
management
systems
of business
technology,
Design
detail,
and
Management
reasons,
special
covers
attention
to
potential
three
is given
processing
Learning. any
All suppressed
Rights
does
and
conflicts
May
not materially
In
and
better
our
design
software
likely
approach
many, if
be solved
even the
people to
without
experience,
cannot
DBMS
system
with the
to
not
help
overcome
best bricklayers
management
worthwhile
management
to skills
use in
most, of even
problems
and carpenters
seem to
scarce
order
be
affect
carefully
function
techniques
be triggered
resources
to
to
exercise
ones
warehouses
structures,
and
develop
them
design
useful
design
between
scanned, the
overall
or
are
on crises
duplicated, learning
it
makes little
in experience.
whole
or in Cengage
part.
Due Learning
may
completed. database
data from more
operational
sense
covered
problems make
elegance,
sense to
to
clients
of current
make
we have
We also
design
what they
when the
understood.
stressed,
skills.
get
design is
much of their
end-of-chapter
database
to
In fact,
understanding
procedures are
numerous
more likely
database
derive
and implementation
the
are
and thoughtfully.
once a good
promotes
data
For example,
copied,
Clients
approached
of database
real
not
design
of communication. is
because
sure that
speed.
Reserved. content
disasters.
poor
seems
concepts,
aspects
develop
the create
blueprint.
hardly
really
structure
and actual
even database-inexperienced
Using an analogy,
means
design
making
enables
Nor is
design
warehouse
practical
in
students
that
a
databases.
For example, data
Cengage
in
Unfortunately,
system
organisations
operational
deemed
studying
Implementation,
with database
It
system
databases,
transaction
has
of
database
to
a bad
database
an excellent
with
of
associated
database
Familiarity
the
from
designed
how their
procedures
number
databases.
discover
software
by poor design.
extensive
technologies.
2020
Stages
applications.
managers.
building
by poorly
Design
review
provide
database
things,
comprehensive
on courses
Design,
are traceable
problems
and
any
and
designed
excellent
Copyright
way to
a good
Most difficult
Editorial
to
define them.
at undergraduate
only for those
of
practical
However, for several important
database
failures
or magnified
create
Because
that
designed
management
very
Providing
those
Principles:
database
programmers
created
the
and
system
best
need
not also
on the
systems.
of excellent
paves
database
for
Database
databases
cant
concepts
is
design:
usually
the
Emphasis
of database
The availability create
text but
and are
databases courses.
which
and data analytics.
Continued
aspects
course in
an ideal
Principles,
databases
the important
science,
data science
while
postgraduate
it is
of computer
Database
implementation
that,
a first
conversion
concepts,
context
introductory
The
for
for
of
design,
notion
on understanding
material
edition
for the
on the
This edition is suitable essential
third
electronic reserves
sure
right
that
some to
third remove
concepts
additional
content
understand
requirements,
databases
party
and
challenging
students
information
design
rights, the
design
are sufficiently
may content
that
be
meet design
suppressed at
any
time
and
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
xiv
Preface
elegance the
standards
use
of
requirements This
Standard
(UML
Foot
of both
Copyright Editorial
review
2020 has
Cengage deemed
these
Learning. that
any
by the
In
is
to high
of
information
that
the
2017
with the
Modelling
are
Therefore,
capable
of
notation
for
third
edition.
to
data
modelling
Language)
Group has led to
edition
second
this
Appendix
requirements.
databases
we explore
meeting
end-user
data
modelling.
standards.
Management
as the
approaches
maintained.
design
Object
within
ensure
UML (Unified
keeping
models
notation
that familiarity
to use
2.5.1 is available
relationship
Crows
the
reviewed.
meet end-user
trade-offs
conforming
retains
development
continually
entity
defined
while
edition
Continual
is
while they fail to
carefully
standard: edition,
order
E, Comparison
ISO/IEC
UML
However, in
has
as to
of ER
UML becoming 19505-1
continued
organisations maintain
Modelling
and 19505-2), to
be used
still
legacy
an International
use
both
systems,
Notations,
to
which produce
Chen
and
it is important
contains
coverage
notations.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
ChAnges to the thirD eDition
In this
third
edition,
database
design
To support Data
the
of Big
Data
technologies
that
have
been
developed
and
expanded
Business and
databases
Cengage deemed
Learning. that
any
coverage
of
a few
and
focuses
data
of the
NoSQL in
and
to
strengthen
the
already
strong
highlights:
technology,
greater
to
continued
depth
support
its
visualisation
we have on the
added
use, including
tools
a new
characteristics Hadoop
and techniques
Chapter
of
in
Big
and
16:
Data
Big
and the
MongoDB.
Chapter
15,
Databases
the
classroom.
Intelligence.
updated of
An additional
has
growth
new features
are just
chapter
Coverage
2020
Here
The
New
review
coverage.
some
NoSQL.
New
Copyright
added
and
for
Editorial
we have
Business
MongoDB
with
appendix
All
Rights
provide
exercises
coverage
topical for
discussion
querying
of Neo4j
points
MongoDB
with hands-on
in
databases
exercises
(Appendix
for
querying
Q). graph
R).
Reserved. content
to
hands-on
containing
(Appendix
suppressed
Vignettes
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
ACknoWLeDgeMents
The
publisher
feedback Emilia
on the
second
Mwim,
UNISA
Patricia Judy
acknowledges
Casper
Essop, Jakeman,
Andy
Davies,
Mick Ridley, Ray
Turner,
Mark
For this
Lecturer
Oxford
I
coverage
of relational
Last,
College
of Essex University of
Glamorgan
to
say
a special
School
of
Computing,
of experience
within
thanks Maths
the
to and
Pamela Digital
database
field
Quick,
who
Technology have
previously
at
been
worked
Manchester
very
valuable,
Metropolitan specifically
I have
Louw.
been lucky
Marinda
to
provided
work
fantastic
with a very support
patient,
in
supportive
answering
all
and
It
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
not least,
Reserved. content
has
been
with you.
certainly
thank
you
to
my family
(my
ohana)
for
your
patience
and
support.
January
Copyright
the
professional
my emails.
Keeley
Editorial
as a
algebra.
working and
State
of Bradford
like
edition,
Marinda
a pleasure
invaluable
Greenwich
Brookes
would
Free
Regional
University
in the
third
provided
College
University
Her years
On this
who
of Technology
of the
of
Blackburn
University.
Publisher,
University
Peterborough
McPhee,
edition,
lecturers,
of Pretoria
University
University
Green,
Duncan
Senior
Central
University
Chris
following
UNISA
Macdonald,
Ismael
of the
editions:
University
Wessels,
Theo
contribution
and third
Alexander,
van Biljon,
the
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
Crockett 2020
eBook rights
and/or restrictions
eChapter(s). require
it
ABout the Authors
Carlos
Coronel
Tennessee
is
State
Administrator,
courses
in
Web development,
Steven
completed
Morris Design
and
Dr Keeley
and
and
from
Rule Induction data
in
dialogue
such
Women
as
in
be a STEM
Dr Craig in
South
the
Activated
is the such is
founder
also
has
Cengage deemed
Learning. that
any
technology
literacy,
All
Rights
does
May not
fields
at
as
a
and
Middle
Database
has
taught
data communications
not materially
be
copied, affect
been
which
at the
of
has
Women
in
Masters
explored
the
His PhD, in education (ACT)
companies
teaching
speaker
who is
digital
scanned, the
overall
or
of
resulted to teaching
of numerous
using
articles
undertaking
many
is
and IEEE also
proud
to
schools.
and
Artificial
Technology
Intelligence
in the
to
development
with technology. books
running,
his innovative
for its
and journal
Systems
application
the
and natural
committee,
of Information
author
She leads
Keeley
in rural
Fuzzy
presence
IEEE
in
systems
systems,
papers
roles.
with technology,
entitled
database
international
science
Mathematics
1998 of
Leadership
approach
and is the
systems,
in
in the
other
technology,
model, a unique
has published
students.
conference
many
area
Systems
a BSc Degree (Hons)
fuzzy
volunteer
in computer
in the
Steven
field
a strong
Engineering
among
outreach
the
intelligence,
an active
PL/SQL,
of Computing,
postgraduate
established
She is
School
She gained
machine learning
and
He has taught
and
University.
within
artificial
University. SQL
journals.
in the
teaching
using
and teaching His
State
over 125 refereed
IEEE
Auburn
University.
subcommittee
acclaimed
Reserved. content
Labs
Specialist,
Advanced
of several
Intelligence
Profiling
database
changing
with
undergraduate
Lab,
years.
of
multiple
various
and
PhD from
boards
has
both
of the
Teaching
our rapidly
suppressed
to
management.
an internationally
in
She
been researching 25
and
and a PhD in the field
with a passion for
over
in
Computer
Technology
Middle Tennessee
review
and journals.
Classroom
as computer
education
2020
for
Business
experience and
Programming
She has published
member
transaction
of
development,
Metropolitan
intelligence
has
Africa
database
1993,
conferences a
of
and
Computational
Psychological
Ambassador
Blewett
in
Domains.
systems.
being
years
Science
of MIS at
20 years
Computational
of
on the
Research
Adaptive
major international
roles
review
Data
Intelligence
into
College
Manager
Database
Manchester
UMIST in
for
the
design
Bachelor
serves
at
engineering
language
his
a Reader
from
Computational
research
database
and Principles
is
25 Web
Development,
Technology
Computation
over
for
levels.
currently
Crockett
Digital
and
and
Design,
articles,
Director
He has
and graduate
many
Lab
Administrator,
undergraduate
Analysis
Copyright
the
University.
Network
Database
Editorial
currently
covering
and
active
approaches
to
topics
living.
help
of Craig
He
change
world.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
WALk-through tour CHAPTER 1 The
IN
THIS
CHAPTER,
The
BUSINESS
Database
VIGNETTE
a
between
database
valuable
RELATIONAL
AN
REVOLUTION
HISTORICAL
How
differs Until
the
late
difficult
1970s,
to
the
databases
navigate.
stored
large
Programmers
database
was
designed.
time-consuming
and
amounts
needed
to
Adding
or
of
know
data
what
changing
in
structures
clients
the
way
that
wanted
the
to
data
were
do
were
inflexible
with
the
1970,
article
Edgar
stored
or
entitled
realised
of
Codd,
theories
computers
and
query
strange
Ted
five-page
of
would
spark
a
to
one
Data
for
a
And
I
said,
Data
Chamberlin,
it
this
very
The
before
analysed
was
main
to
fund
At
on
par
with
of
guy
Ted
the
the
SQL,
Codd
time,
System
R,
eventually
The
lead
a
a
the
who
had
most
popular
some
kind
listened
as
Chamberlin
Codd
recalls.
reduced
number
vested
interest
about
a
research
to
the
of
years,
in
the
same
who
role
of
open
had
this
two
research,
project
creation
that
of
which
IMS,
a
time
as
read
Codds
tight-knit
IBM
Ellison,
SQL
built
and
turned
reliable,
a
DB2.
out
to
high-end
System
R
The
prototype
of
IBM,
be
a
however,
a
symposium
database,
System
decision,
R
because
its
had
just
up,
system
two
established
fuelled
staff
that
professors
a
a
to
series
publish
founded
programmers
Navy,
Ellison
By
the
1983,
a
from
was
able
the
to
had
from
similar
of
these
small
System
market
company
database,
database
are
management
and
how
database
system
a
database
system
(DBMS)
been
the
project
papers.
systems
the
first
(Software
had
and
the
over
those
decisions
require
Data
this
are
to
you
other
in
had
and
information,
which
managed
most
what
a
is
derived
efficiently
from
when
database
management
is,
what
methods.
they
it
You
raw
are
does
will
facts
stored
and
also
known
in
why
learn
a
it
as
database.
yields
about
better
different
types
databases
and
why
database
evolved
is
design
from
now
is
so
computer
largely
important.
file
outmoded,
systems.
Although
understanding
the
file
system
characteristics
of
data
file
systems
of
papers
the
1979,
a
changed
CIA
well
released
had
be
learn
data
1968.
potential
the
from
database
annually,
good
likely
chapter,
than
is
was
important
because
chapter,
they
you
will
are
also
the
source
learn
how
of
the
serious
data
database
management
system
limitations.
approach
In
helps
eliminate
Laboratories.
funding
relational
000
quality
competition
market
Development
securing
Laboratories)
910
data
California
The
the
reading
Software
and
SQL-based
3
and
back
in
of
Ingres.
of
Among
called
Ingres,
Development
grossed
governance
company
released
University
called
Unaware
papers.
company
R
data
which
on
the
most
of
a
convinced
relational
kept
crucial
database
started
work,
groups
allowed
who
Recruiting
systems
components
PREVIEW
of
this Larry
file
management
main
source
of
management the
from
data
of
Databases Berkeley,
between
are
complicated
of At
at
they
system
functions
importance
results had
why
development
In for
and
nobody
data. would
burner
are,
groundbreaking
Good IBM
file
databases
seriously.
Chamberlin
Wow,
a
Banks.
co-inventor
was
took
and
published
revolution
There
nobody
IBM,
Shared
technological
symposium,
line.
by
Large
Don
explains:
but
organised
programs
employed
internet.
today,
notation,
Codd
mathematician
Model
the
language
mathematical
Then
a
Relational
Codds
personal
database
system
systems
a
of
expensive.
Ted
A
that
evolved
file
database
from
types
making
design
and
data
The In
in
the
information
different
database
databases
flaws
What
LEARN:
and
the
decision
of
modern
About
JOURNEY
what
for
importance
WILL
data
is,
assets
The
THE
YOU
difference
What
Approach
and
before
the
shortcomings
of
file
system
data
management.
IBM.
portable
its
of
the
version
name
to
Oracle.
3
Business Vignettes illustrate the parttopics with a
Chapter Previews setthe scenefor the chapter and
genuine scenario and show how the subject integrates
with
provide an overview of the chapters
contents.
the real world.
20
PART
I
Database
CHAPTER 3
Systems
The
are
1
criticisms
not
of
unique
to
introduced
design
Relational
Model
in
THIS
CHAPTER,
YOU
WILL
a
adhering
end
to
the
relational
That
the
relational
tables
in
How
a
database
model
models
relational
takes
basic
field
Entity
you
learn
about
database
definitions
and
and
of
always
structure
of
important
in
Both
learn
6,
data
environment,
the
designers
of
1.3
they
about
Chapter
types
Figure
later,
you
issues
the
reflect
requirements.
naming
be
implementation
must
processing
file
to
when
Regardless
database
the
prove
conventions
Diagrams,
Design.
a
and
in
will
naming
Relationship
Database
or
shown
conventions
and
are
database
Data
Modelling
in
Chapter
11,
the
design
documentation
needs
are
best
served
conventions.
LEARN: Online
That
when
system
reporting
proper
conventions
such
with
Physical
file
users
naming
definitions
Modelling
and
involves
the
field
and
Logical
it
and
and
Because
revisit
Data
Concepts;
whether
needs
IN
will
5,
Conceptual,
by
definitions
systems.
You
Chapter
Advanced
Characteristics
field
file
early.
a logical
view
components
are
of
this
data
relations
implemented
Content
Appendices
A to
P are
available
on the
online
platform
accompanying
book.
through
DBMS
relations
are
organised
in
tables
composed
of
rows
(tuples)
and
columns
(attributes)
NOTE Key
terminology
About
used
the
role
of
in
the
describing
data
relations
dictionary,
and
the
system
catalogue No
How
data
redundancy
is
handled
in
the
relational
database
naming
the
Why
indexing
is
convention
can
fit
all
requirements
for
all
systems.
Some
words
or
phrases
in
some
are
reserved
for
model
important
DBMSs
your
be
internal
DBMS
use.
might
interpreted
you
as
would
For
interpret
get
a
an
example,
a
command
(-)
to
error
the
hyphen
name
as
a
subtract
the
message.
On
the
ORDER
generates
command
to
NAME
other
an
subtract.
field
from
hand,
error
Therefore,
the
CUS
the
field.
CUS_NAME
DBMSs.
field
Because
would
work
Similarly,
CUS-NAME
would
neither
fine
field
because
exists,
it
uses
an
underscore.
PREVIEW 1.5.3 In
Chapter
and
2,
data
Data
Models,
you
independence
allow
learnt
you
that
to
the
examine
relational
the
data
models
models
logical
the
physical
aspects
of
data
storage
and
retrieval.
You
Data
also
without
learnt
that
file
ERM
may
be
used
to
depict
entities
and
their
relationships
graphically
through
systems
structure
an
organisational
chapter,
you
will
learn
some
important
details
about
the
relational
models
and
more
about
how
the
ERD
can
be
used
to
design
a
relational
will
learn
how
the
relational
databases
basic
data
components
fit
into
construct
known
as
a
table.
You
will
discover
that
one
is
unlikely
that
database
physical
important
reason
for
be
models
units.
related
simplicity
You
to
will
one
also
is
learn
that
its
how
tables
the
can
be
independent
treated
as
tables
learning
introduced
an
of
the
to
the
and
part
are
few
way
which
that
relational
poorly
to
For
those
components
example,
and
shape
database
designed
introduced
chapters.
in
their
concepts
of
and
you
next
the
tables,
basic
integral
well-designed
Finally,
it
storage
islands
difficult
of
of
to
the
combine
same
data
basic
information
for
from
data
such
multiple
in
sources.
different
scattered
locations.
data
locations.)
in
different
locations
will
logical
within
versions
numbers
of
occur
in
the
always
same
both
the
be
data.
updated
For
consistently,
example,
CUSTOMER
in
and
the
As
the
Figures
islands
1.3
AGENT
files.
and
of
1.4,
You
the
need
only
correct
copy
of
the
agent
names
and
phone
numbers.
Having
them
occur
in
more
than
one
place
rather
the
data
redundancy.
Data
redundancy
exists
when
the
same
data
are
stored
unnecessarily
at
database places.
another.
about
to
such
make
the
different
phone
Uncontrolled After
security
term
the
different can
stored
contain
and
produces than
data
often
names
one relational
the
a agent
logical
use
database. information
You
of
promotes
professionals
logical it
structure
lack
structure
ERD. (Database
this
and
the The
In
Redundancy
structural
structure
The considering
the
their
design
design,
of
you
will
relationships,
tables.
also
you
Because
the
the
characteristics
learn
table
is
Data
you
relationships
concepts
will
might
that
examine
be
will
become
different
handled
your
kinds
in
the
of
in
in
the
gateway
files
on
relationships
relational
sets
Data
appear
address
basic
redundancy
inconsistency.
data
tables.
some
data
the
stage
for:
are
which
inconsistency
different
the
file.
of
For
If
different
version
exists
places.
AGENT
contain
you
data
the
data
is
when
example,
forget
for
same
different
suppose
to
the
make
and
you
conflicting
change
corresponding
agent.
versions
an
agents
changes
Reports
will
yield
of
phone
in
the
the
same
number
or
CUSTOMER
inconsistent
file,
results
depending
used.
database Poor
data
being
susceptible
security.
Having
multiple
copies
of
data
increases
the
chances
of
a
copy
of
the
data
environment.
Learning
Objectives
appear at the start of each chapter
to
Online Content
to help you monitoryour understandingand progress
unauthorised
access.
boxes draw attention to relevant
material
onthe online platformfor this book.
through each chapter. Each chapter also ends with a
Notes highlight important facts about the concepts
summary section that recaps the key content for revision
introduced in the chapter.
purposes. Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
154 64
PART
I
Database
PART
I
Database
Systems
Systems
User TABLE
Levels
2.3
of
data
queries
can
expression,
Degree
Model
be
written
as
relational
algebraic
expressions.
In
order
to
write
such
as
an
abstraction
Focus
of
Independent
the
?
List
all
?
Select
?
Specify
following
the
steps
attributes
we
should
need
be
to
give
taken:
the
answer.
of all
the
relations
we
need,
based
on
the
list
of
attributes.
Abstraction
2
High
External
End-user
views
Global
Conceptual
view
of
data
(independent
of
database
Hardware
and
software
Hardware
and
software
Specific
database
Storage
and
relational
calculus
predicate
calculus.
operators
is
a
and
formal
the
language
intermediate
results
based
upon
a
that
branch
are
of
needed.
mathematical
logic
called
model)
Tuple Internal
the
Relational
relational
calculus
allows
users
to
describe
what
they
want,
rather
than
how
to
compute
it,
modelHardware and
underlines
the
appearance
of
Structured
Query
Language
(SQL).
Expressions
in
tuple
4 Low
Physical
access
methods
Neither
hardware
nor
software
relational
calculus
Domain
relational
that
TABLE
SUMMARY
data
model
is
Database
a
(relatively)
designers
The
Business
data
abstraction
models
data-modelling
rules
real-world
simple
use
basic
are
of
to
used
to
identify
a
complex
communicate
are
and
real-world
with
components
define
the
data
applications
entities,
basic
and
relationships
modelling
tuples
calculus
on
values
is
from
Summary
which
different
an
of
Operator
for
given
from
attribute
predicate
tuple
is
relational
true.
calculus
as
it
uses
of
the
domain
variables
domain.
relational
Symbol
a
operators
Description
environment.
programmers
attributes,
take
4.1
Relational A
users.
return
and
components
end
constraints.
within
a
specific
SELECT
s
Selects
a
subset
PROJECT
P
Selects
a
subset
Selects
tuples
in
Relation1
but
INTERSECT
Selects
tuples
in
Relation1
or
UNION
Selects
tuples
in
Relation1
and
-
DIFFERENCE
of
tuples
of
from
a
columns
relation.
from
a
not
relation.
in
Relation2*.
environment.
The
hierarchical
the
concepts
The
relational
the
and
are
network
found
in
model
end
user
means
of
is
the
database
for
to
and
visually
end
users
being
stored
complements
present
different
to
that
are
no
longer
used,
but
some
standard.
in
tables.
Tables
The
entity
the
the
are
of
data
the
to
data
into
a
seen
common
is
ER
by
PRODUCT
THETA
other
model
The
as
model,
each
(ER)
model.
the
relational
related
relationship
relational
views
integrate
In
X
JOIN
Computes
{
popular
two
5,
,,
the
possible
relations
,5,
excluding
combinations
to
.5,
Relation2,
,
be
.}.
of
combined
When
the
duplicate
one
operator
tuples*.
tuples.
using
is
comparison
5 the
operators
operator
is
known
as
an
EQUIJOIN.
allows
database
all
Allows
u
by
a
model
Relation*.
of
CARTESIAN
attributes.
that
and
models
implementation
common
modelling
designers
programmers
as
early
models.
database
data
in
data
were
data
current
values
tool
models
current
the
perceives
common
graphical
data
in
NATURAL
|X|A
JOIN
version
designers,
of
the
EQUIJOIN
Relation1Tuple.Y
framework.
both
which
5
relations
selects
those
Relation2Tuple.Y.
which
Y is
must
share
and
natural
the
tuples
a
same
set
where
of
common
domain.
attributes
Duplicate
to
columns
are
removed. The
object-orientated
object
data
resembles
also
an
includes
model
entity
in
information
objects,
thus
(OODM)
that
it
about
giving
its
uses
includes
objects
the
relationships
data
more
as
facts
the
that
between
basic
define
the
modelling
it.
facts
But
as
structure.
unlike
well
as
an
An
entity,
the
relationships
object
with
OUTERJOIN
Based
other
all
relational
model
relational
data
and
scientific
the
most
has
model
adopted
many
(ERDM).
At
applications,
likely
object-orientated
this
while
future
point,
the
scenario
is
(OO)
the
ERDM
an
OODM
is
is
largely
primarily
increasing
extensions
used
geared
merger
of
to
to
in
the
specialised
business
OODM
become
overshadowed
by
NoSQL
the
databases
4
need
are
to
a
distributed
support
to
new
data
consistency
the
develop
internet
generation
stores
and
very
access
of
specific
that
strategies
databases
shifting
needs
provide
the
high
burden
tuples
ERDM
that
of
Big
for
do
Data
not
use
scalability,
in
Relation1
JOIN,
that
have
the
no
OUTERJOIN
in
corresponding
addition
values
in
selects
and
fault
of
maintaining
Although
technologies,
UNIVERSAL
both
relational
model
NoSQL
relationships
and
tuples
in
Relation1
that
match
every
row
in
the
relation
Relation2.
A
formula
The
;
must
formula
be
true
must
for
be
at
true
least
for
one
instance
all instances
are the
case
of
these
operators,
relations
must
be
union-compatible.
databases.
the
organisations.
availability
Selects
'
EXISTENTIAL
engineering
applications.
and
and
KEY geared
u-JOIN
extended
* in
are
the
the
Relation2.
meaning.
DIVIDE The
on
databases
tolerance
data
integrity
TERMS
offer
by
sacrificing
to
the
data closure
natural
difference
PROJECT
DIVISION
predicate
SELECT
join
program safe
expression
code.
Data
modelling
requirements
are
a
function
of
different
data
views
(global
vs
local)
and
domain level
of
data
abstraction.
The
American
National
Standards
Institute
Standards
Planning
relational
calculus
Requirements
Committee
(ANSI/SPARC)
describes
three
levels
of
data
abstraction:
lowest
level
internal.
of
There
data
is
abstraction
also
is
a
fourth
level
concerned
of
data
abstraction
exclusively
with
(the
physical
algebra
relational
algebraic
relational
schema
expression
theory
theta
join
tuple
relational
calculus
external, INTERSECT
and
relational
and equijoin
conceptual
set
calculus
the
physical
level).
storage
This
join
column(s)
left
outer
UNION
union-compatible
RESTRICT
methods. right
join
outer
join
Summary Eachchapter ends witha comprehensive
Key Terms arelisted atthe end ofthe chapter and
summary that provides a thorough recap of the issues in
explained in full in a Glossary at the end of the book,
each chapter, helping you to assess your understanding
and
enabling you to find explanations of key terms quickly.
revise key content.
CHAPTER
single-user
query
database
1
transactional
The
Database
Approach
32
31
language
query
result
social
set
record
semi-structured
media
workgroup
structural
dependence
structural
independence
Structured
data
XML
I
Database
Systems
PROBLEMS
database
1 query
PART
1
database
database
Online
Query
Language
(SQL)
in
a
Content
Microsoft
platform
FURTHER
READING
Given
the
1 Codd,
E.F.
Date,
C.J.
The
Capabilities
The
of
Database
Assessment
of
Relational
Database
Relational
E.F.
Model,
Codds
Management
A
Contribution
Systems.
Retrospective
to
Review
the
Field
of
IBM
and
Database
Research
Analysis:
a
Technology.
Report,
Historical
RJ3132,
Account
Addison
2
Date,
C.J.
An
Introduction
C.J.
Date
to
on
Database
Database:
Systems,
Writings
8th
20002006.
edition.
Addison
Apress,
2006.
Review
Questions
Wesley,
Content
are
available
Answers
on
the
to
online
selected
platform
accompanying
this
and
shown
records
file
structures
database
you
named
see
in
this
problem
Ch01_Problems,
set
available
are
on
simulated
the
online
book.
in
does
problem
would
Figure
the
you
P1.1, P1
file
answer 1answer
contain,
Problems Problems1
and
encounter
if
how
you
1-4.
many
wanted
4
fields
to
are
produce
there
a
per
listing
by
record?
city?
How
would
you
2000. this
problem
by
altering
the
file
structure?
for
Problems
2003.
FIGURE
Online
structure
many
What
solve Date,
this
1981.
and
Wesley,
file
How
The
Access
for
Problems
for
this
chapter
P1.1
PROJECT_
book.
The
file
structure
PROJECT_
CODE
MANAGER
21-5Z
Holly
25-2D
Jane
14
MANAGER_
PROJECT_BID_
MANAGER_ADDRESS
PRICE
PHONE
B.
Naidu
33-5-59200506
180
Boulevard
Dr,
D.
Grant
0181-898-9909
218
Clark
Blvd.,
Dr.,
Phoenix,
64700
London,
13
9
NW3
179
975.00
787
037.00
TRY
REVIEW
QUESTIONS 25-5A
1
Discuss
each
of
the
following
Menzi
25-9T a
Holly
27-4Q
c
Menzi
Holly
is
data
redundancy
and
which
characteristics
of
the
file
system
can
lead
to
Discuss What
5
What
6
Explain
7
What
the is
a
lack
of
DBMS,
data
and
independence what
are
in
its
Boulevard
0181-227-1245
124
River
33-5-59200506
180
Boulevard
Durban,
25
4001
Dr,
Phoenix,
64700
458
16
005.00
887
Zulu
Naidu
Dr.,
Durban,
8
4001
Dr,
Phoenix,
64700
181.00
078
124.00
20
014
885.00
file
systems.
is
structural
is
independence,
the
difference
the
role
and
between
of
a
data
DBMS,
List
and
describe
What
What
11
Explain
12
What
Use
are
is
What
15
Explain
Further
main
and
why
and
what
different
K.
Moor
wanted
postal
is
it
to
Via
39-064885889
code,
how
produce
alisting
would
you
of
alter
the
you
detect,
the
Valgia
file
Silvilla
file
contents
23,
by
last
Roma,
00179
name,
area
44
516
677.00
code,
city,
FIGURE
its
of
of
a
What
data
redundancies
do
county
or
structure?
and
how
could
those
redundancies
lead
to
anomalies?
important?
information.
are
types
components
you
P1.2
The
file
structure
for
Problems
58
advantages?
databases.
PROJ_
NUM
database
system?
EMP_
NAME
EMP_NAME
NUM
1
Hurricane
101
John
1
Hurricane
105
David
F.
1
Hurricane
110
Anne
R.
2
Coast
101
John
D.
Dlamini
2
Coast
108
June
H.
Ndlovu
3
Satellite
110
Anne
R.
3
Satellite
105
David
F.
3
Satelite
123
3
Satellite
112
D.
Dlamini
Schwann
JOB_
JOB_CHG_
PROJ_
CODE
HOUR
HOURS
EE
65.00
13.3
31-20-6226060
CT
40.00
16.2
0191-234-1123
CT
40.00
14.3
34-934412463
EE
65.00
19.8
31-20-6226060
EE
65.00
17.5
0161-554-7812
CT
42.00
11.6
34-934412463
CT
6.00
23.4
0191-234-1123
EE
65.00
19.1
0181-233-5432
BE
65.00
20.7
0181-678-6879
EMP_PHONE
metadata?
why
are
database
the
design
potential
examples
in
are
a
the
what
of
compare
typical
six
is
costs
to
prevalent
14
the
the
If
functions?
PROJ_
13
River
180
it?
4
10
124
file
What
3
9
F.
B.
William
31-7P
8
0181-227-1245
33-5-59200506
record
d
4
Naidu
field 29-2D
3
Zulu
B.
data
b
2
F.
terms:
and
business
levels
is
important.
implementing
a
contrast
database
system?
structured
and
unstructured
data.
Which
type
is
more
Ramoras
environment?
on
meant
Ramoras
which
by
data
Reading
the
quality
of
data
can
be
examined?
Mary
Allecia
Schwann
D.
Chen
R.
Smith
governance.
allows you to explore the subject further,
Problems
become progressively
more complex as
and acts as a starting pointfor projects and assignments.
students draw onthe lessons learnt from the completion of
Review
preceding problems.
Questions
help reinforce and test your knowledge
and understanding, and provide a basis for group discussions and activities. Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
DeDiCAtion
To
my son,
To Craig, you
Kona,
I
being
would
my
be
am today.
there
To
whom I
my best friend
nothing
person
of
for
am so
proud
and
patient
possible.
In
To
my
keep
husband. memory
mother,
Norma
following
your
Thank you for
of
my father,
Crockett,
dreams.
supporting
Frank
who is the
my crazy
Crockett, angel
busy life
who inspired
in
my life.
without
me to
Thank
you
be the
for
always
me.
mother-and
father-in-law
Jackie
and
Bill
Smith
who
have
provided
me
with
much love
and
support. In
memory
To
of Leslie
my family
Much love
Crockett,
and friends, and
aloha
to
a true
all of you
gentleman
whom
have
and
painted
much-loved rainbows
uncle.
in
my life.
all.
Keeley
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
Crockett
eBook rights
and/or restrictions
eChapter(s). require
it
Teaching Support
Cengages
courses
& Learning Resources
peer-reviewed
is
content
accompanied
support
resources.
specific
needs
Examples
The
kind
are
carefully and
provided
area
a test
an instructors
example,
Lecturers:
to
resources
for
Students:
area
online
and learning
tailored the
to
the
course.
include:
instructors
PowerPoint
with,
slides
and
for
students
appendices,
including,
useful
for
weblinks
and
terms.
discover
the
accompanying
access:
education
manual.
An open-access
glossary
bank,
for
further
teaching
student
of resources
example,
and
of digital
resources
instructor,
A password-protected
for
higher
by a range
of the
of the
for
dedicated this
teaching
textbook
digital
please
support
register
here
cengage.com/dashboard/#login
to
resources
discover
the
accompanying
Database
Principles:
Implementation,
dedicated
this
learning
textbook,
Fundamentals
and
please of
Management.
digital
support
search
for
Design,
Edition
on: cengage.com
BEUNSTOPPABLE! Learn Copyright Editorial
review
more 2020 has
Cengage deemed
at cengage.com Learning.
that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
DATABASE PRINCIPLES
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
PartI
DATABASE SySTEmS 1 The Database Approach 2 Data Models
3 Relational Model Characteristics 4 Relational Algebra and Calculus
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
BuSINESS VIgNETTE THE RELATIONAL REVOLuTION AN HISTORICALjOuRNEy Until the late 1970s, databases stored large amounts of data in structures that wereinflexible and difficult to navigate. Programmers needed to know what clients wanted to do with the data before the database was designed. Adding or changing the way the data were stored or analysed was time-consuming and expensive. In
1970,
Edgar Ted
Codd,
a mathematician
employed
by IBM,
published
a groundbreaking
article entitled A Relational Model of Data for Large Shared Data Banks. At the time, nobody realised that Codds theories would spark atechnological revolution on par with the development of personal computers and the internet. Don Chamberlin, co-inventor of SQL, the most popular database
query language
today,
explains:
There
was this
guy Ted Codd
who had some
kind of
strange mathematical notation, but nobody took it very seriously. Then Ted Codd organised a symposium, and Chamberlin listened as Codd reduced complicated five-page programs to one line. And I said, Wow, Chamberlin recalls. The symposium convinced IBM to fund System R, a research project that built a prototype of a relational database, which would eventually
lead
to the
creation
of SQL and
DB2. IBM,
however,
kept
System
R on the
back
burner for a number of years, which turned out to be a crucial decision, because the company had a vested interest in IMS, areliable, high-end database system that had been released in 1968. At about the same time as System Rstarted up, two professors from the University of California at Berkeley, who had read Codds work, established a similar project called Ingres. The competition between
the two tight-knit
groups
fuelled
a series
of papers.
Unaware
of the
market
potential
of
this research, IBM allowed its staff to publish these papers. Among those reading the papers was Larry Ellison, who had just founded a small company called Software Development Laboratories. Recruiting programmers from System R and Ingres, and securing funding from the CIA and the Navy, Ellison
was able to
market the first
SQL-based
relational
database
in 1979,
well before IBM.
By 1983, the company (Software Development Laboratories) had released a portable version of the database, had grossed over 13 910 000 annually, and had changed its name to Oracle.
?
3
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
4
PART I
Database
Systems
Spurred on by competition, IBM finally released SQL/DS, its first relational database, in 1980.1 In 2008, a group of leading database researchers metin Berkeley and issued a report declaring that the industry had reached an exciting turning point and was on the verge of another database revolution.2
In 2010, Oracle acquired MySQL as part ofits acquisition of Sun. It has since maintained the free open-source MySQL Community Edition while providing several versions (Standard Edition, Enterprise Edition and Cluster Edition) for commercial customers. In 2019, the release of MySQL Document
Store
brought together
the
SQL and the
NoSQL languages,
enabling
developers
to link
SQL relational tables to schema-less NoSQL databases.3 Oracles latest offering is Oracle Database 19c, where the c represents cloud; new versions now come out every year. In our historical journey, we must also mention PostgreSQL, developed in1986 as part of the POSTGRES project at the University of California at Berkeley. PostgreSQL4 is afree, open source, object-relational
database
that
extends
the traditional
SQL language
by allowing
creation
of new
datatypes and functions, and the ability to write code in different programming languages. It is a strong competitor to MySQL, given that it has had over 33 years of active development. Analysts, journalists and business leaders continually see new developments with data acquisition and its management, such as the explosion of unstructured data, the growing importance
of business intelligence,
and the
emergence
of cloud technologies,
which
may require
the development of new database models. Although traditional relational databases meetrigorous standards for data integrity and consistency, they do not scale unstructured data as well as new database models such as NoSQL. NoSQL is also known as a non-relational database, which allows
the
storage
and retrieval
of unstructured
data using
a dynamic
schema.
A key
question
asked by database developers today is whether they need a NoSQL database or an SQL database for their application. For example, Twitter and Facebook, which do not require high levels of data consistency and integrity, have adopted NoSQL databases. In 2019, businesses are opting for SQL and NoSQL multiple database combinations, which suggests that one size does not fit all. As of
March 2019, the
most popular
database
management
systems
worldwide
were Oracle,
MySQL, Microsoft SQL and PostgreSQL.5 So, whatis the future? Disruptive database technologies are required for business to remain competitive and the key is real-time data. Alternative database models such as cloud database platforms, which have the capability for real-time data analytics, are for certain. Big data has a role to play as additional data sources must be processed using data
pipelines,
regulations.
1
IBM
2
accordance
The relational
and
Rakesh
all in
Oracle
Agrawal
with the
new
model will survive,
Trade
Barbs
et al.,The
over
Claremont
General
Regulation
(GDPR)
but it will also adapt at unprecedented
speed.
Databases,
Report
on
Data Protection
data
https://phys.org/news/2007-05-ibm-oracle-barbs-databases.html
Database
Research,
http://db.cs.berkeley.edu/claremont/
claremontreport08.pdf. 3
MySQL
4
Editions,
PostgreSQL,
5
Top
www.mysql.com/products/ www.postgresql.org/about/
10 Databases
for
2019,
The
Database
Journal,
www.databasejournal.com/features/oracle/slideshows/
top-10-2019-databases.html
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER 1 The Database Approach IN THIS CHAPTER, yOu The difference
between
What a database valuable
is,
assets
How
modern
About
flaws
What the differs The
decision
making
of database
design
databases in file
types
from
data
systems
a file
file
of databases
are, and
why they
are
systems
management main components
are
and
how
a database
system
system
main functions
The role
evolved
system
database
from
data and information
what the different
for
The importance
wILL LEARN:
of a database
of open
source
The importance
of
data
management
database
system
(DBMS)
systems
governance
and
data
quality
Preview Good
decisions
data.
Data
In this
require
are likely
chapter,
results
than
other
data
and
why
is important this most
review
2020 has
of the
Cengage deemed
Learning. that
any
management
you
they
shortcomings
All suppressed
Rights
does
May not
not materially
be
system
affect
scanned, the
overall
does
systems.
and
facts
known
in
a database.
why it yields about
of serious
duplicated,
Although the
database
learning
raw
are stored
You will also learn
data
or
from
different
as
better types
so important. file
how the
copied,
what it
understanding
are the source
of file
Reserved. content
is
computer
outmoded,
will also learn
derived when they
is,
methods.
design
from
now largely
because
chapter,
which is
most efficiently
what a database
database
evolved is
information,
managed
you learn
Databases
Copyright
be
of databases
management
Editorial
good
to
file
system
characteristics
of file
data
management
system
approach
data systems
limitations. helps
In
eliminate
management.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
6
PART I
Database
Systems
1.1
1
DATA VSINFORmATION
To understand
what
information. to reveal
their
lab think
of its
performance. your
they
of our labs
(a) initial
because
has
survey
Cengage deemed
graphics
summary
(c) information
2020
It is
are second-year
bar
FIguRE 1.1
review
(c).
any
would
has
possible
customer
base?
your
you
begin
Web survey
been
completed,
the
In
get this
(38
raw
data into
quick
answers
case,
that
you
a data
can
extract
respond
are
saved
to
summary
quickly
the
facts
like
the
such
a data
in
hand,
determine
data
one
that
is the
most
(32
quickly,
to
shown
as, What
undergraduates
meaning from
to
and ones is not likely
questions
and first-year
and
labs
to
have
page of zeros
computer
users
data
now
data
processed
of a computer
assess the
raw you
to
users
enables
forms
Although
page after
per cent)
ability to
Panel
users to
form the
Panel (b).
reading
to
know
by surveying
the
1.1,
undergraduates
1.1,
between
of your
per cent).
you show the
(d).
Transforming raw datainto information screen
(b)
in summary
Learning. that
that
you transform
Figure
difference
what the
now
in
the
want to
Figure
can enhance
graph
understand
suppose
useful in this format Therefore,
Panel
to
have not yet been
shows
in
need
that the facts
form
shown
you
word raw indicates
you (a),
survey
one
much insight. 1.1,
Panel
When the as the
customers
data
Copyright
Typically,
1.1,
composition
And,
Editorial
services. Figure
are not particularly
Figure
design,
The
For example,
such
provide in
database
meaning.
questions.
repository,
drives
Data are raw facts.
All suppressed
Rights
Reserved. content
does
format
May not
not materially
be
raw
data
(d) information
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
in graphic format
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
information simple using For
is the result
as organising statistical the
data
weaknesses, Raw data
student
undergraduates
In this
you to
1 to
to
data types age,
to
said
In turn,
be entering
familiarity,
characteristic
Data
constitute
some the
key
building
is used to reveal
Timely
relevant
and
must
environment generation, that
Data
of lab
and presentation. the results
based
customers.
For example, on the
The respondents
More complex
making. strengths
yes/no
formatting
the
classifications responses
is required
when
or images. and timely business
are the
information
survival
foundation
in
is the a global
key to market.
of information,
good We are
which
is
the
and facts about a specific subject. Knowledge
of information can
as it applies
be derived
from
to an environment.
old
A key
knowledge.
data.
meaning
key to
of data.
is
is the
and retrieval
accurate
easy to
activity
good
survival
data.
access
Such
and
decision in
a global
data
process.
making.
must
environment.
be generated
And, like
any
properly,
basic
Given the for
any
crucial role that
business,
data play, it should
government
agency,
and they
resource,
Data management is a discipline that focuses
of data.
a core
key to
organisational
requires that
is
decision
out the labs
needs
1
inferences
of information.
must be managed carefully.
management
for
point
7
may be as
or drawing
meet the
other.
key to
knowledge
information
is the
a format
storage
data
age.6
the
and timely
information
in
show
relevant
making is the
by processing
making
useful
be stored
to
can
better
processing
videos
of accurate,
new
to
storage.
as sounds,
blocks
Information
decision
foundation
form
Approach
points:
is produced
Good
as the
survey
a category
data
and understanding is that
Information
Accurate,
such
knowledge
awareness
summarise
be used
that is, the body ofinformation
of knowledge
Lets
and
decision
the
storage,
for
production
good
bedrock of knowledge
for
forecasts
Panel (c) is formatted
a Y/N format
information
now
implies
1.1,
making
decisions
Database
meaning. Data processing
as
on the
make informed
with complex
making.
question
3, postgraduates
be converted
decision
or as complex can then
each
formatted
Figure
years
to
working
for
must be properly in
patterns
Such information
summary
helping
classification
may need
of processing raw data to reveal its
to reveal
modelling.
example,
and
data
1 The
the
data
on the proper
not surprise
you
organisation
or
service
charity.
1.1.1 Data Quality and Data governance The quality
of the data
long-term
business
within the database
decisions.
develop new strategies can
be examined Accuracy:
Completeness: Timeliness:
6
Peter
knowledge
Copyright Editorial
review
2020 has
Cengage deemed
and
data updated
the Mr
phrase
George
has it
data
purpose
and this
is to
often
make accurate
means
that
it
short-and
can
generation of an organisation.
be used to
Data quality
including:
been
obtained
from
a verifiable
source?
organisation?
being stored?
frequently
in
knowledge
Gilder,
organisation
the income
levels,
to the
if the
Dr
order to
worker George
in
meet the
1959
Keyworth
and
in
his
business
book
Dr Alvin
requirements?
Landmarks
Toffler
of
Tomorrow.
introduced
the
In
1994,
concept
of the
age.
Learning. that
accurate
data relevant
coined
Dyson,
be fit for
of different
Is the required
Is the
Drucker
Ms Esther
data
Is the
must
which aim to increase
at a number
Is the
Relevance:
Data
is essential
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
8
PART I
Database
Systems
Uniqueness:
1
Is the
Unambiguous: The above
to
not
be subject
major
to
Individuals
South
Africa
POPIA
promotes
development and
the
within
database.
Master
and accurate reporting
working
Once in
place,
the
to
ensure
that
they
to
ensure
that it is
1.2
within and
procedures
is
to
create,
a component
of the
managing example, update
of a data
strategy.
within an organisation
and
strategy
levels
within
is
the
polices
and the
several
data,
and
up to
Once the
strategy
will allow
date for
strategy
The
metadata
or raw
that
facts
stores
of interest
filing
the
cabinet
in
cabinets
Figure The
to
the
Copyright Editorial
review
2020 has
Cengage deemed
which
In
to
the
has
any
All suppressed
structure
Reserved. content
does
quality, who
owns
new records
strategy
that
that
data is
to
allow the
in the provides
consistent auditing,
the
be regularly
of the
May
not materially
willinvolve
of the
many
developed
and
with the
measured data
organisation.
and
governance
Data
people put into
strategy. monitored strategy
profiling
and
data
of data over time.
be
copied, affect
data
scanned, the
overall
serves
duplicated, learning
set of relationships
a very
is
that link
well-organised
management
(DBMS)
to the
as the
electronic
system,
a collection
data stored in the
intermediary
and translates
hides
or
system
and managed.
helps
manage
of programs
that
database.
DBmS
requests
DBMS
and the
resembles
as a database
access
of the
A database is a shared,
data are integrated
characteristics
a database
known
database.
user
which the end-user
and controls
DBMS
The
not
and
of:
management
all application
requests.
Rights
willinvolve
defines
delete
been
process to keep track
end
a sense,
software,
A database
that
receives
those
Learning. that
powerful
database
1.2 illustrates
and
monitoring
purpose
a collection
of the
database.
Role and Advantages
DBMS
fulfil
the
contents.
manages the
1.2.1
a description
within
2013.
THE DATABASE AND THE DBmS
structure
provide
data found
procedures. law in
that
all data complies
should
continual
monitoring
metadata, or data about data, through
the
given.
usability,
strategy
task
Efficient data management typically requires the use of a computer
end-user
is
by an organisation
strategy
technology
and time-consuming
of the
This
are often used as part of the
computer
defined
governance
months to ensure that
procedures
being followed.
still relevant
a complex
organisation.
organisation
INTRODuCINg
integrated
not
bodies.
MDM ensures
provides
May
of how the
into
availability,
the
25
consent
and statistical
private
an
of an individual
was signed
methodology
for For
authorised
(MDM)
or
Europe from
an explanation
own data governance
organisation.
who is
and
which
which governs
of data.
governance
are
and
its
which
public
in
explicit
ask for
mathematical
a strategy
for implementation
all systems
by
of data
rights
unless
to
Act (POPIA)
produces
the
Management
will take the
quality tools
of policies
organisation
compliance
different
it
describe
Each organisation
of data
Data
a data
at
information
across
Creating
operation,
of personal
storage
the
profiling, right
appropriate
protection
foundation
and
must utilise
Information
of a series
GDPR includes
have the
of Personal
security
the technological
making
on the
Regulation (GDPR),
for all organisations
22 of the
which includes
decision
used to
own laws
Data Protection
requirement
Article
making,
such
is the term
the
General
alegal in
decision
data quality.
integrity
became detailed
to
will have their
the
Protection
the
Data governance
data
data,
and organisations
has the
to safeguard the
Most countries
changes
subject
decision is reached
data clear?
For example,
automated
who are
without redundancy?
of the
exhaustive.
and processing
One of the
and
meaning
must adhere to.
collecting 2018.
Is the
list is
organisation
data unique
much
in experience.
whole
or in Cengage
of the
part.
Due Learning
between
them
to
into
databases
electronic reserves
rights, the
right
the
the
some to
user
complex
internal
third remove
party additional
content
and the
database.
operations
required
complexity
may content
be
suppressed at
any
time
from if
from
the
subsequent
eBook rights
the
and/or restrictions
eChapter(s). require
it
CHAPTER
application
programs
and users.
programming
language
DBMS
program.
utility
such
FIguRE 1.2
as
The application Python,
program
Visual
Basic,
might be written
C++
or Java,
The DBmS managesthe interaction
1 The
Database
by a programmer
or it
might
be
created
Approach
using
9
a
through
1
a
between the end user and
the database End
users Application
Database
structure
request Metadata
Data
Customers DBMS database management End
End-user
Invoices
system
data
users Products
Application request
Data
Having
a
DBMS
advantages.
between
First,
or users.
the
Second,
the
DBMS
the
end
users
enables
the
applications
data in the
DBMS integrates
the
and
database
many different
the
to
database
be shared
users
offers among
views
some
important
multiple
of the
applications
data into
a single
all-encompassing
data repository. Because
data
managing
such
efficient
and
In
more
respond
and
quickly
to
actions
in
changes
segment
and of the
Minimised data inconsistency. data
appear
in
different
stores
department
stores that
regional its
sales
shows
national
sales
office
inconsistency
is
greatly
Improved
data
the
access
to
a clearer
For
The
data
of the
affect
big
other
example,
price the
such
need
data
a good
way
management
of
more
as:
in
which end users
makes it
well-managed view
company
reduced
you
make
possible
for
end
have better users
to
environment.
persons
shows
access.
access
derived,
helps
an environment
DBMS
It
name
of product
makes it
exists
as Thobile
X as
products
a properly
an integrated
becomes
view
much
easier
of the
to
see
how
exists when different versions of the same
data inconsistency
name as Bathobile
same
in
promotes
picture.
segments.
Data inconsistency
same
is
DBMS
advantages
Such
a sales representatives
office
the
provides
data.
places.
department
which information book,
helps create
in their
Wider
from
in this
a DBMS
The DBMS
operations one
material
better-managed
data integration.
organisations
raw
will discover
particular,
data sharing. to
Better
crucial
As you
effective.
Improved access
are the data.
price
possible
Cele and the M. Cele
R390.00 as
designed
when
in
a companys
companys
or when the
South
R350.00.
African The
sales
personnel
companys currency
probability
and
of data
database.
to
produce
quick
answers
to
ad hoc
queries.
From a database perspective, a query is a specific request for data manipulation (for example,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
10
PART I
Database
Systems
to read or update the data) issued to the DBMS. Simply put, a query is a question and an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to the application. For example, end users, when dealing withlarge amounts of sales data,
1
might
want
quick
?
What
?
What is the
?
answers
was the
How
many
volume
of our
customers
decision better-quality
Increased
end-user
The
usable
more
making.
for
each
have
credit
hoc
on
productivity.
of our
of using
a DBMS
as you learn
past
six
of
months?
during
better
end
3
users
not limited
to
are
make
global
to the
000)
three
or
months?
more?
make it
possible
to
based.
with the tools
quick,
informed
that transform
decisions
that
can
be
economy.
few
technical
past
data access
decisions
of data, combined
in the
the
the
R5 000 (or
data and improved
and failure
more about
the
as:
salespeople
The availability
are
such
balances
which
empowers
success
queries)
during
Better-managed
information,
between
advantages
(ad
by product
figure
information,
difference
advantages
of sales
bonus
generate
data into
questions
sales
Improved
the
to
just
listed.
details
In fact,
you
of databases
will discover
and their
many
proper
design.
1.2.2 Types of Databases A DBMS the
can
number
usage
support
and the
The
number
B and
When the
used
which
of users
C must
or a specific is
to
types
where the
of
the
data
determines
databases.
Databases
are located,
the
data
are
can
type
be classified
of data
stored,
according
to
the intended
data
structured.
whether the
database
is
classified
as single-user
or
multi-user.
database supports only one user at a time. In other words, if user Ais using the database, wait until
is called a desktop time.
different
supported,
degree
A single-user users
many
of users
multi-user
entire
might
done.
A single-user
supports
database
database
a relatively
small
that
and
supports
many
users
number
(more
the database is known as an enterprise also
be used
to
classify
the
database.
runs
supports
on a personal
database.
than
computer
multiple users at the same
of users (usually
within an organisation, it is called a workgroup
organisation
many departments, Location
Ais
database
department
by the
user
database. In contrast, a multi-user
50,
fewer
than
50)
Whenthe database
usually
hundreds)
across
database.
For
example,
a database
that
supports
data
located at a single site is called a centralised database. A database that supports data distributed across several different sites is called a distributed database. The extent to which a database can be distributed,
and the
Distributed The
way in
product
popular
way
must
as an online
Copyright review
2020 has
Cengage deemed
any
All suppressed
Rights
is
addressed
in
detail
in
Chapter
14,
does
however,
from
purchases
them.
(OLTP),
based
on how they
For example,
reflect
and immediately.
operations is classified
is
critical
that
is
as an operational
transactional
transactions
day-to-day
A database
will be
used
such
operations.
designed
primarily
database,
or production
as
Such to
also referred
database.
databases comprise two main components: a data warehouse and an online (OLAP) front end. The data warehouse is a specialised database that stores for
decision
databases
Reserved. content
accurately
processing
optimised
operational
Learning. that
managed,
today,
gathered
and supply
day-to-day
transaction
a format
the
payments
be recorded
Typically, analytical analytical processing
Editorial
is
databases
of the information
sales,
support a companys
from
distribution
of classifying
sensitivity
or service
transactions
data in
such
Databases. most
and on the time
to
which
May not
not materially
be
as
copied, affect
The
well as data from
scanned, the
support.
overall
or
duplicated, learning
in experience.
whole
or in Cengage
data other
part.
Due Learning
warehouse external
to
electronic reserves
sources.
rights, the
contains
right
some to
third remove
historical Online
party additional
content
may content
data
analytical
be
suppressed at
any
time
processing
from if
obtained
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
is a set of tools processing
that
and
application
work together
modelling
has
grown
intelligence.
capturing
and processing
decision
making. (See
data
are
can
data
Therefore,
that
use, that
to
processing,
37890
postal
code
with it.
merely
and
data.
apply
Some
data
hand,
the
code
and is
if this
value
concept
store
of
these
a graphic
invoices
sales,
such
a (structured)
graphic
been
processed
in
focus
to
some
a prearranged
other
on the to the
storage
use
valuable
procedures,
and
rules
are being
management
of structured
data.
information
and
For
that
can
Web pages.
addressed
you
of and
on the
if you look
company
the
you
can
scan
store
data have the
data
corporations Just
documents
known
are
and
thus
are
data.
data storage
the
mentioned
unstructured
of databases
and
computations.
types
and
you
as monthly
Web page,
memos
a
If
you could
However, and
data
them
such
requisite
at a typical
of
as numeric.
invoices.
Semi-structured
data.
types
computations
paper
Instead,
and semi-structured
some
example,
printed
The database
and
processing
value represents
be formatted
the
yields
storage of
for For
display,
emails,
generation
its
mathematical
perform
collected. that
type
must
as semi-structured.
structured
were
derive information
can
semi-structured
in
a new
to
business
Unstructured
they
code. If this
not be useful.
use
Unstructured
through
discipline:
processing
(unstructured)
it
want to
highly
be found
1
approach
data to facilitate
a stack
would
which
perform
some information.
also
11
database
own
structured.
to the
based
transaction,
retrieval
example,
of
They
its
of
to support
of processing.
you cannot
so that
are in
itself
be ready
types
hand, if you
to convey
into
data
or a product
imagine
format
extent.
format
evolved
Approach
for retrieving,
area
information
(format) not
a sales
storage
spreadsheet
this
a comprehensive
unstructured
for future
other
environment
times,
format
not lend
value
as text,
as images
has
the
is, in the
might
further,
Onthe
which
structure
for
stored
structure
format.
to
does
a sales
represents
it
of generating
of formatting
code,
other
that
describes
that
that
You
analysis
recent
Database
Business Intelligence.)
state
a format
(structured)
average
presented
needs
on the
point
degree
most data you encounter are best classified
already
of the
(raw)
in
of information.
perform
for
be ready
data in
Actually,
exist
the
purpose
the
a postal
want to
invoice
Databases
data are the result
or a product
save them in
with the
to reflect
original
data
to
In
intelligence
might
On the
limited
in their
data
warehouse.
usage,
might refer to
To illustrate
far
exist
but they
value
totals
be classified
generation
you intend
15,
also
Structured
and the
data
an advanced
data
business
business Chapter
provide the
and
The term
unstructured
information.
to from
in importance
business
Databases
data
1 The
not
think
such
as
management
as XML
databases.
extensible Markup Language (XML) is a special language used to represent and manipulate data elements in a textual format. An XML database supports the storage and management of semi-structured XML
Connectivity
and
Analytical for
tactical
(data
data.
and
to
sophisticated
tools.
transactional
or
etc.
strategic
easier
to
to
retrieve
Copyright review
2020 has
Cengage deemed
The
15, Databases
Learning. that
any
in
more
All suppressed
Rights
Reserved. content
does
design,
for
May not
not materially
data,
the
base end
detail
in
data
Chapter
16,
Database
to
massaging
forecasts,
advanced
typically
formulate
data
require
pricing
can store
warehouse
structure
and
of data
use
data
sales
market analysis
of
data used to generate information
by data are based
implementation
extensive
perform
decisions
to
metrics used exclusively
decisions,
on storing
warehouse data
requires
pricing
user
Such
information
the
typically
to
primarily
supported
the
such
analysis which
decisions.
extract
Most decisions
on historical data is
extensive
decisions,
data
forecasts,
data obtained
derived quite
sales
from
different
warehouses
are
from
many sources. from
that
of
in
detail
covered
a
Business Intelligence.
Table 1.1 compares features
Editorial
discussed
allow
focuses
Additionally,
database.
Chapter
warehouse
manipulation)
databases.
make it
on
databases
make tactical
market positioning, operational
be
Such
information
Analytical
a data
(data
making.
produce
on.
data using
massaging
in
decision
to
so
In contrast, required
will
Web Technologies.
or strategic
business
databases
databases focus primarily on storing historical data and business
manipulation)
strategies
To
XML
be
of several well-known
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
database
part.
Due Learning
to
electronic reserves
management systems.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
12
PART I
Database
Systems
TABLE 1.1
Types of databases
1 Product
Number
Data Location
Of Users
XML
Data Usage
Multi-user
Single User
workgroup
X
X
X3
X
X
X
X
X
X
X
X3
X
X
X
X
X
X
X
MySQL
X
X
X
X
X
X
X
X
Oracle
X3
X
X
X
X
X
X
X
MS
enterprise
Centralised
Distributed
Operational
X
Analytical
X
X
Access MS SQL Server IBM
DB2
RDBMS
All the
database
commercial
DBMS, its system
applications
the
for
purpose,
any to
The there
MySQL look
general
main benefit of open source
define
and the
Perl
blocks for
the
most popular
PostgreSQL8
media
such
Over the
and
widespread
term
NoSQL9
Copyright Editorial
review
2020 has
PostGres
9
NoSQL
Cengage deemed
Learning. that
any
not
SQL) is
based
Available:
All suppressed
Rights
develop
the
source the
the
which
provided buy
by actual
database
database
database
will then
and
system
be released
for
A disadvantage
of open
Twitter grow
and
new breed
this
LinkedIn
exponentially
new
generally
on the traditional
relational
source
capture
is
a new
database
the
system
vast
as they
software
is
DBMS building
such
as
stick to the
organisations
is that it
to
does
not
systems.
as the
use
basis for the new
Social
media refers
interactions.
amounts
the
of
to
Websites
of data
about
specialised
end
database
has grown in sophistication
known
as
generation
model.
products
and
human
database
database
describe
basic
companies
on
and require
of
to
MySQL
and analysed.
of specialised
type
used
Web server,
commercial
always
LAMP
provides
technologies
anytime,
However,
The term
DBMS products
smaller
by large-scale
anywhere,
stack
management
vendor
ideal
product itself.
software.
Apache
software
database
required
and use the
You
a
NoSQL
database.
of database will learn
The
management
more about
NoSQL
NoSQL.
www.postgresql.org/ http://nosql-database.org/
Reserved. content
to
www.mysql.com/
Available: Available:
data
years, this
16 Big Data and
mysql.com
8
can
Linux,
this
of data are being stored
enable
Currently,
only
namely:
Together
makes them
durability
Instagram,
These
usage.
is
are
order
distribute
of the
World Wide Web and internet-based
that
past few
(Not
that
Chapter
users
support
use than large-scale This
great amounts
Facebook,
consumers.
systems.
7
generation,
Google,
and
systems
of the
ongoing
open source
quickly.
and
mobile technologies
as
users
Typically,
applications
Withthe emergence
Web and
choice,
any improvements,
software,
languages.
principles.
functionality
and
source
are easier to
database
the robust
social
is that
make
in
MySQL7 is an open
of their
software is that it is free to acquire development
open
websites.
database-centred
provide
and
MySQL)
a company
maintenance.
The idea
code
development
developing
fundamental
develop
in the
PHP/Python
MySQL and basic
and
1.1 (except
from
modify a database
product.
source
Table
public.
will be costs involved
used to
in
DBMS
and
in
investment
support
build
at the
shown
a significant
and ongoing users to
actual
the
systems
and require
which allows
improve
back
management
vendors
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
1 The
Database
Approach
13
1
NOTE Most
of the
database
production First,
design,
(transactional)
production
implementation
databases.
databases
are the
enrolling in a class, registering warehouse poorly
designed,
1.3
design
to
store
data
most of their
warehouse
to the
manage
does
a crucial
of good
refers
and
requirements
such
the
databases
a car, buying
derive
management
issues
on production most frequently
a product
is
in this
based
encountered
in
production
based
databases,
on them
book
on two
are
activities
or withdrawal.
their
reliability
on
such
Second,
and if production
will lose
based
considerations.
common
or making a bank deposit
data from
databases
addressed
databases
databases
and
value
as
data
as
are well.
wHy DATABASE DESIgN IS ImPORTANT
Database used
databases
and
The focus
not just
aspect
database
activities
end-user happen;
of working
design
that
focus
data. its
on the
A good
structure
must
with databases
techniques.
design
database
that
of the that
be designed
DBMS
structure
a database
carefully.
most of this
Even a good
database
is,
book is
In fact,
poorly
will be
meets
all user
database
dedicated
will perform
that
that
to the
design
is
development
with a badly
designed
database. Proper
database
expected
use.
operational
Designing
speed.
aggregated
design
Designing
approach
emphasises
the
15 also
requires
that
used
critical
designer
database
design
issues
the
databases
accurate
and
consistent
the
use of historical
a centralised,
single-user
a distributed,
single-user
confronting
precisely
recognises
of
centralised,
identify
database
be used in
the
to
emphasises
warehouse
to in
of transactional,
examine
database
of a data
a database
from
design
the
a transactional
The
data.
a different
and
design
the
environment
multi-user
and
designer
data
requires
database.
multi-user
This
databases.
of distributed
and
generates
accurate
and
and
book
Chapters
data
14
warehouse
databases. A
well-designed
information. errors
that
may lead
organisation. study
database
A poorly
seminars,
and
1.4
bad
why
data
database decision
making
design
to and
bad
often
a breeding
decision
to
of all types
consultants
and
become
too important
why organisations
database
management
is likely
design is simply
design,
ground
making
be left to luck. and sizes make
can lead
Thats
send
valuable
difficult-to-trace
to
the
failure
why university
personnel
an excellent
and
for
to
of an
students
database
design
living.
HISTORICAL ROOTS: FILES AND DATA PROCESSINg
Understanding considering can
to
Database
database
facilitates
designed
be
what a database what
helpful
in
Understanding
a database
is,
is
what it
not.
understanding
A brief
the
these limitations
data
is relevant
does
and the
explanation access
to
proper
of the
evolution
limitations
database
way to
that
use it
of file
system
databases
designers
can
be clarified data
processing
to
overcome.
attempt
and developers
by
because
database
technologies do not make these problems magically disappear database technologies simply make it easier to create solutions that avoid these problems. Creating database designs that avoid the pitfalls of earlier systems requires that the designer understands these problems and how to avoid them;
otherwise,
technologies
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
the
database
and techniques
All suppressed
Rights
Reserved. content
does
May not
not materially
technologies
are no better (and
are potentially
even
worse!) than
the
they have replaced.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
14
PART I
Database
Systems
1.4.1
manual File Systems
1 To be successful,
an organisation
must develop
systems
for handling
core business
tasks.
Historically,
such systems were often manual, paper-and-pencil systems. The papers within these systems were organised to facilitate the expected use of the data. Typically, this was accomplished through a system of file folders and filing cabinets. As long as a collection of data was relatively small and an organisations
business
users had few reporting
requirements,
the
manual system
served its role
well
as a data repository. However, as organisations grew and as reporting requirements became more complex, keeping track of data in a manual file system became more difficult. Therefore, companies looked to computer technology for help.
1.4.2 Computerised
File Systems
Generating
manual file
reports
from
systems
was slow
and
cumbersome.
In fact,
some
business
managers faced government-imposed reporting requirements that led to weeks of intensive effort each quarter, even when a well-designed manual system was used. Therefore, a data processing (DP) specialist was hired to create a computer-based system that would track data and produce required
reports.
Initially,
the
computer
files
within the file
system
were similar
to the
manual files.
A
simple example of a customer data file for a small insurance company is shown in Figure 1.3. (You will discover later that the file structure shown in Figure 1.3, although typically found in early file systems, is unsatisfactory for a database.)The description of computer files requires a specialised vocabulary. Every discipline develops its own terminology to enable its practitioners to communicate clearly. The basic file vocabulary
shown in Table
1.2
will help you to understand
subsequent
discussions
more easily.
Online Content Thedatabases usedin the chapters areavailable onthe onlineplatform accompanying to
chapter
access
Raw facts,
smallest letter Field
online
Online
platform.
Content boxes
Please
see the
highlight
prelims
for
material related
details
on how to
resources.
such as a telephone Data have little
piece A, the
record
define
store
File
A collection
2020 has
Cengage deemed
Learning. that
any
phone
or a file
All suppressed
Rights
such
by the
as /.
computer
A single
(alphabetic
a record number,
name and a year-to-date
is
character
or numeric)
records.
for a customer date
does
May not
not materially
be
of birth,
a single
requires
that
has
copied, affect
scanned, the
overall
or
duplicated, learning
named
a file
the records
describes
credit limit
For example,
might contain
Reserved. content
be recognised
set of one or morefields that
of related
Company,
can
of characters
constitute
address,
a birth date, a customer
character, 1 byte
a specific
(YTD)
manner. The such
as the
of computer meaning.
storage.
A field
is
used
data.
connected
the fields that
number,
meaning unless they have been organised in some logical
5 or a symbol
or group and
Alogically
name,
of data that
number
A character to
review
on the
book,
Basic file terminology
sales value.
Copyright
located
useful
the
Definition
Data
Editorial
book. Throughout
content
these
TABLE 1.2 Term
this
for
in experience.
whole
a person,
J. D. Rudd
might consist
and
balance.
unpaid
might contain
the
or in Cengage
students
part.
Due Learning
place or thing.
to
data
about
currently
electronic reserves
rights, the
right
some to
third remove
of J. D. Rudds
vendors
enrolled
party additional
content
For example,
of
ROBCOR
at Gigantic
may content
be
University.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIguRE
1.3
C_NAMe Alfred
A.
Database
Approach
15
Contents of the CuSTOmER file
C_PHONe
C_ADDreSS
32-3-8891367
Stationsplein
Ramas
Sea
0181-894-1238
C_POSTCODe
A_NAMe
A_PHONe
TP
AMT
reN
2880
Leah
F.
27-21-410-7100
T1
100.00
05-Apr-2018
B.
0161-228-1249
T1
250.00
16-Jun-2018
F.
27-12-410-7100
S2
150.00
29-Jan-2018
F.
27-21-410-7100
S1
300.00
14-Oct-2018
0181-228-1249
T1
100.00
28-Dec-2018
0181-123-5589
T2
850.00
22-Sep-2018
27-21-410-7100
S1
120.00
25-Mar-2018
0181-123-5589
S1
250.00
17-Jul-2018
0161-228-1249
T2
100.00
03-Dec-2018
0181-123-5589
S2
500.00
14-Mar-2018
Hahn
Town
Box 12A
Dlamini
2,
1
Point,
Cape Mpu K.
1 The
Rd,
N6 4WE
Alex
Highgate,
Alby
Johannesburg Loli
32-3-8890340
W.
Rijksweg
Ndlovu
58,
2880
Nkita
Pretoria
Paul
31-20-6226060
F.
Brown
Martin
Olowski
Rd,
1018
Nkita
Westville,
Brown
Durban 0161-222-1672
Fatima
Box 111
Naidoo
Dr.,
M15 REE
Alex
Chatsworth,
B.
Alby
Durban Amy
B.
0181-442-3381
387 Troll
OBrian
Dr.,
N6 LOP
Menzi
Highgate,
East James
G.
19
33-5-59200506
Khumalo
London East
Block
647000
F.
Brown
Plain
3 Baobab
39-064885889
Mahraj
Nkita
Street,
Mitchells Saajidah
T.
Ndlovu
00179
Menzi
Street,
T.
Ndlovu
Queenswood, Pretoria Anne
G.
2119
0181-382-7185
Farriss
Elm
St.,
NW3
RTA
Alex
Parkview,
B.
Alby
Johannesburg Olette
K.
35 Libertas
34-934412463
Snyman
08001
Menzi
Avenue,
T.
Ndlovu
Stellenbosch
C_NAME
5 Customer
C_PHONE C_ADDRESS
A_NAME
Using the
proper
1.3.
The
of nine fields: REN.
its filename
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
phone
type
5 Insurance
policy
REN
5 Insurance
renewal
amount,
in thousands
of euro
date
name
file terminology CUSTOMER
given in
file
shown
C_PHONE,
records
is
5 Agent
5 Insurance
AMT
postcode
C_NAME,
The ten
TP
address
5 Customer
5 Agent
A_PHONE
phone
5 Customer
C_POSTCODE
Figure
name
5 Customer
are
stored
in
Table
1.2, you can identify
Figure
1.3
C_ADDRESS,
in
a named
file.
contains
ten
C_POSTCODE, Because
the
the file
records. A_NAME,
file in
Figure
components
Each record A_PHONE, 1.3 contains
shown
is
in
composed
TP,
AMT
customer
and data,
CUSTOMER.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
16
PART I
Database
Systems
When business
1
the
users
DP specialist.
from
the
report.
the
file, If
For manipulate
a request
existing
program
their
more
data
and
had to
the insurance
the
files
of other
DP specialist
created
automate
was
were used to coverage,
was so
to
asked
among
1.4
Contents
A_NAMe
A_PHONe
A_ADDreSS
Alex
0161-228-1249
Deken
Alby
had
printed
This
user run
results
business
which
which
processing the
personnel and
AGENT
transfers
user.
in turn
SALES, the
the
file
that
other in
Figure
daily sales
1.4.
to
be able to
data
create
management department
efforts.
at
The sales
demanded
The
rerun
saw the
to
data
functions.
of taxes
could
users
sales
manager
personnel
keep track
more
data
as a printed
wanted
the
data to the
DP specialist
example,
helped track
shown
(EFTs),
they
department
it
business
the
meant For
present
DP specialist
As other
for
for the
to retrieve
and
previously,
reports.
payroll create
had requested
for
that
programs
were being reported,
data,
named
sent requests
create
more requests
more requests
a file
to
to the
generated
obvious
fund
the been
data
file, they
had
access
to
Consequently,
the
in the
file
AGENT
paid and summarise
insurance
other tasks.
FIguRE
B.
to
do electronic
the
and
computerised
manner that
which customer
be created,
success
DP specialist
provide
the
DP specialist
whatever
fashions.
company
departments
the
a report
similar
computerised
programs
it in
ways in
in
data from
request,
was for
new and innovative view
wanted
each
of the
Van
Erpstraat
AgENT file
POSTCODe
HireD
YTD_PAY
YTD_iT
YTD_Ni
YTD_SLS
DeP
5492
01-Nov-2001
20
806.00
5
201.00
1
664.00
103
963.00
3
8002
23-May-2004
25
230.00
6
308.00
2
018.00
108
844.00
0
2193
15-Jun-2003
18
169.00
4
542.00
1
453.00
99
20,
Best Nkita
F.
27-21-410-7100
West
Brown
Quay
Road, Waterfront, Cape
Menzi
T.
452
0181-123-5589
Town Elm
St.,
548.00
2
Parkview,
Ndlovu
Johannesburg
A_NAME
5 Agent
A_PHONE
5 Agent
A_ADDRESS
address
5 Agent
5 Agent
As the
YTD_PAY
phone
5 Agent
POSTCODE HIRED
name
date
postcode
owned
the
used its
file
system
DP specialist
alarger,
or the grew,
5 Year-to-date
file
programs
the
demand to
for
the
The new
like
tax
national
the
DP specialists
one shown
in
and
its
DP department.
activity
remained
programmer
Copyright Editorial
review
2020 has
Cengage deemed
In
Learning. that
any
All
of these
programming,
and
suppressed
spite
and the
more time
Rights
program
Reserved. content
does
May not
organisational
and the
DP
changes,
manager
Figure
modify
inevitably
1.5, evolved.
data.
Each file
And each file
was
creation.
The size additional
skills of the
grew
file
managing technical
(DP)
however, spent
even faster,
system
programming
Therefore, the DP specialists job evolved into that of a data processing a
paid
programming
programmers.
and
paid
insurance
sales
to store, retrieve
computer
programming
income
of dependents
commissioned
hire additional
computer.
to spend less time
that
pay
5 Year-to-date
5 Number
system,
department
was authorised
more complex
DP specialist
a small
own application
by the individual
As the
5 Year-to-date
YTD_NI
DEP
of files increased,
in the system
YTD_IT
YTD_SLS
of hire
number
5 Year-to-date
the
and
also required
staff
caused
the
and human resources.
manager,
who supervised
DP departments
much time
primary
as a supervising
senior
troubleshooter.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIguRE
1.5
Database
Approach
17
Asimple file system
Sales
1 Personnel
department
File
department
File
Management
Management
Programs
Programs
CUSTOMER
SALES
file
AGENT
file
file
File
1.5
1 The
File
Report
Report
Program
Program
PROBLEmS wITH FILE SySTEm DATA mANAgEmENT
The file system system
method of organising
and served
a useful
and managing data was a definite improvement
purpose in
data
management
for
over two
in the computer era. Nonetheless, many problems and limitations critique of the file system method serves two major purposes: Understanding the shortcomings modern databases.
decades,
on a manual
a very long timespan
became evident in this approach.
of the file system enables you to understand the development
A
of
Many of the problems are not unique to file systems. Failure to understand such problems is likely to lead to their duplication in a database environment, even though database technology makes it easy to avoid them. The following problems severely challenge the types as well as the accuracy of the information:
of information
that can be created from the data
Lengthy development times. The first and most glaring problem with the file system approach is that even the simplest data-retrieval task requires extensive programming. Withthe older file systems,
programmers
had to specify
what
must be done
and how to
do it.
As you
will learn in
upcoming chapters, modern databases use a non-procedural data manipulation language allows the user to specify what must be done without specifying how.
that
Difficulty in getting quick answers. The need to write programs to produce even the simplest reports makes ad hoc queries impossible. DP specialists who work with mature file systems often receive numerous requests for new reports. They are often forced to say that the report will be ready next week or even next month. If you need the information now, getting it next week or next
Copyright Editorial
review
2020 has
month
Cengage deemed
Learning. that
any
All suppressed
will not serve your information
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
needs.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
18
PART I
Database
Systems
Complex
1
the
system
system
file
administration.
expands.
management
to
add,
and
system
Each file
delete
records;
and limited
and limited
the file
multiple
geographically
management consequently
password measures system
are
often
protection, designed
to
data
security,
and
Extensive
the
safeguard
programming.
system
from
system
devices
changes
to
tend
to
just
when
file
one field
features
is
can
of
to
data
program
include
effective
and other
made to improve
and
effectiveness.
be
original
alack
data among
of creating
difficult
system itself,
scope
structure in the
is
Sharing
are
an attempt in
user
ad hoc
own files.
In terms
Such
be limited
an existing
changing
risks.
or parts of the
Even
allow the
data repository
features
environment.
confidentiality.
security
For example,
a program
a file
out parts of files
data
Making
environment.
require
omitted
several
Because
are closely related.
of security
of files in
The problem is compounded
of a file system
data-sharing
number maintaining
that
reports.
its data by creating its
a lot and
and
programs
generate
multiply quickly.
owns
as the
creating
management
and security
security
ability to lock
the
can
users introduces
programs,
more difficult
requires
and to
Another fault
Data sharing
dispersed
and reporting
own file
programs
data sharing.
files
contents;
in the organisation
data sharing.
becomes
with afew
must have its
to list
each department
of security
security
and
file
are not possible, the file reporting
by the fact that Lack
System administration
a simple
programs.
modify
queries
Even
difficult
in
CUSTOMER
a file file
would
that:
1 Reads a record from the original file. 2 Transforms the original data to conform to the new structures 3
Writesthe transformed
storage requirements.
data into the new file structure.
4 Repeats the preceding steps for each record in the original file. In fact, that
any change
use the
spent
using
structural
to a file
data in that a debugging
and
data
structure,
file.
process
adding
five
will
steps
work
to
a customer
programs
Even
changes
to
exhibit
in the
when it is ability
in file
The
(how
data
data). to
do it.
type,
its
record
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
to
does
May
be
structural
such
changes
in
of the
data
affecting
afile
copied, affect
systems
program
scanned, the
overall
the
or
programs
programs
must
be
application
programs
Conversely,
structural
structure
the
file
without
affecting
the
definitions.
data
Data
management
in experience.
whole
or in Cengage
part.
Due Learning
to
dependence
electronic reserves
require
when
not the
to
to
make
access
logical
the
data
computer
only
what to
opening
makes the file
changes
the
data. format
sees do,
the
but also
of a specific system
file
extremely
of view.
rights, the
the
(how
specify
point
possible ability
computer
that
decimal,
data type), the file system is said
between
data format
to
are subject to change
when it is
difference
lines
integer
programs
must tell the
contain
from
the
exists
physical
must
duplicated, learning
file
previous
system
system
programs
application
is
and the
and
file
file
a field
all data access
dependence
data)
of the
dependence. the
as changing
data independence
without
and its field
not
exhibit
none
the
change (that is, changing
each
materially
because
characteristics
accesses
not
all
short,
make
file shown in Figure 1.3 would require
change,
Because
a programming
Reserved. content
lead
data.
characteristics,
of
specification
this
Therefore,
In
to
the
CUSTOMER Given
access the file.
views the
that
from
turn,
time is
For
they
possible
Conversely,
Consequently,
cumbersome
Editorial
being
Any program
how
that
significance
human
in
programs
additional
that is, access to afile is dependent onits structure.
structure.
structure.
access
characteristics
practical
the
to
dependence.
storage
file
new file
data storage
data
data
limitations,
all of the
and
of
section.
in the file structure,
changes in all programs any of the files
Those
in
errors (bugs),
problems
field to the
previous
CUSTOMER
exists
application
date-of-birth
to the
by change
independence
modifications
produce
errors.
dependence;
in the
new
conform
are affected
those
to
and Data Dependence
described
with the
modified
minor, forces
are likely
to find
Afile system exhibits structural example,
matter how
dependence.
1.5.1 Structural
the
no
Modifications
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
1.5.2 Field Definitions
1 The
Database
Approach
19
and Naming Conventions 1
At first
glance,
the
CUSTOMER
file
shown
in
Figure
1.3 appears
to
have served
its
purpose
well:
requested reports could usually be generated. But suppose you want to create a customer phone directory based on the data stored in the CUSTOMER file. Storing the customer name as a single field turns out to be aliability because the directory must break up the field contents to list the last names, first
names
and initials
in
alphabetical
order.
Or suppose
you
want to
get a customer
listing
by area
code. Including the area code in the phone number field is inefficient. Similarly, producing alisting of customers by city is a more difficult task than is necessary. From the users point of view, a much better (more flexible) record definition would be one that anticipates reporting requirements by breaking up fields into their component parts. Thus, the CUSTOMER files fields
might be listed
as shown in
TABLE 1.3
Sample
Table 1.3.
customer
Customer
last
name
Ramas
CUS_FNAME
Customer
first
name
Alfred
CUS_INITIAL
Customer
initial
CUS_AREACODE
Customer
area
CUS_PHONE
Customer
phone
CUS_ADDRESS
Customer
street
CUS_CITY
Customer
CUS_COUNTY CUS_POSTCODE
Selecting
field
proper
field
name would
origin,
which is the
as
name
London
Customer
county/district
Eastern
Customer
postcode
3001
also important. the
file
customers
file.
Therefore,
file
number
in
renewal
the
portion
the
shown
insurance First,
or box
For example,
structure
prefix
the field not
can
1.3, it is
Using the be
used
of the
field
name
structure
becomes
which
the
files
is
field
more descriptive
belong
to
name
and
of the
yields
That is,
are
that
the
CUS_RENEW_
of the
self-documenting.
fields
names
obvious
as an indicator
question
Lane
Cape
make sure that
date.
CUS
Meadow
Figure
you know that the field in
a few
within
fields
place restrictions
on the length
those
In
restrictions.
on a page,
thus
addition,
making
output
CUSTOMER_INSURANCE_RENEWAL_DATE,
Another
problem
CUSTOMER
have
several
field that
has
address
fields
a CUSTOMER fields
contents.
by simply
what information
looking the
fields
of field very long
names, field
names
spacing
a problem.
being
self-documenting,
while
so it is
wise to
make it
For
be as
difficult
example,
the
is less
to field
desirable
CUS_RENEW_DATE.
The
2020
East
determine
packages
possible
more than
than
0161-234-5678
contain.
software
descriptive
can
1615
code
city
RENEW_DATE
you
A
Green
reasons.
conventions,
names,
to
the
for two
the
is
entry
123
examining
CUSTOMER
naming
field
Some
fit
be better
Second,
With proper
are likely
In
REN represents
DATE
at the
names
descriptive.
property.
review
Sample
CUS_LNAME
reasonably
Copyright
fields
Contents
Field
Editorial
file
Cengage deemed
any
Figure
All suppressed
Reserved. content
does
May not
CUSTOMER
does
named
a unique
Rights
1.3s
currently
customers
contains
Learning. that
in file
not
James
customer
not materially
be
copied, affect
a unique
G. Khumalo.
account
scanned, the
file is the
have
overall
or
duplicated, learning
in
whole
of finding
identifier.
Consequently,
number
experience.
difficulty
record
or in Cengage
For
the
desired
data
example,
addition
of
it is
efficiently. possible
to
a CUS_ACCOUNT
would be appropriate.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
20
PART I
Database
Systems
The criticisms
1
are
not
introduced design
of field
unique
to file
early. in
You
Chapter
Advanced
whether
Data
and the
by adhering
a file
end
to
and
definitions
Online Content this
Design.
Regardless
of the
always
of Figure later,
you learn
and in
Chapter
implementation
issues
the
designers
Both types
are
database
Data
in
1.3
they
about 6,
Modelling
Chapter
data environment,
reflect
requirements.
naming
structure
be important
when
Diagrams,
must
processing and
to
conventions
database
or a database
reporting
field
naming
about
in the file
will prove
Relationship
Database
system
users
proper
and
you learn
shown
conventions
with Entity
when
conventions
such
definitions
and Physical
it involves
needs
field
Modelling
and
Logical,
and naming
Because
will revisit
5,
Concepts;
Conceptual,
definitions
systems.
the
11,
design
documentation
of needs
are
best
served
conventions.
Appendices Ato Rareavailable ontheonlineplatformaccompanying
book.
NOTE No naming
the your
convention
DBMS
fit
all requirements
use. For example,
might interpret
be interpreted you
can
DBMSs internal
get
all systems.
name
ORDER
a hyphen (-) as a command
as a command
would
for
the
an error
to
subtract
message.
the
On the
NAME
other
Some
to subtract. field
from
hand,
words
generates
or phrases
Therefore,
the
are reserved
an error in some
CUS field.
CUS_NAME
would
DBMSs.
the field
CUS-NAME
Because
neither
field
because
it
work
fine
for
Similarly, would exists, uses
an
underscore.
1.5.3 Data Redundancy The file The
systems
structure
organisational
and lack
structure
Database professionals it is
unlikely
information agent one
that
data
contain
different
and phone
numbers
correct
copy
produces
data
different
places.
stored
of the
agent
different AGENT
on Poor
Copyright review
2020 has
Cengage deemed
of the
security.
any
All suppressed
Rights
to
Reserved. content
both the
and
phone
sets the
If
data is
does
the
May
not materially
be
multiple
copied, affect
in
different
be updated
consistently,
and the
Having
them
when the
same
different
and
the
in Figures
occur
in
As
islands
of
1.3 and 1.4, the
AGENT files.
data
locations.
data locations.
You need
more than
are stored
only
one
place
unnecessarily
at
stage for: when
to same
suppose make
you change
corresponding
agent.
conflicting
Reports
versions
an agents
changes
of the
phone
in the
same
number
CUSTOMER
will yield inconsistent
results
or file,
depending
copies
of data increases
the
chances
of a copy
of the
data
access.
scanned, the
data
multiple sources.
used.
unauthorised
not
basic
data. For example,
numbers.
For example,
data for
data from
for such scattered
CUSTOMER
exists
exists
you forget
same
will always
occur in
names
Having
susceptible
Learning. that
different
version
data
being
Editorial
contain
which
locations
of the same
places. file.
of the
to combine
of information
versions
Data inconsistency
data appear in
difficult
storage
Data redundancy
address
in the
the
different
data redundancy
Data inconsistency.
files
in
redundancy.
Uncontrolled
the
promotes
makeit
use the term islands
often
names
of security
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
1 The
Database
Approach
21
1
NOTE Data that display data inconsistency defined and
as the
condition
conditions.
In
are
accurate;
Data
are verifiable;
Data entry
errors
shown
has
in
file,
spelled
name
accrue
Data anomalies. should
be
by forcing
to
change.
address
occur
corrections, an agent Any
in
any field
entered in
only three
hundreds
or even
? Insertion
?
anomalies.
will also
1.6
problems
Copyright Editorial
review
2020 has
file
Cengage deemed
any
All suppressed
Rights
Clearly,
only the
a field
which
the
change phone of
to
CUSTOMER
entry
to reflect
making
for
file
new
a single and
hundreds
occurs
of
when
data integrity.
data are not made as follows:
number
phone such
must be
number is shown.
changes
might
data inconsistencies
existed,
the
1.3. are
a new agent.
maintain
number, that
system,
in
number
problem
defined
potential
change condition
and phone
Ms Browns file
value
file in Figure
The same
places
an incorrectly
CUSTOMER
prospect
and
a non-existent
name, address
name,
commonly
a large
to
add
agents
is
a new
addition.
In
occur
in
great.
agent, Again,
you the
would be great.
Amy B. OBrian,
T. Ndlovus
systems
reference
transaction
does
are
in
of records.
data
many
as
name
allow
problems.
must be assigned
has a new phone
made. In
if
agent! agent
the
agent
Saajidah
data.
Clearly,
Maharaj and this
is
not
Olette
K. Snyman,
then
desirable.
SySTEmS in file
Reserved. content
1.3
be
customer
agents
file
CUSTOMER
an abnormal
changes in the redundant
file records
must
example,
Menzi
made
and the
Learning. that
inherent
Figure
data inconsistencies
delete
often
in
such
Ideally, fosters
with the
by that by that
the
manager
error
numbers)
CUSTOMER
27-21-410-1700).
into
supplies
phone/address
that
made in
Nikita F. Brown
If you delete
DATABASE
systems
master
For
Deletion
served served
found
CUSTOMER
a dummy
for creating
is
events
CUSTOMER
of data integrity
agent
and/or
time be faced
must be correctly
thousands
potential
you
The
add
could
customers
changes
anomalies. also
You
name each
entry
Look at the
move, the
when all of the required
If agent
each of the
case,
change
file.
anomalies
anomalies.
however,
phone
than
number
agency
kind
abnormality.
Each customer
value
data
real-world
the
in the
personnel
a data
as an
a single
make the
record rather
the
same
as 12-digit
phone
Data redundancy,
married and
making just
develops
The
Update
quit.
get
third
and
fact, the
many different locations.
one for each of the
successfully.
would
of
name
In
yields
anomaly
place.
CUSTOMER
decides to
change
this
defines
the
And should
benefits?
number
decides to
must
in the
Data integrity
with the
morefiles. In fact,
if the insurance
not exist.
and
entries (such
(27-12-410-7100
agents
be impressed
phone
changes in
also
A data anomaly
?
to
a single
Instead
you
data integrity.
consistent
results.
error:
number
sales
who does
dictionary
Nikita F. Brown
file (AGENT),
are
in one or
an entry
phone
bonuses
only
field value
If agent likely
The
made in
consistent
when complex
such
agents
or an incorrect
yield
occur
just
are not likely
to
database
and/or recur frequently
a non-existent
of an agent
agent
data in the
data inconsistencies.
to
contains
enter
but customers
of the
will always
more likely
1.3
to
number
no
data
digit in the
possible
phone
are
different files
Figure
are also referred to as data that lack
all
words,
the
are
a transposed
It is
which
there
made in several
file
file
other
Data
are
in
May not
not materially
file,
be
copied, affect
make
to
which
scanned, the
several
overall
or
duplicated, learning
using
a
files
such
were
in experience.
whole
stored
or in Cengage
part.
database
system
as the separately.
Due Learning
to
electronic reserves
very
customer
desirable. master
However,
rights, the
right
some to
third remove
party additional
unlike
content
may content
Traditional
file, the
be
suppressed at
any
time
the
product
file
system,
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
22
PART I
1
Database
Systems
the
database
label
reflects
consists the
fact
of logically that,
related
although
contents
may actually
be physically
Since
databases
data
the
in the 1.6,
way end-user provides
possible
to
structural the
that
DBMS
be referred
make a human
In the sections
FIguRE
the
DBMS
1.6
over file
system
systems
the
one
as the
database
youll
learn
database
of
DBMS
change
shown in Figure
data
by
making it
dependency
software
and the
its
locations.) major
Figure 1.5,
anomaly,
structures
crucial
stores
access
paths
also takes
components heart.
more than
and not
only
to those
care of defining,
of a database
However,
a DBMS to
what a database
system
a
user,
components.
systems
it takes
and/or
of DBMS software
to those
of several
facilities
DBMS,
logical
end
represents
shown in
generation
(The
unit to the
The databases
data
those
generation
paths
storage
database
management,
current
data repository.
be a single
data
the
managed.
between
The current
database
Contrasting
yet, the
access
to
unit,
data inconsistency,
relationships
being function,
that follow,
fits into
and
to
multiple
logical
accessed
is just
appears
among
a single
Better
also the
all required
may even
heart to
the
file
problems. but
managing
Remember DBMS
of the
all in a central location.
and
is
advantages
most
structures,
structures, storing
numerous eliminate
data stored in a single logical data repository
distributed
repository
data are stored,
dependency
data
the
just
as it takes
make a database
system is,
what its
system.
The
more than
a
system function.
components
are and how
picture.
and file systems A Database
Personnel
System
D ata b a s e
dept
E m pl my o e es DBMS
er s
C us t o s
S al Sales
ne
dept
I n v e t or y u nt s
Acco
Accounting
dept
A File System Personnel
dept
Sales
mpl oy e e s
E
C u st o mer
Accounting
dept
I n v e nt or y
S al es
dept
A c c o u nt s
1.6.1 The Database System Environment The term
database
collection,
storage,
management
point
1.7: hardware,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
system
of view,
software,
Rights
Reserved. content
refers
to
management
does
the
May
not materially
be
copied, affect
organisation
use
database
people,
not
an
and
of
system
procedures
scanned, the
overall
or
duplicated, learning
of
data is
components
within
that
a database
composed
of the
define
and
regulate
the
environment.
From
a general
major
shown
in
five
parts
Figure
and data.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
Lets take
a closer look
Hardware.
Hardware
(microcomputers, devices ID
refers
to
switches,
components
shown
all of the systems
mainframes,
(hubs,
readers,
at the five
workstations
routers
and fibre
in Figure
devices
and servers),
storage
and
other
Database
Approach
23
1.7:
physical
optics)
1 The
for example, devices,
devices
1
computers
printers,
(automated
network
teller
machines,
etc.).
FIguRE 1.7
The database system environment writes
Procedures
and
and standards
supervises
enforces Database Analysts
System
administrator
Database
administrator manages
designer
designs End
Hardware
Programmers
users
Application DBMS
programs
use
utilities
write DBMS
access Data
Software.
Although
the
most readily
identified
software
is the
DBMS itself,
to
make the
database
system function fully, three types of software are needed: operating system software, software, and application programs and utilities: ? Operating system software all other
software
Microsoft
to run
on the
Windows, Linux,
? DBMS software software
manages all hardware components computers.
Microsoft
and makesit possible for
of operating
system
software
include
Mac OS, UNIX and MVS.
manages the database
include
Examples
DBMS
Access
within the database system. Some examples
and
SQL Server,
Oracle
Corporations
of DBMS
Oracle and IBMs
DB2. ? Application and to
programs and utility software
manage the
computer
are used to access and
environment
in
manipulate data in the DBMS
which data access
and
manipulation
take
place.
Application programs are most commonly used to access data found within the database, and to generate reports, tabulations and other information to facilitate decision making. Utilities are the software tools used to help manage the database systems computer components. For example, all of the major DBMS vendors now provide graphical user interfaces (GUIs) to help create People.
Copyright review
2020 has
Cengage deemed
Learning. that
structures,
This component
functions,
Editorial
database
any
five types
All suppressed
Rights
Reserved. content
does
control
includes
database
all users
access
of the
and
database
monitor system.
database On the
operations.
basis of primary job
of users can beidentified in a database system: systems
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
administrators,
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
24
PART I
Database
1
Systems
database
administrators,
Each
type,
?
user
Systems
?
database
described
administrators
Database
Database
designers
database
their
?
is
data resources,
Systems
?
design
produce
dimensions
the poor,
even the database
the
database
design
and create the
data entry
access
and
manipulate
are the
operations.
tactical
Procedures.
system.
business
used to
play
entered
into
designers
managerial
with varying with
tends
to
mainframes
In
Copyright Editorial
review
2020 has
at
addition
Cengage deemed
be low.
tends to the
any
the
book.
architects.
most to
to
application
dedicated
optimise
cover
new
programs.
programs
the
to
run
and
obtained
that
They
which end users
organisations
directors from
generated
daily
are all classified
the
database
to
as
make
of facts
they
and audit the
and use of the component
enforce
with customers.
monitor
both the
determination
are to
be organised
of is
standards are
data that
by also
enter the
data.
database.
the
the
of the
Procedures
use of that
stored in the
generated,
design
forgotten,
because
and
through
data
govern the
occasionally
a company
way to
All suppressed
Rights
to
does
May not
an
organisations
on the
can be created
Since which
data are the
data
a vital
part
are to of the
gym
be
database
the
size, its functions
at different levels
managed compare
system
the
system
alocal
may
procedures
is likely
to
and programmers;
procedures
structure.
organisations
membership
claims
many designers
management
and
For example,
microcomputer,
The insurance
are likely
to
have
the
be are
and its
gym
managed
how
corporate
of complexity membership by two
probably
simple
at least
one systems
hardware
probably
be numerous,
Just
complex
and
system
people,
the
and the
data
administrator,
includes
several
and rigorous;
and
be high. levels
account:
Reserved. content
depends
The
locations;
to
standards.
a single
different
into
is
system.
probably
multiple
fact
Learning. that
has expanded
through
and rules although
dimension
precise
DBAs and
volume
important
this
database
strive
managers
organisation
is
systems
to
claims
used is
data
the
and the
the information
how those
structure
adherence
several full-time
the
effect,
and procedures
supervisors,
collection
a new
database
an insurance
volume
the
that is
and
adds
Therefore,
hardware
application
is an organised
database
accompanying
programmers
description
reports
role in
within
which information
system
this
culture.
the
job.
A database complex
the
that
data.
a critical,
an important
conducted
from
ensure
As organisations
and implement
screens,
clerks,
are
and the information
material
and
decisions.
Data. The word data covers the raw
DBMS
are, in
application
job
design
are the instructions
ensure that there
database
sales
Procedures
is
They
environment.
designers
who use the
business
Procedures
Procedures
which
the
and end users.
functions:
operations.
on the online platform
best
end users employ
strategic
system.
general
manage
structure.
databases
people
High-level
and
database
the
For example,
end users.
and programmers,
complementary
responsibilities.
programmers
users
DBAs,
database
and
End
and
systems
available
a useful
and growing analysts
as
Administration,
design
cannot
database
known
analysts
unique
The DBAsroleis sufficientlyimportantto warranta detailedexploration in
K, Database
DBAs
systems both
properly.
Appendix
If the
the
also
is functioning
Online Content
?
performs
oversee
administrators,
database
designers,
below,
not materially
be
of database
database
copied, affect
scanned, the
overall
or
duplicated, learning
system
solutions
in experience.
whole
complexity,
managers
must be cost-effective
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
must also take
as
party additional
content
another
well as tactically
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and
and/or restrictions
eChapter(s). require
it
CHAPTER
strategically
effective.
example the
of good
database
Producing
database
a
million-rand
system
technology
already
selection in
solution
to
or of good
use is likely
to
a thousand-rand
database
affect
the
design
selection
1 The
problem and
Database
is
hardly
management.
of a database
Approach
25
an
1
Finally,
system.
1.6.2 DBmS Functions A DBMS in the
performs
several
database.
through
the
Most
use
data and
functions
integrity
database
dictionary
functions
data
security
work through
data component complex
DBMS
are
automatically
of the
stores
in
of
CUSTOMER
Oracles
data
uses
most can
of the
data
be achieved
storage
only
management,
control, and
Chapter
development
data
backup
application
elements
SQL
dictionary
data
and recovery programming
and their
In
to look
you from
freeing other
dependency
2, Data tool
data
any changes
thereby
structure.
and
data
thus relieving
dictionary,
structural
of the
the
Additionally,
changed
in
access
consistency
In turn, all programs that access the data in the
DBMS
program.
the
data abstraction how
The
the
access
and it removes
more about example
that
and data
languages
definitions
and relationships,
in each
recorded
programs
abstraction
DBMS.
structures
relationships
users,
multi-user access
and
interfaces.
The
the
end
management,
database
communication
the integrity
to
dictionary
relationships (metadata) in a data dictionary. database
guarantee
management,
management,
management.
that
are transparent
They include
and presentation,
management,
Data
of those
of a DBMS.
transformation
interfaces,
important
having to
made in you
words, from
up the
the
having
DBMS
system.
Models).
For example,
Developer
presents
code such
a database
from
the
to
structure modify
provides (You
Figure
the
required
will learn
1.8 shows
data
all
data
definition
an for
the
table.
FIguRE 1.8
Illustrating
metadata with Oracles SQL Developer
Metadata
Data storage
management.
The DBMS
creates
and
manages the
complex
structures
required
for
data storage, thus relieving you of the difficult task of defining and programming the physical data characteristics. A modern DBMS system provides storage not only for the data, but also for related data entry forms or screen definitions, report definitions, data validation rules, procedural code, structures
to handle
video
and picture formats,
database performance tuning.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
Performance
scanned, the
overall
or
duplicated, learning
etc. Data storage
management
is
tuning relates to the activities that
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
also important
for
makethe database
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
26
PART I
Database
Systems
perform
1
more efficiently
as a single (see
Figure
DBMS
the
The
Name:
datafiles
six
E:
database
drive
and access
actually
in
be stored
disk request
the
13,
Although
database
on different
to finish
concurrently.
Chapter
speed.
stores
the
Database
user sees the
multiple
storage
before
physical
media.
next
Data storage
Managing
the
in
one
database data files
Therefore,
starts.
management
In
the
other
words,
and performance
and SQL Performance.
data storage management with Oracle
The
is
Oracle
Manager also
six
Enterprise Express
shows
space
located
the
used
interface amount
of
by each
of the
datafiles.
of the
server
one
requests
are addressed
physical
into
tablespaces
on the
database
DBMS may even
wait for
Illustrating
database in
organised
logical
data files
of storage
the
PRODORA
PRODORA stored
Such have to
issues
FIguRE 1.9
actually
unit,
1.9).
doesnt
in terms
storage
DBMS can fulfil
tuning
Database
data
computer
The
data structures.
characteristics
and
presentation.
The
it
The DBMS relieves
and the
conform
physical
to the
multinational
company. In
of the
logical
DBMS
same
data presentation
South
Africa
the
entered
data
to
conform
of making a distinction
to
enter
in the
to
between
physically
the logical
data
data to
make
database
data
United
required
retrieved
an enterprise
expect
be entered
DBMS
the
imagine
would
would
format,
data
database.
DBMS formats
For example,
date
the
PRODORA
transforms
That is, the
user in
the
GUI shows
for the
expectations.
An end
contrast,
Express
you of the chore
data format.
users
as 11/07/2020.
Regardless
Manager
management
Data transformation
format
Oracle Enterprise
storage
such
States
used
by a
as 11 July
2020
as 07/11/2020.
must manage the date in the
proper format
for each country. Security
management.
privacy.
Security
rules
user
access
and
can
This is
especially
simultaneously.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
The DBMS creates determine which
important
in
All database
Rights
Reserved. content
does
May not
not materially
be
which
data
affect
multi-user
scanned, the
users
operations
users
copied,
a security
overall
can
(read, database
system that
access add,
duplicated, learning
in experience.
whole
or in Cengage
systems
part.
Due Learning
database,
delete
may be authenticated
or
the
or
electronic reserves
the
right
many users
some to
third remove
party additional
content
and data
data items
user
can
each
perform.
access
DBMS through
rights, the
user security which
modify)
where
to the
to
enforces
the
database
a username
may content
be
suppressed at
any
time
from if
and
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
password
or through
information
to
biometric
assign
access
authentication privileges
to
such
various
as a fingerprint database
scan.
1 The
Database
Approach
27
The DBMS uses this
components,
such
as queries
1
and reports.
Online Content AppendixK, Database Administration, examines datasecurityandprivacy issues
in
greater
Multi-user
detail
access
sophisticated
without
and recovery
perform recovery capability
is
covers
critical
backup
and
used to
database
Database
access
languages
through
and
user specify
what
Visual
Basic.NET
and
The
the
majority
Procedural
SQL
communication
multiple,
different
database
?
is the
network
provides
environment, users
communications
can
generate
answers
SQL,
are
data dictionary
in
and
DBMS
Chapter
provides
be done.
languages
such
Structured
address
the
The
used
of
DBMSs
example,
of
Web browsers
the
can
be accomplished by filling
in
DBMS
C,
by the
DBA
Structured
supported and
end-user
might
by
Chapter
requests
provide
as
Chrome,
several
ways:
screen
DBMS
9,
SQL.
accept
such in
standard
Query Language,
use
data
that lets
as COBOL,
utilities
8,
Concurrency.
languageone
how it is to
data
transactional
addressed
The
and data access
For
queries
minimising
in
Transactions
administrative
use
to
Such
Administration,
monitor and maintain the database.
Current-generation
the
with the
failure.
stored in the
a non-procedural
procedural
8, Beginning
environments. through
or a power
important
interfaces.
is
to
also
Advanced
DBA to
deals
rules, thus
issues
Managing
query language
Chapter
and
interfaces.
via the internet
End
de facto
of DBMS vendors. Language
Database
In this
(SQL)
allow the
K, Database
especially
having to specify
interfaces
and the database designer to create, implement, Query Language
is
programming
DBMS
disk
Appendix
12,
A query language
C#.
in the
management
Chapter
without
and
to ensure
management
The data relationships
application
programming
sector
data integrity
and
must be done
application
utilities that
Recovery
and enforces integrity
and transaction
Language,
Transactions
platform).
promotes
Ensuring
uses
concurrently
and data recovery
special
integrity.
online
data consistency.
a query language.
also provides Java,
Query
provide
as a bad
DBMS
database
Managing
backup
procedures.
databases
(see
The DBMS
Data integrity
Structured
such
the
data integrity.
systems.
Beginning
a failure,
the
the
12,
book.
control.
provides
systems
this
consistency,
access
Chapter access
and restore
issues
maximising
enforce
DBMS
data can
database.
The
accompanying
and users
multi-user
DBMS
preserving
management.
redundancy
access
to
after
and recovery
Data integrity
of the
backup
database
platform
multiple
of the
Current
special
online
data integrity that
management.
and
of the
ensure
details
and integrity.
routine
on the
the integrity
covers the
data safety
available
To provide to
compromising
Backup
the
control.
algorithms
Concurrency,
are
and is
forms
access
Firefox
through
via
to
the
or Edge.
their
preferred
Web browser.
? The DBMS
can automatically
?
can
The
DBMS
productivity
Copyright review
communication
Databases,
in
2020 has
Cengage deemed
Learning. that
any
All suppressed
to third-party
predefined systems
reports to
on a website.
distribute
information
via email
or other
applications.
Database
in e-Commerce
Editorial
connect
publish
interfaces
Chapter
17,
(see
online
Rights
Reserved. content
does
May not
are
Database
examined
Connectivity
and
in
greater
detail
Web Technologies,
in
Chapter
and in
14,
Appendix
Distributed H, Databases
platform).
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
28
PART I
Database
1.6.3
Systems
managing the Database System: A Shift in Focus
1 The introduction
of a database
system
provides
a framework
in
which strict
procedures
and standards
can be enforced. Consequently, the role of the human component changes from an emphasis on programming to afocus on the broader aspects of managing the organisations data resources and on the administration of the complex database software itself. The database
system
makes it possible
to tackle
far
more sophisticated
uses of the
data resources
as long as the database is designed accordingly. The kinds of data structures created within the database and the extent of the relationships among them play a powerful role in determining the effectiveness of the database system. Although the database system yields considerable advantages over previous data management approaches,
database
systems
do impose
significant
overheads.
For example:
Increased costs. Database systems require sophisticated hardware and software and highly skilled personnel. The cost of maintaining the hardware, software and personnel required to operate and manage a database system can be substantial. Management complexity. Database systems interface with many different technologies and have a significant impact on a companys resources and culture. The changes introduced by the adoption of a database system must be properly managed to ensure that they help advance the companys objectives. Given the fact that database systems hold crucial company data that are accessed from
multiple sources,
security issues
must be assessed
constantly.
System maintenance. To maximise the efficiency of the database system, you must keep your system current. Therefore, you must perform frequent updates and apply the latest patches and security
measures to all components.
training Vendor
costs tend to
be significant.
dependence.
Given the
Since database
heavy investment
may be reluctant to change database vendors. pricing point advantages to existing customers of database system components.
1.7
technology
advances
in technology
rapidly,
and personnel
personnel
training,
companies
As a consequence, vendors are less likely to offer and those customers may be limited in their choice
PREPARINg FOR yOuR DATABASE PROFESSIONAL CAREER
In this chapter, you wereintroduced to the concepts of data, information, databases and DBMSs. You also learnt that, regardless of what type of database you use (OLTP or OLAP), or whattype of database environment
you are
working in (for
example,
Oracle,
Microsoft
or IBM),
the
success
of a database
system greatly depends on how wellthe database structure is designed. Throughout this book, you willlearn the building blocks that lay the foundation for your career as a database professional. Understanding these building blocks and developing the skills to use them effectively will prepare you to work with databases at many different levels within an organisation. A small sample
of such
career
opportunities
is shown
in
Table 1.4.
As you also learnt in this chapter, database technologies are constantly evolving to address new challenges such aslarge databases, semi-structured and unstructured data, increasing processing speed and lowering costs. While database technologies can change quickly, the fundamental concepts and skills do not. It is our goal that, after you learn the database essentials in this book, you will be ready
to
apply
cutting-edge,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
your
knowledge
complex
Rights
Reserved. content
does
and skills to
work
database technologies
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
with traditional
OLTP and
OLAP systems
as
well as
such as:
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
TABLE 1.4
Database
career
Database
developer
Creates
Database
Approach
29
opportunities
Description
Job Title
1 The
Sample
and
maintains
database-based
Skills
Programming,
required database
fundamentals,
SQL
applications Database designer
Designs and
Database
Manages
administrator Database
maintains databases
and
maintains
DBMS
and
Develops
databases
for
decision
architect
Designs
and
consultant
database
(conceptual,
SQL courses
hardware
database
to improve and
Implements
security
Cloud Computing
business
achieve
security
specific policies
infrastructure cloud
Scientist
for
database
Analyze to
data
for
data
warehouses,
modelling,
SQL,
generate
(VLDB).
the
insights,
data
Internet
data
relationships,
behaviors
Many vendors
database
technologies,
cloud storage
data security,
large
databases,
Data analysis,
statistics,
mathematics,
SQL,
the
need for
administration,
technologies
machine learning,
are addressing
modelling,
SQL, DBMS, hardware,
security
technologies, tuning,
of varied
data
technologies
DBMS fundamentals,
next-generation
amounts
design,
vendor-specific
systems
large
and predictable
databases
data
knowledge
database
goals
SQL,
Design and implement
Data
Architect
optimisation,
Database fundamentals,
administration
officer
query
physical)
processes
Very large
design,
SQL, vendor
DBMS fundamentals,
logical
Helps companies leverage technologies
Data
database
data lakes
and implements
environments
Database
fundamentals,
SQL,
support reporting
Database
design,
Database
databases analyst
Database
Systems
performance
etc. advanced
programming,
data
mining,
data visualization
databases
that
support
large amounts of data, usually in the petabyte range. (A petabyte is more than 1 000 terabytes.) VLDB vendors include Oracle Exadata, IBMs Netezza, Greenplum, HPs Vertica and Teradata. VLDB are now being overtaken in marketinterest by Big Data databases. Big Data databases. Products such as Cassandra (Facebook) and Bigtable (Google) are using columnar database technologies to support the needs of database applications that manage large
amounts
In-memory
of non-tabular
databases.
data.
Most
See
more about this topic
major database
vendors
in
also offer
Chapter
2.
some type
of in-memory
database
support to address the need for faster database processing. In-memory databases store most of their data in primary memory (RAM) rather than in slower secondary storage (hard disks). In-memory databases include IBMs solidDB and Oracles TimesTen. Cloud databases. Companies can now use cloud database services to add database systems to their environment quickly, while simultaneously lowering the total cost of ownership of a new DBMS. A cloud database
offers all the advantages
of alocal
DBMS, but instead
network infrastructure, it resides onthe internet.
of residing
within your organisations
See more about this topic in Chapter 14.
Weaddress some of these topics in this book, but not all no single book can cover the entire realm of database technologies. This books primary focus is to help you learn database fundamentals, develop your database design skills and master your SQL skills so you will have a head start in becoming a successful database professional. However, you first need to learn about the tools at your disposal. In the
next chapter,
influence
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
you
willlearn
different
approaches
to
data
management
and how these
approaches
your designs.
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
30
PART I
Database
Systems
SummARy
1
Data are raw facts.
Information
is the result
of processing
data to reveal its
relevant and timely information is the key to good decision the key to organisational survival in a global environment. Data are usually
stored in a database.
To implement
meaning.
Accurate,
making, and good decision
a database
and to
manage its
making is
contents,
need a database management system (DBMS). The DBMS serves as the intermediary user and the database. The database contains the data you have collected and data known as metadata.
you
between the about data,
Database design defines the database structure. A well-designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organisation. Databases evolved from manual and then computerised file systems. In afile system, data are stored in independent files, each requiring its own data management programs. Although this method of data management is largely outmoded, understanding its characteristics makes database design easier to understand. Awareness of the problems of file systems can help you avoid
similar
problems
with DBMSs.
Some limitations of file system data management are that it requires extensive programming, system administration can be complex and difficult, making changes to existing structures is difficult,
and security
redundant Database
are likely
management systems
weaknesses. to the
features
to
be inadequate.
data, leading to problems of structural
Rather than
files tend to
data
within independent
data repository.
files,
This arrangement
a DBMS presents
promotes
DBMS software
allows
users to
develop the database
the
data sharing,
eliminating the potential problem ofislands ofinformation. In addition, the integrity, eliminates redundancy and promotes data security. Open source
contain
were developed to address the file systems inherent
depositing
end user as a single
Also, independent
and data dependency.
database
thus
DBMS enforces data
system for any purpose, look
at
the source code and make any improvements, which willthen be released back to the general public. Open source DBMSs such as MySQL are currently free to acquire and use, making them ideal for smaller companies and organisations to develop database-centred applications quickly.
KEy TERmS
Copyright Editorial
review
adhocquery analytical database
dataprocessing (DP)specialist dataquality
information
business intelligence
dataredundancy
knowledge
centralised database data dataanomaly datadependence
datawarehouse database database design database management system (DBMS)
logical data format
datadictionary
database system
online analytical processing(OLAP)
datagovernance datainconsistency dataindependence dataintegrity
desktop database distributed database enterprise database Extensible Markup Language (XML)
online transaction processing(OLTP)
data management
field
physical dataformat
dataprocessing (DP) manager
file
production database
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
islandsofinformation
electronic reserves
metadata
multi-userdatabase NoSQL
opensource
operationaldatabase performance tuning
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
query
single-user database
querylanguage
social media
1 The
Database
Approach
31
transactional database
1
workgroup database
queryresultset
structuraldependence
record
structuralindependence
semi-structured
Structured QueryLanguage(SQL)
XMLdatabase
FuRTHER READINg Codd,
E.F.
Date,
C.J.
The
Capabilities
The
of
Database
Assessment
of
E.F.
Date,
C.J.
An Introduction
Date,
C.J.
Date
on
Codds
Database:
on the
REVIEw 1
c
record
d
file
Writings
20002006.
the
Field 8th
of
Database
edition.
Apress,
Research
Report,
a Historical
Technology.
RJ3132,
Account
Addison-Wesley,
1981.
and 2000.
2003.
2006.
Answers to selectedReviewQuestionsand Problemsforthis chapter online platform
accompanying
this
book.
Whatis data redundancy
3
Discuss the lack
4
Whatis a DBMS, and what areits functions?
5
Whatis structural independence, and whyis it important?
and which characteristics
of data independence
of the file system can lead to it?
in file systems.
Explain the difference between data and information. Whatis the role of a DBMS, and what areits advantages? List and describe the different types
9
What are the
10
main components
of databases.
of a database system?
Whatis metadata?
11
Explain why database design is important.
12
What are the potential costs ofimplementing
13
a database system?
Use examples to compare and contrast structured and unstructured data. Whichtype is more prevalent
14
in
a typical
business
environment?
What are the six levels on which the quality of data can be examined?
15
2020
IBM Analysis:
Addison-Wesley,
2
8
has
Systems,
and
data field
7
review
to
Database
Systems.
Review
QuESTIONS
b
6
Copyright
Management
A Retrospective
Discuss each ofthe following terms: a
Editorial
Database
Model,
Contribution
to
Online Content are available
Relational
Relational
Explain whatis
Cengage deemed
Learning. that
any
All suppressed
Rights
meantby data governance.
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
32
PART I
Database
Systems
PROBLEmS
1
Online Content Thefile structures youseein this problem setaresimulated in
a Microsoft
platform
Given the file 1
How
2
for this
Ch01_Problems,
available
Figure
P1.1, answer
contain,
Problems
on the
online
and how
1-4.
many fields
are there
would you encounter if you wanted to produce by altering
per record?
alisting
by city?
How would you
the file structure?
The file structure for Problems 14
PrOJeCT_
CODe
named
does the file
problem
P1.1
PrOJeCT_
shown in
many records
solve this
database
book.
structure
What problem
FIguRE
Access
MANAGer_ADDreSS
MANAGer_
PrOJeCT_BiD_ PriCe
MANAGer
PHONe
21-5Z
Holly
B. Naidu
33-5-59200506
180
Boulevard
25-2D
Jane
D. Grant
0181-898-9909
218
Clark
Blvd.,
F. Zulu
0181-227-1245
124
River
Dr.,
Dr, Phoenix,
13
64700
London,
NW3
TRY
179 975.00
9
787 037.00
25-5A
Menzi
25-9T
Holly B. Naidu
33-5-59200506
180 Boulevard
27-4Q
Menzi F. Zulu
0181-227-1245
124 River Dr., Durban, 4001
29-2D
Holly B. Naidu
33-5-59200506
180 Boulevard
64700
20
014 885.00
39-064885889
Via Valgia Silvilla 23, Roma, 00179
44
516 677.00
William K. Moor
31-7P
Durban,
4001
Dr, Phoenix,
64700
25
458 005.00
16
887 181.00
8 078 124.00
Dr, Phoenix,
3 If you wanted to produce alisting of the file contents bylast name, area code, city, county or postal
4
how
would
you
What data redundancies
FIguRE
P1.2
alter
the
file
structure?
do you detect, and how could those redundancies
lead to anomalies?
The file structure for Problems 58
PrOJ_
PrOJ_
eMP_
NUM
NAMe
NUM
1
Hurricane
101
1
Hurricane
1
eMP_NAMe
JOB_
JOB_CHG_
PrOJ_
CODe
HOUr
HOUrS
John D. Dlamini
EE
65.00
13.3
31-20-6226060
105
David
F.
CT
40.00
16.2
0191-234-1123
Hurricane
110
Anne
R. Ramoras
CT
40.00
14.3
34-934412463
2
Coast
101
John
D. Dlamini
EE
65.00
19.8
31-20-6226060
2
Coast
108
June
H. Ndlovu
EE
65.00
17.5
0161-554-7812
3
Satellite
110
Anne R. Ramoras
CT
42.00
11.6
34-934412463
3
Satellite
105
David F. Schwann
CT
6.00
23.4
0191-234-1123
3
Satelite
123
Mary D. Chen
EE
65.00
19.1
0181-233-5432
3
Satellite
112
Allecia R. Smith
BE
65.00
20.7
0181-678-6879
Copyright Editorial
code,
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
Schwann
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
eMP_PHONe
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
5 Identify in
6
and discuss the serious data redundancy
Figure
problems
exhibited
1 The
Database
by the file structure
Approach
33
shown
1
P1.2.
Looking atthe EMP_NAME and EMP_PHONE contents in Figure P1.2, which change(s) would you recommend?
7 Identify the different data sources in the file you examined in Problem 5. 8
Given your answer to Problem 7, which new files should you create to help eliminate the data redundancies
found
FIguRE P1.3
in the file shown
in
Figure
P1.2?
Thefile structure for Problems 910 DAYS_TiMe
TeACHer_
BUiLDiNG_
rOOM_
TeACHer_
TeACHer_
CODe
CODe
LNAMe
FNAMe
KOM
204E
Mbhato
Horace
KOM
123
Adam
Maria
L
LDB
504
Patroski
Donald
J
KOM
34
Hawkins
Anne
JKP
225B
Risell
James
LDB
301
Robertson
Jeanette
KOM
204E
Adam
Maria
LDB
504
Mbhato
Horace
KOM
34
Adam
Maria
L
MWF
LDB
504
Patroski
Donald
J
MWF 2:00-2:50
9 Identify
and discuss the serious
data redundancy
iNiTiAL MWF 8:00-8:50
G
MWF 8:00-8:50 TTh
W
MWF 10:00-10:50 TTh 9:00-10:15
P
TTh 9:00-10:15 MWF 9:00-9:50
I
TTh
G
problems
exhibited
Copyright Editorial
review
2020 has
Given the file structure KOM were deleted?
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
shown in Figure P1.3, which problem(s)
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
1:00-2:15 11:00-11:50
by the file structure
Figure P1.3. (The file is meant to be used as a teacher class assignment schedule. problems with data redundancy is the likely occurrence of data inconsistencies initials have been entered for the teacher named Maria Adam.) 10
1:00-2:15
shown in
One of the many two different
might you encounter if building
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter 2 Data Models In thIs
Chapter,
Why data
wIll learn:
models are important
About the
basic
data-modelling
What business How the How
you
rules
models
blocks
are and how they influence
major data
data
building
database
design
models evolved
can
be classified
by level
of abstraction
Preview This chapter
examines
design journey, resides
in the
end
most pressing
users
see
data in
data can lead to database failing
to
meet end-user
database the
uses
designers, Data
First,
you
database
data number
notation. are
still
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
about
Finally,
you views
Rights
Reserved. content
does
May not
known
be
the
not materially
be
systems,
emerging
copied, affect
model
scanned, the
overall
or
duplicated, learning
how
data
different
these
as Chen and
model
object and
degrees
to a are
diagrams.
Crows
Within (UML)
Next, model.
being
a
Foot notation
standard.
how it is
of data
how
There
language
relational
media data sets
and
of those
(ERD).
modelling
new industry
and the
social
draw
are
to
them.
will be introduced
diagram
to
and
and implementation
you
unified
such
UML is the
NoSQL
will also learn
used
the
ER model notations
object-oriented
same
are
among
development
design
Second,
data
database
complexities
concepts
database
failures,
as possible.
relations
the
relationship
to
of the among
real-world
and the
same
operation,
such
of ambiguities
Tracing
book.
that
introduced
manage very large
of the
the
entity
actual
nature
data-modelling
models.
of this
systems
entities
basic
as the
briefly
of the
be as free
of the
To avoid
Communication
by reducing
earlier
you understand
in legacy
need to
description
define
of the
from
notation
will
to the
varying
that
programmers
views
an organisations
organisation.
that
designers,
different
requirements.
communications
in the rest
Whilst traditional
will learn
Editorial
model
common
the
some
will help
you
be introduced
current
what
technique
ER
efficiency
a precise
within
developed
are addressed
of
database
and the database
design is that
do not reflect
data
obtain
such
will learn
chapter
and
abstractions
models
modelling
this
step in the
objects
Consequently,
and end users should
clarifies
models
that
of database ways.
data
understood
data
issues
must
of that
modelling
current
modelling is the first real-world
different
designs that
programmers
more easily
Data
between
problems
needs
designers
many
modelling.
as a bridge
computer.
One of the and
data
serving
used
you
will
Then,
you
to fulfil
the
efficiently
and effectively.
abstraction
help reconcile
data.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2.1
the IMportanCe
oF Data
2
Data
Models
35
MoDels
Traditionally, database designers relied on good judgement to help them develop a good design. Unfortunately, good judgement is often in the eye of the beholder, and it often develops after much trial and error. Fortunately, data models (relatively simple representations, usually graphical, of more complex real-world data structures), bolstered by powerful database design tools, have made it possible
to
diminish
the
potential
for errors in database
design
substantially.
In
general terms,
2
a model
is an abstraction of a more complex real-world object or event. A models mainfunction is to help you understand the complexities of the real-world environment. Within the database environment, a data model represents data structures and their characteristics, relationships, constraints and transformations.
note Theterms model
Data
data model and database
model are often used interchangeably.
will be used to refer to the implementation
models
can facilitate
interaction
among
of a data
the
designer,
In this book, the term database
model in a specific
the
applications
database
system.
programmer
and the
end
user. A well-developed data model can even foster improved understanding of the organisation for which the database design is developed. This important aspect of data modelling was summed up neatly by a client whose reaction was as follows: I created this business, I worked with this business for years,
and this is the first time Ive
really
understood
how
all the
pieces really fit together.
Theimportance of data modelling cannot be overstated. Data constitute the most basic information units employed by a system. Applications are created to manage data and to help transform data into information. But data are viewed in different ways by different people. For example, contrast the (data) view of a company manager with that of a company clerk. Although the manager and the clerk both work for the
same
company,
the
manager is
more likely
to
have an enterprise-wide
view
of company
data than the clerk. Even different managers view data differently. For example, a company director is likely to take a universal view of the data because he or she must be able to tie the companys divisions to a common (database) vision. A purchasing manager in the same company is likely to have a more restricted view of the
data,
as is the
companys
inventory
manager. In
a subset of the companys data. The inventory while the purchasing manageris more concerned relationships with the suppliers of those items. Applications
programmers
have yet another
effect,
each
department
manager
works
with
manager is more concerned about inventory levels, about the cost ofitems and about personal/business
view of data,
being
more concerned
with data location,
formatting and specific reporting requirements. Basically, applications programmers translate company policies and procedures from a variety of sources into appropriate interfaces, reports and query screens. The different users and producers of data and information often reflect the blindfolded people and the elephant analogy: the blindfolded person whofelt the elephants trunk had quite a different view of the
elephant
from those
who felt the
elephants
leg
or tail.
Whatis needed is the
ability to see the
whole
elephant. Similarly, a house is not arandom collection of rooms; if someone is going to build a house, he or she should first have the overall view that is provided by blueprints. Likewise, a sound data environment requires an overall database blueprint based on an appropriate data model. When a good
database
blueprint
is available,
view of the data is different from that of the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
it
does not
matter that
an applications
programmers
manager and/or the end user. Conversely,
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
when a good
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
36
part
I
Database
Systems
database
blueprint is not available,
program
or an
costing
the
data
entry
without
thousands
(or
a house
blueprint
to
2.2 The
selecting
stored.
without
blocks
anything
(a
of all
an abstraction;
person,
An entity represents
a blueprint,
as customers
a place,
or products;
an inventory
of operational
management
requirements,
thereby
you
cannot
live
in the
data out of the
you are equally
blueprint.
data
unlikely to
Similarly,
the
model. Just as you are create
a good
database
model.
models
BloCks
are
a thing
a particular
For instance, set
draw the required
data
data
overall
millions).
is
you cannot
house
the
MoDel BasIC BuIlDIng
building
entity is
even
an appropriate
Data
basic
such
build a good
first
are likely to ensue.
may not fit into
mind that
model is an abstraction;
not likely
problems
system
company
Keep in
2
order
type
entities,
attributes,
or an event)
about
of object in the real
but entities
relationships
which
and
data are to
world. Entities
may also be abstractions,
such
constraints.
An
be collected
may be physical as flight
routes
and
objects,
or musical
concerts.
An attribute is a characteristic by attributes
such
and customer
as customer
credit limit.
Arelationship and
customer
may be served
agents
many-to-many
and
One-to-many
entity
are
(the
agent.
often
Many-to-many
Thus, the
capitalised
the
by
as *:*. thus
yielding
the
the
*:* relationship
many customers,
and
each
one-to-many,
shorthand among
is related
PAINTER so they
INVOICE
to the
paints
are
notations
1:*, *:*
the three:
easily
designers
label and
for
the
(the many).
as 1:*. (Note
distinguished.) (the
many)
skills,
that
Similarly, is
generated
a by only
would also be labelled
many job
many classes
paintings
PAINTING
relationship
may learn
label
address
exists between
of relationships:
use the
but each invoice
Database can take
types
distinctions
painter (the one)
An employee
a student
customer
many different paintings, but each one of them
relationship
generates
many employees.
Similarly,
phone,
systems.
can serve
three
usually
illustrate
many invoices,
The CUSTOMER
use
designers
as a convention
may generate
customer
an agent
models
A painter paints label
name,
of fields in file
as follows: Data
examples
(*:*) relationship.
may be learnt
first
among entities. For example, a relationship
Database
designers
one)
customer.
students,
customer
be described one
(1:*) relationship.
names
a single
can by
The following
database
customer
SKILL
that
by only one painter.
Therefore,
name,
are the equivalent
one-to-one.
and 1:1, respectively.
painted
last
Attributes
describes an association
customers
is
of an entity. For example, a CUSTOMER entity would be described
and each job
1:*. skill
the relationship
EMPLOYEE
learns
each
be taken
many
class
relationship
can
expressed
by
by STUDENT
takes
CLASS.
One-to-one (1:1) relationship. of its
stores
be
manages labelled The
managed
only a single
Aretail companys
by a single
store.
employee.
Therefore,
management structure
In turn,
each
the relationship
store
mayrequire that each
manager,
EMPLOYEE
who is
manages
an employee,
STORE is
1:1.
preceding
discussion
identified
each relationship
in
both
directions;
that
is, relationships
are
bidirectional: One CUSTOMER Each
of the
A constraint
Copyright review
2020 has
Cengage deemed
Learning. that
any
many INVOICEs
is a restriction
data integrity.
Editorial
can generate
Constraints
All suppressed
Rights
Reserved. content
does
May not
is
many INVOICEs. generated
placed
on the data.
are normally
not materially
be
copied, affect
scanned, the
by only
overall
or
duplicated,
in experience.
whole
CUSTOMER.
Constraints
expressed
learning
one
are important
in the form
or in Cengage
part.
Due Learning
to
electronic reserves
because
they
help to ensure
of rules; for example:
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The employees
salary
A students
grade
Each
must
class
must have values that
must be between have
one
and
are between
2
Data
Models
37
6 000 and 350 000.
0 and 100.
only
one teacher.
2 How
do you
identify
the
2.3
identify
business
rules
BusIness
When
that
properly
database
of the
of data
such
data
From
a database
go
build
are in
attributes,
environment
an
about
a data
do not,
point
of view,
A business
organisation, that
Business
rules,
and enforce and
written you
see
business
rules
application
of business
person
in simple
by
yield
the
entities,
by gaining
used the
attributes
a thorough
and in
which
required
becomes
generate
in the are
one
In
a business, to
clearly
time
frames
a sense,
a government
of an
environment.
organisations to
this
Business
entities,
as an
you
of
what
used.
But
total
business.
when it reflects
properly
description
rules
are
a religious
organisations
of a policy,
misnamed:
group
they
or a research
are
operations,
rules
operational
define
such
agent,
unit,
are
of the
only
business
relationships
they
understanding
meaningful
and
understanding
information.
description
used
rules throughout
business
in the
the
of business
A customer
rules
seeing
help to
must be rendered
create
in
writing
environment.
attributes,
agent
can
relationships
serve
business
book, especially
is
A training
session
be easy
shares
to
many
rules
at
in the chapters
rules
understand
a common
main and distinguishing
and
constraints.
customers, work.
and
You
devoted
to
each
will see
data
the
modelling
and
interpretation
widely
of the
characteristics
of the
disseminated
rules.
to
Business
ensure
rules
data as viewed
that
describe,
by the company.
are as follows:
may generate
An invoice
must
organisation
language,
Examples
step is to
design.
To be effective, every
data
statements
served
are
of data
organisations
relationship
determining
data
a detailed
change
may be
database
from
any
customer
and
The first
modelling.
organisation.
or small
uses
within that
to reflect
Properly Any time
and
derived
actions
updated
constraints?
rule is a brief, precise and unambiguous
a specific
large
stores
or
collection
within
or principle any
are
and
may start
by themselves,
the
procedure to
selecting
how the
defined business rules. apply
you
model, they
organisation,
and information
laboratory
relationships
rules
designers
will be used to
types
entities,
generated
many invoices. by only
cannot
one
customer.
be scheduled
for
fewer
than
ten
employees
or for
more than
30 employees. Note
that
two
those
business
those
two
rules
establish
entities.
more than and
business
rules
The third
30 people;
two
The
main sources
written and
Copyright review
2020 has
entities,
entities,
business
entities,
rule
relationships
and
constraints.
CUSTOMER
and INVOICE,
establishes
a constraint:
EMPLOYEE
and
TRAINING;
For
no fewer
and
example,
the
and a 1:* relationship than
a relationship
ten
people
between
first
between and no
EMPLOYEE
TRAINING.
2.3.1 Discovering
Editorial
establish
two
Cengage deemed
of business
documentation, more
direct
Learning. that
any
All suppressed
Business rules
source
Rights
Reserved. content
rules
such
does
are company
as a companys
of business
May not
not materially
be
copied, affect
rules
scanned, the
overall
or
is
duplicated, learning
managers, procedures,
direct
in experience.
whole
policy
interviews
or in Cengage
part.
makers,
standards
Due Learning
with
to
electronic reserves
end
rights, the
right
department
or operations
some to
users.
third remove
party additional
managers manuals.
Unfortunately,
content
may content
be
because
suppressed at
any
time
and
A faster
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
38
part
I
Database
Systems
perceptions rules.
differ, end users sometimes
For example,
maintenance task.
2
a
maintenance
procedure,
Such
when
a distinction
are crucial
to the
Too often, interviews of
what the job
general
does
and
verify
not the
can
people
help the results
the
designer.
same job
to
yield
can
perform
very
database
designers
that
the
rules
essential
job
a
users
perceptions.
different
business
a
such
end
end-user
to management
ensure
business
can initiate
Although
pays to verify
may point The
reconciliation
mechanic
consequences.
rules, it
a discovery
any
authorisation
major legal
who perform
database
of the
that
with inspection
have
of business
While such
when it comes to specifying
may believe
mechanics
but it
development
are.
source
mechanic
only
trivial,
with several
components
diagnosis
differences
actually
may seem
contributors
are aless reliable
department
perceptions
problems,
that
is to reconcile
such
rules
are
appropriate
and accurate. The
process
of identifying
and
documenting
business
is
to
database
design
for
several
reasons: They
help
standardise
the
companys
They can be a communications
to understand
They
allow
the
designer
to
They
allow
the
designer
to
create
pilot
not
can
business
fly
more than
rule
can be enforced
ten
In
keep
in
a business
nouns
track
the
rule
associates
hours
be
relationship
modelled.
within
any
their
the
To properly
the type
go both
ways.
by the is
used
entity
to identify
objects.
business
in the
model
the entities. nouns (customer
rule,
you
of interest
could for
the
between
For example,
one-to-many
customer
a
the
and
environment
or passive)
business
wants
rule,
a noun
associating
rule a
customer
and a verb (generate)
that
that: and
should
be represented
by
rule
(1:*).
properly
you should
the
an
business
invoice
Customer
identify
and invoice.
is
is the
consider
rule a
customer by
side,
and invoice
1
the relationship
type,
How
many instances
of A are related
to
one instance
of B?
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
that relationships
generated
of A?
All
no
such
relationships
As a general
(active
environment
one instance
suppressed
that
However,
business
them.
a verb
deduce
to
any
specifies
modelled.
and invoices)
of B are related
Learning.
that
attributes,
For example,
many instances
that
be
for
and
How
Cengage
rule
If the
rules
two
objects
cannot
of entities,
among
business
a business
period
identification
of relationship,
business
As a general rule, to
deemed
an
relationship
identify
complemented
are
contains
are
constraints,
entities.
a generate
is, they
into
this
and
Data Model Components
will be specific
a relationship
From
rules
software.
proper
names there
and invoice
is
relationship
into
nouns.
respective
There
for the
world,
objects,
data.
participation
For example,
24-hour
by application
many invoices
Customer
has
can
will translate
will translate
may generate
that
real
of the
of the
processes.
appropriate
stage
set the
and scope
model.
Business
rules
designers.
nature, role
Business rules into
to
2020
users and
2.3.2 translating
constraints.
review
rules
data.
business
develop data
all business
the
understand
an accurate
of
between
designer
Of course,
Copyright
tool
They allow the
and to
Editorial
view
part.
Due Learning
to
electronic reserves
may generate
only
one
rights, right
some to
third remove
In
many
additional
content
may content
that
is
case,
the
side.
ask two
party
bidirectional;
many invoices
customer.
is the
you should
the
are
questions:
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
For example,
you could identify
How
many classes
How
many
the relationship
can one student
students
can
enrol
in
between
enrol in?
one
class?
student
Answer:
and class
by asking two
2
Data
Models
39
questions:
Many classes.
Answer:
Many students.
2 Therefore,
the
opportunities
soon the
to
between
determine
process
2.4 The
relationship
the
will become
relationships
second
the eVolutIon
quest
for
better
file
systems
is,
what it should
data
critical
chronological
order.
remarkable
You
of the
taBle
to
2.1
many-to-many
entities
to
several
as you
different
models represent
of structures
that it
This section
some
major data
is
(*:*).
proceed
You
that
through
many
of the old
evolution
should
of the
new
that
this
attempt
of thought
employ,
an overview
model
models
schools
gives
data
of major data
Time
Data
First
1960s-1970s
File system
as to
many
book,
and
to resolve
the
what a database
and the technology of the
database
concepts
major
concepts
and
that
data
structures.
would
models
and
be
in roughly
structures
Table
bear
2.1 traces
a
the
Model
models examples
Comments
VMS/VSAM
Used
mainly
Managed 1970s
IMS,
Hierarchical and
Third
have
models.
Generation
Second
will
MoDels
has led
will discover
resemblance
evolution
between
These
structures.
class
nature.
management
shortcomings.
these
and
oF Data
do, the types
used to implement
student
Mid-1970s
ADABAS,
IDS-II
Early
network DB2
Relational
Oracle
on IBM
records,
database
Server
access
Conceptual
simplicity
support
for
systems
systems
Navigational
Entity relationship
MS SQL
mainframe not relationships
(ER)
relational
modelling and data
modelling
MySQL Fourth
Mid-1980s
Object-oriented
Versant
Object/
Objectivity/DB
relational
(O/R)
Object/relational
support
DB2 UDB
Star Schema support
Oracle 11g
warehousing Web databases
Fifth
Mid-1990s
XML Hybrid
DBMS
dbXML
Unstructured
Tamino
O/R
DB2 UDB
Hybrid
Oracle 11g MS SQL Emerging
Late
Models:
2000s
to
Key-value
present
Column
store
Bigtable
NoSQL
Support
(Amazon)
High
Cassandra (Apache)
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
large
databases highly
performance,
Very large
rights, the
right
some to
third remove
party additional
content
XML
documents
end to
databases
Distributed,
(Google)
support
supports
(terabyte
size)
scalable fault
tolerant
storage (petabytes)
Proprietary
Copyright
data
common
DBMS adds object front
Suited for sparse
Editorial
object
for data
become data
model
relational
Server
SimpleDB
store
for
types
data
API
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
40
part
I
Database
Systems
online Content Thehierarchical andnetworkmodels arelargelyof historical interest,yet they
do still
technical
2
on the
contain
some
details of those two accompanying
model.
However,
focuses
online given the
on that
The hierarchical manufacturing
model (Each
as the
parent
can
is the
devoted
presence
rocket
that
an upside-down of a file
of the segment
children,
to improve
Appendices
to the
professionals.
The
I and J, respectively,
object-orientated model,
(OO)
most of the
tree.
directly
but
each
database
that
The
book
with the
The schema is the conceptual
only
is
organisation
contains
allows
not
levels,
a higher
child.
and its
basic or
layer
The hierarchical
children
segments.
parent.)
of records
a record
used
today,
are still
more effectively than
a database
as a collection
model
models
hierarchy,
data relationships
model
generally
network
a parent
one
The
structure
Within the
and to impose
database
of data for complex
1969.
which is called the
complex
network
model
emerged
type.
between has
moon in
hierarchical
beneath it,
performance
the
on the
record
child
the network model,
manage large amounts
landed
systems
database
hierarchical
network
concepts
database
of the relational
(1:*) relationships
user perceives
the
While the
database
current
Models
Apollo
by
many
model,
unlike
parent.
market
Gis
model was created to represent
model, the
However,
as the
equivalent
parent
have
hierarchical
network
such
a set of one-to-many
The network the
dominant
is represented
A segment
depicts
Appendix
model was developed in the 1960s to
structure
is perceived
that interest
models are discussed in detail in platform.
and network
projects,
segments.
and features
model.
2.4.1 hierarchical
logical
elements
used
in
to
the by
standard.
1:* relationships.
have
more than
definitions
modern
In the
of
data
one
standard
models:
of the entire database as viewed by the database
administrator.
The subschema actually
A data and is
A schema
to
desired
language
work
with the
needs
grew
model became
and
programs
Copyright review
2020 has
Cengage
Learning. that
any
All suppressed
that
Large
the
programs that
database.
which data can be managed
to define the
Rights
Reserved. content
does
May not
databases
The lack to
of ad hoc
produce any
applications
the
change
database.
replaced
were required,
query capability
even the simplest
structural
data from
were largely
and
by the
put heavy
reports. database
Because
of the
relational
pressure
Although the
in the
the
could
still
produce
disadvantages
data
on
existing
of the
model in the
1980s.
Model
Shared
of the
sophisticated
drew
they
model wasintroduced
Data for
Communications
deemed
more
models,
both users and designers.
Editorial
by the application
within
(DDL) enables the database administrator
data independence,
network
The relational
1
data
database.
code required
2.4.2 the relational of
the
defines the environment in
cumbersome.
the
limited
all application
hierarchical
Model
(DML)
data in the
and
too
to generate provided
in
from
data definition language
programmers databases
information
components.
As information
havoc
the
manipulation used
schema
network
defines the portion of the database seen
produce
by E.F. Codd (of IBM) in 1970 in hislandmark
Databanks.1
To use an analogy,
ACM,
not materially
be
pp. 377-387,
copied, affect
scanned, the
overall
or
duplicated, learning
The relational
model
the relational
model produced
June
in experience.
whole
represented
a
paper A
major
Relational
breakthrough
an automatic
for
transmission
1970.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
database set the In
to replace stage
for
1970,
the standard
a genuine
Codds
simplicity
was
to implement
work
bought
was
expense
efficiency.
Better
desktop
and laptop
computers,
relational
mainframe
ingenious
yet, the
cost
overhead;
preceded
it. Its
conceptual
a fraction
software
software
of
provided
The relational
computers
computer
of computers
costing
database
relational
that
but impractical.
of computer
model. Fortunately,
system
other
databases
Data
Models
41
simplicity
revolution.
considered
at the
the relational
sophisticated
transmission
database
2
power
rapidly
such
Oracle,
the
power
ancestors
as
conceptual
lacked
power
2
as did operating
as their
mainframe
by vendors
time
grew exponentially,
diminished what their
models
at that
did,
grew.
can run
DB2, Informix,
Today
relatively
Ingres
and
vendors.
note The relational in
Chapter
relational
The
database
model
3, Relational
Model
model is
relational
system
hierarchical
model
Arguably
easier
the
relational relational
database
as
data in a way that Each
table
relations,
a
the
in the
implemented
to
in addition
Relational
the
a more detailed
Algebra
discussions
a
performs
in
sophisticated
same
and
Calculus.
relational
basic
discussion In fact,
most of the remaining
functions
to a host of other functions
of the
RDBMS
RDBMS
manages
of tables
in
database
provided
that
the
chapters.
by the
make the relational
is its
all of the
which
data
ability
to
physical
are
hide
the
details,
stored
and
complexities
while the
can
of the
user
manipulate
sees and
the
query
and logical.
consisting
each
through
RDBMS
advantage The
CUSTOMER
4,
to introduce
and implement.
a collection
matrix,
are related
contained
is
seems intuitive
is
example,
Chapter basis for
The
user.
designed
and in
understand
the
is
as the
will serve
DBMS systems, to
chapter
Characteristics,
model
most important
model from
in this
that it
(rDBMS).
and network
database
For
so important
database
management
presented
of a series
other
through
table
in
the
Figure
of row/column sharing
2.1
intersections.
of a field
might
which
contain
Tables,
is
a sales
common
agents
also
to
both
number
called entities.
that
is
also
AGENT table.
online Content Thischaptersdatabases canbefound onthe accompanying online platform Figure
for this
The common or her data is
sales are
Kubu
link
For example,
in the
between
agent
stored
even
in
though
the
table.
because
other,
minimum
level
you of
for
can easily
associate
Copyright review
2020 has
are stored
you
Dunne,
redundancy
and
CUSTOMER
enables
you to
tables
shown
in
for
can
the
data
to
eliminate
one table
most
that
tables
Bhengani.
between
and the
determine
CUSTOMER
Kubu
the
in
easily
sales
the tables
Dunnes
agent
is
which
model
redundancies
501,
are independent
The relational
of the
to his
representative
customer
AGENT_CODE
Although
tables.
match the customer
provides
commonly
found
a in
systems. The relationship
Editorial
data
AGENT_CODE
controlled
AGENT
and AGENT tables
customer
customer
of the
Ch02_InsureCo.
For example,
type
(1:1,
1:*
or *:*) is
depicted in Figure 2.2. Arelational the
contents
named
CUSTOMER
AGENT tables
of each
the
database
the
another
Bhengani,
matches the
file
book.
2.1 are found
attributes
Cengage deemed
Learning. that
any
within
All suppressed
Rights
those
Reserved. content
does
entities
May not
not materially
be
copied, affect
often
shown
in
a relational
diagram is a representation and the
scanned, the
overall
or
relationships
duplicated, learning
in experience.
whole
or in Cengage
between
part.
Due Learning
to
electronic reserves
schema,
an
example
of the relational those
rights, the
right
of
databases
which
is
entities,
entities.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
42
part
I
Database
FIgure Database
2
Systems
2.1
linking
name:
relational
Ch02_InsureCo
AGeNT_
Table
AGeNT_LNAMe
tables name:
AGENT
(first
AGeNT_FNAMe
six attributes)
AGeNT_iNiTiAL
AGeNT_
CODe
AGeNT_PHONe
AreACODe
501
Bhengani
Kubu
B
0161
228-1249
502
Mbaso
Lethiwe
F
0181
882-1244
503
Okon
John
T
0181
123-5589
Link through
Table name:
CUSTOMER
CUS_
CUS_
CUS_
CODe
LNAMe
FNAMe
10010
Ramas
Alfred
10011
Dunne
Leona
10012
Du Toit
10013
Pieterse
10014
Orlando
10015
OBrian
Amy
10016
Brown
James
10017
CUS_
CUS_reNew_
AGeNT_
AreACODe
PHONe
DATe
CODe
A
0181
844-2573
05-Apr-2018
502
K
0161
894-1238
16-Jun-2018
501
0181
894-2285
29-Jan-2018
502
0181
894-2180
14-Oct-2019
502
0181
222-1672
28-Dec-2019
501
B
0161
442-3381
22-Sep-2019
503
G
0181
297-1228
25-Mar-2018
502
0181
290-2556
17-Jul-2019
503
iNiTiAL
W
Jaco
F
Myron
George
Padayachee
10019
CUS_
CUS_
Maelene
Williams
10018
AGENT_CODE
Moloi
Vinaya
G
0161
382-7185
03-Dec-2019
501
Mlilo
K
0181
297-3809
14-Mar-2019
503
In Figure 2.2, the relational diagram shows the connecting fields (in this case, AGENT_CODE) and the relationship type, 1:*. In this example, the CUSTOMER represents the many side because an AGENT can have many CUSTOMERs. The AGENT represents the 1 side because each CUSTOMER has only one
AGENT.
Arelational table stores a collection of related entities. In this respect, the relational database table resembles a file. However, there is one crucial difference between a table and a file: a table yields complete data and structural independence because it is a purely logical structure. How the data are physically stored in the database is of no concern to the user or the designer; the perception is what counts.
And this
property
of the relational
database
model, explored
in
depth in the
next
chapter,
became the source of a real database revolution. Another reason for the relational database models rise to dominance is its powerful and flexible query language. Relational algebra, which was defined by Codd in 1971, wasthe basis for manyrelational query languages
and
will be introduced
in
more detail in
Chapter
4, Relational
Algebra
and
Calculus.
For
most
relational database software, the query language used is known as Structured Query Language (SQL). SQLis a 4GL that allows the user to specify what must be done without specifying how it must be done. The RDBMS uses SQL to translate user queries into instructions for retrieving the requested data. SQL makesit possible to retrieve data with far less effort than any other database orfile environment.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
2.2
relational
diagram: a relational
2
Data
Models
43
class diagram
2
From an end-user a user interface,
explained
perspective,
any SQL-based relational
a set of tables
stored in the
database
database
and the
application involves
SQL engine.
Each
three
of these
parts: parts is
below:
The end-user interface. Basically, the interface allows the end user to interact with the data (by auto-generating SQL code). Each interface is a product of the software vendors idea of meaningful interaction with the data. You can also design your own customised interface with the help of application generators that are now standard in the database software arena. A collection of tables stored in the database. In a relational database, all data are perceived to be stored in tables. The tables simply present the data to the end user in a way that is easy to understand.
Each table is independent
from
another.
Rows in
different
tables
are related,
based
on common values in common attributes. SQL engine.
Largely
hidden from
the end user, the
SQL engine
executes
all queries
or data
requests. Keep in mind that the SQL engine is part of the DBMS software. The end user uses SQL to create table structures and to perform data access and table maintenance. The SQL engine translates all of those requests into the instructions necessary to perform such tasks largely
behind the scenes
and
without the
end users
knowledge.
Hence, its
said that
SQL is a
declarative language that tells what must be done but not how it must be done. (You willlearn more about the SQL engine in Chapter 13, Managing Database and SQL Performance.) Because the RDBMS performs the behind-the-scenes tasks, it is not necessary to focus on the physical aspects of the database. Instead, the chapters that follow will concentrate on the logical
portion
of the relational
database
in Chapter 8, Beginning Structured SQL and Advanced SQL.
2.4.3 the entity relationship The conceptual
simplicity
the rapidly increasing
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
database
Furthermore,
SQL is
covered
in
and in Chapter 9, Procedural
technology
and information
scanned, the
design.
detail
Language
Model
of relational
transaction
and its
Query Language,
overall
or
duplicated, learning
in experience.
whole
triggered
requirements
or in Cengage
part.
Due Learning
to
electronic reserves
the
demand for
RDBMSs.
created the need for
rights, the
right
some to
third remove
party additional
content
may content
be
more complex
suppressed at
any
In turn,
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
44
part
I
Database
Systems
database (For
implementation
example, Complex
2
model
features
that
graphically
activities
would
require
describe
them
a
widely
accepted
Chen first
and their relationships the
relational
the foundation
model
diagram
One of the
between
relationships over
among
data
When the
were illustrated:
were represented database
designers
including
one of the
1976;
that
notation
it
quickly
results.
network
prefer to
models,
it
Although
the
still lacked
the
use a graphical
(er)
was the
graphical
became
popular
database
model
representations
structures
tool in
which
model, or erM,
representation
because
and
of their
ERM
has
of entities
it complemented
combined
was and
how it
between
entities
was
(1:M),
to
provide
model,
most common
more
versions
using
simple
ERD,
of
the
Chens
which uses the
modelling
notation
such
(1:1).
notation
as n
Relationships line.
were
Foot
style
of relationships
relationship
Crows
Chen also
notation
types
and one-to-one
entities
data
Chens
three
through
versions
model,
in the
a relationship.
(M:N)
entities
graphical
of the
to
achieved
related
components. between
ER data
debate
were introduced,
many-to-many
to the
of the
a large
was different
model components
connected
this
This fuelled
originally
model database
made a distinction
early releases
own.
an entity
one-to-many
to
was that it clearly
However in the
basic data
adopted
successful
a kennel.)
database design. ER models are normally represented in an entity
Chens
by a diamond
design tools.
building
modelling.
The relational
attributes
associations
many.
database
than
Because it is easier to examine
designers
model in
structure
them.
have
what exactly
for representing
to indicate
to
yield and
design tool.
which uses graphical
of Peter
to
activities
Thus, the entity relationship
data
concepts.
(erD),
strengths
and the relationships
community
ER data
for tightly structured
relationship
allowed
for
the
more effective
design
hierarchical
database
standard
in a database
database
the
database
in text,
need for
detailed simplicity
over
are pictured.
introduced
the
more
conceptual
make it an effective
than to
Peter
creating
requires
was a vast improvement
entities and their relationships become
thus
a skyscraper
design
relational
structures,
building
Whilst
developed,
notation.
note One of the
more recent
Foot notation James such
the
was originally
Martin. In as n
symbol
of
legacy
UML,
invented
many
many
of Peter
used the
you
Chen. side
organisations
that
produce
larger
entity
UML
standard.
online
with
C. Finkelstein,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
and
is
Foot
the
from is
simple
the
Crows
This is
modelling
and
notation
three-pronged
a general
shift
towards
particularly
but are vital to the Foot
The
by Clive Finkelstein2
derived there
true
organisation.
in
It is
notations.
Modelling Language (UML) has been used
diagrams
notation
willtherefore
model.
of using the
notation.
Crows
of the Unified class
Foot
is
have
been
emerging
be used to
developed
as the
as a part
industry
data
of the
modelling
model ERDs using relational
concepts.
Morein-depth coverage ofthe Crows Foot notationis providedin
E, Comparison
of ER
An Introduction
Addison-Wesley,
Foot
and software
Chens
method,
UML notation
Crows
made popular
Although
Crows
hardware
Although
design
book the
Crows
use the
both
as the
were used instead
relationship.
component models.
Content
Appendix
2
relationship
object-orientated
In this
still
known
and later
symbols
The label
on obsolete
are familiar
is
Everest
of the
today
Morerecently the class diagram to
notations
graphical
by
many
which are running
important
Chens
by Gordon
Foot notation,
to represent
systems
therefore
Crows
to indicate
used
use
versions
Modelling
to Information
Notations,
available
Engineering:
From
on the
Strategic
online
platform.
Planning
to Information
Systems.
1989.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2
Data
Models
45
note UML is and
an object-orientated
published
common
and
as
set
databases. model
the
Rather,
website
based
and
The
a language The
and
that
OMG
is
of an
Object
effort
(symbols
UML is
describes
Management
headed
and
constructs)
a set of diagrams
which
OMG
for
the
software
includes
to
for
that
More details
which
data are to
)
2
a
design
developing
can be used to
consortium
UML.
(OMG
develop
analysis,
or procedure
and symbols
not-for-profit
computing,
Group
by the
not a methodology
an international
object
by the
result
notations
mind that
of distributed
on the following
Earlier in this
collected
is
area
the
that can
is
setting
be found
on
www.uml.org/
The ER model is
box.
UML is
sponsored
UML is
Keep in
graphically.
in the
language
1997. diagrams
of systems.
a system
Entity.
in
of object-orientated
modeling
standards
modelling
a standard
stored.
name
generally
in
was defined
is represented
entity,
a noun,
is
capital
letters
and is
or EMPLOYEE
relational
an entity
An entity
of the
written
PAINTERS,
chapter,
components:
rather
model, an entity is
as an entity instance
ERD
in the
centre
written
of the
singular
Usually,
a relational
occurrence
about
by a rectangle,
in the
EMPLOYEES.
mapped to
or entity
in the
written
than
as anything
table.
also
rectangle. form:
known The
as an entity
entity
PAINTER
when applying
be
name
rather
the
than
ERD to the
Each row in the relational
table is
known
in the ER model.
note A collection
of like
entities
is known
Figure 2.3 as a collection depicts
entity
conform
Each
entity
example, a first
sets.
to that
is
name.
entity
Data
can
describe
written
connects
two
entities.
be illustrated:
next to
line. paints
2.3 shows
connectivities.
in the
Copyright Editorial
review
2020 has
ERD
Cengage deemed
Learning. that
any
many
some
All
examine
box.)
Reserved. content
entity
AGENT file in
speaking,
set,
the
and this
ERD
book
will
components.
characteristics
as an employee
does
May
basic (1:*)
number,
Diagrams,
data
of the
data.
of the
entity.
For
a last
name
and
explains
how
Most relationships
model, three
many-to-many
(*:*)
are represented
attributes
describe
of relationships
one-to-one
(1:1).
ERD
(The connectivities
by a relationship
an active
companys
types and
of relationships.
of the relationship,
ERDs that basic
use the UML
or vertically. just
not
among
Relationships
each
the
horizontally
Rights
of the
or passive
DEPARTMENTs
line
verb, is
has
that
written
on the
many EMPLOYEEs;
PAINTINGs.
basic
are immaterial;
suppressed
for
and its
Relationship
to label the types
The name
For example,
As you
may be presented
can think
set. Technically
as a substitute ERD
particular
such
Entity
Within the
one-to-many
entity
entities.
a PAINTER
Figure
each
related
relationship
entity
describes
associations
modellers use the term connectivity are
that
you
AGENT entity
any
attributes with
For example,
ERD.)
between
data
Modelling
use
discussing
will have
Relationships
associations among
5,
set. in the
designers when
by a set of attributes
(Chapter
Relationships.
ERD
practice
EMPLOYEE
in the
as an entity
agents (entities)
Unfortunately,
established
described
the
are included
of three
not materially
be
affect
scanned, the
overall
ERD in
to read
or
Figure
The location
remember
copied,
UML notation
duplicated, learning
in experience.
2.3,
and the
to illustrate note that order in
a 1:* relationship
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, right
some to
the
the
third remove
relationships
entities
which the
from
the
these
1
party additional
content
and relationships
entities
are
side to the
may content
and
be
suppressed at
any
time
presented
*
from if
side.
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
46
part
I
Database
FIgure
Systems
2.3
the basic uMl erD A One-to-Many
(1..*)
Relationship:
each
PAINTING
A PAINTER
is
painted
can
paint
many
PAINTINGs:
by one PAINTER.
2 PAINTER
paints
PAINTING
c
1..1
A
Many-to-Many
(*..*)
each
0..*
Relationship:
SKILL
An
EMPLOYEE
can be learned
EMPLOYEE
by
learns
can learn
0..*
(1..1) each
Relationship:
STORE is
An EMPLOYEE
managed
EMPLOYEE
manages
and their
manages
STORE
c
Because
set in the
of participation
1
be aware that, typically,
associations.
of an entity
ER
the
an object
model.
in a relationship
one STORE:
by one EMPLOYEE.
1
You should
SKILLs:
SKILL
c
0..*
A One-to-Many
many
many EMPLOYEEs.
class is
Likewise,
is
UML class
diagram
a collection
an association
often referred
to
was developed of similar
is
similar
objects,
to
as multiplicities.
to
model object
a class is the
a relationship
The only
classes
equivalent
where
the
major difference
degree
between
a UML class and an ER entity is that a blank box is left in the drawing of the UML class to add the names of methods which are required when developing object-orientated systems. However,from a data modelling perspective this does not affect the structure of the data and you will use the UML notation to represent relational concepts only. Chapter 5, Data Modelling with Entity Relationship Diagrams, will introduce
the concepts
of both
Crows
Foot notation
and the
Class
Diagram
notation in
more detail.
Most database modelling tools let you select the UML model diagram option. Microsoft Visio Professional software was used to generate the UML class diagrams you will see in subsequent chapters.
note Many-to-many them.
(*:*)
However,
appropriate
you
relationships will learn
in a relational
exist in
at a conceptual
Chapter
3,
Relational
level, Model
and
you
should
Characteristics,
know
that
how
to
recognise
*:* relationships
are
not
model.
online Content Fora moredetaileddescription ofthe Chen,CrowsFootandotherER model notation
systems,
see Appendix
E, Comparison
of ER Model Notations,
available
on the
online platform.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2
Data
Models
47
note For the Figure
purposes
Figure
2.6
shows
alternative
Crows
Foot
models
of the
UML
ERDs in
2
FIgure
As you
2.4
examine
represented
the basic Crows Foot erD
the
basic
by the
be presented
Figure
visual
Nevertheless,
the
2.4,
three-pronged
horizontally
Its exceptional
to
of illustration,
2.4.
note
or vertically
simplicity
search for
that
Crows
and the
makes the better
data
the
Foot.
1
is represented
As
with
order is
UML
again
the
line
segment
entities
and the
and relationships
*
is may
unimportant.
ER model the dominant modelling tools
by a short
notation
database
continues
as the
modelling
and design tool.
data environment
continues
evolve.
2.4.4
the
Increasingly
object-orientated complex
(oo)
real-world
problems
Model demonstrated
a need
for
a data
model
that
more
closely
represented the real world. In the object-orientated data model (OODM), both data and their relationships are contained in a single structure known as an object. In turn, the OODM is the basis for the object-orientated database management system (OODBMS).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
48
part
I
Database
Systems
online Content Thischapter introducesonlybasicOOconcepts.Youllhavea chance to examine
object-orientated
concepts
and principles in
detail in
Appendix
Like the
relational
G, Object-Oriented
Databases, which can be found on the online platform.
2
An
OODM
is
reflects
described
relationships
other
a different
by its
between
objects.
data
be
on it,
The
An object
to
an ER
Attributes
describe Name,
objects
on the
the ID
share
the
contains
of
as finding In
languages.
structure
models includes about
meaning.
entity,
an object
information its
about
relationships
with
The OODM is said to be
also
to
contain
a specific
all
data
and operational potentially
operations
value
and
that
procedures,
a basic
can
printing
data
the object
building
block for
class
inheritance
is the
methods
of the
be created inherit
has
To illustrate
the
As you examine The
OO data
other
objects
all related
that
Learning.
within the
All suppressed
related
Rights
that
Reserved. content
does
from
May
not materially
behaviour.
the
the
class two
class data
this
of the
OO data
representation.)
and the
the
CUSTOMER
and
EMPLOYEE,
CUSTOMER
and
class
respect.)
attributes
and
tree in
EMPLOYEE
model in this
to inherit
case,
a PERSONs
an upside-down
CUSTOMER
classes,
action
programming
variants in their
hierarchical
hierarchy
In
or printing
resembles
it
a real-world
in traditional (Some
a class
set in that
represents
name
of similar
sense,
an entity
methods
hierarchy
to the
PERSON.
can
EMPLOYEE
will
PERSON.
the
OO
problem
an object
same
object
objects
model
and
shown
in
the
ER
Figure
2.5.
as a box; all of the object
model,
examine
is related
be
copied, affect
scanned, the
overall
to
box.
Note that
one
For
and
must contain
or
duplicated, learning
in experience.
attributes
box. The object representation
to the INVOICE.
each INVOICE
not
includes
a general
from
method
do not include
example,
In
different
a PERSONs
objects
similarity
within the object
each INVOICE
object indicates
object
their
graphical
2.5, note that:
model represents
of the
occurrence
in this list.)
A class is a collection
of procedures
The class
class
between
are included
objects
relationship
an
is
changing
within the
invoicing
one individual
of the items
a PERSON
A classs
For example, the
For
the
methods
simple
Figure
it.
may be considered
only
several
(methods).
equivalent
model
parent.
from
and
the
define
object
above
difference in
methods.
name,
of an object
classes
attributes
representations
indicates
ability
as subclasses
all
as
(Note
represents
in classes.
a class
methods
one
an object
example,
behaviour
are the
class.
For
and
a class hierarchy.
PERSON
an object
However,
PERSONs
only
each
a parent
terms,
defined through
are grouped
methods
semantic
general
of Birth.
set.
known
OO terms, as the
In
of an object. Date
entity
words,
components:
entity.
(attributes)
models
which
any
object finding
at least
More precisely,
and
a selected
such
following
characteristics
in
that
greater
of relationships
content is
properties
Number
Classes are organised
share
entity.
semantic
procedures
other In
model
Cengage
an
object
of a real-world
similar
ER
a set
address.
deemed
based
object
well as information
values,
types
making the
models
with shared
resembles
has
data
an
meaning.
allowed
its
data, various
objects
attributes
Objects that
2020
as
entity,
are given
indicates
has
thus
an abstraction
of an entity. (The
review
an
object,
semantic
as changing
model is
is
equivalent
Copyright
the
entities.
unlike
structures.
OO data
such
use
quite
within the object
because
self-contained,
autonomous
within
development
As objects include
becomes
Editorial
facts
such
and
But
the facts
OODM
performed
values.
the
model
Subsequent
define
content.
Therefore,
a semantic
way to
factual
whole
or in Cengage
the
only
one
at least
part.
Due Learning
connectivities
example,
to
the
1:1
and relationships of the INVOICE
(1:1
and
1:*) indicate
next to the
CUSTOMER.
The
reserves
rights, the
right
some to
third remove
party additional
content
1:* next to
may content
the
CUSTOMER the
one LINE but can also contain
electronic
to
includes
be
many LINEs.
suppressed at
any
time
object
LINE
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
2.5
a comparison
OO data
2
Data
Models
49
of the oo model and the er model
model
ER model
2 INVOICE
INV_NUMBER
INV_DATE INV_SHIP_DATE INV_TOTAL 1 CUSTOMER *
LINE
The ER model uses three separate entities and two relationships to represent an invoice transaction. As customers can buy more than one item at a time, each invoice references one or morelines, one item per line. And because invoices are generated by customers, the data modelling requirements
include
a customer
entity
and a relationship
between
the
customer
and
the invoice. The
OODM
advances
influenced
many areas,
from
system
modelling
to
programming.
(Most
contemporary programming languages have adopted OO concepts, including Java, Ruby, Perl, C# and Visual Studio) The added semantics of the OODM allowed for a richer representation of complex objects. This in turn enabled applications to support increasingly complex objects in innovative
ways.
online Content Ausefulcomparison between the OOandER model components canbe found
in
Table
G.3, located
platform for this
It is important suited data
than
to
purposes.
and
Appendix
G, Object-Orientated
to
some
not
all data
tasks.
For
while implementation
The
network
such
note that
others
modelling,
in
Databases,
available
on the
online
book.
entity
as the relational
model
are created
example, models
relationship
models
models
is
equal;
conceptual are
better
an example
are examples
of implementation
model and the
OODM,
could
some
models
at
are
managing
be used
suited
as both
are
to
better
high-level
data for implementation
model,
At the
models
better
stored
of a conceptual
models.
data
while
the
same time,
conceptual
hierarchical
some
models,
and implementation
models.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
50
part
I
Database
Systems
2.4.5 other Facing the
Models
demand
to support
more complex
data representations,
the relational
models
main vendors
evolved the model further and created the extended relational data model (erDM). The ERDM adds many ofthe OO models features within the inherently simpler relational database structure. The ERDM gave birth to a new generation of relational databases that support OO features such as objects
2
(encapsulated
data
and
methods),
extensible
data types
based
on classes
and inheritance.
Thats
why a DBMS based on the ERDM is often described as an object relational database management system (OrDBMS). Today, mostrelational database products can be classified as object relational, and they represent the dominant market share of OLTP and OLAP database applications. The success of the ORDBMS can be attributed
transaction OODBMS is (CAD/CAM), support for
to the
models
conceptual
simplicity,
data integrity,
easy-to-use
query language,
high
performance, high availability, security, scalability and expandability. In contrast, the popular in niche markets such as computer-aided drawing/computer-aided manufacturing geographic information systems (GIS), telecommunications and multimedia, which require more complex objects.
From the start, the
OO and relational
data
models
were developed
in response
to
different
problems.
The OO data model was created to address very specific engineering needs, not the wide-ranging needs of general data management tasks. The relational model was created with afocus on better data management based on a sound mathematical foundation. Givenits focus on a smaller set of problem areas, it is
not surprising
that
the
OO
market has not grown
as rapidly
as the relational
data
model
market. However, large DBMS vendors such as Oracle readily promote their once relational DBMS now as object relational, with each new release adding new functionality. This gives organisations more choice and flexibility in the design and development of new database applications and in the integration with existing OO applications. The use
of complex
objects
received
a boost
with the internet
integrated their business models with the internet, they exchange critical business information. This resulted in business communication tool. Within this environment, as the de facto standard for the efficient and effective unstructured
data.
Organisations
that
revolution.
When organisations
realised its potential to access, distribute and the widespread adoption of the internet as a Extensible Markup Language (XML) emerged exchange of structured, semi-structured and
use XML data soon realised
that they
needed
to
manage large
amounts of unstructured data such as word-processing documents, Web pages, emails and diagrams. To address this need, XML databases emerged to manage unstructured data within a native XML format. (See Chapter 17, Database Connectivity and Web Technologies). Atthe same time, ORDBMSs added support
for
XML-based
documents
within their
relational
data structure.
Due to its robust
foundation
in broadly applicable principles, the relational model is easily extended to include new classes of capabilities, such as objects and XML. Modelling spatial data for use in applications such as route optimisation (an ambulance finding the quickest route to a patient) or urban planning requires yet another type of data model. Spatial data comprises objects
such
as cities
or forests
that
exist in
a multi-dimensional
space.
Storing
such
data in a relational
database would simply take up too much space and queries would be too long and complex to manage. A spatial database management system (SDBMS) is a database system with additional capabilities for handling spatial data. SDBMS include spatial data types (SDTs) in its data model and query language. For example
the
ability to
model objects (forests,
cities
or rivers) in space
using types
such
as POINT, LINE
and REGION. The POINT data type refers to the objects centre point in the multi-dimensional space, the LINE data type is used to represent connections in multi-dimensional space, e.g. rivers or roads, and the REGION data type is a representation of an extent e.g. alake in a 2-D space. In addition SDMS supports spatial indexing allowing the fast retrieval of objects in a specific area and efficient algorithms for supporting
spatial joins.
SDBMS
are often used to support
GIS applications
one of the
most popular
today being Google Earth.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Although a new
relational
generation
some
and
of
object
databases
relational has
years has become
histories,
customer
Twitter
and
According
to
many studies,
manage
balancing
ways to
about
NoSQL
The Big
in
Data
It is
not
always of rows millions lead
that
data
specific
processing
data
needs,
challenges
found
in
2
of
Data
growth
this
data from
(rapid
Data.
challenge
Todays
growing data
organisations
have accumulated
of browsing
patterns,
sources
of structured
is the top
rapidly
trends
for
data
and
derive
managers
The need to
scalability
business
data.
with system
(IT)
budgets.
performance,
at a reasonable
as Facebook,
unstructured
technology
with shrinking
growth,
purchasing
organisations,3
information
data
such
and
Big Data refers to a movement to find
and scalability
and
media
with combinations
Web-generated
relational
Web data that
challenges.
called Big
and lower
new and better
insight
from
cost. (You
willlearn in
the
of
it,
while
more detail
NoSQL.)
approach
does
not
always
match
needs
organisations
with
social
media
data into
the
conventional
relational
the
of
multiformat
need
for
(structured
more storage,
the type
and
in the relational of high-volume
come
non-structured)
processing
power
on
a daily
sophisticated
basis
data
will
analysis
environment.
implementations
with a hefty
data
and
price tag for
required
expanding
in the
RDBMS
hardware,
storage
environment and
licences.
highly
data
collected
based
on OLAP tools
structured
data.
from
Web sources
will probably
fault-tolerant
cure try to
sell
infrastructure
business
world
has
advantage,
and
others
MySpace
Barnes
it is
that
had
not
mining for
requires to
Big
miss it.
to
developed that
hidden
analysis
ask
in
Netflix
business
a viable
internet
some
needs
could
of
(although
prove
that
of unstructured
many
to
be a
leverage
matter
to
landscape
database
a highly
of business
technology
business
established
creating
scalable,
survival.
gain
The
a competitive
would
be different
if:
in time. model sooner. strategy
organisations
mountains
environments
amounts
organisations,
how the
challenge
in relational vast
approach.
For some
yourself
Facebooks
to the
surprising
of information
idea).
of companies Just
data in the
management
on the Data
be very successful usable
a different
data
you
many examples
had reacted
& Noble
Therefore,
for
had responded
Blockbuster
has proven to
However,
no one-size-fits-all
vendors
unstructured,
columns.
of rows
speaking,
with
is
to fit
and
to
Data analysis
before
Amazon.
are turning
Web data
and
to
gain
NoSQL
databases
a competitive
to
mine the
advantage.
www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo, Gartner
Cengage deemed
the
Big Data problem
software
has
converging
may not be available
Generally for the
See
manage
of
and social
of data
high performance
possible
Adding
wealth
to
16 Big
is that
inevitably
tools
2020
very
51
challenges:
structure
There
pace
amounts
Chapter
problem
need
all these
providing
patterns
as the next biggest
a phenomenon
manage large
simultaneously
review
most current
some
Web data in the form
organisations
the rapid
the
and leverage
costs) has triggered
mountains
need.
behaviour
have inundated
and scalability
are constantly
from the
an imperative
preferences,
LinkedIn
performance
Copyright
address
address
Models
Data Models: Big Data and nosQl
Deriving usable business information over the
Editorial
to
Data
organisations.
2.4.6 emerging
3
databases
emerged
2
Learning. that
any
All suppressed
Identifies
Rights
Reserved. content
does
May not
Top
not materially
be
10
copied, affect
Data
scanned, the
overall
and
or
duplicated, learning
Analytics
in experience.
whole
Technology
or in Cengage
part.
Due Learning
to
Trends
electronic reserves
rights, the
right
for
some to
third remove
2019,
party additional
February
content
may content
be
2019.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
52
part
I
Database
Systems
note Does this
2
mean that
No, relational and
structured
approach
relational
databases data
the
2.4.7
Every time
any
challenges
Bigtable
storage,
relational
The value,
key-value
to
three draw
simple
drivers
has
review
2020 has
example the
one
any
All suppressed
Rights
or
data to
column
different
model. In fact,
models
are
grouped
stores
and
key-value
to
the
these This
and
grow
of products
Facebook,
watch
database.
types that
a
As with
of technologies.
address
is
the
specific
no standard
under
the
stores.
It is
as Amazons and
comes
as LISP), in
NoSQL
stores
store
from
the
fact
to
in the
Googles as the
data
in
early
secondary
that
which in-memory
early
force
SimpleDB,
column
NoSQL umbrella,
still too
a dominant
permanently
emphasis
(such
there
become
stores
models
added
to
such
key-value data
languages
based
does
May not
on a structure
a corresponding
more
composed
of two
value
or set
of values.
data
model.
To better
or associative
these
arrays
data
of values
data
The
elements:
key-value
understand
a key
data the
and
model is
key-value
a
also
model,
2.6.
of a small truck-driving
certifications
and
other
company
general
called
information.
Trucks-R-Us. Using
this
Each
example,
of the we can
points: every of the model,
an attribute
Reserved. content
many
via
a NoSQL
consistency.
will survive
success
that
in Figure
an attribute
key-value
Learning.
to
using
tolerance.
data
points
example
model,
that
early
attribute-value
relational
Cengage
to friends
are
NoSQL.
relational
stores,
database.
has
In the
points
the
model is key
important
deemed
best
were
data.
models
programming
following
column
name
and fault
on the
graph data
other
the
In the
the
data.
2.6 shows
represents
Copyright
any
the
different
indicates
from
every
as the
Figure
and
databases
more detail.
based
Cassandra
stores
data
which
you
applied
than transaction
in
many
However,
word
hold
rather
to
to
messages
Maps,
be loosely
hence
of sparse
of these
Apaches
like
can
send
Google
availability
amounts
are not
and
at the
of application,
architectures.
high
contrary,
The
in
referred
look
database
characteristics
any,
originated
in
to refer
model,
databases if
are used to
Editorial
these
To the
just
models
NoSQL
performance
arena.
leaders.
challenges? transactions
businesses.
on Amazon,
directions
NoSQL
scalability,
document
database
Data
2019, relational
characteristics:
databases
which,
for
areas
September
Big
most day-to-day
general
Geared towards
know
has its
in
with
support
Big Data era and have the following
very large
from
technology
perspective,
to
of databases
Supports
model.
for
term
distributed
NoSQL
organisations
generation
high
data
in
databases
a new
on the
examine
a product
uses
Provides
Lets
for
the
of the
Supports
DBMS
DDMS technology
or search
chapter
based
a place
Databases
new technology, this
have
and dominant
Each
most dominant
YouTube
However,
Not
needs.
you search
on
dont
preferred
best tool for the job. In
nosQl
video
the
analytics
is to use the
still significantly
databases
remain
not materially
be
row
represents
entity
occurrence.
each
row
and the
copied, affect
scanned, the
overall
a single
or
duplicated, learning
Each
represents value
in experience.
or in Cengage
part.
occurrence
column
one
column
whole
entity
has
attribute
Due
to
electronic reserves
the
rights, the
right
some to
every
a defined
of one
contains
Learning
and
third remove
data
entity
actual
party additional
type.
instance.
value
content
column
may content
for
be
The key
the
attribute.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
2.6
2
Data
Models
53
a simple key-value representation Trucks-R-Us Data stored
Data stored using traditional
In the
relational
Each
row
Each
column
In the
row
one
represents in
entity one
a column
key-value
Each
model
are
of the
of the
same
data
entity type
model:
represents
one
attribute/value
of
one entity
Driver 2732
The key The
2
model
instance attribute
instance column
values
type
using
key-value
model:
represents
The values
relational
in
could
represent
the value
and therefore
any
column
it is
entitys
could
generally
attribute
be of any
assigned
data
a long
string
data type SOURCE:
Course
The data type of the value column is generally along string to accommodate data types of the values placed in the column.
Technology/Cengage
Learning
the variety of actual
To add a new entity attribute in the relational model, you need to modify the table definition. To add a new attribute in the key-value store, you add a row to the key-value store, which is whyit is said to beschema-less. NoSQL databases do not store or enforce relationships among entities. The programmer is required to manage the relationships in the program code. Furthermore, all data and integrity validations
must be done in the
expanded to support
program
code (although
some implementations
have been
metadata).
NoSQL databases use their own native application programming interface (API) with simple data access commands, such as put, read and delete. Because there is no declarative SQL-like syntax to retrieve data, the program code must take care of retrieving related data in the correct way. Indexing and searches can be difficult. Because the value column in the key-value data model could contain many different data types, it is often difficult to create indexes on the data. Atthe same time, searches can become very complex. As a matter of fact, you could use the key-value structure as a general data modelling technique when attributes are numerous but actual data values are scarce. The key-value data modelis not exclusive of NoSQL
databases;
actually,
key-value
data structures
could
reside
inside
a relational
database.
However, because of the problems with maintaining relationships and integrity within the data, and the increased complexity of even simple queries, key-value structures would be a poor design for most structured business data. Several
NoSQL
database implementations,
such as Googles
Bigtable
and Apaches
Cassandra,
have
extended the key-value data model to group multiple key-value sets into column families or column stores. In addition, such implementations support features such as versioning using a date/time stamp. For example, Bigtable stores data in the syntax of [row, column, time, value], where row, column and
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
54
part
I
Database
value
Systems
are string
composed NoSQL is
2
that
data types
of (row, supports
they
nodes.
to
use
that
supports particular,
very large
but the number
any certification
very large they
exam,
possible
certificates
only four
data instances.
than
for
each
not required
driver,
there
extrapolate
500 possible tests, remembering NoSQL
provides
databases to the
are
distributed
tolerance
means that,
Most NoSQL of the
biggest
databases fault
problems
with
If the
a copy
In a relational back.
the system
NoSQL only
you
data.
sacrifice
is
need
of
to
of the
technology.
hottest
to
in
the
best
section
and disadvantages
which
data tool
briefly
goes
levels
about
of
this
means that
updates
of the
technologies
for
the
job
summarises
performance.
today.
But,
from
and
any
a data
other
update?
is rolled
Chapter
NoSQL
database
12,
databases
will propagate
consistency,
as you learnt
database
understanding
the evolution
Distributed
data are
after an update.
Whichever
by
One
availability
(See
Some
With eventual
data immediately
management.
high
during
to the
Fault
as normal.
or the transaction
topic.)
NoSQL
consistency.
be served
down
all.
of nodes
operating
ensure
can
more
downtime.
data consistency. to
be consistent
will be consistent.
database
in
high
and
Web origins,
will keep
request
network
to
more
all copies
trends
select
The following
advantages
items
attain
learn
all data copies across
many emerging
be able
to to
consistency,
be consistent
if the
it
are
to take
in the form
than transaction
nodes
the
there
patients
without
is
can take
however,
capacity and
volumes
and three
but is not required
enforcing
down,
happens
consistency
eventual
goes
high
drivers
with 15 000
add
fails,
multiple
are guaranteed
Concurrency,
and eventually
to one
one
and
what
is
at
most
of attributes
drivers
are three
True to its
to
rather
databases
data
updates
of a clinic
database
elements
of the
very
practice,
a few tests
ability
performance
distributed
However,
In
do it transparently
distributed
of data
transaction
called
in the
Bigtable)
number
example,
if there
points.
case
handle
preceding
data
as the
and to
value.
database
of some
which the
and fault tolerance.
high,
with the requested
databases
not guaranteed
it is
node
of the
a feature
nodes
is
labs
data is
databases
of distributed
can
in
case,
can take
such
are geared towards
make copies
Transactions
provide
of the
of very large
database,
NoSQL
Managing
through
one
automatically
tolerance.
node
if
demand
databases
Using the
for the
Web operations,
when the
databases
example
NoSQL
(Cassandra,
network
cases
all. In this
each patient
of
of them
in the research
possible
high availability
to support
database
this
that
high scalability,
designed
will be nine
the stored
budgets!
NoSQL
is low.
access
most recent
big advantages
a complex
that is, for
to take
used to the
several
small
data.
data
data instances
are
Now
sparse
of the
originated
on very
The key
to indicate
fact,
to form
of sparse
for
of actual
One
databases
amounts
but they
blank
In
servers
most started
are suited
data type.
be left
architecture.
NoSQL
and
can
architecture.
commodity
several
NoSQL
a date/time time
database
Web companies,
of data. In
is
where
a distributed
use low-cost
Remember
successful
time),
distributed
generally
are designed
and time
column,
the
of data
in
Chapter
technology pros
and
1,
you cons
use,
of each
models and provides
some
of each.
2.4.8 Data Models: a summary The
evolution
complex Figure
order
be
of data
widely
model
semantic
2020 has
to
A data
model
Cengage deemed
Learning. that
been
driven
by the
of the
search
for
most commonly
new
ways
of
recognised
modelling
data
increasingly
models is shown
in
any
models,
some
All suppressed
than
must represent
Rights
semantics
common
of conceptual
database. the
real
characteristics
that
data
models
must have
the
real
to the
It
does
May not
not materially
be
copied, affect
scanned, the
overall
world
models
or
duplicated, learning
does
not
simplicity
without
make sense
to
compromising
have
a data
the
model
that
is
more
world. as closely data
while data representation
Reserved. content
are some
degree
of the
conceptualise
more
there
accepted:
must show
data behaviour,
review
always
A summary
completeness
difficult
by adding
Copyright
has
data.
evolution to
A data
Editorial
DBMSs
2.7.
In the in
of
real-world
as possible.
representation.
constitutes
in experience.
whole
or in Cengage
part.
the
Due Learning
to
electronic reserves
static
rights, the
right
This
goal is
more
easily
(Semantics
concern
aspect
of the real-world
some to
third remove
party additional
content
may content
be
the
suppressed at
any
time
dynamic
scenario.)
from if
realised
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Representation consistency
of the real-world and integrity
FIgure
2.7
transformations
characteristics
(behaviour)
of any
data
must be in
compliance
2
Data
Models
55
with the
model.
the evolution of data models 2
Semantics in Data
Comments
Model
least
1960
Difficult
Hierarchical
to represent
(hierarchical Structural 1969
Network
1970
Relational
level
No ad
hoc
Access
path
dependency
queries
(record-at-a-time
predefined
Conceptual
access)
(navigational
simplicity
access)
(structural
independence)
Provides ad hoc queries (SQL) Set-oriented
1976
M:N relationships
only)
Entity Relationship
Easy to
understand
Limited
to
(no
1983
access (more
conceptual
semantics) modeling
implementation
component)
Internet is born
Semantic
1978
More semantics Support
in
data
complex
Inheritance
1990
1985
for
(class
model
objects hierarchy)
Behaviour Extended
Object-Oriented
(O/R
most
Relational
Unstructured
DBMS)
XML
Addresses
2009 Big
Data
Big
data
data
Data problem
Lesssemantics in data model
NoSQL
Based on schema-less
key-value
Best suited for large
sparse
SOURCE:
Each
the
new
data
model
hierarchical
relationships. models
In turn,
through
language;
environment.
relational
store
note
model is
of implementation
OODM,
review
2020 has
Cengage deemed
any
All suppressed
also
emerged
the
Big
that
not
all
data
For example,
an example
be used
Rights
Reserved. content
does
May not
several
as the
models
as
not materially
of
both
conceptual
of the various
be
copied, affect
scanned, the
are
overall
or
duplicated, learning
equal;
some
stored
time,
whole
data
while the
hierarchical
models,
applications.
within the
models
or in Cengage
The
business
of alternative
are
better
data
suited
modelling,
purposes.
as the
models.
query
management.
and
such
network
The ERDM added
data for implementation
some
and
development data
Learning
easy-to-use
market share
with traditional
and implementation
in
hierarchical
business
the
Technology/Cengage
model replaced
framework.
has stimulated
created
database
experience.
for
model
(many-to-many)
models are better suited to high-level
model,
same
model
maintain strong also
network
and
data
data stores
Course
complex
over the
data
a break
managing
At the
The
independence
within a rich semantic
conceptual
a conceptual
data
dominant
data
represents
models.
advantages
superior
Data phenomenon
data that
models.
much easier to represent
model and allowed it to
manage
and disadvantages
Learning. that
could
offered
models are better for
examples
advantages
Copyright
years,
relationship
the
model
of previous
made it
support for complex
others for some tasks.
while implementation
Editorial
model
and
to
shortcomings
data representation,
to the relational
In recent
model,
It is important
than
on the
the former
relational
simpler
model introduced
many OO features
ways to
the
its
the
OO data
capitalised
model because
(XML)
exchanges
The entity
network
models
relational
model
Table
2.2
summarises
are and the
models.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
56
part
I
Database
Systems
a all
all
in still
(no good
in
hardware
2
a
hierarchical
DBMS. of
data
changes
use
development storage
relatively
changes
complex
complex management,
to
or
requires
efficiency
the
limitations
data in
substantial
gives overhead.
require
tools yields
yields
require
relationships). limits
knowledge standards.
definition
the
application
M:N
of
physical system
language
system
requires
system.
development,
structure
of
or
simplicity programs
changes
implementation
implementation in
requires
software
data
programs.
people
lack
no complexity
a
are
is
is
management.
RDBMS system
use;
manipulation
multiparent Complex
Disadvantages
knowledge
characteristics.
1.
Navigational
application
Changes
and
path.
2.
application
3.
navigational
implementation,
Navigational
There
There
There
System
4.
5.
6.
1.
2.
that
as
and
The
application
Structural
Conceptual
and
1.
3.
untrained
2.
or
and the
a
to
by
DBMS.
in
(DDL)
data
conceptual
enforced
such
access
in promotes
in equal
data
and
models.
than
models
(DML)
types,
Changes
promoted
least
language promotes
promotes
standards. is at affect
to is
relationships. provided
relationship
not
flexible system tables.
model.
language
is
sharing.
1:M
do
database
definition
more
relationship
file
with
data
relationship
relationship
and
is simplicity
programs.
more
data
security
multiparent.
independence
conformance
structure
various hierarchical
owner/member
access
independent
manipulation
is
and integrity.
of
of
efficient
DBMS.
the promotes
handles includes
is
Database
Parent/child
M:N
Conceptual
Parent/child
by
Data
Data
of
simplicity.
integrity.
It
There
data
data
Structural
application
use
hierarchical
tables
It
It
It
Advantages 1.
4.
3.
2.
6.
5.
4.
3.
2.
1.
5.
1.
disadvantages
and No
No
Yes
Yes
Yes
Yes
Network
Relational
Structural
independence
advantages
Data
independence
2.2
Model
taBle Data
Copyright Editorial
review
2020 has
Cengage deemed
Hierarchical
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
2
Data
Models
57
an
in has
by
may
to
when
only
found it
caused support.
2
provides
graphical
language. departments
applications.
limitation
enhancements,
entities
occurs
transactions.
it
system.
representation.
representation. required.
information
curve.
accepted own
and
(This own
anomalies
support
of
is
integrity
slows
from
unchecked,
model
standards
content subsequent
data
if
of
their widely their
manipulation
relationship
islands
consistency,
in
constraint
a
learning navigational
displays.
and, removed
overhead
same individuals
code.
data
data
relationship
supply
transaction
develop
consistent
steep
as
are programming
no
the
to
poorly limited
limited
is
is
no
no
is
is
of
a
information promote
complex crowded
is
addressed
is
development
of
system eliminating
easily
a systems.
may terms is
system
problems
produce file
can
There
There
There
Loss
versions.)
vendors
Slow
Complex
There
been avoid
attributes
High
There
application
There
eventually
standard.
thus
In It
It
3.
1.
2.
1.
4.
3.
2.
4.
3.
1.
4.
3.
2.
SQL. tolerance conceptual efficiency. user simplicity. on
effective
relational
fault
improves promoting
an
end
semantic
hardware. it
and storage
and
based
improves
integrity.
the
implementation, is
dominant
exceptional
thereby
data
makes
management added.
includes
details the
use.
improves
is isolates
commodity
design, availability
yields
and
Data.
tool. with
and
capability
substantially
promotes
model
Big
simplicity,
content
RDBMS
query view modelling
database
low-cost representation
representation physical-level
scalability,
provided.
hoc
integrated
is
management
Tabular
conceptual
Powerful
Ad
easier
implementation Visual
from
Visual
model.
Semantic Visual
communication
content.
uses
supports
It
It
2.
3.
Key-value
High are Inheritance
simplicity.
It
4.
3.
2.
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
Yes
Yes
Yes
Yes
Yes
Object-Orientated
NoSQL
May not
1.
3.
2.
Yes
Relationship
Entity
Editorial
1.
3.
2.
1.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
4.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
58
part
I
Database
Systems
2.5
Degrees
In the
early
1970s,
Requirements
2
the
As you to
details
can
see,
the
are
framework That is, details helpful
transfer
level
of of
specified,
created
by the
closer
multiple
(and
in integrating
floor
Designing
a usable
that
cannot
Using
conflicting)
of
follows overall of
of
produced.
the
the
floor.
proceeds
the
engineering
basic
conceptual
same
basic
process.
environment
and adds
abstraction
can
as seen
at different
data
Finally,
and
unless
data
engineers
on the factory
abstraction
without
design.
Next,
be
proceed
levels
views
can
and
of data
of automotive
be used
exist
of the
Planning
on degrees
produced.
to
database
view
to implementation. sometimes
be
at a high level
cannot
based
example
to
a structure
process
with an abstract
is
specifications
details
Standards
modelling the
that
into
begins
engineering
designer.
comes
design
car
The factory
and the
starts
concept
car
production the
(ANSi)
data
consider
of the
basic
into
producing
detail.
institute for
abstraction,
concept
the
designer
Standards a framework
of data the
are translated
properly
as the
meaning
process
a database
defined
drawing
help
drawings
an ever-increasing
details
by
that
National
(SPARC) the
begins
engineering
aBstraCtIon
American
To illustrate
designer
design
the
the
Committee
abstraction. A car
oF Data
also
be very levels
of
an organisation.
ANSI/SPARC
architecture
external,
The
conceptual
and internal.
as shown
in
Figure
of a physical
FIgure
2.8. In the
model
2.8
to
(as it is
figure,
address
often referred
to)
defines
You can use this framework the
ANSI/SPARC
physical-level
to
three
better
framework
has
implementation
levels
been
details
of data
understand expanded
of the
abstraction:
database
models,
with the
internal
model
addition
explicitly.
Data abstraction levels
End-User
View
End-User
View
External
External
Model
Model
Degree
of
Abstraction Conceptual
Characteristics
Designers
Model
High
View
ER
Hardware-independent Software-independent
Logical
independence Relational
Medium
Hardware-independent
Object-Orientated Internal Model
View
Network Low
Physical
Software-dependent
DBMS
Hardware-dependent
Hierarchical
Software-dependent
independence
Physical
Model
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2.5.1 the external The
external
2
Data
Models
59
Model
model is the
end
users
view
of the
data
environment.
The term
end
users refers
to people who use the application programs to manipulate the data and generate information. End users usually operate in an environment in which an application has a specific business unit focus. Companies are generally divided into several business units, such as sales, finance and marketing.
Each
business
unit is
subject
to
specific
constraints
and requirements,
and
each
2
one
uses a data subset of the overall data in the organisation. Therefore, end users working within those business units view their data subsets as separate from or external to those of other units within the organisation. As data is being modelled, ER diagrams will be used to represent the external views. A specific representation
of an external
view is known
as an external
schema.
To illustrate
the
external
models
view, examine the data environment of Tiny University. Figure 2.9 (a) and (b) presents the external schemas for two Tiny University business units: student registration and class scheduling. Each external schema includes the appropriate entities, relationships, processes and constraints imposed by the business unit. Also note that, although the application views are isolated from each other, each view shares
a common
entity
with the
other
view.
For example,
the registration
schemas share the entities
CLASS and COURSE.
Note the entity relationships
represented in Figure 2.9. For example:
A LECTURER
may teach
many CLASSes,
is, there is a 1:* relationship A CLASS
may ENROL
and each
CLASS is taught
and scheduling
external
by only one LECTURER;
that
between LECTURER and CLASS.
many students,
and each student
may ENROL in
many CLASSes, thus
creating a *:* relationship between STUDENT and CLASS. (You willlearn about the precise nature of the ENROL entity in Chapter 5, Data Modelling with Entity Relationship Diagrams.) Each COURSE may generate many CLASSes, but each CLASS references a single COURSE. For example, there may be several classes (sections) of a database course having a course code of CIS-420. One of those classes may be offered on Mondays, Wednesdays and Fridays from 8:00 a.m. to 8:50 a.m., another may be offered on Mondays, Wednesdays and Fridays from 1:00 p.m. to
1:50 p.m.,
while a third
may be offered
on Thursdays
from
6:00 p.m. to
8:40 p.m.
Yet all three classes have the course code CIS-420. Finally, a CLASS requires one ROOM, but a ROOM may be scheduled for many CLASSes; that is, each classroom may be used for several classes: one at 9:00 a.m., one at 11:00 a.m., and one at 1:00 p.m., for example. In other words, there is a 1:* relationship between ROOM and CLASS. The use of external views representing It
makesit easy to identify
It
makes the
designers
specific
job
easy
subsets
of the database has some important
advantages:
data required to support each business units operations.
by providing
feedback
about the
models
the model can be checked to ensure that it supports all processes models, as well as all operational requirements and constraints.
adequacy.
Specifically,
as defined bytheir external
It helps to ensure security constraints in the database design. Damaging an entire database is more difficult when each business unit works with only a subset of data. It
Copyright Editorial
review
2020 has
makes application
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
program development
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
much simpler.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
60
part
I
Database
FIgure
Systems
2.9
external
models for tiny university (a) Student
2
A student classes
registration
STUDENT
may take up to six per registration
1..1 enrols_in
c
1..6
ENROL
1..35 is_taken_by
1..1
COURSE
generates
CLASS
c
1..1
1..*
A class is limited
to
35 students
(b) Aroom
Class scheduling ROOM
may be used to teach many classes
1..1 is_used_for
c
1..*
Each class is taught in only one room Each class is taught by one lecturer
CLASS
COURSE
b generates
1..*
1..3 teaches
1..1
c
1..1
LECTURER
Alecturer
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
may teach
scanned, the
overall
or
duplicated, learning
in experience.
up to three
whole
or in Cengage
part.
Due Learning
to
classes
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2.5.2 the Conceptual
Model
Having identified
views,
the
external
a conceptual
model is used,
graphically
2
represented
Data
Models
61
by an ERD
(Figure 2.10), to integrate all external views into a single view. The conceptual model represents a global view of the entire database. It is a representation of data as viewed by the entire organisation. That is, the conceptual model integrates all external views (entities, relationships, constraints and processes)
into
a single
global view of the
entire
data in the enterprise,
known
as a conceptual
2
schema.
The conceptual schema is the basis for the identification and high-level description of the main data objects (avoiding any database model specific details). The most widely used conceptual modelis the ER model. Remember that the ER modelis illustrated with the help ofthe ERD, which is, in effect, the basic database blueprint. The ERDis used to graphically represent
the conceptual
schema.
The conceptual model yields some very important advantages. First, it provides a relatively easily understood birds-eye (macro-level) view of the data environment. For example, you can get a summary of Tiny Universitys data environment by examining the conceptual model presented in Figure 2.10. Second,
the
conceptual
model
is independent
of
both
software
and
hardware.
Software
independence means that the model does not depend on the DBMS software used to implement the model. Hardware independence means that the model does not depend on the hardware used in the implementation of the model. Therefore, changes in either the hardware or the DBMS software
will have
no effect
on the
database
design
logical design is used to refer to the task implemented in any DBMS.
FIgure
2.10
Conceptual
at the
of creating
model for tiny
conceptual
a conceptual
level.
Generally,
data
model that
the
term
could
be
university
enrols_in
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
62
part
I
Database
Systems
2.5.3 the Internal Once
a specific
Model
DBMS
has been
selected,
the internal
model
maps the
conceptual
model to the
DBMS. The internal model is the representation of the database as seen by the DBMS. In other words, the internal model requires the designer to match the conceptual models characteristics and constraints to those of the selected implementation model. An internal schema depicts a
2
specific
representation
of an internal
model,
using
the
database
constructs
supported
by the
chosen database. Since this book focuses on the relational model, a relational database was chosen to implement the internal model. Therefore, the internal schema should mapthe conceptual modelto the relational model constructs. In particular, the entities in the conceptual model are mapped to tables in the relational model.
Likewise,
since
a relational
database
has
been selected,
the internal
schema
is
expressed
using SQL, the standard language for relational databases. In the case of the conceptual model for Tiny University depicted in Figure 2.10, the internal model wasimplemented by creating the tables LECTURER, COURSE, CLASS, STUDENT, ENROL and ROOM. A simplified version of the internal model for Tiny College is shown in Figures 2.11 (a) and (b). The development
of a detailed
internal
model is especially
important
to
database
designers
who
work with hierarchical or network models because those models require very precise specification of data storage location and data access paths. In contrast, the relational model requires less detail in its internal model because most RDBMSs handle data access path definition transparently; that is, the designer
need
not be aware
of the
data
access
path
details.
Nevertheless,
even relational
database
software usually requires data storage location specification, especially in a mainframe environment. For example, DB2 requires that the data storage group, the location ofthe database within the storage group, and the location of the tables within the database be specified. Because the internal model depends on specific database software, it is said to be software-dependent. Therefore,
a change
in the
DBMS
software
requires
that the internal
model be changed
to fit the characteristics and requirements of the implementation database model. When you can change the internal model without affecting the conceptual model, you have logical independence. However, the internal modelis also hardware-independent, because it is unaffected bythe choice ofthe computer on which the software is installed. Therefore, a change in storage devices or even a change in
operating
systems
will not affect the internal
2.5.4 the physical
model.
Model
The physical model operates at the lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes. The physical model requires the definition of both the physical storage
devices
and the (physical)
access
methods
required
to reach
the
data
within those
storage
devices, makingit both software-and hardware-dependent. The storage structures used are dependent on the software (DBMS, operating system) and on the type of storage devices that the computer can handle. The precision required in the physical models definition demands that database designers who work at this level have a detailed knowledge of the hardware and software used to implement the database
design.
Early data models forced the database designer to take the details of the physical models data storage requirements into account. However, the now-dominant relational modelis aimed largely at the logical rather than the physical level; therefore, it does not require the physical-level details common to its
Copyright Editorial
review
2020 has
predecessors.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
2.11
an internal
model for tiny
2
Data
Models
63
university
2
Although the relational physical
storage
model does not require the designer to be concerned
characteristics,
the implementation
of a relational
model
about the datas
may require
physical-level
fine-tuning for increased performance. Fine-tuning is especially important when very large databases are installed in a mainframe environment. Yet even such performance fine-tuning at the physical level does not require knowledge of physical data storage characteristics. As noted earlier, the physical model is dependent on the DBMS, file level access methods and types
of hardware
storage
devices
supported
by the
operating
system.
When you can change
the
physical model without affecting the internal model, you have physical independence. Therefore, a change in storage devices or methods and even a change in operating system will not affect the internal model. A summary
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
of the levels
All suppressed
Rights
Reserved. content
does
May not
of data abstraction
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
is
in experience.
whole
given in Table
or in Cengage
part.
Due Learning
to
electronic reserves
2.3.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
64
part
I
Database
taBle
Systems
2.3
levels
of data abstraction
Degree
Model
2
of
independent
Focus
of
Abstraction High
External
End-user
Hardware
views
(independent Internal Low
Physical
of database
Specific
database
Storage
and
software
Hardware and software
Global view of data
Conceptual
and
model) Hardware
model
access
Neither
methods
hardware
nor software
suMMary A data
model is
Database users.
The
Business
basic
rules
real-world The
a (relatively)
designers
data-modelling
and
end
graphical
perceives
tool for
database
data
are
and
end
resembles
real-world
with entities,
data
applications attributes,
basic
environment.
programmers relationships
modelling
and and
end
constraints.
components
within a specific
most likely
future
are
used,
geared
are
to support
a new the
and
shifting
of the
data into
uses
objects
the
but
that
between
the
relational to
some
of
(ER)
as seen
a common
it.
facts
by
model is a popular
by
model allows
database
modelling
designers,
structure.
But unlike
as
other
framework.
basic
define
model,
each
model. The ER
data
as the
facts
object-orientated
point,
the
An
an entity,
the
well as relationships
object
with other
of
used in
geared
access
to
become
the
specialised
business
strategies
that
of Big
high scalability,
burden
to
extended
engineering
applications.
Although
merger of OODM and ERDM technologies,
of databases needs
extensions
is largely
primarily
an increasing
specific
(OO)
OODM
is
develop internet
provide
the
In the
are related
the relational
views the
ERDM
generation
very
data stores that
consistency
At this
is
Tables
The entity relationship
different
many
while the
scenario
no longer
meaning.
adopted
by the need to
that
standard.
tables.
complements
relationships
model (ERDM).
databases
distributed
has
in
attributes.
it includes
more
applications,
overshadowed NoSQL
data
models
implementation
model (OODM) in that
early
stored
and to integrate
about
its
model
data
being
present
users
were
models.
database as
visually
data
giving
scientific
data
an entity
The relational relational
current
information
thus
models
data
modelling that
to
also includes
are
of a complex
communicate
define the
values in common
The object-orientated
objects,
data
the
designers
programmers
the
and
in current
model is the
user
means of common
and
to
components
network
are found
The relational
object
abstraction
models
environment.
concepts
the
data
are used to identify
hierarchical
the
simple
use
do not
Data
maintaining
use the
relational
organisations.
availability
NoSQL
model
and
and
databases
and fault tolerance
relationships
both are
for databases.
offer
by sacrificing
data integrity
to the
data
program
code.
Data level
modelling requirements of
data
abstraction.
Requirements conceptual
lowest
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
The
Committee
All suppressed
American
There is
of data abstraction
Rights
Reserved. content
does
May not
not materially
be
copied, affect
also
of different
National
(ANSI/SPARC)
and internal.
level
are a function
is concerned
scanned, the
overall
or
duplicated, learning
Standards
describes a fourth
in experience.
whole
data views (global
level
three
of data
exclusively
or in Cengage
part.
Institute levels
Due Learning
to
Standards
of data
abstraction
reserves
rights, the
right
some to
third remove
and the
Planning
abstraction: (the
with physical
electronic
vs local)
physical
storage
party additional
content
may content
and
external, level).
This
methods.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
2
Data
Models
65
key terMs AmericanNationalStandards Institute(ANSI)
entityrelationshipdiagram(ERD)
object-orientateddata model(OODM)
attribute
entityset
object-orientateddatabasemanagement
Big Data
extended relational data model(ERDM)
businessrule
external model
one-to-many (1:*) relationship
class
external schema
one-to-one (1:1) relationship
classdiagram
hardwareindependence
physicalindependence
class hierarchy
hierarchical model
physical model
conceptual model
inheritance
relational database managementsystem
conceptual schema
internal
connectivity
internal schema
relationaldiagram
constraint
logical design
relational model
Crows Foot notation
logical independence
relations
system (OODBMS)
model
(RDBMS)
data definition language (DDL)
many-to-many (*:*) relationship
relationship
data manipulationlanguage (DML)
method
schema semantic data model
data models
network model
softwareindependence
entity
NoSQL
subschema table
entity instance
object
entity occurrence
object relational database management
entityrelationship(ER) model(ERM)
Further Blaha,
Premerlani,
P. The
1(1):
system(ORDBMS)
reaDIng
M. and
Chen,
Unified Modelling Language(UML)
W. Object-Oriented
entity-relationship
model
Modelling
towards
and
a unified
Design
view
for
of data,
Database ACM
Applications.
Prentice
Transactions
on
of the
ACM,
Hall,
Database
1998.
Systems,
1976.
Codd,
E.F. A
Codd,
E.F. A
relational
Conference Codd,
E.F.
Lausen,
on The
Data
for large
Model G.
Database
shared
founded
Description,
Vossen,
NoSQL
of data
sublanguage
Relational
G. and
Oracle
model
database
Access for
and
Database
Models
and
databanks,
on relational Control,
Documentation,
pp.
Management,
Languages
of
ORACLE,
Communications
calculus, 3568,
2.
Addison-Wesley,
Orientated
[online]
of the
pp.
AIM
377-387,
1970.
SIGFIDET
1971.
Version
Object
2019
Proceedings
1990.
Databases.
Available:
Addison-Wesley,
1998.
https://docs.oracle.com/en/database/
other-databases/nosql-database/index.html Thalheim,
B. Entity-Relationship
Modelling
Foundations
of
Database
Technology.
Springer,
2000.
online Content Answers to selected Review Questions andProblems forthischapter can
be found
reVIew 1
review
for
this
book.
QuestIons of data modelling.
Whatis a business rule, and whatis its purpose in data modelling?
3
How would you translate
2020 has
platform
2
business rules into
Describe the basic features user
Copyright
online
Discuss the importance
4
Editorial
on the
Cengage deemed
and the
Learning. that
any
All suppressed
of the relational
data model components? data model and discuss their importance
to the end
designer.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
66
part
I
Database
5
Systems
Explain
how the
database
6
design
entity
relationship
(ER)
model
helped
produce
a
more structured
relational
environment.
Usethe scenario described by A customer can make many payments, but each payment is by only
2
your
one
customer
answer
using
UML
as the
basis
class
diagram
for
an entity
relationship
diagram
(ERD)
presentation.
Whyis an object said to have greater semantic content than an entity?
8
Whatis the difference between an object and a class in the object-orientated
9
How would you model Question 6 with an OODM? (Use Figure 2.7 as your guide.)
10
Whatis an ERDM, and what role does it play in the
11
Whatis arelationship,
12
Givean example of each ofthe three types of relationships.
13
Whatis atable,
14
Whatis arelational
15
Whatis connectivity?
16
Describe the
17
Whatis sparse data? Give an example.
18
Defineand describe the basic characteristics of a NoSQL database.
19
Describe the key-value
20
to
of relationships
data model(OODM)?
database environment?
exist?
model?
Give an example.
Draw ERDs to illustrate
connectivity.
Big Data phenomenon.
model
key-value
and which three types
modern (production)
and what role does it play in the relational
Using the example how
Show
notation.
7
diagram?
made
this
data model.
of a medical clinic with patients and tests, example
modelling
using
the
relational
model
and
provide a simple representation
how it
would
be represented
of
using
the
technique.
21
Whatis logical independence?
22
Whatis physical independence?
proBleMs Use the
contents
of
Figure
would the
would
5
and the
the
1-5.
between
Using and
look
like?
wereimplemented
Label the
Figure
Learning. that
any
structure
in a hierarchical fully,
identifying
model, the
root
1 segment.
between
network
structure
AGENT and CUSTOMER.
AGENT and CUSTOMER
model look
like?
(Identify
the
wereimplemented
record
types
and
in a network
model, what
set.)
OO model.(Use Figure 2.7 on p. 55
guide.) P2.1
attributes
Cengage deemed
AGENT and CUSTOMER
hierarchical Level
between
Using the ERD you drew in Problem 2, create the equivalent as your
has
Problems
Given the business rule(s) you wrotein Problem 1, create a basic UML class ERD.
4 If the relationship
2020
work
2
segment
review
p.46 to
Writethe business rule(s) that govern the relationship
what
Copyright
on
1
3 If the relationship
Editorial
2.3
for
All suppressed
Rights
as your the
Reserved. content
does
guide,
DealCo
May not
not materially
be
answer
stores,
copied, affect
scanned, the
overall
Problem
6. The
in
regions
located
or
duplicated, learning
in experience.
whole
two
or in Cengage
part.
Due Learning
to
DealCo
Class
of the
electronic reserves
rights, the
right
ERD
shows
the
initial
entities
country.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
p2.1
2
Data
Models
67
the DealCo class erD
2
6 Identify Using
each relationship
Figure
entities
P2.2
as your
and attributes
7 Identify
for
type and write all of the business rules.
guide,
answer
Problems
7-9.
The
Tiny
University
class
ERD
shows
the initial
Tiny University.
each relationship
type and write all of the business rules.
8 A hospital patientreceives medicationsthat have been ordered by a particular doctor. Becausethe patient
often
ORDER. ORDER
a
and
painters,
paintings
one
gallery.
gallery.
many paintings.
2020 has
per
can include
several
day, there
is
a 1:* relationship
medications,
creating
between
database
model to capture these business rules.
and
galleries.
A gallery
Similarly,
Using
can
A painting exhibit
a painting
PAINTER,
is
is
many created
PAINTING
paintings, by
and
artists. UBA maintains a small database to
created
by a particular but
a single
each
painter,
GALLERY, in terms
artist
painting but
and then can
each
of a relational
b
How might the (independent)
c
Drawthe complete ERD.
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
tables
scanned, the
overall
or
duplicated, learning
exhibited
be exhibited
painter
Whichtables would you create, and what wouldthe table components be?
Cengage
and
between
PATIENT, ORDER and MEDICATION.
a
deemed
PATIENT
a 1:* relationship
MEDICATION.
a particular
only
review
medications
United Broke Artists (UBA) is a broker for not-so-famous in
Copyright
order
Create an ERD that depicts arelational
track
Editorial
several
each
Identify the business rules for
b 9
receives
Similarly,
can
in
create
database:
be related to one another?
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
68
part
I
Database
FIgure
Systems
p2.2
the tiny
university
class erD
2
ENROL_GRADE
10
Using the ERDfrom attributes the
Problem 9, create the relational
for each of the
entities.
schema. (Create
Make sure you use the
appropriate
an appropriate naming
collection
conventions
of
to name
attributes.)
11
Describe the relationships (identify the business rules) depicted in the ERD shown in Figure P2.3.
12
Convert the ERD from
13
Describe the relationships
14
Create a UML ERD for each of the following more
than
a
one in the
Each of the those
Problem 11into a UML class diagram.
database
modelling
has
The word many merely means
environment.)
many employees
Each department
is
manage only one department
b
descriptions. (Note:
MegaCo Corporations divisions is composed of many departments. Each of
departments
department.
shown in the ERDin Figure P2.4.
assigned
managed
to it,
but each
by one employee,
employee
works
and each of those
for
only
one
managers
can
at a time.
During a period oftime, a customer can rent many DVDsfrom the BigVid store. Each ofthe BigVids
DVDs
can
be rented
to
c
An airliner can be assigned to fly
d
The KwikTite region but
e
Corporation
can be home
each
of those
An employee
to
many
customers
manyflights,
operates is
may have earned
employed
that
period
of time.
but each flight is flown by only one airliner.
manyfactories.
many of KwikTites
employees
during
Each factory is located in a region.
factories. by only
Each factory
employs
Each
many employees,
one factory.
many degrees, and each degree
may have been earned by
many employees. Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
p2.3
2
Data
Models
69
the Crows Foot erD for problem 11
LECTURER
CLASS 2 Teaches
Advises
STUDENT
FIgure
p2.4
the
uMl
erD for
problem
13
note Many-to-many them. not
Copyright Editorial
review
2020 has
However, appropriate
Cengage deemed
(*:*) relationships
Learning. that
any
All suppressed
you in
Rights
in
a relational
Reserved. content
will learn
does
May not
not materially
be
exist
at a conceptual
Chapter
3,
level,
Relational
and you should
Model
Characteristics,
know that
how to recognise
*:* relationships
are
model.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 3 Relational Model Characteristics IN THIS CHAPTER,YOU WILLLEARN: That the
relational
database
That the
relational
models
tables
in a relational
How relations
model basic
takes
a logical
components
view
of data
are relations
implemented
through
DBMS
are organised
in tables
composed
of rows
(tuples)
and columns
(attributes) Key terminology About
the
How
data
used in
role
of the
data
redundancy
Why indexing
describing
relations
dictionary,
is
handled
and the
in the
system
relational
catalogue
database
model
is important
PREVIEW In
Chapter
and
data
2,
considering ERM
the
chapter,
structure
and
You
you
physical
to
introduced
Finally, to the
you
next few
and the
way in
basic
tables,
and retrieval.
models
be used
You
to
will discover
a relational components
database.
one important
fit
into
reason
can be treated tables
the
an ERD.
models logical
data
how the independent
that
through
design
that
without
also learnt
graphically
basic
structural
structure
details about the relational
is that its tables
their
concepts
and poorly
that
chapters.
components shape
database
designed
are introduced
which
logical
for the
aslogical
within the
a
rather
database
another.
part of relational
of well-designed
data
models
relationships
can
You
simplicity
You will also learn
about
to the
such an integral
relational
the
databases
as a table.
one
learning
ERD
relational
models
units.
be related After
how the
the
storage
and their
some important
the
known
that examine
of data
entities
about how
database
you to
aspects
depict
willlearn
more
construct
relational
you learnt
allow
physical
will learn
logical
can
Models,
may be used to
In this
than
Data
independence
to
the
design
design,
you
their
relationships,
of tables.
you
Because
the
are
table
is
will also learn the characteristics
tables.
some
basic
For example,
those
and
you
relationships
concepts
that
will examine might
will become
different
be handled
kinds
in the
your
gateway
of relationships
relational
database
environment.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
3 Relational
Model
Characteristics
71
NOTE The
relational
model,
Predicate logic, of fact)
can
be
of 12345678
is
theory data
is a
as either
named
Cela
mathematical in the
as B(44,
yields
a result
A and
B share
Based
set
Alogical
77, 90,
a common
data structure
deals
model.
1970,
For
is
based
example,
on
predicate
logic
example,
assume
information,
77. This result
that
and
a student
be demonstrated
or groups
Furthermore,
Given this
suppose
can easily
with sets,
For
77).
number,
value,
in
or false.
24,
11).
concepts,
Codd
This assertion
that
A(16,
with a single
on these
true
relational as
E.F.
set theory.
mathematics, provides aframework in which an assertion (statement
Nkosi.
science
77, represented
represented
by
verified
manipulation
24 and
introduced
used extensively in
of things, that
set
B contains
you
can
three
numbers
44,
that
a student
ID
or false.
Set
as the
A contains
four
can be expressed
with
be true
and is used
set
conclude
to
numbers, 77,
of
B 5 77. In
16,
90 and
the intersection
as A
3
basis for
11,
A and
other
B
words,
77.
the
relational
represented
model
has three
by the relational
well-defined
table,
where
components:
data are stored (Sections
3.1, 3.2
and 3.4). A set
of integrity
rules
to
enforce
that
the
data
are
and remain
consistent
over
time
(Sections
3.3,
3.5,
3.6 and 3.7). A set
of operations
that
define
how
data
are
manipulated
(Chapter
4,
Relational
Algebra
and
Calculus).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
72
PART I
Database
3.1
Systems
A LOGICAL VIEW OF DATA
In Chapter metadata. structure. eliminates
1, The Database Approach, you learnt that a database stores and manages both data and You also learnt that the DBMS manages and controls access to the data and the database Such an arrangement placing the DBMS between the application and the database most of the file systems inherent limitations. The result of such flexibility, however, is afar
more complex
3
physical
structure.
In fact,
the
database
structures
required
by both the
hierarchical
and network database models often become complicated enough to diminish efficient database design. The relational data model changed all of that by allowing the designer to focus on the logical representation ofthe data and their relationships, rather than on the physical storage details. To use an automotive
analogy,
the relational
database
uses an automatic
transmission
to relieve
you of the
need
to manipulate clutch pedals and gear levers. In short, the relational model enables you to view data logically rather than physically. The practical significance of taking the logical view is that it serves as areminder of the simple file concept of data storage. Although the use of a table, quite unlike that of a file, has the advantages of structural
and
data independence,
a table
does resemble
a file from
a conceptual
point
of view.
Since
you can think of related records as being stored in independent tables, the relational database model is much easier to understand than its hierarchical and network database predecessors. Greaterlogical simplicity tends to yield simpler and more effective database design methodologies. As the table
our discussion
plays such a prominent
role in the relational
begins with an exploration
model, it
deserves
of the details of table structure
a closer look.
Therefore,
and contents.
NOTE Relational
database
terminology
is
very
precise.
Unfortunately,
file
system
terminology
sometimes
creeps into the database environment. Thus, rows are sometimes referred to as records and columns are sometimes labelled asfields. Occasionally, tables arelabelled files. Technically speaking, this substitution of terms
is
not
always
terms file, record table is rows
actually
alogical
as records
familiar
file
appropriate;
and field and
system
the
database
describe physical
rather
of table
than
table
a physical
columns
is
a logical
concepts.
as fields.
construct, In
rather
than
Nevertheless, you
fact,
may (at the
many
a physical
as long
conceptual
database
concept,
and the
as you recognise that the
software
level)
think
vendors
of table
still
use
this
terminology.
3.1.1 Tables and Their Characteristics The logical view of the relational database is facilitated by the creation of data relationships based on alogical construct known as a table. Atable is perceived as atwo-dimensional structure composed of rows and columns. As far as the tables user is concerned, a table contains a group of related entities, that is,
an entity
set; for that
reason,
the terms
entity
set and table
are often
used interchangeably.
Atable is also called arelation because the relational models creator, E.F. Codd, used the term relation as a synonym for table. You can think of atable as a persistent relation, that is, a relation whose contents can be permanently saved for future use. Withinthe relational model, columns oftables are referred to as attributes
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
and rows
All suppressed
Rights
Reserved. content
does
of tables
May not
not materially
be
copied, affect
are known
scanned, the
overall
or
duplicated, learning
as tuples.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
3 Relational
Model
Characteristics
73
NOTE The
concept
of a relation
restricted
set
mathematics,
is
modelled
of rules.
For
a relation
is formally
on a
example,
every
mathematical relation
defined
construct
within
the
and therefore
database
must
must follow have
a certain
a distinct
name.
In
as:
distinct), Ris a relation on these n sets, Given a number of sets D1 , D2 , ..., Dn(which are not necessarily it is a set of tuples each of which hasits first element from D1 , second element from D2and so on. Lets last
examine
names
this
formal
and
STU_LNAME
{Ndlovu,
DEPT_CODE
{BIOL,
Then
definition
(STU_LNAME)
a relation
can
R 5 {(Ndlovu,
3.1
shows
CIS,
A table
is
2
Each table
row
Each cell or column/row
All values integer Each
7
The order
8
Each table
rules
3.1
in
the
review
2020 has
two
Table
with three
Introduction
to
Learning. that
any
and
All suppressed
have
of students enrolled.
as:
EDU)}
pairs.
conform
to.
columns
an attribute
hence
to the
cells
same
column
and columns.
within the
of values
entity
set and
must be distinct.
is immaterial
to the
a relation.
contains
only an atomic
attribute
attribute
that is, a single
if the
attribute
is
assigned
an
must be integers. domain.
that
LECTURER
The table
multiple
value
DBMS.
of attributes
The
name.
For example,
that
as the
or a combination
a distinct
of a relation.
data format.
known
constitutes
has
contain
representing
LECTURER.
COURSE_NAME
of rows
column
should
in the
and
column
each
in a relation
in the
COURSE
occurrence
and
not allowed
range
and
entity
composed
values.
uniquely
table
COURSE
identifies
conforms however
For example
each row.
to
is
all of the
not
CRS_CODE
a relation
CIS-420 is
values:
and Implementation
Databases
Modelling:
Cengage
are
a specific rows
5 2), one
they
a relation.
must conform
tables: 3.1
Design
deemed
has
sets (n where
DEPT_CODE
EDU),(Ismail,
structure
an attribute,
all values
must have
Database
Data
a column
of the
must
intersection
values
COURSE_NAME
associated
Copyright
in
column
shows
listed
because
Multiple
we have two
(DEPT_CODE)
and
a set of ordered
a single in
represents
data format,
6
Figure
column
represents
4
5
Roux,
a relation
not allowed
Each table
value.
STU_LNAME
as a two-dimensional
are
Assume codes
of a relation
3
data
Editorial
is simply
(tuple)
rows
sets
CIS),(Le
that
perceived
Duplicate
Roux, Ismail}
over the
Properties
1
Le
(Smithson,
properties
TABLE 3.1
an example. department
EDU}
be defined
BIOL),
the
with of the
Smithson,
So, as you can see, a relation
Table
one
3
An Introduction
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
74
PART I
Database
Systems
FIGURE 3.1(a) Table
name:
The relation
LECTURER EMP_
LECTURER_
LECTURER_
LECTURER_HIGH_
NUM
OFFICE
103
DRE 156
6783
PhD
104
DRE 102
5561
MA
105
KLR 229D
8665
PhD
106
KLR 126
3899
PhD
110
AAK 160
3412
PhD
114
KLR 211
4436
PhD
155
AAK 201
4440
PhD
3
FIGURE 3.1(b) Table
LECTURER
name:
EXTENSION
The non-relational
table
DEGREE
COURSE
COURSE CRS_
COURSE_NAME
CODE CIS-220
Introduction
CIS-420
to
Computer
Assembly
Language
Database
Design and
Science
Programming
Implementation Introduction Data QM-261
to
Modelling:
Intro.
to
Applying the concepts A relational
described
Applications
of relations to database
schema
is
byits name followed
a textual
An Introduction
Statistics
Statistical
entity.
Databases
models allows us to define arelational
representation
of the
database
tables,
schema for each
where each table
is
bythe list ofits attributes in parentheses.
NOTE A relational
schema
belonging
the
to
R can be formally
defined
as R5{a1, a2,...,an} where a1...an
is
a set
of
attributes
relation.
For example,
consider
the
database
table
LECTURER
in Figure 3.1. The relational
schema for LECTURER
can be written as: LECTURER(EMP_NUM,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
LECTURER_OFFICE, LECTURER_EXTENSION,
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
LECTURER_HIGH_DEGREE)
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
3.1.2 Attributes Each
attribute
is
3 Relational
Model
Characteristics
75
and Domains
a named
column
within the relational
table
and
draws its
values
from
a domain.
A domain is the set of possible values for this attribute. For example, an attribute called STU_CLASS, which stores the students classification whilst at university, may have the following domain {UG1, UG2, UG3, PG, Other}, which meansthat STU_CLASS can only have one ofthese values within the database. The domain
of values for an attribute
should
contain
only atomic
values
and any one value
should
not
3
be divisible into components. In addition, no attributes with morethan one value are allowed. (These are often referred to as multi-valued attributes.) For example, the value of STU_CLASS could not be UG1 and UG2 at the same time. Each domain is also defined by its data type for example, character string, number, date, etc. The fundamental
principle
of the relational
achieved by comparisons of their values. if their values are drawn from the same LECT_POSTCODE may bein two different postal codes and could be compared. In STU_NAME
with STU_CLASS,
model is that relating
different
entities
A pair of attribute values can only be domains. For example, the columns relational tables, but would share the contrast, it would be nonsense to try
even though
the
domains
are defined
to
one another
is
meaningfully compared STU_POSTCODE and common domain of all to match the attribute
by the data type (character
string).
3.1.3 Degree and Cardinality Degree and cardinality are two important properties of the relational model. A relation with N columns and Nrows is said to be of degree N and cardinality N. The degree of a relation is the number of its
attributes
and the
cardinality
of a relation
is the
number
of its tuples.
The product
of a relations
degree and cardinality is the number of attribute values it contains. Figure 3.2 shows the relational table DEPARTMENT with a degree of 4 and a cardinality of 4. The product of the relational table DEPARTMENT is 16 (4 * 4) and, as you can see in Figure 3.2, it contains 16 attribute values.
FIGURE 3.2 Table
name:
Cardinality
Degree and cardinality
of the DEPARTMENTrelation
DEPARTMENT
5 4
DEPT_CODE
DEPT_NAME
DEPT_ADDRESS
DEPT_EXTENSION
ACCT
Accounting
KLR 211, Box 52
3119
ART
Fine Arts
BBG 185, Box 128
2278
BIOL
Biology
AAK
Box 415
4117
CIS
Computer
Box 56
3245
Info.
Systems
Degree
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
KLR 333,
5 4
rights, the
230,
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
76
PART I
Database
Systems
NOTE The word relation, theory
from
also known
which
relationships
Codd
among
relationships.
derived
tables,
many
as a dataset in
Microsoft
his
the
model.
Since
database
Many then incorrectly
conclude
Access, is
relational
based on the
model
users
incorrectly
that
only the relational
assume
uses
that
the
attribute term
mathematical values
relation
to
set
establish
refers
to
such
model permits
the use of relationships.
to
define
3
3.1.4 You
Summary will
discover
thereby the
greatly
the
table
Characteristics
view
the task
of a relation
listed
of
data
makes
of database
in
Table
3.1
it
easy
design.
can
spot
The tables
be applied
to
and
shown in
a database
entity
Figure
relationships,
3.3 illustrate
how
table.
STUDENTtable attribute values
name:
Table name:
that
simplifying
properties
FIGURE 3.3 Database
of Relational
Ch03_TinyUniversity
STUDENT STU_
STU_
STU_
STU_
STU_
DEPT_
STU_
LECT_
DOB
HRS
CLASS
GPA
TRANSFER
CODE
PHONE
NUM
C
12-Feb-1999
42
UG3
2.84
No
BIOL
2134
205
K
15-Nov-2000
81
UG2
3.27
Yes
CIS
2256
222
23-Aug-2000
36
UG3
2.26
Yes
ACCT
2256
228
H
16-Sep-1996
66
UG2
3.09
No
CIS
2114
222
STU_
STU_
STU_
NUM
LNAME
FNAME
INIT
321452
Ndlovu
Amehlo
324257
Smithson
Anne
324258
Le
Dan
Roux
STU_
324269
Oblonski
324273
Smith
John
D
30-Dec-1998
102
PG
2.11
Yes
ENGL
2231
199
324274
Katinga
Raphael
P
21-Oct-1999
114
PG
3.15
No
ACCT
2267
228
Hemalika
T
08-Apr-1999
120
PG
3.87
No
EDU
2267
311
John
B
30-Nov-2001
15
UG1
2.92
No
ACCT
2315
230
324291
Ismail
324299
Smith
STU_DOB
5
Student
date of birth
STU_HRS
5
Credit
STU_CLASS
5
Student
STU_GPA
5
Grade
point
STU_PHONE
5
4-digit
campus
LECT_NUM
5
Number
Copyright Editorial
Walter
review
2020 has
Cengage deemed
Learning. that
any
hours
All suppressed
earned
classification
average phone
extension
of the lecturer
Rights
Reserved. content
does
May not
who is the
not materially
be
copied, affect
students
scanned, the
overall
or
duplicated, learning
advisor
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Using the to the
1
STUDENT
points
in
table
Table
shown in Figure
3.3, you can draw the following
3 Relational
conclusions
eight
rows
degree
is
attributes
2
(tuples)
12.
corresponding
You
and can
twelve
also
columns.
describe
The
the
cardinality
table
as
of
being
STUDENT
composed
is
therefore
of eight
8 and
records
and
the
twelve
(fields).
entity
set is represented
by STU_NUM Amehlo
by the
5 321452
C.
Oblonski. the
77
structure composed
Each row in the STUDENT table describes a single entity occurrence (The
Characteristics
3.1:
The STUDENT table shown in Figure 3.3is perceived to be atwo-dimensional of
Model
Ndlovu.
Similarly,
STUDENT
defines
For
the
example,
row
entity
STUDENT
table.)
characteristics
row
3 describes
4 in
eight
3.3
row
Dan
entities
(entity
or record)
or fields)
describes
named
distinct
the
(attributes
Figure
a student
set includes
Note that
Roux.
defined
of a student
a student
le
3
within the entity set. named
named
Given
the
Walter
table
H.
contents,
(rows).
3 Each column represents an attribute, and each column has a distinct name. 4
All of the values in a column point
average
must
be classified
different
a
(STU_GPA)
match the entitys
column
according
data types,
to their
most support
STU_HRS
STU_PHONE
is
b
not intended
STU_FNAME,
c d
In
and
for
3.3,
not
Microsoft
Figure
of the
various
table
rows.
DBMSs
can
Data
support
are
numeric
adding
attributes.
or subtracting
On the
phone
other
hand,
numbers
does
result.
mathematical
manipulation.
STU_CLASS
In
is
Figure
and
a data
3.3, the
all, relational
Access
In
Figure
STU_PHONE
3.3, for
example,
are character,
text
or
STU_LNAME,
or string
attributes.
attribute.
range
known
04,
STU_TRANSFER
database
uses the label
a data type
to the
Each table
software
Yes/No
student
students find
number) last
several
is the
name
the
domain
quite
students
Cengage
Learning. that
any
possible
All suppressed
Rights
does
May not
key.
Using
would
be
copied, affect
scanned, the
overall
duplicated, learning
Smith.
in experience.
format.
data type
TRUE, FALSE
whereas
and
NULL.
Because the STU_GPA values
whole
the primary
data
Even
presented
or in Cengage
the
primary named
part.
Due Learning
to
key (PK) is an attribute
any given row. In this
be a good
one student
or
format.
data
is [0,4].
the
not be an appropriate
more than
is
not
last
not
name
data
logical
a logical
domain.
identifies
whose
materially
to indicate
a logical
the
can have values
general terms,
uniquely
primary
uses
support
to the user.
would
to find
Reserved. content
key. In
that
(STU_LNAME)
name (STU_FNAME)
which
values is known as its
inclusive,
of attributes)
attribute
packages
data type
as Boolean,
must have a primary
a combination
(the
deemed
3.3
because
The order of rows and columns is immaterial
(or
has
in
attribute
The columns range of permissible are limited
2020
STU_GPA
STU_DOB
transfer?
but
Oracle uses
review
Although
For example, the grade
each
Logical. Logical data can have only atrue or false (yes or no) condition. For example, is a student Most,
Copyright
and function.
for
the following:
meaningful
STU_INIT,
Figure
a university
is
entries
Date. Dateattributes contain calendar dates storedin a special format known as the Julian date format.
7
characteristics.
STU_GPA
Character. Character data, also known astext data or string data, can contain any character symbol
Editorial
format
at least
not a numeric
not yield an arithmetically
6
attribute
only
Numeric. Numeric data are data on whichyou can perform meaningful arithmetic procedures. For example,
5
contains
reserves
Figure
key
combination
STU_NUM
observe
it is
of the last
name
as Figure
that
possible
a to
and first
3.3 shows,
it
Smith.
rights, the
3.3,
because
key because,
John
electronic
in
primary
case,
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
78
PART I
Database
Systems
Online Content on the
online
names
platform
Allofthe databases usedtoillustratethe material in this chapterarefound for this
used in the figures.
'Ch03_TinyUniversity'
book.
The
database
names
For example, the source
used in the folder
of the tables
match the
database
shown in Figure 3.3 is the
database.
3
3.2
KEYS
A key consists number
of one or more attributes
identifies
One type of table shown because the the primary attention.
all of the invoice
that determine
attributes,
such
other attributes. (For example,
as the invoice
date
and the
an invoice
customer
name.)
key, the primary key, has already been introduced. Given the structure of the STUDENT in Figure 3.3, defining and describing the primary key seems simple enough. However, primary key plays such animportant role in the relational environment, we will examine keys properties more carefully. There are several other kinds of keys that warrant
In this
section,
you
will also
become
acquainted
with superkeys,
candidate
keys
and
secondary keys. The keys role is based on a concept known as determination. In the context of a database table, the statement A determines B indicates that if you know the value of attribute A, you can look up (determine) the value of attribute B. For example, knowing the STU_NUM in the STUDENT table (see Figure 3.3)
means that
you are able to look
up (determine)
that
students
last
name,
grade
point average,
phone number and so on. The shorthand notation for A determines B is A ? B.If A determines B, C and D, you write A ? B, C, D. Therefore, using the attributes of the STUDENT table in Figure 3.3, you can represent the statement STU_NUM
determines
STU_LNAME
by writing:
STU_NUM ? STU_LNAME In fact, the STU_NUM value in the For example, you can write: STU_NUM
STUDENT table
determines
all of the students
attribute
? STU_LNAME,
STU_FNAME,
STU_INIT
? STU_LNAME,
STU_FNAME,
STU_INIT, STU_DOB, STU_TRANSFER
values.
and STU_NUM In
contrast,
STU_NUM
is
not
determined
by STU_LNAME
because
it is
quite
possible
for
several
students to have the last name Smith. The principle of determination is very important because it is used in the definition of a central relational database concept known as functional dependence. The term functional dependence can be defined
most easily this
way: the
attribute
Bis functionally
dependent
on Aif
A determines
B. More
precisely: The output
of the
DIVIDE
Using the contents is functionally
operation
is a single
column
with the
values
of column
of the STUDENT table in Figure 3.3, it is appropriate
dependent
on STU_NUM.
For example,
the
STU_NUM
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
to say that
value
STU_PHONE value 2134. Onthe other hand, STU_NUM is not functionally
B.
321452
STU_PHONE determines
the
dependent on STU_PHONE
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
because
the
STU_PHONE
(Apparently,
some
STU_LNAME
value
because
one
The functional
occur
a phone.)
But the
student
definition
more than
with two
Similarly,
STU_NUM
may have
dependence
values
2267 is associated
share
Smith.
more than
attribute
value
students
the
value
the last
is
value
not functionally
name
a table.
values:
STU_NUM
Model
324274
324273
Characteristics
and 324291.
determines
dependent
79
on
the
STU_LNAME
Smith.
can be generalised
once in
STU_NUM
3 Relational
to cover the
Functional
case in
dependence
which the determining
can then
be defined
this
way:1
3 Attribute table
A determines
that
agree
Be careful student
in
when
value
defining
classification
for
B(that is,
attribute
the
based
TABLE 3.2 Hours
attribute
Bis functionally
A also
agree
dependencys on hours
Student
in
value
direction.
completed;
for
For
these
are
you
can
its
write:
? STU_CLASS
the specific
number
undergraduate
the
determines
3.2.
PG
more
STU_HRS
words,
University
Table
UG1
30
UG3
a third-year
Tiny
in
Classification
60-89
However,
B.
example,
UG2
Therefore,
attribute
shown
30-59
90 or
on A)if all of the rows in the
classification
completed
Fewer than
dependent
classification
of hours is not dependent
(UG3)
with
(STU_CLASS)
62 completed does
not
on the hours
classification.
or one
determine
one
with
and
It is quite possible
84 completed
only
one
value
hours.
for
to find In
completed
other hours
(STU_HRS).
Keep in is,
a key
mind that it
might take
may be composed
of
more than more than
a single
attribute
one attribute.
to
Such
composite key. Any attribute that is part of a keyis known as a key attribute. the
students
last
name, first
attributes.
last
name
would
name, initial
For example,
STU_LNAME,
not
be sufficient
and home
you
can
STU_FNAME,
to
serve
dependence;
multi-attribute
key is
that
known
as a
For instance, in the STUDENT table,
as a key.
phone is very likely
define functional a
Onthe
to produce
other
unique
hand,
the
combination
of
matches for the remaining
write:
STU_INIT, STU_PHONE
? STU_HRS, STU_CLASS
or
1
ISO-ANSI
Working
provided
Copyright Editorial
review
2020 has
Cengage deemed
through
Learning. that
any
All suppressed
Draft the
Rights
Reserved. content
Database
courtesy
does
May not
not materially
Language/SQL
of
be
copied, affect
Dr David
scanned, the
overall
or
Foundation
(SQL3),
Part
2, 29
August,
1994.
This
source
was
Hatherly.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
80
PART I
Database
Systems
STU_LNAME,
STU_FNAME,
STU_INIT, STU_PHONE
? STU_HRS, STU_CLASS,
STU_GPA
or
STU_LNAME,
STU_FNAME,
STU_INIT,
STU_PHONE
? STU_HRS,
key, the
of functional
STU_CLASS,
STU_GPA,
STU_DOB Given the
refined
3
possible
existence
by specifying
If the
attribute
composite Within the
(B)
any key that attributes.
dependence:
is functionally
dependent
key, the
broad
of a composite
full functional
attribute
uniquely
identifies
STUDENT
a composite
(B) is fully functionally
key classification,
In the
on
notion
several
key (A)
dependent
specialised
keys
superkey
could
but
not
can
on any
be further
subset
of that
on (A).
can be defined.
each row. In short, the superkey
table, the
dependence
For example,
functionally
a superkey
determines
is
all of the rows
be any of the following:
STU_NUM STU_NUM,
STU_LNAME
STU_NUM,
STU_LNAME,
In fact,
STU_NUM,
attributes
or
without
additional
attributes,
can
be
a superkey
even
when the
additional
are redundant.
A candidate Using this
key can be described as a superkey
distinction,
STU_NUM, is
with
STU_INIT
note that the
composite
without redundancies,
that is, a minimal superkey.
key
STU_LNAME
a superkey,
but it is
not
a candidate
key
because
STU_NUM
by itself
is
a candidate
key!
The
combination STU_LNAME, might last
also
be a candidate
name,
If the 3.3
first
would
would
name,
students
perhaps
one
STU_FNAME,
named
be driven
as long
and
STU_ID
by the
as you
phone
discount
and
student.
designers
the
possibility
that
two
choice
as one of the attributes
STU_NUM In that
would
case,
the
or by end-user
unique row identifier.
have
in the
been
selection
the
same
requirements.
keys,
STU_NUM In
Note, incidentally,
short,
that
table in Figure because
as the the
primary
a primary
either
primary
key
key is the
key is
a superkey
key.
each
(that is,
share
STUDENT
candidate of
primary
key
value
must
be unique
to
ensure
that
each
bythe primary key. In that case, the table is said to exhibit entity integrity. a null value
students
number.
both it each
key chosen to be the
a table,
STU_PHONE
had been included
identify
as well as a candidate Within
key,
initial
ID number
uniquely
candidate
STU_INIT,
no data entry at all)is
not permitted
in the
primary
row
is
uniquely
identified
To maintain entity integrity, key.
NOTE A null
does
not
A null is created words,
Copyright Editorial
review
2020 has
mean
a zero
when you press the
a null is no value
Cengage deemed
Learning. that
any
or a space.
All suppressed
Rights
keyboards
the
keyboards
space
Enter key without
bar
creates
a blank
(or
a space).
making a prior entry of any kind. In other
at all.
Reserved. content
Pressing
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Nulls can never in
other
are
working
be the
too.
with non-key
values
nulls
cannot
is
often
example,
A not
missing,
on the
sophistication
such
as
attributes
PRODUCT
in
tables
Because
the
once
You
table
Figure
232
table
is likely some
to
of the
may be situations
entities. In any case,
the
have
existence
many
software,
used.
In
of nulls in
different
nulls
addition,
VENDOR
even
a table
meanings.
can
create
nulls
can
For
problems
create
logical
table,
In
thus
database
2,
3.4,
tables
evidence
Data
the
Models,
that
the
data
PRODUCT
note
that
the
VEND_CODE
these
the
that
they
And
value
VEND_CODE
between VENDOR
VEND_CODE
multiple
because
that the
share
PRODUCT
is
occurrences
is the
1
value
may
the
*
of the
are required
to
redundancy
exists
values,
VENDOR
and
side in
side
occur of the
VEND_CODE
make the relationship only
when
there
is
point
to
values.
note
that
the
other table.
is
VEND_CODE
the
VENDOR table.
delivered
connection
value
For example,
Ortozo in the bar
unique
terms,
does
through
is
providing
as
1:* relationship
VENDOR
database
VEND_CODE.
once,
table
value
given
within the
Note, for example,
make the
VEND_CODE
Tables
named
more than
to
any
work.
together.
occurs to the
But
Chapter
16 cm
are
attribute
are not redundant
Henry
saw,
be linked
is related
of attribute
The same
by
can
Henry
be
in
one table
VEND_CODE
Consequently, Ortozo
made for
used
to
value 235 in the
he can
product
be
you discover
and that
the
can
Steel
PRODUCT
that the product
be contacted
tape,
12
by calling
mlength
in the
table.
Remember
the
naming
belong
CODE indicates used
that
be shown
to the
points
VENDOR
3.1.1,
attribute(s)
Normalising
the table.
VEND_CODE
point
key
convention PRODUCT
in section
primary 7,
to the
to
As defined
to
table
a relational is (are)
Database
prefix
PROD
Therefore,
the
some
other
in the
database
For
table
Figure
VEND in the in the
3.4 to indicate
PRODUCT
database.
In
that
tables
this
case,
the
VEND_ the
VEND
database.
underlined
Designs.
was used in prefix
can also be represented with the
example,
schema. the
You
relational
by a relational will see such
schema
for
schema.
schemas
Figure
3.4
in
would
as:
VENDOR
(VEND_CODE,
PRODUCT
Learning. that
they
database
values is required
value in the
chain
Cengage
table
from
points to vendor
deemed
value
PRODUCT
recall
0181-899-3425.
has
share
relationship.
corresponding
relational
relationship.
examine
PRODUCT
3.4
VENDOR
in the
SUM
to
duplication
Houselite
2020
attributes
you
are linked.
of the
Each
should
As you
review
two
In fact,
when
Therefore,
section that there
between
development
and
a common
PRODUCT
unnecessary
Copyright
tables
middle initial.
later in this
because
enable the tables
PRODUCT
VENDOR-PRODUCT values in the
Chapter
a
sparingly.
application
AVERAGE
VEND_CODE
work.
more than
The
have
possible
3
Figure
VENDOR-PRODUCT
is
be used
problems,
makes the
occurrence
PRODUCT
prefix
an EMPLOYEE
not
extent avoided
design.
of the
tables
that
tables
multiple
attributes
of
be reasonably
81
value.
COUNT,
redundancy
VENDOR
table
one do
to the greatest
cannot
of the relationship
must
create
attribute
when relational
common
the
nulls
Characteristics
value.
condition.
Controlled
Editorial
can
attribute
functions
work.
example,
nature
they
database
applicable
problems
the
of poor
but
Depending
235.
be avoided
which
Model
a null can represent:
A known,
the
should
in
employees
of the
be avoided,
improperly,
An unknown
and
For
some
because
always
used
attributes.
cases
may be null. You will also discover
an indication
Nulls, if
key, and they
are rare
However,
which a null exists
if
when
There
EMP_INITIAL.
EMP_INITIAL in
be part of a primary
attributes,
3 Relational
any
VEND_CONTACT,
(PROD_CODE,
All suppressed
Rights
Reserved. content
does
May not
VEND_AREACODE,
PROD_DESCRIPT,
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
VEND_PHONE)
PROD_PRICE,
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
PROD_ON_HAND,
rights, the
right
some to
third remove
party additional
content
VEND_CODE*)
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
82
PART I
Database
Systems
FIGURE 3.4 Database Table
An example of a simple relational
name:
name:
Ch03_SaleCo
PRODUCT
PROD_CODE
3
database
Primary
key:
PROD_CODE
Foreign
key:
VEND_CODE
PROD_PRICE
PROD_DESCRIPT
PROD_ON_HAND
10.23
001278-AB
Claw hammer
123-21UUY
Houselite
QER-34256
Sledge hammer,
SRE-657UG
Rat-tail file
ZZX/3245Q
Steel tape,
chain
12
saw,
16 kg head
232
23 4
235
14.72
6
231
2.36
15
232
5.36
8
235
150.09
16 cm bar
VEND_CODE
mlength link
VEND_CODE
VEND_CONTACT
VEND_PHONE
7325
555-1234
Johnson
0181
123-4536
Sibiya
7325
224-2134
0113
342-6567
0181
123-3324
0181
899-3425
Shelly K. Smithson
230
Table
VEND_AREACODE
231
James
232
Khaya
233
Lindiwe
234
Nijan
235
Henry
name:
Molefe Pillay Ortozo
VENDOR
Primary key: VEND_CODE Foreign key: none
The link between the PRODUCT and VENDOR tables in Figure 3.4 can also be represented by the relational diagram shown in Figure 3.5. In this case, the link is indicated by the line that connects the VENDOR and PRODUCT tables.
FIGURE 3.5
The UMLentity relationship diagram for the CH03_SaleCodatabase
The relationship
line in Figure 3.5 is created
More specifically,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
the
Rights
Reserved. content
does
primary
May not
not materially
be
when two tables
key of one table (VENDOR)
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
share an attribute
appears
to
electronic reserves
rights, the
right
with common values.
as the foreign
some to
third remove
party additional
content
key in
may content
be
a related
suppressed at
any
time
from if
the
subsequent
eBook rights
table
and/or restrictions
eChapter(s). require
it
CHAPTER
(PRODUCT). table.
Aforeign
For example,
as a foreign VENDOR
key (FK) is an attribute
in
Figure
key in the table
PRODUCT
shown
If the foreign
3.5, the
in
table.
Figure
key contains
3.4
not
that
key
contains
referential
a value,
matching
integrity
is
Finally, a secondary customer you
facilitated
that when
members
yield
keys
effectiveness
key is.
of view, the to
dozens
For instance, attribute
examine
of
and
VENDOR
in
one
number? number
could phone
which
narrowing
values New
the
used.
of the
a third
occurs
table,
the
means that, if the
another
tables
course,
database
table
last
Suppose Do
can
primary
name
and
be key
phone
For example, Smith family of last
combination
on
a specific how
is legitimate
to
key.
matches if several
for
produce
CUS_CITY
name match.
restrictive
from
that
a database
a usable return
is
3
3.4.
the
depends
are not likely
Figure
the
outcome.
be searched
key CUS_CITY
(Of
Note
a customer
case,
customers
a search
in
purposes.
for
yield a unique
then
relation.
shown
In that
Similarly,
could
or Paris
matches.
in
retrieval
yield several
down
and it
number is the primary
Data are
line.
secondary
York
of possible
(row)
which the customer
number
83
make(s) use of that foreign
used strictly for data retrieval
combination
matches,
although
millions
PRODUCT
phone
only
to
integrity
the
key does not necessarily
with
that
tuple
and
table
not linked
valid
key is the
at a residence
could
A secondary
want
secondary
a secondary
is
an existing
their
Characteristics
key.
or nulls, the table(s)
table in
name
table
VENDOR
to
will remember
name and home telephone
code
VENDOR
as a key that is
last
the
mind that
last
secondary
than
customers
were living
postal
point
defined
key in the
In other words, referential
refers
Model
matchthe primary key values in the related
primary
a foreign
values
between
customers
number;
Keep in
a customers
you
maintained
key is
most the
customer
number.
and
value
data are stored in a CUSTOMER
suppose
is the
that
the
contain
key is (are) said to exhibit referential integrity. foreign
is the
Because
does
either
whose values
VEND_CODE
3 Relational
a better
unless
secondary
key
CUS_COUNTRY.) Table
3.3
summarises
TABLE 3.3
the
different
Relational
database
Key type
Definition
Superkey
An attribute
Candidate
key
relational
A minimal
(or
keys.
keys
combination
superkey.
of attributes)
A superkey
that
that
does
uniquely
not contain
identifies
each row in
a subset
of attributes
a table. that
is itself
a superkey. Primary
key
A candidate Cannot
Secondary Foreign
key key
database
RDBMSs
application rules
combination
An attribute
(or
combination
Copyright review
2020 has
enforce
Learning. that
any
rules
All suppressed
rules
integrity
conforms
are summarised
Cengage deemed
integrity
design
The integrity
Editorial
values in any given row.
null entries.
(or
primary
all other attribute
key in
another
of attributes)
used
of attributes) table
in
strictly
one table
for
data retrieval
whose
values
purposes. must either
match
or be null.
INTEGRITY RULES
Relational all)
contain
An attribute
the
3.3
key selected to uniquely identify
Rights
in
rules
Table
does
May not
not materially
be
to
automatically.
to the entity
good
database
However,
and referential
it is
integrity
design.
much
safer
Many (but to
by no
make
rules
mentioned in this
Figure
3.6.
sure
means
that
chapter.
your
Those
3.4.
summarised
Reserved. content
are very important
in
copied, affect
Table
scanned, the
overall
or
duplicated, learning
3.4 are illustrated
in experience.
whole
or in Cengage
part.
Due Learning
in
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
84
PART I
Database
TABLE 3.4 Entity
Systems
Integrity
rules
Description
integrity
All primary key entries are unique,
Requirement
Each row
Purpose
will have a unique identity,
reference
3
primary
No invoice
Example
can
are uniquely Referential
and no part of a primary key and foreign
may be null.
key values can properly
key values. have
a duplicate
identified
by their
number, invoice
nor
can it
be null. In
short,
all invoices
a part
of its tables
number.
Description
integrity
A foreign
Requirement
primary
key key)
it is related.
may have
either
or an entry (Every
a null entry (as long
that
non-null
matches foreign
the
primary
key value
as it is
not
key value in
must reference
a table
to
an existing
which primary
key value.) Purpose
It is possible for an attribute impossible rule
to
foreign
The CUSTOMER
The enforcement a row in
key values
one table
in
another
an assigned
to have an invalid
of Figure 3.6 at the top
Entity integrity.
entry. delete
might not yet have
will be impossible
1
to
matching,
A customer
Note the features
an invalid
makes it impossible
mandatory, Example
have
NOT to have a corresponding
sales
value,
but it
of the
referential
whose
primary
integrity key has
table. representative
sales representative
(number),
but it
(number).
of the next page.
tables
primary
key is
CUS_CODE.
The CUSTOMER
column has no null entries, and all entries are unique. Similarly, the AGENT tables AGENT_CODE, and this primary key column also is free of null entries. 2
will be
primary
key
primary key is
Referential integrity. The CUSTOMER table contains a foreign key AGENT_CODE, which links entries in the CUSTOMER table to the AGENT table. The CUS_CODE row that is identified bythe (primary key) number 10013 contains a null entry in its AGENT_CODE foreign key, because MrJaco Pieterse does not yet have a sales representative assigned to him. The remaining AGENT_CODE entries in the
To avoid
nulls,
CUSTOMER
some
table
designers
all
match the
use special
AGENT_CODE
codes,
known
entries in the
as flags,
AGENT table.
to indicate
the
absence
of some
value. Using Figure 3.6 as an example, the code -99 could be used as the AGENT_CODE entry of the fourth row of the CUSTOMER table to indicate that customer Jaco Pieterse does not yet have an agent assigned to him. If such a flag is used, the AGENT table must contain a dummy row with an AGENT_ CODE value of -99. Thus, the AGENT tables first record might contain the values shown in Table 3.5. TABLE
3.5
A dummy
variable
value
used as a flag
AGENT_CODE
AGENT_AREACODE
AGENT_PHONE
AGENT_LNAME
AGENT_YTD_SALES
-99
0000
000-0000
None
0.00
Chapter 5, Data Modelling may be handled.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
with Entity Relationship
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
Diagrams, discusses several
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
ways in which nulls
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 3.6 Database
Anillustration
name:
Table name:
ofintegrity
3 Relational
Model
Characteristics
85
rules
Ch03_InsureCo
CUSTOMER
Primary key: CUS_CODE Foreign
key:
AGENT_CODE CUS_
CUS_
CUS_
CUS_
CODE
LNAME
FNAME
INITIAL
CUS_
CUS_
CUS_RENEW_
AGENT_
AREACODE
PHONE
DATE
CODE
10010
Ramas
Alfred
A
0181
844-2573
12-Mar-19
502
10011
Dunne
Leona
K
0161
894-1238
23-May-18
501
10012
Du Toit
W
0181
894-2285
05-Jan-19
502
10013
Pieterse
0181
894-2180
20-Sep-19
10014
Orlando
0181
222-1672
04-Dec-18
501
10015
OBrian
Amy
B
0161
442-3381
29-Aug-19
503
10016
Brown
James
G
0181
297-1228
01-Mar-19
502
10017
Williams
George
0181
290-2556
23-Jun-19
503
10018
Padayachee
Vinaya
G
1061
382-7185
09-Nov-19
501
10019
Moloi
Mlilo
K
0181
297-3809
18-Feb-19
503
Table
name:
Marlene Jaco
F
Myron
3
AGENT
Primary key: AGENT_CODE Foreign
key:
none
AGENT_CODE
AGENT_LNAME
AGENT_AREACODE
AGENT_PHONE
AGENT_YTD_SLS
501
Bhengani
0161
228-1249
1
371 008.46
502
Mbaso
0181
882-1244
3
923 932.59
503
Okon
0181
123-5589
2
444
244.52
Other integrity rules that can be enforced in the relational model are the NOT NULL and UNIQUE constraints. The NOT NULL constraint can be placed on a column to ensure that every row in the table has a value for that
column.
The UNIQUE
constraint
is
a restriction
placed
on a column
to ensure that
no duplicate values exist for that column.
3.4
THE DATA DICTIONARY AND THE SYSTEM CATALOGUE
The data
dictionary
provides
a detailed
accounting
of all tables
found
within the
user/designer-created
database. Thus, the data dictionary contains atleast all of the attribute names and characteristics for each table in the system. In short, the data dictionary contains metadata data about data. Using the small database presented in Figure 3.6, you might picture its data dictionary as shown in Table 3.6.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
86
PART I
Database
Systems
TABLE 3.6 Table
Name
Asample
Attribute
Name
data dictionary
Contents
Type
Format
Domain
Required
FK
PK
Referenced
or
Table
FK
CUSTOMER
CUS_CODE
3
account
CUS_FNAME
code
Customer
CUS_INITIAL
last
name
CUS_RENEW_DATE
Customer
AGENT_CODE
99999
10000-99999
Y
PK
VARCHAR2(20)
Xxxxxxxx
100-999
Y
FK
VARCHAR2(20)
Xxxxxxxx
CHAR(5)
Customer
CUS_LNAME
first
name Customer
AGENT
Y
CHAR(1)
X
DATE
dd-mmm-yyyy
CHAR(3)
999
CHAR(3)
999
CHAR(4)
999
CHAR(14)
999-9999
Y
Xxxxxxxx
Y
initial
Customer insurance renewal
AGENT
date
Agent
code
AGENT_CODE
Agent
code
AGENT_AREACODE
Agent
area
AGENT_PHONE
Agent
AGENT_LNAME
number
AGENT_YTD_SLS
Agent
code
telephone
VARCHAR2(20) last
Agent
NUMBER(9,2)
name
PK
Y 0.00-9
9 999
999
Y
999.99
Y
999.99
year-to-date sales
FK
5
Foreign
PK
5
Primary
5
Fixed
VARCHAR2
CHAR
5
Variable
NUMBER
5
key
key
character
length
character
Numeric
data
MONEY
or
data
length
(1-255
data
(NUMBER(9,2)
characters)
(1-4
is
CURRENCY
data
000
used
characters)
to
specify
numbers
with
two
decimal
places
and
up
to
nine
digits,
including
the
decimal
places.
Some
RDBMSs
permit
the
use
of
a
type.)
NOTE Telephone area codes are always composed of digits 0-9. Because area codes are not used arithmetically, they are most efficiently stored as character data. Also, the area codes are always composed of a maximum of four digits. Therefore, the area code data type is defined as CHAR(4). Onthe other hand, names do not conform to a standard length. Therefore, the customer first names are defined as VARCHAR2(20), thus indicating
that
up to
20 characters
may be used to
store the
names.
Character
data
are shown
as
left-justified.
NOTE The data dictionary in Table 3.6is an example of the human view of the entities, attributes and relationships. The purpose of this data dictionary is to ensure that all members of database design and implementation teams use the same table and attribute names and characteristics. The DBMSs internally stored data dictionary
contains
additional
and enforcement, database
Copyright Editorial
review
2020 has
implementation
Cengage deemed
Learning. that
any
information
and index types
All suppressed
about relationship
and components.
types,
entity
and referential
This additional information
integrity
checks
is generated during the
stage.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The data the
dictionary
design
is sometimes
decisions
about
described
tables
Like the data dictionary, the system described
as a detailed
data about table the
data type
access
system
database
catalogue
information store
effect,
the
such
of the
same
in
in
is
very
to
describe
the
must be avoided. 33 at the
end
3.5
same
of the
content.
users
and
information,
fact,
current
designers
the
the
3
relational
data
database
Therefore,
database
allows the
homonyms
dictionary
whose tables
system
spelled
words
with
catalogue
As new
check for
For
table
the
confusion,
meanings,
word
example,
and also
use
tables
eliminate
such
homonym
you
might
C_NAME
you should
and
words with different
different
context,
attributes.
To lessen
documentation.
RDBMS to
are similar-sounding
a database
For
example,
of a homonym car
and
why using synonyms
avoid
as fair
indicates use
the
C_NAME
to
to label
a consultant
database
homonyms;
is
auto
and indicates
refer
a bad idea
to the
the use of different
same
when you
object.
Synonyms
work through
Problem
chapter.
know that relationships
(*:*). This section developing
explores
database
The 1:* relationship norm in The
the
each table,
RELATIONSHIPS WITHINTHE RELATIONAL DATABASE
You already
start
which
including
in
authorised
In
can be
regard.
attribute.
You will discover
87
it records
database,
dictionary
a system-created
and
In a database context, a synonym is the opposite names
the
creators, data
from
produces
different
in this
Characteristics
table.
a CUSTOMER table.
useful
because
of columns
interchangeably.
actually
also
In
to label
attribute
a CONSULTANT
dictionary
is
or identically festival).
name
name
used
characteristics
automatically
son,
index
catalogue,
catalogue
within
number
all required
often
documentation
(meaning
attribute
a customer
data
and
and fair
name attribute the
as sun
just)
all objects
date, the
filenames,
a system
and synonyms. In general terms,
meanings, (meaning
describes
user/designer-created
catalogue
that
metadata. The system catalogue
index
are
only
any
contains
contains
dictionary
database
database,
that
column,
The system
like
designers
and creation
catalogue
provides
just
system
to the
homonyms
use
data
generally
be queried
are added
label
and
database
catalogue
dictionary
each
user/designer-created
can
In
to
system
database
Model
structures.
creator
the
may be derived.
the
tables
Since
software
data
the tables
corresponding
privileges.
terms
system
names,
as the
and their
3 Relational
will see
how
focusing
is the relational database
1:1 relationship
should
cannot
as one-to-one
those relationships
designs,
any relational
*:* relationships
are classified
further,
on the
one-to-many
(1:*), and
to help you apply them
following
modelling ideal.
(1:1),
many-to-many
properly
when you
points:
Therefore,
this relationship
database
design.
type
should
be the
design.
be rare
in
any relational
be implemented
any *:* relationship
can
as such in the relational be changed
into
two
model. Later in this
section,
you
1:* relationships.
NOTE The
UML class
element
diagram
to represent
to represent
represents
relationships
*:* relationships
a *:* association
as associations
directly.
between
two
However,
you
classes in
among
objects
will also learn
Chapter
5, Data
how
and
can
use the
an association
Modelling
multiplicity
class is
with Entity
used
Relationship
Diagrams.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
88
PART I
Database
Systems
3.5.1 The 1:* Relationship The 1:* relationship
is the relational
database
norm.
To see how such
implemented, consider the PAINTER paints PAINTING example that the data models in Figure 3.7 withits implementation in Figure 3.8.
FIGURE 3.7
3
The 1:* relationship
As you examine
the
PAINTER
a relationship
was used in
is
modelled
and
Chapter 2. Compare
between PAINTERand PAINTING
and
PAINTING
table
contents
in
Figure
3.8, note the following
features:
each painting is painted by one and only one painter, but each painter could have painted many paintings. Note that painter 123 (Onele P. Najeke) has three paintings stored in the PAINTING table. There is only one row in the PAINTER table for any given row in the PAINTING table, but there may be manyrows in the PAINTING table for any given row in the PAINTER table.
FIGURE 3.8 Database Primary
name: key:
Theimplemented 1:* relationship Ch03_Museum
PAINTER_NUM
Table name:
PAINTER
Foreign
none
PAINTER_NUM
Thunder
1339
Vanilla
Roses
1340
Tired
1341
Hasty
1342
Plastic
Table name:
PAINTING
Primary
PAINTING_NUM
Key:
P
Julio
G
PAINTER_NUM 123 To Nowhere
123
Flounders
126
Exit
123
Paradise
126
Foreign
As we are using the
PAINTER_INITIAL
Onele
Itero
PAINTING_TITLE Dawn
PAINTER_FNAME
Najeke
126
1338
key:
PAINTER_LNAME 123
PAINTING_NUM
between PAINTERand PAINTING
UML notation,
Key:
it is
PAINTER_NUM
worth pointing
out some
of the
different
terminology
may see when representing relationships amongst entities. In UML, relationships associations among entities. Associations have several characteristics:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
that
you
are also known as
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Association over the
name.
written
on the
Association to
the
the
line.
In the
association
direction.
direction
in
PAINTING
Role
Each association
association
Associations the
The participating
the
name.
entities A role
relationship
A PAINTER
paints
example
In
name
of the
association
association
represented
Figure
3.7, the
seen
role
the
entity
and each
names
can role
name
is
written
paints
by an arrow (
arrow
is
would
shown
concepts,
for
PAINTING
have
by a given
name
(class);
be paints
alternatively
played
as the association
by each
a PAINTING,
the two
expresses
names,
as
relationship
is
? ) pointing
pointing
towards
role
class
names
in the
paints is displayed.
3
instead
relationship.
The role
names
example:
is_painted_by
a PAINTER.
and is_painted_by.
we shall not use role
Multiplicity refers to the number ofinstances
one instance
information model.
of a related
as the
As we are concentrating
names in
modelling
in
any relationships
entity
connectivity,
(class).
Multiplicity
cardinality
of one entity (class) that are associated
in the
and relationship
UML
model
participation
provides
the
constructs
same
in the
ER
For example:
One (and and
the 3.7, the
89
entities.
Multiplicity. with
flows.
in the
name
book on modelling relational
between
Figure
also have a direction,
relationship
Figure 3.7 does not show role
this
Normally,
in
Characteristics
line.
which
of an association
In this
shown
Model
entity.
name.
represent
has a name.
example
3 Relational
only
only
one)
PAINTER
one
PAINTER.
generates
one to
many
PAINTINGs,
implemented
in the
and
one
PAINTING
belongs
to
one
NOTE The
one-to-many
of the 1
(1:*)
side in the table
The 1:* relationship will discover
that
COURSE.
relationship
For
Wednesdays
is found each
and
an
Fridays
can
COURSE
There
Figure
and
review
2020 has
from
many
course
10:00
a.m.
Students
CLASSes might
to
two
a.m.
between
in
but that
yield
10:50
the 1:* relationship
by putting
the
primary
key
a typical each
classes:
and
one
CLASS one
offered
COURSE
college
or university
refers
offered
to
one
Mondays,
on Thursdays
and CLASS
only
on
(Th)
from
might be described
one row
many rows in the 3.9
many CLASSes,
maps the
in the
COURSE
CLASS table for
ERM (Entity
but each
CLASS references
table
any
for
given
row
any given row in the
Relationship
Model)
for the
only one
in the
CLASS
COURSE. table,
but there
COURSE table.
1:* relationship
between
COURSE
CLASS.
Cengage deemed
can have
will be only
can be
Copyright
II
model
key.
environment.
generate
Accounting
(MWF)
as a foreign
relational
way: Each
Editorial
side
in any database
6:00 p.m. to 8:40 p.m. Therefore, this
easily
of the many
COURSE
example,
is
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
90
PART I
Database
Systems
FIGURE 3.9
The 1:* relationship
between COURSEand CLASS
3 The 1:* relationship
FIGURE 3.10 Database Primary
key:
COURSE
and
CLASS is further
Theimplemented 1:* relationship Table name:
Ch03_TinyUniversity
Foreign
CRS_CODE
key:
illustrated
in
Figure
3.10.
between COURSEand CLASS COURSE none
CRS_CODE
DEPT_CODE
CRS_DESCRIPTION
CRS_CREDIT
ACCT-211
ACCT
Accounting
I
3
ACCT-212
ACCT
Accounting
II
3
CIS-220
CIS
Introduction
CIS-420
CIS
Database
QM-261
CIS
Introduction
QM-362
CIS
Table
name:
Primary
key:
to
Computer
3
Science
4
Design and Implementation
3
to Statistics
Statistical
4
Applications
CLASS
Foreign
CLASS_CODE
key:
CRS_CODE
CLASS_CODE
CRS_CODE
CLASS_SECTION
10012
ACCT-211
1
MWF 8:00-8:50
10013
ACCT-211
2
MWF 9:00-9:50
10014
ACCT-211
3
10015
ACCT-212
1
10016
ACCT-212
2
10017
CIS-220
1
MWF 9:00-9:50
10018
CIS-220
2
MWF 9:00-9:50
10019
CIS-220
3
MWF 10:00-10:50
10020
CIS-420
1
W6:00-8:40
10021
QM-261
1
MWF 8:00-8:50
10022
QM-261
2
10023
QM-362
1
10024
QM-362
2
Copyright Editorial
name:
between
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
CLASS_ROOM
LECT_NUM
a.m.
BUS311
105
a.m.
BUS200
105
BUS252
342
BUS311
301
BUS252
301
a.m.
KLR209
228
a.m.
KLR211
114
KLR209
228
KLR209
162
KLR200
114
KLR200
114
KLR200
162
KLR200
162
CLASS_TIME
TTh
2:30-3:45
p.m.
MWF 10:00-10:50 Th 6:00-8:40
a.m.
p.m.
a.m.
p.m.
TTh 1:00-2:15
a.m. p.m.
MWF 11:00-11:50 TTh
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
2:30-3:45
or in Cengage
part.
Due Learning
a.m.
p.m.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Using Figure
3.10, take
CLASS
uniquely
key.
table
However,
in the
class
the
In
key.
other
Note in in the
Figure
PAINTING
CRS_CODE,
the
key
table
example, as
is included
SQL in
key.
CLASS
been
will also
composed
Similarly,
Model
Characteristics
CLASS_CODE
chosen
to
uniquely
of CRS_CODE
not null and unique
in the
be the
primary
identify
and
91
each
row
CLASS_SECTION
constraints
enforced.
(You
will
8.)
PAINTER
table
Note that has
CLASS_SECTION key
Chapter
the
terminology.
CLASS_CODE
must have the
that
a foreign
in the
and
composite
when you learn
3.8, for
Therefore,
CRS_CODE
words,
Any candidate
see how this is done
some important
each row.
combination
table.
is a candidate
a minute to review
identifies
3 Relational
tables in
primary
Figure
as a foreign
key,
3.10,
the
PAINTER_NUM,
COURSE
is included
tables
primary
3
key,
key.
3.5.2 The 1:1 Relationship As the vice
1:1 label
versa.
For
department
exhibit
in this
example,
can
one
have
only
a 1:1 relationship.
be required at this
implies,
stage
of the
FIGURE 3.11
you
the
Each lecturer
is
EMP_NUM.
should
in
Figure
tables
in
a Tiny
(However,
can
only
entities
chair
one
one
on the
is
entity,
and
and
one
DEPARTMENT
thus
and lecturers
entities is
basic
1:1 relationship
other
department
and
chair a department
attention
basic
only
LECTURER
between the two
your
5.) The
cannot
optional.
However,
1:1 relationship. modelled
in
Optional
Figure
3.11,
and
between LECTURER and DEPARTMENT
3.12,
University
note that
employee.
that
to
3.12.
Figure
note
be related
not all lecturers
focus
Chapter
The 1:1 relationship
examine
The
That is, the relationship
in
shown
can
a lecturer
chair.
might argue that
discussion,
is
entity
chair
department
(You
will be addressed
its implementation
one
department
one
to chair a department.
relationships
As you
relationship,
not
all
there
are
Therefore,
employees
several the
are
important
lecturer
features:
identification
LECTURERS
is
theres
through
another
the
optional
relationship.) The 1:1 LECTURER foreign
key in the
1:* relationship contains
in
the
Also
which the
EMP_NUM
note that
DEPARTMENT participate
chairs
DEPARTMENT
DEPARTMENT
the
(or
many
LECTURER
the
table
contains
to
a single
that
the
relationship.
more) relationships
is implemented
1:1 relationship
key to indicate
LECTURER
even
relationship
Note that
side is restricted
as a foreign
employs
in two
table.
occurrence.
it is the
In this
case,
that
foreign
a good
EMP_NUM
as a special
department
DEPT_CODE
This is
by having the
is treated
has
case
DEPARTMENT a chair.
key to implement
example
of the
of how two
the
entities
1:* can
simultaneously.
Online Content If youopenthe'Ch03_TinyUniversity' database available onthe online platform
accompanying
LECT_NUM which is
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
this
book youll
as their foreign an example
All suppressed
Rights
of the
Reserved. content
key.
does
May not
not materially
use
be
copied, affect
see that the
LECT_NUM
and
of synonyms
scanned, the
overall
or
duplicated, learning
STUDENT
EMP_NUM
or different
in experience.
whole
or in Cengage
part.
names
Due Learning
to
and
CLASS entities still use
are labels
electronic reserves
for the
for the
rights, the
right
same
some to
third remove
same
attribute,
attribute.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
92
PART I
Database
Systems
FIGURE 3.12 Database
name:
Primary
3
key:
The implemented Ch03_TinyUniversity
Table
Foreign
EMP_NUM
key:
name:
between LECTURER and DEPARTMENT LECTURER
DEPT_CODE LECT_EXTENSION
LECT_HIGH_DEGREE
DRE 156
6783
PhD
ENG
DRE 102
5561
MA
ACCT
KLR 229D
8665
PhD
KLR 126
3899
PhD
EMP_NUM
DEPT_CODE
103
HIST
104 105
LECT_OFFICE
MKT/MGT
106 110
BIOL
AAK
160
3412
PhD
114
ACCT
KLR 211
4436
PhD
AAK
4440
PhD
MATH
155
201
160
ENG
DRE 102
2248
PhD
162
CIS
KLR 203E
2359
PhD
191
MKT/MGT
KLR 409B
4016
DBA
195
PSYCH
AAK 297
3550
PhD
209
CIS
KLR 333
3421
PhD
228
CIS
KLR
300
3000
PhD
297
MATH
AAK
194
1145
PhD
299
ECON/FIN
KLR 284
2851
PhD
301
ACCT
KLR 244
4683
PhD
335
ENG
DRE 208
2000
PhD
342
SOC
BBG 208
5514
PhD
387
BIOL
AAK
230
8665
PhD
401
HIST
DRE 156
6783
MA
425
ECON/FIN
KLR 284
2851
MBA
435
ART
BBG
2278
PhD
The 1:* DEPARTMENT CODE foreign
employs
key in the
The 1:1 LECTURER foreign
key in the
chairs
DEPARTMENT
Primary
key:
DEPT_CODE
Foreign
key:
2020 has
is implemented
through
the placement
of the
DEPT_
relationship
is implemented
through
the placement
of the
EMP_NUM
EMP_NUM
Cengage deemed
relationship
DEPARTMENT table.
DEPARTMENT
review
LECTURER
185
LECTURER table.
Table name:
Copyright Editorial
1:1 relationship
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
DEPT_NAME
DEPT_
CODE
SCHOOL_
EMP_
CODE
NUM
3 Relational
Model
DEPT_ADDRESS
Characteristics
DEPT_
EXTENSION
ACCT
Accounting
BUS
114
KLR 211, Box 52
3119
ART
Fine Arts
A&SCI
435
BBG 185, Box 128
2278
BIOL
Biology
A&SCI
387
AAK 230,
Box 415
4117
CIS
Computer
BUS
209
KLR
333,
Box 56
3245
ECON/FIN
Economics/Finance
BUS
299
KLR
284,
Box 63
3126
ENG
English
A&SCI
160
DRE 102,
Box
223
1004
HIST
History
A&SCI
103
DRE 156,
Box
284
1867
Info.
Systems
MATH
Mathematics
A&SCI
297
AAK 194,
Box 422
4234
MKT/MGT
Marketing/Management
BUS
106
KLR
Box 55
3342
126,
PSYCH
Psychology
A&SCI
195
AAK 297, Box 438
4110
SOC
Sociology
A&SCI
342
BBG 208, Box 132
2008
illustrates
a proper
The preceding
LECTURER
chairs
DEPARTMENT
the use of a 1:1 relationship ensures that should not be. However, the existence of a were not defined properly. It could indicate As rare as 1:1 relationships should be, suppose
you
manage the
database
example
93
1:1 relationship.
3
In fact,
two entity sets are not placed in the same table when they 1:1 relationship sometimes meansthat the entity components that the two entities actually belong in the same table! certain conditions absolutely require their use. For example,
for a company
that
employs
pilots, accountants,
mechanics,
clerks,
salespeople, service personnel and more. Pilots have many attributes that the other employees dont have, such aslicences, medical certificates, flight experience records, dates offlight proficiency checks and proof of required periodic medical checks. If you put all of the pilot-specific attributes in the EMPLOYEE
table,
you
will have several
nulls in that table for all employees
who are not pilots.
To avoid
the proliferation of nulls, it is better to split the pilot attributes into a separate table (PILOT) that is linked to the EMPLOYEE table in a 1:1 relationship. Since pilots have many attributes that are shared by all employees such as name, date of birth and date of first employment those attributes would be stored in the EMPLOYEE table.
Online Content If youlook atthe'Ch03_AviaCo' databaseonthe onlineplatform for this book, you will see the implementation relationship
will be examined
in
of the 1:1 PILOT to
detail in
Chapter
6, Data
EMPLOYEE relationship. Modelling
Advanced
This type
of
Concepts.
3.5.3 The *:* Relationship A many-to-many (*:*) relationship is a more troublesome proposition Traditionally in data modelling the *:* relationship can be implemented set of 1:* relationships.
To explore
the
many-to-many
(*:*) relationship,
in the relational environment. by breaking it up to produce a consider
a rather
typical
college
environment in which each STUDENT can take many CLASSes and each CLASS can contain STUDENTs. The ERD modelin Figure 3.13 shows this *:* relationship.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
many
eBook rights
and/or restrictions
eChapter(s). require
it
94
PART I
Database
Systems
FIGURE 3.13
The *:* relationship
between STUDENT and CLASS
3 Note the features Each
CLASS
There
can
can be
TABLE
Students
can have
be
the
three
Figure 3.13:
many STUDENTs,
many rows
in the
CLASS
STUDENT
*:* relationship
classes.
3.7
Last
ERD in
many rows in the
To examine
takes
of the
more
Name
closely,
times
in the
and
each
the
of those
hours
CIS-220, code
to
and
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
to
May not
not materially
QM-261,
code 10021
reflected
copied, affect
phone
would
of whom
overall
or
also
be repeated
in
in experience.
whole
or in Cengage
STU_NUM
student
of the
taking
lead
Due Learning
output
to
electronic reserves
the
records
the
such
such
many
as
STUDENT
table,
here.
Similarly,
generates
a CLASS
attributes
as credit
discussed
operations
as shown
occur
shown
class
anomalies
the relational
and
part.
to the
values
attributes in
CLASS table included
redundancies
errors
not be implemented
be contained
each
each student
worse if the
duplicated, learning
10018
the
additional
would
of the two tables,
scanned,
code
note that
situation,
home
efficiency
the
each
10018
in Figure 3.13, it should
For example,
Those
system
students,
10014
Statistics,
many duplications:
be
code
CIS-220,
a real-world
and contents
to lead
with two
students.
code
Science,
would be even
structure
are likely
1, ACCT-211,
and
description.
and there
10021
Computer
values
contains
course
In
major
table,
code 10014
QM-261,
table.
The problem
Given the and
Copyright
1, ACCT-211, Science,
attribute
CLASS table
record.
Editorial
STUDENT
STUDENT
data
many redundancies.
classification,
university
Computer
is logically
address,
many CLASSes.
CLASS table.
data for the two
Statistics,
reasons:
create
a small
to
3.14 for
The tables
imagine
to
the *:* relationship good
row in the
Intro
Intro to
two
can take
Classes
Accounting
Figure
given
Intro
Intro
Although
any
enrolment
enrolment
Accounting
Smithson
for
STUDENT
table for any given row in the
Selected
Ndlovu
in
table
Table 3.7 shows the
Sample student
and each
in
become
Chapter
1.
very complex
errors.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 3.14 Database Primary
Table
name: key:
Model
Characteristics
Ch03_CollegeTry
Table
name:
Foreign
STUDENT
key:
none
STU_NUM
STU_LNAME
CLASS_CODE
321452
Ndlovu
10014
321452
Ndlovu
10018
321452
Ndlovu
10021
324257
Smithson
10014
324257
Smithson
10018
324257
Smithson
10021
3
CLASS
Key:
CLASS_CODE
CLASS_CODE
Foreign
STU_
CRS_CODE
Key:
STU_NUM
CLASS_
NUM
CLASS_TIME
CLASS_
SECTION
PROF_NUM
ROOM
10014
321452
ACCT-211
3
TTh 2:30-3:45
p.m.
BUS252
342
10014
324257
ACCT-211
3
TTh 2:30-3:45
p.m.
BUS252
342
10018
321452
CIS-220
2
MWF 9:00-9:50
a.m.
KLR211
114
10018
324257
CIS-220
2
MWF 9:00-9:50
a.m.
KLR211
114
10021
321452
QM-261
1
MWF 8:00-8:50
a.m.
KLR200
114
10021
324257
QM-261
1
MWF 8:00-8:50
a.m.
KLR200
114
Fortunately,
95
between STUDENT and CLASS
STU_NUM
name:
Primary
The *:* relationship
3 Relational
the
problems
inherent
in the
many-to-many
(*:*) relationship
can
easily
be avoided
by
creating a composite entity or bridge entity. Because such a table is used to link the tables that originally were related in a*:* relationship, the composite entity structure includes asforeign keys at least the primary keys of the tables that are to belinked. The database designer has two main options when defining a composite tables primary key: use the combination of those foreign keys or create a new primary
key.
NOTE In UML class diagrams, the composite
entity,
multiplicity element can represent *:* relationships
an association
explore the concept Diagrams.
class is used to represent
of an association
the association
directly. Instead between
two
of using a
entities.
We will
class further in Chapter 5, Data Modelling with Entity Relationship
Remember that each entity in the ERD is represented by a table. Therefore, you can create the composite ENROL table shown in Figure 3.15 to link the tables CLASS and STUDENT. In this example, the
ENROL tables
primary
key is the
combination
of its foreign
Butthe designer could have decided to create a single-attribute
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
keys
CLASS_CODE
and
STU_NUM.
new primary key such as ENROL_LINE,
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
96
PART I
Database
using
Systems
a different line
use the
FIGURE 3.15 Database
value to identify
Autonumber
name:
data
type
to
each
such line
Converting the *:* relationship Ch03_CollegeTry2
Table
Primary key: STU_NUM
3
ENROL table
generate
name:
Table
users
might
STUDENT
STU_LNAME
321452
Ndlovu
324257
Smithson
CLASS_CODE1STU_NUM
keys:
CLASS_CODE,
name:
Primary
Access
ENROL
Primary key: Foreign
(Microsoft
key: none
STU_NUM
Table
uniquely. automatically.)
into two 1:* relationships
name:
Foreign
row
values
STU_NUM CLASS_CODE
STU_NUM
ENROLL_GRADE
10014
321452
C
10014
324257
B
10018
321452
A
10018
324257
B
10021
321452
C
10021
324257
C
CLASS
key:
CLASS_CODE
Foreign
key:
CRS_CODE
CLASS_CODE
CRS_CODE
CLASS_SECTION
CLASS_TIME
10014
ACCT-211
3
TTh 2:30-3:45
10018
CIS-220
2
MWF 9:00-9:50
10021
QM-261
1
MWF 8:00-8:50
Because
the
linking
ENROL table in
Figure
3.15 links
CLASS_ROOM
PROF_NUM
BUS252
342
a.m.
KLR211
114
a.m.
KLR200
114
p.m.
two tables,
STUDENT
table. In other words, alinking table is the implementation
and
CLASS, it is also called
of a composite
a
entity.
NOTE In
addition
as the
to the linking
grade
designer
earned
attributes,
in the
wants to track.
the
course.
composite
In fact,
Keep in
ENROL
a composite
mind that the
table
table
can
can
composite
also
contain
contain
entity,
any
although
relevant
number
attributes,
such
of attributes
it is implemented
that
the
as an actual
table, is conceptually alogical entity that was created as a meansto an end: to eliminate the potential for multiple redundancies in the original *:* relationship.
The linking composite
Copyright Editorial
review
2020 has
Cengage deemed
(ENROL)
entity
Learning. that
table
any
All suppressed
shown
represented
Rights
Reserved. content
does
May not
not materially
in
Figure
by the
be
copied, affect
scanned, the
overall
3.15
ENROL
or
duplicated, learning
yields
table
in experience.
whole
the
required
*:* to
must contain
or in Cengage
part.
Due Learning
to
electronic reserves
at least
rights, the
right
some to
third remove
1:* conversion. the
party additional
Observe
primary
content
may content
keys
be
suppressed at
any
time
that
of the
from if
the
subsequent
the
CLASS
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
and
STUDENT
Also
note
table
tables
that
the
contains
incapable
multiple
of
be assigned
FIGURE
ENROL tables
the
3.16
As you
class
code
conversion
between
3.16,
and
The 1:* relationship With the control
help
the
between
sections
of this
to
FIGURE 3.17
each
foreign
key
key consists student in the
that
values,
of the two
ERM,
per entity.
but those
controlled
too.
enforced.
to satisfy
attributes
are needed
to
The revised
Model
which it serves
one row
is
is selected
number
for
only
integrity
ENROL_GRADE
primary
note
respectively) contain
as referential
and the
Characteristics
as a connector.
The linking
ENROL
redundancies
Additional
a reporting
a particular
relationship
is
are
attributes
may
requirement.
Also
CLASS_CODE
define
97
and
STU_NUM,
students
shown
in
grade.
Figure
3
3.16.
to two 1:* relationships
the
composite
entity
named
ENROL
represents
the
linking
table
CLASS. COURSE
relationship,
you
redundancies. and
of a CLASS
common
case,
between
databases COURSE
of the
now
the *:* relationship
Figure
STUDENT
STU_NUM,
tables
as long
is reflected
Changing
examine
and
CLASS
anomalies
as needed. In this
both the
Naturally,
and
occurrences
producing
note that the because
(CLASS_CODE
STUDENT
3 Relational
CLASS
while
CLASS
shown
kept
CLASS
can increase Thus, in
controlling
are
and
Figure Figure
was first illustrated
the 3.16 3.17.
COURSE
The expanded entity relationship
of available
be expanded
Note that
redundancies
in the
amount can
by
in Figure
the
making
3.9 and Figure
information,
to include model
is
sure that
even
the
able
3.10. as you
1:* relationship
to
handle
all of the
multiple
COURSE
data
table.
model COURSE
1..1
has
c
1..*
STUDENT
registers
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
shows_in
1..*
1..1
Editorial
ENROL
c
does
May not
not materially
be
copied, affect
1..*
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
CLASS
c
1..1
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
98
PART I
Database
Systems
The ERD
will be examined
complex
databases.
of a realistic
database
3.6
in
The
greater
ERD
design
detail in
will also in
Chapter
5 to
as the
basis
be used
Appendices
B and
C (see
show for
the
you
the
how it is
used to
development
online
platform
design
more
and implementation for this
book).
DATA REDUNDANCY REVISITED
3 In
Chapter
the
1 you learnt
effectiveness
control
that
of the
data redundancy
database.
data redundancies
The
proper
use
of foreign
that, in the
strictest
because
foreign
values
keys
minimises
key
data
crucial
that
to
the
exercising
thus
are
data
anomalies
database by tables,
redundancy
control.
the
chance
that
it
called
possible
foreign
However,
keys does not eliminate Nevertheless,
can destroy
makes
shared
many times.
minimising
Those
relational
that
use of foreign
be repeated
redundancies,
data anomalies.
attributes
sense, the can
to
also learnt
common
keys is
emphasising the
You
by using
leads
to keys.
it is
worth
data redundancies,
the
proper
use
of foreign
destructive
data
anomalies
will
are stored,
but whether the
develop.
NOTE The real test elimination
of redundancy
of an attribute
information
can still
redundant. multiple
in
Given
be generated
that
view
occurrences
mind that
in
controlled
and/or information
is not how will eliminate
many copies
information.
through
relational
of redundancy,
a table.
proper
However,
redundancies
of a given attribute
Therefore,
even
algebra,
foreign
when
Exclusive
reliance
delete
an attribute
the inclusion
keys
you
are
use this
are often designed
requirements.
if you
clearly
less
of that
restrictive
algebra
attribute
not redundant view
as part of the system
on relational
and the
to
in
original would
spite
be
of their
of redundancy,
keep
to ensure transaction
speed
produce
required
information
maylead to elegant designs that fail the test of practicality.
You
will learn
in
requirements:
Chapter
design
15,
Databases
defined
and controlled the
As important must
such
review
2020 has
Cengage deemed
any
All suppressed
about
a consistent
a system
Rights
Reserved. content
does
May not
one
Regardless
serve
when
crucial
the
data. For example,
input
are shown in
not materially
be
copied, affect
at a time,
purchased
pricing
scanned, the
overall
or
each
consider
in experience.
whole
The
or in Cengage
that
Due Learning
to
electronic reserves
table
appears
class
rights, the
right
You seem
some to
third remove
system.
several
should
content
may content
be
any
Because
the
LINEs, product
The tables
time
that
Figure 3.19.
suppressed at
to
The system
contain
ERD is shown in
additional
exist
invoice
on the invoice.
party
will learn to
an INVOICE.
may contain
PRODUCT
redundancy
purposes.
generating
data
control.
of data
a small invoicing thus
The systems
part.
level
in
carefully
of how you describe and careful
the
will learn
requires
data redundancies
an invoice
product
Figure 3.18.
duplicated, learning
product. for
design
information
when
contradictory
And you
warehousing
are times
are times
often
requirements.
data
properly.
there
And there
product
three
by proper implementation
is,
database
15.
of the
proper
who may buy one or more PRODUCTs,
more than
details
provide
Learning. that
buy
control
reconcile
and information
that
damage is limited
Chapter
must
to function
make the
accuracy
are part of such
Copyright
to
CUSTOMER,
providing
price to
for
historical
may
speed
Intelligence,
redundancy
in
a customer each
data
designers
processing
data redundancies
be increased
the
database
Business
redundancies
preserve the includes
for
potential as
actually
about
Editorial
elegance,
Chapter
redundancies,
5 that
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 3.18 Database
Asmall invoicing
name:
3 Relational
Model
Characteristics
system Table
Ch03_SaleCo
name:
Foreign
Primary key: CUS_CODE
CUSTOMER
key: none
CUS_CODE
CUS_LNAME
CUS_FNAME
CUS_INITIAL
CUS_AREACODE
CUS_PHONE
10010
Ramas
Alfred
A
0181
844-2573
10011
Dunne
Leona
K
0161
894-1238
10012
Du Toit
0181
894-2285
10013
Pieterse
0181
894-2180
10014
Orlando
0181
222-1672
10015
OBrian
Amy
B
0161
442-3381
10016
Brown
James
G
0181
297-1228
0181
290-2556
10017
Marlene
George
Moloi
10019
Table
F
Myron
Padayachee
10018
W
Jaco
Williams
99
Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
3
name: INVOICE
Foreign
Primary key: INV_NUMBER INV_NUMBER
key: CUS_CODE
CUS_CODE
INV_DATE
1001
10014
08-Dec-19
1002
10011
08-Dec-19
1003
10012
08-Dec-19
1004
10011
09-Dec-19
Table name: LINE Primary
key: INV_NUMBER
1 LINE_NUMBER
Foreign
key: INV_NUMBER,
PROD_CODE
INV_NUMBER
Copyright Editorial
review
LINE_PRICE
LINE_NUMBER
PROD_CODE
LINE_UNITS
1001
1
123-21UUY
1
1001
2
SRE-657UG
3
2.36
1002
1
QER-34256
2
14.72
1003
1
ZZX/3245Q
1
5.36
1003
2
SRE-657UG
1
2.36
1003
3
001278-AB
1
10.23
1004
1
001278-AB
1
10.23
1004
2
SRE-657UG
2
2.36
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
150.09
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
100
PART I
Table
3
Database
name:
Systems
PRODUCT
Primary
key:
PROD_CODE
Foreign
key:
none
PROD_CODE
PROD_DESCRIPT
001278-AB
Claw
123-21UUY
Houselite
QER-34256
Sledge
SRE-657UG
Rat-tail file
ZZX/3245Q
Steel tape,
FIGURE 3.19
PROD_PRICE
PROD_ON_HAND
VEND_CODE
23
232
4
235
6
231
2.36
15
232
5.36
8
235
10.23
hammer chain
saw,
hammer,
16 cm
150.09
bar
14.72
16 kg head
12 mlength
The ClassERDfor the invoicing system
As you examine
the tables
in the invoicing
system in Figure 3.18 and the relationships
depicted
in Figure
3.19, note that you can keep track oftypical sales information. For example, by tracing the relationships among the four tables, you discover that customer 10014 (Myron Orlando) bought two items on 8 December, 2012 that were written to invoice number 1001: one Houselite chain saw with a 16-inch bar and three rat-tail files. (Note: Trace the CUS_CODE number 10014 in the CUSTOMER table to the matching
CUS_CODE
value in the INVOICE
table.
Next, take the INV_NUMBER
1001 and trace it to the
first two rows in the LINE table; then match the two PROD_CODE values in LINE with the PROD_CODE values in PRODUCT.) Application software will be used to write the correct bill by multiplying each invoice line items LINE_UNITS byits LINE_PRICE, adding the results, applying appropriate taxes, etc. Later,
other
application
software
might use the
same technique
to
write sales reports
that
track
and
compare sales by week, month or year. As you examine the sales transactions in Figure 3.18, you mightreasonably suppose that the product price billed to the customer is derived from the PRODUCT table because thats where the product
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
data are stored. redundancy?
But why does that
It certainly
success.
Copying
accuracy
of the
table
appears
the
product
you use the Now suppose reflected
sales
transaction
thus
revenues took
place!
eliminating
the
price
data
are
price
will always
such
planned
is
case,
the
data
is
are
stored.
You
on
myinvoice
from those
3.7
not
you
orderly
potential price
topic
arrangement
through books
when
other
3
the
hand, if the
LINE
You
table,
that
will discover
that
LINE table in Figure primary but this
numbers
generation,
key
and,
redundancy
automatically.
the redundancy
3.18.
is
In this
not
a source
benefit: the order of the retrieved invoicing
as soon when
effect
the
composite
were entered. If product
codes
change
calculations
will be incorrect,
is redundant,
such line
automatic
are looking
look
in
the
For
codes
as the invoice
a customer
is
calls
at an invoice
and
are used as part of the completed
and the
says,
second
whose lines
The
show
data item
a different
order
and
pointers. you
database
in
and see if the PAINTER_NUM
table
and use the index
in the
index
Cengage
find
any
All suppressed
Rights
the
does
Figure
May not
not materially
be
in this
read
the
to locate
is
reference
the
to the location of the
pointers.
every
system)
which
points
you
matter. Anindex is an
you
make sense
that
point
described
in
the
go to to the
by each
preceding
and a set
of
anindex is an ordered by the
a given row
in
up the appropriate index
key.
painter the
However, if you index
the
you
key
data identified
must read
speaking,
to
quickly.
created
merely need to look
to read
much simpler
of an index
of the
painter.
Conceptually
item
Moreformally,
paintings
an index,
references
indexes
point.
up
you
page
composed
points
Without
through
catalogue,
Does it
not; it is
a needed
work like
an index
all
book.
Of course
matches the requested
matching
in
Reserved. content
3.8.
key PAINTER_NUM,
depicted
Learning. that
and
key
manual or a computer
model,
and
used
of view,
Each
Figure
table
presentation
model,
want to look
a
the topic?
environment
point
to look
logically.
as ER
is
make sense
not; you use the librarys
of the book a quick and simple
across ER
Does it
Of course
a table
an index
database
suppose
Ch03_Museum
in such
phrase case,
a library.
(in either
key is, in effect, the indexs
of keys
example,
rows
a topic,
a conceptual
pointers. The index arrangement
each
relational
From
The index
access
to find
up the
In
in
want?
making retrieval
until you stumble
page(s).
Indexes
to
book
one you
and author.
want
page
index,
paragraphs.
deemed
you
a particular
thereby used
you
every
appropriate
has
be a sufficient
generates
confusion
and
until you find the
by title,
Or suppose
2020
product
the
copy!
want to locate
to the books location,
review
the
an incorrect
which the
LINE
calculate
the
in
time.
historical the
This price
not in
a data
systems
INDEXES
is indexed
Copyright
given its
at that
LINE_NUMBER
that
to
Onthe
was used in the
data
can imagine
book in the library
Editorial
Yes, the
table
101
design.
attribute
order in
on the customers
Suppose
the
place
database
in
transactions
transaction
also adds another
those
all past
with the
took
the
was
over time.
stored
and PROD_CODE
But
that
that
maintains
changes.
Characteristics
to the
Unfortunately,
price for
of LINE_NUMBER
will arrange
has
product
that
software
necessary.
PROD_PRICE
Isnt
LINE_PRICE
PRODUCT
sales comparisons
good
redundant?
the
Model
crucial
table
and
LINE_NUMBER
by invoicing
match the
key, indexing
in
is
write the
calculations.
calculations
transaction
of INV_NUMBER
The inclusion
new
table
common
why the
tables
LINE
to
price) from
the
making proper
the
to the
Relational
LINE table?
redundancy
you fail
revenue
revenue
PRODUCT
reflect
created
will always
primary
reflect
table
that
(product
again in the
apparent
PRODUCT
sales
the
of
the
LINE_NUMBER
redundancy
of anomalies.
also
are
combination
commonly
the
from
might wonder
isnt
quite
now
As a result,
the
PRODUCT
will
redundancies
Wouldnt the
the
the
all subsequent
accurately
Finally, you
therefore,
that
price occur
time,
for instance,
in
possibility
copied
from
PROD_PRICE
will be properly past
product
But this
Suppose,
sales revenue.
of
be.
price
transactions.
and that
same
to
3
in the
PAINTING
the
PAINTER
PAINTER_NUM
would
resemble
the
3.20.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
102
PART I
Database
Systems
FIGURE 3.20 PAINTING
Components table
of anindex
index
PAINTING
123
1, 2, 4
126
3, 5
table
3
PAINTER_NUM (index key)
Pointers to the PAINTING table rows SOURCE:
Course
Technology/Cengage
Learning
As you examine Figure 3.20, note that the first PAINTER_NUM index key value (123) is found in records 1, 2 and 4 of the PAINTING table. The second PAINTER_NUM index key value (126) is found in records
3 and 5 of the
PAINTING
table.
DBMSs use indexes for many different purposes. You just learnt that an index can be used to retrieve data more efficiently. But indexes can also be used by a DBMS to retrieve data ordered by a specific attribute or attributes. For example, creating anindex on a customers last name will allow you to retrieve the customer data alphabetically ordered by the customers last name. Also, anindex key can be composed
of one or more attributes.
For example,
in
Figure
3.18, you can create
an index
on VEND_CODE and PROD_CODE to retrieve all rows in the PRODUCT table ordered by vendor and within vendor, ordered by product. Indexes play animportant role in DBMSs for the implementation of primary keys. Whenyou define atables primary key, the DBMS automatically creates a unique index on the primary key column(s) you declared. For example, in Figure 3.18,
when you declare
CUS_CODE to
be the
primary
key of the
CUSTOMER
table,
the DBMS automatically creates a unique index onthat attribute. A unique index, asits name implies, is an index in whichthe index key can have only one pointer value (row) associated withit. (The index in Figure 3.20 is not a unique index because the PAINTER_NUM has multiple pointer values associated withit. For example,
painter
number
123 points to three rows
1, 2 and 4 in the
PAINTING table.)
Indexes are crucial in speeding up data access. They can be used to facilitate searching, sorting and even joining tables. Theimprovement in data access speed occurs because anindex is an ordered set of values that contains the index key and pointers. A table can have manyindexes, but each index is associated with only one table. Theindex key can have multiple attributes (composite index). Creating an index
is
easy.
You
will learn
in
Chapter
8 that
a simple
SQL command
will produce
any required
index.
NOTE You willlearn more about how indexes can be applied to improve Conceptual, Logical, and Physical Database Design.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
data access and retrieval in
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
Chapter 11,
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
3.8 In
3
Relational
Model
Characteristics
103
CODDS RELATIONAL DATABASE RULES
1985,
Dr E.F. Codd
published
alist
of 12 rules
to
define
a relational
database
system.2
The reason
Dr Codd published the list was his concern that many vendors were marketing products asrelational even though those products did not meet minimum relational standards. Dr Codds list, shown in Table 3.8, serves as a frame of reference for what atruly relational database should be. Bearin mindthat even the
dominant
database
TABLE Rule
3.8
vendors
do not fully
Dr Codds
12 relational
Rule Name
1
Information
2
Guaranteed
support
all 12 rules.
database
All information
Access
in a relational
values
Every
Systematic
Treatment
of
Nulls
Nulls
Based
Online on the
Catalogue
5
Comprehensive
Data
guaranteed
name,
The relational
database.
key value in
through
as
a
and column
a systematic
one
management
and Such
managed data
name.
way,
may support
well-defined
authorised
language. However
language
data
constraints,
commit
data, that is, in to
many languages.
declarative
view definition,
(begin,
as ordinary
must be available
database relational
database
must support
it
with support
manipulation (interactive authorisation
and transaction
and rollback).
Any view that is theoretically
Updating
be accessible
and treated
and by program), integrity
View
to
primary
must be stored
within the
for data definition,
6
represented
within tables. is
users, using the standard
Sub-language
must be logically
of data type.
metadata
tables
Model
database
must be represented
The
Relational
a table of table
independent
Dynamic
in rows
value in
combination
4
rules
Description
column
3
3
updatable
must be updatable
through
the
system. 7
High-Level and
Insert,
Physical
8
The
Update
database
Data Independence
Application physical
9
must support
Logical
Data Independence
programs access
Application changes
programs are
Integrity
Independence
11
Distribution
Independence
12
Non-Subversion
The
made to the
Rule Zero
to
Codd,
E.F., Is
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
Your
and 21
All suppressed
Rights
DBMS October,
Reserved. content
users data
does
and
May not
not materially
Really
structures
application
location
bypass
All preceding to
table
constraints
and
deletes.
rules
are
based
that
when
unaffected
preserve
or inserting
the
when
original
columns).
not at the application unaware
of the
of and
level.
unaffected
databases).
access to the
on the
it
are logically
are
vs local
rules
relational,
unaffected
are changed.
catalogue,
programs
low-level
the integrity
be considered
are logically
must be definable in the relational
(distributed
If the system supports way to
14 October
updates
structures
order of column
and stored in the system
end
by the
or storage
and ad hoc facilities
All relational integrity language
2
inserts,
and ad hoc facilities
methods
table values (changing 10
set-level
Delete
data, there
must not be a
database.
notion
that,
in
must use its relational
order for
a database
facilities
exclusively
manage the database.
Relational?
and Does
Your
DBMS
Run by the
Rules?
Computerworld,
1985.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
104
PART I
Database
Systems
SUMMARY Tables are the basic building blocks of a relational as an entity
set, is stored in
a table.
Conceptually
database. speaking,
A grouping of related entities, known the relational
intersecting rows (tuples) and columns. Each row represents represents the characteristics (attributes) of the entities.
3
Keys are central to the
use of relational
tables.
table is composed
of
a single entity, and each column
Keys define functional
dependencies;
that is, other
attributes are dependent on the key and can, therefore, be found if the key value is known. A key can be classified as a superkey, a candidate key, a primary key, a secondary key or aforeign key. Each table row
must have a primary
key. The primary
key is an attribute
or a combination
of
attributes that uniquely identifies all remaining attributes found in any given row. Because a primary key must be unique, no null values are allowed if entity integrity is to be maintained. Although the tables key of one table
are independent,
can appear
they can belinked
as the foreign
integrity dictates that the foreign table or must contain nulls. Once you know the relational
by common attributes.
key in another
table to
key must contain values that
database
basics,
Thus, the primary
which it is linked.
Referential
match the primary key in the related
you can concentrate
on design.
Good design
begins by identifying appropriate entities and attributes, and the relationships among the entities. Those relationships (1:1, 1:* and *:*) can be represented using ERDs. The use of ERDs allows you to create and evaluate simple logical design. The 1:* relationships are most easily incorporated in a good
design;
you just
have to
make sure that the
primary
key of the 1
is included
in the table
of
the many.
KEYTERMS associations
flags
predicate logic
associationclass
foreign key(FK)
primary key (PK)
attribute domain
full functional dependence
referential integrity
bridge entity
functional dependence
relation
candidatekey
homonyms
relationalschema
cardinality
index
secondary key
composite entity
index key
superkey
composite key
key
synonym
datadictionary
key attribute
systemcatalogue
determination
linking table
tuple
domain
multiplicity
entity integrity
unique index
null
FURTHER READING Codd,
E.F.
Codd,
E.F. Relational
The
Series
RJ987
March
(6
Copyright review
2020 has
Cengage deemed
Learning. that
any
Series
All suppressed
Model
for
Data
completeness
Symposia
Symposia
Editorial
Relational
Rights
6, Data 1972). 6.
Base
Republished
does
May not
not materially
be
Management:
base
Systems,
Prentice-Hall,
Reserved. content
Base
of data
New
in
Version
sublanguages York
Randall
J.
City,
Rustin
2.
Addison-Wesley,1990.
(presented NY,
(ed.),
2425 Data
at May,
Base
Courant
1971).
Computer
IBM
Systems:
Science
Research
Courant
Report
Computer
Science
1972.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Date,
C.J.
The
Date,
C.J.
Darwen,
Relational
Date,
C.J.
Date
Date,
C.J.
Database
Database
Dictionary.
H. Databases,
on
Database: in
Types Writings
Depth:
The
OReilly,
and the
Relational
Model.
APress, Model
Relational
Model
Characteristics
105
2006.
Relational
20002006.
3
for
Addison-Wesley,
2006.
2006.
Practitioners.
OReilly,
2005.
Online Content Allofthe databases usedin the questions andproblems areavailableon the
online
platform
database is
the
accompanying
names
used
in the
'Ch03_CollegeQue'
chapter
are
also
REVIEW
this figures.
database.
available
on the
book. For
The
example,
Answers online
database the
to
names
source
selected
used
of the
Review
in the
tables
folder
shown
Questions
and
match in
2
What does it
3
Whyare entity integrity and referential integrity important in a database?
4
What can a NULL value represent?
5
Whatis the domain of an attribute?
6
Create the basic ERD using UML notation for the database shown in Figure Q3.1.
Table
this
QUESTIONS
Whatis the difference between a database and a table?
Database
Q3.1 for
platform.
1
FIGURE
the
Figure
Problems
3
meanto say that a database displays both entity integrity
Q3.1 name:
name:
The Ch03_CollegeQue
database
and referential integrity?
tables
Ch03_CollegeQue Table
STUDENT
STU_CODE
LECT_CODE
100278
name:
LECTURER
LECT_CODE
DEPT_CODE
1
2
128569
2
2
6
512272
4
3
6
531235
2
4
4
531268
553427
7
Copyright Editorial
review
2020 has
1
Create the basic ERD using UML notation for the database shown in Figure Q3.2.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
106
PART I
Database
FIGURE Database Table
3
Systems
Q3.2 name:
name:
The Ch03_TravelQue
database tables
Ch03_TravelQue
CUSTOMER
CUS_CODE
CUS_LNAME
CUS_EMAIL
CUS_MOBILE
24563
GARNETT
[email protected]
08703345671
24565
MWBAU
[email protected]
08734566664
Table name: BOOKING BOOKING_NO
PACKAGE_ID
BOOK_TOTAL_COST
BOOK_PAID
BOOK_DEP_DATE
24563
9910001
956.00
Y
06-Jan-19
24565
9910001
895.00
N
07-Sep-19
24563
9910003
3056.00
N
05-Oct-19
Table name: PACKAGE_HOLIDAY PACKAGE_ID
PACK_DESTINATION
9910001
Spain
Riveria Travel
7
9910002
USA
Mouse
14
9910003
Australia
Wallaby Tours
8
PACK_OPERATOR
PACK_DURATION
Holidays
21
Suppose you have the ERD shown in Figure Q3.3. How would you convert this that displays only 1:* relationships? (Make sure you create the revised ERD.)
FIGURE Q3.3
The UMLClassERDfor question 6 TRUCK
DRIVER
1..*
1..*
During
some
TRUCKS
9 10
What are homonyms
and
time
any
interval,
TRUCK
and synonyms,
How would you implement example.
Use your knowledge
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
of naming
not materially
be
copied, affect
a DRIVER
can
be driven
by
can
the
overall
or
duplicated, learning
many DRIVERs.
in a database
composed
of two tables?
Give an
ofthe table shown in Figure Q3.4, using correct terminology.
conventions
scanned,
drive many
and why should they be avoided in database design?
a 1:* relationship
11 Identify and describe the components
Editorial
modelinto an ERD
in experience.
whole
to identify
or in Cengage
part.
Due Learning
to
the tables
electronic reserves
rights, the
right
some to
probable
third remove
party additional
content
foreign
may content
be
key(s).
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE Database Table
Q3.4 name:
name:
The Ch03_NoComp
Characteristics
Ch03_NoComp
EMPLOYEE EMP_FNAME
11234
Friedman
K
Robert
MKTG
12
11238
Zulu
D
Cela
MKTG
12
11241
Fontein
11242
Theron
11245
Smithson
11256
McBride
11257
Mazibuko
11260
Ratula
Oleta
ENGR
8
Randall
ENGR
8
the
b
Identify the foreign Create the
Q3.5
Database name:
Table
name:
primary
has
William
Learning. that
any
14
MKTG
14
5
INFS
Katrina
of the two tables shown in Figure Q3.5.
keys.
ERM.
The Ch03_Theatre
database tables
Ch03_Theatre
DIRECTOR
name:
Cengage deemed
MKTG
keys.
DIR_NUM
DIR_LNAME
DIR_DOB
100
Broadway
12-Jan-75
101
Hollywoody
18-Nov-63
102
Goofy
21-Jun-72
PLAY
PLAY_CODE
PLAY_NAME
DIR_NUM
1001
Cat On a Cold, Bare Roof
102
1002
Hold the
1003
2020
Fikile
Suppose you are using the database composed Identify
6
INFS
G
A
a
9
ENG
Bernard
D
3
5
B
W
Smith
JOB_CODE
INFS
Emma
J
Washington
11258
DEPT_CODE
Juliette
11248
Table
107
database EMPLOYEE table
EMP_INITIAL
FIGURE
review
Model
EMP_LNAME
c
Copyright
Relational
EMP_NUM
12
Editorial
3
All suppressed
Rights
Reserved. content
does
I
Mayo, Pass the
Never Promised
1004
Silly
Putty
1005
See
No Sound,
1006
Starstruck
1007
Stranger
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
Goes
in In
whole
Bread
101
You Coffee To Hear
102
Washington
100
No Sight
101
Biloxi
102
Parrot Ice
or in Cengage
part.
Due Learning
to
101
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
108
PART I
Database
d
Systems
Suppose you wanted quick lookup director.
e
Which table
be the
What would be the conceptual contents
13
would
of the
conceptual
capability to get alisting basis for
the INDEX
table,
of all plays directed and
what
would
3
world.
The database
table.
composed
a
Identify the primary keys.
b
Identify the foreign
c
Createthe ERM.
FIGURE Q3.6 Database
is
name:
key?
view of the INDEX table that is described in Part d? Depict the
INDEX
Suppose you are using the database to enable a museum to find the location the
by a given
be the index
of the three
tables
shown
in Figure
of artefacts around
Q3.13.
keys.
Table Name Artefact Museum
Database
ARTEFACT_DESCRIPTION
ARTEFACT_
TRACK_ID 10034
Greywacke
Statue Tribute to Isis
10039
The Golden Rhinoceros
ARTEFACT_
ARTEFACT_
ARTEFCAT_
AGE
VALUE
LOCATION_ID
664525
of
BC
6000000
78343
10751220
12100000
56432
18th
85900000
23412
Mapungubwe 10056
Pinner
Qing
Dynasty
Vase
Century 19002
Rosetta
181
Stone
BC
23412
Table name: LOCATION ARTEFACT_LOCATION_ID
ARTEFACT_COUNTRY
78343
FRANCE
56432
USA LONDON
23412
d
Suppose the could
be
museum database
contacted
CURATOR_NO,
for
to
request
to
CURATOR_NAME
more than
one location.
was to be expanded
to include
see
details
an
and
artefact.
The
CURATOR_CONTACT.
Modify your
ERM to include
details of a curator
that
need
to
A curator
be
may
who
stored
are
a
be responsible
this information.
PROBLEMS Use the four
database
tables
that
shown reflect
in
Figure
these
P3.1 to
work
Problems
1-7.
Note that
the
database
is
composed
of
relationships:
An EMPLOYEE
has only one JOB_CODE,
An EMPLOYEE
can
participate
in
many
but a JOB_CODE PLANs,
and
any
can be held
PLAN
can
by many EMPLOYEEs.
be assigned
to
many
EMPLOYEEs. Note
table
Copyright Editorial
review
2020 has
also that
serves
Cengage deemed
Learning. that
any
the
*:* relationship
has been
as the composite
All suppressed
Rights
Reserved. content
does
May not
not materially
be
or bridge
copied, affect
scanned, the
overall
or
broken
two
1:* relationships
for
which the
BENEFIT
entity.
duplicated, learning
down into
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE P3.1 Database
name:
Relational
Model
Table name: JOB
EMP_LNAME
JOB_CODE
JOB_CODE
14
Rudell
2
1
Clerical
15
Arendse
1
2
Technical
16
Ruellardo
1
3
17
Smith
3
20
Smith
2
name:
109
Ch03_BeneCo
EMP_CODE
Table
Characteristics
The Ch03_BeneCo database tables
Table name: EMPLOYEE
1
3
BENEFIT
JOB_DESCRIPTION
3
Managerial
Table name: PLAN
EMP_CODE
PLAN_CODE
PLAN_CODE
PLAN_DESCRIPTION
15
2
1
Term life
15
3
2
Stock purchase
16
1
3
Long-term
17
1
4
Dental
17
3
17
4
20
3
For each table in the
have a foreign
database,
identify
the
primary
key and the foreign
disability
key(s). If a table
does
not
key, write None in the space provided.
Primary
Table
Key
Foreign
Key(s)
EMPLOYEE BENEFIT
JOB PLAN
2
Create the ERD using UML notation to show the relationship
between EMPLOYEE and JOB.
3
Do the tables
explain
exhibit
entity integrity?
Answer
yes or no; then
Entity Integrity
Table
your
answer.
Explanation
EMPLOYEE BENEFIT JOB PLAN
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
110
PART I
Database
4
Systems
Dothe tables (not
exhibit referential integrity?
applicable)
if the
table
does
not
Referential
Table
Answer yes or no; then explain your answer.
have
a foreign
Write NA
key.
Integrity
Explanation
EMPLOYEE BENEFIT
3
JOB PLAN
5
Createthe ERD using Crows Foot notation to show the relationships JOB and
6
among EMPLOYEE, BENEFIT,
PLAN.
Create the ERD using UML class diagram notation to show the relationships BENEFIT,
JOB
among EMPLOYEE,
and PLAN.
Usethe database shown in Figure P3.2 to answer Problems 7-13.
FIGURE P3.2 Database
name:
Table name:
Ch03_StoreCo
EMPLOYEE
EMP_CODE
EMP_TITLE
EMP_LNAME
EMP_FNAME
EMP_INITIAL
EMP_DOB
STORE_CODE
21-May-70
3
09-Feb-75
2
1
Mr
Govender
Adimoolam
2
Ms
Ratula
Nancy
3
Ms
Greenboro
Lottie
R
02-Oct-67
4
4
Mrs
Rumpersfro
Jennie
S
01-Jun-77
5
5
Mr
Smith
Robert
L
23-Nov-65
3
6
Mr
Renselaer
Cary
A
25-Dec-71
1
7
Mr
Ogallo
Roberto
S
31-Jul-68
3
8
Ms
Van Blerk
Elandri
10-Sep-74
1
9
Mr
Eindsmar
Jack
19-Apr-61
2
10
Mrs
Jones
Rose
06-Mar-72
4
11
Mr
12
Mr
13
Mr
14 15
Broderick
W
I W R
Tom
21-Oct-78
3
Alan
Y
08-Sep-80
2
Smith
Peter
N
25-Aug-70
3
Ms
Smith
Sherry
H
25-May-72
4
Mr
Olenko
Howard
U
24-May-70
5
16
Mr
Archialo
Barry
V
03-Sep-66
5
17
Ms
Grimaldo
Jeanine
K
12-Nov-76
4
18
Mr
Rosenberg
Andrew
D
24-Jan-77
4
19
Mr
Bophela
F
03-Oct-74
4
20
Mr
Mckee
Robert
S
06-Mar-76
1
21
Ms
Baumann
Jennifer
A
11-Dec-80
3
Copyright Editorial
The Ch03_StoreCo database tables
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Washington
Rights
Reserved. content
does
May not
not materially
be
Ingwe
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
Table
3
Relational
Model
Characteristics
111
STORE
STORE_CODE
STORE_NAME
1
Access
2
Database
3
Tuple
4
Attribute
5
Primary
name:
STORE_YTD_SALES
EMP_CODE
730.05
2
8
123 370.04
2
12
792
Junction 1
Corner
Charge Alley Key
REGION_CODE
779
558.74
1
7
746
209.16
2
3
314 777.78
1
15
2
Point
3
REGION REGION_CODE
REGION_DESCRIPT 1
2
East
West
7 For eachtable, identify the primary key and the foreign key(s).If atable does not have aforeign key,
write
None in the space
provided.
Primary
Table
Key
Foreign
Key(s)
EMPLOYEE STORE REGION
8
Dothe tables exhibit entity integrity?
Entity
Table
Answer yes or no; then explain your answer.
Integrity
Explanation
EMPLOYEE STORE REGION
9
Do the tables
exhibit referential
(not applicable) if the table
integrity?
Referential
Table
Answer
does not have aforeign
yes or no; then
explain
your
answer.
Write NA
key.
Integrity
Explanation
EMPLOYEE STORE REGION
Copyright Editorial
review
10
Describe the type(s) of relationship(s)
11
Create the ERD using UML notation to show the relationship
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
between STORE and REGION.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
between STORE and REGION.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
112
PART I
Database
12
Systems
Describe the type(s) of relationship(s) many
13
employees,
Create the
one
of
ERD using
whom
between EMPLOYEE and STORE. (Hint: Each store employs
manages
the
store.)
UML notation to show the relationships
among EMPLOYEE,
STORE and
REGION.
Use the
database
shown in Figure
P3.3 to
answer
Problems
14-18.
3
FIGURE P3.3 Database
name:
The Ch03_CheapCo database tables Ch03_CheapCo
Table name:
PRODUCT
Foreign
VEND_CODE
key:
PROD_
Primary key: PROD_CODE
PROD_DESCRIPTION
CODE
PROD_ON_
PROD_
VEND_
DATE
HAND
PRICE
CODE
12-WW/P2
18 cm power saw blade
07-Apr-16
12
10.94
123
1QQ23-55
6 cm wood screw,
19-Mar-16
123
13.55
123
231-78-W
PVC pipe, 8 cm, 2.44
07-Dec-15
45
17.01
121
33564/U
Rat-tail
08-Mar-16
18
10.94
123
AR/3/TYR
Cordless
136.33
121
DT-34-WW
Philips
118.40
123
EE3-67/W
Sledge
ER-56/DF
Houselite
file,
100 m
0.5 cm, fine
drill,
0.6 cm
screwdriver
29-Nov-15 20-Dec-15
pack
hammer,
8
7 kg
chain saw, 40 cm
11
25-Feb-16
9
114.21
121
28-Dec-15
7
1186.04
125
FRE-TRY9
Jigsaw,
30 cm blade
12-Aug-15
67
11.15
125
SE-67-89
Jigsaw,
20 cm blade
11-Oct-15
34
11.07
125
23-Apr-16
14
110.26
123
01-Mar-16
15
17.07
121
ZW-QR/AV
Hardware
ZX-WR/FR
Claw
VENDOR
Foreign
none
key:
cloth,
Primary key: VEND_CODE
VEND_CODE
VEND_NAME
120
Bargain
121
Cut n
122
Rip & Rattle
123
Tools R
124
Trowel
125
Bow
review
2020 has
Cengage deemed
VEND_CONTACT
Snapper, Glow
write
Learning. that
any
All suppressed
Anne
does
May not
not materially
0181
899-1234
Olero
0181
342-9896
Morrins
0113
225-1127
G. McHenry
0161
546-7894
F. Frederick
0113
453-4567
0113
324-9988
T. Travis
R.
George
Inc.
& Wow Tools
the
VEND_PHONE
J.
Juliette
& Dowel,
Reserved. content
Co.
Us
None in
Rights
Henry
Co. Supply
VEND_AREACODE
Melanie
Inc.
For each table, identify key,
Copyright
0.6 cm.
hammer
Table name:
14
Editorial
PROD_STOCK_
Bill S. Sedwick
the primary key and the foreign space
be
copied, affect
key(s). If a table
does not have aforeign
provided.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
Primary
3
Foreign
Key
Relational
Model
Characteristics
113
Key(s)
Product VENDOR
15
Dothe tables exhibit entity integrity?
Entity
Table
Answer yes or no; then explain your answer.
Integrity
3
Explanation
Product VENDOR
16
Dothe tables exhibit referential integrity? Answer yes or no; then explain your answer. (not applicable) if the table does not have aforeign key.
Referential
Table
Integrity
Write NA
Explanation
Product VENDOR
17
Create the ERD using UML notation for this database.
18
Create the data dictionary for this database.
Use the
database
shown
FIGURE P3.4 Database Table
name:
name:
Foreign
in
Figure
Copyright review
answer
Problems
Ch03_TransCo Primary
TRUCK
key:
19-24.
The Ch03_TransCo database tables
BASE-CODE,
key:
TRUCK_NUM
TYPE_CODE
TRUCK_
BASE_
TYPE_
TRUCK_
TRUCK_BUY_
TRUCK_SERIAL_
NUM
CODE
CODE
KM
DATE
NUM
1001
501
1
32 123.50
23-Sep-13
AA-322-12212-W11
1002
502
1
76 984.30
05-Feb-12
AC-342-22134-Q23
1003
501
2
12 346.60
11-Nov-13
AC-445-78656-Z99
1
2 894.30
06-Jan-14
WQ-112-23144-T34
45 673.10
1004
Editorial
P3.4 to
01-Mar-13
FR-998-32245-W12
245.70
15-Jul-10
AD-456-00845-R45
3
32 012.30
17-Oct-11
AA-341-96573-Z84
502
3
44 213.60
07-Aug-12
DR-559-22189-D33
503
2
10 932.90
12-Feb-14
DE-887-98456-E94
1005
503
2
1006
501
2
1007
502
1008 1009
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
193
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
114
PART I
Table
Database
name:
Foreign
Systems
BASE
Primary
key:
BASE_CODE
key: none
BASE_CODE
BASE_CITY
BASE_PROVINCE
BASE_AREA_CODE
BASE_MANAGER
BASE_ PHONE
3
501
Polokwane
502
Cape
503
Best
North
504
Durban
KwaZulu-Natal
Table
name:
Foreign
Town
Western
Cape
Brabant
0700
123-4567
Sibusiso
7100
234-5678
Clementine
4567
345-6789
4001
456-7890
Primary
TYPE
key:
19
Limpopo
key:
Balisa Daniels
Maria J. Talindo Pragasen
Khan
TYPE_CODE
none TYPE_CODE
TYPE_DESCRIPTION
1
Single
box,
2
Single
box, single-axle
3
Tandem
For each table, identify key,
write
trailer,
single-axle
the primary key and the foreign
None in the space
Primary
Table
double-axle
key(s). If a table
does not have aforeign
provided.
Key
Foreign
Key(s)
exhibit entity integrity?
Answer yes or no; then explain your answer.
TRUCK BASE TYPE
20
Dothe tables
Entity
Table
Integrity
Explanation
TRUCK BASE TYPE
21
Dothe tables (not
exhibit referential integrity?
applicable)
if the table Referential
Table
Answer yes or no; then explain your answer.
does not have a foreign
Write NA
key.
Integrity
Explanation
TRUCK BASE TYPE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
22 Identify the TRUCK tables 23
For each table, identify
Relational
Model
Characteristics
115
candidate key(s).
a superkey and a secondary
key.
Superkey
Table
3
Secondary
Key
TRUCK
3
BASE TYPE
24
Createthe ERD using UML notation for this database.
FIGURE Database
P3.5 name:
Table name: CHAR_ TRIP
The Ch03_AviaCo
database tables
Ch03_AviaCo
CHARTER
CHAR_
CHAR_
CHAR_
AC_
CHAR_
CHAR_
CHAR_
CHAR_
DATE
PILOT
COPILOT
NUMBER
DESTINATION
DISTANCE
HOURS_
HOURS_
FLOWN
10001
05-Feb-20
104
10002
05-Feb-20
101
10003
05-Feb-20
105
10004
06-Feb-20
106
1484P
CPT
10005
06-Feb-20
101
2289L
CDG
10006
06-Feb-20
109
4278Y
CPT
10007
06-Feb-20
104
2778V
10008
07-Feb-20
106
1484P
10009
07-Feb-20
105
2289L
LHR
10010
07-Feb-20
109
10011
07-Feb-20
101
10012
08-Feb-20
101
2778V
10013
08-Feb-20
105
4278Y
10014
09-Feb-20
106
4278Y
10015
09-Feb-20
104
101
2289L
10016
09-Feb-20
109
105
2778V
10017
10-Feb-20
101
10018
10-Feb-20
105
The
destinations
CDG CPT
Copyright Editorial
review
2020 has
5 PARIS 5 CAPE
Cengage deemed
Learning. that
any
All
109
105
May
not materially
320.00
1.6
0
7.8
0
472.00
2.9
4.9
023.00
5.7
3.5
397.7
472.00
2.6
5.2
LHR
1 574.00
7.9
TYS
644.00
4.1
1 574.00
6.6
23.4
affect
998.00
6.2
352.00
1.9
884.00 644.00
the
97.2
1
10019
2
10011
117.1
0
10017
0
348.4
2
10012
0
140.6
1
10014
459.9
0
10017
3.2
279.7
0
10016
5.3
66.4
1
10012
4.8
4.2
215.1
0
10010
3.9
4.5
174.3
1
10011
936.00
6.1
2.1
302.6
0
10017
1 645.00
MOB TYS
6.7
0
459.5
2
10016
MQY
312.00
1.5
0
67.2
0
10011
CPT
508.00
3.1
0
105.5
0
10014
644.00
3.8
4.5
167.4
0
10017
three-letter
airport
FRANCE,
LHR
SOUTH
AFRICA
duplicated, learning
72.6
10014
CDG
or
10011
10016
TYS
overall
1
2
CDG
scanned,
CODE
0
1
LHR
CUS_
OIL_QTS
339.8
BNA
by standard
copied,
354.1
1 574.00
4278Y
be
2.2
LHR
DE GAULLE,
not
5.1
BNA
INTERNATIONAL,
does
936.00
CHAR_
GALLONS
4278Y
1484P
Reserved. content
WAIT
2778V
1484P
104
CHARLES
Rights
CDG
4278Y
104
are indicated
TOWN
suppressed
2289L
CHAR_ FUEL_
in experience.
whole
or in Cengage
codes.
5 LONDON
part.
Due Learning
to
electronic reserves
For example, HEATHROW,
rights, the
right
some to
third remove
party additional
UNITED
content
may content
be
KINGDOM
suppressed at
any
time
from if
the
subsequent
AND
eBook rights
and/or restrictions
eChapter(s). require
it.
116
PART I
Table
Database
name:
Systems
AIRCRAFT
AC_NUMBER
3
1 833.10
101.80
2289L
C-90A
4 243.80
768.90
1 123.40
2778V
PA31-350
7 992.90
1 513.10
789.50
4278Y
PA31-350
2 147.30
622.10
243.20
5 Aircraft total time, left
AC_TTER
5 Total time,
right
developed table
Table name:
AC_TTER
1 833.10
5 Total time,
a fully
AC_TTEL
PA23-250
AC_TTEL
CHARTER
AC_TTAF
1484P
AC_TTAF
In
MOD_CODE
system, entries
airframe (hours)
engine
(hours)
engine such
(hours) attribute
values
would
be updated
by application
software
when the
are posted.
MODEL
MOD_CODE
MOD_MANUFACTURER
MOD_SEATS
MOD_NAME
MOD_CHG_MILE
C-90A
Beechcraft
KingAir
8
1.67
PA23-250
Piper
Aztec
6
1.20
PA31-350
Piper
Navajo
10
1.47
Customers
number
are charged
per round-trip
mile, using
of seats in the airplane, including
a pilot
and
copilot
Table
name:
has
six
passenger
the
Chieftain
MOD_CHG_MILE
the pilot and copilot
seats
rate.
seats.
The
Therefore
MOD_SEAT
gives the total
a PA31-350 trip that is flown
by
available.
PILOT
EMP_
PIL_
NUM
LICENCE
PIL_RATINGS
PIL_MED_
PIL_MED_
PIL_PT135_
TYPE
DATE
DATE
101
ATP
ATP/SEL/MEL/Instr/CFII
1
20-Jan-20
11-Jan-20
104
ATP
ATP/SEL/MEL/Instr
1
18-Dec-19
17-Jan-20
105
COM
COMM/SEL/MEL/Instr/CFI
2
05-Jan-20
02-Jan-20
106
COM
COMM/SEL/MEL/Instr
2
10-Dec-19
02-Feb-20
109
COM
ATP/SEL/MEL/SES/Instr/
1
22-Jan-20
15-Jan-20
CFII
The pilot licences Pilot.
Businesses
(FARs) 135
that
shown in the that
operate
are enforced
operators.
pilots
by the
Part 125
six months. The Part
PILOT table include on demand Federal
operations
135 flight
must have at least
Aviation
require
are
governed
Administration
that
proficiency
a commercial
the ATP 5 Airline Transport
air services
pilots
(FAA).
successfully
Such
of the
flight
medical certificate
Air Regulations
are known
proficiency
in PIL_PT135_DATE.
and a second-class
5 Commercial
Federal
businesses
complete
check data is recorded
licence
Pilot and COM
by Part 135
as Part
checks
every
To fly commercially,
(PIL_MED_TYPE
5 2).
The PIL_RATINGs include: SEL
5 Single
engine,
land
MEL
SES 5 Single engine, sea CFI
Copyright Editorial
review
5 Certified
2020 has
Cengage deemed
Learning. that
any
flight
All suppressed
Instr.
instructor
Rights
Reserved. content
Multi-engine,
does
5Instrument
CFII
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
land
5 Certified
or in Cengage
part.
Due Learning
to
electronic reserves
flight
instructor,
rights, the
right
some to
third remove
instrument
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
Table
name:
3
Relational
Model
Characteristics
EMPLOYEE
EMP_NUM
EMP_TITLE
EMP_LNAME
EMP_FNAME
EMP_INITIAL
EMP_DOB
EMP_HIRE_DATE
100
Mr.
Nkosi
Cela
D
15-Jun-52
15-Mar-98
101
Ms.
Naude
Amahle
G
19-Mar-75
25-Apr-96
102
Mr.
Vandam
Rhett
14-Nov-68
18-May-03
103
Ms.
Jones
Anne
11-May-84
26-Jul-09
104
Mr.
Lange
John
P
12-Jul-81
20-Aug-00
105
Mr.
Williams
Robert
D
14-Mar-85
19-Jun-13
106
Mrs.
Duzak
Jeanine
K
12-Feb-78
13-Mar-99
107
Mr.
Diante
Jorge
D
01-May-85
02-Jul-07
108
Mr.
Wiesenbach
Paul
R
14-Feb-76
03-Jun-03
109
Ms.
Travis
Elizabeth
K
18-Jun-71
14-Feb-16
110
Mrs.
Genkazi
Leighla
19-May-80
29-Jun-10
Table
name:
M
W
3
CUSTOMER
CUS_ LNAME
CUS_ FNAME
10010
Ramas
Alfred
10011
Dunne
Leona
10012
Smith
Kathy
10013
Pieterse
Jaco
10014
Orlando
10015
OBrian
Amy
10016
Brown
James
10017
Williams
George
10018
Padayachee
Vinaya
10019
Smith
Olette
CUS_CODE
Use the
117
database
CUS_ PHONE
A
0181
844-2573
10.00
K
0161
894-1238
10.00
0181
894-2285
1559.73
0181
894-2180
1802.09
0181
222-1672
1420.15
B
0161
442-3381
1633.19
G
0181
297-1228
10.00
0181
290-2556
10.00
G
0161
382-7185
10.00
K
0178
297-3809
1283.33
W F
Myron
shown in
Figure
P3.5 to
CUS_ BALANCE
CUS_ AREACODE
CUS_ INITIAL
answer
Problems
25-28.
ROBCOR is
an aircraft
charter
company that supplies on-demand charter flight services using a fleet of four aircraft. Aircraft are identified by a unique registration number. Therefore, the aircraft registration number is an appropriate primary key for the AIRCRAFT table. The nulls in the CHARTER tables CHAR_COPILOT column indicate that a copilot is not required for some
charter trips
or for some aircraft.
(Federal
Aviation
Administration
(FAA) rules require
a copilot
onjet aircraft and on aircraft having a gross take-off weight over 5 500 kg. None of the aircraft in the AIRCRAFT table are governed bythis requirement; however, some customers mayrequire the presence of a copilot for insurance reasons.) All charter trips are recorded in the CHARTER table.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
118
PART I
Database
Systems
NOTE Earlier both
in the the
the
chapter
pilot
it
was stated
and the
CHARTER
copilot
table.
are
Therefore,
that
it is
pilots the
best
in the
to
avoid
PILOT
synonyms
homonyms
table,
but
CHAR_PILOT
and
synonyms.
EMP_NUM and
In
cannot
this
be used
CHAR_COPILOT
problem, for
were
both in
used
in
the
CHARTER table.
3
Although is
the
solution
not required.
charter flight
Worse,
company
such
grows
engineers
additional
works in this
and load
crew
would
have
to
trip
without
would
yield
additional
to
You will have a chance points:
Dont
synonyms.
greatest
structural on the
25
26
extent,
in the
Given this
For
requirements then
have to
required
change,
example,
if the
AviaCo
to
modified
to include
be
include the
CHAR_LOADMASTER
time
aircraft,
when a copilot
may increase
and
each
in larger
nulls
a smaller
the
aircraft
missing
crew
flew
a
members
table.
design
requires
tables.
shortcomings
the
design the
database
change.
crew would
generates
CHAR_FLT_ENGINEER
table.
those
design
table
as
members
CHARTER
to correct
possible
changes
CHARTER
and it
requirements
aircraft,
attributes
of crew
If your
larger
CHARTER
number nulls in the
two important
To the
The
such the
as crew
using
masters.
be added the
proliferate
starts
assignments;
charter
use
nulls
and
case, it is very restrictive
use
Problem
of synonyms,
database Plan
in
revise
to accommodate
ahead
27. The problem illustrates
and try to
the
design!
growth
anticipate
without requiring
the
effects
of change
database.
For each table,
where possible, identify:
a
The primary
key.
b
A superkey.
c
A candidate
d
The foreign
e
A secondary
Create the
key.
key(s). key.
ERD using
UML notation.
(Hint:
Look
at the table
contents.
You
will discover
that
an
AIRCRAFT can fly many CHARTER trips, but each CHARTER trip is flown by one AIRCRAFT, that a MODEL references many AIRCRAFT, but each AIRCRAFT references a single MODEL, etc.) 27
Modify the ERD you created in Problem 26 to eliminate the problems created by the use of synonyms. (Hint: Modify the CHARTER table structure by eliminating the CHAR_PILOT and CHAR_COPILOT attributes; then create a composite table named CREW to link the CHARTER and EMPLOYEE tables. Some crew members, such as flight attendants, may not be pilots. Thats why the
28
EMPLOYEE
Create the
table
ERD using
enters into
UML notation
this relationship.) for the design
you revised
in
Problem
27. (After
you have had
a chance to revise the design, your instructor will show you the results of the design change, using a copy of the revised database named Ch03_AviaCo_2).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 4 Relational Algebra and Calculus IN THIS CHAPTER,YOU WILLLEARN: What is
meant
How to
by relational
manipulate
How the
DBMS
The different How to
database
supports
types
and relational
tables
the
using
calculus
relational
key relational
set
operators:
operators
select,
project
and join
of joins
write queries
About tuple
algebra
using relational
and domain
algebra
relational
expressions
calculus
PREVIEW Relational
algebra
databases
and
relational of
how
both
it
and relational
model.
Codd
proposed
actually
be
components
and
of formal
is
language.
a
Predicate
basis
for
Once
we have
Query
Copyright review
from
which
required theory,
Language,
you
model.
relations and
manipulation
is relatively
easy
will learn
how
the
understand. calculus,
modified.
SQL
such In
in
the
These which
can
to
logic
as the and
set
important
This is
usually
as SQL (Structured DML languages
both
8, Beginning be used
Set theory used
next
a relation. such
in
a database.
provide
as SQL use alimited
Chapter
commands
or false.
predicate
(DML)
to
as a result.
and is
database, within
data to
a collection
a framework
on relations
data
key
as a procedural
of things,
Together,
and relational
often
which allows
as either true
language
by any DML. Languages are
in
modify
of the
algebra is
provides
or groups
described
one
described
mathematics,
operations
be
that
new relations
and is
the
independently
should
Relational
produce
defining
have
the
basic
implementation Structured
accomplish
Query relational
tasks.
Cengage deemed
relational
the
algebra
and
that
sets,
performing
data
relational
of relational
has
the for
specified
a high-level
operations
2020
in
basis
with
modelled
data
of a relation,
manner.
in
be
basis for relational basis for
we identified
can be verified
deals
how to retrieve
Language),
algebra
Editorial
is
using
stemmed
of fact)
that
2
set theory
extensively
(statement
an ideal
consideration achieved
used
and
as the
do this,
concept
relations
logic
1971 should
to
Chapter
was the
on these
manipulation
provide
data
that,
in a structured
on predicate
science
data
model
logic,
mathematical
theory
In
acting
based
which an assertion is
minimally.
mathematical
Codd in
the
and
database
operations
The algebra
that
used
of the relational within the
are the
by E.F.
would
mathematically
be stored
calculus
were proposed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
120
PART I
Database
Systems
Although language algebra that
is
both
gain
provides
us to
be used to
is
not an easy language of the
with aformal retrieve
express
the
form
relational
can
operators
queries
using
both tuple
also
be
and
how
relational
modify
same
data
of how the
which
they
by the
can
be used
complete
to
database
First, data.
you
Then,
will explore
study
the
relational
and the
mathematics
relational
calculus
a relationally
you
to
Essentially,
operates,
if any query that
manipulate
necessary
and tuple
we have
language.
Finally,
it is
operations.
algebra
means that
query
expressions.
relational
understand,
Relational
is relationally
expressed
to
manipulation
a relational
data.
queries,
algebraic
and domain
basic
description
and
We say a query language
algebraic
using
algebra
an understanding
necessary
language.
write
relational
to
complete
can
query
can be written in relational will learn you
about
will learn
how to
the
basic
about
write
how
simple
to
queries
calculus.
4
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
4.1
4
Relational
Algebra
and
Calculus
121
RELATIONAL OPERATORS
Relational
algebra
relational
defines the theoretical
operators.
Codd originally
way of manipulating table contents through
defined
eight relational
operators,
called
a number
SELECT (or
of
RESTRICT),
PROJECT, JOIN, PRODUCT, INTERSECT, UNION, DIFFERENCE and DIVIDE. The most important operators are SELECT, PROJECT and JOIN, which can be used to formulate relational algebra expressions to answer many user queries. The relational operators have the property of closure; that is, relational algebra operators are used on existing tables to produce new tables. The relational operators are classed
as being
unary or binary.
Unary operators,
such
as SELECT
and PROJECT,
can be applied
to one relation, whilst binary operators such as JOIN are applied on two relations. In Chapter 3, Relational Model Characteristics, welearnt about a number of important concepts and properties of relations that are essential for understanding the relational model.In this chapter, we will build
on these
concepts
to understand
how relational
algebra
can be used to
write queries.
4
Within
Chapter 3, we modelled a relation on a mathematical construct, which had to abide by a set of rules (Table 3.1). When applying relational operators to relations, we have to follow these rules in addition to those defined for each relational operator. In the following sections you willlearn about the theory associated with common relational operators and view some
practical
examples.
Remember
that the term relation
is a synonym
for table.
NOTE To
be
considered
PROJECT
minimally
and
JOIN.
relational,
Very few
the
DBMSs
DBMS
are
must
capable
support
the
of supporting
key
all eight
relational
operators
relational
SELECT,
operators.
A NOTE ON SET THEORY Set theory is one of the most fundamental concepts in mathematics.1 The theory is based on the idea that elements have membership in a set. Given two sets, A and B, wesay that Ais a member of B, which can be written
as A [
B. Alternatively,
we can say that the
set
B contains
A as its element.
The elements
of a set can be numbers, the names of students who enrolled in a course or the flight numbers of all the flights operated by an airline. Each set is then determined by its elements and each element in a set is unique. Venn diagrams2 are a way of visually representing sets. Supposing we have the following two sets: Set
A 5 Students
who take
the
Databases
Set
B 5 Students
who take
the
Programming
Some
of the
Venn
diagram
1
Karel
2
John
Copyright review
2020 has
Hrbacek
Cengage deemed
Learning. that
any
and
All
set
Rights
the
Reserved. content
in
Thomas
On
Magazine
suppressed
in
as shown
Venn (1880)
Philosophical
Editorial
students
does
A appear Figure
Jech,
May not
not materially
be
also in
Introduction
copied, affect
to
and
Journal
of
scanned, the
{Sarah, unit set
{Paul, B and
vice
Phinda, Mikla,
Paul,
Asanda,
versa.
Hamzah, Kiki,
Mikla}
Craig}
We can represent
these
facts
using
a
4.1.
Diagrammatic and
unit
overall
Science
or
duplicated, learning
Set Theory,
Mechanical 9(59):
in experience.
whole
third
edn.
Marcel
Representation
Dekker,
of Propositions
Inc.,
1999.
and
Reasonings.
Dublin
118.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
122
PART I
Database
Systems
FIGURE 4.1
Asimple Venn diagram
4
In
Figure
and
4.1, the
two
Programming
sections
of the
the left-hand right-hand
represent
and
who
the
appear
two
circles.
Sarah,
circle,
whilst
Asanda,
in
Phinda
two
sets
A and
both
sets
are
and
Kiki and
Hamzah
B.The
Paul only
take
Craig only take
students
and
the
who take
Mikla. the
These
Database
Programming
both
the
will go in the unit,
Database overlapping
so these
go only in
unit and only appear in the
circle.
We will be using union,
circles
units
Venn
intersection
and
diagrams
throughout
this
chapter
to illustrate
the
three
relational
set
operators:
difference.
4.1.1 Selection The relational or it
operator SELECT, also known as RESTRICT, can be used to list all of the row values,
can return
a horizontal
only those subset
row
values
that
match
a specified
criterion.
In
other
words,
SELECT
returns
of a relation.
The SELECT operator, denoted by su, is formally
defined
as:
su(R)
or s,criterion. (RELATION)
where su(R) is the set of specified tuples the
required
of the relation
R and uis the predicate (or criterion) to extract
tuples.
NOTE The Euro, denoted as , became the official currency of 12 European member states in 2002. Today the Euro is used by more than 175 million Europeans in 19 of 28 EU member countries, as well as some countries that are not formally members of the EU.
Figure
4.2 (a)
contains shows
Copyright Editorial
review
2020 has
the
Cengage deemed
Learning. that
shows
visually
information
any
effects
All suppressed
Rights
about of selecting
Reserved. content
how
does
May not
not materially
rows
products
be
all rows
copied, affect
scanned, the
within
which
overall
or
with
duplicated, learning
a relation
are
sold in
no criteria.
in experience.
whole
or in Cengage
part.
are a store The
Due Learning
to
electronic reserves
selected. is
criterion
rights, the
An example
shown
right
in
Figure
specified
some to
third remove
party additional
content
in
may content
of 4.2 (b).
Figure
be
any
time
that
Figure
4.2 (c)
4.2 (d)
suppressed at
a relation
from if
the
subsequent
selects
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
those
rows
P_CODE
only
where the
123456
is
4.2 (a)
than
2.00
Figure
4.2 (e),
Algebra
only the row
name:
Row
1
Row
2
Row
3
Row
4
Row
5
P_DESCRIPT
containing
1
Column
2
4
123456
Flashlight
123457
Lamp
123458
Box Fan
213345
Relation
Figure 4.2 (c) s(PRODUCT)
PRICE
P_CODE
P_DESCRIPT
123456
Flashlight
19.87
123457
Lamp
8.68
123458
Box
9 v battery
1.52
213345
9 v battery
1.52
254467
100
1.16
254467
100
1.16
311452
Powerdrill
27.64
311452
Powerdrill
4.16
W bulb
s price , 2.00(PRODUCT)
4.2 (d)
Figure
P_CODE
P_DESCRIPT
213345
9 v battery
1.52
254467
100
1.16
Figure
possible
contains
create the
about only the
PRICE
W bulb
to
4.3 illustrates
information
123
Ch04_Relational_DB_Operators
P_CODE
It is also
Calculus
SELECTION
Figure 4.2 (b) The PRODUCT
Figure
and
The SELECToperator
Column
Database
and, in
Relational
displayed.
FIGURE 4.2
Figure
price is less
4
more complex
use
courses tuples
of the offered
where
the
criteria AND
at
University.
Tiny
DEPT_CODE
operator
is
4.16
19.87 Fan
8.68
W bulb
27.64
(PRODUCT) s p_code5123456
4.2 (e)
P_CODE
P_DESCRIPT
123456
Flashlight
by using the logical
logical
using Figure
CIS and the
PRICE
PRICE 4.16
operators
the
COURSE
4.3 (b)
shows
AND,
the
CRS_CREDIT
OR and
relation,
which
new
value
NOT. stores
relation,
is
which
4.
Online Content Allofthe databases usedtoillustratethe material in this chapterarefound
Copyright Editorial
review
2020 has
on the
online
names
used in the figures.
Cengage deemed
Learning. that
any
All suppressed
platform
Rights
Reserved. content
does
for this
May not
not materially
be
book.
copied, affect
The
scanned, the
overall
or
database
duplicated, learning
in experience.
whole
names
or in Cengage
part.
used in the folder
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
match the
party additional
content
may content
database
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
124
PART I
Database
Systems
FIGURE 4.3 Database Figure
name:
Ch04_TinyUniversity
4.3 (a) the
COURSE
DEPT_CODE
CRS_DESCRIPTION
CRS_CREDIT
ACCT-211
ACCT
Accounting
I
3
ACCT-212
ACCT
Accounting
II
3
CIS-220
CIS
Introduction
to
CIS-420
CIS
Database
Design
QM-261
CIS
QM-362
CIS
Intro.
to
Computer
Science
3
and Implementation
4
Statistics
Statistical
3
Applications
4
s dept_code5CIS ANDcrs_credit5 4(COURSE)
4.3 (b)
CRS_CODE
DEPT_CODE
CRS_DESCRIPTION
CIS-420
CIS
Database
Design
QM-362
CIS
Statistical
Applications
4.1.2 The
Relation
CRS_CODE
4
Figure
Selecting from the COURSErelation
CRS_CREDIT and Implementation
4 4
Projection
PROJECT
vertical
operator
subset
defined
returns
of a relation
all values
excluding
for
any
selected
duplicates.
attributes. The
In
other
PROJECT
words,
operator,
PROJECT
denoted
by
returns
a
P,is formally
as:
Pa1...an (R)
or P,List of attributes.
(Relation)
where the projection the relation Figure Figure to
4.4 (b)
4.4 (c)
create
(d)
Copyright review
2020 has
relation
the
effect
that
how columns
within a relation
stores
information
about
the
PROJECT
relational
of applying
containing
only the
PROJECT
operator.
attribute
PRICE.
Notice that
the
products
which
operator
The two
order
attributes a1...an of
are selected. are
on the
sold
in
a store.
PRODUCT
further
examples
of attributes
is
relation,
in
Figure
maintained
4.4
in the
relations.
Learning. that
R, denoted by Pa1...an (R) is the set of specified visually
a relation
the
and (e) illustrate
Cengage deemed
shows
shows
a new
resulting
Editorial
of the relation
R. Figure 4.4 (a) shows
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 4.4 Database Figure
Relational
Algebra
and
Calculus
125
The PROJECT operator
name:
4.4 (a)
4
Ch04_Relational_DB_Operators
PROJECTION
Column
Column
1
2
Row 1 Row 2
4
Row 3 Row 4 Row 5
Figure
4.4 (b)
Figure
The
PRODUCT
relation
P_CODE
P_DESCRIPT
123456
Flashlight
123457
Lamp
123458
Box
213345 254467 311452
Powerdrill
Figure
P_DESCRIPT
PRICE
PRICE 4.16
Flashlight
4.16
19.87
Lamp
19.87 8.68
Box Fan
8.68
9 v battery
1.52
9 v battery
1.52
100
1.16
100
1.16
Fan
W bulb
W bulb
27.64
Powerdrill
27.64
(PRODUCT) Pprice
4.4 (c)
(PRODUCT) Pp_descript,price
4.4 (d)
Figure
(PRODUCT) Pp_code,price
4.4 (e)
PRICE
P_CODE
PRICE
4.16
123456
19.87
123457
19.87
8.68
4.16
123458
8.68
1.52
213345
1.52
1.16
254467
1.16
27.64
311452
27.64
4.1.3 UNION The
UNION
relations
set
must
be used in the degree, The
Copyright Editorial
review
2020 has
and
UNION
Cengage deemed
Learning. that
any
operator
have the
UNION.
Rights
denoted
Reserved. content
from
characteristics
or more tables
does
May not
not materially
by
be
copied, affect
, is formally
scanned, the
overall
or
duplicated, learning
two
relations,
(the
columns
share the same
share the same (or compatible)
operator,
All
all tuples
attribute
When two
when they
suppressed
combines same
domains,
defined
in experience.
whole
or in Cengage
part.
excluding and
number they
duplicate
domains
must
of columns,
are said to
tuples.
The
be identical)
i.e.
to
have the same
be union-compatible.
as:
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
126
PART I
Database
Systems
The union of relations relation
R3(c1 , c2,...,
R1(a1 , a2,..., an) and R2 (b1, b2,..., bn) denoted
cn)
where for
each i (i
R1
5 1, 2..n), ai and bi must have
The degree of R3is the same as that of R1and R2.However the cardinality b are the cardinalities
of R1 and
R2respectively,
Figure 4.5 (a) visually shows R1 Figure
4
4.5 (b) to (c)
Both
PRODUCT1
same
domains.
FIGURE 4.5 Database
name:
shows
the
since there
R2 with degree compatible
n, is the
domains.
of R3is a 1 b, only if a and
may not be duplicate
tuples
in
R1 and
. R2
. R2 effect
and PRODUCT2
of the
UNION
operator
are union-compatible
on relations
as they
PRODUCT1
have the
same
and
degree
PRODUCT2.
and share the
The UNIONoperator Ch04_Relational_DB_Operators
Figure 4.5 (a) R1 Union R2
R1
R2
Figure 4.5 (b) The UNION_PRODUCT1
Figure
relation
4.5 (d)
Result
of UNION_PRODUCT1
UNION_PRODUCT2 P_CODE
P_DESCRIPT
123456
Flashlight
123457
Lamp
123458
Box Fan
8.68
213345
9 v battery
1.52
254467
100
1.16
311452
Powerdrill
Figure
4.5 (c)
The
P_CODE
Copyright Editorial
review
2020 has
PRICE 4.16
19.87
Wbulb
27.64
UNION_PRODUCT2
P_DESCRIPT
relation
Microwave
126.40
345679
Dishwasher
395.00
Cengage
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
P_DESCRIPT
PRICE
123456
Flashlight
123457
Lamp
123458
Box
213345
9 v battery
1.52
254467
100
1.16
311452
Powerdrill
4.16 19.87 8.68
Fan
W bulb
27.64
345678
Microwave
126.40
345679
Dishwasher
395.00
PRICE
345678
deemed
P_CODE
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Figure that
4.6 shows the
only
one
exists in the
effects
additional
UNION operator
has
been
added
in
when two relations Figure
4.6 (c),
as
contain
Relational
Algebra
duplicate
tuples.
CRS_CODE
and
5 ACCT-211
Calculus
127
Notice already
COURSE_RELATION.
FIGURE 4.6 Database name: Figure
of the
tuple
4
4.6 (a)
The Union operator
COURSE
COURSE2
Ch04_TinyUniversity
The
COURSE_RELATION CRS_CREDIT
CRS_CODE
DEPT_CODE
CRS_DESCRIPTION
ACCT-211
ACCT
Accounting
I
3
ACCT-212
ACCT
Accounting
II
3
CIS-220
CIS
Introduction
to
CIS-420
CIS
Database
Design
QM-261
CIS
QM-362
CIS
Intro.
to
Computer
3
Science
4
and Implementation
3
Statistics
Statistical
4
4
Applications
Figure 4.6 (b) The COURSE2_RELATION DEPT_CODE
CRS_DESCRIPTION
ACCT-211
ACCT
Accounting
I
3
CIS-430
CIS
Advanced
Databases
6
Figure 4.6 (c) Result of COURSE
CRS_DESCRIPTION
ACCT-211
ACCT
Accounting
I
3
ACCT-212
ACCT
Accounting
II
3
CIS-220
CIS
Introduction
to
CIS-420
CIS
Database
Design
QM-261
CIS
QM-362
CIS
Statistical
Applications
4
CIS-430
CIS
Advanced
Databases
6
(a)
and the
attribute.
In the
2020 has
relation
in
example,
the
4.7 (a) is
not
could
3
UNION operator cannot be applied as the results
UNION allowed
COURSE
write PCRS_CODE (COURSE)
4
and Implementation
operator
to the
(COURSE
be used to restrict
both relations
3
Science
Statistics
then the
applying
operator
to
Computer
and
COURSE
CLASS).
the columns
CLASS
have
(CLASS) PCRS_CODE
In
order
to
obtain
Figure
4.6
around
this
over a common
attribute the
in get
in each relation
a common
and
relation
CRS_CODE.
resulting
relation
We shown
4.7 (b).
Cengage deemed
example,
PROJECT
could therefore Figure
For
CLASS
the
Intro.
are not union-compatible,
be invalid.
problem,
in
review
CRS_CREDIT
DEPT_CODE
would
Copyright
COURSE2
CRS_CODE
If two relations
Editorial
CRS_CREDIT
CRS_CODE
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
128
PART I
Database
Systems
FIGURE 4.7 Database Figure
The Union operator
name:
not union-compatible
example
Ch04_TinyUniversity
4.7 (a) the
CLASS_RELATION
CLASS_CODE
CRS_CODE
CLASS_TIME
CLASS_
CLASS_ROOM
LECTURER_ NUM
SECTION
4
10012
ACCT-211
1
MWF 8:00-8:50
a.m.
BUS311
105
10013
ACCT-211
2
MWF 9:00-9:50
a.m.
BUS200
105
10014
ACCT-211
3
TTh 2:30-3:45
BUS252
342
10015
ACCT-212
1
MWF 10:00-10:50
BUS311
301
10016
ACCT-212
2
Th 6:00-8:40
BUS252
301
10017
CIS-220
1
MWF 9:00-9:50
a.m.
KLR209
228
10018
CIS-220
2
MWF 9:00-9:50
a.m.
KLR211
114
10019
CIS-220
3
MWF 10:00-10:50
KLR209
228
10020
CIS-420
1
W 6:00-8:40
KLR209
162
10021
QM-261
1
MWF 8:00-8:50
KLR200
114
10022
QM-261
2
TTh 1:00-2:15
KLR200
114
10023
QM-362
1
KLR200
162
10024
QM-362
2
KLR200
162
MWF
p.m.
p.m.
a.m.
p.m. a.m. p.m.
11:00-11:50
a.m.
TTh 2:30-3:45
(COURSE) Figure 4.7 (b) Result of PCRS_CODE
a.m.
p.m.
(CLASS) PCRS_CODE
CRS_CODE ACCT-211 ACCT-212 CIS-220 CIS-420 QM-261 QM-362
4.1.4 INTERSECT The INTERSECT true
in the
cannot
operator,
case
of
denoted
UNION,
use INTERSECT
the
if
as
tables
one
,
returns
must
of the
attributes
in the second table is character-based.
only the
tuples
that
be union-compatible in
the
first
The INTERSECT
to
table
is
appear
give
in
valid
numeric
both
relations.
results. and the
operator is formally
As
was
For example,
you
corresponding
one
defined as:
The intersect of relations R1 (a1, a2,..., an) and R2 (b1, b2,..., bn) denoted R1 R2 with degree n, is the relation R3(c1 , c2,..., cn) that includes only those tuples of R1that also appear in R2 where for each i (i
5 1, 2..n), ai and
bi must have
compatible
Figure 4.8 (a) visually shows R1
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
domains.
. R2
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The effect is
shown
in
F_NAMEs
of applying Figure
that
appear
FIGURE 4.8 Database
the INTERSECT
4.8 (d). in
Only both
Kuhle
operator and
Jorge
to the first appear
name
in the
INTERSECT_RELATION_1
4
name:
column (F_NAME)
final
relation
4.8 (b)
and
Calculus
129
in two relations
as they
are the
only
two
and INTERSECT_RELATION_2.
Ch04_Relational_DB_Operators
R2
R1
4
R2
Figure
The INTERSECT_
RELATION_1
Algebra
TheINTERSECT operator
Figure 4.8 (a) R1INTERSECT
Figure
Relational
4.8 (c)
RELATION_2
relation
The INTERSECT_
Figure
relation
4.8 (d)
Result
of
INTERSECT_RELATION_1 INTERSECT_RELATION_2
F_NAME
F_NAME
F_NAME
George
Kuhle
Kuhle
William
Kuhle Elaine
Jorge
Piet
Dennis
Jorge
Jorge
4.1.5 DIFFERENCE The
DIFFERENCE
is, it
subtracts
operator
returns
one relation
from
must be union-compatible.
all tuples
the
other.
in
one relation
The
that
DIFFERENCE
The DIFFERENCE
are
not found
in the
operator
also requires
operator is formally
defined as:
other
that
the
relation; two
The difference of relations R1 (a1, a2,..., am) and R2 (b1, b2,..., bm) denoted R1 R2 with degree relation R3(c1 , c2,..., cm) that includes all tuples that arein R1 but not in R2 wherefor each i (i domains. ai and bi must have compatible Figure
4.9 (a) shows
The effect relation that
Copyright review
2020 has
Figure
appear
result
Editorial
in
in
4.9 (c)
Learning. that
any
DIFFERENCE
shows
only
DIFF_RELATION_1
All suppressed
Rights
order
Reserved. content
does
May not
operator
George, and
not in
of the relations
not materially
m,is the
51,2..m),
R2 can be visualised.
the
as BA, i.e. the
Cengage deemed
how R1
of applying
that
relations
be
copied, affect
scanned, the
overall
or
duplicated, learning
to two
Elaine
and
relations Piet,
is
DIFF_RELATION_2.
are important
in experience.
whole
or in Cengage
part.
Due
to
in
electronic reserves
rights, right
some to
third remove
only
AB
DIFFERENCE
the
Figure
are the
Note that
in the
Learning
shown
as these
party additional
4.9.
The resulting
values
of F_NAME
will not
give the
same
operator.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
130
PART I
Database
Systems
FIGURE 4.9
The DIFFERENCEoperator R1R2
R1
R2
4
Database Figure
name:
4.9 (b)
Ch04_Relational_DB_Operators
The
DIFF_
Figure
4.9 (c)
The
Figure
DIFF_
4.9 (d)
Result
RELATION_1
of DIFF_
- DIFF_RELATION_2
RELATION_1 relation
RELATION_2 relation
F_NAME
F_NAME
F_NAME
George
Kuhle
George
Kuhle
Elaine
William
Elaine
Piet
Jorge
Piet
Dennis
Jorge
4.1.6 CARTESIAN PRODUCT The CARTESIANPRODUCTis usually written as R1 3 R2withthe new resulting relation R3containing all the attributes that are present in R1 and R2along . both R1 and R2 It
can
be formally
defined
with all the possible combinations
of tuples from
as:
The CARTESIAN PRODUCT of two relations R1 (a1, a2,..., an) with cardinality i and R2(b1, b2,..., bm) , with cardinality j is arelation R3 with degree k 5 n 1 m, cardinality i*j and attributes (a1, a2,..., an, b1
b2,..., bm).This can be denoted as R3 5 R1 3 R2. Therefore, two 4
if
one relation
attributes,
the
1 2 5 6 attributes,
Figure
4.10
LOCATION
(c)
i.e. the
shows
relations
You can see in cardinality
it is
Copyright Editorial
review
2020 has
Cengage deemed
by itself,
used in
known
the
Figures
conjunction
would
PRODUCT
of 6 (3
many tuples
with the
other relation
is
composed
be 18 tuples used
on
has three
of 6 and the
rows
and
3 5 18 rows
and
degree
combining
the
would
be 6.
PRODUCT
and
and (b) respectively.
a degree
combines
and the
a new relation
new relation
4.10 (c) that the result
3 3) and as it
of the
CARTESIAN 4.10 (a)
attributes
creates
RESTRICT
of PRODUCT
3 LOCATION
1 3). The
CARTESIAN
that
no association
have
(SELECT)
operator,
it
is a new relation
PRODUCT with
becomes
is
each
not
with a
a very
other.
useful
However,
a very important
if
operator
as a JOIN.
Learning. that
in
and four
PRODUCT
cardinality
how
Figure
of 18 (6
operation
has six rows
CARTESIAN
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 4.10 Database Figure
name:
4.10 (a)
The
PRODUCT
Figure 4.10 (c) PRODUCT
2020 has
Cengage deemed
Learning. that
any
All suppressed
and
Calculus
131
relation P_CODE
P_DESCRIPT
123456
Flashlight
123457
Lamp
123458
Box
213345
9 v battery
1.52
254467
100
1.06
311452
Powerdrill
Rights
Reserved. content
does
4.16 19.87 8.68
Fan
Wbulb
4
27.64
AISLE W
SHELF 5
24
K
9
25
Z
6
X LOCATION P_CODE
P_DESCRIPT
STORE
AISLE
SHELF
123456
Flashlight
4.16
23
W
5
123456
Flashlight
4.16
24
K
9
123456
Flashlight
4.16
25
Z
6
123457
Lamp
19.87
23
W
5
123457
Lamp
19.87
25
Z
6
123457
Lamp
19.87
24
K
9
123458
Box Fan
10.99
23
W
5
123458
Box Fan
10.99
24
K
9
123458
Box Fan
10.99
25
Z
6
213345
9 v battery
1.52
23
213345
9 v battery
1.52
24
K
9
213345
9 v battery
1.52
25
Z
6
254467
100
W bulb
1.16
23
254467
100
W bulb
1.16
24
K
9
254467
100
W bulb
1.16
25
Z
6
311452
Powerdrill
27.64
24
W
5
311452
Powerdrill
27.64
25
K
9
311452
Powerdrill
27.64
26
Z
6
May not
PRICE
relation
23
review
Algebra
Ch04_Relational_DB_Operators
STORE
Copyright
Relational
The CARTESIAN PRODUCT
Figure 4.10 (b) The LOCATION
Editorial
4
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
PRICE
or in Cengage
part.
Due Learning
to
electronic reserves
W
5
W
rights, the
right
some to
third remove
5
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
132
PART I
Database
Systems
4.1.7 DIVISION , that The DIVISION operation produces a new relation by selecting the tuples in one relation, R1 match every row in another relation, R2.It is essentially the inverse of the CARTESIAN PRODUCT operation, just like the arithmetic divide is the inverse of multiplication. DIVISION, denoted by R1 4 R2, can be formally defined as:
(b1, b2,..., bm) with cardinality j R1 (a1, a2,..., an) with cardinality i and R2
The DIVISION of two relations
is arelation R3with degree k 5 n 2 mand cardinality i 4 j. Using the example shown in Figure 4.11, note that: Table 1 (Figure 4.11(a)) is divided by Table 2 (Figure 4.11(b)) to produce Table 3(Figure 4.11(c)). Tables 1 and 2 both contain the column CODE but
4
do not share
LOC.
To be included
in the resulting
Table 3, a value in the
unshared
column
(LOC)
be associated (in the dividing Table 2) with every value in Table 1. The only value associated A and Bis 5.
FIGURE 4.11 Database
Name:
must
with both
The DIVISION operator Ch04_Relational_DB_Operators
Figure 4.11 (a) Division Table 1 CODE
LOC
A
5
A
9
A
4
B
5
B
3
C
6
D
7
D
8
E
8
Figure 4.11 (b) Division Table 2 CODE A B
Figure
4.11
(c)
Result
of
Division
Table
1
4
Division
Table
2
LOC 5
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
4.2
4
Relational
Algebra
and
Calculus
133
JOINS
The JOIN operation is one of the essential operations of relational algebra. It is a binary operation that allows the user to combine two relations in a specified way. JOIN operations are the real power behind the relational database, allowing the use ofindependent tables linked by common attributes. The JOIN oftwo relations R1and R2is arestriction ontheir Cartesian product R1X R2to meet a specified criterion. Thejoin itself is defined on an attribute a of R1and an attribute b of R2 where the attributes the same domain. A JOIN operator may be formally defined as:
a and b share
(a1, a2,..., an) and R2 Thejoin of two relations R1 (b1, b2,..., bm) is a relation R3 with degree k 5 n 1 m ) that satisfy a specific join condition. and attributes (a1, a2,..., an, b1 , b2,..., bm In this section we willlook at a number of different kinds ofjoin operations including EQUIJOIN, NATURAL JOIN, LEFT OUTER JOIN and RIGHT OUTER JOIN.
4
the THETA JOIN,
4.2.1 Theta Join and Equijoin One of the equality
most commonly
condition
that
used joins is known as an equijoin,
compares
specified
columns
whichlinks tables
of each table.
The outcome
on the basis of an of the equijoin
does
not eliminate duplicate columns, and the condition or criterion used to join the tables must be explicitly defined. The equijoin takes its name from the equality comparison operator (5) used in the condition. If any other comparison operator is used the join is called a theta join denoted with the symbol u(u-join). So, theta represents
a predicate
The equijoin is therefore
that
consists
of one of the comparison
operators
{ 5, ,,
,5,
.5,
,
.}.
one special type of theta join:
Let R1 (a1, a2,..., an) and R2 (b1, b2,..., bm) be relations that may have different schemas. Then the u-join . of R1and R2is denoted as R1 uR2 and the equijoin is denoted as R1 R1.a5R2.bR2 It is also
possible to
express
both the u-join
and the
equijoin in terms
of the restriction
and
Cartesian
). product operations. So,for example, the equijoin R1 R1.a 5 R2.bR2 mayalso be written as sR1.a 5R2.b (R1 3 R2 Looking at the u-join and the equijoin in this way allows us to create some simple rules, which will allow us to compute such joins on any two relations: . This first performs a Cartesian product to form all possible combinations Compute R1 3 R2
1
of the
. rows of R1and R2 2
Restrict the Cartesian product to only those rows
where the values in certain columns
match.
For example, suppose we wish to find out all students who take classes in each department at Tiny University. To answer this query, we mustjoin together the two relations STUDENT-2 and DEPARTMENT-2 shown in Figure 4.12 (a) and (b). Following the two rules stated above, this will first involve finding the Cartesian
product
of the
STUDENT-2
and
DEPARTMENT-2
relations
shown in
Figure 4.12 (c).
Then,
we
need to restrict the resulting relation in Figure 4.12 (c) to only those tuples that satisfy the join condition on the common columns of DEPT_CODE, which is found in both relations (Figure 4.12 (d)). In this case, this would be where STUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE. This query, which we will call STUDENT_IN_DEPT, can be written in relational algebra as: (STUDENT 3 DEPARTMENT) STUDENT_IN_DEPT 5 sSTUDENT.DEPT_CODE 5DEPARTMENT.DEPT_CODE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
134
PART I
Database
Systems
FIGURE 4.12 Database Figure
name:
4.12 (a)
4
Figure
4.12 (b)
Equijoin example Ch04_TinyUniversity
The
STUDENT-2
relation STU_LNAME
STU_FNAME
STU_DOB
321452
Ndlovu
Amehlo
12 February
324257
Smithson
Anne
15 November
324258
Le Roux
Dan
23 August 1986
324269
Oblonski
324273
Smith
The
Walter
1992
30
BIOL
1997
16 September
John
DEPARTMENT-2
Figure 4.12 (c) The Cartesian
CIS ACCT
1997
December
CIS
1975
ENGL
relation DEPT_CODE
DEPT_NAME
ACCT
Accounting
BIOL
Biology
CIS
Computer
ENGL
English
product (STUDENT
Info.
Systems
3 DEPARTMENT) DEPT_NAME
S.DEPT_
D.DEPT_
CODE
CODE
1992
BIOL
ACCT
Accounting
1992
BIOL
BIOL
Biology
12 February
1992
BIOL
CIS
Computer
Amehlo
12 February
1992
BIOL
ENGL
English
Smithson
Anne
15
1997
CIS
ACCT
Accounting
324257
Smithson
Anne
15 November
1997
CIS
BIOL
Biology
324257
Smithson
Anne
15 November
1997
CIS
CIS
Computer
324257
Smithson
Anne
15 November
1997
CIS
ENGL
English
324258
Le Roux
Dan
23 August
1986
ACCT
ACCT
Accounting
324258
Le Roux
Dan
23 August
1986
ACCT
BIOL
Biology
324258
Le Roux
Dan
23 August
1986
ACCT
CIS
Computer
324258
Le Roux
Dan
23 August 1986
ACCT
ENGL
English
324269
Oblonski
Walter
16 September
1993
CIS
ACCT
Accounting
324269
Oblonski
Walter
16 September
1993
CIS
BIOL
Biology
324269
Oblonski
Walter
16 September
1993
CIS
CIS
Computer Info.
324269
Oblonski
Walter
16 September
1993
CIS
ENGL
English
324273
Smith
John
30 December
1975
ENGL
ACCT
Accounting
324273
Smith
John
30 December
1975
ENGL
BIOL
Biology
STU_
STU_
STU_
NUM
LNAME
FNAME
321452
Ndlovu
Amehlo
12
February
321452
Ndlovu
Amehlo
12
February
321452
Ndlovu
Amehlo
321452
Ndlovu
324257
Copyright Editorial
DEPT_CODE
STU_NUM
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
STU_DOB
May not
not materially
be
November
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
Info.
Systems
Info.
Systems
Info.
Systems
suppressed at
any
time
Systems
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
STU_DOB
STU_
STU_
STU_
NUM
LNAME
FNAME
324273
Smith
John
30
324273
Smith
John
30 December
Figure 4.12 (d) the final relation
STUDENT_IN_DEPT
December
4
Relational
Algebra
S.DEPT_
D.DEPT_
CODE
CODE
1975
ENGL
CIS
Computer
1975
ENGL
ENGL
English
and
Calculus
135
DEPT_NAME
Info.
Systems
(STUDENT 5 sSTUDENT.DEPT_CODE 5 DEPARTMENT.DEPT_CODE
3
DEPARTMENT) STU_
STU_
STU_
STU_DOB
NUM
LNAME
FNAME
321452
Ndlovu
Amehlo
12 February
324257
Smithson
Anne
15 November
324258
Le Roux
Dan
23 August 1986
324269
Oblonski
324273
Smith
Walter
1992
16 September 30 December
John
Notice in Figure 4.12 (c) that there
are two
columns
1997
1993 1975
called
S.DEPT_
D.DEPT_
CODE
CODE
BIOL
BIOL
Biology
CIS
CIS
Computer Info.
ACCT
ACCT
Accounting
CIS
CIS
Computer Info.
ENGL
ENGL
English
This is
due to the fact that
DEPT_CODE.
DEPT_NAME
4 Systems
Systems
both
STUDENT-2 and DEPARTMENT-2 both contain a column of the same name. In this case DEPT_CODE also shares the same domain and provides referential integrity between the two relations. In order to distinguish between them, a prefix of S and D has been added to the name of these columns, i.e. S.DEPT_CODE and D.DEPT_CODE, to makethem easier to read. You can also see these two common columns
again in the resulting
relation
in
Figure
4.12 (d)
as the
equijoin
columns. Ideally, it would be far better not to show duplicate equijoins are so common, so an operator called the natural join
does not eliminate
columns in the resulting was defined.
duplicate
relation,
as
4.2.2 The Natural Join The natural join
operation
is the
most common
variant
of the joins.
The natural join
operation
requires
that the two operant relations must have at least one common attribute, i.e. attributes that share the same domain. The common column(s) is (are) referred to as the join column(s). The natural join is in fact an equijoin; however, in addition, we drop the duplicate attributes, so the resulting relation contains one less column than that of the equijoin. Let R1be arelation having attributes (a1, a2,..., an, y), R2be another relation having attributes (b1, b2,..., bm y) where y is a set of common attributes (join column(s)) that share the same domain. The natural join operator is defined as: The natural join of R1and R2,denoted R1|3| R2 , consists of combining the tuples of R1 and R2to build a new relation R3,such that if R1Tuple [ R1 , R2Tuple [ R2 , and R1Tuple.y 5 R2Tuple.y, then R3Tuple 5 R1Tuple.a1 , R1Tuple.an, R1Tuple.y, R2Tuple.b1,... R2Tuple.bm. R1Tuple.a1 corresponds
; the notation Note that the common set of attributes y appears only once in R3 . to the a1attribute value of atuple of R1 Although
this
definition
appears
to
be quite complicated,
join of two relations is quite straightforward 1
Copyright Editorial
review
2020 has
. This first Compute R1 3 R2 . rows of R1and R2
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
performs
copied, affect
scanned, the
overall
or
and is the result a Cartesian
duplicated, learning
the
in experience.
whole
or in Cengage
of a three-stage
product
part.
Due Learning
to
steps required
electronic reserves
to form
rights, the
right
to
compute
to
third remove
party additional
natural
process:
all possible
some
the
content
combinations
may content
be
suppressed at
any
time
from if
the
subsequent
of the
eBook rights
and/or restrictions
eChapter(s). require
it
136
PART I
Database
2
Systems
Select those tuples values
3
in the join
where R1Tuple.y
column(s)
are
Perform a PROJECT operation final
relation.
joining
This is to
column,
ensure
thereby
DEPARTMENT
tables
on either R1 .y or R2.yto the result
that
the
eliminating on the
5 R2Tuple.y. Only the rows are selected
where the attribute
equal.
final
relation
duplicate
DEPT-CODE
results
columns. joining
in
a single
For example,
column,
of step (2), and call it yin the copy if
of each
wejoined
we would
only
attribute
the
want
in the
STUDENT
one
column
called
DEPT_CODE in our final relation. Finally, project the rest of the attributes in R1and R2except drop the prefix R1and R2in the final relation. Let us now apply these
4
AGENT
that
steps to an example.
will be used
FIGURE 4.13
to illustrate
the
Figure
natural
4.13 shows two
join
relations
called
and
y and
CUSTOMER
and
operator.
The CUSTOMERand AGENTrelations
Database name: Ch04_Relational_DB_Operators Relation:
CUSTOMER CUS_CODE
Relation:
CUS_LNAME
CUS_POSTCODE
AGENT_CODE
1132445
Strydom
4001
231
1217782
Adares
7550
125
1312243
Nokwe
678954
167
1321242
Reddy
2094
125
1542311
Smithson
1401
421
1657399
Vanloo
67543W
231
AGENT
1
Copyright review
2020 has
Cengage deemed
AGENT_PHONE
125
01812439887
167
01813426778
231
01812431124
333
01131234445
First, compute the Cartesian product operation
Editorial
AGENT_CODE
Learning. that
any
All suppressed
will produce
Rights
Reserved. content
does
May not
not materially
the results
be
copied, affect
scanned, the
overall
of CUSTOMER and AGENT,i.e.
shown
or
duplicated, learning
in experience.
in Figure
whole
or in Cengage
part.
CUSTOMER
3 AGENT. This
4.14.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 4.14 Database
name:
and
Calculus
C.CUS_
C.AGENT_
A.AGENT_
A.AGENT_
LNAME
POSTCODE
CODE
CODE
PHONE
1132445
Strydom
4001
231
125
01812439887
1132445
Strydom
4001
231
167
01813426778
1132445
Strydom
4001
231
231
01812431124
1132445
Strydom
4001
231
333
01131234445
1217782
Adares
7550
125
125
01812439887
1217782
Adares
7550
125
167
01813426778
1217782
Adares
7550
125
231
01812431124
1217782
Adares
7550
125
333
01131234445
1312243
Nokwe
678954
167
125
01812439887
1312243
Nokwe
678954
167
167
01813426778
1312243
Nokwe
678954
167
231
01812431124
1312243
Nokwe
678954
167
333
01131234445
1321242
Reddy
2094
125
125
01812439887
1321242
Reddy
2094
125
167
01813426778
1321242
Reddy
2094
125
231
01812431124
1321242
Reddy
2094
125
333
01131234445
1542311
Smithson
1401
421
125
01812439887
1542311
Smithson
1401
421
167
01813426778
1542311
Smithson
1401
421
231
01812431124
1542311
Smithson
1401
421
333
01131234445
1657399
Vanloo
67543W
231
125
01812439887
1657399
Vanloo
67543W
231
167
01813426778
1657399
Vanloo
67543W
231
231
01812431124
1657399
Vanloo
67543W
231
333
01131234445
Notice
C.CUS_
in
Figure
4.14
C.AGENT_CODE to the
column
relations. i.e.
3
in the
from
prefixed
AGENT_CODE AGENT
result
Therefore
of
Step
we SELECT
2 so that
of the
attributes
prefix
C and
Cengage
Learning. that
any
All suppressed
only
one
A in the
Reserved. content
does
our
May not
not materially
copied, affect
the
starting
relation
scanned, the
overall
or
duplicated, learning
in experience.
is
letter
whilst
of
each
4
relation.
A.AGENT_CODE
refers
which the
4.15 shows
appears
whole
or in Cengage
part.
is
Due Learning
in the
electronic reserves
appears
values
in
the both
are equal,
of Step 2.
or A.AGENT_CODE to the result
shown
to
as it
AGENT_CODE
the results
final
CUS_POSTCODE, relation
we must first identify
AGENT_CODE
C.AGENT_CODE
column
The final
this
for
Figure
CUS_LNAME,
relation.
be
with
example
only the rows
AGENT_CODE
final
column CUSTOMER
5 R2Tuple.y. To perform this step
1. In
on either
(CUS_CODE,
Rights
each in the
5 A.AGENT.CODE.
Perform a PROJECT operation
deemed
137
relation.
where R1Tuple.y
the
C.AGENT_CODE
Step
has
we have
to the
Select those tuples join
2020
that
refers
AGENT_CODE
2
review
Algebra
Ch04_Relational_DB_Operators
CODE
Copyright
Relational
Step 1: CUSTOMER X AGENT
C.CUS_
Editorial
4
in
rights, the
relation.
Then
project
AGENT_PHONE)
right
Figure
some to
third remove
the
and
of
rest
drop
the
4.16.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
138
PART I
Database
Systems
FIGURE 4.15 Database Relation
name:
Step 2: Selecting rows
where values in the join column
Ch04_Relational_DB_Operators
CUSTOMER
X AGENT Joining
4
columns
C.CUS_
C.CUS_
C.CUS_
C.AGENT_
A.AGENT_
A.AGENT_
CODE
LNAME
POSTCODE
CODE
CODE
PHONE
1132445
Strydom
4001
231
125
01812439887
1132445
Strydom
4001
231
167
01813426778
1132445
Strydom
4001
231
231
01812431124
1132445
Strydom
4001
231
333
01131234445
1217782
Adares
7550
125
125
01812439887
1217782
Adares
7550
125
167
01813426778
1217782
Adares
7550
125
231
01812431124
1217782
Adares
7550
125
333
01131234445
1312243
Nokwe
678954
167
125
01812439887
1312243
Nokwe
678954
167
167
01813426778
1312243
Nokwe
678954
167
231
01812431124
1312243
Nokwe
678954
167
333
01131234445
1321242
Reddy
2094
125
125
01812439887
1321242
Reddy
2094
125
167
01813426778
1321242
Reddy
2094
125
231
01812431124
1321242
Reddy
2094
125
333
01131234445
1542311
Smithson
1401
421
125
01812439887
1542311
Smithson
1401
421
167
01813426778
1542311
Smithson
1401
421
231
01812431124
1542311
Smithson
1401
421
333
01131234445
1657399
Vanloo
67543W
231
125
01812439887
1657399
Vanloo
67543W
231
167
01813426778
1657399
Vanloo
67543W
231
231
01812431124
1657399
Vanloo
67543W
231
333
01131234445
The tuples
shaded
in
produce the results
blue are those
where
C.AGENT_CODE
5 A.AGENT.CODE.
These
are then
selected
to
of Step 2.
C.CUS_
C.CUS_
C.CUS_
C.AGENT_
A.AGENT_
A.AGENT_
CODE
LNAME
POSTCODE
CODE
CODE
PHONE
1132445
Strydom
4001
231
231
01812431124
1217782
Adares
7550
125
125
01812439887
1312243
Nokwe
678954
167
167
01813426778
1321242
Reddy
2094
125
125
01812439887
1657399
Vanloo
67543W
231
231
01812431124
Copyright Editorial
match
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 4.16 Database
name:
no
Relational
Algebra
and
1132445
Strydom
4001
231
01812431124
1217782
Adares
7550
125
01812439887
1312243
Nokwe
678954
167
01813426778
1321242
Reddy
2094
125
01812439887
1657399
Vanloo
67543W
231
01812431124
crucial features
match is
made
tuple.
Smithson
139
Ch04_Relational_DB_Operators CUS_LNAME
unmatched
Calculus
CUSTOMER|X| AGENT
CUS_CODE
Note a few If
Step 3: Final relation
4
of the
between
In that
is included.
CUS_POSTCODE
natural join the
case,
tuples
Smithsons
AGENT_PHONE
4
operation:
in the
neither
AGENT_CODE
relation,
the
AGENT_CODE
AGENT_CODE
new relation
does
not include
421 nor the customer
421
does
not
match
the
whose last
any
entry in
name is
the
AGENT
table. The
column
on
which
the join
was
made
that
is,
were to
occur
several
AGENT_CODE
occurs
only
once
in the
new
table. If the
same
AGENT_CODE
be listed
for
each
AGENT
table,
occur three result
the
match.
For example,
customer
named
times in the resulting
because it
if the
times
Nokwe
who is
table. (A good
would contain
unique
in the
AGENT_CODE
primary
AGENT 167
associated
with
AGENT table
table,
were to
a customer
occur
three
AGENT_CODE
cannot,
would times
167,
of course,
in the
would
contain
such
a
key values.)
4.2.3 The Outer Join When using
the
theta
join
do not have identical that
all the tuples
have a join
and the
natural
join,
it is
values for the common
from the
which keeps
original tables all the tuples
possible
attributes.
are to
outer join,
denoted
There are three Left Right Full As you
outer
join
outer
in relation
join,
whether
Copyright Editorial
review
2020 has
keeps steps
except
that
we are
Cengage
Learning. that
any
All suppressed
determining
Reserved. content
does
the
As a result these tuples
R1 which
from from
May not
or right
aleft first
not materially
left-hand
be
right-hand
both
relations
an outer
affect
the
overall
no corresponding
have
null
If
then it is values
values.
This type
we require
necessary in the
of join
to
relation
is
known
or
relation
join
data from
outer join
scanned,
will be lost.
relation,
relations
are
very
the left
similar
or right
to
side
those
of the
steps
for
relation,
computing
depending
on
outer join.
performs
copied,
have
R2 will
in the joined
relation
the
determining
a left
tuples
outer join:
we also include
performing
Rights
data
for
. This Compute R1 3 R2 rows of R1and R2.
deemed
data
of the
.
of the
data from
keeps
the
The stages in
1
join
symbol
types
keeps
outer join
will see,
a natural
by the
common
some
be shown in the resulting
R2 . In these tuples, the attributes in the second relation as the
that
are:
a Cartesian
duplicated, learning
in experience.
whole
or in Cengage
product
part.
Due Learning
to
to form
electronic reserves
rights, the
right
all possible
some to
third remove
party additional
content
combinations
may content
be
suppressed at
any
time
from if
of the
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
140
PART I
Database
2
Systems
Select those tuples values
in the join
4
Perform a PROJECT operation
in For
in
Aleft
join,
including
those
that
in
4.17.
Notice
Figure been
entered
has
outer join
for
AGENT,
do not have a matching that
there
in the
is
columns.
the
the
final
R2Tuple.y.
of Step 2, and call it simply yin
in a single
Finally,
relations
will return
AGENT_PHONE
copy
project
the
of each attribute rest
of the
name:
relation.
CUSTOMER
and
AGENT,
of the
tuples
AGENT relation.
for
the
in
the
customer
Smithson
and
relation,
a value
returns
values in the
all of the
CUSTOMER
tuples
relation.
in the
AGENT
The result
relation,
of
NULL
including
of this join is
Left outer join : CUSTOMER
shown in
AGENT
Ch04_Relational_DB_Operators CUS_POSTCODE
AGENT_CODE
AGENT_PHONE
1132445
Strydom
4001
231
01812431124
1217782
Adares
7550
125
01812439887
1312243
Nokwe
678954
167
01813426778
1321242
Reddy
2094
125
01812439887
1657399
Vanloo
67543W
231
01812431124
1542311
Smithson
1401
421
name:
Right outer join : CUSTOMER
NULL
AGENT
Ch04_Relational_DB_Operators
CUS_CODE
CUS_LNAME
CUS_POSTCODE
AGENT_CODE
AGENT_PHONE
1132445
Strydom
4001
231
01812431124
1217782
Adares
7550
125
01812439887
1312243
Nokwe
678954
167
01813426778
1321242
Reddy
2094
125
01812439887
1657399
Vanloo
67543W
231
01812431124
NULL
NULL
NULL
333
01131234445
Learning. that
were
of this join is shown
CUS_LNAME
Cengage
which
CUSTOMER
The result
CUS_CODE
deemed
in
attributes
field. AGENT,
matching
all
value in the
no AGENT_PHONE
CUSTOMER
do not have
FIGURE 4.18
2020
duplicate
results
,.
4.18.
FIGURE 4.17
review
an
CUSTOMER
outer join,
that
Figure
Copyright
eliminating
performing
where the attribute
4.14.
outer
those
thereby
consider Figure
A right
Editorial
This is to ensure that the final relation
column,
example,
has
on either R1 .y or R2.yto the result
R1and R2,except y, and drop the prefix R1and R2in
defined
Database
5 R2Tuple.y. Only the rows are selected
equal.
Select those tuples in R1that do not have matchingvalues in R2, so R1Tuple.y
the joining
Database
are
3
the final relation.
4
where R1Tuple.y
column(s)
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
So, regardless the
of the type
matched
null.
pairs
Outer joins
of outer join,
would
are
especially
useful
cause(s)
referential
primary
key values in the related
other
amounts
the
integrity
non-database
vast
and
any
when
you
problems
data into
of time
the two
be retained
and
relational
Figures
values to
determine
what key
In fact, if you
are asked to
tables,
you
when
relation
value(s)
encounter
and
would in
values
convert
will discover
you
Algebra
Calculus
4.18 have shown
other
when foreign
headaches
Relational
4.17 and
in the
created
are
database
uncounted
in
are trying
which
table(s).
examples unmatched
4
large
that the
not
tables match
the
spreadsheets
outer joins
referential
that be left
related
do
integrity
141
or
save you errors
after
conversions. You
may
the tables
wonder
why the
are listed
in the
outer
joins
are labelled
SQL command.
left
Chapter
and right.
The labels
8 will explore
refer
to the
order
in
which
such joins.
4
4.3
CONSTRUCTING QUERIES USING RELATIONAL ALGEBRAIC
EXPRESSIONS The
main purpose
a database. are
used
to tell
calculus
the
provides
relations.
non-procedural
1977
(Lacroix
and
properties
and
over
set
again,
writing relational
algebraic not
optimiser. access and
the
SQL
need
in
for
relations.
in terms
and
a
calculus
databases.
power
is
of relational
relational
with relational
other
calculus
one form
by domain
expressive
Relational of those
relational
Codd proposed
use
in its
other
whilst
followed
relational this
relational
algebra
book
to
calculus.
users
will ask
will
that
expressions This
different
at the
to formulate
examine
For those end
of this
on the
spur
of the
of
smaller
used
query
in
the
in
However,
both
provide
the
expressions
the
mathematical
who
are interested
using
definitions, there
is
a
chapter.
these
results
DBMSs,
is to
it is
the
moment.
query
each
of the
query.
pointing
queries
building step
out that
Chapter
a when
of individual same,
the
but
can
efficiency
determined
the
a query
generates
Generally,
be the
is
and find in
will be asked
of execution
will always
of execution
optimiser
of
where
the order
query
Some
The task
steps
worth
order
of queries.
steps,
queries, of the
analyse the
more about the
kinds
following
However,
most
optimiser
You will discover
in the
to represent
expressions.
The job
many different
a number
means that
and that,
of the
be
into
are then
very important
data.
no
section
down
matter.
by slightly
is
of
relation
in
section
queries.
reading
others
query
results
does
In 1972,
algebra
on applying
is
behind further
whilst
the
operations obtained
There
of a database,
of intermediate
be
database
terms
(tables)
previous
Queries
breaking
a query
real
in
relations
about in the
language,
was later
designed
relational
manipulate
of the required
logic.
and this were
to
relation
definition
as a procedural
versions
and
you have just read
required
on predicate
equivalent
a way to create
that
the
calculus
Both
is
in the
During the lifetime over
some
classed
and based
characteristics
material
4.3.1 Building
involves
often
relational
operators.
and of
build
we will be focusing
main relation
selection
is
specifying
section,
to
provide
algebra
for formulating
Pirotte).
calculus
base for
In this
how
algebra
as tuple
relational
the
DBMS
language
known
required
algebra is to of relational
a notation
Relational
calculus
tuple
of relational
The operations
of
by a query
most efficient
13, Managing
way to Database
Performance.
In order to build a query using a relational
algebraic
expression,
you should take the following
steps:
1 List all the attributes we need to givethe answer. 2 Select allthe relations we need, based onthe list of attributes.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
142
PART I
Database
3
Systems
Specify the relational
To learn
how to
small
database
Each
car is
a new
queries
to
undergo
maintenance parts
to
are shown in Figure
FIGURE 4.19
these
steps,
about the
is
each
created
purchased
and
and fitted.
are completed
results that are needed.
we will now look
maintenance
an inspection
record
be
FAIL until all the repairs
4
following
stores information
required
inspection, require
build that
operators and the intermediate
year
to
any repairs
If
a car
test
needs
are
based
it is roadworthy.
needed
a repair,
examples
on a
ERD is shown in Figure 4.19).
whether
that
and then it is set to
at some
of cars (the
then
PASS. The tables
After
are recorded. the
each
A repair
EVALUATION
representing
is
this
can set to
database
4.20.
The car inspection
ERD CAR
MAINTENANCE_RECORD REGISTRATION INSPECTION_CODE
{PK}
REGISTRATION
b requires
{PK}
CAR_MAKE
{FK}
CAR_MODEL
INSPECTION_DATE
0..*
EVALUATION
MODEL_YEAR
1..1
LICENCE_NO
1..1
is_for
c
0..* PART
REPAIR INSPECTION_CODE PART_NO
{PK}
requires {PK}
c
PART_NO
{FK}
{FK}
Database
name:
Table name:
Thecarinspection database Ch04_Car_Inspection
CAR
REGISTRATION
CAR_MAKE
Toyota
3679MR82
Copyright Editorial
review
PART_COST
0..*
0..*
FIGURE 4.20
{PK}
PART_NAME
CAR_MODEL
CAR_COLOUR
MODEL_YEAR
LICENCE_NO
Corolla
Blue
2016
1967fr89768
Micra
Red
2004
1973Smith121
E-TS865
Nissan
PE57UVP
Peugeot
508
Blue
2017
1990bty3212
PISE567
Volkswagen
Eos
Lime
2016
DF-678-WV
ROMA482
Volkswagen
Golf
Black
2017
AQ-123-AV
Z-BA975
Peugeot
Black
2017
1980vrt7312
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
GT
208
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
PART_NO
PART_NAME
12390
Paint sealants
List To
answer
12392
Brake
pads
24.99
12393
Brake
discs
49.54
12395
Spark
plugs
0.99
12396
Airbag
24.95
12397
Tyres
25.00
REGISTRATION
INSPECTION_DATE
Copyright Editorial
review
2020 has
Cengage deemed
any
FAIL
10/05/2018
100390
ROMA482
01/09/2018
106750
E-TS865
01/03/2016
PASS
122456
Z-BA975
03/10/2018
FAIL
145678
PISE567
30/09/2017
PASS
200450
E-TS865
21/02/2015
PASS
200456
E-TS865
01/04/2017
FAIL
query,
the .
All suppressed
query
asked
about
cars
you
relation
Rights
12396
106750
12397
100036
12393
200450
12391
100036
12397
200450
12392
200456
12397
where
The the
106750
the
model
interpret
that
user
only
relational
year is List
wants
2016.
all information
to
operator
after
about
see information
SELECT
on
we can
cars cars
means list where
write this
query
the
all the attribute
as a relational
as:
Reserved. content
Using
PART_NO
by a user:
must first CAR.
2016.
expression
Learning. that
EVALUATION
PE57UVP
following
MODEL_YEAR
algebraic
4
REPAIR
this in
143
19.95
Wiper
100036
all information
attributes
Calculus
14.95
INSPECTION_CODE
1
and
MAINTENANCE_RECORD
Table name:
the
Algebra
PART_COST
12391
INSPECTION_CODE
Consider
Relational
PART
Table name:
Example
4
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
144
PART I
Database
Systems
(CAR) smodel_year . 2016 The resulting
relation
FIGURE4.21
4
is shown
CAR_MAKE
PE57UVP
Peugeot
ROMA482
Volkswagen
Z-BA975
Peugeot
Example
the
CAR_COLOUR
508 Golf
GT
208
mechanic
The following
query is
Display
all the
query
at the
names
only
the
SELECT
operator.
FIGURE4.22
garage
wishes
and their
specific
attributes
2017
1990bty3212
Black
2017
AQ-123-AV
Black
2017
1980vrt7312
find
information
will also
Consider
algebraic
a
more complex
cars
parts
to
of the
to
restrict
the
part is
greater
be displayed,
PART_COST.
stock.
Both
so
are
rows
we
20.00.
will need
obviously
where
for this
than
in the
the
relation
relation
PART_COST
PART.
. 20.00
using
Ppart_name (s part_cost.20.00(PART))
query is
4.22.
PART_NAME
PART_COST
Brake Pads
24.99
Brake
49.54
of
and
Discs
24.95
different
model
operator
and show
how
we can
write expressions
when
tables.
details
and
out
after
was carried
Cengage
part
numbers
for
01/03/2018,
all
which
cars
resulted
where in
the
model
a part
being
year is required
and
and
will have to
results.
CAR_MODEL
MODEL_YEAR
is
be
broken
The first
part of the
which are located 2017.
down
in the
This information
can
into
a number
query states
that
CAR relation. be
written
of different
we need the
Also, using
stages,
each
attributes
we are only interested
the
following
relational
expression:
Learning. that
query
a set of intermediate
whose
algebraic
deemed
are in
query:
an inspection
REGISTRATION
has
a number
following
car registration where
one having
2020
parts
a repair.
This is
review
cost
expression
is shown in Figure
will also use the natural join from
the
the
2017.
Copyright
which
3 example
data is required
Editorial
about
Resultof Ppart_name (PART)) (s part_cost . 20.00
Example The final
in
out information
the
about and
be required
The relational
relation
where
Airbag
for
LICENSE_NO
Blue
to
prices
PART_NAME
PART_COST
The resulting
List
MODEL_YEAR
asked:
part
requires
contains
The attribute the
CAR_MODEL
2
Supposing
which
4.21.
(CAR) Resultof s model_year . 2016
REGISTRATION
This
in Figure
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
4
Relational
Algebra
and
Calculus
145
Pregistration, car_model (smodel_year 52017(CAR)) The result
of applying
this
FIGURE4.23
statement
to the
CAR table is
shown in
Resultof Pregistration, car_model (smodel_year5 2017(CAR)) REGISTRATION
CAR_MODEL
PE57UVP
508
ROMA482
Golf
next
part
01/03/2018.
of this
query is not asking means the values the
query,
part
query
Information
for
any specific
selecting
query
can
be
information
inspections
of all attributes
by only
of the
requires
about
those
is
4
so
in
inspections
the
where
we will assume
the
that
were
carried
MAINTENANCE_RECORD
relation.
after The
about inspections
However,
INSPECTION_DATE
out
relation.
that information
MAINTENANCE_RECORD
tuples
written
about stored
attributes,
in the
GT
208
Z-BA975
The
Figure 4.23.
we must restrict
. 01/03/2018.
This
second
as:
( MAINTENANCE_RECORD) sinspection_date . 01/03/2018
The result
of applying
this
FIGURE4.24
expression
to the
INSPECTION_DATE
EVALUATION
100036
PE57UVP
10/05/2018
FAIL
100390
ROMA482
01/09/2018
122456
Z-BA975
03/10/2018
with the
REGISTRATION.
TempR where
been
tables
can
be
Copyright review
2020 has
Cengage deemed
Learning. that
any
4.24.
shown
the
in Figures
column written
for
in
first
two
parts
FAIL
of the
query.
4.23 and 4.24. This join
both the
CAR and
The
next
operation
stage is to join
is the
MAINTENANCE_RECORD
now natural
relations
being
as:
(MAINTENANCE_RECORD) 5 Pregistration, car_model (s model_year 52017 (CAR)) |3|s inspection_date . 01/03/2018
a relation
which stores
of the
natural
join
is
prefixed
with
the
letters
(MAINTENANCE_RECORD
Editorial
expressions
common
This
TempR is
The result have
algebraic
from the resulting
operation,
Figure
REGISTRATION
have relational
the rows join
table is shown in
(MAINTENANCE_RECORD) Resultof sinspection_date . 01/03/2018
INSPECTION_CODE
We now
MAINTENANCE_RECORD
All suppressed
Rights
and
Reserved. content
does
May not
not materially
be
the intermediate
shown
using
the
M and
results.
three
C to
steps
show
in
Figure
which
4.25.
relations
Notice they
that
were
the
attributes
originally
from
CAR respectively).
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
146
PART I
Database
Systems
FIGURE 4.25 Step
The TempR relation
1: Compute
the
Cartesian
M.INSPECTION_
product:
MAINTENANCE_RECORD
M.REGISTRATION
M.INSPECTION_
CODE
4
X CAR. M.EVALUATION
C.REGISTRATION
C.CAR_
DATE
MODEL
100036
PE57UVP
10/05/2018
FAIL
PE57UVP
100036
PE57UVP
10/05/2018
FAIL
ROMA482
100036
PE57UVP
10/05/2018
FAIL
Z-BA975
208
100390
ROMA482
01/09/2018
PE57UVP
508
100390
ROMA482
01/09/2018
ROMA482
100390
ROMA482
01/09/2018
Z-BA975
208
122456
Z-BA975
03/10/2018
FAIL
PE57UVP
508
122456
Z-BA975
03/10/2018
FAIL
ROMA482
122456
Z-BA975
03/10/2018
FAIL
Z-BA975
Step
2: SELECT
only the rows
for
which the
REGISTRATION
values
are
equal, i.e.
508 Golf GT
Golf
GT
Golf
GT
208
M. REGISTRATION
5 C.
REGISTRATION. Joining
Columns
M.REGISTRATION
M.INSPECTION_
C.CAR_ MODEL
FAIL
100036
PE57UVP
10/05/2018
100390
ROMA482
01/09/2018
122456
Z-BA975
03/10/2018
Step 3: Perform a PROJECT prefixes
C and
of
3.
result
C.REGISTRATION
DATE
CODE
the
M.EVALUATION
M.INSPECTION_
Step
on either
Min the final
FAIL
C.REGISTRATION
relation.
The table
or M.REGISTRATION
below
shows
508
PE57UVP
the
relation
ROMA482
Golf
Z-BA975
208
to the result TempR,
of Step 2 and drop
which
has
been
created
INSPECTION_CODE
REGISTRATION
INSPECTION_DATE
EVALUATION
CAR_MODEL
100036
PE57UVP
10/05/2018
FAIL
508
100390
ROMA482
01/09/2018
122456
Z-BA975
03/10/2018
The next
part of the
query requires
Golf
we have
as a
GT
208
FAIL
the information
GT
obtained
so far to
be restricted
even further
by only displaying information for cars where a part was needed for arepair. To find out this information we have to look to see if there is a PART_NO in the REPAIR relation, which corresponds to a specific INSPECTION_CODE in the MAINTENANCE_RECORD relation. The relation TempR already stores the intermediate results from the first part of our query, so we must now connect TempR to the REPAIR relation
using
a natural join
QueryResult
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
on the INSPECTION_CODE
5 TempR |3|
Rights
Reserved. content
does
May not
not materially
be
column.
This can be
written as the
expression:
REPAIR
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Figure
4.26 shows
called
QueryResult.
the result
FIGURE 4.26 The relation
of performing
this
natural join
operation
4
Relational
Algebra
and stores the results
and
Calculus
in a relation
The QueryResultrelation
TempR
INSPECTION_CODE
REGISTRATION
100036
PE57UVP
10/05/2018
100390
ROMA482
01/09/2018
122456
Z-BA975
03/10/2018
The relation
INSPECTION_DATE
EVALUATION
CAR_MODEL
FAIL
508
FAIL
208
Golf GT
4
REPAIR
QueryResult
5 TempR
|3|
INSPECTION_CODE
INSPECTION_CODE
PART_NO
106750
12396
106750
12397
100036
12393
200450
12391
100036
12397
200450
12392
200456
12397
REPAIR REGISTRATION
INSPECTION_DATE
EVALUATION
CAR_MODEL
PART_NO
100036
PE57UVP
10/05/2018
FAIL
508
12393
100036
PE57UVP
10/05/2018
FAIL
508
12397
Finally,
the
147
original
This requires
query
us to
requested
perform
using the following
that
a PROJECT
we only list
the
operation
on the intermediate
car registration,
in
4.27.
model results
details in the
and
part numbers.
QueryResult
relation
expression:
(QueryResult) Pregistration, car_model,part_no The final
results
of the
FIGURE 4.27
query
are
shown
Figure
Solution to example 3 REGISTRATION
As you
can
see, it is
smaller
relational
possible
algebra
to
CAR_MODEL
PART_NO
PE57UVP
508
12393
PE57UVP
508
12397
solve
a complex
expressions.
The full
query
by
expression
breaking
for
down
example
the
3 can
query be
into
written
a number
of
as:
car_model (smodel_year 52018 (CAR)) |3|sinspection_date . Pregistration, car_model,part_no((REPAIR) |3| ( Pregistration, 01/03/2018 (MAINTENANCE_RECORD)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
148
PART I
Database
4.4
Systems
RELATIONAL CALCULUS
Relational
calculus
calculus.
There
are two types
Tuple
relational
calculus. compute
it. In
will learn
uses
addition,
about
domain
tuple.
In the
is a formal
in
variables
8.
the
Domain
take
sections
a branch
users
tuple to
relational
of
is
about
calculus is a precise language
that
calculus
what
they
two
called
predicate
and
domain
relational
rather
Language
from
domain,
these
logic
want,
Query
different
an attribute
more
mathematical
Structured
calculus
from
will learn
of
relational
describe
appearance
on values
you
upon
calculus,
allows
underlines
that
based
of relational
calculus it
Chapter
following
language
tuple
rather
types
than
(SQL),
relational
than
to you
calculus
values for
of relational
how which
as it
an entire
calculus.
4
NOTE A NOTE ON PREDICATE CALCULUS First-order
logic
or predicate
are words that
describe certain relations
can be used to express
and properties. In logic,
queries.
Predicates
a predicate has the form:
name_of_predicate(arguments). Consider
the following
statements:
student(Alex) studies(Alex,
Database Systems)
In these two statements, student and studies are the names of the predicates. The statement student(Alex) has a value TRUE if Alexis a student, and a value FALSE if Alexis not a student. Variables
are used if
individual.
we want to express the
So the above
statements
property
of being a student,
and not refer to a specific
become:
student(x) studies(x,y) The expression student(x) is now referred to as a predicate expression. It has no predetermined truth value as the value of xis currently unknown. Variables in a predicate expression can take values within a certain domain. The domain of a predicate variable is the set of all values that can be substituted in the place of the variable. When writing expressions in predicate
P(x)represents
a predicate
calculus,
we use a capital letter
asthe name ofthe predicate.
with one variable x.
Whenx has a value we can say whether or not the expression is true or false. known
as a Truth
For example:
Set which is
defined
Every predicate has whatis
as:
{x[D|P(x)} So, atruth set of a predicate substituted
P(x) with a domain
for x. For example,
consider
Dis the set of all elements
the following
predicate,
lecturer(x).
of Dthat
make P(x)true
The domain
when
would be all people
and the truth set would be alllecturers. Aformula in predicate calculus can comprise: Set of comparison
operators:
Set of connectives: Implication
Copyright Editorial
review
2020 has
Cengage deemed
(5.)
Learning. that
any
,,
#,
.,
$,
5,
and (`), or (~), not () where x 5. y means:if x is true, then y is true.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
4.4.1 Tuple Relational Tuple relational
4
Relational
Algebra
and
Calculus
149
Calculus
calculus is a non-procedural
query language
which is
used to describe
what information
is required from the database without giving a specific method for obtaining that information. When specifying a query in tuple relational calculus we say only which attributes are to be retrieved and not how the query is to be executed. This is in contrast to relational algebra, which provides a procedural way of writing the query and incorporates
a strategy
for executing
the
query through
the
way in
which the
operations are ordered. Relational algebra and tuple relational calculus can both be used to express the same queries, which meansthat wehave arelationally complete query language. Wesay a query language is relationally complete if any query that can be written in relational algebraic form can also be expressed bythe query language. Most relational query languages such as SQL are not only relationally complete, but also contain
additional
features
like
aggregate
functions
that
allow
more complex
4
queries to be written.
In tuple relational calculus, wespecify a number of tuple variables where each tuple variable ranges over a database table. The values of the tuple variables are the actual tuples in the table. A query in the tuple relational calculus is expressed as: {t|P(t)} which represents the set of tuples, T, for which predicate, P,is true. Therefore, the results of this query are alltuples that satisfy the condition represented by predicate P. For example, consider the car inspection database in Figure 4.20. If we wanted to write the following query Find
{t|t
all cars
with a model_year
.5
2018
using tuple
relational
calculus
we would
write:
[ Car ` t.MODEL_YEAR.52018}
This query means return the set of tuples, t, where t belongs to the Carrelation year t is greater than 2018. As you can see in the
in the following
example,
a query
or expression
in tuple
relational
and the
calculus
model_year for
can also be written
extended form:
{t1.A1, t2.A2,..., tn.An| P(t1,..., tn, tn11,..., tn1m) where:
t1,..., tn, tn11,..., tn1mare tuple variables, on which ti ranges,
A1...An are attributes of the relation Pis a predicate A formula
following
in tuple
relational
calculus
consists
of predicate
calculus
atoms.
An atom
has one
of the
forms:
(i) R(t) where t is
a tuple
variable
and
Ris
a relation
name.
(ii) t.A oper s.B where t and s are tuple variables, A and B are attributes and oper is a comparison operator. (iii) t.A
oper const
where t is atuple variable, Ais an attribute, oper is a comparison operator, and const is a constant.
Each of these types Every atom
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
of atoms evaluates to either TRUE or FALSE for a specific
has a truth
All suppressed
Rights
Reserved. content
combination
of tuples.
value.
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
150
PART I
Database
Systems
Tuple relational logical
Boolean
Existential
and
Tuple relational these
calculus
AND,
Universal
are either
OR,
NOT (`,
an atom
or atoms
or other formulae
connected
via the
~, ).
Quantifiers
calculus formulae
quantifiers
by a quantifier
formulae
operators
is to constrain
can contain
existential
the variables
is
said to
be free.
the
following
two
A tuple
(')
and universal (;)
of tuples in a single relation.
relational
calculus
expression
quantifiers.
The role
of
Any variable that is not bound
may contain
at
most one free
variable. Consider
4
' t
[
R( P(t) ) reads
that
; t
[
R( P(t) ) reads
that
The
existential
universal
(')
(;)
expressions:
there
a tuple
P is true
for
states
that
quantifier
quantifiers
exists
t in
all tuples
relation
t in relation
a formula
state that the formula
R such
must
that
predicate
P(t) is true
R.
be true
for
at least
one instance,
while
the
must be true for all instances.
4.4.2 Building a Tuple Relational Calculus Expression To specify (i)
a tuple
Specify
(ii)
the
Specify
which
calculus
relation to
how
at any
to
build
branch are
to
of the
variable
the
following
t. In the
combinations
we will look
bank.
in
take
steps:
form
of R(t).
of tuples.
be retrieved.
about
shown
tuple
particular
expressions,
information
database
expression,
R of each select
a set of attributes
stores
money this
range
a condition
(iii) Specify To learn
relational
at some
customers
at a bank.
The
shown
Figure
ERD is
in
examples
based
Customers Figure
4.28
can
on a simple
withdraw
and the relations
small
database
money and (tables)
deposit
representing
4.29.
FIGURE 4.28 WITHDRAWAL WITH_TRANS_NO makes
c
{PK}
0..*
WITH_DATE
makes
c
WITH_AMOUNT
0..*
CUS_ACCNO
{FK1}
BRANCH_NO
{FK2) 1..1
1..1
CUSTOMER CUS_ACCNO
BRANCH {PK}
BRANCH_NO
CUS_LNAME
{PK}
BRANCH_NAME
CUS_FNAME
BRANCH_CITY
CUS_BALANCE
1..1 1..1 DEPOSIT DEP_TRANS_NO
makes
c
0..*
{PK}
0..*
DEP_DATE
makes
c
B_AMOUNT
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
CUS_ACCNO
{FK1}
BRANCH_NO
{FK2}
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
4
Relational
Algebra
and
Calculus
151
FIGURE 4.29 Relation:
CUSTOMER CUS_ACCNO
Relation:
CUS_LNAME
CUS_BALANCE
2465454
Emerson
Percy
1034
1012345
Adares
Constance
1865
BRANCH BRANCH_NO
Relation:
BRANCH_NAME
BRANCH_CITY
125
Monsuir
333
FirstStep
Paris
231
Cross_St
Rome
4
London
WITHDRAWAL WITH_TRANS_NO
Relation:
WITH_DATE
WITH_AMOUNT
CUS_ACCNO
BRANCH_NO
48887211
01-Jul-18
50
2465454
125
48867666
02-Jul-18
100
1012345
333
64446566
18-Jul-18
200
2465454
125
64443229
20-Jul-18
400
2465454
231
DEPOSIT
DEP_TRANS_NO
Example
CUS_FNAME
DEP_DATE
DEP_AMOUNT
CUS_ACCNO
BRANCH_NO
90000034
30-Jun-18
1000
2465454
125
90000780
30-Jun-18
1400
1012345
333
1
Suppose we wanted to find out which customers the following expression:
had made any withdrawals over 200.
{w| w [ WITHDRAWAL(w) ` w.WITH_AMOUNT
We would write
.5200}
This expression gives us all attributes from the WITHDRAWAL relation, but suppose we only want the last names of customers who have withdrawn 200 or more. CUS_LNAME exists in the CUSTOMER relation, which means we will have to perform ajoin on the CUSTOMER and WITHDRAWAL relations. The attribute CUS_ACCNO appears in both CUSTOMER and WITHDRAWAL and is used to join the two relations
together
as shown in the expression
{w.CUS_LNAME|
w [ WITHDRAWAL(w) ` ('c) (c [ CUSTOMER ` (c.CUS_ACCNO
w.WITH_AMOUNT
would read display the names of all customers such that there exists a
WITHDRAWAL
AND CUSTOMER
attribute are equal, and the value of the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
5 w.CUS_ACCNO)
.5200}
In English, the above expression tuple in the relations
below:
does
May not
not materially
be
copied, affect
scanned, the
overall
for
which the
values
of and for the
CUS_ACCNO
WITH_AMOUNT attribute is greater than or equal to 200.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
152
PART I
Database
Systems
Example Find
2
all customers
{c| c [ ` ('b)
CUSTOMER
(b
[
4
(d
and
from
[ DEPOSIT
relies
be seen
CUS_ACCNO
a deposit
(b.BRANCH_NO
expression As can
made
` ('d)
BRANCH
The above relations. is
having
London.
` (c.CUS_ACCNO
existing
Figure
between
in
5 d.CUS_ACCNO)
5 d.BRANCH_NO)
on joins
from
branch
4.29,
` (b.BRANCH_CITY
between
the
the
common
DEPOSIT
and
BRANCH
expressions
that
generate
CUSTOMER,
column the
.5
BRANCH
between
common
'London')))}
DEPOSIT
attribute
is
and and
DEPOSIT
CUSTOMER
BRANCH_NO.
NOTE Safety of Expressions It is
{ t|
possible
t
to
write tuple
[ R } results
in an infinite
In order to solve this A safe
expression
component For { t | This
[
if the
domain
the set of allowable
is an expression
consider
the
CUSTOMER)
expression
possible
problem,
relation
to
is
have
infinite
relations.
of any attribute
expressions
following
tuples,
that
the
expression
Ris infinite.
to safe expressions.
calculus
or constants
For example,
of relation
is restricted
{ t | P(t) } in the tuple relational
of t appears in one of the relations,
example, ( t
calculus
that is
classed
as safe if every
appear in tuple relational
formula
P.
expression:
}
NOT safe a customer
as it reads
display
tuple
does
that
all tuples not
that
appear
in
are
NOT in the
CUSTOMER
relation.
It is
not
CUSTOMER.
4.4.3 Domain Relational Calculus Domain relational in
calculus is classed as a non-procedural
power to tuple relational
calculus for
in that
an entire
calculus.
domain
expression
However,
variables
that
domain relational take
on values
calculus is
from
different from tuple relational
an attribute
domain,
rather
than
values
in
domain
relational
calculus
is
of the
of atoms,
as
was the
form:
x1, x2,..., xn. | P(x1 , x2,..., xn)}
Where x1, x2,..., case in tuple that
uses
tuple.
A general
{,
it
query language that is seen to be equivalent
xn represent
relational
involve
getting
are created
domain
Calculus. tuples
A formula
in
domain
Formulae
from
using the logical
variables.
relations
are recursively and
connectives
relational
P represents
making
AND,
calculus
is
formulae defined,
starting
comparisons
OR and
composed
with simple
of attribute
values.
atomic
formulas
Bigger
formulae
NOT.
constructed
using
the
following
rules:
(i) an atomic formula; (ii)
Copyright Editorial
review
2020 has
p, p`q,
p~q
where
p and q are formulas;
(iii)
'
X(p (X))
where
Xis
a domain
variable;
(iv)
;
X (p (X))
where
Xis
a domain
variable.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The use of quantifiers be free.
This
'x
means that,
and
;x in a formula
is said to
when writing expressions
must be the only free variables in the formulae Let
us take
a look
at some
examples
4
Relational
bind x. A variable that is
in domain relational
calculus,
Algebra
and
Calculus
153
not bound is said to
the
variables
x1, x2,...,
xn
P(x1, x2,..., xn).
using
the
simple
banking
database
shown
in
Figures
4.28
and 4.29. Example
1
Find all customers
with a balance
{CUS_ACCNO,
` CUS_BALANCE
In this formula,
the
should
in the
CUS_LNAME,
CUS_LNAME
to the left
result
CUS_LNAME,
CUS_BALANCE
[
4
CUS_ACCNO,
The term
be included
((CUS_ACCNO,
. 500)}
CUS_ACCNO,
tuple.
Example Find
condition
variables
CUSTOMER
500.
CUS_LNAME,CUS_BALANCE|
CUSTOMER)
domain
greater than
of |
and
CUS_BALANCE
CUS_BALANCE
means that
every
customer,
ensures
are bound to the fields
tuple
that
satisfies
that
the
of the same
CUS_BALANCE
.
500
set.
2
all customers
with
{CUS_ACCNO,
a balance
greater
CUS_LNAME,
CUSTOMER)
`
DEPOSIT)
`
than
500
and
CUS_BALANCE|
CUS_BALANCE
.
500)
DEPOSIT.BRANCH_NO
`
who have
deposited
((CUS_ACCNO, '
money
CUS_LNAME,
DEPOSIT.BRANCH_NO
at branch
125.
CUS_BALANCE
[
(DEPOSIT.BRANCH_NO
5 CUSTOMER.BRANCH_NO
`
[
CUSTOMER.BRANCH_NO
5 125} In this
the
example,
the
CUSTOMERS
Example
existential
quantifier
' has
been
used
to to find
a tuple
in
DEPOSIT
that
joins
with
tuple.
3
List the
branches
where there
{BRANCH_NO,
BRANCH_NAME,
[ BRANCH) CITY)
` ('
have been
` ('
no deposits.
BRANCH_CITY|
BRANCH.BRANCH_NO)
DEPOSIT.BRANCH_NO)
(({BRANCH_NO,
BRANCH_NAME,
BRANCH(BRANCH_NO,
(DEPOSIT.BRANCH_NO
[
BRANCH_CITY
BRANCH_NAME,
DEPOSIT)
BRANCH_
` (DEPOSIT.BRANCH_NO
5 BRANCH.BRANCH_NO)}
SUMMARY One of the within
key
the
Relational
algebra
Relational relations
components
database
algebra
is
for formulating
the
real
queries,
as they
as
SQL.
Cengage deemed
Learning. that
any
and used
supports
DIVIDE.
All
Rights
does
May not
not materially
be
eight
affect
allows
data
scanned, the
overall
or
on relations and
are formally
both
algebra
and
the
is
in
shown
to
be stored
in experience.
whole
or in Cengage
part.
Due Learning
to
produce
to
each
data
calculus)
of those
other
operations
for
other.
originally
PRODUCT,
new
relational
in terms
operators
for
databases.
the required
equivalent
JOIN
basis Table
relation
JOIN,
to
domain
provide
operators
PROJECT,
PROJECT
duplicated, learning
act
calculus
and form
for relational
calculus
of the required
relational
SELECT,
operators
copied,
which
basis
relational
RESTRICT),
information
of these
Reserved. content
The
to retrieve
A summary
suppressed
the
as SELECT (or
that
relational
definition
specifying
model
relation,
mathematical
operations (tuple
and tuple
DIFFERENCE
has
of formal calculus
algebra
commonly
2020
are the
Both relational
These are known
review
calculus
Relational
database
model is the
manner.
a collection
a notation
relational
relations.
The relational
Copyright
of the
a structured
and relational
as a result.
provides
Editorial
in
defined
by
INTERSECT,
are the
manipulation
ones
Codd.
UNION,
that
are
languages
most
such
4.1.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
154
PART I
Database
Systems
User queries expression, ?
List
?
Select
?
Specify
can be written the
all the
following
attributes
all the the
calculus
predicate
calculus.
Tuple relational
4
and
the
calculus
take
TABLE 4.1 Relational
from
Summary
Operator
based
for
describe
which
different
of relational
a given
from
write such
as an
are
needed.
mathematical
want, rather (SQL).
logic
than
called
how to compute
Expressions
it,
in tuple
is true.
tuple relational
calculus
as it uses
domain
variables
domain.
operators
Description a subset
of tuples
PROJECT
P
Selects
a subset
of columns
-
INTERSECT UNION
THETA JOIN
of
Language
predicate
Selects
PRODUCT
that
a branch
what they
s
CARTESIAN
results
Query
SELECT
DIFFERENCE
order to
of attributes.
upon
Structured
an attribute
Symbol
on the list
users to of
tuples
calculus is
on values
based
language
allows
In
answer.
and the intermediate
appearance return
Domain relational that
a formal
expressions.
be taken:
give the
operators
calculus
underlines
relational
is
to
we need,
relational
algebraic
should
we need
relations
Relational
as relational
steps
from
a relation.
from
a relation.
Selects
tuples
in
Relation1
but
Selects
tuples
in
Relation1
or in
Relation*.
Selects
tuples
in
Relation1
and
Relation2,
X
Computes
u
Allows two relations {
5, ,,
all the possible
,5,
.5,
to ,
not in
Relation2*.
be combined
.}.
excluding
combinations
When the
duplicate
tuples*.
of tuples.
using one of the comparison
operator
is
5 the
operator
is
operators
known
as an
EQUIJOIN. NATURAL
JOIN
|X|
A version
of the
EQUIJOIN
Relation1Tuple.Y
which
selects
5 Relation2Tuple.Y.
both relations
which
those Yis
tuples
where
a set of common
must share the same domain.
attributes
Duplicate
to
columns
are
removed.
OUTERJOIN
Based on the u-JOIN and natural JOIN, the all the tuples in
Relation1 that
OUTERJOIN in addition
have no corresponding
selects
values in the relation
Relation2. 4
Selects
'
A formula
;
The formula
DIVIDE EXISTENTIAL
UNIVERSAL * in the
case
of these
operators,
relations
must
tuples
in
Relation1
must be true for
that
match
at least
every row
in
Relation2.
one instance
must be true for all instances
be union-compatible.
KEYTERMS closure
left outer join
COURSE_RELATION
naturaljoin
SELECT
DIFFERENCE
predicate calculus
set theory
DIVISION
predicateexpression
thetajoin
domain
PROJECT
tuple relational calculus
domain relational calculus
relational algebra
UNION
equijoin
relational algebraic expression
union-compatibl
INTERSECT
RESTRICT
join column(s) Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
safe expression
right outerjoin All
suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
4
Relational
Algebra
and
Calculus
155
FURTHER READING Codd,
E.F.
A Relational
Milestones
of
Model
Research:
of
Data for
Selected
Large
Papers
Shared
Data
19581982.
Banks.
CACM
CACM
25th
13,
No.
Anniversary
6, June
Issue,
1970.
CACM
Republished 26,
No.
in
1, January
1983. Codd,
E.F.
Relational
Research Date,
Report
C. J.
Dietrich,
completeness RJ 987,
An Introduction
Hrbacek,
K. Jech,
Lacroix,
M. and
Venn,
J. On
T. Introduction
on the
Very
Jose,
Base
California,
Database Set
Magazine
and and
edition.
Query
pp.
3rd
edition.
of
Marcel
6598,
Prentice
Hall
and IBM
2004.
edition.
Prentice
Dekker,
Inc.,
Hall,
2001.
1999.
Proceedings
of the
4
3rd International
1977.
Representation
Science
1st
Languages.
370378,
Mechanical
Journal
Systems:
Addison-Wesley,
Languages,
Relational
Databases,
Database
1972. 8th
Theory,
A. Domain-Oriented Large
Sublanguages.
Systems,
to
Diagrammatic
Philosophical
Data
Database
Relational
Pirotte,
Conference
San
to
S. Understanding
of
9(59):
of
118,
Propositions
and
Reasonings.
Dublin
1880.
Online Content Answers to selectedReviewQuestions andProblems forthis chapter are contained
on the
online
platform
accompanying
this
book.
REVIEW QUESTIONS 1
What are the
main operations
of relational
2
Whatis the
3
Whatis the difference between
algebra?
Cartesian product? Illustrate
your answer with an example.
PROJECTION and SELECTION?
4
Explain the
difference
between
the
5
Whatis the
difference
between
tuple relational
6
natural join
and the
outer join.
calculus
and domain
relational
calculus?
Usethe small database shown in Figure Q4.1to illustrate the difference between a natural join, an equijoin
and an outer join.
FIGURE Database
Table
Q4.1 name:
name:
The Ch04_UniversityQue
database tables
Ch04_UniversityQue
Table
STUDENT STU_CODE
name:
LECTURER
LECT_CODE
100278
LECT_CODE
DEPT_CODE
1
2
128569
2
2
6
512272
4
3
6
531235
2
4
4
531268
553427
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
1
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
156
PART I
Database
Systems
Online Content the
online
platform
Allofthe databases usedin the questions andproblems arefoundon
for this
book.
names used in the figures. 'Ch04_UniversityQue'
7
Using the relations
names
For example, the source
used in the folder
of the tables
match the
database
shown in Figure Q4.1 is the
shown in Figure Q4.2, compute the following
TOUR_UK
TOUR_EUROPE
b
TOUR_UK
BOOKING
c
TOUR_UK
TOUR_EUROPE
d
TOUR_UK
e
TOUR_EUROPE
f
TOUR_UK X TOUR_EUROPE
g
sprice_brand 5P2(TREK_UK)
h
(TREK_EUROPE) Ptour_name, price_band
i
database
database.
a
4
The
relational
algebra expressions:
TOUR_EUROPE TOUR_UK
(TREK_UK))
Ptour_name (sprice_brand 5 P2
j
TREK_UK |X| BOOKING k
TREK_EUROPE |X| BOOKING
l
BOOKING
TREK_EUROPE
m Ptour_name, price_band(stour_no 5A1ortour_no 5A2( TREK_UK |X| TREK_EUROPE)) 8
Using the relations
shown in
Figure
Q4.2, compute
the following
tuple
relational
calculus
domain relational
calculus
expressions:
9
a
Find all bookings
with a rating
b
List the tour names offered by TREK_UK and TREK_EUROPE.
Using the relations
shown in
of S6.
Figure
Q4.2, compute the following
expressions:
a
Find all bookings
b
List the tours from TREK_UK that have not yet been booked.
FIGURE Database Table
Copyright Editorial
review
2020 has
Cengage deemed
name:
name:
Learning. that
Q4.2
any
All suppressed
with a rating
of S7.
The Ch04_Tours database tables Ch04_Tours
TREK_UK
Rights
Reserved. content
does
May not
not materially
be
TOUR_NO
TOUR_NAME
PRICE_BAND
A1
TREK PERU
P2
A2
TREK
ANDES
P2
A3
TREK
EVEREST
P3
A4
TREK
K2
P5
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
Table
Database
TOUR_NO
TOUR_NAME
PRICE_BAND
A3
TREK
EVEREST
P3
A1
TREK
K2
P4
A2
TREK
ALPS
P9
name:
TOUR_NAME
CUSTOMER_NO
RATING
TREK
ANDES
C2
S5
TREK
K2
C3
S6
TREK
K2
C4
S7
The Ch04_Vending
Q4.3 to
answer
Writethe relational Figure
11
Questions
Table
name:
MACHINE
1014.
algebra formula to apply a UNION relational
operator to the tables
shown in
applying
a UNION relational
operator to the tables
shown in
algebra formula to apply anINTERSECT relational
operator to the tables shown
Create the table that results from applying and INTERSECT relational
operator to the tables shown
Figure
in
14
4
Q4.3.
Writethe relational in
157
Q4.3.
Figure
13
Calculus
database tables
BOOTH
Create the table that results from
12
and
Ch03_VendingCo Table name:
10
Algebra
BOOKING
Q4.3
Use Figure
Relational
TREK_EUROPE
name:
FIGURE
4
Q4.3.
Figure
Q4.3.
Usingthe tables in Figure Q4.3, create the table that results from
MACHINE DIFFERENCE BOOTH.
PROBLEMS The four
relations
shown
in
Figure
P4.1 represent
tables
in
a database
which
contains
information
about customers eating habits. The database tables store information about customers and the types of restaurants that they frequently visit. In addition, for each restaurant the types of cuisine which is served is recorded.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
158
PART I
Database
Systems
Use the relations queries
in
in
Figure
P4.1 to
write relational
algebraic
expressions
for the following
1-12:
1
Display all information
2
Find all the customers
3
List the names of allrestaurants
4
Show the names of all customers who wentto Claridges before 10 January 2008 or have spent more than
4
shown
Problems
250
about restaurants who frequently
on the last
where the restaurant
visit
price is equal to .
McDonalds.
whereit is possible to have fine dining.
bill.
5 Find the names and phone numbers of all customers who have visited fast food restaurants more than
40 times.
Use the
relations
shown
in
Figure
P4.1 to
6
RESTAURANT X CUSINE
7
CUSTOMER |X| VISIT
8
CUSTOMER |X| VISIT|X| RESTAURANT Hint
9
When trying
to solve this
RESTAURANT
problem
shows
a set
Use the
queries in
of
database
relations
shown
Problems
1220:
12
STUDENT-1
STUDENT-2
13
STUDENT-1
STUDENT-2
14
STUDENT-1
FIGURE P4.1 Database
name:
Table name:
review
2020 has
answer from
expressions:
Problem
7.
in
that
Figure
VISIT))
store P4.2
information to
write
about
relational
student
assessments
algebraic
expressions
at Tiny for
the
STUDENT-2
The Ch04_Restaurant_Guide database tables Ch04_Restaurant_Guide
CUSINE TYPE
CATEGORY
American
FAST FOOD
French
FINE DINING
Chinese
BUFFET African
FINE
DININ
CUSTOMER
Cengage deemed
|X|
tables
South
name:
algebra
VISIT
(CUSTOMER)) Prest_name,last_bill_amount (VISIT)|X| (scus_lname 5Dunnes P4.2
relational
see how you can use your
11
University.
Copyright
following
(CUSTOMER Pcus_lname (srest_name 5MacDonalds
following
Editorial
the
10
Figure
Table
compute
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
CUS_CODE
CUS_LNAME
CUS_PHONE
10010
Ramas
844-2573
10011
Dunne
894-1238
10012
Smith
894-2285
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
Table
name:
Table
name:
REST_NAME
REST_LOCATION
REST_PRICE
and
Calculus
McDonalds
The
Claridges
London
French
Pompidou
Paris
French
The Islands
Cape
Frankies
Milan
American
Hague
South
Town
African
American
4
VISIT
10010
The Islands
10
02/01/2018
146.78
10011
McDonalds
87
30/12/2017
7.98
10011
Claridges
1
01/01/2018
520.22
10012
Pompidou
5
03/01/2017
68.75
10012
McDonalds
32
04/01/2018
12.75
DATE_LAST_VISITED
15
STUDENT-1
16
STUDENT-2
17
(ASSESSMENT))) Pstu_lname(STUDENT-1 |X|(s exam-mark . 60
18
Pclass_name(CLASS) |X| ((ASSESSMENT) |X|(s stu_lname 5Vos(STUDENT-1)))
LAST_BILL_AMOUNT
|X| ASSESSMENT
ASSESSMENTS
19
Write a relational algebraic expression to find scored less than 60 in the Java_Prog exam.
20
To obtain a merit in a class, students must achieve 65 or over in both coursework and exam marks. Write a relational algebraic expression to show the names and numbers of all students in STUDENT-1 who have achieved a meritin their classes.
P4.2 name:
The Ch04_Student_Assess
out the names of all students in STUDENT-1
Ch04_Student_Assess
Table name:
2020 has
STU_LNAME
CRS_CODE
321452
Vos
Comp-600
12
324257
Smith
Eng-534
43
324258
Oblonski
Comp-600
46
STU_LNAME
CRS_CODE
324258
Oblonski
Comp-600
324787
Swithety
Comp-600
Learning. that
any
CLASS_NAME Databases Info_Sys Java_Prog
STUDENT-2
STU_NUM
Cengage deemed
CLASS
CLASS_CODE
STU_NUM
name:
who
relations
Table name: STUDENT-1
Table
159
REST_TYPE
NO_TIMES_VISITED
Database
review
Algebra
REST_NAME
FIGURE
Copyright
Relational
RESTAURANT
CUS_CODE
Editorial
4
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
160
PART I
Table
Database
name:
Systems
ASSESSMENT
STU_NUM
CLASS_CODE
EXAM_MARK
COURSE_WORK_MARK
321452
12
60
70
321452
46
50
60
324258
46
65
65
324457
43
0
70
4 21
Usethe following relational
schema to write relational algebra expressions for the following
a
Show the names of all authors who have published
b
List the ISBNs of all books in stock.
c
Show all the stores in Belgium.
d
Find the ISBN of all stores that carry a non-zero
e
Find the name and address of all stores that do not carry any books byCornell.
queries:
books after 1st January 2019.
quantity of every book in the BOOK relation.
Relational schema BOOK(ISBN, Author_name, Title, Publisher, Publish Date, Pages, Notes ) STORE(Store_No, Store_Name, Street, Country, Postcode) STOCK(ISBN, Store_No, Price, Quantity ) 22
Usethe following relational
schema to write relational algebra expressions for the following
a
Show the Reservation_No 21 December 2020.
b
List the last name of passengers travelling
c
Find the efficiency ratings plane.
d
List the Passport_No
Relational schema PASSENGER(Passenger_ID, FLIGHT(Flight_No,
and Total_cost of all flights that on flight
of all planes, including
were paid before
number VO345. in your answer the airline name for
Passenger_firstname,
Airline_Name,
24
Copyright Editorial
review
2020 has
Using the relations expressions: a
Find the names
b
List all students
c
List all students
Passenger_lastname,
Passport_No,
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
shown in
Figure
of all students
P4.2, compute
who are studying
who have course studying
not materially
be
copied, affect
Date_of_Birth)
Plane_Type)
the
scanned, the
overall
or
class Java_Prog
duplicated, learning
in experience.
whole
or in Cengage
part.
the following
tuple
the
using domain relational
Due
relational
calculus
marks are both greater than 50.
and have taken
Learning
Date_paid, Total_Cost )
Comp-600.
work and exam
Repeat Problem 7, but compute the expressions
Cengage deemed
each
of passengers sitting in seats 36C, 38F and 42D on Flight_No V0667.
RESERVATION(Reservation_No, passenger_ID, Flight_No, Seat_No, Flight_date, PLANE(Plane_type, Traveller_Capacity, Efficiency_Rating) 23
queries:
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
assessment.
calculus.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
PartII
DesIgn concePts
5 Data Modelling with Entity Relationship Diagram 6 Data Modelling Advanced Concepts
7 Normalising Database Designs
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
busIness
vIgnette
usIng DAtA to IMPRove AnD WoMen
tHe LIves oF cHILDRen
Overthe past 20 years, UNICEF1 has assisted charities, organisations and governments by providing data, analytics and insights to helpimprove the welfare of children and women worldwide. They are harnessing the Big Data available to them to make data workfor children. In 2017, UNICEF released the Data for Children Strategic Framework, which has allowed it to expand its commitments in three
areas that
are essential
for
good
data
work: coordination,
strategic
planning
and knowledge
sharing.2 UNICEF currently holds data assets that have been generated from household surveys, global data advocacy and data provided by individual countries; the framework provides an opportunity to build a new data landscape to work within the data governance frameworks of individual
countries
and provide
a gateway
to reliable
and open
data and analysis
on the situation
of children and women worldwide.2 UNICEF Data and Analytics teams workto ensure that the data collected is statistically sound by using Multiple Indicator Cluster Surveys (MICS).3 Global databases are used to track children and women, and new methodologies and monitoring tools have been designed to enable successful data gathering
on issues
such
aslow
birth
weight, education
and child labour.
UNICEF
houses the
power of a modern data warehouse to enable data to be more accessible through interoperability, and data visualisation is achieved through the use of interactive maps and graphs. The ultimate aim is to put data into action.
1
UNICEF,
2
Data for
3
Multiple
available: Children
https://data.unicef.org/about-us/ Strategic
Indicator
Cluster
Framework,
Surveys,
available:
available:
https://data.unicef.org/resources/data-children-strategic-framework/
http://mics.unicef.org/
163
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
164
PARt
II
Design
Concepts
One example
project
Uganda through project
the
was to
impact
tackle
to the
has been concerned
use of near real-time the
scorecards,
allowing
health
of care.4 This, coupled including
SMS
Action
Ministry
of
of the
are
successfully
awareness
Copyright Editorial
review
2020 has
Cengage
to
near
Learning. that
any
to
quality,
be
Swaziland
and report
using
and impact
collected
the
colour-coded of their
through
and the
and
The aim of the
areas
collection
reach
measured
Kenya,
community.4
in rural
data
feedback
provide
delivery
many sources,
ability for the community
real-time
data
and
children
The
to raise tip
there
for
feedback
education
virus
in
2013
data received schools
built
to
from
or the impact
Education
Management
an
adaptive
EMIS,
and periods
when
Framework
has
Strategic
has
75 counties
of the
data
Lebanon.4
together
history
Children
Zika
data
and
to
no
MEHE
virus
analytics
severe
South
up to
was used to and
analyse
data
maternal,
social
prevention of
mining
newborn
and
To raise
media
a data-informed
the impact
and
distress
America.
develop
provide
demonstrate
support
caused in
UNICEF teamed
awareness
of the iceberg
community
for
at least
and
came
in
use in 355 schools.4
Data
in
refugees
of the
was and
attendance,
anonymised
data
needs
UNICEF
Brazil, the
Facebook
Brazil.
the
In
child
UNICEF
the
time,
2016,
UNICEFs
and robust
and
At this
to
and poor-quality
determine
a childs
of children.
within
clean
and
Today, this system is in where
are just
education
(MEHE)
During
to track
campaign
studies
provide
to
delivered.
measures,
Zika
to
Education
Lebanon.
women
of prevention
case
Using
deemed
of
communications
available:
being in
the lives
about
UNICEF
was impossible
examples
well-being
conversations
4
the
community
impact
services
mobile
but due to the inadequate
werent in school. many
the
it
(EMIS)
improved
databases
health
enabled
understand
Higher
a way for schools
were and
These
has
time
enabled and
services
System
which provided
public
to
to children,
educational
affected
facilities
has enabled
Education
of schools,
There
Action
also
number
Information
they
has
free education
alimited
child health in from the
solutions.
Data for
provide
monitoring
of decentralised
Data for
with near-real
messaging,
to recommend
The
problems
communities.
with
data and feedback
strategies.
well architected
purposes.
and
child
health
in
East
Africa,
https://data.unicef.org/wp-content/uploads/2018/01/From-Insight-to-Action-November-2017.pdf
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
cHAPteR5 Data Modelling with entity Relationship Diagrams In tHIs cHAPteR, you The
main characteristics
How relationships
relationships
WILLLeARn:
of entity
between
entities
are incorporated
How ERD components That real-world
relationship
affect
database
are
components
defined
and refined,
into the
database
database
design
design
design
and
how those
process
and implementation
often requires
the reconciliation
of conflicting
goals
Preview This chapter
expands
coverage
modelling is the first real-world
objects
Therefore, entity
the
and
the
of
diagrams
in
and relationships
chapter
goes
among
the
much entities,
of data required studies One
will case
which
known
study
as Tiny
Copyright Editorial
review
2020 has
Learning. that
any
shows
is
based
on
University, is
based
All suppressed
Rights
Reserved. content
does
you to
May not
not materially
should
be
copied, affect
the
Throughout
types
scanned, the
overall
design
or
computer. through
duplicated, learning
you
world.
structure
to
this
you.
the
chapter,
This
called
wealth
two
amongst
The
of
of relationships
summarise
company
the
model (ERM) components
be familiar
of relationships
travel
conflicting
basic
depiction
design.
around
make
now
graphic help
on the internal
how
the
depictions
different
agents
the
Data
between
graphically
entity relationship
For example,
an international
of travel
requiring
those
the
in
expressed
used in the
analysing
a successful
to illustrate
design.
as a bridge
is implemented
representation
how
of database
serving
be overstated.
Models.
broader,
chapter illustrates
possibly
Cengage deemed
and
to implement
a number
Finally, the design,
and
be used
owns
2, Data
aspect
details,
cannot
and their
deeper
that
modelling
and definitions
Chapter
entities
modelling
design journey, model
data
(ERDs),
Most of the basic concepts were introduced
data
database
database
importance
relationship
of the
step in the
case
entities.
ILoveHolidays,
second
case
study,
of a university.
goals can be a challenge
in
database
compromises.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
166
PARt
II
Design
Concepts
note As this is
book
generally
exclusively
and
design
type.
the
data
Conceptual
in the logical Chapter they
on the
tool.
Actually,
used
requirements
model,
conceptual organisation.
conceptual
of databases.
However,
model is
develop
database
used
you
models
of an
relational to
relational
models are used in the
design
3, the
are
focuses
a relational
might such
Therefore,
design
since
be tempted as the
the
ERM
in this
chapter
conclude
to
that
be used
ERM is independent
of databases,
you are now familiar
extensively
to can
to of the
while relational
ERM
database
models are used
with the relational explain
the
understand
model from
ER constructs
the
and the
way
designs.
5
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
5.1
tHe entIty
You should
remember
that the ERM forms end
user.
ERDs
an entity Thus,
the
the
the
basis
for
of entities
Figure
these
5.1
entities,
a BOOK
can
a RECIPE
Chapter
database
design
In
that
some
with
Entity
Relationship
Diagrams
167
3, Relational
attributes object
design
Model Characteristics,
database
and relationships.
are
often
developed
The order in tools
used in
used
by the Because
interchangeably.
this
which the
are
as viewed
chapter
includes
ERD components
to
develop
are
ERDs that
can
and implementation. has
Figure 5.1, BOOK,
out
and
modelling
which
been
RECIPE,
would be identified find
Modelling
conceptual
entities,
database
way the
ERD,
the
entity
and flights.
by the
a simple
we can
such
hotels
dictated
successful
words
ILoveHolidays
employees, is
and their ingredients.
ERD in
Models, and
the
Data
MoDeL
main components: object,
of the
by introducing
all examples
2, Data
databases
(objects)
chapter
start
Chapter
a real-world
bookings,
in the
Lets
book
depict
entities
customers,
form
from
(eR)
the basis of an ERD. The ERD represents
represents
covered
ReLAtIonsHIP
5
during
basic
created
to
model
recipes
RECIPE_INGREDIENT
database
information
design.
about
the
within
a cookery
and INGREDIENT
Bylooking
are
more closely
relationships
that
exist
5
at the
between
as:
contain
at least
requires
at least
one INGREDIENT
one one
RECIPE,
but
may contain
RECIPE_INGREDIENT,
can be found
in
a number
many
but
RECIPEs
may have
many
of RECIPE_INGREDIENTs
RECIPE_INGREDIENTs
but
may not appear in any
RECIPE_INGREDIENT.
You can also see in BOOK
contains
chapter
that
attribute
each
as the
Copyright review
2020 has
5.1 that
used
called to
each
denote
in
of that
Likewise,
Figure
entity which
has a number has the
an attribute
instance
book uniquely. one shown
each ISBN,
that
entity.
is the
In this
FK is used to
5.1 to illustrate
of attributes.
notation
next
PRIMARY example,
denote
all the
{PK}
For example to it.
KEY a books
a FOREIGN
concepts
of
You
an entity,
ISBN
KEY.
of entity
is
the
entity
will learn
in this
which
used
is
an
to identify
We will use examples
relationship
modelling
in
chapter.
FIguRe
Editorial
is
identifies
different
such this
{PK}
that
Figure
an attribute
Cengage deemed
Learning. that
any
5.1
All suppressed
Arecipe eRD
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
168
PARt
II
Design
In
Concepts
Chapter
2, you learnt
Foot
notation
UML
notation
about the
and the to
more
model
different
notations
contemporary
ERDs,
using
UML
relational
used
within ERDs, including
notation.
Within this
concepts
the traditional
chapter,
we
Crows
will continue
to
use
and terminology.
online content Fora more detailed description ofthe Chen,CrowsFootandotherER model
notation
the online
systems,
see
platform for this
Appendix
E, Comparison
of ER
Modelling
Notations,
2, you learnt
that,
available
on
book.
5.1.1 entities 5
An entity is an object level,
an entity
word
entity
refers
to
actually
in the
ERM
a specific
represented
of interest
entity
corresponds
to
table
part is
used
end user. In
refers to the
row
as an
is
subdivided
by a box that
The top
to the
to
name
set and not to a single
a table
entity
the
Chapter
and
not to
instance
into
or entity
three
entity.
The
a row
entity in the
at the
occurrence. relational
occurrence.
In
In
ER
modelling
other
words, the
environment. UML
The
notation,
ERM
an entity
is
parts:
entity
name,
a noun,
is
usually
only
when
written
in
capital
letters. The
middle
The
bottom
part is
used
part is
to
used
name
to list
the
or object-orientated within
this
and
describe
the
methods.
database
attributes.
Methods
models
are
used
and therefore
will be left
designing blank
object-relational
in the
examples
book.
note One component
database some
of this
However, an entity
of
UML is the
modelling. class
diagram
it is important is referred
The
UML
UML
be shown
aware
ERDs
standards
to
you
similar
to the function
book for
modelling
will be described that
in
UML
are
For
see in this reflected
However,
another,
formats. in
but it
are
capabilities.
vendor
presentation
which is
in this
the
of the
entities
using relational terminology
is
ER diagram
in relational
and their relationships, terminology
different.
uses
and concepts.
For
example,
in
UML,
as a class.
These
software
you
diagram
modelling
diagram,
adopted
notation,
that
to
class
standards.
class
The notation
in
any
although
most of the
example,
chapter
the
to the
commercial
the software
software entity
adhere
that
name
database
details
generates
may be
generally
accepted modelling
UML
modelling
software
that
has
do not vary significantly
from
one
such
ERDs lets
boldfaced
and the
you select
entity
name
various box
may
colour.
5.1.2 Attributes Attributes
are characteristics
AGENT_ID, attribute
Copyright Editorial
review
2020 has
Cengage deemed
box
Learning. that
any
of entities.
AGENT_NAME,
All suppressed
below
Rights
the
Reserved. content
does
entity
May not
For example,
AGENT_ADDRESS.
not materially
be
rectangle
copied, affect
scanned, the
overall
In
(see
or
duplicated, learning
Figure
in experience.
whole
or in Cengage
the the
TRAVEL_AGENT
entity includes
UML
attributes
model,
the
the
are
attributes
written
in the
5.2).
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
As you examine data
entries,
if the
travel
manager
Figure
because agent
5.2, note that
of the
AGENT_ID,
assumption
has just
been
that
it
Data
AGENT_NAME
all travel
established,
5
agents
might
have
with
Entity
Relationship
and AGENT_ADDRESS
have
not
Modelling
an ID,
name
a phone
and
number,
Diagrams
169
will require
address. email
However, address
and
yet.
FIguRe 5.2
the attributes of the tRAveL_Agent
entity
Travel_Agent AGENT_ID
{PK}
AGENT_NAME AGENT_ADDRESS
AGENT_PHONE
5
AGENT_EMAIL
online content Foot ERDs and Databases shows
Microsoft VisioProfessional wasusedto generate boththe Crows
UML class diagrams in this and subsequent
with
Visio
you how to
Professional:
create
ERD
A Tutorial, models like
available
the
chapters.
on the
ones in this
Appendix
accompanying
A, Designing
online
platform,
chapter.
Domains
Attributes have a domain. As you learnt in Chapter For example, the domain for the (numeric) attribute the lowest possible GPA value is 0 and the highest attribute GENDER consists of only two possibilities: for
a companys
date
of hire attribute
consists
3, a domain is the attributes set of possible values. grade point average (GPA) is written (0,4) because possible value is 4. The domain for the (character) M or F(or some other equivalent code). The domain
of all dates that
fit in
a range
(for
example,
company
startup date to current date). Attributes may share a domain. Forinstance, an employee of atravel agency may also be a customer of the travel agency and share the same domain of all possible addresses. In fact, the data dictionary may let
a newly
declared
attribute inherit
the characteristics
name is used. For example, the TRAVEL_AGENT named ADDRESS. identifiers The
(Primary
ERM
of an existing
attribute if the
AND EMPLOYEE entities
same attribute
may each have an attribute
Keys)
uses identifiers
to
uniquely
identify
each
entity instance.
In the
relational
model, such
identifiers are mapped to primary keys in tables. Identifiers are underlined in the ERD. Key attributes are also underlined when writing the relational schema, using the notation introduced in Chapter 3. TABLE
NAME (KEY_ATTRIBUTE
For example, a CAR entity CAR(CAR_REG,
1, ATTRIBUTE
may be represented
2, ATTRIBUTE
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
ATTRIBUTE
K)
by:
MOD_CODE, CAR_YEAR, CAR_COLOUR)
(REG is the standard acronym for vehicle registration
Editorial
3, ...
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
number.)
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
170
PARt
II
Design
Concepts
Composite Ideally,
Primary
a primary
a single-attribute that
is,
a primary
key
primary
PAYMENT_NO. PAYMENT
from
the
5
table
primary
FIguRe 5.3
a single
more than
attribute.
shown
entity, the
attribute.
5.3,
candidate
entity
table
in
Figure
5.3
entity
candidate
and INVOICE_NO
is the
primary
key
the
a
of using
current
and
PAYMENT_NO
and INVOICE_NO)
by using
instead
Given the
key,
structure
combination
attribute
becomes
is
deleted
an acceptable
key.
the PAyMent (entity) components and contents
PAYMeNT_NO
iNvOiCe_NO
CUST_NO
AMOUNT_PAiD
PAYMeNT_TYPe
DATe_PAiD
152675687
631304
152001
500
VISA
03-Apr-19
152342111
631304
152002
500
VISA
03-May-18
152887222
631304
152003
1000
VISA
03-June-19
152228445
712344
152010
350
American
152987877
712344
152011
550
VISA
152344223
901234
152132
2000
MasterCard
06-Jun-19
152334534
091234
152167
4329
MasterCard
02-Aug-19
Express
24-May-19 01-Jul-19
If the PAYMENT_NO in Figure 5.3is used asthe primary key, the PAYMENT entity in shorthand form by: PAYMENT (PAYMENT_NO, On the
uses
database
(occurrence)
instance.
key. If the
key (CUST_NO
the ILoveHolidays
instance
of CUST_NO each
PAYMENT_NO
a proper
the
For instance,
PAYMENT
identifies
Figure is
For example,
However, it is possible to use a composite
combination
uniquely in
one
each
of the
approach
and INVOICE_NO
PAYMENT
composite
of
to identify
key composed
Either
CUST_NO
of only
composed
may decide
composite
of
composed
primary key named PAYMENT_NO.
administrator
of the
Keys
key is
other
CUST_NO, INVOICE_NO,
hand, if
PAYMENT_NO
CUST_NO AND INVOICE_NO, (CUST_NO,
INVOICE_NO,
Composite
deleted
and the
the PAYMENT entity
AMOUNT_PAID,
Note that both key attributes
Attributes
is
AMOUNT_PAID_PAYMENT, composite
TYPE, DATE_PAID)
primary
may be represented
PAYMENT_TYPE,
may be represented
key is the
combination
of
by:
DATE_PAID)
are underlined in the entity notation.
and Simple Attributes are classified
as simple
or composite.
A composite
attribute,
not to
be confused
with a
composite key, is an attribute that can be further subdivided to yield additional attributes. For example, the attribute ADDRESS can be subdivided into street, city, state and postal code. Similarly, the attribute PHONE_NUMBER can be subdivided into area code and exchange number. A simple attribute is an attribute that cannot be subdivided. For example, age, gender and marital status would be classified as simple
into
attributes.
To facilitate
detailed
queries, it is
usually
appropriate
to
change
composite
Single-valued
Attributes
A single-valued
attribute
is an attribute
that
can have
only a single
value.
For example,
have only one ID number and a manufactured part can have only one serial number.
Copyright Editorial
review
2020 has
Cengage deemed
attributes
a series of simple attributes.
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
a person
Keep in
suppressed at
any
time
from if
the
subsequent
can
mindthat
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
a single-valued as
attribute
SE-08-02-189935,
the
region
(02)
in
which
and the
part
Multivalued Multivalued several
trim).
no
entity
was
by adding
attribute.
a composite
(SE),
or a household may
Figure for
produced
are attributes that
colour
support
a simple
but it is
the
Data
Modelling
For instance, attribute
plant
within
with
a parts
because
that
Entity
region
it
(08),
Relationship
serial
can
Diagrams
number,
such
be subdivided
the
shift
within
171
into
the
plant
(189935).
degrees
a cars
is
part
Attributes attributes
The ERD in
there the
the
single-valued,
number
university
Similarly,
is not necessarily
is
5
may have
be subdivided
5.4 contains
primary
the
can have
all of the
keys.
notation
into
However,
{PK}
after
many values. For instance,
several many
components primary
the
different
colours
phones,
(that
is,
attribute(s)
can
each
colours
introduced
keys
a person with its
for
the
own
roof,
thus far. In the
be easily
determined
to
added be the
to
may have number. body
and
UML notation
an attribute
primary
within
key.
5 FIguRe 5.4
resolving
Multivalued
Although
the
implement relation
table,
review
2020 has
can
handle
Remember
*:* relationships from
intersection
must decide
and
Chapter
3,
represents
on one of two
possible
create
the
shown
in
Cengage deemed
components.
Learning. that
any
new
attributes
Figure
5.5
All suppressed
For example,
5.5,
Reserved. content
does
May not
not materially
be
assigned
the
copied, affect
to
the
data
courses
of action:
the
overall
or
duplicated, learning
in experience.
So if
whole
should that,
multivalued
CAR_COLOUR and
not
in the
attributes
one for each of the original
CAR_BODYCOLOUR, CAR
you
Characteristics,
value.
attribute
multivalued
can be split to
CAR_TRIMCOLOUR,
entity.
multivalued attribute into
scanned,
attributes,
Model
a single
CAR entitys
CAR_TOPCOLOUR,
and
splitting
Rights
the
multivalued
Relational
Within the original entity, create several new attributes,
FIguRe
Copyright
RDBMS.
column/row
designer
Problems
model
in the each
attributes
Editorial
Attribute
conceptual
them
exist, the
1
the multivalued attribute in an entity
or in Cengage
part.
Due Learning
to
electronic reserves
new attributes
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
172
PARt
II
Design
Concepts
Although table.
this
For
solution
example,
cars,
the
table
cars
that
do
their
If some of
be
have ten
degrees
most of the
In
5
new (independent)
relationship. the
are
and
colour
section.
not
some
that
case,
applicable.
new attributes degrees
certifications.
or none,
attribute
you have seen solution
or
(Imagine would cause
and
most have fewer of those
In
in the
for
components
into
most
added
non-existing
employee
while
ten
Table
such
entity
a change
allows
is
the
number
values
1 applied,
would
it is
be
not an
the
then
related
designer
to
to
components. (See Figure 5.6.) the
define
original
colour
CAR
for
entity
different
in
a 1:*
sections
of
5.1).
A new entity set composed of a multivalued attributes components
5.1
components
of the
multivalued
attribute Colour
Top
White
Body
Blue
Trim
Gold Blue
Using the as
many
Figure
approach colours
5.5 (a)
multivalued
illustrated
as
Derived Finally,
attributes.
Learning. that
a new
to
change
listed
entity
expandable
any
(derived)
All suppressed
If you
Rights
Reserved. content
from
instead,
may be found
EMP_DOB.
Cengage deemed
having
components
may be classified
database;
EMP_AGE,
has
5.1, you even get a fringe
in
in
the
Table
benefit:
table
5.1.
structure.
This is the
a 1:* relationship
solution,
you are now able to
with the
and it is compatible
Note that preferred
original
the
ERMs
way to entity
assign
deal
yields
with the relational
in with
several
model!
Attributes
within the
2020
Table without
the
Creating
an attribute
value is calculated
the
in
necessary
and (b) reflect
benefits: it is a more flexible,
review
attribute
containing
Section
Copyright
the
of the original multivalued attributes
Interior
Editorial
new
for
problems
colour
as N/A to indicate
although
CAR_COLOUR
Note that
car (see
FIguRe 5.6
short,
the nulls
certifications number
major structural
solution.
Create a new entity composed The
entity
and
to
as alogo
accommodate
a multivalued
would
employees.)
such
generate
an employee
attributes
can lead
are entered
splitting
to
to
sections
sections
Figure 5.5
adoption
components
modified
colour
applied
employees
acceptable
tAbLe
in
degree/certification
null for
2
such
entries for those
when it is
work, its colour
must
have
how the solution problems
to
additional
structure not
colour
seems
if
does
May
other
Microsoft
not materially
be
copied, affect
the
overall
or
duplicated, learning
The derived
value
you
in experience.
A derived
whole
would
or in Cengage
part.
Due Learning
of the
attribute
attribute
by using an algorithm.
the integer Access,
scanned,
attribute.
attributes.
it can be derived
by computing
use
not
as a derived
need
difference
electronic reserves
rights, the
right
some to
whose
not be physically
For example,
stored
an employees
between
use INT((DATE()
to
is an attribute
the
current
age,
date
and
EMP_DOB)/365).
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
If you use EMP_DOB by
Oracle, you was stored
multiplying
by dividing
which
the
trip
would use SYSDATE instead
in the
Julian
quantity
ordered
distance
by the
can be seen on the
FIguRe 5.7
date format.) by the
time
attribute
price.
en route.
EMP_AGE in
Data
Modelling
of DATE(). (You
Similarly,
unit
spent
5
the
total
Or the In
Figure
Entity
are assuming,
cost
average
derived
Relationship
of course,
of an order line
estimated
UML,
with
speed
attributes
are
Diagrams
173
that the
can
be derived
can
be derived
prefixed
with
a /,
5.7.
Depiction of a derived attribute
5
Derived could
attributes
aggregating The
to
attribute
as computed
values
located
on
many table
to
store
derived
attributes
in
database
placed
on a particular
with such derived
tAbLe
5.2
in
on the
rows
(from
tables
application.
constraints.
attributes
attributes.
located
of values
in accordance storing)
referred
two
sum
constraints
not
as adding
the
decision
the
are sometimes
be as simple
A derived
same
the
row,
same
depends
should
Advantages
could
computation be the
or from
result
a different
processing
balance
the
and disadvantages
of
table).
requirements
be able to
Table 5.2 shows the advantages
the
table
on the
The designer
attribute
or it
and
design
of storing (or
database.
and disadvantages
of storing
derived Derived
Attribute
Stored
Advantage
Not
Saves
CPU
Data Can
value be
processing is
used
available
to
track
keep
data
Requires
constant
to ensure derived especially
cycles
readily
historical
Disadvantage
attributes
Saves
storage
space
Computation of
always
yields
current
value
maintenance
Uses
value is current,
if any values
calculation
Stored
CPU
processing
Adds coding
cycles
complexity
to
queries
used in the
change
5.1.3 Relationships A relationship
is an association
also known as participants. identified verb;
for
employs
by a name
that
example,
a STUDENT
a LECTURER,
is
descriptive
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
entities.
of the
takes
a DIVISION is
and an AIRCRAFT is flown
Editorial
between
The entities
You should recall from a
relationship.
CLASS,
managed
that
participate
in
a relationship
are
Chapter 2, Data Models, that each relationship is The relationship
a LECTURER
name is
teaches
by an EMPLOYEE,
a
an active
CLASS,
a
a CUSTOMER
or passive
DEPARTMENT
makes a BOOKING
by a CREW.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
174
PARt
II
Design
Concepts
Relationships between
the
between
entities
A CUSTOMER Each INVOICE Because
you
entities
named
many INVOICEs.
generated
by one CUSTOMER.
both
directions
see that this relationship The relationship For
example,
if
A DIVISION
you dont
5
manage
of the
is
specify
is
both
between
difficult
to
establish
by one
is 1:1 or 1:*. Therefore, If the
answer
many
DIVISIONs.
division?
written
An EMPLOYEE
may
manage
cannot
manage
the
is then
written
An EMPLOYEE
CUSTOMER
and INVOICE,
it is
easy
to
if you know
only
one side of the relationship.
is
you should
yes, the
ask the question
relationship
is
Can
1:*, and the
an employee
second
part
of the
as:
If an employee relationship
define the relationship
that:
EMPLOYEE
know if the relationship
is then
specify
as 1:*.
more than
relationship
That is, to
would
that:
managed
one
directions.
you
relationship
can be classified
classification
you
operate in and INVOICE,
may generate is
know
always
CUSTOMER
more than
one division,
the relationship
is
1:1, and the
second
part of
as:
may manage only one DIVISION.
note In UML class diagrams the relationship name
of the
association
by an arrow ( name
may be replaced
seen
chapter,
A role
usually
an INVOICE
a PRODUCT
all relationship
relationship
5.1.4
is
line.
Associations
also
which the relationship name
expresses
described
the
by two role
have
flows. role
a direction,
represented
Alternatively, the association
played
names
name. Normally, the
by a given
which represent
entity
(class)
in
the relationship
example:
generates
supplies
as the
association
names.
Each relationship
A VENDOR
same
over the
with role
by each class; for
A CUSTOMER
In this
written
? ) pointing in the direction in
the relationship. as
is
name is often referred to as an association
and
names
name
and each INVOICE each
will
be
PRODUCT described
used in traditional
belongs
is
to a CUSTOMER.
supplied
using
relational
the
by a VENDOR. singular
association
name,
as it is the
modelling.
Multiplicity
You learnt
in
Chapter
many-to-many.
2 that
entity
Multiplicity is the
relationships
may
main constraint
that
be classified
exists
as
one-to-one,
on a relationship,
one-to-many,
which enables
or
us to define
the number of participants in that relationship. So, multiplicity refers to the number ofinstances of one entity that are associated with one instance of a related entity. Figure 5.8 illustrates how Visio shows multiplicity
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
on an ERD
All suppressed
Rights
Reserved. content
does
using
May not
not materially
be
UML
copied, affect
notation.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
FIguRe
5.8
5
Data
Modelling
with
Entity
Relationship
Diagrams
175
Multiplicity in an eRD Relationship
name: teaches
Herethe arrow indicates
the direction
of the
relationship
LECTURER
teaches
CLASS
c
1..1
1..4
5 Multiplicities
As you
examine
Figure
5.8, notice
that the
multiplicities
represent
the
number
of occurrences
in the
related entity. For example, the multiplicity (1..4) written next to the CLASS entity in the LECTURER teaches CLASS relationship indicates that the LECTURER tables primary key value occurs at least once and no morethan four times as foreign key values in the CLASS table. If the multiplicity had been written the
as (1..*), there
multiplicity
(1..1)
would be no upper limit to the
number
written next to the LECTURER
of classes
entity indicates
alecturer
that
might teach.
each class is taught
Similarly, by one and
only one lecturer. That is, each CLASS entity occurrence is associated with one and only one entity occurrence in LECTURER. If you examine multiplicity further, you will see that each numerical range actually describes two important
constraints:
participation
and
cardinality.
The
word cardinality
is
a common
term
used
in traditional entity relationship modelling, and is used to express the maximum number of entity occurrences associated with one occurrence of the related entity. Participation determines whether all occurrences of an entity participate in the relationship or not. So, the multiplicity (1..4), written next to the CLASS entity in Figure 5.8, can beinterpreted as follows: The 1 represents the participation and indicates relationship and that it is mandatory. The 4 to four
represents the cardinality,
Copyright review
2020 has
Cengage deemed
Learning. that
that one lecturer
must participate in the
mustteach
atleast
one and up
classes.
You willlearn
Editorial
and indicates
that alllecturers
any
more about relationship
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
participation in Section 5.1.8.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
176
PARt
II
Design
Concepts
note Traditional
modelling
occurrences
to
one
or
many.
has to
written
entities,
entities,
while
minimum
the
MS Visio to
example,
in the
Tiny
unless it
has at least
application
DBMS
provided
Chapter
draw
application
9, Procedural
Multiplicities
in
the
Chapter
2.)
the
may
to limit
very
concise
Such
rules,
want to
can
in
notation,
a class
only
class. table
level
create
and
tool
Knowing
software
that
hold
text
numbers
of associated
entities.
the
to
the
number
of
of zero,
appropriate
at the how
numbers
cardinality
application
ensure
classroom
will learn
the
level.
is
For
not taught
30 students,
the
keep
mind
However, that
in
capability
is
triggers
in
execute
SQL.
statements derived
also establish
Foot
minimum
at the
cardinalities
You
specific the
of associated
enrolment
of the
allow
Crows
the
useful
if the
not
by placing the
number very
university
SQL and Advanced
data environment,
example
is
or by triggers.
by
to represent
is indicated
Similarly,
cardinality
software
for
maximum
implementation
established
were introduced
organisations
the
Language
are
study,
did
used
value represents
occurrences
enrolled.
use that
handle
the
Foot
were
using
The first
of entity
Crows
Cardinality
represents
case
and symbols
ERDs,
(x,y).
students
should
cannot
by the
value
ten
Chen
Instead,
cardinality.
number University
software
the
numeric
second
maximum
as
ERD.
using the format
and
that
such
on the
When using
be used to specify
beside the
5
notations
be
the
known
from
as
a precise
ERMs
entities,
business and
rules.
detailed
attributes,
(Business
rules
description
relationships,
of
an
cardinalities
and constraints.
online content
Since the carefuldefinitionofcompleteandaccuratebusiness rules
is crucial to good design, their undertake learning and
a real-life
in this
C(Global
through
logical
are
Tickets
platform
Ltd
in the
and physical
online
database
chapter
all stages
derivation is examined in detail in Appendix design exercise for a university lab.
applied
in the
e-commerce database
database
development database).
design
In
process
database
Appendices
B and
conceptual
design and implementation.
accompanying
this
The modelling skills you are
of a real
from
B, where you will
design C you
design
(Both
and
appendices
in
Appendices
B
will be taken verification
to
are available
on the
book.)
Since business rules define the ERMs components, making sure that all appropriate identified is an important part of a database designers job.
business rules are
5.1.5 existence Dependence An entity is said to be existence-dependent if it can exist in the database only when it is associated with another related entity occurrence. In implementation terms, an entity is existence-dependent if it has a mandatory foreign key that is, a foreign key attribute that cannot be null. For example, if an XYZ Corporation employee wants to claim one or more dependents for tax-withholding purposes, the relationship
EMPLOYEE
claims
DEPENDENT
would
be appropriate.
In that
case, the
DEPENDENT
entity is clearly existence-dependent on the EMPLOYEE entity, because it is impossible for the dependent to exist apart from the EMPLOYEE in the XYZ Corporation database. If an entity can exist apart from one or morerelated entities, it is said to be existence-independent. (Sometimes designers refer to such an entity as a strong or regular entity.) For example, suppose that
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
the
XYZ Corporation
produced PART
uses parts to
in-house
to
and
other
exist independently
all, at least
some
Therefore,
The
parts
of relationship
the
CAR
in the
entity
primary
primary
entity
does
also
not
suppose
In that
relationship
with
that
scenario,
PART
is
Entity
some it is
Relationship
of those quite
supplied
Diagrams
177
parts are
possible
for
by VENDOR.
a
(After
VENDOR.
In this
on
how the
of one entity
VENDOR
and
section,
in database
in the
as
you
a PK component
PK of the
parent
and
entity
both
how
TRAVEL_AGENT(AGENT_ID, EMPLOYEE(EMP_ID,
of the
key
different
is
defined.
To
related
entity.
For
3.5, is implemented
PRODUCT.
entity.
parent
There
For example,
component
are times
in
and
relationship
relationship, entity.
Figure
a foreign
strength
5
5.6, key
decisions
in the
AGENT_NAME,
Travel
Agent
relationships
entity. case
AGENT_ADDRESS,
EMP_LNAME,
exists if the PK of the related
By default,
as a FK on the related
entities
AGENT_ID,
3, Figure
key in
related
entity
key in the
Chapter
a primary
will learn
of a related
design.
appear
EMPLOYEE
in
as a foreign
component
appears
key
as a foreign
PRODUCT
VENDOR
key
primary
appears
relationships also known as a non-identifying
contain
TRAVEL_AGENT
based
key in
a primary
key arrangement
by having the the
between
primary
entity.
weak (Non-identifying) A weak relationship,
Modelling
by a vendor.)
from
key
key (CAR_REG)
CAR_COLOUR
affect
is
primary
VEND_CODE key is
Further,
vendors.
in the
supplied
strength the
1:* relationship
the foreign
the
not
Data
strength
a relationship,
by using the when
from
a VENDOR
are
products.
bought
PART is existence-independent
concept
example,
are
from
of the
5.1.6 Relationship
implement
produce its
parts
5
are
For example,
study
are
EMP_FNAME,
suppose
defined
AGENT_PHONE,
established
that
as:
AGENT_EMAIL)
EMP_PHONE,
EMP_GRADE,
PAYROLL_NO) In this case, a weak relationship is the
EMPLOYEE
EMPLOYEE
entitys
PK did
the
weak relationship
that
the
UML
not inherit
notation
does
do not require
However,
because
foreign
key
FIguRe
the
between
diagrams
the
exists
not
in
PK component
make
the foreign
attributes
TRAVEL_AGENT
AGENT_ID
TRAVEL_AGENT
the focus
5.9
between
PK, while the
from and
a distinction
are shown
in
the
the
to
A weak non-identifying
and
be added to the
diagrams
relationship
entity.
strong
to
{FK}
the EMP_ID example,
Figure
Figure
5.9
5.9,
of the
class
1:* relationship. databases,
attribute
between tRAveL
will see
UML
model relational after the
the
shows
you
relationships.
many side
diagrams
by adding
because
only an FK. In this
By examining
weak
use of UML class class
is
TRAVEL_AGENT
EMPLOYEE. between
key attribute
here is on the
AND EMPLOYEE
EMPLOYEE
name.
Agent
and eMPLoyee
EMPLOYEE TRAVEL_AGENT AGENT_ID
EMP_ID
{PK}
AGENT_ID
{PK}
AGENT_NAME
employs
{FK1}
PAYROLL_NO
c
EMP_LNAME
AGENT_ADDRESS AGENT_PHONE
EMP_FNAME
1..*
1.1
EMP_PHONE
AGENT_EMAIL
EMP_GRADE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
178
PARt
II
Design
Concepts
used
to looking
note If
you
are
expect
to
see
relational
the
diagram
design
relationship the
FK
characteristics
can, of course, but that
5
tools
after
the
always
properly
clearly
lines
in
used for
communication
FIguRe 5.10
Table
name:
name:
Primary
Key:
rather
reflected
been
defined; by the
the
reflects
choice,
decision
between
tables
the
that
than
in
Weak(non-identifying)
anchor
readability
PK to
after the
You
of the
is
the
Access,
you
However,
the
on the
which
design.
after the
the
the
line
on
FK attribute
FK points.)
You
has been completed
that
vertically
entities
graphically.
relationship that
update
will discover and
focus
ensures
of the
horizontally
FK.
FKs are established
to
points
necessity.
the
a
anchor
both
designer(s)
exist
the
the
are anchored
feature
characteristics
includes
to improve
(This
Microsoft
PK to
an ERD, the
so it is impossible
line
by
the
relationships
properly,
software.
attribute
rather
ERD that
produced from
ERD. In
way those
are used
created
match
ones drawn
in the
the
Professional
has
as the
diagram
than
move the relationship
designers
of the
such
relational
necessarily
been
a complex
by the
An example
not
them,
entities
FK has
to
diagrams
in the
as Visio
the
dictated
Database
such
decide
decision
relationship
is
between
between until
line
convention
and the relationships In fact, if
at relational
relationship
the
placement
placed
entities
(Remember
of the is largely
that
the
ERD is
and end users.)
weak relationship
relationship
is
shown
in
Figure
5.10.
between tRAveL_Agent
and eMPLoyee
CH05_Travel_Agent
Travel_AGENT AGENT_ID
AGeNT_iD 1
AGeNT_NAMe
AGeNT_ADDreSS
Timeless
Upper
Travel
Keys
Cannock,
Business
FlightLite
Anansi
7550, 9
VILLANOVO
0800
Village,
AGeNT_eMAiL
333 2233
[email protected]
Staffordshire,
WS12 2HA, 8
AGeNT_PHONe
UK
Park,
0860232425
Durbanville,
[email protected]
Cape Town, SA 33170809753
244 Rue De Rivoli 75001
[email protected]
Paris 222
Rue
Paris,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
De Rivoli 75001
France
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Table
name:
Primary
5
Data
Modelling
with
Entity
Relationship
Diagrams
179
EMPLOYEE
Key: EMP_ID
eMP_iD
AGeNT_iD
eMP_LNAMe
eMP_FNAMe
eMP_PHONe
eMP_GrADe
PAYrOLL_NO
1239909
9
Meniur
Adele
044573322
Manager
NW445T
1239986
9
Vos
Astrid
049989900
Deputy
NW211Q
Manager
1344255
9
Marin
Gaston
046656671
Staff
NW887L
1556743
9
Vulstrek
Henry
043343322
Staff
NW667P
4000566
1
Khoza
Buhle
087632343
Staff
CW990U
4000768
1
Fenyang
Abri
084544477
Staff
CW211R
4005655
1
Xu
Chang
088765676
Manager
CW223V
5009323
8
Lefu
Mosa
081231133
Manager
TY334Z
Strong (identifying) A strong
relationships
relationship,
also known
entity contains a PK component and EMPLOYEE entities:
EMPLOYEE(AGENT_ID,
as an identifying
AGENT_NAME,
PAYROLL_NO,
that a strong relationship
EMPLOYEE
entitys
composite
relationship,
exists
when the
PK of the related
of the parent entity. For example, the definitions of the TRAVEL_AGENT
TRAVEL_AGENT(AGENT_ID,
indicate
5
exists
AGENT_ADDRESS,
EMP_LNAME,
between
PK is composed
AGENT_PHONE,
EMP_FNAME,
TRAVEL_AGENT
of AGENT_ID
AGENT_EMAIL)
EMP_PHONE,
EMP_GRADE,)
and EMPLOYEE
+ PAYROLL_NO.
because the
(Note that the
AGENT_ID
in EMPLOYEE is also the FK to the TRAVEL_AGENT entity.) Whetherthe relationship between TRAVEL_AGENT and EMPLOYEE is strong or weak depends on how the EMPLOYEE entitys primary key is defined. Figure 5.11 shows the strong relationship between TRAVEL_AGENT and EMPLOYEE.
online content available
Copyright Editorial
review
2020 has
Cengage deemed
on the
Learning. that
any
All suppressed
Allofthe databases usedtoillustratethe material in this chapterare
online
Rights
Reserved. content
does
platform
May not
not materially
be
accompanying
copied, affect
scanned, the
overall
or
duplicated, learning
this
in experience.
whole
book.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
180
PARt
II
Design
FIguRe
Concepts
5.11
stRong (non-identifying)
relationship
between tRAveL_Agent EMPLOYEE
TRAVEL_AGENT
AGENT_ID AGENT_ID
and eMPLoyee
{PK}
{PK}
PAYROLL_NO
AGENT_NAME
employs
c
{FK1}
{PK}
EMP_LNAME
AGENT_ADDRESS EMP_FNAME AGENT_PHONE
1..*
1..1
EMP_PHONE
AGENT_EMAIL EMP_GRADE
5
Database Table
name:
name:
CH05_Travel_Agent
Travel_AGENT
Primary Key: AGENT_ID AGeNT_iD
AGeNT_NAMe
1
Timeless
AGeNT_PHONe
AGeNT_ADDreSS
Travel
Upper Keys Business Village,
Cannock
Staffordshire,
AGeNT_eMAiL
0800 333 2233
[email protected]
0860232425
[email protected]
, WS12 2HA,
UK 8
FlightLite
Anansi
Park,
7550, 9
VILLANOVO
Paris
Key:
33170809753
222
[email protected]
Rue De Rivoli
Paris,
France
EMPLOYEE
Primary Key: AGENT_ID Foreign
SA
244 Rue De Rivoli 75001
75001
Table name:
Durbanville,
Cape Town,
AND PAYROLL_NO
AGENT_ID
AGeNT_iD
PAYrOLL_NO
eMP_iD
eMP_LNAMe
9
NW445T
1239909
9
NW211Q
1239986
Meniur
Vos
eMP_FNAMe
eMP_PHONe
Adele
044573322
Astrid
049989900
eMP_GrADe Manager
Deputy Manager
Gaston
046656671
Staff
Vulstrek
Henry
043343322
Staff
4000566
Khoza
Buhle
087632343
Staff
CW211R
4000768
Fenyang
Abri
084544477
Staff
1
CW223V
4005655
Xu
Chang
088765676
Manager
8
TY334Z
5009323
Lefu
Mosa
081231133
Manager
9
NW887L
1344255
Marin
9
NW667P
1556743
1
CW990U
1
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Keep in
mind that the
order in
in the TRAVEL_AGENT be created tables
before
foreign
the 1
EMPLOYEE
reference
problem
the
table.
Remember
that
professional
transaction,
5.1.7
are
the
After
all, it
and loaded
would
not
that
Entity
Relationship
not
yet
to exist.
have In
into the tables.
of referential
table
the
181
must
EMPLOYEE
some
DBMSs,
In fact,
integrity
Diagrams
For example,
TRAVEL_AGENT
be acceptable did
possibility
with
is very important.
5.11), the
data are loaded
to avoid the
Modelling
you
this
must load
errors, regardless
of
or strong.
of the to
Data
(Figure
table
up until the
weak
nature
judgement
efficiency
relationship
a TRAVEL_AGENT
does not crop
relationships
are created
EMPLOYEE
side first in a 1:* relationship
whether
use
the
key
sequencing
which the tables
employs
5
relationship
determine
is
which
and information
often
determined
relationship
requirements.
by the
type
and
That point
database
strength
will often
designer, best
who
suit
the
be emphasised
in
must
database
detail!
Weakentities 5
A weak entity is one that
meets two conditions:
1 It is existence-dependent;
that is, it cannot exist without the entity
2 It has a primary key that is partially or totally For example, purpose the
a company
of describing
DEPENDANT
without
the
unless
DEPENDANT
A strong both
an insurance
policy,
an EMPLOYEE
be associated that
of the
conditions
weak
to
and the
entity
the
cannot
DEPENDANT
entity
PK of the related the
and
ERD in
weak
entities
when
shown the
you
using
the
in
related have
contains
5.12,
the
Figure
For the
XYZ
for
but
cannot
exist
Corporation
the
XYZ
as a
Corporation.
DEPENDANT. 5.12.
entity
is
weak.
met
the
a PK component that
at the
working
has
been
will notice UML
dependants.
DEPENDANT
coverage
of an employee
definition
entity
Figure
Moreover,
EMPLOYEE
that
and his/her
may or may not have a DEPENDANT,
get insurance
is
indicates
weak
an employee
an EMPLOYEE.
be a dependant
relationship for
FIguRe 5.12
a person
happens
(identifying)
strong
is,
with
weak entity in the relationship
As you examine between
may insure
s(he)
is the
An example
that
policy
EMPLOYEE;
dependant
derived from the parent entity in the relationship.
insurance
must
with whichit has a relationship.
Such
of the
there
is
a relationship
related
entity
parent
is
means existence-dependent,
entity.
no diagrammatic
distinction
notation.
A weakentity in an eRD EMPLOYEE EMP_NUM
DEPENDANT
{PK}
DEP_NUM
{PK}
EMP_LNAME
EMP_NUM
{PK}
EMP_INITIAL
DEP_FNAME
EMP_FNAME
has
c
0..*
1..1
EMP_DOB
{FK1}
DEP_DOB
EMP_HIREDATE
Strong
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
Entity
not materially
be
copied, affect
Weak
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
Entity
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
182
PARt
II
Design
Concepts
Remember at least
that the
part
weak entity inherits
of the
DEPENDANT
part of its primary
entitys
key
shown
in
key from its strong
Figure
5.12
counterpart.
was inherited
from
For example, the
EMPLOYEE
entity: EMPLOYEE
(EMP_NUM,
DEPENDANT
5.13 illustrates
and its
parent or strong
attributes,
this
scenario Linda
5
FIguRe Database Table
key:
you
dependants,
Annelise
weak entity (DEPENDANT) primary
was inherited
can
determine:
and
Jorge.
key is composed
from
EMPLOYEE.
eMP_HireDATe
1001
De Lange
Linda
J
12-Mar-74
25-May-07
1002
Smithson
William
K
23-Nov-80
28-May-07
Herman
H
15-Aug-78
28-May-07
Lydia
B
23-Mar-84
15-Oct-08
28-Sep-76
20-Dec-08
G
12-Jul-89
05-Jan-12
Washington
1004
Chen
1005
Johnson
Melanie
1006
Khumalo
Mandla
1007
ODonnell
Peter
D
10-Jun-81
23-Jun-12
1008
Brzenski
Barbara
A
12-Feb-80
01-Nov-13
DEPENDANT EMP_NUM
and
DEP_NUM
EMP_NUM
mind that
weak based
Cengage deemed
Given
EMP_NUM eMP_DOB
key:
of
EMPLOYEE
eMP_iNiTiAL
Foreign
has
EMP_NUM
relationship,
the
eMP_FNAMe
name:
2020
and that
between
DEPENDANTs
eMP_LNAMe
keys:
review
Note that
eMP_NUM
Primary
Copyright
of this
EMP_HIREDATE)
CH05_ShortCo
Keep in
Editorial
DEP_NUM,
EMP_DOB,
DEP_DOB)
of the relationship
and
two
EMP_INITIAL,
DEP_FNAME,
(EMPLOYEE).
help
claims
EMP_FNAME,
A weak entity in a strong relationship
1003
Table
counterpart
with the
J. De Lange
name:
DEP_NUM,
the implementation
EMP_NUM and
5.13
name:
Primary
(EMP_NUM,
Figure
two
EMP_LNAME,
Learning. that
any
eMP_NUM
DeP_NUM
DeP_FNAMe
DeP_DOB
1001
1
Annelise
05-Dec-07
1001
2
Jorge
30-Sep-12
1003
1
Suzanne
25-Jan-14
1006
1
Nonhlanhla
25-May-11
1008
1
Michael
19-Feb-05
1008
2
George
27-Jun-08
1008
3
Katherine
18-Aug-13
the
on the
All suppressed
Rights
business
Reserved. content
database
does
May not
not materially
be
designer
rules.
copied, affect
determines
An examination
scanned, the
usually
overall
or
duplicated, learning
in experience.
whole
or in Cengage
whether
of the relationship
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
an entity
can
between
TRAVEL_AGENT
party additional
content
may content
be
be described
suppressed at
any
time
from if
the
subsequent
eBook rights
as and
and/or restrictions
eChapter(s). require
it
cHAPteR
EMPLOYEE AGENT.
in
After
cannot
exist
Figure
without
employee
Mosa
may cause
examine
being
Lefu
case is the travel which is
5.10
all, if you
the
employed
cannot
agent
you to
he is
parent
entity.
5.10, is
with
is
clear
existence to
EMPLOYEE
Relationship
that
existing
tables
183
TRAVEL
For example,
travel
primary
EMPLOYEE
Diagrams
a EMPLOYEE
dependency.
an
EMP_FNAME,
Entity
a weak entity to
it seems
attached
That is,
EMP_LNAME,
Modelling
EMPLOYEE
so there
unless
Data
Figure
Note that the
COURSE
AGENT_ID,
that
in
agency;
employee
called FlightLite.
EMPLOYEE(EMP_ID,
rows
by a travel
be an
not derived from the
conclude
EMPLOYEE
5
agent,
key is
in this
EMP_ID,
may be represented EMP_PHONE,
by:
EMP_GRADE,
PAYROLL_NO) The second in
Figure
had
weak entity requirement
5.10
been
may not
defined
EMPLOYEE
key,
be represented
by:
could
case,
AGENT is
a
weak
always
in
entity
by
existence-dependent
Participation
occurrence
words,
an entity
a table.) In
the
in
optional
(row)
occurrence
FLIGHT
EMPLOYEE entitys
AGENT_ID
and
entity
primary
key
PAYROLL_NO,
Crows
as strong,
EMP_PHONE,
key is
primary
partially
key.
Foot
terms,
or not it is
the
In
EMP_GRADE)
derived
Given this
or identifying.)
whether
(The
The
between
in
or mandatory.
the
from
decision,
TRAVEL_ EMPLOYEE
relationship
any case,
defined
between
EMPLOYEE
is
as weak.
considered
relationship
between
of an
optionality
is
to
to
necessarily that
be optional
to the shown that
the
condition
in
any
entity
relationship.
a flight. the
other
existence
of as
entity.
by a 0..1 or 0..* minimum which
5.14. In In
is implemented
BOOKING
indicates
label
be for
require
each
entities is
optionality
used
not
may not
(Remember
means that
a particular
and FLIGHT in Figure
bookings
does
table.
participation in
BOOKING
some
table
FLIGHT
Optional
occurrence
entities
at least
BOOKING
existence
entity
the two
relationship,
entity is
term
optional
a corresponding
in the
an optional 5.11.
entity.
relationships
primary
tables
Professional
is either
of FLIGHT
the
Figure
combination
EMP_FNAME,
EMPLOYEE
classified
not require
occurrence
UML notation,
illustrated
the
the relationship
entity
Therefore,
the
EMPLOYEE
Participation
consists
a corresponding
if the
5
on TRAVEL_AGENT,
does
consider
BOOKING
by definition,
hand,
EMP_LNAME,
Visio
is
in an entity relationship
entity
For example, the
(In
other of the
TRAVEL_AGENT
EMPLOYEE
met; therefore,
On the
composed
5.11,
is the
definition.
and
5.1.8 Relationship
one
Figure
AGENT_ID
TRAVEL_AGENT
weak.
PAYROLL_NO,
illustrated
because
as
as a composite
EMPLOYEE(AGENT_ID, In that
has not been
be classified
multiplicity
cardinality
one
or
is
more
as 0 for
optional
exist.)
FIguRe 5.14
An optional FLIgHt entity in the relationship booKIng consists of FLIgHt BOOKING FLIGHT
BOOKING_NO
{PK} FLIGHT_NO
EMP_ID
{FK1}
CUST_NO
{PK}
FLIGHT
AIRLINE
{FK2} consists_of
BOOK_STATUS_CODE
EVENT_ID
c
FLIGHT_DEPART_AIRPORT
{FK3}
FLIGHT_ARRIVE_AIRPORT
{FK4}
0..* HOTEL_ID
0..1
FLIGHT_DEPART_TIME
{FK5}
FLIGHT_ARRIVE_TIME
FLIGHT_NO
{FK6} FLIGHT_COST
BOOK_TOTAL_COST
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
184
PARt
II
Design
Concepts
note Remember foreign
that
the
key. In
Mandatory in
of establishing
cases,
that
mandatory
relationship
If
no
with the
cardinality
relationship
entity
is
on the
always
placed
many side
of the
on the
entity
that
optionality
related
is 1 for the
symbol
entity.
is
The
mandatory
depicted
existence
with the
of a
contains
the
relationship.
meansthat one entity occurrence requires a corresponding
relationship.
minimum
the
will be the
participation
a particular
the
5
burden
most
entity,
mandatory
entity occurrence
the
entity
relationship
exists
in
indicates
a
that
entity.
note
You may be tempted optional relationship.
strength
entity
clearly
is
entity
poor
rules
to
create
different decisions
examine
research
without
other
CLASS. thus table
FIguRe
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
optionalities that
more scenarios.
hand,
a CLASS ERD
rows
does
May
Tiny
examine
teach
may teach and
only
by the
at all or as
LECTURER
row
multiplicity
cLAss entity in the relationship
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
be supplied
distinction
by a
may lead
or deleted.
some
CLASS
Therefore,
lecturers
CLASS is
to
LECTURER
multiplicity
next to classes.
each
who conduct
relationship,
optional
many as three
to the
entity
For example,
of the database design process,
assuming
next
is
when
related
written.
may not
teaches
Therefore,
DEPENDANT
PK of the
this
employs
5.15 shows the
no classes one
component
by a LECTURER.
(1..1)
the
may or
LECTURER
CLASS.
when
be established
rule is
are inserted
University
the
a
model shown in Figure
one
not
you
that
and
relationship and
to
Failure to understand
when table
must be taught
a lecturer
Reserved.
If not to
represented
content
entities!
Suppose
classes.
a LECTURER
Rights
problems
on how
part
mandatory
After all, you cannot require
business
and A
participation turns out to be animportant
An optional
All
same
major
EMPLOYEE
depends
a
participation
a strong
a weak relationship
on how the
between entities in an entities in
relationship
encounter
to EMPLOYEE.
for
strength
for the
cause
to
between
optional
as possible
by a vendor
must
will reference
suppressed
as clearly
be supplied
that
5.15
relationship
part
teaching
one lecturer,
the
occur
between
mind that
are likely
depends
Note that the
row
only
You
The relationship
for
indicating
is just
occur
Keep in
thing.
example,
And it is just
another.
a few
possible
warranted.
same
participation
Since relationship
On the
For
not
when they
relationship
Each
design
quite
the
are weak when they
are strong
is
describe
another.
mandatory while the
lets
conclusion
have dependents.
is
business
vendor to
to
relationships
one, but DEPENDANT
to
is formulated, the
this do not
optional
a strong
employees one
and that
However,
relationship one
to conclude that relationships
relationship
is
mandatory
CLASS to And
to
be (0..3),
each
class is taught
LECTURER
it is
LECTURER.
CLASS
by one
and
table.
LectuReR teaches cLAss
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Failure to may
understand
yield
created
designs
just
to
understand
It is
in
CLASS
constitutes
universitys
is
CLASS
a CLASS mandatory
Figures
CLASS
CLASS is of the
Keep in
it is important
must
that
you
of a
may determine
Tiny University
between
class
COURSE.
in the
class
to the COURSE a
COURSE.
Two
offers
and
several
course
(Typically,
185
be
clearly
that
the type courses;
in this
courses
schedules
is
generates
Therefore,
scenarios
scenarios
of
each
discussion:
are listed
students
a
in the
use to register
for
the
CLASS
you
can
CLASS
are a function
of the
relationship,
conclude
entity
the
the
may
semantics
practical system
All
which
a year
generates
must have
should
first
Rights
the
Reserved. content
does
May not
not from
updates
into
suppressed
for
once
assignments.
sections
and
do
one
easy
COURSE
written,
of the
shown
problem;
in
5
that is,
COURSE first and then
In the real
(classes)
not
or
at least
order to comply
aspects
desirable
database
are inserted
any
be
it is
the
have
generate
not
world, such
yet
classes
been
each
a scenario
defined.
In fact,
semester.
more
one
CLASSes.
CLASS.
In
Therefore,
with the semantics
bythe semantics
ER terms,
each
a CLASS
of the
COURSE
in
must be created
problem.
couRse and cLAssin a mandatory relationship
entities
Learning.
only
relationship
associated
that
making the teaching
COURSE
yet have a CLASS
Cengage
after
that
defined:
may be courses
Each
environment
created,
entity
COURSE is created in
mind the
relationship,
deemed
Diagrams
in relationships
instances)
mandatory. This condition is created bythe constraint that is imposed
FIguRe 5.17
a rigid
Relationship
cLAss is optional to couRse
statement
as the
has
(entity
of a problem
that
are listed
without
relationship.
are taught
the generates
2020
section)
contribution
exist
The different
there
courses
FIguRe 5.16
2
Entity
participation
rows
Therefore,
semantics
distinction
(or
with
participation.
suppose
the
offering
entitys
in the
very likely;
some
review
the
optional
temporary entities.
optional
Modelling
CLASS is optional. It is possible for the department to create the entity is
Copyright
again
on how the relationship
create the
Editorial
that
while classes
cannot
5.16 and 5.17.
depend
1
a specific catalogue,
the
see that
they
of required and
For example, Note
and
unnecessary)
creation
understand
mandatory
Data
classes.)
Analysing
entity
to
(and
mandatory
classes.
course
for their
of
a relationship.
has several
between
awkward the
concepts
also important
participation
distinction
which
accommodate
the
course
to
the
in
5
of the
scenario
accept an
a course
operational
the
with it.
not
be
copied, affect
in not
Figure
overall
apparent
CLASS table.
or
duplicated, learning
in experience.
whole
or in Cengage
For
Due Learning
to
electronic reserves
rights, right
semantics
with at least
a COURSE
some to
third remove
of the
party additional
content
a new
entity that
be
any
time
is
does not
relationship,
suppressed at
such
when CLASS
mandatory
may
Is
COURSE
be solved
content
of this
one class.
when
seems to
because
the
Given the
example,
inserting problem
However,
part.
5.17.
associated
of view?
COURSE table, thereby
scanned, the
that is point
Naturally, the
corresponding
materially
presented
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
186
PARt
II
Design
Concepts
the system desirable
will bein temporary
to
Finally, DBMS.
classify as you
To
the examine
maintain
When you create
tAbLe
on both
the
5.3
order
DBMS
rule constraint.
to
presented
the
the foreign
Table
in
scenarios
a relationship
sides.
of the business
as optional
data integrity,
with a COURSE through
many
violation
CLASS
produce
in
must
Figures ensure
For practical
a more flexible 5.16
that
and
the
purposes,
it
would be
design.
5.17,
keep in
many
mind the
side (CLASS)
role
is
of the
associated
key rules. in
MS Visio using
5.3 shows
the
various
UML, the
default relationship
multiplicities
that
will be optional
are supported
by the
UML
and
notation.
Multiplicity
Multiplicity
Description
0..1
A minimum instance
5 0..*
of zero
of the
A minimum instance
1..1
1..*
of zero
of the
and
a
maximum
class (indicates
of one and
of the
one instance a
online
of this
mandatory of this
content
class is
class). class
In
are associated
this
unary relationship
Databases
exists
relationship
with
of this
class
are
associated
with an
class).
equivalent
to
of the
other
related
class
1..1.
with an instance
of the
Visio
Professional:
other related
A Tutorial,
the number of entities or participants
when an association
degrees is
with an instance
class.
available
on the
online
Degree
exists whentwo entities are associated. higher
an
book.
degree indicates
entities
with
of this class are associated
with an instance
words,
are associated
class).
mandatory
associated
other
class
class).
many instances a
with an
Tolearn how to definerelationships properly withthe help of MSVisio,
A, Designing
5.1.9 Relationship A relationship
of
of this
an optional
class (indicates
are associated
to 0..*.
Appendix for
a maximum
other related
Many instances
platform
class (indicates
class
class).
many instances
of one instance
Equivalent
Although
of
of this
an optional
a mandatory
(indicates
of four
of one instance
class (indicates
other related
other related
Exactly
see
maximum
A minimum of one and a maximum
A minimum
*
a
of the
instance 1
and
other related
exist, they
described
degrees
using
maintained
within a single
Aternary relationship
as
a four-degree
entity.
with a relationship. A binary
A
relationship
exists whenthree entities are associated.
are rare and are not specifically
simply UML
is
associated
named. (For
relationship.)
Figure
example,
5.18
shows
an association these
types
of
notation.
Unary relationships In the the
case
of the
manager
for
relationship
Copyright Editorial
review
2020 has
unary one
means
relationship
shown
or
more employees
that
EMPLOYEE
in
Figure
5.18,
within
that
entity.
requires
EMPLOYEE has a relationship
with itself.
The different
relationships
Cengage deemed
Learning. that
any
cases
All suppressed
Rights
of recursive
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
another
an employee In this
EMPLOYEE
Such a relationship will be explored
in experience.
whole
or in Cengage
part.
Due Learning
within
case,
to
electronic reserves
the
EMPLOYEE
the
existence
to
be the
Section
rights, the
right
some to
third remove
is
manages
manager
is known as a recursive in
entity
of the
that
is,
relationship.
5.1.10.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
5
Data
Modelling
with
Entity
Relationship
Diagrams
187
Binary relationships A binary are
relationship
most
(ternary
In
exists
common. and
Figure
In
higher)
when
fact,
to
two
simplify
relationships
the
are
5.18, the relationship
a
entities
are
associated
conceptual
design,
decomposed
LECTURER
in
into
a relationship.
whenever
appropriate
teaches
one
or
Binary
possible,
most
equivalent
more
relationships
binary
CLASSes
higher-order relationships.
represents
a binary
relationship.
FIguRe 5.18
three types of relationship degree
Unary
Relationship
Binary Relationship
b manages
5
0..*
EMPLOYEE
LECTURER
teaches
CLASS
c
1..1
0..*
1..1
Ternary
DOCTOR
writes
Relationship
PRESCRIPTION
c
0..*
1..1
PATIENT
b receives
0..*
1..1
0..* appears_in
c
1..1
DRUG
Ternary
and Higher-Order
Although
most relationships
relationships are
binary, the
use of ternary
and higher-order
relationships
does
allow
the designer some latitude regarding the semantics of a problem. A ternary relationship implies an association among three different entities. For example, note the relationships (and their consequences) in Figure 5.18, which are represented by the following business rules: A DOCTOR writes one or more PRESCRIPTIONs. A PATIENT mayreceive A DRUG
may appear
one or more PRESCRIPTIONs.
on one or more PRESCRIPTIONs.
(To
simplify
this
example,
assume
that the
business rule states that each prescription contains only one drug. In short, if a doctor prescribes more than one drug, a separate prescription must be written for each drug.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
188
PARt
II
Design
Concepts
The reason entity entities
FIguRe Database Table
name:
PATIENT
a single and
and not three event
binary relationships
or object
that
is
simultaneously
because
includes
the
associate
all three
parent
DRUG).
the implementation
of a ternary relationship
Ch05_Clinic
Drug
key:
DRUG_CODE
5
Table
relationship
reflects
(DOCTOR,
5.19
name:
Primary
why this is a ternary
PRESCRIPTION
name:
DrUG_CODe
DrUG_NAMe
DrUG_PriCe
AF15
Afgapan-15
25.00
AF25
Afgapan-25
35.00
DRO
Droalene
DRZ
Druzocholar
KO15
Koliabar
OLE
Oleander-Drizapan
TRYP
Tryptolac
Chloride
111.89
Cryptolene
18.99
Oxyhexalene
65.75 123.95
Heptadimetric
79.45
Patient
Primary key: PAT_NUM PAT_NUM
PAT_TiTLe
PAT_LNAMe
PAT_FNAMe
PAT_iNiTiAL
PAT_DOB
PAT_AreACODe
PAT_PHONe
100
Mr
Dlamini
Phindile
D
15-Jun-1952
0181
324-5456
101
Ms
Lewis
Rhonda
G
19-Mar-2015
0181
324-4472
102
Mr
Vandam
Rhett
14-Nov-1968
0879
675-8993
103
Ms
Jones
Anne
M
16-Oct-1984
0181
898-3456
104
Mr
Lange
John
P
08-Nov-1981
0879
504-4430
105
Mr
Nsizwa
D
14-Mar-1985
0181
890-3220
106
Mrs
Smith
Jeanine
K
12-Feb-2013
0181
324-7883
107
Mr
Diante
Jorge
D
21-Aug-1984
0181
890-4567
108
Mr
Wiesenbach
Paul
R
14-Feb-1976
0181
897-4358
109
Mr
Smith
George
K
18-Jun-1971
0879
504-3339
110
Mrs
Genkazi
Leighla
19-May-1980
0879
569-0093
111
Mr
112
Mr
113
Ms
Gounden
114
Ms
115 116
Copyright Editorial
review
2020 has
W
Rupert
E
03-Jan-1976
0181
890-4925
Edward
E
14-May-1971
0181
898-4387
Melanie
P
15-Sep-1980
0181
324-9006
Brandon
Marie
G
02-Nov-1942
0879
882-0845
Mrs
Saranda
Hermine
R
25-Jul-1982
0181
324-5505
Mr
Smith
George
A
08-Nov-1975
0181
890-2984
Cengage deemed
Mthembu
Washington Johnson
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Table
name:
Data
Modelling
with
Entity
Relationship
Diagrams
189
Doctor
Primary keys:
DOC_ID DOC_iD 29827
Table
5
name:
DOC_LNAMe
DOC_FNAMe
Ndosi
Sipho
DOC_iNiTiAL
DOC_SPeCiALTY Dermatology
J
32445
Jorgensen
Annelise
G
Neurology
33456
Jali
Phakamile
A
Urology
33989
LeGrande
George
Paediatrics
34409
Washington
Dennis
F
Orthopaedics
36221
McPherson
Katye
H
Dermatology
36712
Dreifag
Herman
G
Psychiatry
38995
Minh
Tran
40004
Chin
Ming
D
Orthopaedics
40028
Cele
Denise
L
Gynaecology
PAT_NUM,
PRES_DATE
5
Neurology
Prescription
Primary
key:
DRUG_CODE,
Foreign
keys:
DOC_ID
DRUG_CODE,
and
DOC_ID and PAT_NUM
DOC_iD
PAT_NUM
DrUG_CODe
32445
102
DRZ
32445
113
OLE
one
34409
101
KO15
one tablet
36221
109
DRO
38995
107
KO15
As you examine
the table
two
two
tablets
every
teaspoon
tablets
in
instance, you can tell that the first drug DRZ on 12 November 2019.
Figure
5.18,
prescription
four
with
every
hours
each
50 tablets
meal
six hours
with every
one tablet
contents
PreS_DATe
PreS_DOSAGe
meal
14-Nov-19
total
60 tablets
14-Nov-19
total
30 tablets
possible
12-Nov-19
ml total
30 tablets
every six hours
note that it is
250
total
14-Nov-19
total
to track
14-Nov-19
all transactions.
For
was written by doctor 32445 for patient 102, using the
5.1.10 Recursive Relationships As was previously
mentioned,
a recursive
relationship
is
one in
which a relationship
can exist
between
occurrences of the same entity set. (Naturally, such a condition is found within a unary relationship.) For example, a 1:* unary relationship can be expressed by an EMPLOYEE may manage many EMPLOYEEs, and each EMPLOYEE is managed by one EMPLOYEE. Aslong as polygamy is not legal, a 1:1 unary relationship may be expressed byan EMPLOYEE may be married to one and only one other EMPLOYEE.
to
relationships
Copyright Editorial
review
2020 has
Finally, the *:* unary relationship
may be expressed
by a
COURSE
may be a prerequisite
many other COURSEs, and each COURSE may have many other COURSEs as prerequisites.
Cengage deemed
are
Learning. that
any
All suppressed
shown
Rights
Reserved. content
does
in
May not
Figure
not materially
be
copied, affect
Those
5.20.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
190
PARt
II
Design
FIguRe
Concepts
5.20
An eR representation
of a recursive relationship
5
The 1:1 relationship Note that Singh.
FIguRe Database
you
Anne
Jones
5.21
determine
that
is
to
married
in the single table
married
who is
to
Vediga
married
eMPLoyee
to
Singh,
Anne
shown in
who is
Figure
married
to
5.21. Nishok
Jones.
is married to eMPLoyee
name:
eMP_LNAMe
eMP_FNAMe
eMP_SPOUSe
345
Singh
Nishok
347
346
Jones
Anne
349
347
Singh
Vediga
345
348
Delaney
Robert
349
Shapiro
Anton
346
Another unary relationship PARt contains PARt Ch05_PartCo
PART_V1
PArT_CODe
PArT_DeSCriPTiON
PArT_iN_STOCK
PArT_UNiTS_NeeDeD
PArT_OF_PArT
AA21-6
2.5 cm washer, 1.0 mmrim
432
4
C-130
AB-121
Cotter
1034
2
C-130
C-130
Rotor
E129
2.5 cm steel
128
1
C-130
X10
10.25
345
4
C-130
X34AW
2.5 cm hex nut
879
2
C-130
Copyright Editorial
Shapiro,
is
EMPLOYEE_V1
FIguRe 5.22
Table name:
Anton
Singh
Ch05_PartCo
eMP_NUM
Database
5.20 can be implemented
Nishok
the 1:1 recursive relationship
name:
Table name:
shown in Figure
can
review
2020 has
Cengage deemed
Learning. that
any
pin,
cm
All suppressed
copper
36
assembly
Rights
shank
rotor
Reserved. content
does
blade
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Unary relationships a rotor assembly. pins,
are common
assembly
(C-130)
Figure
one
2.5
implemented
parts,
illustrates aware
steel
tables
such
four
5.21 thus
be used to
two
aviation,
FIguRe 5.23
many
a rotor 10.25
assemble
are required
is
tracking
part
PART
2.5
Entity
Figure to
hex
parts PART
important output.
as
In fact,
in
Diagrams
5.22 illustrates only
washers,
nuts.
within each rotor
of other
Relationship
create
2.5 cm
cm
contains
is increasingly
is
with
used
of four
and two
more complex
parts tracking
part is
composed
kinds
the
Modelling
For example,
each
different
Data
each
blades
you to track
several
Implementation
but
assembly cm rotor
of producing
full
industries. parts,
to implement Parts
ramifications
those involving
of
enables
an environment.
of the legal
manufacturing
that
shank,
Figure
If a part can many
in
composed
5.22 indicates
cm
in
is
5
191
that
one rotor two
cotter
The relationship
assembly.
and is itself
composed
relationship.
Figure
managers
become
many industries,
of 5.23 more
especially
mandatory.
of the *:* recursive PARt contains PARt relationship 5
Database
name:
Table name:
Table
Ch05_PartCo
COMPONENT
name:
COMP_CODe
PArT_CODe
COMP_PArTS_NeeDeD
C-130
AA21-6
4
C-130
AB-121
2
C-130
E129
1
C-131A2
E129
1
C-130
X10
4
C-131A2
X10
1
C-130
X34AW
2
C-131A2
X34AW
2
PART PArT_CODe
PArT_DeSCriPTiON
AA21-6
2.5 cm
PArT_iN_STOCK
washer,
1.0
AB-121
Cotter pin, copper
C-130
Rotor
432
mm rim
1 034 36
assembly
E129
2.5 cm steel
X10
10.25
X34AW
2.5 cm
shank
cm rotor hex
128
blade
345
nut
879
The *:* recursive relationship might be morefamiliar in a school environment. Forinstance, note how the *:* COURSE requires COURSE relationship illustrated in Figure 5.20is implemented in Figure 5.24. In this
example,
MATH-243 is a prerequisite
are prerequisites
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
to
QM-261
and
QM-362,
while both
MATH-243
and
QM-261
to QM-362.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
192
PARt
II
Design
FIguRe Database Table
Concepts
5.24 name:
name:
Implementation
couRse
relationship
COURSE
5
name:
CrS_CreDiT
CrS_CODe
DePT_CODe
CrS_DeSCriPTiON
ACCT-211
ACCT
Accounting
I
3
ACCT-212
ACCT
Accounting
II
3
CIS-220
CIS
CIS-420
CIS
Intro.
CIS
QM-362
CIS
to
Computer
Database
Design
Mathematics
MATH
QM-261
Intro.
to
for
3
Science and Implementation
4
Managers
3
Statistics
Statistical
3
Applications
PREREQ
Finally,
the
1:* recursive
implemented
FIguRe
requires
Ch05_TinyUniversity
MATH-243
Table
of the *:* recursive couRse
in
5.25
Figure
CrS_CODe
Pre_TAKe
CIS-420
CIS-220
QM-261
MATH-243
QM-362
MATH-243
QM-362
QM-261
relationship
EMPLOYEE
manages
EMPLOYEE,
shown
in
Figure
5.20, is
5.25.
Implementation
of the 1:* eMPLoyee
manages eMPLoyee
recursive
relationship Database
name:
Table name:
Ch05_PartCo
EMPLOYEE_V2 eMP_CODe
eMP_LNAMe
101
Mazwai
102
Orincona
eMP_MANAGer 102
Jones
103
102
104
Malherbe
102
105
Robertson
102
106
Deltona
102
5.1.11 composite
entities
You should recall from Chapter 3, Relational Model Characteristics, that the relational model generally requires the use of 1:* relationships. (You should also recall that the 1..1 relationship has its place, but it should be used with caution and proper justification.) If *:* relationships are encountered, you must create
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
a bridge
All suppressed
Rights
between
Reserved. content
does
May not
not materially
be
the
copied, affect
entities
scanned, the
overall
or
duplicated, learning
that
in experience.
display
whole
or in Cengage
part.
such relationships.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
Recall that the
party additional
content
may content
be
suppressed at
any
time
bridge
from if
the
subsequent
eBook rights
entity
and/or restrictions
eChapter(s). require
it.
cHAPteR
(also
known
connected.
as a composite (An
FIguRe Database Table
example
5.26
composed
a bridge
converting
name:
name:
entity) is
of such
is
of the
shown
in
5
primary
Figure
the *:* relationship
Data
Modelling
keys
with
Entity
of each
Relationship
of the
Diagrams
entities
to
be
5.26.)
into two 1:* relationships
CH05_Travel_Agent
BOOKING BOOK_
BOOK_
CUST_
STATUS_
eMP_iD
NO
CODe
204200
1239986
101
1
225
06/04/2019
301200
1239986
102
1
90
04/02/2019
401211
4000768
1099
2
185
25/05/2019
BOOKiNG_
NO
Table
Table
193
name:
eveNT_
HOTeL_
iD
FLiGHT_
TOTAL_
BOOKiNG_
NO
COST
DATe
iD
5
TOUR_BOOKING
name:
TOUr_iD
BOOKiNG_NO
TOUr_DATe
1001
401211
06/07/2019
1002
401211
08/07/2019
1004
204200
03/08/2019
1005
301200
07/09/2019
1001
301200
28/09/2019
TOUR
TOUr_ iD
TOUr_ NAMe
TOUr_DeSCriPTiON
1001
The
See the
changing
Covent
Garden, the
Total
London Experience
Westminster
of the
guards
London
at Buckingham
Eye, St Pauls
Abbey, the river
Thames
Palace,
TOUr_
TOUr_
TOUr_
PriCe_ ADULT
PriCe_ CHiLD
PriCe_ CON
120
99
99
65
50
55
26
20
20
125
100
115
20
10
20
Cathedral,
and
more.
Meeting Point: 4 Fountain Square. Daily at 08:45 a.m. 1002
London
Visit the Tower of London
Gems
1003
Big
123151
Buckingham
See nine
attractions
Bus
on and
off in
of stops
take
Nairobi
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
including different
on the first a relaxing
floor
scenic
at 6:15
p.m.
Pick
up from
location/locations
7:45 p.m. Arrive at the the game drive/park
Reserved. content
does
May not
of the Seine
National
Rights
not materially
be
copied, affect
Eiffel
places.
daily
Safari
the
Meet
Daily at 1:00
p.m.
Tower.
Receive
Hop
details
58 Tour Eiffel restaurant,
Park Day Tour
Road.
when booking.
located
Tour
Editorial
nine
Palace
Enjoy dinner at the
Paris Night
1005
Crown Jewels and
go on a boat cruise on the River Thames.
City Tour
1004
and the
Eiffel
Tower,
River cruise.
to
Nairobi
be advised
National
formalities.
then Departs
Park for
Go on escorted
Walk.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
194
PARt
II
Design
Concepts
As you examine
Figure
on the
other
entities;
by the
composite
the
two
composed are
the in
entirely
possible
Implementing Specifically,
whether
the
5
date
Therefore,
if you
BOOKING
side
small
keys
database
even
1
mandatory
Figure
and the
no bookings
Figure
5.27,
the *:* relationship
that
play
of at least
place for a specific and
tables.
no role in
the
BOOKING
case the tour
(TOUR_ID
TOUR
are connected
date
which
booking.
BOOKING_NO)
Therefore,
no null
is entries
attributes.
*
or optional.
*:* relationship
key
that
attributes
in this
will take
and
entities
be composed
attributes,
tables
BOOKING
of the
additional
must
of the tour
key
shown in the
though
examine
of the
entity is existence-dependent
keys
contain
entity
any additional
tables
must know are
of the
the
TOUR_BOOKING
primary
you
may exist
FIguRe 5.27
the
primary
may also
although
TOUR_BOOKING
relationships
A TOUR
entity
TOUR_BOOKING
on the
on which that instance
mind that
the
clearly.
composite
based
may also include
of the
in the
composite
is
For example,
keys, it
uniquely identifies keep
composition
The
process.
primary
Finally,
its
entity.
connective
and TOUR
5.26, note that the
that
relationship,
of each
For example,
have
currently
an optional between
5.26 requires sides
note the
been
multiplicity
BOOKING
you define the relationships you
must
know
points:
made for it.
(0..*)
and
and
following
should
appear
on the
TOUR.
between booKIng AnD touR
BOOKING
may_contain
TOUR
c
0..*
0..*
You might argue that, for atour to exist, at least one BOOKING must be made. Therefore, TOUR is mandatory to BOOKING from a purely conceptual point of view. However, when a new tour is first offered, it will not have had the opportunity to be booked. Therefore, at least initially, TOUR is optional to BOOKING. Note that the practical considerations in the data environment help dictate the use of optionalities.
If TOUR is
not optional
to
BOOKING
from
a database
point
of view
a booking
must
be madefor the tour to allow it to beincluded in the database. But thats not how the process actually works. In short, the optionality reflects practice. The ERD in Figure 5.28 shows that the *:* relationship between BOOKING and TOUR has been decomposed
into
two
1:* relationships
through
TOUR_BOOKING.
In Figure 5.28, the
optionalities
have
been transferred to TOUR_BOOKING. In other words, it now becomes possible for a TOUR not to occur in TOUR_BOOKING if no customer has actually booked that tour. Because a tour need not occur in TOUR_BOOKING, the TOUR_BOOKING entity becomes optional to BOOKING. And because the TOUR_BOOKING entity is created before any bookings have been made, the TOUR_BOOKING entity is also optional
FIguRe
5.28
to
BOOKING.
Acomposite
entity in an eRD
BOOKING TOUR
BOOKING_NO EMP_ID
{PK} TOUR_ID
{FK1}
TOUR_NAME may_contain
BOOK_STATUS_CODE
c
TOUR_ID
{FK3}
{PK}
{PK}
TOUR_DESCRIPTION
{FK2}
TOUR_PRICE_ADULT
TOUR_DATE
0..*
1..1
{FK5}
has c
{FK1}
BOOKING_NO
EVENT_ID {FK4} HOTEL_ID
{PK}
TOUR_BOOKING
CUST_NO {FK2}
0..*
TOUR_PRICE_CHILD
1..1
TOUR_PRICE_CON FLIGHT_NO
{FK6}
BOOK_TOTAL_COST
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
5
Data
Modelling
with
Entity
a *:*
association
Relationship
Diagrams
195
note In
a UML
The
class
diagram,
association
association
an association
class
exists
can
have its
class
class to represent
within own
class the
attributes.
the *:* relationship
side of the relationship,
FIguRe 5.29
is
used
context Figure
between
which indicate
that
to
of the
represent associated
5.29
shows
BOOKING
entities
the
use
and TOUR.
both the participation
between
and,
as in
two
the
ER
entities.
model,
of an TOUR_BOOKING
the
association
Note the
multiplicities
(0..*)
of BOOKING
and TOUR
are optional.
on each
An association class 5
BOOKING
TOUR BOOKING_NO
{PK}
TOUR_ID
EMP_ID {FK1}
{PK}
TOUR_NAME CUST_NO
{FK2}
TOUR_DESCRIPTION BOOK_STATUS_CODE EVENT_ID
{FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
{FK3} TOUR_PRICE_ADULT
0..*
0..*
TOUR_PRICE_CHILD TOUR_PRICE_CON
{FK6}
BOOK_TOTAL_COST
TOUR_BOOKING TOUR_ID
{PK}
{FK1}
BOOKING_NO
{PK} {FK2}
TOUR_DATE
As customers entity.
make bookings
Naturally,
will appear occurs
a customer
more than
twice
booking
if
in
the
number
for specific books
once in
and
multiplicity
(0..*))
If
you
BOOKING
on the the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
shown
between
(1..1))
All suppressed
Rights
(0..*))
Reserved. content
does
May not
in
in
not materially
copied, affect
to
the
TOUR
overall
or
5.26.
be 1:* in
5.26,
5 1001
on the
scanned,
that
On the
entry.) Figure
TOUR_BOOKING
customers
note
entity. (Note
you
For example,
on the
be
Figure
5 401211
TOUR_BOOKING is located
then
the
that other
hand,
that the Therefore,
5.28,
booking
number
BOOKING_NO each
5 401211 customer
BOOKING
table in
the relationship
with the
* (shown
between
as the
side.
Figure
TOUR_ID
is located
one tour,
BOOKING
shown
table.
However,
multiplicity
multiplicity
tables
TOUR_BOOKING table.
the relationship as the
is
TOUR_BOOKING
will be entered into
For example,
BOOKING_NO
TOUR_BOOKING
examine
once in the
table
only once in the
Figure 5.26 has only one that BOOKING
more than
they
TOUR_BOOKING.
TOUR_BOOKING
occurs
tours,
duplicated, learning
will see that
TOUR_ID
occurs
only
a tour
5 1001 once in
can
occurs
the
occur
more than
twice
TOUR
in the
table
to
and TOUR is 1:*. Note that, in Figure TOUR_BOOKING
side,
while the
TOUR_
reflect
that
5.28, the * (shown
1 (shown
as the
side.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
196
PARt
II
Design
5.2
Concepts
DeveLoPIng
An eR DIAgRAM
The process of database design is aniterative rather than alinear or sequential process. The verb iterate meansto do again or repeatedly. Aniterative process is thus one based on repetition of processes and procedures. Building an ERD usually involves the following activities: Create a detailed narrative of the organisations
description
Identify the business rules based on the descriptions Identify
all main entities from the business rules.
Identify
all main relationships
Develop aninitial
5
between
of operations.
of operations.
entities from the
business
rules.
ERD.
Determine the multiplicities and the participation of all relationships. Remember, participation involves identifying whether arelationship can be optional or mandatory for each entity. Identify the primary and foreign Identify
keys.
all attributes.
Revise and review the ERD. During the review process, uncovered. Therefore, the components. Subsequently, of the existing diagram. The is a fair representation
of the
it is likely that additional objects, attributes and relationships will be basic ERM will be modified to incorporate the newly discovered ER another round of reviews may yield additional components or clarification process is repeated until the end users and designers agree that the ERD organisations
activities
and functions.
During the design process, the database designer does not depend simply on interviews to help define entities, attributes and relationships. A surprising amount of information can be gathered by examining the business forms and reports that an organisation uses in its daily operations. In this section, we will use two case studies Tiny University and ILoveHolidays to show the interactive process involved
5.2.1 tiny
in creating
an ERD.
university
case study
To start constructing an ERD, aninitial interview is required interview process yields the following business rules: 1
withthe Tiny University administrators.
The
Tiny University (TU) is divided into several schools: a school of business, a school of arts and sciences, a school of education, and a school of applied sciences. Each school is administered by a dean,
who is a lecturer
who has reached
the
grade
of professor
(LECT_GRADE
has a value
PROF). Keep in mindthat each dean can administer only one school. Therefore, a 1:1 relationship exists between LECTURER and SCHOOL. Note that the multiplicity can be expressed by (1..1) for the entity LECTURER and by (0..1) for the entity SCHOOL. (The smallest number of deans per school is one, as is the largest number, and each dean is assigned to only one school.) However
not all lecturers
are deans,
so
we need to
ensure that
the
entity
SCHOOL
has optional
participation. 2
Copyright Editorial
review
2020 has
Each school is composed of several departments. For example, the school of business has an accounting department, a management/marketing department, an economics/finance department and a computer information systems department. Note again the cardinality rules: the smallest
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
number
of departments
indeterminate
(*).
multiplicity belongs
is to is
operated
On the
other
expressed one,
by a school
hand,
by (1..1).
as is the
each
one,
the
number.
Data
Modelling
belongs
minimum Figure
with
and the largest
department
That is,
maximum
is
5
to
number
only
number
Entity
these
school; that
first
Diagrams
of departments
a single
of schools
5.30 illustrates
Relationship
197
is
thus,
the
a department
two
business
rules.
between
LECTURER
note It is
again
and
appropriate
SCHOOL
often indicates eliminated the
is that
the
data duplication
the
maintaining
of attributes
worth
as entities.
attributes
in the
SCHOOL
dean?
and what
the
anomalies.
duplication
of
However,
and
and the
may offer courses.
1:1 relationship that
the
existence
In this case, the entity.
This
are that are
already
of one approach
each
1..1 relationship
solution
also
would
credentials?
stored
in the
over another
easily
be
make it
easier
to
The
downside
often
table,
dean, the
depends
judgement. the
could
LECTURER
by a single
professional within
of 1:1 relationships
1:1 relationship
deans
each school is run
designers
university
the
repeating
database
make sure that
tiny
data that
because
minor. The selection speed,
the first
department
for It is
schools
it requires
lightly
reason
relationship.
is the
is rather
5.30
Each
of
deans
transaction
1:1 relationships
3
the who
stage for
requirements,
FIguRe
evaluate
dean
a misidentification
queries,
solution
setting
is
by storing
answer this
to
in the
In
database
of
5
thus
problem
of
on information short,
design
do not
is
use
defensible.
segment
For example,
the
management/marketing
department
offers
courses such asIntroduction to Management, Principles of Marketing, and Production Management. The ERD segment for this condition is shown in Figure 5.31. Note that this relationship is based on
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
198
PARt
II
Design
the
Concepts
way Tiny University
classified entity
operates.
as research would
FIguRe 5.31
only,
be optional
If, for example,
those
to the
Tiny
departments
DEPARTMENT
University
would
not
had some
offer
departments
courses;
therefore,
that
the
were
COURSE
entity.
the second tiny university eRDsegment
5
4
A CLASS is a section of a COURSE. That is, a department may offer several sections (classes) of the same database course. Each of those classes is taught by a lecturer at a given time in a given place. In short, a 1:* relationship exists between COURSE and CLASS. However, because a course may exist in
Tiny
Universitys
course
catalogue
even
when it is
not offered
as a class in a current
class schedule, CLASSis optional to COURSE. Therefore, the relationship CLASS can look like that shown in Figure 5.32.
FIguRe 5.32
5
Each department
mayhave lecturers 5 PROF) chairs the
assigned to it. department.
One of the lecturers
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
whose grade is a professor
Only one of the lecturers
to which (s)he is assigned, and no lecturer is required to accept the DEPARTMENT is optional to LECTURER in the chairs relationship. summarised in the ER segments shown in Figure 5.33.
Copyright
COURSE and
the third tiny university eRDsegment
(LECT_GRADE
Editorial
between
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
can chair the
department
chair position. Therefore, Those relationships are
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
FIguRe
5.33
the fourth
tiny
university
5
Data
Modelling
with
Entity
Relationship
Diagrams
199
eRD segment
0..*
b employs
LECTURER LECT_NUM
1..1
{PK}
DEPT_CODE
{FK}
LECT_SPECIALITY
DEPARTMENT
LECT_GRADE
LECT_LNAME
DEPT_CODE
LECT_FNAME
LECT_NUM
{PK}
LECT_INITIAL
DEPT_NAME
{FK1}
LECT_EMAIL LECT_GRADE
5 0..1
1..1
6
Each lecturer be on
contract
may enrol in several classes, period.
five
may enrol
enrolment in the
Statistics, in
in
up to
period)
shown
Cengage deemed
Learning. that
six
any
shown
ENROL.
PK is
All
Rights
five
each
class
and
ERD
segments
in
Figure
may also
5.34
depict
in the
If a class
does
May not
not materially
be
copied, affect
scanned, the
overall
exists
or
have
that
duplicated,
has
of the
in experience.
whole
5.35.
no students
entity is
Cengage
part.
Due Learning
to
electronic reserves
enrolled
weak: it is and
rights, the
right
thus
that
to
third remove
optional
party additional
content
may content
never
can
suppressed at
any
CLASS use
time
occurs
and its
You
be
a
of the
participation
class
entities.
would student
the
existence-dependent,
CLASS
some
to
through
in it, that
to
creating
start
optional
1:* relationships, But note
Each
exist (at the
STUDENT is
two
student
period!
35 students,
in it, so
STUDENT
or in
up to
may decide
but that
enrolment
can initially
into
Figure
a student
History
the
A CLASS
in
ENROL
PKs
learning
may
period,
and
during
have enrolled
segment
Note also that the of the
Database
must be divided
ERD
each class only once during any given
enrolment
times
CLASS.
no students
composed
Reserved. content
and
STUDENT
class
This *:* relationship
entity
next to
suppressed
classes,
current
English,
Statistics
even though
ENROL table.
(composite)
has
The
but (s)he takes
during the
Accounting,
same
between
ENROL
in the
2020
the
*:* relationship.
of the
review
For example,
classes
be enrolled
*:* relationship
is
at all.
the fifth tiny university eRDsegment
A student
not
Copyright
no classes
enrolment take
Editorial
and teach
conditions.
FIguRe 5.34
7
c
mayteach up to four classes; each class is a section of a course. Alecturer
a research
those
chairs
from if
add
the
subsequent
eBook rights
the
and/or restrictions
eChapter(s). require
it
200
PARt
II
Design
Concepts
multiplicities shown
FIguRe
in
5.35
(0..6)
Figure
and (0..35)
next to the
the sixth tiny
university
entity to reflect
the
business
rule
constraints
as
eRD segment
STUDENT STU_NUM
ENROL
5.35.
ENROL {PK}
is_written_in
STU_FNAME
STU_NUM
c
{PK}
CLASS_CODE
STU_LNAME
{FK1} {PK}
is_found_in
{FK2}
CLASS
c
ENROL_DATE
STU_INITIAL
1..1
0..6
CLASS_CODE
ENROL_GRADE
STU_EMAIL
{PK}
CLASS_TIME
0..35 1..1
5 8
Each department has several (hopefully
many) students
However,
major and is, therefore,
each student
has only a single
whose majoris offered bythat department. associated
with a single
department.
(See Figure 5.36.) However, in the Tiny University environment, it is possible atleast for a while for a student not to declare a major field of study. Such a student would not be associated with a department; therefore, DEPARTMENT is optional to STUDENT. It is worth repeating that the relationships between entities and the entities themselves reflect the organisations operating environment.
FIguRe
5.36
9
That is, the
the seventh
business
tiny
rules
university
define the
ERD components.
eRD segment
Each student has an advisor in his or her department; advisor
is
also
LECTURER
in
FIguRe 5.37
a lecturer,
but
not
the LECTURER
all lecturers
advises
advise
STUDENT
each advisor counsels several students. students.
Therefore,
relationship.
(See
STUDENT
Figure
is
An
optional
to
5.37.)
the eighth tiny university eRDsegment LECTURER
LECT_NUM
STUDENT
{PK}
LECT_SPECIALITY
STU_NUM
LECT_RANK
advises
{PK}
LECT_NUM
c
LECT_LNAME
{FK1}
STU_FNAME
LECT_FNAME
STU_LNAME
0..*
1..1
LECT_INITIAL
STU_INITIAL
LECT_EMAIL
STU_EMAIL
LECT_GRADE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
10
5
Data
Modelling
with
Entity
Relationship
Diagrams
201
Whenyou examine the CLASS entity in Figure 5.38, youll notethat this entity contains a ROOM_CODE attribute.
Given the
because FK to
naming
conventions,
a class is taught
an entity
ERD is created
a room,
ROOM.
by observing
single
BUILDING. (See
(class)
rooms.
FIguRe
in
named
clear that
it is reasonable
In turn,
that
a BUILDING
a storage
ROOM_CODE
to
each room
Figure 5.38.) In this
For example,
5.38
it is
assume
is located
can contain
might
the ninth tiny university
the
ROOM_CODE
it is clear that
not
a FK to another
a building.
many ROOMs,
ERD segment,
building
that in
is
contain
any
entity. in
So the last
Clearly,
CLASS is the Tiny
University
but each ROOM is found in a some buildings
named
rooms
do not contain
at all.
eRD segment
5
Using the preceding summary, you can identify the following SCHOOL
COURSE
DEPARTMENT
CLASS
ENROL (the
bridge
entity
between
STUDENT
and
LECTURER
STUDENT
BUILDING
ROOM
entities:
CLASS)
Once you have discovered the relevant entities, you can define the initial set of relationships among them. Next, you describe the entity attributes. Identifying the attributes of the entities helps you better understand the relationships among entities. Table 5.4 summarises the ERMs components, and names the
entities
and their relations.
tAbLe 5.4
components
entity
operates
1..*
DEPARTMENT
DEPARTMENT
has
1..*
STUDENT
DEPARTMENT
employs
1..*
LECTURER
DEPARTMENT
offers
1..*
COURSE
COURSE
generates
1..*
CLASS
1..1
SCHOOL
is
dean
of
LECTURER
chairs
1..1
DEPARTMENT
LECTURER
teaches
1..*
CLASS
LECTURER
advises
1..*
STUDENT
STUDENT
enrols
1..*
CLASS
BUILDING
contains
1..*
ROOM
1..*
CLASS
Note:
review
entity
Connectivity
SCHOOL
ROOM
Copyright
eRM
relationship
LECTURER
Editorial
of the
2020 has
is ENROL
Cengage deemed
Learning. that
any
is the
All suppressed
composite
Rights
Reserved. content
does
used
entity
May not
not materially
be
in
for
that
copied, affect
implements
scanned, the
overall
or
duplicated, learning
the
in experience.
whole
relationship
or in Cengage
part.
Due Learning
STUDENT
to
electronic reserves
rights, the
right
enrols
some to
third remove
in
party additional
CLASS.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
202
PARt
II
Design
You
Concepts
must also
the
end
user
conceptual also
define
the
diagram,
be displayed
depicted
connectivity
extensively.
Having
depicted in
the
in
ERD.
and
defined Figure
cardinality the
5.39.
However,
to
for the just-discovered
ERMs
components,
Actually,
avoid
the
you
entity
crowding
attributes
the
relations
can
diagram,
now
and their the
by querying
draw
the
ERD,
domains
entity
or
should
attributes
may be
separately.
FIguRe 5.39
the completed tiny university eRDsegment LECTURER
1..1
LECT_NUM
is_dean_of
{PK}
DEPT_CODE
c
SCHOOL
{FK} SCHOOL_CODE
LECT_SPECIALITY
1..1
0..1
LECT_NUM
LECT_RANK
{PK} {FK1}
SCHOOL_NAME
LECT_LNAME
1..1
LECT_FNAME
5
b employs
LECT_INITIAL
LECT_EMAIL LECT_GRADE
0..*
operates
1..1
c
1..1 advises
c
chairs
c
1..1
teaches
c
1..*
DEPARTMENT DEPT_CODE
0..1
{PK}
SCHOOL_CODE b
LECT_NUM
has
{FK1} {FK2}
DEPT_NAME 1..1
0..*
1..1
0..*
offers
c
STUDENT CLASS STU_NUM
0..*
0..*
{PK}
DEPT_CODE
CLASS_CODE
{FK1}
{PK}
CLASS_SECTION
STU_FNAME
0..*
COURSE
CLASS_TIME
STU_LNAME
CRS_CODE
STU_INITIAL
LECT_CODE
STU_EMAIL LECT_NUM
CRS_CODE
1..1
{FK2}
ROOM_CODE
{FK2}
b generates
{FK1}
{PK}
DEPT_CODE
{FK3}
{FK1}
CRS_TITLE
1..1
CRS_DESCRIPTION CRS_CREDITES
is_found_in
c
0..*
1..1
is_used_for
is_written_in
c
c
1..1
ROOM ROOM_CODE
{PK}
BLDG_CODE
0..*
0..*
{FK1}
1..1
b contains
BUILDING
ROOM_TYPE
BLDG_CODE ENROL CLASS_CODE STU_NUM
{PK}
BLDG_NAME {PK}
BLDG_LOCATION
{FK2}
{FK2}
ENROL_GRADE
0..*
ENROL_GRADE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
5
Data
Modelling
with
Entity
Relationship
Diagrams
203
5.2.2 ILoveHolidays ILoveHolidays
is
a small international
company
that
owns a number
of independent
travel
agencies in a
number of countries. The travel agencies specialise in booking complete holidays, hotels, flights, tours and one-off events. They also offer information to customers on attractions and places of interest in a number of cities worldwide. From interviews with various stakeholders and employees, the following business
1
rules
have been established:
Each travel Deputy
agent has a number
Manager
or Staff).
of employees
Each travel
who each have an associated
agent
must have
Manager. Therefore a 1:* mandatory relationship Figure 5.40 illustrates this first business rule.
FIguRe
5.40
one
employee
grade (Manager,
who takes
the
role
of
exists between TRAVEL_AGENT and EMPLOYEE.
segment 1: the tRAveL_Agent
5
eMPLoyee relationship EMPLOYEE
TRAVEL_AGENT
EMP_ID
{PK}
PAYROLL_NO
AGENT_ID {PK}
AGENT_ID
AGENT_NAME
employs
c
{FK1}
EMP_LNAME
AGENT_ADDRESS EMP_FNAME AGENT_PHONE
1.1
1..*
EMP_PHONE
AGENT_EMAIL
2
EMP_GRADE
Each employee may make bookings on behalf of customers when they visit one of the travel agencies. However, some employees, such as the Manager, may be confined to back office duties and may not make a booking. This is why BOOKING is optional to EMPLOYEE. A booking can only exist if it has been
relationship FIguRe
made by an employee.
The ERD segment is shown in Figure 5.41 and shows the 1:*
that exists between EMPLOYEE and BOOKING.
5.41
segment
2: the eMPLoyee
booKIng
relationship BOOKING
EMPLOYEE EMP_ID
BOOKING_NO
{PK}
EMP_ID {FK1}
{PK}
CUST_NO
PAYROLL_NO
makes
AGENT_ID {FK1}
c
EMP_LNAME EMP_FNAME
{FK2}
BOOK_STATUS_CODE EVENT_ID
{FK4}
HOTEL_ID
{FK5}
0..*
1..1
EMP_PHONE
FLIGHT_NO
EMP_GRADE
BOOK_TOTAL_COST
{FK3}
{FK6}
BOOKING_DATE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
204
PARt
II
3
Design
Concepts
Figure 5.42 shows the relationships entities. be for
A customer at least
friends.
booking. where and
the
In this
can
make
customer
scenario
himself customer
This is represented
by the
PARTY_MEMBERS same
BOOKING
time
and
is
more bookings or herself
the
at the
FIguRe 5.42
between the CUSTOMER, BOOKING and PARTY_MEMBERS
one or
but
becomes
to
segment 3: the custoMeR
other
The
people
CUSTOMER booking
party
1:* and
agencies.
booking
such
and is responsible
Each
more
booKIng
travel
traveller
between
1 or
is therefore
of the
also include
lead
CUSTOMER.
may include
PARTY_MEMBERS
could
the
1:* relationship
optional
a booking
at any
is for
members.
between
and
as family
for
the
or
overall
PARTY_MEMBER only
one customer
The relationship
BOOKING
must
and
between
CUSTOMER
*:1.
PARty_MeMbeRs relationship
CUSTOMER PARTY_MEMBERS CUST_NO
5
{PK} CUST_NO
{PK}{FK1}
CUST_FNAME
BOOKING_NO
CUST_LNAME CUST_ADDRESS makes
c
Party
c
PARTY_LNAME
CUST_DOB CUST_PHONE
1..1
{PK}{FK2}
PARTY_FNAME
PARTY_DOB
0..*
1..1
CUST_EMAIL
OUT_SEAT_NO
OUT_SEAT_NO
IN_SEAT_NO
IN_SEAT_NO
0..* 1..*
BOOKING BOOKING_NO
{PK}
EMP_ID {FK1} CUST_NO
{FK2}
BOOK_STATUS_CODE EVENT_ID
{FK3}
has
c
{FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
1..1
{FK6}
BOOK_TOTAL_COST
BOOKING_DATE
4
Each BOOKING is assigned
a BOOKING_STATUS_CODE.
These codes
allow the travel
agencies
to
track the status of the booking. When a customer makes a booking, he or she must pay a deposit. The booking status code is then set to Deposit Paid. Once the cost of the booking is paid in full, the booking status code changes to Fully Paid. Booking status codes also exist if the booking is cancelled and when the booking is complete, whichis set after the customer has completed his or her travel
plans.
BOOKING
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Figure
5.43 shows the
1:* relationship
between
BOOKING_STATUS_CODE
and
where BOOKING is optional to BOOKING_STATUS_CODE.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
FIguRe
5.43
segment 4: the booKIng
5
Data
Modelling
with
booKIng_stAtus_coDe
Entity
Relationship
Diagrams
205
relationship
BOOKING_STATUS_CODE BOOKING_STATUS_CODE
{PK}
DESCRIPTION 1..1
5
BOOKING BOOKING_NO EMP_ID
{PK}
{FK1}
CUST_NO
{FK2}
BOOKING_STATUS_CODE
{FK3}
EVENT_ID {FK4} HOTEL_ID
{FK5}
FLIGHT_NO
0..*
{FK6}
BOOKING_TOTAL_COST
BOOKING_DATE
5 ILoveHolidays Wimbledon BOOKING. may or the
Copyright Editorial
review
2020 has
Learning. that
any
Championships.
be for
agencies
All suppressed
Figure
Both sides of the relationship
may not
travel
Cengage deemed
also sells tickets for a number of events such as the Tennis
Rights
Reserved. content
an event will keep
does
May not
not materially
be
copied, affect
and the
scanned, the
overall
or
duplicated, learning
shows
allow for optional
an event
details
5.44
may or
may not
of all events
offered
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
the
Monaco Grand Prix or the
relationship
participation. be booked within
rights, the
right
some to
third
EVENT
because
by a customer.
their
remove
between That is
and
a booking Regardless,
database.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
206
PARt
II
Design
FIguRe
Concepts
5.44
segment 5: the booKIng
event relationship BOOKING BOOKING_NO
EVENT
EMP_ID
EVENT_ID {PK}
{PK}
{FK1}
CUST_NO {FK2}
EVENT_DESCRIPTION
may_contain
BOOKING_STATUS_CODE
{FK3}
EVENT_PRICE_ADULT EVENT_ID
{FK4}
HOTEL_ID
{FK5}
EVENT_PRICE_CHILD 0..1
0..*
EVENT_PRICE_CON
FLIGHT_NO
{FK6}
EVENT_DATE BOOKING_TOTAL_COST
BOOKING_DATE
5
6
A booking
made by a customer
Sydney in House stored
Australia
and
The initial
into
on
table
(entity)
TOUR
optional
that
example
tour
of a
of the
FIguRe 5.45
Bondi
to
participation
the
and
appear
WEAK
entity,
it is
and
TOUR
For example, a customer
Blue
Mountains,
of all tours
a TOUR
can
TOUR
use of the
is shown
never
see the
Details
BOOKING
through
can
BOOKING
tours
Beach.
and therefore
between
1:* relationships
Note that
PKs
a day
relationship
two
books,
may book separate
spend
in the
may be for a number of tours.
is
offered
exist
*:*,
without
but this
TOUR_BOOKING
TOUR_BOOKING
existence-dependent
relationship
and
has
being must
are made.
be divided
Figure
exists that
5.45.
no one ever
TOUR_BOOKING
a composite
Opera
agencies
as shown in
If a tour
table.
Sydney
travel
a BOOKING
entity
next to TOUR_BOOKING.
in in the
visit the
by the
visiting
is
also
PK composed
an
of the
entities.
segment 6: the booKIng
touR_booKIng
touR relationship
BOOKING BOOKING_NO EMP_ID
TOUR
{PK}
{FK1}
CUST_NO
TOUR_BOOKING
{FK2} {FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
{PK}
TOUR_NAME
BOOKING_STATUS_CODE EVENT_ID
TOUR_ID
{FK3}
may_contain
c
TOUR_ID
{PK}
BOOKING_NO 1..1 0.. *
{FK6}
{FK1}
has
{PK}
c
TOUR_DESCRIPTION
{FK2}
TOUR_PRICE_ADULT
TOUR_DATE TOTAL_TOUR_COST
1..1
0..*
TOUR_PRICE_CHILD TOUR_PRICE_CON
BOOKING_TOTAL_COST BOOKING_DATE
7
Figure 5.46 shows the relationship between TOUR, ATTRACT_TOUR and ATTRACTION. A tour may comprise visits to a number of attractions and at the same time different combinations of attractions
may be offered
on different
tours.
This
means that,
initially,
a *:* relationship
existed
between TOUR and ATTRACTION, which needed to be resolved by the addition of the weak entity ATTRACT_TOUR. An attraction may exist without belonging to a tour and therefore the travel agencies would be able to provide information to the customer about the attraction such as travel instructions.
This is
a specific
requirement
of ILoveHolidays
in
order to
customers and exceed expectations. Note that ATTRACT_TOUR the PKs from TOUR and ATTRACTION.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
provide
additional
has a composite
third remove
party additional
content
may content
be
PK comprising
suppressed at
any
time
help to
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
FIguRe
5.46
segment 7: the touR
5
Data
Modelling
AttRAct_touR
with
Entity
AttRActIon
Relationship
Diagrams
207
relationship
TOUR TOUR_ID may_contain
{PK}
TOUR_NAME
c
TOUR_DESCRIPTION TOUR_PRICE_ADULT 1..1
0..*
TOUR_PRICE_CHILD TOUR_PRICE_CON
ATTRACT_TOUR TOUR_ID
{PK}{FK1}
ATTRACTION_NO_{PK}
{FK2} ATTRACTION
ATTRACTION_NO CITY_ID
0..*
5
{PK}
{FK}
ATTRACT_TYPE ATTRACT_NAME ATTRACT_WEBSITE
may_be_visited
c
1..1
ATTRACT_PHONE
ATTRACT_OPENING_TIME ATTRACT_CLOSING_TIME ATTRACT_ADDRESS ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON
8
Each booking or
may be
shows
may be for
booked
on
the relationship
relationship
one hotel. Hotels that exist in the
multiple
occasions.
between
the
This
business
BOOKING
and
HOTEL table
rule
HOTEL
is illustrated
entities.
may never be booked in
Figure
Note that
both
5.47,
which
sides
of the
are optional.
FIguRe 5.47
segment 8: the booKIng
HoteL relationship HOTEL HOTEL_ID
BOOKING BOOKING_NO EMP_ID
{PK}
HOTEL_STARS
{FK1}
CUST_NO
{PK}
HOTEL_NAME
HOTEL_PHONE
{FK2}
BOOKING_STATUS_CODE
HOTEL_EMAIL
consists_of
{FK3}
HOTEL_ADDRESS EVENT_ID
{FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
CITY_ID
0..*
0..1
{FK}
DOUBLE_ROOM_PRICE
{FK6} FAMILY_ROOM_PRICE
BOOKING_TOTAL_COST
SINGLE_ROOM_PRICE
BOOKING_DATE
HOTEL_NO_NIGHTS HOTEL_DATE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
208
PARt
II
Design
9
Concepts
A booking
may be for
relationship can
FIguRe
be seen in
5.48
one specific flight
between
BOOKING
Figure
and
and a specific flight
FLIGHT
is
therefore
may be on
*:1
with
both
many bookings.
sides
being
The
optional,
as
5.48.
segment 9: the booKIng
FLIgHt relationship
BOOKING BOOKING_NO
FLIGHT
{PK}
EMP_ID {FK1}
FLIGHT_NO
CUST_NO
FLIGHT
{FK2}
BOOKING_STATUS_CODE
consists_of
{FK3}
FLIGHT_DEPART_AIRPORT FLIGHT_ARRIVE_AIRPORT
EVENT_ID {FK4} HOTEL_ID
5
0..*
{FK5}
FLIGHT_NO
{PK} AIRLINE
FLIGHT_DEPART_DATETIME
0..1
FLIGHT_ARRIVE_DATETIME
{FK6}
FLIGHT_COST
BOOKING_TOTAL_COST BOOKING_DATE
10
In
order for
employees
to search
for
attractions
in any given
city, ILoveHolidays
wishes to store
details of what attractions exist in each city. An attraction exists in one and only one city whilst a city may have any number of attractions. The relationship between ATTRACTION and CITY is shown in Figure 5.49.
FIguRe
5.49
segment 9: the AttRActIon
cIty relationship
ATTRACTION ATTRACTION_NO CITY_ID
{PK}
{FK}
ATTRACT_TYPE ATTRACT_NAME
CITY
ATTRACT_WEBSITE exits_in
ATTRACT_PHONE
CITY_ID
c
{PK}
COUNTRY_ID
ATTRACT_OPENING_TIME
{FK}
CITY_NAME
ATTRACT_CLOSING_TIME
0..*
1..1
LOCAL_WEBSITE
ATTRACT_ADDRESS ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
5
Data
11 In order to deal with more detailed enquiries from customers country
(for
example,
relationship a city
based
between
can
only
cities
exist in
FIguRe 5.50
upon
the
and their
one
country
number
of attractions
associated (see
country.
Figure
segment 10: the cIty
Modelling
about
in
each
Entity
Relationship
Diagrams
209
which cities to visit in a given city), it is
One country
has
necessary
one
or
to
model
more cities
a
whilst
5.50).
countRy relationship COUNTRY
CITY CITY_ID
with
{PK}
COUNTRY_ID
b has
COUNTRY_ID
{PK}
COUNTRY_NAME
{FK}
TOURISM_WEB_SITE
CITY_NAME 1..*
LOCAL_WEBSITE
1..1
MAIN_LANGUAGE
5
12
Each city where a customer available. are required. is
optional
for
the
would like to stay will hopefully
To allow the travel Figure in the
travel
to search for hotels
5.51 shows the 1:* relationship
relationship
agencies
FIguRe 5.51
agencies
to
as a city
recommend
(and
have
therefore
segment 11: the HoteL
entities,
CITY and
HOTEL.
between
may not
have a selection
by city, two
any
hotels
would
not
that
are
be included
of hotels that are HOTEL and Notice that
deemed
good
in the
HOTEL
CITY, HOTEL
enough table).
cIty relationship
HOTEL HOTEL_ID
{PK}
HOTEL_NAME HOTEL_STARS
CITY
HOTEL_PHONE HOTEL_EMAIL
exists_in
CITY_ID
c
{PK}
HOTEL_ADDRESS
COUNTRY_ID
CITY_ID {FK}
CITY_NAME
0..*
DOUBLE_ROOM_PRICE
1..1
{FK}
LOCAL_WEBSITE
FAMILY_ROOM_PRICE SINGLE_ROOM_PRICE HOTEL_NO_NIGHTS
HOTEL_DATE
13
A customer makes atleast one or,in some cases, many payments in order to pay off the total cost of their booking. The relationship between CUSTOMER and PAYMENT (shown in Figure 5.52) is
mandatory
on both sides
be associated
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
as a customer
must make atleast
one payment
and one payment
must
with a CUSTOMER.
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
210
PARt
II
FIguRe
Design
Concepts
5.52
segment 12: the custoMeR
PAyMent
relationship
CUSTOMER
CUST_NO
{PK}
PAYMENT
CUST_FNAME
PAYMENT_NO
CUST_LNAME provides
CUST_ADDRESS
CUST_NO
c
INVOICE_NO
CUST_DOB
{PK}
{FK1} {FK2}
AMOUNT_PAID
CUST_PHONE
1..*
1..1
PAYMENT_TYPE
CUST_EMAIL
DATE_PAID OUT_SEAT_NO
IN_SEAT_NO
5 14
A booking will generate atleast one invoice but may generate manyinvoices. This will depend on whether the customer chooses to pay for his or her booking all at once. In this case, only oneinvoice will be produced. Otherwise, several invoices may need to be generated for a specific booking. The 1:* mandatory
FIguRe 5.53
relationship
between
BOOKING
segment 13: the booKIng
and INVOICE
can be seen in Figure
5.53.
InvoIce relationship
BOOKING BOOKING_NO
EMP_ID
{PK}
{FK1}
CUST_NO
INVOICE
{FK2}
BOOKING_STATUS_CODE
generates
{FK3}
INVOICE_NO
c
{PK}
BOOKING_NO
EVENT_ID
{FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
{FK1}
INVOICE_DATE 1..1
1..*
INVOICE_BALANCE
{FK6}
BOOKING_TOTAL_COST
BOOKING_DATE
15
Figure 5.54 shows the 1:* relationship between
BOOKING
relationship.
to reduce
FIguRe 5.54
and
INVOICE,
One invoice
the
may
balance
between INVOICE and PAYMENT. Similar to the relationship PAYMENT
be paid
and
by a number
INVOICE
also
of payments
participate
whilst
one
in
a
payment
mandatory is
assigned
on one invoice.
segment 14: the InvoIce
PAyMent relationship PAYMENT
INVOICE
PAYMENT_NO
INVOICE_NO
{PK}
BOOKING_NO
{PK}
CUST_NO
is_paid_by
{FK1}
{FK1}
INVOICE_NO
INVOICE_DATE
{FK2}
AMOUNT_PAID
1..*
1..1
INVOICE_BALANCE
PAYMENT_TYPE DATE_PAID
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
We have now completed components,
we can
FIguRe
5.55
all the
draw
the
segments
5
Data
of the ILoveHolidays
completed
conceptual
Final ILoveHolidays
ERD
Modelling
ERD. as shown
with
Now that in
Entity
Relationship
Diagrams
we have defined
Figure
211
all the
5.55.
eRD CUSTOMER
TRAVEL_AGENT EMPLOYEE
AGENT_ID
CUST_NO
{PK} EMP_ID
1..1
{PK}
{PK} CUST_FNAME
AGENT_NAME PAYROLL_NO employs
c
1..*
AGENT_ADDRESS
CUST_LNAME AGENT_ID
{FK1} CUST_ADDRESS
AGENT_PHONE EMP_LNAME
CUST_DOB
AGENT_EMAIL EMP_FNAME
CUST_PHONE EMP_PHONE
CUST_EMAIL
EMP_GRADE
1..1
provides
c
CUST_SEAT_NO
1..*
IN_SEAT_NO
1..1
PAYMENT
1..1 party
c
1..1
PAYMENT_NO
{PK}
CUST_NO
0..*
BOOKING_STATUS_CODE
BOOKING_STATUS_CODE
5
{FK1}
INVOICE_NO
{FK2}
AMOUNT_PAID
{PK} makes
c
PAYMENT_TYPE
PARTY_MEMBERS
DESCRIPTION
makes
c
DATE_PAID
1..1
CUST_NO
{PK}{FK1}
BOOKING_NO
{PK}{FK2}
PARTY_FNAME
PARTY_LNAME 1..* PARTY_DOB has
c OUT_SEAT_NO
0..*
is_paid_by
IN_SEAT_NO
0..* 1..*
INVOICE BOOKING
INVOICE_NO generates
BOOKING_NO EMP_ID CUST_NO
1..1
{PK}
c BOOKING_NO
{PK}
{FK1}
INVOICE_DATE
{FK1}
1..*
{FK2}
INVOICE_BALANCE
1..1 BOOKING_STATUS_CODE
EVENT_ID HOTEL_ID
may_contain
{FK5}
FLIGHT_ID
0..*
{FK3}
{FK4}
{FK6}
consists_of
0..*
c
BOOKING_TOTAL_COST BOOKING_DATE 0..*
consists_of 0..1 1..1 0..1
may_contain
0..*
c
0..1 EVENT
FLIGHT
0..* FLIGHT_NO
EVENT_ID
{PK}
{PK} FLIGHT_AIRLINE HOTEL
EVENT_DESCRIPTION
TOUR_BOOKING FLIGHT_DEPART_AIRPORT
EVENT_PRICE_ADULT
HOTEL_ID TOUR_ID
{PK}
FLIGHT_ARRIVE_AIRPORT
{PK}{FK1} HOTEL_NAME
EVENT_PRICE_CHILD BOOKING_NO
{PK}{FK2}
FLIGHT_DEPART_DATETIME HOTEL_STARS
EVENT_PRICE_CON
FLIGHT_ARRIVE_DATETIME
TOUR_DATE HOTEL_PHONE
EVENT_DATE
FLIGHT_COST
TOUR_TOUR_COST HOTEL_EMAIL
HOTEL_ADDRESS
has
CITY_ID{FK}
c
DOUBLE_ROOM_PRICE
0..* 1..1
FAMILY_ROOM_PRICE SINGLE_ROOM_PRICE
TOUR
HOTEL_NO_NIGHTS HOTEL_DATE
TOUR_ID 1..1 may_contain
{PK}
TOUR_NAME
c TOUR_DESCRIPTION 0..* TOUR_PRICE_ADULT
TOUR_PRICE_CHILD
0..*
TOUR_PRICE_CON
exits_in
c
ATTRACT_TOUR
TOUR_ID
{PK}{FK1}
ATTRACTION_NO
{PK}{FK2} ATTRACTION 1..1 ATTRACTION_NO
{PK} COUNTRY
CITY_ID
{FK}
0..*
ATTRACT_TYPE
COUNTRY_ID
1..1
CITY
ATTRACT_NAME
0..*
ATTRACT_PHONE
c
{PK}
COUNTRY_NAME
bhas
1..*
{PK}
TOURISM_WEB_SITE
1..1
may_be_visited
CITY_ID
1..1
ATTRACT_WEBSITE exits_in
c
COUNTRY_ID
{FK} MAIN_LANGUAGE
CITY_NAME ATTRACT_OPENING_TIME LOCAL_WEBSITE ATTRACT_CLOSING_TIME ATTRACT_ADDRESS
ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD
ATTRACT_COST_CON
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
212
PARt
II
Design
5.3
Concepts
DAtAbAse
Database
designers
DesIgn
often need to
cHALLenges:
make design
as adherence to design standards (design The database
design
must conform
to
conFLIctIng
compromises
that
are triggered
elegance), processing design
standards.
goALs by conflicting
speed andinformation
Such standards
goals, such
requirements.
have guided
you in
developing logical structures that minimise data redundancies, thereby minimising the likelihood that destructive data anomalies will occur. You have also learnt how standards prescribed avoiding nulls to the greatest extent possible. In fact, you have learnt that design standards govern the presentation
of all components
within the
database
design. In short,
design
standards
allow you to
work with well-defined components and to evaluate the interaction ofthose components with some precision. Without design standards, it is nearly impossible to formulate a proper design process, to evaluate an existing design, or to trace the likely logical impact of changes in design.
5 In many organisations, particularly those generating large numbers of transactions, high processing speeds are often atop priority in database design. High processing speed means minimal access time, which may be achieved by minimising the number and complexity of logically
desirable
relationships.
For example,
a perfect
design
might use a 1:1 relationship
to
avoid nulls, while a higher-transaction-speed design might combine the two tables to avoid the use of an additional relationship, using dummy entries to avoid the nulls. If the focus is on data-retrieval speed, you might also be forced to include derived attributes in the design. The quest for timely information might be the focus of database design. Complex information requirements may dictate data transformations, and they may expand the number of entities and attributes
within the design.
Therefore,
the
database
may have to
sacrifice
some
of its clean
design structures and/or some ofits high transaction speed to ensure maximum information generation. For example, suppose that a detailed sales report must be generated periodically. The sales report includes allinvoice subtotals, taxes and totals; even the invoice lines include subtotals. If the sales report includes hundreds of thousands (or even millions) of invoices, computing
the totals,
taxes
and subtotals
is likely
to take
some time. If those
computations
had
been made and the results had been stored as derived attributes in the INVOICE and LINE tables at the time of the transaction, the real-time transaction speed might have declined, but that loss of speed would only be noticeable if there had been many simultaneous transactions. The cost of a slight loss
of transaction
speed
at the front
end and the
addition
of
multiple
derived
attributes
is likely to pay off when the sales reports are generated (not to mention the fact that it will be simpler to generate the queries). Another issue that needs to be borne in mindif derived values are used to improve performance, is data integrity. Should the values from which the derived value is calculated change, then triggers need to bein place to ensure that the derived values are automatically
updated.
Failing to
do this
would result in
As arule, you should first strive for a design that hasintegrity for performance
reasons.
Once a normalised
design is in
data integrity
issues.
before attempting to denormalise the design
place, issues
around improving
performance
by
mergingtables, including derived values, etc., can beincluded. A design that meets alllogical requirements and design conventions is an important goal. However, if this perfect design fails to meetthe customers transaction speed and/or information requirements, the designer will not have done a proper job from the end users
point of view.
Compromises
are a fact
of life in the real
world of database
design.
Even as the designer focuses on the entities, attributes, relationships and constraints, he or she should begin thinking about end-user requirements such as performance, security, shared access and dataintegrity. The designer must consider processing requirements and verify that all update, retrieval and deletion options are available. Finally, a design is oflittle value unless the end product is capable of delivering
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
all specified
All suppressed
Rights
Reserved. content
does
May not
query
not materially
be
copied, affect
and reporting
scanned, the
overall
or
duplicated, learning
requirements.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
cHAPteR
You are quite likely further using
changes, the
process.
meeting
the
thorough
to
discover
mandated ER
demands
modelling
revisit
the
Figure
get
design
a sense
1:1 recursive
5.56
shows
best design
requirements.
essential and
in
Such
the
growth.
design
relationship
with
process
produces
changes
should
of a sound
ERDs
yields
problems
Entity
Relationship
an ERD that not
the
that
213
you from
is
richest
Diagrams
requires
discourage
design
perhaps
that
and implementation
ways
Modelling
capable
bonus
of
of all: a
really functions.
EMPLOYEE
different
Data
development Using
and implementation
of the
three
FIguRe 5.56
even the
of how an organisation
There are occasional To
is
of adjustment
understanding
solutions.
that
by operational
5
is
married to
of implementing
various implementations
do not yield clean
choices
a database
EMPLOYEE
such
first
implementation
designer
examined
in
faces,
lets
Figure
5.21.
a relationship.
of the 1:1recursive relationship 5
Database name:
Ch05_PartCo
Table name: EMPLOYEE_V1 First implementation
Second
eMP_NUM
eMP_LNAMe
eMP_FNAMe
eMP_SPOUSe
345
Singh
Nishok
347
346
Jones
Anne
349
347
Singh
Vediga
345
348
Delaney
Robert
349
Shapiro
Anton
346
implementation
Table name: EMPLOYEE
Table name:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
eMP_NUM
eMP_LNAMe
eMP_FNAMe
345
Singh
Nishok
346
Jones
Anne
347
Singh
Vediga
348
Delaney
Robert
349
Shapiro
Anton
MARRIED_V1
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
eMP_NUM
eMP_SPOUSe
345
347
346
349
347
345
349
346
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
214
PARt
II
Design
Concepts
Third implementation Table
name:
MARRIAGE MAr_NUM
Table name:
MAr_DATe 1
04-Mar-13
2
02-Feb-09
MARPART MAr_NUM
5
Table
name:
eMP_NUM
1
345
1
347
2
346
2
349
EMPLOYEE eMP_NUM
eMP_LNAMe
eMP_FNAMe
345
Singh
Nishok
346
Jones
Anne
347
Singh
Vediga
348
Delaney
Robert
349
Shapiro
Anton
As you examine the EMPLOYEE_V1 table in Figure 5.56, note that this table is likely to yield data anomalies. For example, if Anne Jones divorces Anton Shapiro, two records must be updated by setting the respective EMP_SPOUSE values to null to properly reflect that change. If only one record is
updated,
inconsistent
data
occur.
The
problem
becomes
even
worse if
several
of the
divorced
employees then marry each other. In addition, that implementation also produces undesirable nulls for employees who are not married to other employees in the company. Another approach would be to create a new entity shown as MARRIED_V1 in a 1:* relationship with EMPLOYEE. (See Figure 5.56, second implementation.) This second implementation does eliminate the nulls for employees
who are not
married to somebody
working for the same company.
(Such
employees
would not be entered in the MARRIED_V1 table.) However, this approach still yields possible duplicate values. For example, the marriage between employees 345 and 347 may still appear twice, once as 345 347 and once as 347 345. (Since each of those permutations is unique the first time it appears, the creation
Copyright Editorial
review
2020 has
Cengage deemed
of a unique index
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
will not solve the
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
problem.)
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
cHAPteR
As you can see, the first two implementations Both solutions refer to
use synonyms.
an employee.
Both solutions as 345
are likely
Both solutions
violate
integrity
be to
the
EMP_NUM
approach
to show
would
have
two
be the
preferred
can
see,
effectiveness
and
a recursive
judgement
processing
requirements
Finally, youve
you (or
those
design.
Although
the
design
yield
document
need
for
only
you)
to
pick
The
of ensuring
data
to enter
other
the
class
to
employee
345.
employees.
different
in
For
will not
a 1:* relationship.
diagram
environment.
But
in
Figure
even this
5.38.)
approach
only once in any given
5
marriage,
MARPART table. solutions
with
as a database
requirements
Put all design you
stay
up the
varying
designer
imposed
degrees
is to
by
use
business
of your
rules,
activities
in
writing.
Then review
what
on track
during
the
design
process,
but
also
thread
when
modify
the
problems
in
design
should
be
work is that the put
stages.
aspect
UML
occurs
Your job
meets
helps
documentation
analysis
and implementation
that
215
principles.
document!
not
possible
MARPART
the
attribute in the many
principles.
design
and
following
and systems
very important
and
(See
a relational
yields
a solution
and basic
it is
married to employee
MARRIAGE
in
Diagrams
and EMP_SPOUSE
married to several
EMPLOYEE.
solution
design
Relationship
all unique.
on the EMP_NUM
basic
Documentation
enables
database
to
to
document,
written.
are
entities
1:1 relationship
adherence
professional
348 as
For example, to ensure that an employee
would have to use a unique index As you
Entity
as 345 347 and 348 347 and 349 347 that
they
key to
with
EMP_NUM
For example,
one employee
new
foreign
Modelling
problems:
uses
employee
because
Data
same synonyms.
data.
have data pairs such
requirements
some fine-tuning.
table
uses the
347 and to enter
would
contains
This third
to
table
produce inconsistent
data entries
possible
approach
MARPART
you
allow
it is entity
requires
to
several
The EMPLOYEE_V1
MARRIED_V1
married to employee
example,
A third
The
yield
5
obvious,
it in
development
writing
of
compatibility
the
one
of the
comes
most
rule is often
organisational
and
time
to
vexing
not observed
documentation
in all of the
standards
is
a
coherence.
suMMARy The
ERM
uses
ERDs to
main components and
cardinality
(optional
or
Multiplicity number that
notations.
is the
known
whether
all occurrences
In the
ERM,
ERM in a relational
ERDs
may be based
the
least
Copyright review
2020 has
Cengage deemed
Learning. that
any
on
of the
All suppressed
Rights
to the
cardinality.
at the
number
on
conceptual
level.
must be
ERMs
participation
us to
describes
define the of one entity
two
the
entity.
important
specific
number
Participation
or not.
business
The
connectivity
etc.).
expresses
of a related
based
user.
of instances
Multiplicity
in the relationship
usually
end
relationship
ternary,
which enables
Cardinality
an occurrence
does
different
May not
not materially
ERMs.
the same.
application
business
Reserved. content
on a relationship,
participate
valid
binary,
entity.
by the
strength,
(unary,
the *:* relationship
many
remains
constraints,
some
relationship
and refers
are
as viewed
The ERD also includes
determines
Participation
is either
rules.
However,
when implementing
mapped to
the
a set of 1:* relationships
entity.
modelling logic
and action
is
database,
a composite
exists
and with
Multiplicities
a *:* relationship
through
show
of a related
of an entity
or optional.
also
relationship,
one instance
associated
database
and attributes.
of relationship
that
as participation
occurrences
conceptual
can
degree
in that with
of entity
mandatory
Editorial
and
main constraint
associated
constraints,
the
relationships
An ERD
mandatory),
of participants
are
represent
are entities,
However,
Because
software
regardless
of
which
no ERM can accurately
must be used to augment
model is
portray
selected,
all real-world
the implementation
data of at
rules.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
216
PARt
II
Design
Concepts
Database
designers,
all applicable compromises
that
conventions.
knowledge
of
modifications
1 5
end
end,
users
forced
have
are
database It is
keep
the
designs that
design
and
and/or
adherence
professional to
designers
must have
also important
information to
all
modification.
to
process
modelling
to
and
determine
To ensure that
detailed
document
on track
to
Those
judgement
are subject
design
conform
compromises.
speed
logic
must use their
conventions. helps
make
vital transaction
conventions
sound,
produce
to
modelling
designers
modelling
which
are able to
often
of perfect
database
data-modelling to
well they are
use
judgements
beginning
steps
the
what extent the
professional
The
when
prevent
Therefore,
how and to
from
matter how conventions,
are required
requirements
their
no
modelling
and in-depth
the
design
allows
for
process easy
in the future.
in
creating
an entityrelationship
model
are:
Create a detailed narrative of the organisations
description
of operations.
2 Identify the business rules based onthe descriptions of operations. 3 Identify
all main entities from the business rules.
4 Identify all mainrelationships between entities from the business rules. 5
Develop aninitial
6
Determine the involves
ERD.
multiplicities
identifying
and the participation
whether
a relationship
7 Identify the primary and foreign 8 Identify
9
of all relationships.
can be optional
Remember,
or mandatory
participation
for each entity.
keys.
all attributes.
Revise and review the ERD.
Key teRMs association
identifying relationship
relationship degree
binaryrelationship
iterative process
simple attribute
cardinality
mandatory participation
single-valued attribute
class
multiplicity
strongrelationship
multivaluedattribute
ternaryrelationship
compositeattribute composite key
non-identifying relationship
unary relationship
derived attribute
optional participation
weak entity
existence-dependent
participants
weakrelationship
existence-independent
participation
identifiers
recursive relationship
FuRtHeR ReADIng Chen,
P. (ed.)
Entity-Relationship
Computer Gordon,
K.
BCS,
Society Modelling
and
Approach:
The
North-Holland,
Business
Use
of ER
Concept
in
Knowledge
Representation.
IEEE
1985.
Information:
Entity
Relationship
and
Class
Modelling
for
Business
Analysts,
2.
Hernandez,
M. J.
Database
Design
for
Mere
Mortals:
A Hands-On
Guide
to
Relational
Database
Design.
Addison-Wesley,
2003. Larman,
C. Applying
Development.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
UML
and
Prentice
Rights
Reserved. content
does
Patterns:
Hall,
May not
not materially
be
An Introduction
to
Object-Oriented
Analysis
and
Design
and Iterative
2004.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
Patig,
S. Evolution
Elsevier
of entity-relationship
Science,
Rumbaugh,
J.,
February
Jacobson,
modelling,
Journal
of
5
Data
Data
Modelling
& Knowledge
with
Entity
Engineering
Relationship
56(2):
Diagrams
217
122138,
2006.
I. and
Booch,
G. The
Unified
Modelling
Language
Reference
Manual.
Addison-Wesley,
2004.
online content available
on the
Answers to selectedReviewQuestions andProblems forthis chapterare
online
platform
accompanying
this
book.
RevIeW QuestIons 1
Which two conditions must be met before an entity can be classified example of a weak entity.
2
Whatis
3
a strong (or identifying)
as a weak entity?
Give an
5
relationship?
Given the business rule an employee may have many degrees, discuss its effect on attributes, entities and relationships. (Hint: Remember what a multivalued attribute is and how it might be implemented.)
4
Whatis a composite
FIguRe
Copyright Editorial
review
2020 has
Cengage deemed
Q5.1
Learning. that
any
All suppressed
entity and when is it used?
the conceptual
Rights
Reserved. content
does
May not
not materially
be
copied, affect
model for
scanned, the
overall
or
duplicated, learning
in experience.
whole
question
or in Cengage
part.
Due Learning
to
5
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
218
PARt
II
Design
5
Concepts
Suppose you are working Given the
a
within the framework
model in
Figure
Identify
Whatis arecursive relationship?
7
How would you (graphically) identify
a
an entity?
b
the
Give an example. each of the following
ERM components in a UML model:
multiplicity (0:*)?
Discuss the difference indicated
9
model shown in Figure Q5.1.
all of the cardinalities.
6
8
of the conceptual
Q5.1:
Writethe business rules that are reflected in it.
b
5
conceptual
in
between a composite
key and a composite
attribute.
How would each be
an ERD?
Whattwo courses of action are available to a designer
when he or she encounters
a multivalued
attribute?
10
Whatis a derived attribute? Give an example.
11
Howis a relationship
12
Discuss two (Hint:
13
waysin whichthe 1:* relationship about
14
has
1720
Q5.2
Cengage deemed
in an ERD, and what is its function? Illustrate
using the
must be addressed in database design? attributes and simple attributes.
of each.
Whatare multivaluedattributes, and how can they be handled withinthe database design?
FIguRe
2020
entity represented
Briefly, but precisely, explain the difference between single-valued
Questions
review
COURSE and CLASS can beimplemented.
strength.)
Whichthree (often conflicting) database requirements
16
Copyright
between
notation.
Give an example
Editorial
relationship
How is a composite UML
15
Think
between entities indicated in an ERD? Give an example using UML notation.
Learning. that
any
All suppressed
are based
on the
ERD in
Figure
Q5.2.
the eRD for questions 1720
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
17
Writethe ten cardinalities (multiplicities)
18
Writethe business rules reflected in this
19
Whichtwo attributes
21
in your
Thelocal soccer.
Data needs to Also,
Draw a data
data
Team:
Cengage deemed
Team,
the
children
coaches
for
and attributes
Player,
Coach,
and
here.
5
ID
number,
Team
name,
and
Team
colours
Coach last
ID
number,
Parent ID number, address
(Street,
relationships
Team
is related
to
Player.
Team
is related
to
Coach.
Player
is related
to
and
first
Parent last
City,
Province,
must
be defined:
name, and
name, name,
Parent first
Postal
and Player age and
Coach
name,
home
Home
phone
phone
number
number
and
code)
Parent.
participations
are
defined
as follows:
may or may not have a Player. must
have
a Team.
A Team
may have
A Player
has
A Team
may or may not have a Coach.
only
must
many Players. one
have
Team.
a Team.
A Team
may have
A Coach
has
A Player
must
have
a Parent.
A Parent
must
have
a Player.
A Player
may have
A Parent
may have
All
and their
Parent
name,
suppressed
who sign up to play
team.
described
Coach
any
primary key. Use proper
who will play on each team,
each
Coach
Learning. that
on the
Coach:
A Coach
has
on each team,
be kept
Player last
A Player
2020
to
name,
Connectivities
review
weak entitys
Player first
The following
Copyright
DEPENDANT
Player ID number,
Parent:
Editorial
219
entity between STORE and PRODUCT?
Player:
A Team
Diagrams
required:
Team
Home
Relationship
ERD.
of the
with the entities
required:
Attributes
Entity
needs a database system to help track children
be kept
needs
model
Entities
with
answer.
city youth league
parents.
Modelling
in your answer.
Describe precisely the composition terminology
Data
that are appropriate for this ERD.
must be contained in the composite
Use proper terminology
20
5
Rights
only
Reserved. content
many
does
one
May not
Coaches.
Team.
many
Parents.
many
not materially
be
copied, affect
Players.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
220
PARt
II
Design
Concepts
PRobLeMs 1
2
5
3
Using the following
business
a
A company
operates
b
Each department
c
Each of the employees
d
Each employee
Using the following
rules,
create the
appropriate
ERD using
UML notation.
many departments.
employs
one or
more employees.
may or may not have one or more dependants.
may or may not have an employment business
rules,
create the
history.
appropriate
ERD using
a
Afootball team has at least 11 players and
b
Each player
c
A minimum of 11 players and a maximum of 14 players
d
A player
e
Each game
UML notation:
may have up to 40 players.
may or may not play one or more games.
may or may not score
may participate in one game.
one or more goals.
may have zero or more goals.
Using the following a
A musician
b
Onerecording
c
Atrack
business rules, create aninitial
makes atleast
one recording,
but
ERD using UML notation: may over a period
of time
make many recordings.
consists of at least three or more tracks.
can appear
on
more than
one recording.
4
Revise the ERD you developed in Problem 3 and resolve any *:* relationships.
5
The Hudson Engineering Group (HEG) has contacted you to create a conceptual model whose application will meetthe expected database requirements for the companys training programme. The HEG administrator
environment. cardinalities.
gives
you the
description
(Hint: Some of the following Can you tell which ones?)
(see
below)
sentences
identify
of the training
the
groups
volume
operating
of data rather than
The HEG has 12instructors and can handle up to 30 trainees per class. HEG offers five advanced technology courses, each of which may generate several classes. If a class has fewer than ten trainees, it will be cancelled. Therefore, it is possible for a course not to generate any classes. Each class is taught
by one instructor.
to do research the following:
6
Define all of the entities and relationships.
b
Describe the relationship between instructor and existence-dependence.
review
2020 has
a
A department employs
b
Some employees,
c
A division operates
d
An employee
Cengage deemed
up to two
any
All suppressed
Rights
many employees,
known asrovers,
may be assigned
may be assigned
do
and class in terms
of cardinality,
UML notation.
participation
Write all appropriate
but each employee is employed by one department.
are not assigned to any department.
many departments,
must have atleast
Reserved. content
or
but each department is operated
many projects,
and a project
by one division.
may have
many employees
to it.
A project
Learning. that
classes
(Use Table 5.4 as your guide.)
Use the following business rules to create an ERD using multiplicities in the ERD.
e
Copyright
may teach
maytake up to two classes per year. Giventhat information,
a
assigned
Editorial
Each instructor
only. Each trainee
does
May not
not materially
be
copied, affect
scanned, the
overall
one employee assigned to it.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
f
One of the employees one
g
7
5
manages each department,
Data
Modelling
with
Entity
Relationship
and each department is
Diagrams
221
managed by only
employee.
One of the employees runs each division, and each division is run by only one employee.
During peak periods, Temporary Employment Corporation (TEC) places temporary companies. TEC
TECs
has
If the job
manager
a file
of candidates
candidate history
additional Each
job
history has
a BA degree both
or a
a BA and
Each time folder.
a company
That folder
starting Each
several
contains
opening
requires
When a candidate made in the number,
the
a temporary
Record
hours
job
Each time the
Each
for
history. (Naturally, candidate
And
no
worked,
one
may be earned
one candidate
clearly,
a candidate
to
by
more
have earned
may have
5
earned
temporaries.
number,
TEC
a company
date, and hourly or
an
name,
entry in
required
the
Openings
qualifications,
a
pay.
he or she is given the job
That folder
etc. In
makes
main qualification.
qualification, folder.
worked,
qualification
more than
employee,
one specific
matches the
workersin
business:
Certification.)
ending
only
has a specific
worked.)
Certification.
an opening
Placement total
candidate
that request
requests
of the
work.
qualifications.
Network
date, an anticipated
to
it is possible
Network
of companies
description
created.
example,
Microsoft
TEC also has alist
willing
has never
was
Microsoft
a
following
before, that
earned
(For
the
are
candidate
record
one candidate.
you
who
has worked
exists if the
candidate
than
gives
contains
addition,
an
an entry
is
opening
and an entry is number,
made in the
job
a candidate
history
for the
candidate.
An opening TEC
uses
can be filled special
codes
Copyright review
many candidates,
to
describe
codes is shown in the table
below.
DeSCriPTiON
SEC-45
Secretarial
work,
SEC-60
Secretarial
work, at least
CLERK
General clerking
PRG-PY
Programmer,
Python
PRG-C++
Programmer,
C++
DBA-ORA
Database
Administrator,
DBA-DB2
Database
Administrator,
DBA-SQLSERV
Database
Administrator,
SYS-1
Systems
Analyst,
level
1
SYS-2
Systems
Analyst,
level
2
NW-NOV
Network
Administrator,
2020 has
Cengage deemed
Web Developer,
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
and a candidate
a candidates
CODe
WD-CF
Editorial
by
duplicated, learning
at least
45
can fill
qualifications
words
per
for
many openings.
an opening.
The list
of
minute
60 words per
minute
work
Oracle IBM
DB2
MS SQL
Server
Novell experience
ColdFusion
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
222
PARt
II
Design
Concepts
TECs
management
wants to
keep track
of the following
entities:
COMPANY
OPENING QUALIFICATION
CANDIDATE JOB_HISTORY
PLACEMENT Given that information,
a 5
Drawthe ERD using UML notation for this enterprise.
b
Identify
c
Identify the multiplicities(including the mandatory/optional dependencies)for eachrelationship.
d 8
all possible relationships.
Resolve all *:* relationships.
The Gauteng Netball Conference (GNC) is an amateur netball association. has one team 11
players.
other
Given those
c
Identify the cardinality
d
Identify the dependency
has a maximum
coaches
plays
do the
(offensive,
two
games
Each town in the province
of 14 players defensive
(home
and
and a minimum
and
visitor)
physical
against
of
training
each
of the
following:
of each relationship.
of dependency that exists between TOWN and TEAM. between teams
and players and between teams and town.
between coach and team and between team and player.
Draw the ERD to represent the
GNC database.
Automata Inc. produces specialty vehicles by contract. The company operates several departments, of
which a new
request
The
builds
a particular
vehicle
is
specific to
order
keep
is
built,
by the
maintained
so
is in inventory.
If an item
Using
that
such
places
Automatas
purchasing
and to
that
as a limousine,
department
accelerate
purchasing
When an order
several
the
of orders
immediately.
have
vehicle,
components. track
received
inventory
most
frequently
it
with the
process may
in, it is checked
is not in inventory,
order
functional do the
in
creating
a
different are
whether
items.
delivered
almost
the requested
a supplier.
An
item
Each item
may
description
of the
processes
encountered
at
Automatas
purchasing
following:
b
Identify all ofthe relations and multiplicities among entities.
c
Identify the type
all of the
main entities.
of existence dependency in all relations.
Giveatleast two examples ofthe types ofreports that can be obtained from the database.
Learning. that
department
materials.
items
from
Identify
Cengage
purchasing
several
determine
must be ordered
RV.
is interested
contain
requested
to
or an
of delivering
a
d
a van
department
the
department
the
comes
an
a truck,
suppliers.
department,
deemed
multiplicities
three
team
conditions,
Identify the type
database
has
up to
each
b
to
2020
Each team
has
season,
Identify the
When
review
also
the
a
each
Copyright
team
During
teams.
e 9
as its representative. Each
coaches).
Editorial
do the following:
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
10
5
Data
Modelling
with
Entity
Relationship
Diagrams
223
Create an ERD based on the UML notation, using the following requirements: An INVOICE but
is
written
each invoice
The INVOICE
is
is
by a SALESREP.
written
written
Each
by a single
for
a single
sales
sales
representative
can
write
many invoices,
representative.
CUSTOMER.
However,
each
customer
can
have
many
invoices.
An INVOICE
can include
many detail lines
(LINE),
which
describe
the
products
bought
by the
customer. The product
information
The products
is
vendor
stored
in
information
a PRODUCT
is found
in
entity.
a VENDOR
entity.
note
5
Limit
your
do not include
11
ERD to
add realism the
entities
and
to
your
design
that
would
attributes
Using the following fully
labelled
relationships
permit
brief summary
ERD.
Make sure
based
by expanding
on the
business
or refining
the
model to
the
rules
business
be successfully
shown
rules.
here.
In
other
However,
all appropriate
you
implemented.
of business rules for the ROBCOR catering service,
you include
words,
make sure
entities,
relationships,
draw the
connectivities
and
cardinalities. Each
12
dinner
can
attend
can
be
is
based
on a single
many dinners,
mailed to
many
following
business
A patient
can
make
one
doctor,
and cases
patient many
Each
appointment
is
and, the
visit
patients
A patient
one
patients.
an
be served
by
many
at
many
dinners.
guests.
Each
dinner
A guest invitation
many invitations.
for a medical clinic, using at least
records
a single
with the
are
Each
more
doctors
each
However,
doctor
in the
clinic,
appointment
and
is
a doctor
made
with
only
patient.
appointment
appropriate,
a bill.
or
However,
appointment. in the
a visit
when
creates
for
book
appointment
management
as unscheduled.
specified
in the
appointment.
The visit
treatment. updated
patient
to
visit is
provide billed
a
medical
by one
history.
doctor,
and
each
doctor
can
patients.
more than
one
However,
a bill
may be paid in
many instalments,
and a payment
may
bill.
may pay the
insurance
can
can receive
references
entered
yields
bill must be paid.
cover
If the
visit,
entre
be attended
with
many
do not require
a diagnosis
Each bill
each
an emergency
With each
guest
appointments with
an appointment
yields
each
and
many
appointments
If kept,
each
can
UML notation that can be implemented
accept
purposes,
but
dinner
rules:
can
Emergency
entre,
each
guests,
Create an ERD using the
and
bill
directly,
or the
bill
may
be the
basis
for
a claim
submitted
to
an
company.
bill is
paid
by an insurance
company,
the
deductible
is
submitted
to the
patient
for
payment.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
224
PARt
II
13
Design
Concepts
Tiny University is so pleased tracking of
system
operations
members
example,
centres,
travel
are
forms,
form
and name vehicle,
(The
TFBS
to
completion
form.
department
is
panel van,
is
log
required,
parts
maintenance maintenance
mechanic
maintains
Each
sign
are
generated,
each
month.
Finally,
Cengage deemed
Learning. that
any
brief
notation
All suppressed
Rights
does
May not
not materially
be
as the
to
affect
parts
parts
used
scanned, the
overall
to
and
The reports within
draw
the
relationships
or
duplicated, learning
in experience.
whole
appropriate and
or in Cengage
part.
Due Learning
item
the
the
form
form.
The
belts
stapled
of various parts
that
types. reach
manager requires maintenance;
each
the
parts
used.
mileage
driven
parts
report
usage
each
is
reports.
various
fully
performed,
date is filled
In addition,
(and
a
maintenance
reorder
part is
created
who
is recorded.
detail
vehicles
include
is
to
number
the
and
parts
the
A detailed
summary
signs
and to
a department.
department.
maintenance
log
maintenance
each
and the
department
completion
service
which
maintenance
mechanics
parts
maintenance
usage
under
(Only
is transferred
forms
usage, the
a
service.)
air filters
perform
number
members
of operations,
copied,
monitor
The
was completed
maintenance
the
oil filters,
are
maintenance,
to the
source for various oil,
(sedan,
of maintenance
who performed
log
type
be
the
more entities than
service.
for each
back into
To track
are
into
maintenance
completed,
vehicle
a set of reports.
entities,
lines
vehicle
credit
must form,
log form.
number
mechanic
of the trip, University
receipt
requires
also forwarded the
form.)
entities!)
back into
maintenance
the
card
maintenance
forms
is
Tiny
of the type
back
checkout
completion
maintenance
vehicle
which
been
the
log
by vehicle
to indicate
Reserved. content
that
maintenance
summary
a vehicle
vehicle
on
have
daily
a vehicle
Each time
pick up the completion
identification
Do not use and
which the
of the
on hand level.
and by faculty
on the
the log
including
parts
based
separate
form,
checked
month, TFBS issues
reports
ERD
items
out the the
by department,
has
is
records
Given that
contains
inventory,
quantity
rate
the
the
and end
credit
attributes
destination
up a trip
(if any) and the
trip
the
form
start
The
arrives to
signs
at the
who are
of a trip.
members
to
used for
Centre.
faculty,
pick
the
number
be used later
a parts
inventory
to
manager
to
on
initiated,
usage
also
a brief description
forms
log
use faculty
on a prenumbered
who releases
are then filed,
mechanic
2020
detail form
for
minibus) used. (Hint:
date
students
vehicles
required,
and
travel.
to transport
member
faculty
mileage
may release
a parts
to the
the minimum
review
out
type
of the
by TFBS.
been
vehicle
the
sanctioned learning
end
purchased,
between
the
the log
When all
parts
at a
used and for identification
and the
The
description
off-campus
The
for its at the
readings
is
officially to
form
vehicle
purchased
receipt
who released
has
form;
who fills
maintenance
TFBS
Copyright
form
detail
manager,
date,
out the
vehicle
If fuel Upon
billed
authorisation
as the log
maintenance
fuel.
performed
entry
date,
includes
of fuel
difference
mechanic
an inspection
the
vehicle identification,
log
of the
As soon
the
entry is completed the
the initial
identification
forms
A brief
Far But Slowly)
When the faculty
odometer
minivan or
maintenance
log form includes
Editorial
pay for
All vehicle
maintenance
the
for
vehicles
completion
to log
form
(if any), litres
the
out
pool.
purposes.
TFBS (Travel
departure
who releases
Remember
stapled
car
presented,
service
can reserve
form
trip
wagon,
item.
of its student registration/
its
to travel
are
public
trip
completion
necessary.
papers
for
member.
a checkout
trip
members
research
expected
faculty
identification,
members
station
which
used
to the
faculty
The
to include
by Tiny University
by faculty
department
the
complaints
number
stapled
owned
appropriate
employee
members
maintenance
each
sign
code, the vehicles
for the
design
by Tiny Universitys
authorised
must
The faculty
have
the
and to travel
out the
includes
of the
(s)he
at
managed
for filling
reservation
card
expand
vehicles
locations,
reservation
form.
to
may be used
to locations
purposes
responsible
5
you
vehicles
sanctioned
Using
with your design and implementation
wants
may use the
the
to
officially
such
it
follows:
Faculty For
that
by vehicle, revenue is
also filed
month.
labelled)
ERD.
Use the
UML
multiplicities.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
14
Using the following implemented.
information,
Make sure
EverFail
company
in their
cars for
wipers,
oil filters
charges
for
presented
is in the
what is and
the to
not extend
oil change
described
oil and
to
used,
and
pay
EverFails
database
business.
EverFail
or
write
Diagrams
225
that can be
multiplicities:
customers
also replaces
charge.
card
Relationship
model and
The invoice
labour
a credit
Entity
UML
Although
approval.
a standard
is to
with
relationships
oil changes,
use
Modelling
based on the
customer
cash,
Data
entities,
and lube
as quick
subject
all parts they
an ERD
all appropriate
quick
air filters,
customers,
credit.
produce
you include
5
bring
windshield
contains
When the a cheque.
the
invoice
is
EverFail
does
be designed
to keep track
of all components
operations,
EverFail
maintain
in all
transactions. Given
the
high
of its
parts
(oil,
minimum vendor.
parts
usage
wipers,
on-hand
of the
oil filters
quantity,
EverFail
business and
the
maintains
air filters)
inventory.
parts in low
a vendor
list,
supply
which
must
Therefore,
if
parts
must be reordered
contains
vendors
careful
reach
from
actually
control
their
an appropriate
used
and
5
potential
vendors.
Periodically, EverFail
15
based
also tracks
Create a complete description not
any
models
Each
stock.
get
and
Every Every
spas any
from
is
one
or
service,
produces brand.
spa
mails updates
in the relational
a small
start-up
a simple must
different
to
customers.
model using the following
company
warehouse
be ordered
so
at the
that
sells
spas.
customers
time
can
of the
HW does
see
some
of
sale:
manufacturers.
or
more different
by only
one
brands
of spas.
manufacturer.
models.
as part
that
is
sold
one
EverFail
mileage.
up in
products
more
produced
an 81-jet
set
produced
an entry-level
BBI-6,
are
several
brand
has
manufacturer
car
Water (HW)
produces
every
model is
spas,
Hot
but
spas
brand
cars
customers
A few
manufacturer
Each
of the
ERD that can be implemented
available,
HW can
date
each
of operations.
carry
the
on the
of a brand.
Big
Blue
The
with two
Big
Meerkat Blue
6 hp
For example, spas,
Meerkat
motors,
Bay
a premium-level
brand
and the
Meerkat
offers
BBI-10,
Spas is
brand,
several
and
models,
a 102-jet
spa
a Lazy
Lizard
including
with three
the
6 hp
motors. Every
manufacturer
code,
phone
For each in the
brand,
the
by a
and account brand
name
motor,
model, the
suggested
capacity
must
retail
price,
be kept in
the
code.
The
company
are kept in the system
brand
level
on the
brief
Volunteers
carry
out the tasks
are tracked
for
each
require
assigned
(premium,
name,
for every
mid-level,
address,
area
manufacturer.
or entry-level)
are
horsepower
per
kept
HW retail
volunteer
description
of the
volunteer.
yet. It is
is assigned
of jets,
price,
dry
number weight,
of
motors,
water
capacity,
and
seating
system.
Each
many volunteers.
a task
number
organisation
following
tasks
and
model number,
United Helpers is a non-profit Based
manufacturer
number
system.
For each
16
is identified
number
that
of operations,
to
to a task, the
have
system
aid to people after natural disasters.
create
organisation. volunteer
the
appropriate
The name,
might tasks
be in the
that
fully
labelled
address
and telephone
to
several
tasks,
without
having
may be assigned
A volunteer
possible
provides
system
no
one
has
been
should track
the
start time
ERD:
number
and
assigned.
some
been When a
and end time
of that
assignment.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
226
PARt
II
Design
Concepts
Each task
has a task
may be a task recurring,
and
of prepare
code, task
with task
code
a status
5 000
description,
101,
of ongoing.
packages
task type
a description Another
of basic
task
medical
and task
of answer might
supplies,
status.
the have
For example,
telephone, a code
a type
there
a type
of 102,
of packing,
of
a description and
a status
of open. For
all tasks
of type There
packages,
child-care
packing list up the
name
Packing produced
by the
date
the
should
be tracked.
Each item
placed as
package,
100 bandages,
and
no
to
with the
that
have
offers
also record to lead
needs many
date
qualified
at least
any
All suppressed
the
many
list
Tasks
an ID
that
number.
A given
phones)
package
is
will not produce
of basic
any
medical supplies)
but it is not always
actual items
different
included
items,
and
has an item ID number,
the
system.
Along
of each item
list
may state
that
and 4 bottles
The fact
that
in
a given
the
in the
basic
medical
of hydrogen
package
in
will
possible
to
each package item
description, the
actual
package
can
be used
item
value
packages
peroxide.
includes
package
items
must
bandages possible
yet,
but
because
1 bottle
of
of iodine,
and iodine for the
every
are
include
However,
It is
that
be tracked
should
only 10 bandages,
included.
any
item
with tracking
placed
may include
of each item
needs
to
organisation
package
will contain
company.
tours.
and is
The
For each
home
tour,
address,
It is important they
to
completed
different
the
difficulty
know
tour
name,
guides
qualification
A tour
can
to lead
are
length
(in
but the system
qualified
for
each
many
any tours, just
ID,
with all of the
as follows:
Guides take
are
test
have
up
approximate
by an employee
which
keeping
operations
and date of hire.
the
tours.
having
companys
Guides are identified
may not be qualified
must be designed
Reserved. does
The
May not
to
are kept.
others
one tour.
content
quickly
a test to
to lead
tour.
different
as a new tour
should be qualified
which
A guide
qualified
hours),
tours
may be
guides.
New
may or may not have
guides.
while
Rights
has grown
name,
which
description
one tour,
Learning.
contain
package
of the
many
may or
official
that
on
qualified
Every tour
Cengage
a make
of supplies
assigned
the
of each package,
a given
needed.
tours.
to lead
guides
deemed
package
is
as answer
Therefore,
been included
different
a guides
specific
and the
has
A packing
many tasks.
are recorded.
contents
quantity
not
LOST
and fee charged is
2020
list.
should
Scenic Tours (LOST) provides guided tours to groups of visitors to the Cape Town years,
information
LOST
review
package
weight
that
one item.
Luxury-Oriented
any
each
5 000 packages
in
of iodine
peroxide.
along
area. In recent various
with
medical
list.
as prepare
quantity
of items,
hydrogen
at least
stored the
4 bottles
have items
packing
Each individual
total
provides
a packing
supply
be recorded
can
organisation
For example,
the limited
and
of each item.
on hand
each
well.
packing
as basic
has an ID number,
the items
one
may be associated
Some tasks (such
A package
quantity in
only
of the
such
Each packing list
with
packages.
contents
packages.
that the
and item
the
packages,
which describes
with any
and its
the ideal
number
different
specifies
packages.
describes
the ideal
many
many
or it
is tracked,
with only one task.
that
different
associated
of
was created
while other tasks (such
in
Copyright
creation
packages,
list
produce
packages.
associated
associated
include
is
any tasks,
organisation
with
to
description,
task
not
in the
package
The packing list
Editorial
with
a packing
lists
and food
packing
are
result
be associated
17
packages,
tasks
tasks
is
packing
and a packing list Every
packing
there
many
be associated
not
The
are
package.
may not are
5
packing,
packages.
not materially
be
as
order in
affect
scanned, the
at least
three
Some locations
(such
copied,
visit
overall
District
or
duplicated, learning
(such
Six) are
which the
in experience.
tour
whole
locations.
or in Cengage
as Table
visited
visits
part.
Due Learning
For each location,
Mountain) are visited
by a single
tour.
each location
to
electronic reserves
rights, the
a name,
right
should
some to
third remove
party additional
by
All locations
content
may
be
any
more than
as
suppressed at
and
are visited
be tracked
content
type
time
from if
by
well.
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
When a tour is in
advance
schedules. have
actually
so they A tour
any
that
tour.
Tourists,
Newly
called
at least
a
hired
number
outing
can
one
outing,
outings, is for
tour
Each outing
outing
of a tour
guides
may never
by LOST,
pay to join
although
outing.
only
outings
outings
their
work
tours for
officially
to lead
may not
a particular
the
many different
have
any
to
outings.
client,
who
Guides
qualified
any
For each
on clients
may not
227
well
upcoming
designed
not
may sign up to join
scheduled
Diagrams
at LOST are guided tours,
are
scheduled
kept
Relationship
has one and only one guide.
been
is
Entity
scheduled
All tours
a scheduled
Information
newly
newly and is
even if they
have
Clients
many clients.
understand
although
a single
with
LOST schedules
can
to each outing.
are recorded.
have
employees
with a tour.
an
Modelling
to as an outing.
so
outing
Data
must be associated
to lead
clients
and telephone each
Each
must be assigned asked
and
many scheduled
All outings
occasionally
lead
have
scheduled.
date and time.
are
be advertised
can
outings
so a guide
given, that is referred
can
5
name
outings,
have
clients
and
signed
up for
signed
up yet.
Create an ERD to support LOST operations.
5
b The operations provided state that it is possible for a guideto lead an outing of atour even if the
business
guide
is
officially
rules instead
to lead
an outing
data
not
model in
specified
unless
Part
qualified
that
to lead
modified
of that
a guide is never,
he or she is qualified
a be
outings
to
enforce
under
to lead this
Imagine
that
the
any circumstances,
outings
new
tour.
of that tour.
allowed
How could the
constraint?
note Problems
18 and
of translating ERD that
19
may be used
a description can
about the
generic
components
mind that are
many
handled
existence
18
business,
Web-based
get
away
designed
Use the following decals
for
a bad
to
models
through
available,
is
The
review
2020 has
Cengage deemed
Learning. that
includes
any
card
All
Rights
Reserved. content
does
May not
not materially
be
copied, affect
the
order
overall
or
a few items
the
also
such
constraints
Problem
than
per
must
must
can easily be
of transactions
you
keep in
18 deals
design, rather than
day,
on
made that the
ever.
(You
but the
might
problems
of
increases.)
Company to complete ships
and
this
cars)
(www.rc_models.com).
products at the
duplicated, learning
and to
customers
exercise.
and add-on
Models
on the
and
decals
are
in experience.
whole
CC
or in Cengage
pulled
CC
Bank,
Bank is
not
part
part.
Due Learning
to
electronic reserves
rights, the
right
at
to
third remove
is
in the
for
additional
content
may content
shipping
be
container. Company
database.)
suppressed at
the
shipment.
RC_Models
RC_Models
party
not
are not charged
inventory
which
of the
some
orders
his or her transactions, from
is enclosed
to the
card. If a product
(Back
completes
invoice
The printed invoice
The
pay by credit discretion.
When a customer
listed
(Note:
scanned,
in
more important
models (aircraft,
website
are transmitted
account.
only
instead,
of the
of an
discussions
is to separate You
operations
the argument
of RC_Models
internet
charge.)
charges
sell
plastic
shipped.)
a shipping
credit
a commercial
suppressed
is
products
aspects
number
components
One of the things
design;
database
challenge
basis for
design.
of
the
1/144 to 1/32.
on back
order
and the
you
operations
vary from
placed
if
products its
database
design
the
as the
implemented,
database
description
database
as the
website to select the
until the
customer
maintains
Copyright
it is
printed
(The invoice
Editorial
use the
a customer
invoice
of the
the
the
illustrate
will define
be used
of operations.
details. In fact,
design
that
also
affect
into
made
problems
can be successfully
be on the
compounded
sells its
are available in scales that
currently
database
are
Company
those
Customers
has
rules can
directly
Although
should
These
description
that
management
descriptions
RC_Models
problems
be incorporated
businesses
with
of business
that
software.
projects.
a set
details
the focus
databases
class
These
databases
cannot
applications
for
of a proper
the
and the transaction of
be able to poorly
create
constraints
with a Web-based its interface
and contents
material from
by the
to
implemented.
want to
background
basis
of operations
be successfully
need to learn if you
as the
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
228
PARt
II
Design
Concepts
RC_Models materials. conduct to,
Company
Because its
operations,
customer
product is removed
of the
However,
RC_Models list
to
RC_Models
Company.
periodically
sends
requires
detailed
available.
category
Those
and
amount,
has not recorded
a sale
reports
include,
product
turnover
within four
out
promotional
information but
are
and
to
not limited
revenues
by
weeks of being stocked,
it
and scrapped. on the
use in
RC_Models also
has
marketing
In addition,
Company
plastic
models
from
Models
are
ordered
customer
purchased
its
list
have
a copy
products
customer
account
order
with the
specified
to
bought
of the
customers
RC_Models
FineScale who
data are recorded
products
from
are
Bank.
products.
Modeler
have
not
when potential
applicable
business
rules
Use the following
are
yet
magazine
bought
customers
to
from
request
All orders
the
others.
Decals
are
in the
RC_
through
placed
number
and
example,
Not all manufacturers are placed
handled
For
when
of product
via the
manufacturers
RC_Models product
units
commercial
inventory
ordered
reaches
depends
on the
product.)
description of operations for RC_Models Company, write all
establish
three
manufacturers.
Revell/Monogram
automatically
(The
for each
the
others. (Note:
orders.)
on hand.
specified
Giventhat brief andincomplete (Hint:
and
automatically
Orders
quantity
from
Academy,
WaterMark
have received
CC
quantity
directly
Tamiya,
amounts
minimum
order
its
Tauro,
database
and the
minimum
orders
Aeromaster,
Company
websites,
a
are
and
Company
information.
ordered
the
purchases
RC_Models
reports
product
Company
subscription
bank
at
If a product
customers
RC_Models
5
by
from inventory
Many
customer
numerous
purchases
and customer.
product
tracks
management
entities,
business
rules
relationships,
optionalities
as examples,
and
multiplicities.
writing the remaining
business
rules
in the same format.) A customer
may generate
Each invoice
is generated
Some
b
customers
Drawthe fully labelled a of this
19
have
problem.
Use the following
many invoices.
by only one customer.
not (yet)
generated
an invoice.
and implementable
Include
all entities,
description
ERD based on the business rules you wrotein Part
relationships,
of the operations
optionalities
of the
and
RC_Charter2
multiplicities.
Company to complete this
exercise. The
RC_Charter2
certificate available
place
companies
only to
after
one
one
and
during
only
operations;
one
charges
function crew
expenses.
models
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
yields
model
revenue
for the
upon
used,
use
the
mileage
mile. Round-trip
traced
does
May not
in Figure
to
not materially
be
be 130
copied, affect
scanned, the
overall
are
duplicated, learning
+ 180
in experience.
whole
or in Cengage
do not
a flight.
the
part.
Due Learning
or
flights
by
procedure.
charter
companys
use
and
or some
trip is reserved
other
by
charter
RC_Charter2
only.)
is generated
flight
customer multiplying
actual
take
many different
charter
special
on the
are
date
This revenue
The
computed
+ 390
or charter)
cargo
charter
operations
time,
charter
use the
services
miles are based
P5.1 illustrates
reserve each
Company.
waiting
is,
passengers,
of course,
charter
of
taxi
The aircraft
at a customer-designated
customers
RC_Charter2
FAA.
Canada. that
purposes,
on the
charges
+ 200
or
to fly
maintenance
flown,
per
States and
can,
completion
distance
Part 135 (air
by the
transporting
for billing
will focus
distance
Reserved. content
fuel,
FAR
operations
of RC_Charter2s
The
calculated
Rights
A customer
design
pays
United
of an aircraft
However,
charge
route
miles is
use
the
are enforced
destinations,
purchase
database
a customer
The sample
Editorial
they
trip
the
Some
under
that
unscheduled
and cargo.
customer.
of aircraft
by the
reserves
any time frame.
This
Each charter the
within the
so-called
customer-designated
instead, (Note:
of aircraft
operations
provide
more
a fleet
(FARs)
of passengers
(trips)
services.
operates
Air Regulations
a customer
or
combination flights
Federal
for air taxi (charter)
Charter
time
Company
of the
the
charges
by are
a
requirements
and
round-trip
miles
navigational
path flown.
Note that the number
of round-trip
5 900.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
FIguRe
P5.1
Round-trip
5
Data
Modelling
with
Entity
Relationship
Diagrams
229
mile determination
180
Intermediate
200
Destination
miles
Stop
miles
390
miles
Pax Pickup
5
130
miles
Home
Depending
on whether
a customer
Base
has RC_Charter2
Pay the entire charter bill upon the completion Pay a part of the may not
exceed
charter the
credit
he or she
may:
of the charter flight.
bill and charge the remainder
available
authorisation,
to the
account.
The charge
amount
credit.
Charge the entire charter bill to the account.
The charge amount
may not exceed the available
credit.
Customers may be charge
may pay all or part
made
at any time
includes
the
customers
the
The aircraft, crew
and
the
used
must
a crew
flies
Copyright Editorial
review
2020 has
Cengage deemed
any
for
to
The hourly
handle
crew
previous
a specific
other by
basis.
crew
FAR
charter charter
by
those
crew-member
trips.
trip.
required
135,
All suppressed
Such
The
FAR
135.
customers
charge is
payments
charter
However,
are
based
mileage if
charged
for
on each
crew
flights
piston
aircraft
of a flight
In
short,
trip
aircraft
attendants
engineer, can
charter
of
the
of the
a
weight
of the larger Some
aircraft
one
of an
require
takeoff
crew.
cargo-carrying more than
use
aircraft
a gross
while some
as part
and larger consist
requires
having
a pilot and a copilot,
flight
a crew
charter
engine-powered
that is,
aircraft require may require
assignment
aircraft waiting
does
May not
not materially
be
waiting
charge.
charges
Reserved. content
Larger
Each
person
of the require
and
not
all
waited
by
are pilots.
hourly
Rights
assignment.
The smaller
pilot.
passengers
the
as hotel/motel
Learning. that
and
not required
aircraft.
of a loadmaster.
members
The charter
such
tied
and jet-powered
require
models
each a single
to transport
assignment
the
be able to
of only
aircraft
crew
balance
pilot(s)
crew
on an hourly
of 5 500 kg or more
older
existing necessarily
qualifications.
database
consisting
aircraft
not
of the
additional
members
members
of the
are
expense
request
crew
and
and
copied, affect
scanned, the
overall
charges
Crew ground
or
duplicated, learning
are computed
expenses
by
are limited
multiplying
to
meals
the
and
hours
overnight
expenses
transportation.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
230
PARt
II
Design
Concepts
The trips,
RC_Charter2
expenses
data that
database
and
each
pilot in
destination(s),
aircraft
data pertinent
to the
that
detail revenue
other
crew
contract
licensing
5
are
of crew
is
water,
The instrument
a monthly
records.
for
crew)
Such
each
data,
charter
are then
for
customers,
is,
based
date(s)
fuel
and
other
monthly reports
and pilots.
the
on the
and time(s),
usage
generate
aircraft
that
of all charter
are
trip
flown,
used to
employees;
summary
records trip:
distance
data
Company
members.
All pilots
company
does
and
not
use
designed
is
that
require
either
appropriate
govern
the
a commercial
ratings.
and landings
multi-engine
Ratings
ability to
Conditions
or visual
conduct
flight
and
Rules (IFR).
conditions
all flight
can take
off
operations
rating is required
(IMC),
Flight
only, the
aircraft
seaplane.
The instrument
Instrument
on land
When a multi-engine
MES, or
Meteorological
weather
of requirements
must have earned
for takeoffs landplane.
on a demonstrated
FAR-specified
good
pilots
Both licences
instrumentation.
Instrument
under
set
example:
rating
based
a strict
For example,
For
aircraft
to cockpit
under
under
licence.
appropriate
rating is
under
conducted
MEL, or multi-engine
the
with sole reference
conducted
charter
requirements.
rating
governed
Such
pilot (ATP)
a multi-engine
are
generate
to record other
cost information
are
appropriate
an aircraft
to charter
(and
RC_Charter2
To operate
on
flight.
transport
competency
and land
pilot
operations
or an airline
specific
the
crew.
135
and training
licence are
charter
from
is required
number,
and operating
and
Part
derived
command
members
pilots
FAR
must be designed
revenues
all such
In
are
operations
contrast,
based
to operate
operations
on the
FAR
Visual
Flight
Rules (VFR). The type aircraft
rating that
is required
are
purely
aircraft is said to aircraft
pilot licences under
The the
If the
reverts period,
Aturboprop
a Class II
it
automatically
unless it
both a current
of
more than
engines
that is,
meets the
are not time-limited,
to
a turbo
exercising
the
5 500
drive
5 500 kg
medical certificate
may be Class I or Class II.
must be renewed
Class I
to
weight
uses jet
kg
or for
propellers,
that
propeller-powered weight limitation.)
privilege
of the licence
and a current
and
Part 135 checkride.
are important:
certificate
Class II, and it
yearly.
a takeoff
an aircraft
rating
and ratings
distinctions
medical
with
(If
a type
Part 135 requires
The following
all aircraft
be turboprop-powered.
does not require
Although ratings
for
jet-powered.
medical
is
certificate.
every six
not renewed If the
reverts
to
The
months.
during
Class II
the
medical
a Class III
Class I
The Class II six-month
is
medical,
medical is
medical
period,
not renewed which is
more stringent
not
must be renewed
it
automatically
within valid
than
the
for
specified
commercial
flight
operations. A Part
135
every
six
checkride months.
is The
a practical checkride
flight
examination
includes
that
all flight
must
manoeuvres
be successfully
completed
and
specified
procedures
in
Part 135. Non-pilot
crew
requirements. In
addition,
crew
operations over
19)
members
that are
also
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Reserved. content
does
May not
not
be
the
(more to
a complete
affect
scanned, the
overall
or
duplicated, learning
in experience.
and
flight
a 5 500
pass
record
whole
certificates
an appropriate
than
medical certificate
copied,
proper
need
as loadmasters
aircraft
to keep
materially
have
periodically
as well as pilot
Rights
such
large
required
also
loadmasters
members
involve
Company is required member,
must
For example,
a
Cengage
part.
and
Due
to
electronic reserves
to
practical
meet
specific
as do flight who
weight
of all test types,
Learning
order
attendants
kg takeoff
written
examination
or in
in
certificate,
may
job
attendants.
be required
and
passenger
exam.
The
in
numbers
RC_Charter2
dates and results
for each
crew
dates.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
cHAPteR
In addition, must such
as
flight
Part
assigned
are
Test
135
to
members
that
Nor
exams.)
For example,
in the
table
are
to submit
members required
are to
many crew
Modelling
to
periodic
not required
take
crew
members
with
on a given
charter
earned
flight,
tests
such
certificate
formats
below.
Test
Test frequency
description
Part 135 Flight
6 months
Check
5
Class 1
6 months
3
Medical,
Class
2
12
months
Practical
12
months
12
months
4
Loadmaster
5
Flight
6
Drug test
7
Operations,
Attendant
Practical
Random written
exam
6 months
B results Test
Test
result
Test
101
1
12-Nov-18
Pass-1
103
6
23-Dec-18
Pass-1
112
4
23-Dec-18
Pass-2
103
7
11-Jan-19
Pass-1
112
7
16-Jan-19
Pass-1
101
7
16-Jan-19
Pass-1
101
6
11-Feb-19
Pass-2
125
2
15-Feb-19
Pass-1
C Licences
Licence
or
and
code
date
employee
Certificates Licence
Certificate
or
Certificate
ATP
Airline
Comm
Commercial
Med-1
Medical
Med-2
Medical certificate,
Transport
Pilot
certificate,
LM
Loadmaster
FA
Flight
Cengage
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
1
class 2
aircraft rating
attendant
duplicated, learning
class
rating
Multi-engine land
MEL
Description
licence
Instrument
Instr
deemed
pilot
is required. data
Medical,
has
and
If that
2
2020
tests
certifications
certificate.
Sample
231
the results
pilot-specific
and/or
licence.
Diagrams
as loadmaster
have licences
pilots
Relationship
drug testing;
the loadmaster
a commercial
Entity
to take
may have an ATP and a loadmaster
may have
code
Part
review
Data
A Tests
Part
Copyright
pilots
a pilot
1
Editorial
crew
However,
be a loadmaster attendant
are required
non-pilot
checkrides.
a flight
shown
crew
(Note
practical
areas.
Similarly,
PArT
too.
attendant
in several is
all flight
be tracked,
5
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
232
PARt
Part
II
Design
Concepts
D Licences
and
Certificates
Held
employee
Licence
101
Comm
Date earned
or Certificate
12-Nov-03 28-Jun-04
Instr
101 101
MEL
9-Aug-04
103
Comm
21-Dec-05
112
FA
23-Jun-12
103
Instr
18-Jan-06
112
LM
27-Nov-15
5
Pilots
and
other
assignments. For
example,
pilot
The
for
The
crew
by
in
Command
a gross
The those
under
RC_Charter2 aircraft
The
can
attendants.
a crew to
a
aircraft
b
a complete
record
and flight
of all recurrency
credentials
and
of each requirement
maintained
requirements by
However,
engines
are
aircraft
must fly the
piston
operations
available.
aircraft
or turboprops
permitted
under
of a copilot
who is capable
as and
Part 135
even if FAR Part 135 permits
anticipates
the
of a pilot
ratings that
passengers cargo
A
Pilot have
as long
single-pilot
of conducting
crew
copilot.
exceed
5 500
the
requires
over
and securing
charter
of turbojet-powered and
requirements.
weighing
the loading
lease
and training
that
optionalities
Each charter
trip is requested
customers
charter
any
All suppressed
problem.
Rights
Reserved. content
does
May
many
have
the
the
5 500
kg,
of the
gross
pilot
takeoff
presence
of
aloadmaster
cargo.
assignment
kg
aircraft,
Both
and
weight.
one must
The database
or
and copilot
Those
more flight
be assigned
as
must be designed
capability.
trip
not materially
multiplicities.
not (yet)
may have
be
business
charter
copied, affect
scanned, the
overall
requested
to serve
many
entities,
or
duplicated, learning
Use the
following
five
business
trips.
a charter
in
whole
member
assigned
on
to it to
many charter serve
trips.
as crew
members.
ERD based on the business rules you wrote in Part a
relationships,
experience.
trip.
as a crew
employees
and implementable all
(Hint:
rules in the same format.)
by only one customer.
may be assigned
Include
not
and
writing the remaining
Drawthe fully labelled
Learning. that
of all crew
consisting
aircraft
of
may request
of this
Cengage
record
powered
presence
a crew
A customer
Each
deemed
are
the
larger
carry
relationships,
An employee
has
that
is available.
additional
as examples,
Some
2020
a detailed
rules
specified
Usingthis incomplete description of operations, writeall applicable businessrules to establish rules
review
record
currency
kg, single-pilot
have
number
anticipated
entities,
Copyright
a complete
must have a properly
manager
to
member to supervise
meet the
and
keep
must keep
and
Part 135 licensing,
the
If those
135 flight
requirements
work
135.
also leases
carry
maintain
company
require
Part
operations
same
company
aircraft
autopilot
are required
must meet the
to
licensing
5 500
many customers
operations
to
Part
their
is job-specific.
training.
company
aircraft
under
maintained
operations, flight
the
FAAs
For those
weight
as a properly
flight,
of the
(PIC).
takeoff
The
operations
to
that
data.
a charter all
is required
appropriate
curriculum
of all applicable
flight
is required
to the
training
FAA-approved
a review
company
Company
Part 135.
recurrency
on an
includes
subject
Company
mandated
meets
based
training
member
and of all compliance
who
is
data interpretation,
RC_Charter2
To conduct
must receive
training
RC_Charter2
each
all training
pilot
members
recurrency
weather
procedures. training
crew
Recurrency
regulations,
Editorial
by employees
or in Cengage
part.
optionalities
Due Learning
to
electronic reserves
rights, the
right
some to
and
third remove
party additional
multiplicities.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 6 Data Modelling Advanced Concepts IN THIS CHAPTER,YOU WILLLEARN: About
the
extended
How
entity
clusters
entity are
relationship
used
The characteristics
of good
How to
solutions
use flexible
to
(EER)
represent
primary
models multiple
keys
main constructs entities
and relationships
and how to select them
for special
data
modelling
cases
PREVIEW In the
previous
properly relationship entity
Most
(EER)
current
important
of primary
adapted
to
2020 has
you
carry
any
All suppressed
out
does
May
not materially
design,
of
on relational
why this
tables,
chapter
this
chapter
databases. it is
for
As the
essential
to learn
Primary key selection covers
critical
designs,
keys. (Flexible
designs
changing
data
aspects
should
of poor
modelling
copied, affect
scanned, the
overall
database
tasks, data
or
duplicated, learning
know
the
in
whole
Cengage
requirements.) a good
mantra:
Data
foundation
good
and no amount
chapter
or in
of
can be
database
of outstanding
design.)
modelling
experience.
the
special
are designs that
providing
designs,
some
proper identification
and information
of databases, (You
basic
also illustrates
of flexible
on bad database
be
to
entity
and adds support
and how to select them.
development
outlines
not
based among
which is
of foreign
data
(ERDs)
extended
placement.
the limitations
Reserved. content
keys
development.
that
Rights
are
the importance
be based
diagrams
about the
on ER concepts
associations
database
demands
relationship
will learn
clustering.
chance,
step in the
checklist
Learning. that
create
and
and placement
can overcome
Cengage deemed
to
highlight
a vital
entity
primary
application
To help
review
be left
cannot
modelling
Copyright
to
successful
applications coding
to
of good
meet the is
entity
you
implementations
keys
on practical
keys
modelling
Editorial
uses
cases that
primary
for
and
key identification
Focusing
design
use
model. The EER model builds
database
model
how to
model. In this chapter,
subtypes
characteristics
is too
you learnt
a data
supertypes,
relational
the
chapters,
create
concludes
with
a database
principles.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
234
PART II
Design
6.1
Concepts
THE EXTENDED ENTITY RELATIONSHIP
As the complexity requirements
of the
have
data structures
become
being
more stringent,
MODEL
modelled has increased,
there
has
been
and as application
an increasing
need to
software
capture
more
information in the data model. The extended entity relationship model (EERM), sometimes referred to as the enhanced entity relationship model, is the result of adding more semantic constructs to the original entity relationship (ER) model. As you might expect, a diagram using this model is called an EER diagram (EERD). In the following sections, you willlearn about the main EER model constructs entity
supertypes,
entity
subtypes
and entity
clustering
Following on from Chapter 5, Data Modelling UML notation to produce EER diagrams.
and see how they
with Entity Relationship
are represented
in
ERDs.
Diagrams, this chapter
will use
6.1.1 Entity Supertypes and Subtypes In the real
6
world,
most businesses
employ
people
with a wide range
of skills and special
qualifications.
In fact, data modellers find many ways to group employees based on employee characteristics. For instance, aretail company would group employees as salaried and hourly employees, while a university would group employees as faculty, staff and administrators. The grouping of employees to create various types of employees provides two important benefits: It avoids unnecessary that
are not shared
nulls in the employee attributes
by other
when some employees
It enables a particular employee type to participate in relationships employee
Toillustrate those benefits, lets
explore the case of an aviation business. The aviation business employs
pilots,
accountants,
mechanics,
FIGURE 6.1
secretaries,
EMP_
EMP_
EMP_
LNAME
FNAME
INITIAL
100
Nkosi
Cela
101
Lewis
Marcos
102
Vandam
Jean
103
Jones
Victoria
104
Lange
Edith
107
Diante
Venite
108
Shenge
109
Travis
EMP_HIRE_
MED_TYPE
DATE 15-Mar-98
SEL/MEL/Instr/CFII
25-Apr-99
1
28-Aug-13
U
ATP
SEL/MEL/Instr
1
20-Oct-07
COM
SEL/MEL/Instr/CFI
2
08-Nov-07
COM
SEL/MEL/Instr
2
05-Jan-14 02-Jul-07
L
18-Nov-05 T
COM
SEL/MEL/SES/Instr/CFII
1
14-Apr-11 01-Dec-13
Stan
Learning. that
EMP_
R
Brett
Cengage
of employees.
Mhambi
Genkazi
deemed
EMP_RATINGS
EMP_ LICENCE
ATP
Gabriel Theeban
has
many other types
20-Dec-03
Naidu
2020
and
T
106
review
managers
by unique attributes
NUM
Copyright Editorial
Nulls created
Williams
database
how pilots share certain characteristics with other employees, such as alast and hire date (EMP_HIRE_DATE). Onthe other hand, many pilot characteristics
EMP_
110
that are unique to that
type.
Figure 6.1 illustrates name (EMP_LNAME)
105
have characteristics
employees.
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
are not shared
by other
requirements
such
employee have
characteristics
a lot
of nulls
for employees
Based
those
For
to
not
all
in the
employee
flies
preceding
discussion,
to
that
all
are
unique
employees.
a generic
entity type that is related
common
subtype.
characteristics
In the
specialisation
next
section,
supertypes
hierarchy entity
would
supertype
PILOT
EMP_MED_TYPE
only
if
you
would
case,
special
will generate that
employees
all
nulls
are unique to
who
are
pilots
can
and that hierarchy,
the
entity
can
entity
PILOT
entity
entity
stores that
unique
is
that
are
a subtype
of
supertype
characteristics
supertypes
only
an entity supertype is
where the entity
the
stores
attributes
PILOT
modelling terms,
subtypes,
contain
the
conclude
of PILOT.In
how the
that
EMPLOYEE you
subtypes
will learn
deduce
contains
of each entity
and subtypes
are related
in
a
6
and
Hierarchy subtypes
and
(child
entities).
three
entity
reflects
the
subtype
subtype
this
in some relationships
correctly
pilots
are
depicts the arrangement
subtypes
hierarchy
entity,
In
235
meet special
Therefore,
EMPLOYEE entries.
Concepts
relationship.
to one or more entity
you
must
training.
dummy
aircraft;
Advanced
hierarchy.
6.1.2 Specialisation Entity
participate
on that
and the
a single
Modelling
pilots
periodic
and
can fly
EMPLOYEE and that EMPLOYEE is the supertype the
in
of needless
pilots
Data
employees, and
EMP_RATINGS
you
to
Based
a lot
employees
aircraft
other
checks
were stored
make
as EMP_LICENCE,
on the
attributes
common
qualifications
have
example,
unlike
flight
who are not pilots. In addition,
qualifications.
participate
restrictions,
special
would
such
For example,
hour
and
or you
pilot characteristics
their
employees.
as flight
6
6.2 shows
subtypes
to
a specialisation
PILOT,
to
specialisation
MECHANIC
one instance
The specialisation
entities) and lower-level formed
by an EMPLOYEE
ACCOUNTANT.
and
of the
of the
hierarchy and
EMPLOYEE
one instance
hierarchy.
entity supertypes (parent
the
between
is related
is related
in
of higher-level
Figure
1:1 relationship
occurrence
occurrence
organised
each
of its
EMPLOYEE
EMPLOYEE
The
specialisation
subtypes.
For
supertype,
and
example,
a
a
MECHANIC
supertype.
NOTE In
UML
notation,
notation
also
subtypes
enables
generalisation
and
within the
directly
related.
relationships in turn,
Copyright Editorial
review
2020 has
Learning. that
any
of the
All suppressed
Reserved. content
does
May not
the
understand
have
is the
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
a
to
in experience.
can
hierarchy
or in Cengage
part.
to
tree in
(supertype)
continue
are
mechanic
in
UML
UML
as class
which
each
class.
Throughout
in
Due Learning
to
electronic reserves
through
sometimes
is
child
class
this
chapter. in
and
hierarchy,
many levels
which
the
described
an employee,
can have only have
other lower-level
whole
superclasses.
are referred
within a specialisation
hierarchy
a specialisation supertype
as you
and every subtype
a specialisation
as
in all discussions.
hierarchy
that,
which
of another
will be explained
an employee,
known
an upside-down
a subclass
specialisation
a pilot is
to
can
subtypes
Rights
6.2
are
hierarchies,
and supertype
Figures
supertypes
resembles
class is
of a supertype
However, you
child
within
and
specialisation hierarchy
subtype
in
context
that is,
one
Cengage deemed
Each
It is important
only
subclasses
represent
For example,
is an employee.
it is
class.
depicted
relationships.
called
A class
symbols
The relationships
exist
to
we will use the terms
The terminology
IS-A
you
hierarchies.
has only one parent chapter
are
terms
of
an accountant
a subtype
one supertype
can
to
which
of supertype/subtype
a supertype
has
many subtypes;
subtypes.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
236
PART II
Design
Concepts
FIGURE 6.2
Aspecialisation
hierarchy
Employee
supertype Inherited
EMPLOYEE
Attributes
inherited subtypes
EMP_NUM
by all
relationship
DEPENDANT
{PK}
EMP_NUM
EMP_LNAME
has
c
{PK}
DPNT_NUM
EMP_FNAME
{FK1}
{PK}
DPNT_FNAME
EMP_INITIAL 1..1
EMP_HIRE_DAT
DPNT_LNAME
0..1
DPNT_RELATION
EMP_TYPE
Discriminator
is
emp_type
emp_type
{OR,
Participation
Optional}
?
OR is
?
Optional
and disjoint
an example is
constraints
of a disjoint
an example
constraint
of a participation
constraint
6 MECHANIC
PILOT
ACCOUNTANT PIL_LICENCE
MEC_TITLE
PIL_RATINGS
MEC_CERT
Attributes
ACT_TITLE
unique to subtypes
ACT_CPA_DATE
PIL_MED_TYPE
Subtypes
Online Content Thischaptercoversonlyspecialisation hierarchies. TheEERmodelalso supports those
specialisation
concepts
are
lattices
better
Databases.
where a subtype
covered
under
The appendix
is
the
can have
multiple parents (supertypes).
object-oriented
available
on the
model in
Appendix
platform
for
online
However,
G, Object-Orientated
this
book.
As you can see in Figure 6.2, specialisation hierarchies enable the data model to capture additional semantic content (meaning) into the ERD. Aspecialisation hierarchy provides the meansto: Support attribute inheritance. Define a special supertype
attribute known as the subtype
Define disjoint/overlapping
constraints
The following
sections
will cover such
discriminator.
and complete/partial
characteristics
constraints.
and constraints
in
more detail.
6.1.3 Inheritance The property of inheritance enables an entity subtype to inherit the attributes and relationships of the supertype. As discussed earlier, a supertype contains those attributes that are common to all of its subtypes.
Copyright Editorial
review
2020 has
Cengage deemed
In contrast,
Learning. that
any
All suppressed
Rights
Reserved. content
does
subtypes
May not
not materially
be
contain
copied, affect
scanned, the
overall
or
only the attributes
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
that
to
electronic reserves
are unique to the
rights, the
right
some to
third remove
party additional
content
subtype.
may content
be
For example,
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Figure name,
6.2 illustrates first
attributes
name, that
are
characteristic
Figure
that
pilots,
middle initial, unique;
is that
the
date
same
all entity
6.2 that the
mechanics
hire
is true
subtypes
EMP_NUM
and
attribute
and
accountants
so on.
However,
for
mechanics
inherit
their
is the
primary
all inherit Figure
and
primary
6
Data
the
from
their
237
last
pilots
have
inheritance
supertype.
of the subtypes,
Concepts
number, that
One important
attribute
key for each
Advanced
employee
6.2 also illustrates
accountants.
key
Modelling
Note in
but it is
not shown
in the subtype. At the implementation maintain
level,
a 1:1 relationship.
EMPLOYEE
and the
table
name:
Table
Copyright review
has
you
Figure
replace
the
the
undesirable EMPLOYEE
6.3.)
PILOT supertype
subtype relationship
100
Nkosi
Cela
T
15-Mar-98
101
Lewis
Marcos
102
Vandam
Jean
103
Jones
Victoria
104
Lange
Edith
Williams
R
Naidu
Theeban
107
Diante
Venite
108
Shengi
109
Travis
Brett
110
Genkazi
Stan
EMP_TYPE
6
25-Apr-99
P
20-Dec-03
A
28-Aug-13
U
Gabriel
106
L
Mhambi T
20-Oct-07
P
08-Nov-07
P
05-Jan-14
P
02-Jul-07
M
18-Nov-05
M
14-Apr-11
P
01-Dec-13
A
PILOT EMP_NUM
PIL_LICENCE
PIL_RATINGS
PIL_MED_TYPE
101
ATP
SEL/MEL/Instr/CFII
1
104
ATP
SEL/MEL/Instr
1
105
COM
SEL/MEL/Instr/CFI
2
106
COM
SEL/MEL/Instr
2
109
COM
SEL/MEL/SES/Instr/CFII
1
inherit
shows entity.
all relationships
the
EMPLOYEE
Through
All suppressed
with
Reserved. content
does
May not
not materially
which the supertype
be
copied, affect
all subtypes
multiple
and relationships
Rights
in entity
inheritance,
hierarchies
attributes
any
hierarchy
supertype
EMP_HIRE_DATE
Learning. that
PILOT. (See
lets
one representing
specialisation
EMP_INITIAL
6.2
Cengage deemed
hierarchy
tables
in the
EMP_FNAME
specialisation
2020
specialisation
with two
the subtype
depicted
EMP_LNAME
DEPENDANT
Editorial
the
6.1
subtype(s)
EMP_NUM
subtypes
all of the
example,
Figure
and its
EMPLOYEE
name:
Figure
supertype
The EMPLOYEE
105
In
in
other representing
Table
Entity
For
structure
FIGURE 6.3
the
levels
from
scanned, the
overall
or
duplicated, learning
in
entity
participating are
also
able
whole
upper-level
or in Cengage
part.
Due Learning
to
participates.
in to
For
example,
a 1:* relationship
participate
of supertype/subtypes,
all of its
experience.
supertype
in that
a lower-level
with
a
relationship.
subtype
inherits
supertypes.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
238
PART II
Design
Concepts
6.1.4 Subtype Discriminator A subtype
discriminator
is the
attribute
in the supertype
entity that
determines
to
which entity subtype
each supertype occurrence is related. As seen in Figure 6.2, the subtype discriminator is the employee type (EMP_TYPE). It is common practice to show the subtype discriminator and its value for each subtype in the ER diagram,
as seen in
Figure 6.2.
However,
not all ER
modelling tools
follow
that
practice.
In
Figure 6.2,
the discriminator was added in MS Visio by using the UML generalisation properties. Its important to note that the default comparison condition for the subtype discriminator attribute is the equality comparison. However, there may be situations in which the subtype discriminator is not necessarily based on an equality comparison. For example, based on business requirements, you may create two
new pilot subtypes,
PIC (pilot-in-command)
qualified
and copilot
qualified
only. A PIC-qualified
pilot will be anyone with morethan 1 500 PIC flight hours. In this case, the subtype discriminator beFlight_Hours and the criteria would be . 1500 or ,5 1500, respectively.
would
6 NOTE When creating
object
to
a specialisation
connect
the
hierarchy
subtype
field called discriminator
using
entity to the
UML
notation
supertype
in
entity.
through the UML generalisation
MS Visio,
you
The subtype
properties
should
use the
discriminator
generalisation
is typed
into the
box.
Online Content ForatutorialonusingMSVisioto createa specialisation hierarchy, see
Appendix
platform
for
A, Designing this
Databases
with
Visio
Professional:
A Tutorial,
available
on the
online
book.
6.1.5 Disjoint and Overlapping
Constraints
An entity supertype can have disjoint or overlapping entity subtypes. For example, in the aviation example, an employee can be a pilot or a mechanic or an accountant. Assume that one of the business rules
dictates
that
an employee
cannot
belong to
morethan
one subtype
at a time; that is, an employee
cannot be a pilot and a mechanic atthe same time. Disjoint subtypes, also known as non-overlapping subtypes, are subtypes that contain a unique subset of the supertype entity set; in other words, each entity instance of the supertype can appear in only one of the subtypes. When using UML notation, a disjoint relationship
is represented
by an OR,
and an overlapping
constraint
is represented
by an AND.
For example, in Figure 6.2, an employee (supertype) whois a pilot (subtype) can appear only in the PILOT subtype, not in any of the other subtypes. You can see that when using MS Visio to produce ERDs using UML notation, the disjoint subtype is indicated by placing the word OR in brackets. Onthe other hand, if the business rule specifies that employees can have multiple classifications, the EMPLOYEE
supertype
may contain
overlapping
job
classification
subtypes.
Overlapping
or non-disjoint
subtypes are subtypes that contain non-unique subsets of the supertype entity set; that is, each entity instance ofthe supertype may appear in morethan one subtype. For example, in a university environment, a person may be an employee or a student or both. In turn, an employee may be alecturer as well as an
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
administrator. subtypes of the UML
Because
of the
an employee
supertype
supertype
EMPLOYEE.
notation
by placing
FIGURE 6.4
also
PERSON,
just
Figure
the
may be a student, as LECTURER
6.4 illustrates
word AND
in
STUDENT
and
6
Data
and EMPLOYEE
ADMINISTRATOR
how these
Modelling
Advanced
subtypes
239
are overlapping
are overlapping
overlapping
Concepts
subtypes
are represented
in
brackets.
Specialisation hierarchy with overlapping subtypes
6
It is
common
However, does
practice
to show
not all ER
not show
add the OR
the
modelling
the
disjoint/overlapping
tools
follow
disjoint/overlapping
and AND
constraints
that
symbols
practice.
constraints.
in Figures
in the
ERD. (See
For example,
Therefore,
the
when
Figure
using
MS Visio text
6.2 and
UML
tool
Figure
notation,
6.4.)
MS Visio
was used to
manually
6.2 and 6.4.
NOTE
Alternative notations exist for representing popularised the use of G and Gsto indicate
disjoint/overlapping subtypes. For example, disjoint and overlapping subtypes.
Toby J. Teorey
As you learnt earlier in this section, the implementation of disjoint subtypes is based on the value of the subtype discriminator attribute in the supertype. However, implementing overlapping subtypes requires the
use of one discriminator
attribute
for
each subtype.
For example,
in the
case of the
Tiny University
database design you saw in Chapter 5, Data Modelling with Entity Relationship Diagrams, alecturer can also be an administrator. Therefore, the EMPLOYEE supertype would have the subtype discriminator attributes and values shown in Table 6.1.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
240
PART II
Design
Concepts
TABLE 6.1 Discriminator
Discriminator
attributes
with overlapping
subtypes
Comment
Attributes
Lecturer
Administrator
Y
N
The Employee is a member of the Lecturer
N
Y
The
Employee
is
a
Y
Y
The
Employee
is
both
6.1.6 Completeness
member
of the
a Lecturer
subtype.
Administrator and
subtype.
an Administrator.
Constraint
The completeness constraint specifies whether each entity supertype occurrence must also be a member of at least one subtype. The completeness constraint can be partial or total. Partial completeness meansthat not every supertype occurrence is a member of a subtype; that is, there may
6
be some
supertype
occurrences
that every supertype
that
occurrence
are not
members
of any subtype.
must be a member of atleast
Total
completeness
means
one subtype.
NOTE Alternative Foot line
notations
notation under
the
the
represents
circle
a total
In
UML,
exist to represent
completeness represents
completeness
Given
Disjoint
Partial
Supertype
{Optional}
Total
Copyright Editorial
review
2020 has
Cengage
any
optional
discriminator sets
are
supertype
subtypes
and shown
Subtype
sets
Rights
Reserved. content
does
May not
can
in
constraint
Table
Crows
horizontal
under
the
circle
a
member
cannot
not
be
copied, affect
scanned, the
overall
of a (at
duplicated, learning
in experience.
whole
notation
6.2
possible is
Partial
completeness
Figures
it is
UML
or in Cengage
part.
has optional
Subtype
discriminators
Subtype
sets
Every
be null.
or
whilst total
be seen in
Constraint
Supertype
be null.
is
constraint.
shown
and
6.4.
to
have
in
brackets.
the
scenarios
unique.
materially
The
Overlapping
can
participation
constraints, 6.2.
least
discriminator
All
A single line
participation
completeness
one) subtype.
suppressed
when using
shape.
horizontal
as the
by Optional
subtypes.
instance
are
to
This representation
unique.
Subtype
Learning. that
category
a double
referred
{OR}
Subtype
least
deemed
Constraint has
often
scenarios
hierarchy
Subtype
Every
{Mandatory}
constraint
Specialisation
Type
is
participation.
disjoint/overlapping hierarchy
TABLE 6.2
For example,
MS Visio
constraint;
above is represented
by Mandatory
the
constraint.
on the
completeness
constraint
as described
specialisation
based
constraint.
completeness
is represented
completeness
is
a partial
completeness
the
the
constraint
are
supertype
be null.
is
a
member
of a(at
one) subtype. discriminators sets
to
can
instance
Subtype
Due
subtypes.
not unique.
Subtype
Learning
{AND}
electronic reserves
rights, the
right
are
some to
third remove
cannot
be null.
not unique.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
6.1.7 Specialisation You can use different
6
Data
Modelling
Advanced
Concepts
241
and Generalisation
approaches
to develop
entity supertypes
and subtypes.
For example,
you can first
identify aregular entity, and then identify all entity subtypes based on their distinguishing characteristics. You also can start byidentifying multiple entity types and then later extract the common characteristics of those entities and create a higher-level supertype entity. Specialisation
is the top-down
process
of identifying
lower-level,
more specific
entity
subtypes
from a higher-level entity supertype. Specialisation is based on grouping unique characteristics and relationships ofthe subtypes. In the aviation example, you used specialisation to identify multiple entity subtypes from the original employee supertype. Generalisation is the bottom-up process ofidentifying a higher-level, more generic entity supertype from lower-level entity subtypes. Generalisation is based on grouping
common
characteristics
and relationships
of the subtypes.
For example,
you
multiple types of musical instruments: piano, violin and guitar. Using the generalisation could identify a string instrument entity supertype to hold the common characteristics subtypes.
might identify
approach, you of the multiple
6
6.1.8 Composition and Aggregation So far we have looked at how to model relationships between entities using IS-A relationships. Suppose we have two entities, one called DEPARTMENT and one called UNIVERSITY. The relationship between the two could be described as apart_of orhas_a relationship asthe DEPARTMENT entity is a part_of the UNIVERSITY entity. This type of relationship is known as aggregation, whereby a larger entity
can be composed
of smaller
entities.
A special
case
of aggregation
is known
as composition.
Thisis a much stronger relationship than aggregation, since when the parent entity instance is deleted, all child entity instances are automatically deleted. Consider the two entities BUILDING and ROOM, where BUILDING is the parent entity and ROOM is the child entity. A ROOM is part_of a BUILDING and if the building was destroyed then all the rooms would also be destroyed.
TABLE
6.3
Aggregations
UML Construct
and compositions
UML Symbol
Description
Aggregation
This type
of association
relationship entity).
(that An
instance
entity
an
Composition
When
(child)
parent
are an
not
deleted,
entity deleted.
empty
all child
composition of the
formed
as
the
dependent
(child)
the
(parent)
with instance The
is
a collection
strong deleted,
aggregation
in
the
a special
case
are
that
association
instance.
entity.
a dependent
is
deleted.
with a filled
This is the
is
with a strong
automatically
is represented
object
child
parent
When the parent entity instance
entity instances
entity
of the
indicates
has a mandatory
of other
the
of the
of
entity
association
side
A composition
association parent
type
is
represents
entity instance.
or has_a
that
diamond
association.
entity instance
(parent)
that
association
of association
aggregation
side
the
by
This type
entity
a part_of
indicates
optional
instances
represented ?
an
aggregation
has
instance.
is,
represents
diamond
equivalent
The in the
of a weak
entity in the ER model.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
242
PART II
Design
Concepts
There is no ability to and
composition
notation. of
In
a
particular,
dependency
much the
of the
two
An aggregation
construct entities
but the
has_a
relationship
A composition with For
the
type.
used
is
used
has and
a team
invoice
depicted
in
composition
of aggregation
through
to indicate
Table
6.3
and composition
the
the
summarises
of (or is formed
players,
are
associated
deleting
lines,
the
help
or a band in
parent
or an order
6.5 to
constructs
UML
strength the
main
you
has
the
order
understand
of other
can be classified many
as a
musicians.
an aggregation
deletes
contains
as follows:
by) a collection
That is, the relationship
many
entities
Figure
concept
developed
UML constructs.
has
That is,
as the
been
an relationship.
aggregation
when two
contains
relationships
in
of each other.
relationship.
an invoice
which
when an entity is composed
For example,
identifying
example,
use of the
are independent
construct
a strong
Examine
is
Foot notation,
use aggregation
participating
and composition
guides the
entities,
Crows
approach,
diagrams
entities
aggregation
UML standard
using
more contemporary UML class
between
characteristics The
model such relationships
is
association
children
instances.
lines.
the
use
of
aggregation
and
composition.
6 FIGURE 6.5
Aggregation and composition Aggregation
OWNER
CAR
OWNER_ID
CAR_VIN bis_owned_by
OWNER_FNAME
CAR_YEAR CAR_BRAND
OWNER_LNAME OWNER_INIT
CAR_MODEL
0..*
1..1
OWNER_DRIVER_LIC
Deleting
an
OWNER_ID
OWNER
parent instance
does
not delete
all related
CAR children
instances.
Composition LINE INVOICE
INV_NUMBER contains
INV_NUMBER
c
LINE_NUMBER P_CODE
INV_DATE CUS_CODE
Deleting
6.2 Developing
design
Copyright Editorial
review
2020 has
to the
use
Learning. that
Generally,
approaches
can
Cengage deemed
parent instance
an ER diagram
diagram
you
an INVOICE
1..*
deletes
all related
LINE_PRICE
LINE
children
instances.
ENTITY CLUSTERING
relationships.
the
1
LINE_UNITS
any
entity
All suppressed
Rights
of
clusters
does
May not
not
be
discovery
modeller
the
ERD
making it to
materially
the
data
completion, point
Reserved. content
entails
the
affect
scanned, the
will contain
unreadable
minimise
copied,
of possibly
will develop
overall
or
the
duplicated, learning
in
whole
of entities
or in Cengage
part.
Due Learning
to
electronic reserves
of entity
types
ERD containing
hundreds
and inefficient
number
experience.
hundreds
an initial
of entities
and their
a few
and relationships
as a communication shown
rights, the
right
in the
some to
third remove
respective
entities.
party additional
tool.
As the
that
crowd
In those
cases,
ERD.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
An entity cluster is a virtual ERD.
An entity
object. the
cluster
An entity
final
with the Figure
cluster
is
entity
purpose
of simplifying
cluster
6.6 illustrates in
two
clusters:
entity
by combining
considered
ERD; the
introduced
entity type used to represent
is formed
the
Chapter
is
virtual
a temporary
the
use of entity
5, Data
grouping
the
COURSE
LOCATION
grouping
the
ROOM
Note
also that the
primary
the
key
ERD in
key inheritance
consequences,
the loss
such
of foreign
avoid the
display
FIGURE
of the rules
and
in the
6.6
change.
enhancing based
Entity
not show
In turn,
the
from
of attributes
some
when entity
Tiny University
are change from
entities. clusters
Advanced
into
a single
that it is multiple
not
243
in the
abstract
actually
entities
Concepts
entity
an entity
in
and relationships,
Tiny
University
Diagrams.
example
Note that
the
that
was first
ERD
contains
and relationships.
entities
entities
Modelling
its readability.
on the
entities
in relationships
sense
Relationship
BUILDING
does
entities
used to represent
CLASS
combined
as changes
key attributes
6.6
with
and
Figure
attributes
or abstract
clusters
Modelling
OFFERING
clusters,
interrelated
ERD and thus
Data
multiple entities and relationships
multiple
entity
6
and relationships.
attributes
for the
no longer
available.
in the inheritance
identifying
to
To eliminate
entities.
When using
Without rules
the
can
have
non-identifying
those
key
undesirable
or vice
problems,
the
entity
attributes,
versa
and
6
general rule is to
are used.
ERD using entity
clusters
0..1 b
is_dean_of
SCHOOL
1..1
operates
c
1..1
1..* 1..1
1..1
employs
0..*
c
DEPARTMENT
LECTURER
b chairs
0..1
1..1
1..1
1..1 has
c
0..*
STUDENT
offers
c 1..1
is_written_in
c
0..*
ENROL LOCATION
0..* 1..1 is_found_in
c
1..1 teaches
b is_used_for
c
OFFERING
0..* 1..*
0..*
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
244
PART II
Design
Concepts
6.3
ENTITY INTEGRITY: SELECTING PRIMARY KEYS
Arguably, the combination
mostimportant of attributes)
characteristic
used to
of an entity is its
uniquely
identify
each
primary key (a single attribute
entity instance.
The primary
or some
keys function
is
to guarantee entity integrity. Furthermore, primary keys and foreign keys work together to implement relationships in the relational model. Therefore, the importance of properly selecting the primary key has a direct bearing on the efficiency and effectiveness of database implementation.
6.3.1 Natural Keysand Primary Keys The concept of a unique identifier is commonly encountered in the real world. For example, you use class (or section) numbers to register for classes, invoice numbers to identify specific invoices, account numbers to identify credit cards, and so on. Those examples illustrate natural identifiers or keys. A natural key or natural identifier is a real-world, generally accepted identifier used to distinguish that is,
6
uniquely identify
real-world
objects.
Asits
name implies,
a natural
key is familiar
to
end users
and forms part of their day-to-day business vocabulary. Usually, a data modeller uses a natural identifier as the primary key of the entity being modelled, assuming that the entity has a natural identifier. Generally, most natural keys make acceptable primary key identifiers. However, there are occasions when the entity being modelled does not have a natural primary
key or the
composed
natural
key is
of the following
not a good
primary
key. For example,
assume
PROJ_NUM,
EMP_NUM, ASSIGN_HOURS,
Which attribute (or combination of attributes) would make a good primary Database Designs, you willlearn that trade-offs are often associated combinations of attributes to serve as the primary key for a specific table. about the use of surrogate keys, which can also be used as a primary key?
entity
attributes:
ASSIGNMENT (ASSIGN_DATE, ASSIGN_CHARGE)
primary
an ASSIGNMENT
The next section
gives some
basic
guidelines
for selecting
ASSIGN_CHG_HOUR,
key? In Chapter 7, Normalising with the selection of different You will alsolearn in Chapter 7 key. But what makes a good primary
keys.
6.3.2 Primary Key Guidelines A primary key is the attribute or combination of attributes that uniquely identifies entity instances in an entity set. However, can the primary key be based on, say, 12 attributes? And just how long can a primary
key be? In
previous
examples,
why was EMP_NUM
selected
as a primary
key of EMPLOYEE
and not a combination of EMP_LNAME, EMP_FNAME, EMP_INITIAL and EMP_DOB? Can a single 256-byte text attribute be a good primary key? The answer may depend on whom you ask. There is no single answer to those questions; however, there is a body of practice that database experts have built over the years. This section will examine that body of documented practices. First,
you should
understand
the function
of a primary
key. The
primary
keys
main function
is to
uniquely identify an entity instance or row within atable. In particular, given a primary key value that is, the determinant the relational model can determine the value of all dependent attributes that describe the entity. Note that identification and description are separate semantic constructs in the model. The function of the primary key is to guarantee entity integrity, not to describe the entity.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Second, However, end
primary the
users.
the
In
objects.
a store
the
real
For
display
applications should
using
primary
behind
is
done
objects
not
the
on the
characteristics
select
at the
selection
process
among
multiple
relationships behind
you
by looking
scenes.
key
store,
The
Nonintelligent
PK
The
unique
PK should
semantic
stock
over
If
time
those
concepts
probably
has
make good she
semantic
primary
gets
married
semantic
better
used
In other
L. as a primary
to the database basically Preferably
A primary primary
key keys
should
meaning,
and
it
adding
to
numeric
Unique
can
routines
with the
complaint
All suppressed
The selected
Rights
Reserved. content
does
minimum
6.4,
Having
and
managed
of each
her surname
constructs,
must not
risk
or violation.
EMPLOYEE
table
not
be
copied, affect
scanned, the
overall
or
over
her
why names
key,
husbands
key value
do
what happens surname?
thus
adding
means that you are
of attributes
possible.
Single-attribute
primary
the
possible
addition
the
of many attributes,
coding
are numeric attribute
In fact,
simplify
primary keys can cause
making (application) when they
Single-attribute keys
thus
more cumbersome.
because
that
primary
the
database
automatically
most database
increments
systems
Microsoft
can
include
the
Access, to support
be composed
of any attribute(s)
For example,
using
an ID
that
might
number
be
as a PK in
an
not a good idea.
duplicated, learning
entity
key attributes.
key
materially
This is
must be updated,
a primary
such as Autonumber in
primary
May
of the
would be preferred
primary to
key values
a counter-style
a security
not
with embedded
updates.
as the
multiple-attribute
new row.
considered
is
must be able to
characteristic
to
Smith
number
keys.
to implement
primary
key
An attribute
as a descriptive
changing
not required.
workload
addition
self-incrementing
any
at Table
6
Vickie
but
be better
ability to use special
Learning.
objects
of an entity.
have the
database
values
use internal
that
database
different
mind, look
A primary
meaning.
change
Furthermore,
entities to grow through
to the
values
Cengage
in
may be subject
have
decides
desirable
of foreign
keys of related
deemed
Therefore, of
from
natural for
key identifier.
the identity
are
implementation
has
about
them
only
words, a student ID of 650973
keys. If you
workload.
changing
single-attribute
2020
know
nulls.
If a primary key is subject to change, the foreign
review
Its
narratives
entity instance. contain
embedded
as an identifier.
an attribute
not
is
each
It cannot
not have
Martha
when
Copyright
number.
descriptive
from
by taking
245
entities.
hidden
they
products
as much as possible.
Keeping
identify
values.
meaning
Smith,
Editorial
among scenes,
Concepts
key characteristics
must uniquely
rather than
Security
the
Advanced
characteristics.
primary
guarantee
Preferably
Modelling
mostly
based
at a grocery
choose
primary
used to implement
Data
Rationale
values
No change
identify
human
user
Desirable
Characteristic
Unique
users
the labels,
end
values
are
relationships
shopping
mimic the
the
desirable
TABLE 6.4 PK
to
key
summarises
end
when
and reading
let
keys
of such
world,
shelf
database
which
and foreign
example,
applications while
keys
implementation
6
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
246
PART II
Design
6.3.3 In the
Concepts
Whento Use Composite Primary Keys previous
section,
you learnt
about
the
desirable
characteristics
of primary
keys.
For example,
you learnt that the primary key should use the minimum number of attributes possible. However, that does not mean that composite primary keys are not permitted in a model. In fact, composite primary keys are particularly useful in two cases: Asidentifiers of composite the *:* relationship.
entities, where each primary key combination is allowed only once in
Asidentifiers of weak entities, parent entity.
where the weak entity has a strong identifying
relationship
with the
Toillustrate the first case, let us consider two examples. For the first example, assume that you have a STUDENT entity set and a CLASS entity set. In addition, assume that those two sets are related in a *:* relationship
via an ENROL
entity set in
which each student/class
combination
may appear
only once in
the composite entity. Figure 6.7 shows the ERD to represent such a relationship using UML notation. As shown in Figure 6.7, the composite primary key automatically provides the benefit of ensuring that there cannot be duplicate values that is, it ensures that the same student cannot enrol more than
6
once in the same class.
FIGURE
6.7
The *:* relationship
between
STUDENT and CLASS ENROL
STUDENT STU_NUM
is_written_in
{PK}
c
CLASS_CODE
STU_LNAME
STU_NUM
STU_INIT
Table
name:
review
2020 has
Cengage deemed
CLASS_CODE
{FK2}
{PK}
CRS_CODE 0..*
CLASS_SECTION
1..1
STU_NUM
STU_LNAME
STU_FNAME
STU_INIT
321452
Ndlovu
Amehlo
C
324257
Smithson
Anne
K
324258
Le Roux
Dan
324269
Oblonski
324273
Smith
324274
Katinga
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
copied, affect
scanned, the
overall
or
D
John Raphael
P
Hemalika
T B
John
Smith
be
H
Walter
Ismail
324299
Copyright
c
STUDENT
324291
Editorial
{PK}
is_found_in
{FK1}
ENROL_GRADE
0..*
1..1
CLASS
{PK}
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
Table name:
second
The
entities
which
example TOUR
shown and
second
review
2020 has
Cengage deemed
to
Learning. that
any
C
10014
324257
B
10018
321452
A
10018
324257
B
10021
321452
C
10021
324257
C
each
All
CRS_CODE
CLASS_SECTION
10012
ACCT-211
1
10013
ACCT-211
2
10014
ACCT-211
3
10015
ACCT-212
1
10016
ACCT-212
2
10017
CIS-220
1
10018
CIS-220
2
10019
CIS-220
3
10020
CIS-420
1
10021
QM-261
1
10022
QM-261
2
10023
QM-362
1
10024
QM-362
2
10025
MATH-243
1
Figure
6.8
further
are related can
Advanced
Concepts
247
only
by
illustrates
of each other
and
in
Rights
world. other.
a strong
DEPENDANT
Reserved. content
does
May
not materially
the
use
a *:* relationship
appear
once in the
of
via the
composite
primary
TOUR_BOOKING
TOUR_BOOKING
relationship
keys. entity
in
entity.
with a parent
be
copied, affect
the
However,
one
entity is
overall
or
duplicated, learning
such
objects
relationship.
in
whole
or in Cengage
can For
the
part.
Due Learning
example, in
key of the
to
electronic reserves
rights, the
are two
exist in the
dependency
contains
experience.
object. Those types
and an employee
of existence
key that
scanned,
on another real-world
A dependant
identifying is
a composite
not
6
normally
situations:
in the real
entity is
suppressed
CLASS_CODE
combination
independently
dependant
Copyright
321452
object that is existent-dependent
EMPLOYEE
Editorial
10014
in
one of two
Areal-world
relate
ENROL_GRADE
BOOKING
are distinguishable exist
STU_NUM
case, a weak entity in a strong identifying
used to represent
1
Modelling
CLASS
a booking/tour
In the
Data
ENROL
CLASS_CODE
The
6
right
the which
parent
some to
third remove
party additional
separate model
of objects
people
only
relationship the
who
when they between
primary
key
of the
entity.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
248
PART II
Design
Concepts
FIGURE 6.8
The *:* relationship
between BOOKING and TOUR
BOOKING BOOKING_NO EMP_ID
TOUR
{PK}
{FK1}
CUST_NO
BOOKING_STATUS_CODE EVENT_ID
{FK4}
HOTEL_ID
{FK5}
FLIGHT_NO
TOUR_ID
TOUR_BOOKING
{FK2} {FK3}
may_contain
c
TOUR_ID
{PK}
{FK1}
BOOKING_NO
has
{PK}
0..*
{FK6}
c
TOUR_DESCRIPTION
{FK2}
TOUR_PRICE_ADULT
TOUR_DATE
1..1
{PK}
TOUR_NAME
1..1
0..*
TOUR_PRICE_CHILD
TOTAL_TOUR_COST
TOUR_PRICE_CON
BOOKING_TOTAL_COST BOOKING_DATE
2
6
A real-world identifying in
object that is represented
relationship.
a data
model:
independent In
both
situations,
weak
6.3.4 There key
having
a strong
to the
entity
and
parent
types
LINE.
but rather
Clearly,
as part
provides
In
the
model as two
invoice
LINE
relationship
summary,
benefits
data
separate
entities in
object is represented
entity
does
not
exist
are some may not
instances
when
be a suitable
rooms
the format
for
that
a primary
primary
small
shown in
TABLE 6.5
in the
real
the
ensures
that
selection
enhance
the
dependent
of a composite
the integrity
and
entity
primary
consistency
can
key for of the
parties.
key. The
key
doesnt
exist in the
For example,
manager
of the
consider facility
real
the
world
case
keeps
track
or
when the
of a park
recreation
of all events,
of events Party_Of
17/06/19
11:00AM
2:00PM
Allure
Ndlovu
60
17/06/19
11:00AM
2:00PM
Bonanza
Adams
Office
12
17/06/19
3:00PM
5:30PM
Allure
Naidoo
Family
15
17/06/19
3:30PM
5:30PM
Bonanza
Adams
Office
12
18/06/19
1:00PM
3:00PM
Bonanza
Scouts
18/06/19
11:00AM
2:00PM
Allure
March
18/06/19
11:00AM
12:30PM
Bonanza
Naidoo
Family
EVENT
entity
data
shown
EVENT(DATE,
Cengage
Learning. that
any
All suppressed
in
Table
6.5,
TIME_START,
Rights
Reserved. content
with
Table 6.5.
Data used to keep track
Given the
that
a folder
Event_Name
deemed
only
natural
facility
using
Room
has
exist
composite
existing
Time_End
2020
as an
model.
Time_Start
review
entities
world
Date
Copyright
a strong
by two
of an INVOICE.
identifying
entity.
in the
the real-world
Whento Use Surrogate Primary Keys
houses
Editorial
INVOICE
object,
when it is related and
For example,
does
May not
not materially
be
you
would
TIME_END,
copied, affect
scanned, the
overall
or
duplicated, learning
model the ROOM,
in experience.
whole
or in Cengage
Wedding
33 of
Dimes
EVENT_NAME,
part.
Due Learning
to
electronic reserves
12
as: PARTY_OF)
rights, the
25
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
What primary a primary you
might
stands)
that
the
RESOURCE Given the
business
EVNTRSC now
have
the
given
selected time
primary
key
and text
data
would
problem
is to
use
Surrogate
could
(that
RSC_ID,
as tables,
EVENTs.
The
RSC_QTY,
249
be used as chapters,
EVENT
projectors,
RESOURCE
entity.
PCs and entity
would
RSC_PRICE)
and
key
would
be represented
QTY_USED) primary
make the
EVENT
as follows:
key.
What
existence-dependent
would
happen
entity?
implementation
of the
attributes
are
with several
if the
At this
database
surrogate practice
when
there
through
is
different
point, and
EVNTRSC you
can
program
see
coding
data
case,
key
the
may not work
EVENT
entitys
by a combination
types).
In
addition,
entities.
complex
no natural
data
key,
when
candidate
key, you
use
primary
this
of date,
the
selected
The solution
to the
key.
todays
selected
the
selected 6.4. In
and is formed
primary
in
or when the
properly
Table
existence-dependent
if you use a surrogate
performs
in
information
keys for
accepted
contents,
EVENT entitys
about
semantic
primary
helpful
semantic
that the
you learnt
single-attribute
keys
especially
question
is,
cause lengthy
there is a trade-off:
environments.
the
candidate
key is too long
must ensure that the
of unique
index
Surrogate
selected
and not
key
or cumbersome.
candidate
null
key of the
constraints.
DESIGN CASES: LEARNING FLEXIBLE DATABASE DESIGN
Data
modelling
acquired and
many
primary
composite
embedded
a numeric
are
has embedded
6.4
could previous
ROOM) for the
(such
RESOURCE
a composite
by another
guidelines
contains
primary
keys
in
for
between
may have noticed
key
columns
key
entity
TIME_START,
RESOURCEs
RSC_TYPE,
ROOM,
key
you
primary
However,
with
four-attribute
modeller,
the
primary
primary
in
Concepts
ROOM)
be used
*:* relationship entity,
primary
key that about
Advanced
complex.
As a data well,
the
were inherited
composite
natural
you learnt
Modelling
attributes:
TIME_START,
key
unnecessarily
may
many
RSC_DESCRIPTION,
a lengthy
no simple
concepts
key (DATE,
may use
RESOURCE
rules,
(DATE,
key
TIME_END,
primary
EVENT
composite
primary
or (DATE,
following
(RSC_ID,
EVNTRSC
entitys
one
case, there is
primary
Data
options:
composite
same
by the
In this
on the
ROOM)
the
and that
Based
of these
determine
be represented
that
one
you select
you
You
model.
TIME_START,
Assume
via the
would you suggest?
suggest
(DATE,
Next,
key
key in the
6
and
through
different
design
design
importance
require
practice
problems.
of flexible
skills
regular
that
are
This section
designs,
acquired
and frequent
proper
through
repetition,
will present
identification
of
experience.
applying
four
the
special
primary
keys
design and
In turn,
concepts
experience
learnt
to
cases that
placement
is
specific
highlight
of foreign
the
keys.
NOTE In
describing
to
be
the
different
on relational
issues
are
between
addressed
design
modelling
models.
attempts
to
Copyright Editorial
review
2020 has
Cengage deemed
Entities
any
All suppressed
Rights
Reserved. content
this
on the
goal in
does
are identified
book,
practical
the
focus
nature
mind. Therefore,
there
has
been
of database is
and
continues
design,
no sharp
in
entities
May
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
books
whole
or in Cengage
that
may become
and relationships.
an ERD is implemented
you will discover that this
not
keys are not part of an ER diagram
by identifiers
and define the
designed
modelling tool,
Learning. that
focus
stage of the design, foreign
understand
which the relationship
as your
throughout
the
line
all
design
of demarcation
and implementation.
and relationships.
modeller
concepts given
with the implementation
Atthe pure conceptual entities
Also,
part.
Foreign
in a relational
methodology
Due Learning
to
electronic reserves
rights, the
right
primary keys
keys. are the
model. If you
use
is reflected in the
some to
third remove
party additional
content
may content
the
ERD displays
only
During
the
design,
mechanism MS Visio
through
Professional
Visio modelling practice.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
250
PART II
Design
Concepts
6.4.1 Design Case #1:Implementing
1:1 Relationships
Foreign keys work with primary keys to properly implement relationships in the relational model. The basic rule is very simple: put the primary key ofthe one side (the parent entity) on the many side (the dependent entity) as aforeign key. However, where do you place the foreign key when you are working with a 1:1 relationship? For example, assume the case of a 1:1 relationship between EMPLOYEE and DEPARTMENT
based
on the
business
rule one
EMPLOYEE
is the
manager
of one DEPARTMENT,
and
one DEPARTMENT is managed by one EMPLOYEE. In that case, there are two options for selecting and placing the foreign key: Place aforeign keyin both entities. That option is derived from the basic rule you learnt in Chapter 5, Data
Modelling
with Entity Relationship
Diagrams.
Place EMP_NUM
as a foreign
key in
DEPARTMENT
and
DEPT_ID as aforeign key in EMPLOYEE. However, that solution is not recommended asit would create duplicated work and it could conflict with other existing relationships. (Remember that DEPARTMENT and EMPLOYEE also participate in a 1:* relationship one department employs many employees.) Place a foreign key in one of the entities. In that case, the primary key of one of the two entities appears
6
as a foreign
key on the
other
entity.
That is the
preferred
solution,
but there
is a remaining
question: which primary key should be used as a foreign key? The answer to that question is found in Table 6.6. Table 6.6 shows the rationale for selecting the foreign key in a 1:1 relationship based on the relationship properties in the ERD. TABLE
6.6
Case
ER
I
Selection Relationship
One
II
side
is
other
side
Both
sides
of foreign Constraints
mandatory is
key in a 1:1 relationship Action
and
the
Place
optional.
are
the
optional.
the
PK
optional
Select the
of the side
as
FK that
Both sides are
mandatory.
on the
a FK and
causes
FK in the entity in
III
entity
mandatory
make the
the fewest
entities
do not
FK
belong
together
your in
in the
entity
on
mandatory.
number
which the (relationship)
See Case II or consider revising
side
of nulls
role is
or place
the
played.
model to ensure that the two
a single
entity.
Figure 6.9 illustrates the EMPLOYEE manages DEPARTMENT relationship. Note that, in this case, EMPLOYEE is mandatory to DEPARTMENT. Therefore, EMP_NUM is placed as the foreign key in DEPARTMENT. Alternatively, you might argue that the manager role is played by the EMPLOYEE in the
FIGURE
DEPARTMENT.
6.9
A 1:1 relationship A one-to-one one
between
DEPARTMENT
(1:1) relationship:
DEPARTMENT;
each
An EMPLOYEE DEPARTMENT
and EMPLOYEE manages
is
managed
zero by
or
one
EMPLOYEE
EMPLOYEE EMP_NUM
DEPARTMENT manages
{PK}
c
DEPT_ID
EMP_LNAME EMP_FNAME
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
1..1
Reserved. content
{PK}
DEPT_NAME
does
May not
not materially
be
EMP_NUM
0..1
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
{FK1}
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
As a designer,
you need to recognise
should
be supported
are
not
placed
and
unique
entity,
in
data
same
table.
the
entity
what
in the
types
would
that
be the
that
1:1 relationships
model. In fact, In
other
do not
name
of that
Company
managers through
company
managers
company words,
profits
the
data
Normally, without
attribute that
use the
stored
on
all
data in
and for
a database
are
such
hand,
attributes
such
as your
Sometimes
open any
and close case,
original
entity.
This new
is
model
pertinent
as
Figure
to
If
you
only
group
that
two
are
clearly
them
251
they
entity
sets
separate
together
in
one
by replacing
Data
based
on the
as
questions
the
data,
but
such
existing
also
as,
subject
must keep a history
to
change
over
date
of birth
or your
student
grades
or your
are externally are
time
and
ID
bank
account
on
and event
well-defined
with the
current
In
other
new
value,
for
a given
6
data refer to data whose You could
time
not time
balance
do the
of values
data changes.
are
is
In fact,
data.
history
are, therefore,
number,
originated
based
of the
How
value
when the
that
events.
sales trends?
historic
attribute
are situations
information
well as past
what are XYZ products
current
there
is
current
answer
years and not
which you
well as the
history
of time-variant
time-variant
entity
to the
are
driven,
variant.
However,
variant. subject
to
argue
On the
other
change
over
such
as a product
such
as the
a
multivalued
schedules,
daily
price stock
will contain
event
history
data,
being
you
the
data is must
equivalent
create
new value, the
modelled.
of all department
For example, managers
to
a new
having
entity
date of the if
you
over time,
in
change,
want to you
attribute
a 1:* relationship
could
keep
with the
and
whatever
track
of the
create
the
model
other current shown
6.10.
FIGURE 6.10
Cengage deemed
making
data reflect
databases
However,
changes
the
attribute
has
DEPARTMENT
entity.
Concepts
values.
keeping To
2020
in
reflect
as your
occasions,
entity.
manager
review
decision Such
of previous
the data changes
On other
your
in
good
managed
over time
values,
in
stored
value.
attribute
In
Copyright
are
previous
ensure
History of Time-Variant
databases.
databases
and
a single
Advanced
world and, therefore,
to
must be preserved. From a data modelling point of view, time-variant
change.
Editorial
data
some
quote
that
in
to those
changes
to the
change
time.
realise
data stored
compare
data
regard
values
generally
the
in
used
Modelling
entity?
6.4.2 Design Case #2: Maintaining
generated
is
EMPLOYEE
together
Data
exist in the real
a 1:1 relationship
words,
belong
6
Learning. that
any
All suppressed
Rights
Maintaining manager history
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
252
PART II
Design
As you and
Concepts
examine
Figure
the
manager
of
managers.
the of the the
many
entity to
an employee
Figure
time,
could
managed through
each
time
a new
DEPARTMENT
appears of the
you
would
history
Copyright Editorial
review
2020 has
modify
Learning. that
any
and
a given
code,
and
one insert
model in
entity to
employee
many
different
once
could
you
attribute
became to
case in could
the
be the
your
be
employee
DATE_ASSIGN
employee
not the
in
manager manager
environment
make
of if,
DATE_ASSIGN
there
6.10
6.9
hand,
relationship
Figure
6.10
is implemented
the
historic
with that
data
model is that
modifications:
one
update
entity. more
apparent
case, the
for
that
case,
each
a JOB_HIST
when
you
PK of the 1
Now suppose In
salary
practice.
most recent
ERD in
Additionally,
data
in
the
the
The trade-off
will be two
In that key.
by adding
other
manager
becomes
employees.
employees
and redundant
by retrieving
DEPARTMENT.
relationship.
and the
is
The current
and
as a foreign
maintain the
in theory
On the
MGR_HIST
Figure
assigned
Figure
data.
in the in
optional
a department
DEPT_MGR_HIST.
companys
date
is
a department,
proposed
the
the
with EMPLOYEE
an
(EMP_NUM)
same
is
department.
historic
to
of the
the
only
of
EMPLOYEE
many employees
each
the
new JOB_HIST
Cengage deemed
model
employee
time,
have
must store
scenario
manager
side (EMPLOYEE)
for
the job
FIGURE 6.11
and
employs
in the many
job
department,
the
of the
department
for
assigned
entity
over
could
you
permits
If that
relationship
between
is
data,
of a department
who the
data
that,
entity.
EMP_MGR_HIST
manager
The flexibility
one
current
has a 1:* relationship
fact
a department
MGR_HIST dates.
MGR_HIST
entity
the
on which the
manager
out
relationship
are
the
find
between
by the manages
in
key of
MGR_HIST
date from
differentiates
date
different
is the
reflect and
6.10 that the manages
you
DATE_ASSIGN
on
in the
MGR_HIST to
time-variant
the
primary but
attribute
Note in At any
provide
The
the
departments
are recording
department,
example,
note that
DEPARTMENT
different
you
department.
same
a non-prime
6
with
Because
MGR_HIST
for
6.10,
a 1:* relationship
add the
1:*
side (DEPT_ID)
you
would like to keep track
you
would
employee. entity.
keep
track
To accomplish
Figure
6.11
of the
that
shows
the
task, use
of
history.
Maintaining job history
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Again, it is
worth emphasising
and redundant history in
and
practice.
selecting
Chapter
SQL and
in
8,
can
only the
Beginning
Advanced
represented
that the manages You
relationships
to
out
where
data row
Query
SQL, finding
and employs
find
most current
Structured
for
Language
the
separate
admittedly
current
each
works
employee. and in
redundant
Modelling
However, Chapter
253
at the job
as you
will discover
9, Procedural
not a trivial task.
Concepts
optional
by looking
Language
Therefore,
but unquestionably
historic
Advanced
are theoretically
employee
works is
data from
Data
relationships
each
(SQL),
where each employee
in Figure 6.11 includes
employs
always
6
the
model
useful manages
and
data.
6.4.3 Design Case #3: Fan Traps Creating
a data
due to
model requires
miscommunication
uncommon
to
misidentify
contain a design trap. and,
therefore,
is
Given
Figure
a
data
relationships
of the
entities.
has
among
business
Under
those
rules
circumstances,
occurs when a relationship is improperly way that
is
not
consistent
entities.
with the
real
However,
or processes,
it is
the
ERD
orincompletely world.
The
not may
identified
most
common
6
as a fan trap.
among
league those
among
A design trap in
of the
understanding
occurs when you have one entity in two 1:* relationships
an association football
identification
relationships
represented
design trap is known
Afan trap
proper
or incomplete
the many
other
entities
divisions.
incomplete
that
Each
business
is
not
division
rules,
you
expressed
has
many
might
create
to other entities, thus producing
in the players,
model. and
an ERD
For
each
that
example,
division
looks
assume
has
like
the
the
many teams. one
shown
in
6.12.
FIGURE 6.12
Incorrect
ERD with fan trap problem Fan trap
due to
misidentification
of relationships
PLAYER
TEAM DIVISION
TEAM_ID {PK} TEAM_NAME
DIV_ID
DIV_ID
DIV_NAME
{FK1}
0..*
PLAYER_ID {PK} PLAYER_NAME
{PK}
1..1
DIV_ID
0..*
1..1
{FK1}
As you can see in Figure 6.12, DIVISION is in a 1:* relationship with TEAM and in a 1:* relationship with PLAYER. Although that representation is semantically correct, the relationships are not properly identified.
For example,
there is no way to identify
which
players
belong to
which team.
Figure
6.12 also
shows a sample instance relationship representation for the ERD. Note that the DIVISION instances relationship lines fan out to the TEAM and PLAYER entity instances, thus the fan trap label. Figure 6.13 shows the correct ERD after the fan trap has been eliminated. Note that, in this case, DIVISION is in a 1:* relationship
also shows the instance
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
with TEAM. In turn,
relationship
May not
not materially
be
copied, affect
scanned, the
TEAM is in a 1:* relationship
representation
overall
or
duplicated, learning
in experience.
whole
with PLAYER.
Figure 6.13
after eliminating the fan trap.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
254
PART II
Design
Concepts
FIGURE 6.13
Corrected ERD after removal Fan trap
eliminated
by
of the fan trap
proper
identification
of relationships
PLAYER
TEAM DIVISION DIV_ID
TEAM_ID
{PK}
PLAYER_ID
{PK}
DIV_NAME
DIV_ID
0..*
1..1
{PK}
PLAYER_NAME
TEAM_NAME 0..*
1..1
{FK1}
TEAM_ID
{FK1}
Jordan Baird
6
U-15
Club
Pirates
Dlamini Malone
Ajax
U-18
FC Shezi Zulu
Given the to find
design in Figure
out
division;
which
then
6.4.4
(As
places
you learnt
related
model.
Chapter
The
data.
historic
need
via the
TEAM
to
see
team.
In
Relational
is
Model
teams
words,
belong
there
is
to
each
a transitive
entity.
seldom
a good
Characteristics,
occur
when
with redundant to
which
other
However,
note that
in
the
redundancies
there
are
relationships some
environments
thing
can
multiple
cause
anomalies
paths
remain
use redundant
backups
environment. data
relationship
is that they
designs
(multiple
database
between
consistent
relationships
across
as a
way to
relationships
by the fact
Another
was first
data.
more
introduced
However,
that
such
specific
the
in
Figure
relationships
example
6.10
use of the redundant
of
during
were dealing
a redundant
the
discussion
manages
on
and employs
with current
data rather
relationship
is
represented
and
through
in
6.14. Figure
safely
6.14,
(So
deleted
Cengage deemed
first
on each
which team.
Relationships
relationships
of time-variant
than
has
3,
of redundant
history
was justified
2020
play
mind), redundancy
it is important
relationships
redundant.
review
PLAYER
main concern
note
entity set. Therefore,
Copyright
and
you
players
play for
design.
An example
Figure
to
Redundant
However,
the
maintaining
Editorial
division,
which
players
is often seen as a good thing to have in computer comes
in
entities.
simplify
In
which
out
DIVISION
redundancy
a database.)
the
play in
to find
Design Case #4: Redundant
multiple
in
need
between
Although in
players
you
relationship
6.13, note how easy it is to see which
Learning. that
any
All suppressed
the
too is the without
Rights
Reserved. content
transitive
the relationship
does
additional
losing
May not
not materially
1:* relationship
between
DIVISION
that
DIVISION
and
connects
attribute
DIV_ID in
PLAYER.)
any information-generation
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
In that
capabilities
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
PLAYER
the
PLAYER is, for all practical
right
case,
in the
some to
third remove
party additional
the
TEAM
purposes,
relationship
could
be
model.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 6.14
6.5
Aredundant
6
Data
Modelling
Advanced
Concepts
255
relationship
DATA MODELLING CHECKLIST 6
Data
modelling
real-world
data,
enables
the
trade-offs The
translates
designer
processes to
add
and intricacies
modelling
database
designs.
modelling
and tools
TABLE
6.7
(The
BUSINESS
thus
far
give
the
tools
needed
Table
relational
in
such
to
6.7
how the
also
learnt
Table
6.7 is
successful
that
based
entity
order for
you
perform
on the
concepts
relationship
it is assumed
as synonyms,
the data.
all is in
ensure
the
EERM
about
produce
ensure that
model,
the
of time-variant to
will help
model.) Therefore, checklist,
have
modelling
checklist
used in the
You
represents
chapter
the
3
the
in this
a checklist
in
model that
and
modelling
Chapter
a data
model.
you
pilot uses
entity relationship
modelling
the
keys
shown
data
in
and labels
Data
primary
checklist
beginning
to
of
as any good
into
You have learnt
content
learnt
modelling
successfully.
majority of terms
that
aliases
model,
you are familiar
and relationships.
checklist
RULES
Properly Ensure
document that
entities,
and verify all business rules
all business
attributes,
Identify
the
existence
DATA
have
and the extended
environment
semantic
selection
just
data
have learnt,
normalisation with the
you
the
tasks
you
more
However,
flight,
real-world and interactions.
in the
techniques
a successful data
users,
a specific
source and
rules
are
relationships
written and
of all business
by the
date
and
with the end users.
precisely,
clearly
and simply.
The
business
rules
must help identify
constraints.
rules
and
person(s)
ensure
that
responsible
each
for
the
business business
rule is rules
accompanied
verification
by the and
reason
for its
approval.
MODELLING
Naming
Conventions:
All names
should
be limited
in length
(database-dependent
size).
Entity names: ?
Should
be nouns
?
Should include
?
Should
be unique
composite
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
are familiar
abbreviations,
? For composite
Editorial
that
business
synonyms
within the
entities,
to
and should
be short
and
meaningful
and aliases for each entity
model
mayinclude
a combination
of abbreviated
names of the entities linked through
the
entity
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
256
PART II
Design
Attribute
Concepts
names:
?
Should
be unique
?
Should
use the
within the
entity
abbreviation
? Should be descriptive
such as _ID, _NUM or _CODE for the
? Should not be a reserved ? Should not contain
?
or prefix
of the characteristic
? Should use suffixes
Relationship
entity
PK attribute
word
spaces
or special
characters
such as @,! or &
names:
Should
be active
or passive
verbs
that
clearly
indicate
the
nature
of the
relationship
Entities:
6
All entities
should
All entities
should
The
granularity
represent
a single
subject
be in 3NF or higher (covered
of the
The PKis clearly
entity instance
defined
is
in
clearly
and supports
Chapter 7, Normalising
Database
Designs)
or for
maintaining
defined
the selected
data granularity
Attributes: Should
be simple
Should include Derived
and
single-valued
default
attributes
should
as a foreign
data)
values, constraints, be clearly
Should not be redundant, used
(atomic
synonyms
identified
unless they
and aliases
and include
are required
source(s)
for transaction
accuracy
a history
or are
key
Relationships: Should
clearly
identify
Should clearly
relationship
participants
define participation
and cardinality
rules
ER Diagram: Should
be validated
Should evaluate Should
not
Should
against
where,
contain
expected
processes:
when, and how to
redundant
updates
and
deletions
maintain a history
relationships
minimise data redundancy
inserts,
except
as required
to ensure single-place
(see
attributes)
updates
SUMMARY The extended supertypes, one
or
entity relationship subtypes
more entity
A specialisation and
entity
of the
to
Copyright review
2020 has
Cengage deemed
Learning. that
any
All
Inheritance
allows
can
entity
Rights
Reserved. content
supertype
does
is
to the
a generic
ER
model via entity
entity
May
not materially
be
an entity
the
and relationships subtype
type
that
supertype
copied, affect
and subtypes:
scanned, the
overall
or
duplicated, learning
in experience.
occurrence
or in Cengage
part.
Due Learning
the
A subtype
is related
to
electronic reserves
to
entity
attributes
to
supertypes
and relationships
discriminator
is related.
approaches
specialisation
whole
between
to inherit
or overlapping.
There are basically two
supertypes
not
arrangement
be disjoint
subtype
completeness.
of entity
suppressed
model adds semantics
An entity
depicts the
Subtypes which
or total
hierarchy
Editorial
hierarchy
subtypes.
determine
(EER)
clusters.
subtypes.
supertype.
partial
and
is
used
The subtypes
developing
can
to exhibit
a specialisation
and generalisation.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
An entity
cluster is a virtual
entity type
ERD.
An entity
cluster
is formed
single
abstract
entity
object.
Natural
keys are identifiers
primary
keys.
should
Surrogate
primary
keys
primary
are
data
keep
This entity trap
occur
refers
occurs
when there
redundant
to
and
and relationships
do not necessarily must
and they
and
no natural
key
with
mandatory
data
whose
are
have
Concepts
257
in the into
a
make good
unique
preferably
values,
numeric
weak (strong-identifying)
key that
makes
multiple different
changes.
the
new
you
the
other
have
they and
entities.
a suitable
primary
data types,
entity that
are
the
date
in
is
key,
or when the
used
key in the
any
which the
between
to represent
to
in the
related
across has_a
data,
other
be
you
entities
The
6
must data.
maintained. and there
Redundant
entities.
the
mandate
time-relevant
other
model.
entity,
played.
requirements
history is to
1:* relationships
consistent
whose
of time-variant and
optional
where the role is
and
history
expressed
paths
remain
time
of change
two
not
multiple relationship
composition
over
with the entity for
one
they
maintain
the
as a foreign
of nulls, or place it
change
To
value,
entities
is that
entity
number values
a 1:* relationship
are
is
primary
PK of the
of data
relationships
Aggregation
the
when
among
and relationships
entities
They
over time,
Advanced
be usable.
containing
maintains
association
when there
causes the least
a history
an entity
useful
place
entity that
Time-variant
A fan
change
*:* relationships
key is a composite
a 1:1 relationship,
place it in the
you
not
Modelling
entities
Natural keys
characteristics:
Data
attribute.
key is too long to
create
must
multiple
interrelated
world.
have these
keys are useful to represent primary
when the
multiple
exist in the real
should they
of a single
Composite
that
that
keys
be non-intelligent,
composed
In
Primary
used to represent
by combining
6
is
an
relationships
main concern
with
model.
or part_of
relationships
between
entities. The
data
minimum
modelling
checklist
provides
a way for the
designer
to
check
that
the
ERD
meets
a set
of
requirements.
KEY TERMS aggregation
entitysubtype
completeness constraint
entity supertype
composition
extended entity relationship
design trap
overlapping(non-disjoint)subtypes partial completeness model
specialisation hierarchy
(EERM)
subtype discriminator
disjointsubtypes(non-overlappingsubtypes)
fantrap
time-variantdata
EERdiagram(EERD)
inheritance
total completeness
entity cluster
natural key(natural identifier)
FURTHER READING Advances
in
Conceptual
Modelling
Theory
and
Practice,
Lecture
Notes in
Computer
Science,
Volume
4231,
Springer, 2006. Booch, G. Unified Modelling Language Gordon,
K.
Modelling
Computer
Society,
Hernandez,
User Guide, Addison-Wesley,
Business Information:
Entity
Relationship
and
2005. Class
Modelling for
Business
Analysts.
British
2017.
M. J. Database
Design for
Mere
Mortals:
A Hands-On
Guide to
Relational
Database
Design.
Addison-Wesley,
2003.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
258
PART II
Design
Concepts
Online Content are available
on the
Answers to selectedReviewQuestions andProblems forthis chapter
online
platform
accompanying
this
book.
REVIEW QUESTIONS
6
1
Whatis an entity supertype
2
Whatkinds of data would you store in an entity subtype?
3
Whatis
4
Whatis a subtype discriminator?
Given an example of its use.
5
Whatis an overlapping
Give an example.
6
Whatis the difference
7
Whatis an entity cluster, and which advantages
a specialisation
and whyis it used?
hierarchy?
subtype?
between partial completeness
8
Which primary key characteristics considered desirable.
9
Under which circumstances
10
Whatis
a surrogate
11
Whenimplementing mandatory
12
primary
key,
and
when
a 1:1 relationship,
and one side is
Whatare time-variant
are derived from its use?
are considered
would composite
optional?
and total completeness?
desirable?
Explain
why each characteristic
is
primary keys be appropriate?
would you use one?
where should you place the foreign
Should the foreign
key be
mandatory
key if one side is
or optional?
data, and how would you deal with such data from a database design point
of view?
13
Whatis the
most common
design trap, and how does it occur?
PROBLEMS 1
AVANTIVE Corporation is a company specialising in the commercialisation of automotive parts. AVANTIVE has two types of customers: retail and wholesale. All customers have a customer ID, a name, an address, a phone number, a default shipping address, a date oflast purchase and a date of last payment. Retail customers have the credit card type, credit card number, expiration date and email
address.
Wholesale
customers
have a contact
name,
contact
phone
number,
contact
email address, purchase order number and date, discount percentage, billing address, tax status (if exempt) and tax identification number. Aretail customer cannot be a wholesale customer and vice versa. Given that information, create the ERD containing all primary keys, foreign keys and main attributes.
2
AVANTIVE
Corporation
purchasing.
Each
has five
department
departments:
employs
administration,
many employees.
marketing,
Each
employee
sales, shipping has an ID,
and
a name,
a
home address, a home phone number, a salary and a tax ID. Some employees are classified as sales representatives, some astechnical support and some as administrators. Sales representatives receive a commission based on sales. Technical support employees are required to be certified in
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
their
areas of expertise.
systems ERD
3
specialists.
containing
For example,
all primary
AVANTIVE Corporation
keys,
keeps
AVANTIVE
keeps several
alist
on hand.
A retail
purchased 30 days from
number, prices
to
pays
about
models,
and is
normally price
the
that
others,
Concepts
259
as electrical
information,
manufacturer,
will have a part ID,
card
customer
specialists;
Given
Advanced
create
the
business rules:
many car
a discounted
line
person
a shipping
closed
for
Modelling
main attributes.
A part
by credit
or wholesale)
extended the
and
a bonus.
Data
for
and
a car
charged
the list
item
has
price
many
for
order
purchased.
and
year.
unit price and
model
pays via purchase each
model
description,
parts.
each
with terms
(The
of net
discount
varies
customer.)
(retail
to identify
as drivetrain
and
with information
be used
a date, a shipping and
cost,
can
charged
customer
A customer
models
A wholesale
and is
a title
keys
parts in stock.
normally
item.
foreign
of car
A part
customer
have
operates under the following
AVANTIVE
quantity
some are certified
All administrators
6
totals.
who
date,
can place
address, Each
made the
an order
many orders.
a billing
total
order
address also
has
sale,
an
order
cost,
an
order
Each
order
and a list a sales
representative
subtotal, total
will have an order
of part codes,
an order
paid
and
quantities,
ID (an
tax
total,
an order
unit
employee)
6
a shipping
status
(open,
or cancel).
Using that information,
create the
complete
ERD containing
all primary
keys, foreign
keys
and
main attributes.
4 In
Chapter 5, Data Modelling with Entity Relationship
University
database
many students to include
design.
these
business
An employee A lecturer
Staff employees
be staff
one
such
business
department.
rules
Modify the
as a lecturer design
may advise
shown
in
Figure
5.39
or a lecturer
or an administrator.
be an administrator.
have a work level
Only lecturers
can
Only lecturers
can serve
chair
classification,
a department.
as the
such
A department
as Level I and Level II. is
dean of a school.
chaired
by only
Each of the
one lecturer.
universitys
schools
is served
dean.
Alecturer
can teach
Administrators Given that primary
may chair
rules:
could may also
by one
That design reflected
and a lecturer
Diagrams, you saw the creation of the Tiny
many classes.
have
a position
information,
keys,
create
foreign
keys
title.
the
and
complete
ERD
using
UML
class
diagram
notation,
containing
all
main attributes.
5 Tiny University wantsto keep track ofthe history of all administrative appointments (date of appointment and to
date
know
2018
of termination). how
or who the
complete
6
and to
review
2020 has
dean of the
ERD containing
technology
infrastructure
Copyright
Time
worked
variant
in the
School
all primary
data School
are at
Cengage deemed
Learning. that
support.
technology
take
any
support
All
Some
of Education keys, foreign
Rights
training
Reserved. content
does
May not
not
be
retain
copied, affect
their
scanned, the
overall
or
duplicated, learning
The
was in 2010. keys
provide IT
and
in experience.
whole
technology are
Cengage
part.
Due Learning
to
electronic reserves
chancellor 2000
may want
and
1 January,
Given that information,
create the
(IT) personnel. Some IT personnel
Some IT
expertise.
or in
University 1 January,
main attributes.
technology
personnel
technical
Tiny
between
programmes.
personnel support.
to
materially
academic
IT
infrastructure
periodic
suppressed
for
work.)
of Business
Some Tiny University staff employees areinformation provide
Editorial
(Hint:
many deans
personnel
provide
technology
support
for
academic
programmes
not lecturers.
IT
personnel
are required
Tiny
rights, the
right
University
some to
third remove
party additional
tracks
content
may content
all IT
be
suppressed at
any
time
personnel
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
260
PART II
Design
Concepts
training
by date, type
complete
7
ERD
The FlyRight
and results
containing
Aircraft
maintenance
for
all
(completed
primary
vs not completed).
keys,
foreign
Maintenance (FRAM)
FRCs
aircraft.
Produce
keys
and
Given that information,
division of the FlyRight
a data
model
create the
main attributes.
segment
Company (FRC) performs
that
reflects
the following
all
business
rules:
All
mechanics
Some
mechanics
specialised
(AV) in
in
maintenance.
in their
course
type,
(AF)
not
a real-world
you
will show
of expertise.
FRC
(Y/N)
and
are
mechanics.
maintenance.
Some
mechanics
are
components
mechanics tracks
take all
mechanics
specialised
are
in
avionics
of an aircraft that
periodic
courses
refresher
taken
are used
courses
by each
to
stay
mechanic
date,
performance.
employment
terminated
your
(EN) Some
electronic All
and
requirement. in
engine
are the
of the
date
in
navigation.)
certification
a history
Not all employees
maintenance.
(Avionics
areas
promoted,
Given those
specialised
and
current
date
8
are
airframe
communication
FRC keeps
6
are FRC employees.
of all mechanics. so
Instead,
on. (Note:
it
has
The history includes
The and
been
used
so on
the
component
here to limit
the
date
is,
hired,
of course,
number
of attributes
design.)
requirements,
create
the
ERD
segment
using
UML
notation.
You have been asked to create a database design for the BoingX Aircraft Company(BAC), which has two
products:
TRX-5A
managers to track For
simplicitys
blueprints make
All parts you
may assume
your
own
used
in the
You
HUD (heads-up
sake,
may assume
parts.
TRX-5B
parts and software
and that
up
and
blueprints, you
the
TRX-5B
blueprint
are free
the
to
the
based
units.
The
database
HUD, using the following
TRX-5A
unit is
on three
based
engineering
must enable
business
on two
rules:
engineering
blueprints.
You
are free
to
names.
TRX-5A
that
that
unit is
display)
for each
and
TRX-5B
TRX-5A
make
unit
up your
are
uses
own
classified
three
part
as hardware.
parts
and that
For simplicitys
the
TRX-5B
sake,
unit
uses four
names.
NOTE Some
parts
suppliers Company. parts.
are
supplied
by vendors,
must be able to Any parts supplier
Therefore,
any
while
meet the technical
part
that
others
are
supplied
requirements
meets the
BoingX
may be supplied
by
Aircraft
multiple
by the
BoingX
specification
(TRSs)
Companys
suppliers
TRSs
and
Aircraft
Company.
Parts
set by the
BoingX
Aircraft
may be contracted
a supplier
can
supply
to supply
many
different
parts.
BAC
wants
BAC
wants to
assume also
to
that
uses
keep keep
the
two
track
of
track
unit
the
the change,
BAC
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
person
requirements,
Rights
Reserved. content
does
May not
not materially
be
of the
change,
affect
scanned, the
overall
software.
software
or
duplicated, learning
You are free
the
description
made the
data
in
whole
or in Cengage
part.
to
make
Due
to
electronic reserves
sake,
up your
change,
and the
using
changes.
and that
you
may
TRX-5B
unit
software
names.
Those changes
the
reason
the
own
and software.
by test type, test
Learning
of those
For simplicitys
of the
change
ERD segment
experience.
dates
components
made in blueprints
of all HUD test create the
copied,
TRX-5B
components.
who actually
keep track
and the
named
of all changes
and time
the
wants to
Given those
Editorial
date
changes
and
uses two
software
BAC wants to keep track reflect
price
of all TRX-5A
TRX-5A
named
all part
person
for the
must
who authorised
change.
date and test
outcome.
UML notation.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
9
Given the following hierarchy,
if
departments box
in
number
employees, salaried, is
which
target
and
each
who can
work
on sales
hours
a salesperson
with the
a base
he
employees,
billing
rate
example,
addition
of
plus another
R500
beginning
000
date
are
stored
per
year
in the plus
5 per cent and
end
a 2 per
cent
of their
for
employees,
For
profit
and
salespeople
commission
system.
of the
date
are
the
percentage
example,
John
commission
on each
contracts
be
which
wages
40 hours/week
For all salaried their
can
number,
employees
salespeople,
mail
assigned
hourly
may target
salaried
261
the
internal
Employees
an employee
company
and
many
employees,
base salary.
profit
have
department.
Concepts
a specialisation
name,
can
hourly
Some
For
salary
their
the
to their
makes,
one
For
others.
system. on
department
are assigned
address.
20 for
in the
the
for
and
for
and
in
only
Advanced
on employees
A department
to
Modelling
Foot ERD using
the
kept.
Data
information
department, are
All employees
percentage
all sales
keeps
assigned
stored;
is recorded
with
price for
For contract
is
others,
commission
a Crows
each
name
are
32 for
amount
and
For
create Company
extension
employees
earn a commission
salary
sales
work.
phone
or work on contract.
employees,
yearly
Sales
employee
with the
weekly
some
they
office
along
scenario,
Granite
and
hourly,
kept
business
appropriate.
6
is
on the
of those
sales.
are stored
along
6
hours.
CASESTUDIES 1
Sedgefield
Bike Rentals is a small family-owned
Tourists
regularly
visit the
coastline.
The
main business
maintaining suitable
those
for
bikes
a complete bike
has
manufacturer, CHILD,
three
to
ERD to
and type
TEENAGER,
Sedgefield
Bike
that
depending
(e.g.
bike is
date
condition.
is
on
2020 has
Cengage deemed
Learning. that
any
which
with
a class
day or full
it
All
South Africa. areas
bikes to
on bikes
described
beautiful
hire to tourists,
that
are
no longer
below:
by a unique
number.
is recorded
along
has reached
the
dealers
they
For each with the
end
bike,
the
size (e.g.
model,
INFANT,
of its lifespan
The price it is
sell to
on
A dealer
poor condition
it is
make
along
with the
Rights
into
Reserved. content
does
May not
it is
date
the
not
be
(typically
sold for and the
a regular
may or
then it is
to
affect
hire If
scanned, the
overall
to
Bike
basis
may not
and
maintain
purchase
not offered
to
a bike
a dealer
or
duplicated, learning
in experience.
need
by either
whole
Cengage
bike
For each
Due Learning
good
class
size,
working
A description
has
example,
been
a typical
maintenance
telephoning to
can rent
part.
for
a
order.
of the fault
repaired,
the
problem
is that
on a regular
and
action new
basis.
maintenance.
agrees
or in
rates
CHILD_YOUNG,
rates.
Log.
will undergo
A customer
standard
ensure that it is still in
For
a customer
determine
LARGE_ADULT}.
When the
a bike
to
of bike are {INFANT,
Maintenance
may never bikes
used
and full-day
is recorded.
a bike
shop.
copied,
checked
lifespan,
that
is
sizes and
also recorded.
are recorded.
materially
that
half-day
in the
are
a request
details
code
STANDARD_ADULT
Over its
possible
size
day). The class
was noticed
and the
walking
suppressed
needs
information.
bike is in
it is recorded
are required.
and contact
review
a bike
of
contact
If the
assigned,
noticed,
Customers
Copyright
or road)
After
a number
dealers
TEENAGER,
was taken
However,
Editorial
business
mountain
has
the
associated
code
simply
the
etc.).
After a bike has been returned,
tyres
selling
the
in the bike record.
of hire (half
If a fault is
that
acquiring
and
on usage) it is sold on to a dealer.
Rentals
on its
CHILD_OLD,
the
order
Sedgefield, around
scrapped.
period
unique
and cycle
Rentals is in
working
and is identified
ADULT,
contains
and just Each
business located in
countryside Bike
a good
support
a bike record
date are also recorded
the
the
of Sedgefield ensure
years but dependent
a list
admire
hiring.
Create Every
area to
to
electronic reserves
Sedgefield
hire
a bike,
Bike
his or her
one or more bikes
rights, the
right
some to
third remove
party additional
content
Rentals
name,
or by
address
at a given time.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
262
PART II
Design
Concepts
For each time In
bike that is rented,
the
bike
addition,
code.
the
Each
number
was taken amount
rental
or
When a bike is for
ordering
each
being
week, for
one
costs
order
A order
but
and
a grand
over
time
can be
one
Explorer
part
but
mountain
the rental
actual
time
determined
customer.
it
date, the
was returned.
through
the
can
be rented
A bike
order
total. can
one
place
many
a part
contain
Bike
Rentals
orders
are
may place
can
class
size out
any
be responsible on Fridays
of orders.
order,
any associated
be associated
with
placing
orders.
can be ordered
description
a request
will
placed
any number of the
employee
and parts
number,
often
Typically
date, a subtotal
Only
many parts
bike large
of Sedgefield
An employee
number,
employees
can
one
are required.
Mondays.
by a unique
and the
which is
with only
that
order
made for
This contains
back
an employee
parts
on
due
be rented.
of an order
part is identified at least
bike
is created.
was
paid is recorded,
maintained,
delivery
consists
it
associated
may never
additional
An order delivery
is
record
time
of rent
record
of times
a rental
out, the
for
and
on part
several
multiple cost.
of the
occasions.
Each
An order
must
parts,
e.g. three
same
be for
saddles.
6 A particular in
stock
part
with
Sedgefield
manufacturer,
to
2
of the
is
placed Bike
uses
Unsolicited
the
title
ensure
that
its
the journal,
name,
is
some
a part is part in
basis.
address,
not
stock.
For
telephone
may supply
for
is
affiliation
manuscript
to to
record
publication.
publication.
support
it
the
each
number
parts
via
orders
When a
the
and
a
including
(the
school
order
authors.
which
is received,
below:
the
each
authors
A single
authors
name,
are kept in
author a
of
Every
manuscripts
when
the
status
or company).
Additionally, in
of the journal
described
manuscript
who have submitted
have several
needs
cent
about it in the system,
was received
the journal.
10 per
A new issue
manuscript
authors
systems research
Only about
business
also recorded,
Only authors
to
for
basic information
date
author(s) and
does
within
ensure
their
email
May
not materially
be
affect
the
scope
of the
scope
overall
or
duplicated, learning
of the journal,
reviewer
may have
manuscript
are listed
(for areas
in
or in
whole
Cengage
part.
then the other
Due Learning
to
electronic reserves
has
in the
right
is
An area
some to
is the an
third remove
area
party additional
content
manuscript
not
author
selects
to
appropriate is
notified
three
or
via
more
or universities
the
of interest.
and
rights,
editor
reviewer,
IS2003
the
content
companies
areas
of interest,
of the
and the
has specified.
example,
topic
If the
For each and
the
rejected
work for
validity.
many
review
journal. to
affiliation
the
experience.
briefly
changed
Reviewers
a description have
scanned, the
is
scientific
that
can
copied,
will
address,
expertise
and includes
not
editor
manuscript.
name, of
the
status
within the
A reviewer
Reserved. content
manuscripts
ERD to
the
manuscripts
fall
the to
areas
Rights
the
manuscripts
number,
modelling).
All
on a regular
manufacturer
are accepted
and records
for a
contents
content
by an IS code
suppressed
a
by authors.
manuscript,
convenience,
the
manuscripts
predefined
select
submitted
about
to review
reviewer
and
a complete
it is important
or his earliest
any
the
the
basis.
to the journal
different
At her
Learning.
use
If
have
credits.
email. If the
that
manufacturers.
see if they
they
is recorded:
to
address
It is typical
authors,
reviewers
Cengage
of
to
Research Knowledge is a prestigious information
are
email
many
manuscript
read
number
manufacturers
a regular
must have an author.
system.
for
any
be checked
manufacturer
Create
Information
multiple
of
process
of the
address,
submitted
deemed
on
it a number
the
manuscript
has
Rentals
quarter.
assigns
received.
2020
one
manuscripts
including
review
with
submitted
each
mailing
alist
can
information
a peer-review
manuscripts
published
editor
from
others
keep
The Journal of E-commerce
is
Copyright
Rentals
the following
Sedgefield
It
be obtained
address.
order
journal.
Editorial
always
manufacturer,
Bike
and email Each
can
one
system Areas
and
records
a
of interest
of interest
are
is identified
code for database of interest
may content
be
suppressed at
any
time
if
can
be
from
eBook
the
subsequent
rights
and/or restrictions
eChapter(s). require
it
CHAPTER
associated It is
with
unusual,
many reviewers.
but
The editor
possible,
will change
reviewers
received
will typically
as
well
of this the
a
of the journal,
although
the
will be
published
journal.
issue
Each
including been
fonts,
which
issue.
stored
in
of the
each
publication
the
order
issue
is
which
may
appear
page
manuscript
has
status
that
system.
in
and the
been
for
each
For
is
manuscripts
the
of the
content,
manuscript
has
will then
of
decide
manuscripts
within
must
an issue,
published,
in that
6
publication
manuscript for
each
one issue
editor
order
scheduled
manuscript
only
Once the
each
Once an issue of
to accepted
formats
The
the
is recorded.
which
in
which
manuscripts,
known
appears
number
to scheduled.
and the
it is
on
summer),
many
and so on. the
date
all
evaluations,
status
spring,
process
in
will record
their
its
on a
be scheduled.
will contain before
editor
with the
must
feedback
to the field,
of acceptance
winter,
justification
will
The
change
it
a typesetting
provide
manuscript
provided
date
manuscript
beginning
the
and
the
system
is recorded
the
along
have
publication,
in the
spacing,
is changed
recorded,
263
no reviewers.
new reviewers
and contribution
received,
An issue
manuscript
Once
Concepts
A reviewer
and
or reject).
period (autumn,
of pages
and the
system.
for
through
rating
reviewers
An accepted
size, line
has
reviewer.
convenience
includes
manuscript
are recorded.
accepted
The
manuscript
the
the
number
earliest
review
the
accepted
goes
Advanced
and record
each
each year, although
will be published,
issue.
journal
review
methodology
of the
publish
been
that
Modelling
one area of interest.
the
sent to
(accept
each
all
has
font its
issue
each
for
Once
manuscript
typeset,
at their
publication
may be created in
was
to review
clarity,
system
and number
at least which
to under
date it
each reviewer
for
manuscript
manuscript
volume
manuscript
manuscript
whether to
If the
for
Data
yet.
from
in the
will decide
year,
the
was received.
or rejected.
issue
manuscripts
must specify
of interest
and the
manuscripts
as a recommendation
editor
of the
appropriateness,
information
Once
status
The feedback
feedback
for
any
scale for
an area
manuscript
will read
editor.
ten-point
All reviewers
have
several
not have received
to the
the
the
receive
The reviewers
to
6
the
the
issue
be
is
status
print
date
changed
to published.
3
Global Computer offices to
located
maximise
according so that
Solutions (GCS) is an information
throughout its
resources,
to region.
The
a first
2020 has
Southern
Each
employee
Each
skill
for
has
II,
many
a skill ID,
consulting
companys
highly
projects,
company
success
skilled
is
based
employees
GCS has contacted
customers,
systems
C11
I,
All suppressed
Rights
Reserved. content
GCSs
GCS
have
an
and
a date
does
May not
and skills,
analyst
C11
II,
employees,
not materially
to
you to
with many on its
work
ability
on projects
design
a database
project
schedules,
projects,
copied, affect
and information
requirements.
A basic
many
I,
overall
or
I,
duplicated,
an employee
last
name,
a
middle
of
II,
in
Valid
II,
database
or in Cengage
I,
Java
engineer
shows
whole
Europe (EE),
have the
pay.
Java
network
experience.
Eastern
Western Europe
(SA).
analyst
Python
table
learning
ID,
employees
and rate
engineer
the
Africa
I, systems
scanned,
employee
of hire.
South
The following
be
operations
Europe (NE),
and
Python
manager.
project
all of
description
DBA, network
any
assign
of their
Northern
(SE)
SQL Server
Learning.
The
follows:
a region
Europe
has
entry C II,
working
and
that
technology
Africa. to
manage its
are as follows:
(WE),
Cengage deemed
ability
must support
name,
Valid regions
review
its
South
can keep track
main entities
The employees
Copyright
is,
To better
database of the
initial,
Editorial
and
and invoices.
GCS
description
C I,
that
GCS managers
assignments
data
Europe
part.
skills
are
Due
to
electronic reserves
II,
as follows:
ASP I,
I,
data
database
ASP II,
of the
right
some to
third remove
skills
party additional
content
entry
I,
designer
Oracle
web administrator,
rights, the
skill.
designer
II,
an example
Learning
same
DBA,
technical
II,
MS
writer
inventory.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
264
PART II
Design
Concepts
Employee
Skill
6
Seaton
Data
Entry I
Data
Entry II
Williams
Systems
Analyst
I
Craig
Systems
Analyst
II
Chandler
Josh;
Brett;
Williams
Josh;
Seaton
Amy
Sewell
Beth;
Joseph;
Burklow
Designer
I
Yarbrough
Peter;
Smith
DB
Designer
II
Yarbrough
Peter;
Pascoe
Kattan Chris; Epahnor
C11 I
Smith Jose;
C11 II
Nokwi Londe;
Python I
Zebras
Steve; Ellis Maria
Python
Zebras
Steve;
Duarte
Miriam;
Ismail
Summers
Victor,
Nkosi Cela
Nokwi Londe;
Pieterse Bush
Hemalika;
Oracle DBA
Smith Jose;
Peter;
Smith
Engineer
I
Ismail
Hemalika;
Smith
Mary
Network
Engineer
II
Ismail
Hemalika;
Smith
Mary
Ismail
Hemalika;
Smith
Mary;
Manager
GCS has number GCS to
many customers.
works
design,
by
projects.
develop
a project date (an
start
estimated
cost
employee
assigned
actual
employee
is, in
effect,
has
Cengage deemed
Learning. that
any
All
Rights
date
as
Kenyon
Tiffany;
Connor
has a customer
based
Sean
ID, customer
on a contract
a computerised
(that
is,
the
date
a project
an
actual
manager project
start
of the
is
the
name,
phone
who is the
manager
to
which date
date,
the
Each
which the
the
end
date,
and
GCS
specific
belongs,
contract
an estimate),
an actual
has
project
projects
(also
customer
project
a brief
was
signed),
a project
a
budget
an actual
cost
(total
and
one
of
pay)
project.
updated
hours
on
end
between
solution.
project ID, the customer
multiplying
and
the
each
each
Friday
employee
does
May not
not materially
that
copied, affect
scanned, the
overall
by adding worked
duplicated, learning
must
In the
performed
that
times
of people
in experience.
whole
or in Cengage
part.
complete
project
to
description,
number
or
project plan.
will be
a brief task
and the
be
of the
development
tasks
needed,
Reserved. content
is
estimate),
has a task ID,
of skill
suppressed
as the
project),
a design
determine
Each task
2020
Roger;
weeks
the
cost
skills
rate
to
cost.
The
type
Jaco
Larry
Each customer
A project
of the
by
actual
must
of
cost
(computed
review
such
project
the
Mudd
and implement
description,
The
Brad;
Bender
Pieterse
and region.
characteristics
Copyright
Surgena;
Paine
Maria
Jose
Network
Project
Ellis
Pascoe Jonathan
Yarbrough
Kilby
Anna;
Pieterse Jaco
Miriam; Pieterse Jaco
Writer
Erin
Emily
Duarte
Technical
Steve
Jaco
ASP II
Web Administrator
Zebras
Bible Hanah
Miriam; Bush Emily
DBA
Emily;
Cope Leslie
Duarte
Server
Robbins
Victor;
ASP I
SQL
Bush
Jonathan
CII
Java II
Erin;
Mary
Kattan
I
Epahnor
Buhle
Shane;
CI
II
Chris;
Khoza
Robbins
DB
Java
Editorial
Amy;
take
schedule
(or
project
from
the
the tasks (with
Due Learning
to
electronic reserves
the
starting
required
rights, the
a project
right
some to
third remove
schedule, plan),
the
and ending
additional
content
manager
beginning
skills)
party
which
may
be
to
suppressed at
any
time
end.
date, the
required
content
to
from if
complete
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
the task.
General tasks
coding,
testing,
schedule
Project
shown
in
the
See
next
Date:
1/3/2019
Start
Date
End
sign-off.
Management
Contract End
Date
database
and
and system
For
Data
Modelling
Advanced
Concepts
265
design, implementation,
example,
GCS
would
have the
project
table.
Sales
Rocks
Start
interview,
evaluation
Description:
ID:1
Company:
are initial
and final
6
Task
Date:
Date:
System 12/2/2019
1/7/2019
Region:
WE
Budget:
R375
Skill(s)
Description
000
Required
Quantity
Required 1/3/13
Initial Interview
6/3/19
11/03/13
15/03/19
Database
11/03/13
12/04/19
System
Project
Manager
Systems
Analyst
Design Design
1 II
1
DB Designer
I
1
DB Designer
I
1
Systems
Analyst
II
1
Systems
Analyst I
2
18/03/13
22/03/19
Database Implementation
Oracle DBA
1
25/03/13
20/05/19
System
CI
2
C II
1
Coding
and
Testing
Oracle
25/03/13
07/06/19
System
10/06/13
14/06/19
Final Evaluation
DBA
Technical
Documentation
1
Writer
Project
Manager
Systems
Analyst
DB Designer
1 1 II
1
I
1
Cobol II 17/06/13
21/06/19
On-Site
System
Online
and
Data Loading
1
Project
Manager
Systems
1
Analyst II
DB Designer
1
I
1
CII 01/07/13
01/07/19
Sign-Off
Assignments: are
assigned
first
projects
analyst
II,
GCS pools to
is assigned that
customer,
can
work
Copyright review
2020 has
Cengage deemed
skills
multiple
project
If
can
tasks.
an employee
cannot
is
All suppressed
Rights
task
Reserved. content
does
May not
not materially
be
and
copied, affect
scanned, the
overall
duplicated, learning
task
in experience.
whole
them
to
work
on a project
until his/her
current
or in Cengage
part.
Due Learning
to
electronic reserves
ahead
rights, the
right
of the
some to
third remove
party additional
Using as the
task.
and
a given
on only task
may content
employee
one
from
project 20/02/19
is closed ending
behind)
content
manager
region
match the of (or
the
project.)
same
assignment
does not necessarily be completed
project
project to it,
for
a systems
(The
in the
can
work
employees
06/03/19
duration
assigned
to
pool,
1
For example,
needed.
to the
an employee
assigned
can
are
for the
employees
is closed
or
01/03/19
who are located
However,
a task
manager.
manager
assigning
many
work on another
because
period
Manager
and from this
project
and remains
already
on which an assignment
the
employees
have
(s)he
schedule
for
a project
required
The date
any
and
the
03/03/19,
Learning. that
the
task
on
that
I
by region,
by the
project is created
schedule
at a time.
project
Editorial
know
GCS searches
matching
project
task
you
employees
scheduled
designer
when the
information,
Each
task
schedule, a database
1
Project
all of its
a specific
6
to
(ends).
date of the
schedule.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
266
PART II
Project
Design
Concepts
Description:
ID:1
Company:
Sales
Management
Contract
See Rocks
System
As of: 29/03/19
Date: 12/2/2019 Actual
Scheduled Project
Task
Start
Initial Interview
Date
1/3/19
Employee
End Date
Skill
6/3/19
Project
11/03/19
15/03/19
11/03/19
12/04/19
Mgr.
Start
101
Connor
Analyst
II
102
Cele
DB Designer
I
103
Pillay
DB Designer
I
104
Pillay
Sys.
Database
assignments Date
End Date
01/03/19
06/03/13
01/03/19
06/03/13
M.
01/03/19
06/03/13
M.
11/03/19
14/03/13
S.
S.
Design System
Design
Database
Sys. Analyst II
105
Cele S.
11/03/19
Sys. Analyst I
106
Hemalika I.
11/03/19
Sys.
107
Zebras
11/03/19
108
Smith
Analyst
I
DBA
S.
15/03/19
J.
18/03/19
22/03/19
Oracle
25/03/19
20/05/19
Cobol I
109
Summers
Cobol I
110
Ellis
Cobol II
111
Epahnor
DBA
112
Smith
Writer
113
Kilby
19/03/13
Implementation
6
System
Coding
& Testing
Oracle System
25/03/19
07/06/19
10/06/19
14/06/19
Tech.
A.
21/03/19
M.
21/03/19 21/03/19
V.
21/03/19
J.
25/03/19
S.
Documentation Final
Evaluation
Project
Mgr.
Sys. Analyst II DB Designer
I
Cobol II On-Site
17/06/19
System and
21/06/19
Project
Online
Data
DB
Loading
(Note:
01/07/19
The
assignment
assignments whatever
shown number
01/07/19
number previously
matches
Given with
all
your
shown
as
fills
following
end
each
assignment
of the bill to
Cengage
Learning. that
any
All suppressed
Rights
you
project
name; date
are
can
for
of this
see that
schedule.
example, design.
assignment
101,
The
102.)
Assume
assignment
week (Friday)
of the
month
ID, the total
hours
that
number
which the
work log
that
the
can
be
entry is charged.
of the
current
is
project
shown
an employee
assignment,
schedule
on page
a record
of each
month.
month if it
up to the
Obviously,
work log
of the
you
task,
date
projects
run
267.
of the
actual
hours
work log is a weekly form that the employee
of the
week (or
associates
track
be any dates as some
containing
end
day
keep
employee,
form
The
or at the
or the last
worked
ID,
a work log
on a given assignment.
assignment to
could
assignment
kept in
the
Therefore,
ends (which
A sample
works
of each
Friday
end
each
The form
doesnt
contains
fall
of the
month),
work log
the
on a Friday), and the
the number
entry can be related
entries for the first sample
to
project is shown in
table.
Reserved. content
employee
as of the
information:
only one bill. A sample list
deemed
of the
existing
schedule).
an employee
out at the
the following
Mgr.
starts and date assignment
of or behind
date (of
has
I
II
information, the
worked by an employee
2020
Designer
design.
using
the
a prefix ones
preceding task,
at least
The hours
review
II
Project
only
database
of the
assignment ahead
is
are the
a project
require
Copyright
Analyst
Cobol
Sign-Off
Editorial
Mgr.
Sys.
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Employee
Week
Name
Hours
Advanced
Worked
Bill
4
xxx
01/03/19
1-101
4
xxx
01/03/19
1-103
4
xxx
08/03/19
1-102
24
xxx
08/03/19
1-101
24
xxx
08/03/19
1-103
24
xxx
15/03/19
1-105
40
xxx
Hemalika I.
15/03/19
1-106
40
xxx
Pillay
15/03/19
1-108
6
xxx
15/03/19
1-104
32
xxx
15/03/18
1-107
35
xxx
22/02/19
1-105
40
Hemalika I.
22/02/19
1-106
40
Ellis
22/02/19
1-110
12
22/02/19
1-111
12
Pillay J.
22/02/19
1-108
12
Pillay
22/02/19
1-112
12
22/02/19
1-109
12
22/02/19
1-107
35
22/02/19
1-105
40
29/03/19
1-106
40
29/03/19
1-110
35
Connor
S.
M.
Cele
S.
Connor
S.
Smith
M.
Cele
S.
J.
Pillay
M.
Zebras Cele
S. S.
M.
Mbaso
V.
J.
Summers Zebras
A. S.
Cele S. Hemalika Ellis
I.
M.
29/03/19
1-111
35
Kilby
S.
29/03/19
1-113
40
Smith
J.
29/03/19
1-112
35
29/03/19
1-109
35
29/03/19
1-107
35
Mbaso V.
Summers Zebras (Note:
A.
S. xxx represents
Finally,
the
the
every
on the
15
period.
entries that
and
each
first
you
work-log
can
work log
Create
all of the
Create
the
Populate
has
Cengage deemed
Learning. that
is to
any
All suppressed
Reserved. content
does
May not
sent to the
not
be
copied, affect
scanned, the
overall
that
only
bill.
the
hours
using the
GCS sent
between
bill in this
will fulfil the
operations
bill
and
table
worked
bill number,
many work log one
01/02/19
one
skill,
additional create
maintain
or
totalling
a bill can refer to one
Number
database.)
customer,
only
267
6
a bill, it updates,
worked is
duplicated, learning
in experience.
entity
whole
customer, required
all of the
as needed (as indicated
materially
to
hours there
are and
to
your
bill. In summary,
are employee,
tables
in
Concepts
on 15/03/19
15/03/19.
and that
bill
covers
above form.
bill. (There
indexes
the tables
Rights
that
a database
entities and
bill number
be related the
assume
required
required
can
totalling
create
and
of that
shown in the
minimum required
assignment,
2020
safely
entries
the
When GCS generates
entry
(Xerox),
matches
written
are part
work log
project
Your assignment The
one that
a bill is
Therefore,
the
Use the
days
entries the
bill ID.
project for that
work-log
for
review
Modelling
1-102
Smith
Copyright
Number
Data
01/03/19
Cele S.
Editorial
Assignment
Ending
6
in the
or in Cengage
part.
Due Learning
region, entities
required
integrity
electronic reserves
project,
that
are
project
problem. schedule,
not listed.)
using
surrogate
primary
keys.
data and forms).
rights, the
in this
relationships.
when
sample
to
described
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
268
PART II
Design
Concepts
4 Martial
Arts R Us (MARU)
students. teach
The each
of each
database
class,
and
student are
with their
name,
instructor
date
must
An instructor
any
taught
of students
help.
many
For
5:00
Each
least
A given rank
school each
head
rank
Cengage deemed
a rank
Learning. that
any
is
stored
along
In addition
start
working
(compensated
to
as an
or volunteer).
but each class
volunteer
has one and
instructors,
may not be
class
2 is
week, and location.
Room
an intermediate-level
level
meetings
other than
session.
class
be recorded.
the
assigned
one instructor
Room
martial
arts.
1. During
students.
rank
rank.
is
belt
instructor
assigned
to teach
meeting
up
class. or the
of that
class,
instructor.
colour,
and rank
requirements, All ranks
to that
(head
as an assistant
name,
may show
many assistant
assigned
a particular
served
have numerous
with only one particular
and
roles
Mr Jones
The rank
normally
by any
instructor
who is
and the instructors
Ms Khumalo
the
yet.
For example,
class in
is no
Therefore,
meeting is
be attended
may have a head instructor the
A third
be tracked.
and each may not
class.
class.
week, so there
class
must
For
class.
each
particular
meeting
meetings,
class
and
a beginner-level
during
any
class
1 is
1 is
an advanced-level
meetings
any
Most ranks
day of the
Room
class
at least
in the
by
rank
that
can
should
also
but each
except
All
Rights
white belt have at
does
computing.
new
May
attains rank
is
of
system.
kept
white
in
belt.
All ranks
Sales (GUTS) is
a variety
not
each students the
system.
The
have
to think
progress
date
New that
at least
of a student
through students
a student
one
joining is
student
as
the ranks. the
awarded
who
has
at some time.
not materially
Employees
of
personal
computing
company
Reserved. content
the
While it is customary
to track
a student
be kept in the
rank
use The
The
suppressed
many students.
given
Global Unified Technology
address.
has
number
they
of classes,
will attend
of the
instructor
automatically
that
and laptops.
2020
to
may be held
model for employee
review
have date
associated
every are
achieved
Copyright
progress
requirements:
are instructors.
status
time,
appropriate
class
a single rank, it is necessary
Therefore,
Editorial
the
to
school.
especially
Room
meeting
p.m., intermediate
is
to track
The
of
assigned
one requirement.
having
They
of the
attended
the
are stored.
requirement
who is
ERD for these
date that
instructor
of a class, instructors
need
holds
requirements
the
p.m. in
p.m. in
Some
will always
was the
student
the
with hundreds
offered,
school.
at 5:00 p.m. in
6:00
student
have
meeting,
instructor)
Mr Jones
at 5:00
a given class
but it
Monday,
at
many different
meeting
class
assistant
joined
at a specific
Mondays
Mondays
students.
Therefore,
each
the
any number
at each individual
may not
instructors,
Foot
not all students
with their
level
particular
will attend
At any given
they
all instructors,
to teach
any class
any
students
join
Some instructors,
on
on
attendance
by
date
are
Also, it is important
Crows
they
but clearly
a specific
that
to
5
for
may attend
A student
class.
that
class.
expectation
attended
when
for
instructor.
on Tuesdays
Students
each
classes
a complete
and the
along
one class taught class
taught
New
birth,
may be assigned
to
Another
attend
number
be recorded
A class is offered
class
of
MARU is a martial arts school
of all the
Create
information,
only one assigned
6
students
a student
student
example,
track
are also students,
normal
assigned
keep
advance.
given
All instructors the
which
as they
Students
needs a database.
must
copied, affect
to
scanned, the
a bring
your own device (BYOD)
can use traditional
desktop
computers
mobile
computing
model introduces
wants
be
moving towards
overall
ensure
or
duplicated, learning
that
in experience.
whole
any
or in Cengage
part.
some devices
Due Learning
to
electronic reserves
devices security
such risks
connecting
rights, the
right
some to
third remove
that to
party additional
content
in their
as tablets, GUTS is
their
may content
attempting
servers
be
are
suppressed at
any
time
offices.
smartphones
from if
the
subsequent
to
properly
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
registered support
and approved the
Every
business
employee
number,
is
company.
every
has
currently
At such
have
number
hired employees
it is
happens,
possible
only the
can
part of the
a static
someone For
devices,
that
Once
a device
devices for
a period
for
a single
Each
server
Some
servers
then
the
can
host
system
Cengage deemed
Learning. that
any
All suppressed
to
it is
must
Rights
virtual
to
connect each
for and IP
May
but
newly
brand and system.
However,
system.
When
6
mobile to
IT
a static IP address,
should
department
serial
and
also
be kept dispatch
which
department
has
device is
can
number,
OS. The IT
devices.
be a permanent
A desktop
This location
not
also
is
encryption
also enabled
or not each
mobile
for
device
and which but
be
to
copied, affect
the
overall
or
duplicated,
any
but
the
one
server.
more
has
if it is Not
all
all
system
a number
Therefore,
it is
of
possible
servers.
departments
physical
facilities
servers
Further,
enabled servers.
might be in the GUTS
not for IT
are
or
device
individually.
Within the
are
physical
server
virtual
it is
server. In
that
in experience.
whole
is
other
host
that
servers.
If
it is running
server
servers
devices
learning
to
be recorded.
physical each
be created
scanned,
so the
server
where
some
Not all physical many
capabilities to
can
are
be located.
necessary
to track
on each server.
can host a virtual
have
materially
address.
used
appropriate
servers
rooms
should
servers
new servers
not
several
server
track
server.
does
or
system.
on whether
at first,
for
servers,
servers
Reserved. content
office
the
connections
approved
is in
virtual
for
be approved
brand,
server
will normally
for
in
an
another.
in the
are intended
enabled
and the
be approved
be approved
should
many
virtual
possible
system,
may be approved
system is being are
lock
are recorded
device,
to
is tracked
devices
storing information
in the
a name,
each
operating
A server
has
support
before
to
has
Only physical
2020
has
of climate-controlled
which
another
review
device
device
a device
Which room
Copyright
the of the
registered
device
a number
Editorial
capture version
For
needs to be recorded.
number).
also to
the
will be in the
device is assigned
and the
within
employees.
assigned
one
one employee device
compromised,
a screen
is
an employee
is kept in the
office
an
enabled.
the
of time
and
at least
and
using,
meet the requirements
servers,
and
becomes
device
a company
desktop
and the
For each device, the
to
the
in
any
middle initial)
Each have
company
hardware
name
problem.
mobile
reside
each
the
should
is
device,
and
from
owns
As such,
it is important
each
ERD to
department
be created
without
initially.
transfer
that
computer
device
capabilities
mobile
could
by the
if the
it is
269
mail box
which
can
system.
are registered
provided
(building
(OS)
The system
has these
a
location
system
verifying
data.
for the
so that,
mobile
that
systems
network.
to remediate
operating
in the
who currently
desktop
Concepts
title.
Most employees
a device
are typically
company
system
department
in the system, the date of that registration
MAC address
kept in in the
registered
name,
in
temporarily
each employees
devices
employee
be either
devices
and the
a new
and name (first, last,
to keep
that
Advanced
has 5 employees,
will only track
may exist
number
Only
Modelling
Create a complete
code,
currently
system
Very rarely,
devices
a department
department
when it is registered.
a device is registered
Desktop
many
has
This
department
be recorded.
While unlikely,
Devices
the
department.
might not have any devices registered
need to
if that
employed.
an employee
can
identification
model
The smallest
It is also necessary
An employee
that
40 employees.
times,
employee,
system.
Technology
Data
below:
a department
number.
department
employee
described
works for
and phone
largest
the
by the Information
needs
6
on.
on only
words,
one virtual
Cengage
part.
Due Learning
electronic reserves
to
rights, the
server, server
physical
server
server.
cannot
host
server.
approved
to
a virtual physical
one
access
the
server,
do not yet have any approved
or in
is
A single
hosted
a virtual
are
a server
right
some to
third remove
party additional
content
but it is
devices.
may content
be
When a
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
270
PART II
Design
Concepts
device is It is
approved
also
happens,
the
approval,
it
removal A server
date
provide service
Cengage
in this
Learning. that
user
tracked
any
All suppressed
system,
must
get
permissions
approved
date of that for
was removed at a later
but
new is
the
Rights
approval
a server
should date if
to lose
should
its
be recorded.
approval.
be recorded.
whatever
service
to
might
If
If that
a device
circumstance
employee
access
loses
that
May
not materially
be
affect
scanned, the
can
its
lead
the
any
to the
overall
or
that
can
at first.
are not
Most
employees
might not have employees
The
date
The first
a username
employee
on only one services
use it.
approved
system.
must create the
date
a server.
they
users
by the
and
The
runs
Client-side
with
multiple
approved
name.
Each service initially.
before
support
employee
managers,
and
but new employees
is tracked
and password
eventually
copied,
a service
homework
number
be associated
of services,
have
a service
a service,
is
chat,
be recorded.
must
service
not
use
username
as email, identification
should
service
Each
to
such
a unique
might not offer any services
every
service.
to access
not
has
permission
services
same
does
that
so
approved
Reserved. content
services,
to use a wide array
on any
will be the
deemed
many
on a server
offering
is approved
has
was
approval
new servers
which
to a server, the
that
approval
although
employee
2020
that
GUTS began
users,
review
the
can
permission
Copyright
that
may regain
Each
Employees
Editorial
a device
server
have
6
for
is resolved.
others.
that
for connection
possible
as
on
time
which
the
an employee
and password.
will use for
every service
This for
approved.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 7 Normalising Database Designs IN THIS CHAPTER, YOU WILLLEARN: What normalisation About the
is and
normal forms
How normal forms
what role it plays in the
1NF, 2NF, 3NF,
can be transformed
database
design
process
BCNF and 4NF
from lower
normal forms
to
higher
normal
forms That
normalisation
database That
and
ER
modelling
are
used
concurrently
to
produce
a good
design
some
situations
require
denormalisation
to
generate
information
efficiently
PREVIEW Good
database
will learn
thereby
design
to evaluate
avoiding
must be
useful to
a poor
of a poor
good
one.
table
table
structures
structures.
In this
to
data redundancies,
control
chapter,
you
The process that yields such desirable results is
and appreciate
examine
characteristics
to
good table
data anomalies.
known as normalisation. In order to recognise it is
matched
and design
the characteristics
Therefore,
structure
the
and
the
of a good table
chapter problems
begins it
structure,
by examining
creates.
You
the
will then
learn how to correct a poor table structure. This methodology will yield important dividends: you will know how to design a good table structure and how to repair an existing
poor
You
one.
will
discover
normalisation,
less
complicated
the normalised operations.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
not
only
but also that
to use than set
All suppressed
Rights
of table
Reserved. content
does
May not
that
data
a properly
an unnormalised
structures
not materially
be
anomalies
normalised
copied, affect
overall
or
duplicated, learning
in experience.
be
eliminated
structures
set. In addition,
more faithfully
scanned, the
can
set of table
whole
or in Cengage
reflects
part.
Due Learning
to
electronic reserves
through
is actually
you will learn that
an organisations
rights, the
right
some to
third remove
party additional
content
real
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
272
PART II
Design
Concepts
7.1
DATABASE TABLES AND NORMALISATION
Having
good relational
database
software
is
not enough
to
avoid the
data redundancy
discussed
Chapter 1, The Database Approach. If the database tables are treated as though they system, the relational database management system (RDBMS) never has a chance to
superior data-handling capabilities. The table is a basic building block in the structure with
is
of great
Entity
interest.
Ideally,
Relationship
the
Diagrams,
database
database
yields
design process.
design
good
table
process
in
Yet it is
possible
structures.
are files in a file demonstrate its
Consequently,
explored
Chapter
5,
to
in
the tables
Data
create
Modelling poor
table
structures even in a good database design. So, how do you recognise a poor table structure, and how do you produce a good table? The answer to both questions is based on normalisation. Normalisation is a process for evaluating and correcting reducing the likelihood of data anomalies. to tables
based
on the
concept
of
table The
structures to normalisation
determination
minimise data redundancies, process involves assigning
you learned
about
in
Chapter
Characteristics. Normalisation works through a series of stages called normal forms. described as first normal form (1NF), second normal form (2NF) and third structural design
point
of view,
purposes,
2NF is
3NF is
better
as high
than
as you
discover in Section 7.3 that properly normal form (4NF).)
7
1NF and
3NF is
need
go in
to
designed
better the
than
2NF.
also
Relational
Model
The first three stages are normal form (3NF). From a For
normalisation
3NF structures
3,
thereby attributes
most
business
process.
database
(Actually,
you
meet the requirements
will
of fourth
Although normalisation is a very important database design ingredient, you should not assume the highest level of normalisation is always the most desirable. Generally, the higher the normal
that form,
the
more relational
join
operations
are required
to
produce
a specified
output
and
the
more
resources are required by the database system to respond to end-user queries. A successful design must also consider end-user demand for fast performance. Therefore, you will occasionally be expected to denormalise some portions of a database design in order to meet performance requirements. (Denormalisation
produces
denormalisation.)
a lower
However,
the
normal
price
you
form;
that
is,
a 3NF
pay for increased
will be converted
performance
to
through
a 2NF
through
denormalisation
is
greater data redundancy.
7.2
THE NEED FOR
The normalisation
process
activities of a construction project
number,
name,
can
rate
company
is
dependent
be illustrated
company that
employees
name and job classification, The
NORMALISATION
charges
application,
the
assigned
to it and so on. Each employee
simplified
database
Each project has its own has an employee
number,
such as engineer or computer technician.
its
on the
with a business
manages several building projects.
clients
by billing
employees
position.
for
the
(For
hours
spent
example,
on each
one
hour
contract.
The
of computer
hourly
billing
technician
time
is
billed at a different rate from one hour of engineer time.) Periodically, areport is generated that contains the information displayed in Table 7.1.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
TABLE 7.1
A sample
report
Normalising
Database
Designs
Chg/
Hours
Total
Hour
Billed
Charge
Project
Employee
Employee
Num.
Name
Number
Name
15
Evergreen
Job
Class
Mzwandile
E. Baloyi
101
John
G. News
105
Alice
K. Johnson*
106
William
102
Kavyara H. Moonsamy
Smithfield
Elec. Engineer
67.55
23.8
1 607.69
Database
Designer
82.95
19.4
1
Database
Designer
82.95
35.7
2
26.66
12.6
335.92
76.43
23.8
1 819.03
Programmer
Systems
Analyst
Subtotal 18
Amber
114
Annelise Jones
Wave
118
James
104
Noxolo
112
Darlene
Applications
J. Frommer K.
General
Maseki*
Designer
Support
Systems
M. Smithson
Analyst
DSS Analyst
Rolling
105
Alice K. Johnson
Database
104
Noxolo
K. Maseki
Systems
113
Delbert
K. Joenbrood*
Applications
111
Geoff
106
William
Wabash
Clerical
Smithfield
38.00
25.6
14.50
45.3
76.43
32.4
2
476.33
36.30
45.0
1
633.50
972.80 656.85
Designer Analyst Designer
Support
Programmer
82.95
65.7
5 449.82
76.43
48.4
3 699.21
38.00
23.6
896.80
21.23
22.0
467.06
28.24
12.8
361.47
Subtotal 25
Starflight
Note: * indicates
The
107
project
subtotals
Maria D. Alonzo
115
Travis
101
John
114
Annelise
10
Programmer
B. Bawangi
Systems
G. News*
Analyst
Database
Jones
Designer
Applications
Designer
25.6
76.43
45.8
3
500.49
82.95
56.3
4
670.09
38.00
33.1
1
257.80
1
Krishshanth
76.43
23.6
118
James J. Frommer
General Support
14.50
30.5
112
Darlene
DSS Analyst
36.30
41.4
Systems
M. Smithson
Analyst
7
874.36
28.24
108
B. Khan
961.32
5 739.48
Tide
B.
609.23
8 333.19
Subtotal 22
273
layout
Proj.
103
7
722.94
803.75
442.25 1
502.82
Subtotal
13
900.14
Total
38 942.09
leader.
and total
charge
in
Table
7.1
are
derived
attributes
and,
at this
point,
not
stored
in the
table.
The easiest short-term wayto generate the required report correspond to the reporting requirements. (See Figure 7.1.)
might seem to be atable
whose contents
Online Content Thedatabases usedtoillustratethe material in this chapterareavailable on the
Copyright Editorial
review
2020 has
Cengage deemed
online
Learning. that
any
All suppressed
platform
Rights
Reserved. content
does
accompanying
May not
not materially
be
copied, affect
this
scanned, the
overall
or
duplicated, learning
book.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
274
PART II
Design
Concepts
FIGURE 7.1 Database Table
Tabular representation
name:
name:
of the report format
Ch07_ConstructCo
RPT_FORMAT
RPT_FORMAT
PROJ_
PROJ_NAME
EMP_
NUM
CHG_HOUR
JOB_CLASS
EMP_NAME
HOURS
NUM
15
Evergreen
18
Mzwandile
103
Amber
Wave
E. Baloyi
101
John
G. News
105
Alice
K. Johnson
Elect.
*
Smithfield
Engineer
67.55
23.80
Database
Designer
82.95
19.40
Database
Designer
82.95
35.70
26.66
12.60
76.43
23.80
38.00
24.60
General Support
14.50
45.30
Systems
76.43
32.40
DSS Analyst
36.30
44.00
Database
82.95
64.70
76.43
48.40
38.10
23.60
106
William
102
Kavyara
H. Moonsamy
Systems
114
Annelise
Jones
Applications
Programmer Analyst
Designer James J. Frommer
118
7 Rolling
22
Tide
104
Noxolo
112
Darlene
105
Alice
104
Noxolo
K.
113
Delbert
K. Joenbrood
K. Maseki * M. Smithson
K. Johnson
Analyst
Designer
Systems
Maseki *
Analyst
Applications Designer
25
Starflight
111
Geoff B. Wabash
Clerical Support
21.23
22.00
106
William
Programmer
28.24
12.80
107
Maria
D. Alonzo
Programmer
28.24
24.60
B. Bawangi
Systems
76.43
45.80
82.95
56.30
38.00
33.10
Analyst
76.43
23.60
Smithfield
115
Travis
101
John
114
Annelise
Analyst
Database
G. News *
Designer
Applications
Jones
Designer
As you examine Apparently,
of the
In
Copyright Editorial
review
2020 has
5 112)
data
set,
addition, total
you
14.50
30.50
112
Darlene
DSS Analyst
36.30
41.40
any
All suppressed
can
includes
the
per
Rights
hour
Reserved. content
does
May not
not materially
be
only
copied, affect
a single
overall
or
duplicated, learning
in experience.
one
occurrence
of hours
whose
for
value in
whole
the
or in Cengage
project.
of any
which
Due Learning
7.1.
to
electronic reserves
example, Starflight.
one
employee.
classification
each
by
No structural
rights, the
right
some to
third remove
party additional
Smithson structure
Therefore,
worked
content
may content
is
projects.
Darlene
knowing
hourly
charge.
on each
multiplying harm
to
Given the
and its
employee
be computed
Figure
of employees
For
Wave and
the job
can
part.
assignment
Amber
you find
been included
scanned, the
more than projects:
will let
number
attribute not
to
to two
value
total
has
Systems
7.1, note that it reflects
be assigned
EMP_NUM
B. Khan
M. Smithson
assigned
a derived
is included.)
Learning.
Figure
been
will know
attribute
that
General Support
project
charge charge
Cengage
James J. Frommer
has
and
and the
deemed
118
data in
each
PROJ_NUM
(The
Krishshanth
an employee
(EMP_NUM
the
the
108
the
project.
hours
billed
done if this
be
suppressed at
any
time
from if
the
subsequent
derived
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Unfortunately, discussed
1
in
the
structure
Chapter
3,
will define
2
each
3
nulls. (Given
be entered
The table a
project.
spite
ease.
just
to
Design
entries.
data
design
El.
know
Eng.
in
that
others,
Those data redundancies
very
well.
PROJ_NUM
1 EMP_NUM
value Elect.
and EE
in
still
Engineer
others.
yield the following
anomalies:
5 105.
not yet assigned,
In
For
a new employee
a phantom
project
must be assigned to a
must be created
to
complete
clerk
update
HOURS. number
EMP_NAME 300 table
person
and
Darlene
not
in Figure
by the
7.1 does not
data
15
Evergeen
At first
glance,
Evergreen?
Cengage
Learning. that
any
All suppressed
the
Rights
data DCS
Reserved. content
is
does
in
May
entry
not
be
to
must be
hours
data
for
what
up the
data
worked
DB
is
generated
anomaly
value
Design
proverbial
wall
that
job
and Database
and
is to ensure that the
classification
has
by the
cannot
database
are looked
up from
table.
work
on the
DSS
EMP_NUM, assigned
to
problems
(at
a high
Given the existence Evergreen
Analyst
project.
36.30
EMP_NAME,
to the
a project,
project,
cost), it
of update
The
data
entry
0.0
JOB_CLASS,
she
has
not
CHG_HOUR
yet
worked,
so the
affect
scanned, the
overall
suppose
duplicated, learning
to
wasted data DCS
to
be
to
be
in experience.
as
data entry number
charge. job
PROJ_NAME,
chore
should
Because
(name,
updated.
the
Smithson
or
hourly
(such
when 200 or
be sufficient there
is
classification
Unfortunately,
the
only and
structure
so
to one on)
displayed
possibility.
7.1 leads
appears
her
entries
the
employee
characteristics
main file is
for that
Figure
data
Imagine
of the
and
persons the
supposed
copied,
entry
description that
some
repeated.
Note that the
Darla
materially
employee
will
entry:
been
example,
Analyst
not
on
total
becomes inefficient.
assigned
assigned
112,
112
And is
information
work; the report
anomalies
for job
M. Smithson
each time
For
to
most of the reporting
data entry
PROJ_NAME,
make allowances
anomalies.
the
managers
key in this
with the
her job
evident
project
a fictitious
depending
show
be used
are unnecessarily
in
the
0.0.)
number
The data redundancy produces
is
made!
be typed
results
to
eliminate
is
has just
employee
CHG_HOUR)
have to
can
Darlene
M. Smithson,
identified
should
appears
drive
could
M. Smithson
must be
structure
will not include
a foreign
even a simple
worked
an existing
entries
identify
are
file
deleted,
information,
different
report
auditing
Ms Smithson
of hours
are
project
a report
anomalies
PROJ_NUM,
(When
Each time
that
112
attributes
print
codes
PROJECT
Evergreen
match the
may yield
they
Darlene the
data
The only solution to avoid these
entry
that
of the
the table
want to
example,
data
employee
information.
reporting
words,
careful
suppose
15
report
you
(Such
other
anomalies, must
project
Designer,
integrity.
table.
and the
deficiencies,
if
is easy to demonstrate
deemed
is
programming.)
Even if very
total
you
For example, the JOB_CLASS
each EMP_NUM
the
the
example,
be fixed through
and
discussion,
cases,
To prevent the loss
structural
Database
has
some
company
save
Unfortunately, For
another
the
be deleted.
classification
has
to the requirements
data
data entry.
leaves
of those
occurred.
2020
handle
275
Deletion anomalies. Supposethat only one employee is associated with a given project. If that
created
review
not conform
does it
Designs
Modifying the JOB_CLASS for employee number 105 requires (potentially) one for
employee
employee
also
Copyright
7.1 does nor
anomalies. Just to complete a row definition, If the
employee
with
in
displays data redundancies.
b Insertion
c
preceding
datainconsistencies.
Update anomalies.
the
the
as Elect.Eng.
many alterations,
Editorial
Figure
Characteristics,
Database
row.)
The table entries invite might
to
data set in
Model
Normalising
The project number (PROJ_NUM) is apparently intended to be a primary key(PK) or atleast a part of a PK, but it contains
In
of the
Relational
7
whole
or in Cengage
part.
space. clerk
But is
Due
to
electronic reserves
Is
rights, the
right
some to
third remove
data redundancy
the
data
as:
0.0
Evergeen Darla
more,
entered
36.30
Analyst?
Learning
Whats had
Analyst
correct. DSS
disk entry
the
same
Smithson
party additional
content
may content
the
be
project
same
suppressed at
any
time
from if
as
person
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
276
PART II
Design
as
Concepts
Darlene
the
data The
when
the
M. Smithson?
entry
failed
possibility
is
designed.
overcome
confusion
conform
to the
of introducing
a database
designer
Such
to
those
is
rule
a data integrity
that
problem
all copies
that
of redundant
data integrity
problems
caused
The relational
database
environment
was caused
data
must
by data redundancy is
because
be identical.
must be considered
especially
well suited
to
helping
problems.
NOTE Remember likely
that the
origin is.
with the
PROJECT
name
length
rather
than
is
data
while the
section,
NAME
especially
the
you learn
will be used
component
in the
directly
ensure
to
in
primary
accomplish
a single courses.
the
data
a table
the
subject.
are
are
For
stored
updated
in
in
the
prefix
and is
was
used
misunderstood.)
objective
is to
tables
create
to
tables
a course
more than one
on the
table
will contain
will contain only
one table.
only
student
store the that
have
data that
data.
The reason
for this
requirement
place.
primary
key
the
entire
primary
key
and
Table 7.2.
normal
the
normalisation
forms.
The
You will learn
most
process common
the details
takes
normal
of these
you
forms
through and their
the
steps
basic
nothing
but
that
lead
to
characteristics
normal forms in the indicated
are
sections.
Normal forms Section
Characteristic
1NF
and
no partial
2NF
and
no transitive
normal
mind that
CHG
will be
what its
associated
keep in
prefix
a set of normalised
The
table
only
Second
form
form
(2NF)
(3NF)
normal
normal
produce
example,
Table format;
Fourth
However,
reason,
that that
information.
First normal form (1NF)
Boyce-Codd
to
a student
dependent
Form
normal
not likely
for
attribute
PROCESS
Similarly,
objective,
higher
TABLE 7.2
Third
too.
For that
stands
the
key.
successively
in
that
self-documenting,
it is
required
will be unnecessarily that
All attributes the
to indicate
designation.
context,
the
what each attribute
PROJ
use normalisation
generate
represents
pertains
No data item is to
to
is
prefix
databases
how to
prefix
characteristics:
Each table
Normal
uses the
THE NORMALISATION
that
listed
makes it easy to see
PROJ_NAME
an issue,
following
To
convention
CHARGE. (Given
In this
the
table,
also
7.3
7
naming
For example,
form
form
Every
(BCNF)
determinant
3NF
(4NF)
no repeating
and
groups
7.3.1
and PK identified
7.3.2
dependencies
7.3.3
dependencies
is
a candidate
no independent
key (special
multivalued
case
of
3NF)
7.6.1
dependencies
7.6.2
Even higher-level normal forms exist. However, normal forms such as the fifth normal form (5NF) and domain-key normal form (DKNF) are not likely to be encountered in a business environment and are mainly of theoretical interest. Some very specialised applications, such as statistical research, may
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
require
normalisation
operations. normal
beyond
Since
this
are
not
forms
book
the
4NF, but those
focuses
on
applications
practical
fall
applications
outside
7
the
of database
Normalising
scope
Database
of
techniques,
Designs
277
most business
the
higher-level
covered.
7.3.1 Conversion To First Normal Form Because
the relational
values
model views the
must be identified,
Figure 7.1 contains fact
that
multiple
or repeating
that
groups,
the
project.
time
of the
evidence
that
the
you
group
because
grows
the
table
are in the
of an attribute
of several
of related entries,
have
works
which
all key
Note that
fields.
In
each
a PROJ_NUM
entries, 7.1,
note
For example, person
whose
Evergreen
These Figure
data entries. one for
on the
key.
value
project,
the
working is
the
on
15.
Each
number
of
by one. repeating in
groups.
Figure
The existence
7.1
fails
to
meet
of repeating even
the
groups
lowest
provides
normal
form
7
will reduce
making sure that diagnose
each who
in
as shown.
group derives its name from the
key
may consist
a group
of tables
be stored
redundancies.
structure
by to
single
with five
they
or collection
might not
Arepeating
any
but
person
table
data
for
associated
another
RPT_FORMAT
be identified
where
for
7.1
groups. exist
can reference
are related
reflecting
must be eliminated must
Figure
structures
must not contain
thus
Normalising
in
can
5 15) is
entered
table
requirements,
type
(PROJ_NUM)
entries is
entries in the repeating A relational
same
(PROJ_NUM
Those
a new record
data as part of a table
depicted
will have identical
number
project
data
whatis known as repeating
entries
each project
Evergreen
the
the
normalisation
the
data redundancies.
each row
defines
If repeating
a single
normal
form.
Identification
process.
The
normalisation
groups
entity. In addition, of the
process
do exist,
the
they
dependencies
normal
form
will let
starts
with a simple
you
know
three-step
procedure.
Step 1: Eliminate the Start
by
presenting
Repeating
the
data
repeating
groups.
To
repeating
group
attribute
Figure
7.1 to
1NF in
FIGURE 7.2 Database Table
name:
name:
in
Groups
a tabular
eliminate
the
format,
repeating
contains
an
where groups,
each
cell
eliminate
appropriate
data
has
a single
the
value.
nulls
This
value
by
change
and there
making
sure
converts
are
that
the
no
each
table
in
Figure 7.2.
Atable in first
normal form
Ch07_ConstructCo
DATA_ORG_1NF
DATA_ORG_1NF
PROJ_
PROJ_NAME
EMP_
NUM
Copyright Editorial
review
EMP_NAME
Evergreen
103
15
Evergreen
101
John
G. News
15
Evergreen
105
Alice
K. Johnson
15
Evergreen
106
William
15
Evergreen
102
Kavyara
H. Moonsamy
Systems
18
Amber
Wave
114
Annelise
Jones
Applications
18
Amber
Wave
118
James
has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
Mzwandile
copied, affect
HOURS
HOUR
15
2020
CHG_
JOB_CLASS
NUM
scanned, the
overall
or
duplicated, learning
E. Baloyi
Elect.
*
J. Frommer
in
whole
or in Cengage
part.
Due
to
electronic reserves
23.80
Designer
82.95
19.40
Database
Designer
82.95
35.70
26.66
12.60
76.43
23.80
38.00
24.60
14.50
45.30
Analyst Designer
General Support
Learning
67.55
Database
Programmer
Smithfield
experience.
Engineer
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
278
PART II
Design
Concepts
DATA_ORG_1NF PROJ_NAME
PROJ_
EMP_
HOUR
18
Amber
Wave
104
Noxolo
18
Amber
Wave
112
Darlene
22
Rolling
Tide
105
Alice
22
Rolling Tide
104
Noxolo
22
Rolling Tide
113
Delbert K. Joenbrood
22
Rolling Tide
111
Geoff B. Wabash
22
Rolling Tide
106
25
Starflight
25
Starflight
25
Systems
K. Maseki *
DSS
M. Smithson
Analyst
Analyst
Database
K. Johnson
Designer
Systems
K. Maseki
Analyst
76.43
32.40
36.30
44.00
82.95
64.70
76.43
48.40
38.00
23.60
Clerical Support
21.23
22.00
William Smithfield
Programmer
28.24
12.80
107
Maria
Programmer
28.24
24.60
115
Travis
76.43
45.80
Starflight
101
John
82.95
56.30
25
Starflight
114
Annelise
38.00
33.10
25
Starflight
108
Krishshanth
76.43
23.60
25
Starflight
118
James
General Support
14.50
30.50
25
Starflight
112
DSS Analyst
36.30
41.40
Step
2: Identify
The layout note
in
that
the
one row
example,
the
key that
will uniquely
in
Figure
Step
All
Also,
if
are
In
2020 has
Cengage deemed
you
one
5 15
23.8,
the
all of the of five
project
To
(row)
maintain
must be composed key. For example,
EMP_NUM
5 103, the
HOURS
can
only
observer
does
entity
primary
and
a casual
number
remaining
employees.
new key
and
Even
not
will
uniquely
attributes. a proper
For
primary
of a combination using the
entries
for
of
data shown
the
attributes
be Evergreen,
Mzwandile
respectively.
Step
2
means that
you
have
already
identified
the following
? PROJ_NAME, EMP_NAME, JOB_CLASS,
EMP_NAME,
For words,
JOB_CLASS,
by
the
example,
the
project
CHG_HOUR
combination
the
project
name
is
of
CHG_HOUR,
HOURS
and HOURS values are all dependent
PROJ_NUM
number
on its
dependent
dependency:
on the
and
own
EMP_NUM.
identifies
project
There
are
(determines)
number.
You
can
the write that
? PROJ_NAME
know
an and
employee that
number,
employees
you
also
charge
per
know hour.
that
employees
Therefore,
you
name, can identify
that
employees
the
dependency
next:
Learning. that
any
change.
as:
EMP_NUM
review
because
CHG_HOUR,
and
determined
other
classification
shown
key
not identify
PROJ_NUM
PK in
dependencies. name.
Analyst
Dependencies
EMP_NUM
PROJ_NUM
job
primary does
JOB_CLASS, 67.55
of the
they
Designer
a mere cosmetic
15 can identify
that
PROJ_NAME,
dependency
more than
hence
value
Engineer,
PROJ_NUM,
project
M. Smithson
value, the
The identification
additional
J. Frommer
an adequate and
EMP_NAME,
is,
Systems
a composite
Elect.
that
B. Khan
any attribute
3: Identify
on
not
know
Designer
Applications
Jones
This is called
you
Analyst
Database
identify
7.2, if
That is, the
G. News *
and EMP_NUM.
PROJ_NAME, E. Baloyi,
Systems
Designer
Key
table
PROJ_NUM
PROJ_NUM
Copyright
is
Applications
B. Bawangi
7.2 represents
of the
*
D. Alonzo
Darlene
Primary
Figure
PROJ_NUM
identify
Editorial
HOURS
CHG_
JOB_CLASS
NUM
NUM
7
EMP_NAME
any
All suppressed
? EMP_NAME, JOB_CLASS,
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
CHG_HOUR
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
However, means
given the
knowing
previous
the
dependency
charge
per
components,
hour for
that
job
you can see that
classification.
In
other
7
Normalising
knowing words,
Database
the job
you
Designs
279
classification
can identify
one last
dependency:
JOB_CLASS
? CHG_HOUR
The dependencies in
Figure
7.3.
you have just
Because
such
it is known as a dependency view
of all
of the
will overlook
be
and
determined
by
determines
part
another
primary
The diagram Partial
key.
with the found
help
of the
within
a given
diagram table
shown
structure,
diagrams are very helpful in getting a birds-eye
attributes,
dependencies.
non-key
FIGURE 7.3
a tables
dependency.
of the
be depicted
all dependencies
Dependency
among
transitive
can also
depicts
diagram.
relationships
an important
dependencies
examined
a diagram
and
below
their
use
shows two types
dependencies
Transitive
makes it less
are
dependencies
that
of dependencies:
where are
likely
a non-key
when
one
you
partial
attribute
non-key
can
attribute
attribute.
First normal form (1NF) dependency
diagram
7 PROJ_NUM
PROJ_NAME
EMP_NUM
EMP_NAME
JOB_CLASS
Partial dependency
CHG_HOUR
HOURS
Transitive dependency Partial dependencies
1NF (PROJ_NUM,
EMP_NUM,
PROJ_NAME,
EMP_NAME,
JOB_CLASS,
CHG_HOURS,
HOURS)
PARTIAL DEPENDENCIES: (PROJ_NUM PROJ_NAME)
(EMP_NUM
EMP_NAME,
JOB_CLASS,
CHG_HOUR)
TRANSITIVE DEPENDENCY: (JOB CLASS CHG_HOUR)
As you
1
examine
Figure
7.3,
note
The primary key attributes
2
the
following
dependency
are bold, underlined
The arrows above the attributes indicate are
based
on the
combination
of
primary
key. In this
PROJ_NUM
and
diagram
and shaded in a different colour.
all desirable
case,
features:
note
that
dependencies,
the
entitys
that is, dependencies
attributes
are
dependent
that on the
EMP_NUM.
3 The arrows below the dependency diagram indicate less desirable dependencies. Twotypes of such
a
dependencies
exist:
Partial dependencies. that
is,
know
the only
PROJ_NAME the
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
dependent to find
the
on
only
part
EMP_NAME,
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
of the the
based on only a part of a composite
Reserved. content
is
EMP_NUM
A dependency
Editorial
You need to know only the PROJ_NUM to determine the PROJ_NAME; primary
key.
JOB_CLASS
And
and
the
you
need
to
CHG_HOUR.
primary keyis called a partial dependency.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
280
PART II
Design
b
Concepts
Transitive
dependencies.
JOB_CLASS. neither
(In
attribute
other
is
words,
another
As you examine Figure 7.3, note that
Because
neither
at least
CHG_HOUR
part
of a key
a transitive
non-prime
nor the
dependency
attribute.)
condition
is
The problem
CHG_HOUR is dependent
JOB_CLASS is
is
a prime
known
a dependency
with transitive
attribute
as a transitive
of one
is,
dependency.
non-prime
dependencies
on
that
attribute
is that they
on
still yield
data anomalies.
NOTE Partial
and transitive
dependencies
Partial dependency primary
refers
are important
to attributes
concepts
that
when
performing
are only dependent
normalisation.
To recap:
on part of the composite
key.
Transitive
dependency
is
when an attribute
is
dependent
on any other
attribute
except
the
primary
key.
Note that Figure 7.3 includes the relational identified dependency.
7
All relational in
Figure
the
tables
7.3 is that
primary
caution.
contains
dependencies
If the information
need
caution
for
is
a data
duplication of a day, the
requirements design,
because
to
various
atable
seem to
dictate
discussed
in
that
anomalies.
EMP_NAME,
values
more, the
contains
versions
EMP_NUM
for
of effort
with the
1NF table
dependencies
integrity
partial
structure
based
data
or
15,
occur
on only
be
because
entered
shown a part
or the hourly
data
to
of
evaluate Such
entry requires
even
course
though
very inefficient. user from
slightly
name for
name also
violate
the
the Whats
typing
the employee
The project
anomalies
with
data redundancies
time
is
pay. For instance,
Such
to
Intelligence.
5 105 during the
the
or K. Moonsamy.
as Evergeen.
it is time
each
prevents
be used
every row
of effort
nothing
should
Business
EMP_NUM
duplication
Moonsamy
for
is still subject
entries must
Such
they
dependencies,
Databases
dependencies
anomalies;
misspelled
reasons,
use of partial
CHG_HOUR 5 105.
as Kavyara
consistency
the
Chapter
name, the position
as Evergreen and
and
create
might be entered
performance
makes 20 table
EMP_NUM
helps
of the employee
5 102 correctly
databases
is,
The data redundancies
JOB_CLASS
are identical
duplication
be entered
that
used for
of data. For example, if the user
attribute
The problem
dependencies
are sometimes
warehouse
warranted
and, therefore,
different
1NF requirements. partial
notation for each
key.
While partial
the
satisfy the
it
schema for the table in 1NF and a textual
might
relational
rules.
NOTE
The term first All of the
normal form (1NF) describes the tabular format in which: key attributes
are defined.
There are no repeating groups in the table. In other words, each row/column one and only one value, not a set of values. All attributes
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
intersection
contains
are dependent on the primary key.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
7
Normalising
Database
Designs
281
7.3.2 Conversion To Second Normal Form Fortunately,
the relational
database
design
can be improved
easily
format known as the second normal form (2NF). The 1NF-to-2NF the 1NF format displayed in Figure 7.3, you do the following: Step
1:
Write Each
Key Component
on a Separate
by converting
the
database
conversion is simple.
into
Starting
a
with
Line
Write each of the composite primary keys components (composite) key on the last line. For example:
on a separate line; then
write the original
PROJ_NUM EMP_NUM
PROJ_NUM EMP_NUM Each component will become the key in a new table. In other words, the original table is now divided into three tables (PROJECT, EMPLOYEE and ASSIGNMENT). Step 2: Assign Corresponding Dependent Attributes Use Figure 7.3 to determine those attributes that are dependent on other attributes. The dependencies for the original key components are found by examining the arrows below the dependency diagram shown in Figure
are described
7.3. In other
words, the three
bythe following
PROJECT
(PROJ_NUM,
relational
new tables
(PROJECT,
EMPLOYEE
and
7
ASSIGNMENT)
schemas:
PROJ_NAME)
EMPLOYEE (EMP_NUM,
EMP_NAME, JOB_CLASS,
ASSIGNMENT (PROJ_NUM,
EMP_NUM,
CHG_HOUR)
ASSIGN_HOURS)
As the number of hours spent on each project by each employee is dependent on both PROJ_NUM and EMP_NUM in the ASSIGNMENT table, you place those hours in the ASSIGNMENT table as ASSIGN_HOURS.
NOTE The
ASSIGNMENT
table
contains
a composite
primary
key
composed
of the
attributes
PROJ_NUM
and
EMP_NUM. Any attribute that is atleast part of a key is known as a prime attribute or a key attribute. Therefore, both PROJ_NUM and EMP_NUM are prime (or key) attributes. Conversely, a non-prime attribute, or a non-key attribute, is not even part of a key.
The results earlier
of
have
you need to A partial a table Figure
Copyright review
2020 has
still
any
All suppressed
Rights
Reserved. content
PROJECT
key
shows
does
displayed
exist
consists
not materially
be
copied, affect
when
of only
you
the
overall
or
duplicated, learning
At this
point,
want to
a tables
primary
a single
which
in experience.
whole
or in Cengage
part.
is can
held by
Due
to
electronic reserves
anomalies
discussed
a PROJECT
composed
automatically generate
rights, the
right
of several in
2NF
anomalies.
many employees,
Learning
of the
record,
only one row.
key is
attribute
most
add/change/delete
and add/change/delete
classification
scanned,
7.4. now
dependency,
for a job
May
Figure
if
table
only
a transitive
not
in
For example,
can
primary
7.4
Learning. that
2 are
go only to the
per hour changes
Cengage deemed
1 and
eliminated.
dependency
whose
charge
Editorial
Steps
been
some to
third remove
that
party additional
content
attributes,
For example,
change
may content
so
when it is in
be
any
if the
must be
suppressed at
1NF.
time
from if
the
subsequent
made
eBook rights
and/or restrictions
eChapter(s). require
it
282
PART II
Design
Concepts
for each of those the
charge
hourly
name:
If you forget
change,
different
to update
employees
some
of the
with the
same
employee job
records
description
that
are affected
will generate
by
different
Second normal form (2NF) conversion results
PROJECT
PROJ_NUM
Table
hour
charges.
FIGURE 7.4 Table
employees.
per
name:
PROJECT
(PROJ_NUM,
PROJ_NAME)
PROJ_NAME
EMPLOYEE
EMPLOYEE
(EMP_NUM,
EMP_NAME,
TRANSITIVE
JOB_CLASS,
DEPENDENCY
(JOB_CLASS
EMP_NUM
EMP_NAME
JOB_CLASS
CHG_HOUR)
CHG_HOUR)
CHG_HOUR
7 Transitive dependency Table
name:
ASSIGNMENT
PROJ_NUM
ASSIGNMENT
EMP_NUM
(PROJ_NUM,
EMP_NUM,
ASSIGN_HOURS)
ASSIGN_HOURS
NOTE A table
is in
second
normal
form
(2NF)
when:
It is in 1NF. and It includes
no
partial
dependencies;
that
is,
no
attribute
is
dependent
on only
a portion
of the
primary
key.
(It is
still possible
for a table in
may be functionally
2NF to exhibit
dependent
on non-key
transitive
dependency;
that is,
one or
more attributes
attributes.)
7.3.3 Conversion To Third Normal Form The data anomalies created completing
Copyright Editorial
review
2020 has
Cengage deemed
the
Learning. that
any
All suppressed
following
Rights
Reserved. content
does
bythe database organisation
three
May not
not materially
be
shown in Figure 7.4 are easily eliminated
by
steps:
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Step 1:Identify Each New Determinant For every transitive dependency, write its any
attribute
whose
dependencies,
you
one transitive
value
determines
will have
dependency.
three
determinant
other
values
different
Therefore,
Normalising
Database
Designs
as a PK for a new table. (A determinant
within
a row.)
If
Figure
7.4
determinants.
write the
7
determinant
you
have three
shows
different
a table
for this transitive
that
283
is
transitive
contains
dependency
only
as:
JOB_CLASS
Step 2:Identify Identify
the
the
attributes
dependency.
that
In this
JOB_CLASS Name
the
Step
3: Remove
table
Eliminate
Attributes
are
case,
dependent
you
on each
determinant
identified
in
Step
1 and identify
the
write:
? CHG_HOUR to reflect
the
its
In this
from
in the transitive
In this
the
and function.
Attributes
attributes
relationship.
7.4 to leave
contents
Dependent
all dependent
a transitive Figure
Dependent
example,
EMPLOYEE
Transitive
JOB
from
CHG_HOUR
dependency
seems
appropriate.
Dependencies
relationship(s)
eliminate
table
case,
definition
each
from
the
of the tables
that
EMPLOYEE
table
have such shown
in
as:
7 EMP_NUM
? EMP_NAME, JOB_CLASS
Note that the JOB_CLASS Draw
a new
dependency
new tables and that
no table
complete
In
contains
have
by simply
ASSIGNMENT Note that
this
are
all of the
to serve as the
tables
you
Step 3 to
have
FK.
defined
make sure that
in
Steps
each table
1-3.
Check
the
has a determinant
dependencies. 13,
you
drawing
the
will see the revisions
results
as you
has been completed,
EMP_NAME,
in
Figure
7.5. (The
usual
procedure
is
make them.)
your
database
contains
four
tables:
JOB_CLASS)
CHG_HOUR)
(PROJ_NUM, conversion
now
show
table
PROJ_NAME)
(EMP_NUM,
(JOB_CLASS,
EMPLOYEE
modified in
conversion
(PROJ_NUM,
EMPLOYEE
to
you
Steps
words, after the
PROJECT
tables
13
in the
inappropriate
completed
Steps
other
JOB
diagram
as well as the tables
When you to
remains
said to
EMP_NUM,
has eliminated
be in
third
normal
ASSIGN_HOURS) the
original
form
EMPLOYEE
tables
transitive
dependency;
the
(3NF).
NOTE A table
is in
It is in
third
normal
form
(3NF)
when:
2NF.
and
It contains
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
no transitive
Rights
Reserved. content
does
May not
not materially
dependencies.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
284
PART II
Design
Concepts
FIGURE 7.5
Third normal form (3NF) conversion results
PROJ_NUM
PROJ_NAME
EMP_NUM
Table name: PROJECT PROJ_NAME)
EMPLOYEE (EMP_NUM,
CHG_HOUR
PROJ_NUM
Table name: JOB CHG_HOUR)
EMP_NAME,
EMP_NUM
Table name:
JOB (JOB_CLASS,
JOB_CLASS
Table name: EMPLOYEE
PROJECT (PROJ_NUM,
JOB_CLASS
EMP_NAME
JOB_CLASS)
ASSIGN_HOURS
ASSIGNMENT
ASSIGNMENT (PROJ_NUM,
EMP_NUM,
ASSIGN_HOURS
7
7.4
IMPROVING
The table
structures
You can
now focus
operational need
to
section
to
presents
just
that
valuable
normal
are cleaned
the
In the
produce
normalisation its
unless
use
cannot, helps
intentionally
left
will learn
of tables.
be relied
normal
the
for
dependencies. on enhancing
types
of issues
space
issues,
due to
to all remaining
make good At a
and
different
note that,
principle
on to
forms
information
about
Please
data redundancies.
in lower
partial and transitive
provide
must apply the
byitself,
eliminate
you
set
designer
initial
ability to
paragraphs,
normalised the
the troublesome
databases
next few
a good
one example
because
form,
up to eliminate
on improving
characteristics. address
Remember is
THE DESIGN
tables
performance
all designs
reasons,
you each
in the
designs. Instead,
minimum,
its
design.
normalisation
should
be in third
as discussed
later.
7.4.1 Evaluate PK Assignments As the
number
of employees
entered
into
EMPLOYEE
integrity the to
the
violations.
EMPLOYEE create
JOB_CODE
JOB_CLASS A transitive of another to
pay
Copyright Editorial
review
2020 has
Cengage deemed
any
All suppressed
it
of a JOB_CODE
a transitive
produces
dependency
the
Rights
the
would
be better
attribute
to
produces
employee
Database
add
the
is
to referential
Designer, into
a JOB_CODE
attribute
dependency:
dependency,
if you assume that the JOB_CODE
is a proper
dependency:
the
does
May not
new
not materially
because the
presence
Reserved. content
exists
attribute
Note that
Learning. that
Therefore,
a new
errors that lead
? CHG_HOUR
non-key
because
violations.
it
each time
make data-entry
CHG_HOUR
does produce
because
must be entered
easy to
entry of DB Designer, rather than
a violation.
The addition
? JOB_CLASS,
This new attribute key,
such
value
it is too
a JOB_CLASS
will trigger
identifier.
a JOB_CLASS
Unfortunately,
For example,
table
a unique
primary
grows, table.
be
a non-key
CHG_HOUR.
of JOB_CODE JOB
copied, affect
table
scanned, the
overall
or
greatly
now
duplicated, learning
attribute However,
has two
in experience.
whole
or in Cengage
the that
decreases
the
candidate
part.
JOB_CLASS
transitive
Due Learning
to
electronic reserves
determines
dependency likelihood
rights, right
some to
third remove
party additional
content
an
of referential
keys (JOB_CODE
the
is
and
may content
be
any
time
value price
integrity
JOB_CLASS).
suppressed at
the easy
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
7
Normalising
Database
In this case, JOB_CODE is the chosen primary key as well as a surrogate key. A surrogate artificial
PK introduced
tables.
Surrogate
free
keys
of semantic
You learnt
by the
are usually
content
about
designer
(they
with the
numeric,
have
purpose
they
are
no special
PK characteristics
of simplifying
often
automatically
meaning),
and assignment
the
and they
in
Chapter
assignment
generated
are
usually
6, Data
by the
Modelling
keys to
DBMS,
from
285
key is an
of primary
hidden
Designs
the
they
end
Advanced
are
users.
Concepts.
7.4.2 Evaluate Naming Conventions It is
best
to
adhere
CHG_HOUR addition,
to
the
naming
will be changed
the
Database
attribute
Designer
have
noticed
That
change
and
that
to
name so
on; the was
It generally
is good
the
use of the
practice
EMP_NAME
a last
name,
flexibility.
For
to
generate
if the
phone
lists
If the
For example,
would
be desirable.
Data
Models.
association entries
fits
the
in
the
with the JOB table.
such
as
entries
Systems
better.
conversion
ASSIGNMENT
Therefore,
Also,
from
In
Analyst, you
1NF
may
to
2NF.
table.
to the
use
atomicity
table is not atomic By improving
names,
by the
degree
(An
atomic
names
and
and initials.
In
rules
EMP_NAME
attribute
general,
and
Such
processing
you
also
EMP_INITIAL, a task
designers
is
one
Clearly, the
can be decomposed
of atomicity,
EMP_FNAME,
first
business
because
the
EMP_LNAME,
last
requirement.
Such an attribute is said to display atomicity.)
an initial.
were used in
a real-world
gross
salary
an employee
of employment
gain you
would
querying can
be very
easily difficult
prefer to use simple,
single-valued
requirements.
environment,
payments
hire
and serve
date
and
attribute
as a basis for
several
would have to
UIF (Unemployment
Insurance
(EMP_HIREDATE)
could
awarding
measures. The same principle
other attributes
bonuses
Fund)
be used
to long-term
be
payments to track
employees
an and
must be applied to all other tables in your design.
New Relationships
ability
EMP_NUM
with the
were within a single attribute.
morale-enhancing
to
supply
as a foreign
PROJECTs
designer
you
Adding
7.4.5 Identify
each
and
year-to-date
length
systems
its
describe
ASSIGN_HOURS
worked
pay attention
by sorting
table
added.
for other
quite
2,
New Attributes
EMPLOYEE
employees
to
hours
subdivided.
as indicated
7.4.4 Identify
the
if
name components attributes
The
name
example,
not
JOB_DESCRIPTION
in the EMPLOYEE
a first
Chapter
Atomicity
that cannot be usefully further into
label
in
to indicate
does
changed
you associate
7.4.3 Refine Attribute
outlined
JOB_CHG_HOUR
JOB_CLASS
HOURS
lets
conventions
key in
manager
must take
care
detailed
data
to
information
PROJECT.
without
place
the
about
That action
producing
right
each
projects
ensures
unnecessary
attributes
in the
7.4.6 Refine Primary Keys as Required for
that
and
right
manager
is
ensured
you can access
the
undesirable
data
by using
normalisation
tables
by using
details
duplication.
of The
principles.
Data Granularity
Granularity refers to the level of detail represented bythe values stored in atables row. Data stored attheir lowest
level
of granularity
ASSIGN_HOURS
are those represent requires hour,
Copyright Editorial
review
2020 has
day,
Learning. that
any
week,
All suppressed
daily total,
definition.
Reserved. content
does
the
and
May not
not materially
case,
so on
be
copied, affect
scanned, the
overall
or
monthly
the relevant
duplicated, learning
7.5, the
by a given
of granularity?
total,
do you
Figure
worked
level
weekly
In this
data. In
hours
at their lowest
total,
month
Rights
be atomic
to represent
recorded
hourly
more careful
Cengage deemed
values the
are said to
attribute
in
whole
or in Cengage
part.
the
Due Learning
to
other
or yearly
question
want to record
experience.
In
total
ASSIGNMENT
employee
electronic reserves
would
table
on a given
words,
do the
total?
Clearly,
be as follows:
ASSIGN_HOURS
rights, the
right
some to
third remove
party additional
in
3NF uses the
project.
However,
ASSIGN_HOURS ASSIGN_HOURS For
what time
frame
data?
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
286
PART II
Design
Concepts
For example, (composite) the
total
key
assume
primary number
such
assume
as
of hours
that the
still
generated
if
any
may have
entries in the
add the
on
or
its
table.
That
for the
hours in the
when
the
an entity
project
is
example,
entity integrity
integrity
on the
and then
ASSIGN_NUM
primary
For
key and an employee
violates
PK,
only
a surrogate
primary
same
an acceptable
representing
flexibility.
action
morning
in
Using
greater
of a composite
is
useful
start.
yields
is used as the
more entries
no problems
PROJ_NUM
key is
since
and
as part
a few
data entry yields
a project
ASSIGNMENT
project
and
primary
combination
makes two
on the
That
granularity
ASSIGN_DATE
employee worked
in the day.) The same
lower
and PROJ_NUM
worked
of EMP_NUM table.
worked
provides
Even if you
employee
combination
ASSIGNMENT
an employee
EMP_NUM
makes two hours
the
the
ASSIGN_NUM
requirement. is
that
key in
violation
same
day. (The
on it
again later
worked
used as the
primary
key.
NOTE In an ideal (database
design)
or the requirements designs
involve
real-world
gathering
the
phase.
refinement
environment,
And those
world, the level
of
However,
existing
changing
changes
of desired
may ultimately
requirements,
the
determined
use
triggering
may dictate
of surrogate
at the conceptual
seen in this
thus
requirements
require
is
as you have already
data
granularity
granularity
chapter, design
changes
in
design
many database
modifications.
primary
key
In
a
selection.
keys.
7
7.4.7
Maintain
Writing
the
job
accuracy
Historical
charge
of the
data
per in
ASSIGN_CHG_HOUR. that is true to
each
that
project
the
the
the time data
charge
is
Finally,
you
the
derived
results.
when the
end
has
Cengage deemed
name
hour found
Chapter
within
the
a derived
derived
user
time.
presses
3, Section
the
historical attribute
same.
However, hours
However,
suppose
it is reasonable
that
the
JOB
found
discussion
to
in
always
per hour that
3.6, for a more detailed
charges
the
show
was in effect at on how historical
Attributes in the
to
ASSIGNMENT
be named
must
Enter
write the
be reported is
key, thus
to
store
done
speeding
preceding
database
application
time
up the
actual
result
point
the of data
charge
of
made
multiplying
of view,
or invoices.
software
summarised, at the
the
is the
write reports
and/or
calculation
in the
a strictly
are needed to
makes it easy to
(If the
table
ASSIGN_CHARGE, From
when they
described
the
in the
ASSIGN_CHG_HOUR.
in the table
the this
database.)
attribute
attribute,
multiplying
the
time.
charge
(See
by
forever
over
per hour stored in the JOB table, rather than the
reporting
7.6 is
Learning. that
a vast improvement
properly,
NUM and
2020
to
such
derived
However,
to
produce
availability entry,
it
the
storing
the
desired
of the
derived
will be completed
process.)
sections
are illustrated
in the tables
shown
in
7.6.
designed
review
maintaining
would
many transactions
will save
to
project
attribute
Also, if
Figure
Copyright
per
appropriate
on the
The enhancements
Editorial
billed)
can be calculated
attribute
Figure
(and
crucial
would appear to have the same value as JOB_CHG_HOUR,
will change
charge
is be
charges
by the
values
would
Those
use
ASSIGN_HOURS
attribute
table
It
value remains
hour
Using Derived
That
table.
attribute
per
maintained
can
a project.
ASSIGNMENT
worked
of the assignment.
accuracy
the
table.
and the
7.4.8 Evaluate
to
charge
calculated
table
current
into
ASSIGNMENT
JOB_CHG_HOUR
job
were
ASSIGNMENT
hour
the
Although this
only if the
assume
Accuracy
any
the
ASSIGN_HOURS
All suppressed
Rights
Reserved. content
does
over
most active
May not
table
values.
not materially
be
copied, affect
original
database
design.
requires
the
The values for the attributes
scanned, the
the
(ASSIGNMENT)
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
entry
If the
ASSIGN_NUM
rights, the
right
some to
third remove
application
of only the
party additional
content
software
PROJ_NUM,
is
EMP_
and ASSIGN_DATE
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
can
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.6 Database Table
name:
name:
The completed
7
Normalising
Database
Designs
287
database
Ch07_ConstructCo
PROJECT PROJ_NUM
PROJ_NAME
EMP_NUM
15
Evergreen
105
18
Amber
104
22
Rolling Tide
113
25
Starflight
101
Wave
Table name: JOB
Table
Copyright Editorial
review
name:
JOB_CODE
JOB_DESCRIPTION
JOB_CHG_HOUR
500
Programmer
28.24
501
Systems
76.43
502
Database
503
Electrical
Analyst
82.95
Designer
7
66.76
Engineer
53.64
504
Mechanical
Engineer
505
Civil Engineer
44.07
506
Clerical
21.23
507
DSS
508
Applications
509
Bio Technician
27.29
510
General Support
14.50
Support
36.30
Analyst
38.00
Designer
ASSIGNMENT
ASSIGN_
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_CHG_
ASSIGN_
NUM
DATE
NUM
NUM
HOURS
HOUR
CHARGE
1001
04-Mar-18
15
103
2.60
67.55
175.63
1002
04-Mar-18
18
118
1.40
14.50
20.30
1003
05-Mar-18
15
101
3.60
82.95
298.62
1004
05-Mar-18
22
113
2.50
38.00
95.00
1005
05-Mar-18
15
103
1.90
67.55
128.35
1006
05-Mar-18
25
115
4.20
76.43
321.01
1007
05-Mar-18
22
105
5.20
82.95
431.34
1008
05-Mar-18
25
101
1.70
82.95
141.02
1009
05-Mar-18
15
105
2.00
82.95
165.90
1010
06-Mar-18
15
102
3.80
76.43
290.43
1011
06-Mar-18
22
104
2.60
76.43
198.72
1012
06-Mar-18
15
101
2.30
82.95
190.79
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
288
PART II
Design
Concepts
ASSIGN_
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_CHG_
ASSIGN_
NUM
DATE
NUM
NUM
HOURS
HOUR
CHARGE
1013
06-Mar-19
25
114
1.80
38.00
68.40
1014
06-Mar-19
22
111
4.00
21.23
84.92
1015
06-Mar-19
25
114
3.40
38.00
129.20
1016
06-Mar-19
18
112
1.20
36.30
43.56
1017
06-Mar-19
18
118
2.00
14.50
29.00
1018
06-Mar-19
18
104
2.60
76.43
198.72
1019
06-Mar-19
15
103
3.00
67.55
202.65
1020
07-Mar-19
22
105
2.70
82.95
223.97
1021
08-Mar-19
25
108
4.20
76.43
321.01
1022
07-Mar-19
25
114
5.80
38.00
220.40
1023
07-Mar-19
22
106
2.40
28.24
67.78
Table name:
7
EMPLOYEE
EMP_
EMP_
EMP_
EMP_
EMP_
NUM
LNAME
FNAME
INITIAL
News
John
G
08-Nov-10
502
Kavyara
H
12-Jul-99
501
E
01-Dec-07
503
Noxolo
K
15-Nov-98
501
Alice
K
01-Feb-04
502
22-Jun-15
500
D
10-Oct-04
500
101
102
Moonsamy
103
Mzwandile
Baloyi
JOB_CODE
HIREDATE
104
Maseki
105
Johnson
106
Smithfield
William
107
Alonzo
Maria
108
Khan
Krishshanth
B
22-Aug-99
501
109
Smith
Larry
W
18-Jul-09
501
110
Olenko
Gerald
A
11-Dec-06
505
Geoff
B
04-Apr-99
506
111
Wabash
112
Smithson
Darlene
M
23-Oct-05
507
113
Joenbrood
Delbert
K
15-Nov-04
508
114
Jones
Annelise
20-Aug-01
508
115
Bawangi
Travis
B
25-Jan-00
501
116
Pratt
Gerald
L
05-Mar-05
510
Angie
H
19-Jun-04
509
04-Jan-16
510
Williamson
117
Frommer
118
be generated bythe application. the
ASSIGN_DATE
J
James
For example, the ASSIGN_NUM can be created by using a counter, and
can be the system
date read
by the
application
and automatically
entered
into the
ASSIGNMENT table. In addition, the application software can automatically insert the correct ASSIGN_ CHG_HOUR value by writing the appropriate JOB tables JOB_CHG_HOUR value into the ASSIGNMENT table. (The JOB and ASSIGNMENT tables are related through the JOB_CODE.) If the JOB tables
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
JOB_CHG_HOUR the
change
the
system
requires
ASSIGNMENT
tables
value changes,
automatically.
structure
7.5
this
address
design
enter
of that
thus their
magnetic
primary
key.
card
value into the
minimises
own reader
maintaining
the
need
ASSIGNMENT
for
human
Database
table
In
work
hours,
they
can scan
their
EMP_NUM
enters
their
identity.
Thus,
the
desired level
289
will reflect
intervention.
that
some
Designs
fact,
into
if the
ASSIGNMENT
of security.
At the
example,
and referential
a composite
grows. (It becomes key. In
Or a primary why the
When, for
key
primary
addition,
requirements,
key
may simply
have too
primary
to
key is
too
the
key
key
makes it
much
descriptive
JOB
considered
designer
table
still
cumbersome
a suitable foreign primary
was added
the
may become
to create
a composite
attribute
the
integrity
difficult
attribute
JOB_CODE
whatever reason,
implementation via the
incremented
level,
DBMS.
for
each
JOB table
TABLE 7.3
For example,
and
7.4 that
remember
in the
that in
key is
a system-defined
column,
Section
However,
a surrogate
Usually,
new row.
uses an identity
Recall from
shown
entity
to
must to
use
when the related more
difficult
content
serve
as that
to be unsuitable,
to
to
be
tables
designers
use
created
and
keys.
managed
Server
vital
primary
routines.)
which is
surrogate
For
of attributes
a composite
search
usable
meets the
concerns.
number uses
key.
to
a
can set the stage for
some
as the
write
employees
by using
next insertion structure
Normalising
SURROGATE KEY CONSIDERATIONS
Although
table
the
table
the
The table
7
Microsoft
JOB_CODE
Access
the
JOB_CODE
attribute does
numeric
uses
an
generally
and its
value is
AutoNumber
data type,
designated
prevent
to
be the
duplicate
entries
from
primary
being
made,
7.3.
Duplicate
entries in the job table JOB_CHG_HOUR
JOB_DESCRIPTION
511
Programmer
26.66
512
Programmer
26.66
data entries in
MS SQL
JOB tables
JOB_CODE
Clearly, the
7
automatically
object.
was not
attribute
key is
Oracle uses a sequence
the
Table
a system-defined
surrogate
Table
7.3 are inappropriate
because
they
duplicate
existing
records
yet
there has been no violation of either entity integrity or referential integrity. This multiple duplicate records problem was created when the JOB_CODE attribute was added as the PK. (When the JOB_ DESCRIPTION was initially designated to be the PK, the DBMS would ensure unique values for all job description entries when it was asked to enforce entity integrity. However, that option created the problems
that
caused
use of the
JOB_CODE
attribute
in the first
place!) In
any case, if
JOB_CODE
is
to be the surrogate PK, you must still ensure the existence of unique values in the JOB_DESCRIPTION through the use of a unique index. Note that all of the remaining tables (PROJECT, ASSIGNMENT and EMPLOYEE) are subject to the same limitations. For example, if you use the EMP_NUM attribute in the EMPLOYEE table as the PK, you can
make
multiple entries for the
same
employee.
To avoid that
problem,
you
might create
a
unique index for EMP_LNAME, EMP_FNAME, and EMP_INITIAL. But how would you then deal with two employees named Joe B. Smith? In that case, you might use another (preferably externally defined) attribute, such asID number, to serve as the basis for a unique index.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
290
PART II
Design
It is
Concepts
worth repeating
judgement.
In
flexibility.
For
EMP_NUM
example,
and
all, if
make
the
best
you
Unfortunately,
multiple
number
to
7.6 Tables
design
if you
the
ASSIGNMENT
is likely to
for
same
be to
times
table
employee
a new
to to
a project and
externally
the
hours
given
project
attribute
on
and
PROJ_NUM, entry
multiple times
per
for any
point of view.
day, it
must
that
day. In
during
such
data audits
index
integrity
ASSIGN_HOURS
a managerial
any
of professional
design
a unique one
from
same
defined
In any case, frequent
use only
during
exercise
between
enter the same
be undesirable
on
and the
a balance
couldnt
might
uniqueness.
trade-offs
strike
employees
entries
add
to
an employee
different
that
need
want to limit
several
ensure
as a stub,
be
that
voucher
After
possible case,
or ticket
would be appropriate.
HIGHER-LEVEL NORMAL FORMS in
3NF
when higher
will perform
suitably
normal forms
Boyce-Codd
7
might
often involves
you
that limitation
works
solution
design
would ensure that
an employee
to
database environment,
ASSIGN_DATE
date. That limitation given date.
that
a real-world
in
business
transactional
are useful. In this section,
normal form (BCNF),
and about
databases.
you learn
However,
about
a special
there
case
are
occasions
of 3NF, known
as
4NF.
7.6.1 The Boyce-Codd Normal Form (BCNF) Atable is in Boyce-Codd (Recall
from
reason, the
it
3NF
Chapter
normal form (BCNF)
3 that
a candidate
was not chosen to and the
when the
table
BCNF
are
contains
be the
key
has the
primary
equivalent.
same
characteristics
key.) Clearly,
when a table
Putting
more than
one
determinant
in
when every determinant in the table is a candidate key.
that
proposition
candidate
as a primary
contains
another
way,
key,
but for
some
only one candidate BCNF
can
key,
be violated
only
shown
here
key.
NOTE A table
is in
BCNF
Most
designers
are used, table
when
be in
3NF
other
words,
not violate
The Figure
Copyright review
2020 has
BCNF
not
be in
when a table
3NF
in the
situation
table
attribute
when it is in
described
meet the
(a
of the
key.
3NF. In fact,
once the
that
question,
you
is
dependent
on
2NF
attribute
be a candidate
just
a candidate
case
To answer
which a non-key
3NF, yet it fails to
is
BCNF requirements
non-prime
is in
table
as a special
to the BCNF?
one
the
and there
is the
are
if the
must keep another
BCNF requirements
in
So, how can a
mind that
non-prime
no transitive
determinant
techniques
3NF is reached.
a transitive
attribute.
dependencies.
of a key attribute? because
However,
That condition
BCNF requires
that
every
key. 3NF
table
that
fails
to
meet
BCNF
requirements)
is
shown
in
7.7.
Cengage deemed
the
conform
a case in
determinant
Editorial
and
exists
what about does
consider
most tables
dependency In
every
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.7
Normalising
Database
Designs
291
Atable that is in 3NF but not in BCNF
A
As you examine
7
Figure
7.7, note these
BC
functional
D
dependencies:
A 1 B ? C, D C ?B The table
structure
shown in
Figure 7.7 has no partial dependencies,
nor does it contain transitive
dependencies. (The condition C ? Bindicates that a non-key attribute determines part ofthe primary key
and that
requirements.
dependency
is
not transitive!)
Yet the condition
FIGURE 7.8
Thus, the table
C ? B causes the table to fail to
but
not
BC
D
1NF
A
CB
D
has
Cengage deemed
Learning. that
any
All suppressed
meets the
3NF
7
Rights
Reserved. content
does
May not
not materially
be
dependency
CD
3NF
2020
7.7
meetthe BCNF requirements.
A
A
review
Figure
BCNF
Partial
Copyright
in
Decomposition to BCNF
3NF,
Editorial
structure
copied, affect
and
scanned, the
overall
or
CB
BCNF
duplicated, learning
3NF
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
and
party additional
BCNF
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
292
PART II
Design
Concepts
To convert change
the
means that
Figure
structure
key to
Cis, in
dependency in
the table primary
effect,
in
Figure
7.7 into table
A 1 C. That is
a superset
of
structures
an appropriate
B. At this
C ? B. Next, follow the standard
point,
that
action
the
table
decomposition
are in
because
is in
1NF
3NF
the
and in
because
it
TABLE 7.4
contains
procedure
Sample
can be applied to an actual
data for
a BCNF conversion
problem,
examine the sample
a partial
data in Table 7.4.
amended
STAFF_ID
CLASS_CODE
ENROL_GRADE
125
25
21334
A
125
20
32456
C
135
20
28458
B
144
25
27563
C
144
20
32456
B
7.4 reflects
Each
the
following
CLASS_CODE
course
in two
might
class
INFS
section
A student
classes.
might identify class
This
condition
For example,
each identified
420,
can take earning
A staff
member
the
INFS
Or the
labelled
the
section
420
28458
which
might
registration.
1, while the
CLASS_CODE
case in
INFS
code to facilitate
420, class
2.
illustrates
a course
by a unique
section
many
classes.
grades
A and
can teach
member
The structure
uniquely.
a
be taught
Thus, the
CLASS_CODE
might identify
32457
QM
362,
5.
32456,
staff
a class
many
(sections), 32456
might identify
conditions:
identifies
generate
classes
CLASS_CODE
that
? B
procedures to produce the results shown
STU_ID
Table
C
7.8.
To see how this
7
BCNF, first
dependency
many
20 teaches
shown
in
Table
STU_ID 1 STAFF_ID
Note,
for
example,
that
student
125
has taken
both
21334
and
C, respectively. classes,
the
but
classes
each
class
identified
7.4 is reflected
in
? CLASS_CODE,
is taught
as 32456
Panel
by
and
A of Figure
only
one
staff
member.
Note
28458.
7.9:
ENROL_GRADE
CLASS_CODE ? STAFF_ID Panel
A of Figure 7.9 shows
has a major problem, enrolment
information.
different
staff
an update
is lost, structure, Figure
7.9 yields
review
2020 has
Cengage deemed
Learning. that
any
the two
that
All suppressed
contains
Rights
Reserved. content
that is clearly in 3NF, but the table represented
does
purpose to teach
student
135
May
drops
procedure
outlined
structures is in
that
BCNF
not materially
be
copied, affect
scanned, the
overall
when
or
duplicated, learning
table class
anomaly.
only one candidate
not
describe two things:
a deletion
table
a table
a dual
assigned
And if
following
when a table
Copyright
is
producing
Remember
Editorial
Such
member
anomaly.
thus
a structure
because it is trying to
structure
32456, class
key,
in experience.
whole
rows
28458,
information
to
both
3NF
determinant
3NF and
or in Cengage
part.
Due Learning
and
BCNF table
is
of
if
a
producing that
decompose
decomposition
in that
example,
thus
who taught
is to
structure
and student
For
updates,
about
problem
the
by this
to classes
anomalies.
will require
to the
Note that
conform every
will cause
two
The solution earlier.
staff assignments
class
the table
Panel
B shown
in
requirements. a candidate
key.
Therefore,
BCNF are equivalent.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.9
7
Normalising
Database
Designs
293
Another BCNF decomposition
Panel A: 3NF, but not BCNF
STU_ID
STAFF_ID
CLASS_CODE
ENROL_GRADE
Panel B: 3NF and BCNF
7 STU_ID
CLASS_CODE
ENROL_GRADE
CLASS_CODE
STAFF_ID
7.6.2 Fourth Normal Form (4NF) You
may encounter
example, involved
poorly
consider in
the
multiple
Cross
and
3 and
4. Figure
designed
possibility
service
United
databases
that
Way. In
addition,
same
that
set
employee
may each
multivalued The
presence
2(Table a few are
not
unique,
be used
and
Copyright Editorial
review
2020 has
different
employee
sets
attributes are
in the
clearly
Cengage
Learning. that
any
multivalued
PKs.
table.
many of
3 (VOLUNTEER_V3)
version
3
meets
two
and
many
means
the tables candidate
of the
Such
whom
at least
sets
that,
yet it
1 and
to contain
versions is
multiple
and
entries.)
EMP_NUM
table
contains
1,
of independent
if versions
has a PK, but it is
3NF requirements,
projects:
assignment
a condition
may have
be Red
ORG_CODE
are likely
in
the
ways.
key. (The
attributes
nulls.)
on three
different
contain
dependencies
contain
of employees,
Version
In fact,
of them
work for
The attributes
entries
are implemented,
No combination
some
tables
service
work
very
For
and can also
volunteer to
in
exist.
1 and
not
job
quite values 2
desirable,
assignments
composed
of all of
many redundancies
undesirable. is to
by creating
deemed
be
are thousands
activities.
be assigned
do not even have a viable
a PK because
when there
many service
many
attributes
assignments
does
be recorded
That is, the
have
of independent
cannot
create
values. can
multivalued
seems to be a problem.
and VOLUNTEER_V2)
so they
to
The solution
this
multiple
VOLUNTEER_V1
especially
that
of
many
(One
null values; in fact, the tables
can
the
have
dependencies.
10123
might can
of
multiple
employee
of facts
If you examine the tables in Figure 7.10, there ASSIGN_NUM
can have
Suppose
the
how
which a number
an employee
organisations.
7.10 illustrates
in
All suppressed
the
Rights
eliminate
problems
caused
and
SERVICE_V1
ASSIGNMENT
Reserved. content
the
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
by independent
or in Cengage
tables
part.
Due Learning
to
multivalued
depicted
electronic reserves
rights, the
right
in
some to
third remove
dependencies.
Figure
party additional
content
7.11.
may content
be
do
As you examine
suppressed at
You
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
294
PART II
Design
Figure
Concepts
7.11,
note that
multivalued
FIGURE 7.10 Database Table
name:
name:
neither
dependencies.
the
Those
ASSIGNMENT tables
are
nor the
said
to
be in
table
contains
independent
Tables with multivalued dependencies Ch07_Service
VOLUNTEER_V1 EMP_NUM
ORG_CODE
ASSIGN_NUM
10123
RC
1
10123
UW
3
4
10123
Table name:
SERVICE_V1 4NF.
VOLUNTEER_V2 EMP_NUM
ORG_CODE
10123
RC
10123
UW
ASSIGN_NUM
7
Table name:
10123
1
10123
3
10223
4
VOLUNTEER_V3 EMP_NUM
ORG_CODE
ASSIGN_NUM
10123
RC
1
10123
RC
3
10123
UW
4
NOTE A table
is in
fourth
normal
form
proper
design
(4NF)
when
it is
in
3NF
and
has
no
multiple
sets
of
multivalued
dependencies.
If you follow
described tables
Copyright Editorial
review
2020 has
the
problem.
conform
All attributes
2
Norow
Cengage
Learning. that
any
Specifically,
two
All
Rights
Reserved. content
illustrated
discussion
in this
book,
you
of 4NF is largely
does
May not
not materially
be
shouldnt
academic
encounter
if you
the
previously
make sure that
your
rules:
must be dependent on the primary key, but they
maycontain two
suppressed
the
to the following
1
deemed
procedures
must beindependent
of each other.
or more multivalued facts about an entity.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.11 Relational
7
Normalising
Database
Designs
295
Aset of tables in 4NF
Diagram
7
Database name: Table
Table
name:
EMPLOYEE
name:
Copyright review
2020 has
Cengage deemed
Learning. that
any
EMP_NUM
EMP_LNAME
10121
Rogers
10122
OLeery
10123
Panera
10124
Johnson
PROJECT
Table name:
Editorial
Ch07_Service
PROJ_CODE
PROJ_NAME
PROJ_BUDGET
1
BeThere
808
2
BlueMoon
15
3
GreenThumb
2
555 220.24
4
GoFast
4
482 460.00
5
GoSlow
791
363.55 956 900.32
975.00
ORGANISATION
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
ORG_CODE
ORG_NAME
RC
Red Cross
UW
United
Way
WF
Wildlife
Fund
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
296
PART II
Table
Design
name:
Concepts
ASSIGNMENT
Table name:
ASSIGN_NUM
EMP_NUM
PROJ_CODE
1
10123
1
2
10121
2
3
10123
3
4
10123
4
5
10121
1
6
10124
2
7
10124
3
8
10124
5
SERVICE_V1
7
7.7
EMP_NUM
ORG_CODE
10123
RC
10123
UW
10123
WF
NORMALISATION
AND DATABASE DESIGN
The tables
shown in Figure 7.6 illustrate
how normalisation
good tables
from
have ample
poor ones.
You will likely
procedures
opportunity
to
can be used to
put this
skill into
produce
practice
when
you begin to work with real-world databases. Normalisation should be part of the design process. Therefore, make sure that proposed entities meetthe required normal form before the table structures are created. Keep in mindthat, if you follow the design procedures discussed in Chapter 3, Relational Model Characteristics, and Chapter 5, Data Modelling with Entity Relationship Diagrams, the likelihood of data anomalies
will be small. (But
even the
best database
designers
are known
to
make occasional
mistakes that come to light during normalisation checks.) However, many of the real-world databases you encounter will have been improperly designed or burdened with anomalies if they wereimproperly modified during the course of time. And that means you may be asked to redesign and modify existing databases that are, in effect, anomaly traps. Therefore, you should be aware of good design principles and procedures
as well as normalisation
procedures.
First, an ERDis created through aniterative process. You begin byidentifying relevant entities, their attributes and their relationships. Then you use the results to identify additional entities and attributes. The ERD provides the big picture, or macro view, of an organisations data requirements and operations. Second,
normalisation
focuses
on the
characteristics
of specific
entities;
that is,
normalisation
represents a micro-view of the entities withinthe ERD. And as you learnt in the previous sections of this chapter, the normalisation process may yield additional entities and attributes to beincorporated into the ERD. Therefore, it is difficult to separate the normalisation process from the ER modelling process; the two techniques are used in an iterative and incremental process.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
To illustrate of the can
the
contracting
proper
be summarised The
manages
project
requires
An employee
Some employees a project.
the
employee
hourly
billing
Many
assigned
employees
has
are
many
several
part
re-examine sections.
the
297
operations
Those
operations
projects.
and perform
of alabour
job
lets
preceding
Designs
rules:
different
pool,
secretary
primary
process,
in the
Database
employees.
to a project
executive
a single
business
of
to
design
Normalising
would
to
not specifically
be shared
not
classification.
duties
by all
be assigned That job
to
project any
teams.
one
classification
related
to For
particular
project.
determines
the
rate.
employees
can
one electrical
Given that
projects.
companys
Each
than
many
services
in the
were normalised
following
are not assigned
Some
example,
tables
the
the
may be
of normalisation
whose
by using
company
Each
role
company
7
simple
have the
same
job
classification.
For example,
the
company
employs
more
engineer.
description
of the
companys
operations,
two
entities
and their
attributes
are initially
defined: PROJECT
(PROJ_NUM,
EMPLOYEE
PROJ_NAME)
(EMP_NUM,
7
EMP_LNAME,
EMP_FNAME,
EMP_INITIAL,
JOB_DESCRIPTION,
JOB_CHG_HOUR) Those
two
entities
constitute
FIGURE 7.12
the initial
ERD shown
in
Figure
7.12.
Initial contracting company ERD
After creating the initial
ERD shown in Figure 7.12, the normal forms are defined:
PROJECT is in 3NF and needs no modification at this point. EMPLOYEE requires additional scrutiny. The JOB_DESCRIPTION attribute defines job classifications such as systems analyst, database designer and programmer. In turn, those classifications determine the billing rate, JOB_CHG_HOUR. Therefore, EMPLOYEE contains a transitive
dependency.
The removal
of EMPLOYEEs
Copyright review
2020 has
dependency
PROJECT (PROJ_NUM,
PROJ_NAME)
EMPLOYEE (EMP_NUM,
EMP_LNAME,
JOB (JOB_CODE,
Editorial
transitive
Cengage deemed
Learning. that
any
All suppressed
Rights
JOB_DESCRIPTION,
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
yields three entities:
EMP_FNAME, EMP_INITIAL, JOB_CODE) JOB_CHG_HOUR)
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
298
PART II
Design
Because in
Concepts
the normalisation
Figure
process
yields
an additional
entity (JOB),
the initial
ERD is
modified as shown
7.13.
FIGURE 7.13
Modified contracting company ERD
7
To represent
the *:* relationship
relationships have
could
many
employees
Unfortunately,
between
be used
an
assigned
that
Figure
7.14
projects,
the
must thus
primary
be
keys
from
the
ERD entities
note that in this implementation, avoid the
ASSIGNMENT
and the
non-identifying
relationships.
Figure
PROJECT
and
ASSIGNMENT
primary
requires
that
cannot
you
might think
projects,
be correctly
entity
7.15.
The
and
that
each
two
1:*
project
can
entitys the
between
implemented.
cannot
to track
be implemented,
the
ASSIGNMENT
EMPLOYEE
key. Therefore,
relationship
many
and PROJECT
ASSIGNMENT
in
the
use of a composite
the
shown
PROJECT, to
7.14).
a design
EMPLOYEE
to include
the
and
be assigned
Figure
yields
between
modified
yielding
can
to it (see
representation
Because the *:* relationship
to
EMPLOYEE
employee
to
serve
surrogate
as its
enters relationship
PROJECT
and
assignment entity
primary
the
in
ERD in
of employees Figure
foreign
7.15
keys.
uses
However,
key is
ASSIGN_NUM,
between
EMPLOYEE
ASSIGNMENT
are in fact
to and
weak
or
NOTE In Chapter 5, Data Modelling with Entity Relationship Diagrams, it not make a distinction between weak and strong relationships.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
was discussed that
rights, the
right
some to
third remove
party additional
content
UML notation
may content
be
suppressed at
any
time
from if
the
subsequent
does
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.14
Modified contracting
7
Normalising
Database
Designs
299
company ERD
7
FIGURE 7.15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Final contracting company ERD
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
300
PART II
Design
Concepts
As you examine named
Figure
7.15, note that the
ASSIGNMENT.
creation
As you
of a manages
foreign
key in
ability to
generate
employee
the
PROJECT
model
should
(EMP_NUM,
(JOB_CODE,
entities yields
now reflect
Figure
7.15,
optionality in
exists
Figure
FIGURE
name:
Table name:
Copyright review
and their
EMP_NUM)
EMP_LNAME,
EMP_FNAME,
to improve
may want to include
the
the
through
the
the
date on which the
length.
Based
on this last
attributes:
EMP_INITIAL,
EMP_HIREDATE,
JOB_CODE)
PROJ_NUM,
EMP_NUM,
ASSIGN_HOURS,
on the
right
track.
conformance
whose note
that
because
to
entities
3NF.
ERD represents
is
optional
all employees
the
The combination
may now be translated
PROJECT
not
The
to
manage
accurately,
of normalisation
into
appropriate
EMPLOYEE projects.
operations
in the
The final
and
table
and the
ER
manages
contents
This
are
shown
database
Ch07_ConstructCo
EMPLOYEE EMP_FNAME
EMP_INITIAL
EMP_HIREDATE
JOB_CODE
101
News
John
G
08-Nov-10
502
102
Moonsamy
Kavyara
H
12-Jul-99
501
E
01-Dec-07
503
Noxolo
K
15-Nov-98
501
Alice
K
01-Feb-04
502
22-Jun-15
500
D
10-Oct-04
500
Baloyi
Mzwandile
104
Maseki
105
Johnson
106
Smithfield
William
107
Alonzo
Maria
108
Khan
Krishshanth
B
22-Aug-99
501
109
Smith
Larry
W
18-Jul-09
501
110
Olenko
Gerald
A
11-Dec-06
505
111
Wabash
Geoff
B
04-Apr-99
506
112
Smithson
Darlene
M
23-Oct-05
507
113
Joenbrood
Delbert
K
15-Nov-04
508
114
Jones
Annelise
20-Aug-01
508
115
Bawangi
Travis
B
25-Jan-00
501
116
Pratt
L
05-Mar-05
510
Cengage deemed
As you
relationship.
database
EMP_LNAME
has
modelling
structures.
EMP_NUM
2020
systems
JOB_CHG_HOUR) ASSIGN_DATE,
The implemented
103
Editorial
is implemented
be created
entity
manager,
7.16.
7.16
Database
now
their
ERD,
projects
each
ASSIGN_CHARGE)
is
a useful
examine
7
process
to the composite
about
of worker employment
entities
PROJ_NAME,
(ASSIGN_NUM,
ASSIGN_CHG_HOUR, design
four
may
you
to keep track
is assigned
relationship
attributes
For example,
include
attribute
information
The manages
additional
JOB_DESCRIPTION,
ASSIGNMENT
detailed
useful.
some
information.
(PROJ_NUM,
EMPLOYEE
The
is
Finally,
additional
need
was hired (EMP_HIREDATE)
modification,
JOB
relationship
PROJECT.
ASSIGN_HOURS
will likely
Learning. that
any
All suppressed
Rights
Gerald
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
EMP_NUM
EMP_LNAME
117
Williamson
118
Table
name:
Table
Copyright Editorial
review
Database
Designs
EMP_INITIAL
EMP_HIREDATE
JOB_CODE
Angie
H
19-Jun-2004
509
04-Jan-2016
510
J
JOB_CODE
JOB_DESCRIPTION
name:
JOB_CHG_HOUR
500
Programmer
28.24
501
Systems
76.43
502
Database
503
Electrical
Analyst
82.95
Designer
66.76
Engineer
Mechanical
53.64
Engineer
505
Civil Engineer
44.07
506
Clerical
21.23
507
DSS
508
Applications
509
Bio Technician
27.29
510
General
14.50
Support
36.30
Analyst
38.00
Designer
Support
7
PROJECT
name:
PROJ_NUM
PROJ_NAME
EMP_NUM
15
Evergreen
105
18
Amber
104
22
Rolling Tide
113
25
Starflight
101
Wave
ASSIGNMENT
ASSIGN_
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_CHG_
ASSIGN_
NUM
DATE
NUM
NUM
HOURS
HOUR
CHARGE 175.63
1001
04-Mar-19
15
103
2.60
67.55
1002
04-Mar-19
18
118
1.40
14.50
1003
05-Mar-19
15
101
3.60
82.95
1004
05-Mar-19
22
113
2.50
38.00
95.00
1005
05-Mar-19
15
103
1.90
67.55
128.35
1006
05-Mar-19
25
115
4.20
76.43
321.01
1007
05-Mar-19
22
105
5.20
82.95
431.34
1008
05-Mar-19
25
101
1.70
82.95
141.02
1009
05-Mar-19
15
105
2.00
82.95
165.90
1010
06-Mar-19
15
102
3.80
76.43
290.43
2020 has
Cengage deemed
Learning. that
any
301
JOB
504
Table
Normalising
EMP_FNAME
James
Frommer
7
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
20.30 298.62
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
302
PART II
Design
Concepts
ASSIGN_
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_CHG_
ASSIGN_
NUM
DATE
NUM
NUM
HOURS
HOUR
CHARGE
1011
06-Mar-19
22
104
2.60
76.43
198.72
1012
06-Mar-19
15
101
2.30
82.95
190.79
1013
06-Mar-19
25
114
1.80
38.00
68.40
1014
06-Mar-19
22
111
4.00
21.23
84.92
1015
06-Mar-19
25
114
3.40
38.00
129.20
1016
06-Mar-19
18
112
1.20
36.30
43.56
1017
06-Mar-19
18
118
2.00
14.50
29.00
1018
06-Mar-19
18
104
2.60
76.43
198.72
1019
06-Mar-19
15
103
3.00
67.55
202.65
1020
07-Mar-19
22
105
2.70
82.95
223.97
1021
08-Mar-19
25
108
4.20
76.43
321.01
1022
07-Mar-19
25
114
5.80
38.00
220.40
1023
07-Mar-19
22
106
2.40
28.24
67.78
7
7.8
DENORMALISATION
Although the creation of normalised relations is animportant database design goal,it is only one of many such goals. Good database design also considers processing requirements. Astables are decomposed to conform to normalisation requirements, the number of database tables expands. Joining the larger number of tables takes additional input/output (I/O) operations and processing logic, thereby reducing system
speed.
Consequently,
occasional
circumstances
may allow
some
degree
of denormalisation
so
processing speed can beincreased. Keep in mindthat the advantage of higher processing speed must be carefully weighed against the disadvantage of data anomalies. Onthe other hand, some anomalies are of only theoretical interest. For example, should people in a real-world database environment worry that a POST_CODE determines CITY in a CUSTOMER
table
whose primary
key is the
customer
number?
Is it really
practical to
produce
a separate table for: POST_CODE (POST_CODE, to eliminate
a transitive
CITY)
dependency
from the
CUSTOMER
table?
(Perhaps
your answer to that
question
changes if you arein the business of producing mailing lists.) The advice is simple: use common sense during the normalisation process. Normalisation purity is often difficult to sustain in the modern database environment. The conflicts between design efficiency, information requirements and processing speed are often resolved through compromises
that
may include
denormalisation.
You
will also learn
(in
Chapter
15,
Databases
for
Business Intelligence) that lower normalisation forms occur (and are even required) in specialised databases known as data warehouses. Such specialised databases reflect the ever-growing demand for greater scope and depth in the data on which decision-support systems increasingly rely. You will discover
that the
data
warehouse
routinely
uses
2NF structures
in its
complex,
multilevel,
data environment. In short, although normalisation is very important, especially production database environment, 2NF is no longer disregarded asit once was.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
multisource
in the so-called
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
Although partial
2NF tables
and/or
Aside
from
the
production Data
updates
larger
tables.
possibility
is
many
attributes
create
that
good
tables
and
in
views
such
choice
is
not
for 8,
created
database
as the
that
you
than
their
ones
can
that
Database
with tables should
created,
and
to
that
not
be
Designs
303
contain
minimised.
unnormalised
tables
in
a
the
all of the
virtual
tables
deal
with
required
known
Structured
application
thus far.
why
must
indexes
Query
for the
under
In
(You
that
use
will
a database.
data redundancy
other
some
as views.
Language.)
programs
often lead to various
examined
explain
tables
table.
Beginning
in
update
build
creating
tables
normalised
read
unnormalised
Chapter be
working
environment being
practical
strategies
cannot
unnormalised
anomalies
of
Normalising
defects:
a single
in
problem
database
programs
simply
no simple use
make sure
a better
these
because
design
databases and
are
yield
mind that
production
cautiously
data
from
It
the
a production
may be located
tables
Also keep in in
suffer
efficient
that
how to
Remember
to
more cumbersome.
Unnormalised
be avoided,
in
of troublesome
tend
are less
Indexing
always
dependencies
database
learn
cannot
transitive
7
words,
disasters
use denormalisation
circumstances
the
unnormalised
counterparts.
7
SUMMARY Normalisation first three point
is a technique
normal forms
of view,
forms
yield
3NF
as the
form,
higher
relatively ideal
used to
(1NF,
2NF and 3NF) are
normal
forms
fewer
data redundancies
normal
form.
when
all key
is in
1NF
dependent
on the
dependencies.
(A
primary
(A partial
attribute
is functionally
primary
key
A table is in
3NF, the keys
2NF
2NF
forms
minimised.
higher
all business
known
as
The
From a structural
because
Almost
3NF is
are
in
keys.
normal
designs
Boyce-Codd
use
normal
attributes
Cengage
Learning. that
any
All
all remaining
still
contain
an attribute
partial
is functionally
dependency
non-key
attributes both
attribute.)
is one in A table
are and transitive
dependent which
on
one non-key
with a single-attribute
dependencies.
in
2NF
key is
no partial based
may still
form
(BCNF)
When a table
is
has
dependencies.
on only
contain
a single
transitive
no transitive merely
only
may be split into
process
is illustrated
is an important
are defined
suppressed
can
a 1NF table
i.e.
it is
not
a
dependencies.
dependencies.
a special
a single
Therefore, attribute,
3NF
attribute
Given that
case in
which
candidate
definition
of
all determinant
key,
a 3NF table
is
BCNF.
The
Normalisation
which
2NF and contains
normal
when
1NF
key. Atransitive
primary
A table
when it is in
candidate
one in
and
in
1NF and contains
when its
key.
3NF
requirements.
deemed
are
encountered.
normal
database.
defined
on another
partial
A table that is not in 3NF
has
lower
in the
a table
is
primary
when it is in
in
Boyce-Codd
automatically
2020
most commonly
than
are
However,
dependency
exhibit
primary
A table is in
review
which data redundancies
more restricted
attributes
dependent
cannot
automatically
composite
Copyright
better
special,
key.
only a part of a multi-attribute
Editorial
are
in
or BCNF.)
A table
is
design tables
Rights
Reserved. content
does
part
during the
May not
not materially
be
copied, affect
in
Figures
scanned, overall
or
modelling
duplicated, learning
in experience.
whole
until all of the tables
7.17
but only a part
ER
the
new tables
to
or in Cengage
part.
Due Learning
design
subject
to
3NF
7.19.
of the
process,
meet the
electronic reserves
each
rights, the
process.
right
some to
third remove
As entities
entity (set) to
party additional
content
may content
be
and
normalisation
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
304
PART II
Design
Concepts
checks and
and form
continue
equivalent
tables
Atable in 3NF redundant (4NF)
new entity (sets)
the iterative are in
3NF
The larger
and
the
greater
the
I/O in
order
increased
table
FIGURE 7.17
the
normalised
and their
dependencies
remove
the
of tables,
the
entities into the
attributes
are
defined
and
ERD all
by
logic.
I/O
numerous
to the Thus,
the
operations
tables
are
Unfortunately,
data
updates
data redundancies use
either
dependencies.
Therefore,
speed.
making
databases,
produce
a 3NF table
fourth
a table
null values normal
is in
or
form
4NF
when
dependencies.
processing
speed
that
convert
more additional
of processing
and by introducing
to
multivalued
multivalued
to increase
of production
Incorporate
all entities
may be necessary
to no
amount
processing
cumbersome, design
the
it
contains
the number
less
multivalued
Therefore,
by splitting
it is in
until
3NF.
may contain
data.
as required.
ER process
denormalisation
sometimes
with larger
less
that
are required
efficient,
are likely
sparingly
denormalised tables,
by
to yield
and
to join them
you
and to
pay for
yield the
making indexing
more
data anomalies.
In the
cautiously.
Theinitial 1NF structure
7 A
BC
DE
Partial
Transitive
F
dependency
dependency Step
1:
Write line; PK
each
then on the
PK
component
write the last line.
on
original
a separate
(composite)
A
B
A
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
B
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 7.18
Identifying Step
2:
Normalising
Database
Designs
305
possible PK attributes
Place
all
dependent
attributes
attributes
identified
No attributes
A
7
in
are
with
Step
the
PK
1.
dependent
on
A. Therefore,
A does
not
become a PK for a new table structure.
This
BC
(no
table
is in
partial
3NF
because
dependencies)
no transitive
it is and it
2NF
contains
dependencies.
This A
in
BD
EF
table
because
is in it
transitive
Transitive
2NF
contains
a
dependency.
dependency
7
FIGURE 7.19
Table structures based on the selected PKs Step
3: Remove
all transitive
and retain All tables
are in (no
3NF
partial
transitive
B
dependencies
identified
in
Step
2
all 3NF structures. because
they
are in
dependencies)
2NF
and they
do
not
contain
dependencies.
C
DF
Attribute A
BD
table
E
Dis retained
structure
to
in this
serve
as the
FK to the second table.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
306
PART II
Design
Concepts
KEYTERMS atomicattribute
fourth normalform (4NF)
primeattribute
atomicity
granularity
repeatinggroup
Boyce-Coddnormal form (BCNF)
key attribute
second normalform (2NF)
denormalisation
non-key attribute
surrogate key
dependency diagram
non-prime attribute
third normalform (3NF)
determinant
normalisation
transitivedependency
first normal form (1NF)
partial dependency
Online Content are available
Answers to selectedReviewQuestions andProblems forthis chapter
on the online
platform
accompanying
this
book.
FURTHER READING 7 Ambler,
S. Agile
Database
Codd,
E.F.
Further
Fagin,
R. Multi-valued
Fagin,
R. Normal
Conference Maier,
forms
of
of
Relational
Wiley
& Sons Inc,
Database
and
and relational
Management
Theory
John
of the
dependencies
on
D. The
Techniques.
Normalizations
a new
database Data,
pp.
normal
Model.
form
operators. 153160,
Databases.
2003.
Relational
NY
Data
for
relational
In
Proceedings
Base
Systems.
databases.
Prentice
ACM
of ACM
Hall, 1972.
Transactions
Sigmoid
2(3),
1977.
International
1979.
Computer
Science
Press,
1983.
REVIEW QUESTIONS 1
Whatis normalisation?
2
Whenis atable in 1NF?
3
Whenis atable in 2NF?
4
Whenis
5
Whenis atable in BCNF?
6
Given the dependency a
a table in 3NF?
Identify
b
diagram shown in Figure Q7.1, answer Items
and discuss each of the indicated
Create a database each
c
6a-6c.
dependencies.
whose tables
are atleast in 2NF, showing the
dependency
diagrams for
whose tables
are atleast in 3NF, showing the
dependency
diagrams for
table.
Create a database each table.
7
Whatis a partial dependency?
8
With which normal form is it associated?
Whichthree data anomalies arelikely to be the result of data redundancy?
How can such anomalies
be eliminated?
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE
Q7.1
Dependency
Normalising
Database
Designs
307
diagram for Question 6
C1
9
7
C2
C3
Define and discuss the concept of transitive
C4
C5
dependency.
10
Whatis a surrogate key, and when should you use one?
11
Whyis a table whose primary key consists of a single attribute automatically in 2NF whenit is in 1NF?
12
How would you describe a condition in which one attribute is dependent on another attribute when neither
13
attribute
is part of the
primary
7
key?
Suppose that someone tells you that an attribute that is part of a composite primary keyis also a candidate
14
key. How
would you respond
A table is in ________ normal form
to that
statement?
when it is in ___________ and there are no transitive
dependencies.
15
The dependency diagramin Figure Q7.2indicates that authors are paid royalties for each book they
write for
a publisher.
The amount
of the royalty
can vary by author,
by book,
and by edition
of the book.
FIGURE
Q7.2
ISBN
Book royalty
BOOK_TITLE
dependency
AUTHOR_NUM
diagram
LAST_NAME
PUBLISHER
SOURCE:
ROYALTY
Course
EDITION
Technology/Cengage
whose tables
Learning
a
Based on the dependency diagram, create a database showing the dependency diagram for each table.
are at least in 2NF,
b
Create a database whose tables are atleast in 3NF, showing the dependency diagram for each table.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
308
PART II
Design
16
Concepts
The dependency one
or
tables
more are in
FIGURE Q7.3
MED_NAME
diagram in Figure Q7.3indicates
medicines at least
over 2NF,
time.
Based
showing
the
that a patient can receive
on the
dependency
dependency
diagram,
diagram
for
each
many prescriptions for
create
a database
whose
table.
Prescription dependency diagram
PATIENT_ID
REFILLS_
DATE
PATIENT_
ALLOWED
DOSAGE
NAME
SOURCE:
Course
SHELF_LIFE
Technology/Cengage
Learning
7 Suppose someone tells you that an attribute that is part of a composite candidate key. How would you respond to that statement?
primary
key is also a
PROBLEMS 1
Using the INVOICE
table
structure
shown
below,
write the relational
schema,
draw its
dependency
diagram, and identify all dependencies (including all partial and transitive dependencies). You can assume that the table does not contain repeating groups and that an invoice number references morethan one product. (Hint: This table uses a composite primary key.) Attribute
Name
Sample
INV_NUM
Value
Sample
211347
Value
Sample
211347
Sample
Value
Sample
211348
211349
RU-995748G
AA-E3422QW
15-Jan-2019
15-Jan-2019
16-Jan-2019
Band
Rotary
Power
AA-E3422QW
SALE_DATE
15-Jan-2019
15-Jan-2019
PROD_LABEL
Rotary
0.25-in.
VEND_CODE
211
VEND_NAME
NeverFail,
QUANT_SOLD
1
8
1
2
1
PROD_PRICE
34.46
2.73
31.59
34.46
69.32
sander
Inc.
draw
the
new
saw
NeverFail,
Inc.
BeGood,
1, remove
dependency
sander
Inc.
Identify
the
NeverFail,
normal
drill
157
all partial dependencies,
diagrams.
GH-778345P
211
309
211
Using the answer to Problem and
drill bit
Value
211347
PROD_NUM
2
QD-300932X
Value
Inc.
ToughGo,
write the relational
forms
for
each
table
Inc.
scheme
structure
you
created.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
7
Normalising
Database
Designs
309
NOTE You
can
assume
products.
that
PROD_NUM (Hint:
3
any
Therefore,
given
it is
product
proper
to
is
supplied
conclude
that
? PROD_DESCRIPTION,
Your actions
should
produce
by a single the
PROD_PRICE,
three
dependency
4
new
dependency
Using the results
5
Using the
diagram.
Name
Identify
the
a vendor
can
supply
many
exists:
VEND_CODE,
VEND_NAME
dependencies,
normal
forms
for
writethe relational schema and each
table
structure
you
created.
of Problem 3, draw the ERD using UML class notation.
STUDENT table
dependency
Attribute
diagrams.
but
dependency
diagrams.)
Usingthe answer to Problem 2, remove alltransitive draw the
vendor,
following
structure
Identify
Sample
shown
here,
all dependencies
Value
Sample
write the relational
(including
Value
all transitive
Sample
STU_NUM
211343
200128
199876
STU_LNAME
Stephanos
Smith
Jones
STU_MAJOR
Accounting
Accounting
DEPT_CODE
ACCT
DEPT_NAME
Accounting
DEPT_PHONE
4356
4356
COLLEGE_NAME
Business
ADVISOR_
Grastrand
Admin
Value
schema
and draw its
dependencies).
Sample
Value
Sample
Value
223456
199876 Ortiz
McKulski
Marketing
Marketing
Statistics
ACCT
MKTG
MKTG
MATH
Accounting
Marketing
Marketing
Mathematics
Business
Admin
Business
Admin
Business
Gentry
Grastrand
3420
4378
4378
7
Admin
Arts & Sciences
Tillery
Chen
T356
J331
LNAME ADVISOR_
T201
T228
T201
OFFICE
Building
Torre
Torre
Building
Building
ADVISOR_BLDG
Torre
ADVISOR_
2115
2115
2123
2159
3209
STU_GPA
3.87
2.78
2.31
3.45
3.58
STU_HOURS
75
45
117
113
87
Torre
Building
Jones
Building
PHONE
STU_CLASS
6
3NF requirements
considerations If
review
2020 has
Cengage deemed
dictate
necessary,
naming
Copyright
UG1
Usingthe answerto Problem 5, writethe relational schema and drawthe dependency diagramto meet the
Editorial
UG3
UG3
UG2
UG1
Learning. that
any
add
or
to
using
the
greatest
practical
a 2NF structure,
modify
attributes
to
extent
explain create
possible.
why your decision
appropriate
If
you
believe
to retain
determinants
that
practical
2NF is appropriate.
and
to
adhere
to
the
conventions.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
310
PART II
Design
Concepts
NOTE Although
the
completed
(STU_CLASS), student
is
a student hours
this
student
dependency
considered
ajunior
who is classified
the
7
not
if that
range
(STU_HOURS)
as obvious
student
as a junior
within the specified
define
hours
is
has
do
as you completed
between
may have completed
of 61-90
hours. In
determine
might initially
the assume
61 and
student it to
90 credit
66, 72 or 87 hours,
short,
any hour value
classification
be.
For
example,
hours.
a
Therefore,
or any other
number
within a specified
range
of will
classification.
Using the results
of Problem 6, draw the ERD using UML notation.
NOTE This
ERD constitutes
a small
be combined
with the
might
7
Relationship
8
Tiny
of a universitys
University
full-blown
presentation
in
design. Chapter
5,
For example, Data
this
Modelling
To keep track
of office furniture,
table
computers,
printers and so on, the FOUNDIT company
Sample
Sample
Value
Sample
Value
231134-678
342245-225
254668-449
ITEM_LABEL
HP DeskJet 3755
HP Toner
DT Scanner
ROOM_NUMBER
325
325
123
BLDG_CODE
NTC
NTC
CSF
BLDG_NAME
Nottooclear
Nottoclear
Canseefar
I.
BLDG_MANAGER
Given that information,
answer
I.
B. Rightonit
write the relational
that you label the transitive Using the
uses the
Value
ITEM_ID
to
May B. Next
B. Rightonit
schema
and draw the dependency
Using the results
diagram.
Make sure
and/or partial dependencies.
Problem
8,
write the
relational
schema
and
diagrams that meet 3NF requirements. Rename attributes to create new entities and attributes as necessary. 10
Entity
structure:
Name
9
segment with
Diagrams.
following
Attribute
segment
create
a set
of dependency
meetthe naming conventions
and
of Problem 9, draw the ERD using UML notation.
NOTE Problems
Copyright Editorial
review
2020 has
11-13
Cengage deemed
Learning. that
any
may be combined
All suppressed
Rights
Reserved. content
does
May not
not materially
be
to
copied, affect
serve
scanned, the
overall
or
duplicated, learning
as a case
in experience.
whole
or in Cengage
or a
part.
Due Learning
mini-project.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
11
The table (For
structure shown below contains
example,
attributes
Attribute
there
are
not
atomic.)
are
several
Sample
Name
many unsatisfactory
multivalued
attributes,
Value
Sample
1003
EMP_NUM
naming
Willaker
EMP_EDUCATION
BBA,
components
Value
Sample
JOB_CLASS
SLS
EMP_DEPENDANTS
Gerald (spouse),
Database
Designs
311
and characteristics. are violated
Value
and
Sample
1019
some
Value
1023
McGuire
Smith
MBA
Normalising
conventions
1018
EMP_LNAME
7
McGuire
BS, MS, Ph.D.
BBA SLS
JNT
DBA
JoAnne (spouse)
George (spouse)
Mary (daughter),
Jill (daughter)
John (son) DEPT_CODE
MKTG
MKTG
SVC
DEPT_NAME
Marketing
Marketing
General
DEPT_MANAGER
Jill
EMP_TITLE
Sales
EMP_DOB
23-Dec-1978
EMP_HIRE_DATE
H. Martin
Jill
Agent
H.
Hank
Info.
B. Jones
Systems
David
G. Dlamini
DB
28-Mar-1989
18-May-1992
20-Jul-1969
14-Oct-2007
15-Jan-2016
21-Apr-2013
15-Jul-2009
EMP_TRAINING
L1, L2
L1
L1
L1, L3,
EMP_BASE_SALARY
30
24
EMP_COMMISSION_RATE
0.015
transitive 12
structure,
221.45
Agent
Service
Janitor
Given that
Sales
Martin
INFS
095.00
15
602.50
Admin
101
7
L8, L15
041.00
0.010
write the relational
schema
and
draw its
dependency
diagram.
Label
all
and/or partial dependencies.
Using the
answer
to
Problem
11, draw
the
dependency
diagrams
that
are in
3NF. (Hint:
You
might have to create a few new attributes. Also make sure that the new dependency diagrams contain attributes that meet proper design criteria; that is, make sure that there are no multivalued attributes, that the naming conventions are met, and so on.) 13
Using the results
of Problem 12, draw the UML ERD.
NOTE Problems
14
1416
2020 has
or a
mini-project.
of
Cengage deemed
as a case
business rules to form the basis for a database design. The
A
review
serve
Suppose you are given the following must
Each
Copyright
to
database members,
Editorial
may be combined
any
the
plan the
All suppressed
many
receives
Rights
Reserved. content
does
manager
meals, to
dinner serves
member
Learning. that
to
enable
keep
track
members,
many invitations,
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
a company of
who
and each and
in experience.
whole
dinner attends
member
each invitation
or in Cengage
part.
Due Learning
to
electronic reserves
club the
to
mail invitations
dinners
and
may attend is
right
some to
third remove
the
clubs
on:
many dinners.
mailed to
rights, the
so
to
party additional
many
content
may content
members.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
312
PART II
Design
Concepts
A dinner is dinners.
based
For
on a single
example,
may be composed
A member Because
the
structure
Attribute
of a fish
may attend manager
shown
entre,
a dinner entre,
a baked
many dinners,
is
not
Sample
may be used as the of a fish
potato
and
expert,
entre,
string
and each dinner
a database
in the following
Name
but an entre
may be composed
basis for
rice
and
Or the
dinner
beans.
may be attended
the first
many
corn.
attempt
by many
at creating
the
members.
database
uses the
table.
Sample
Value
Value
Sample
235
Value
214
MEMBER_NUM
214
MEMBER_NAME
Alice
MEMBER_ADDRESS
325
MEMBER_CITY
Murkywater
Highlight
MEMBER_POSTCODE
12345
12349
12345
INVITE_NUM
8
9
10
INVITE_DATE
23-Feb-2020
12-Mar-2020
23-Feb-2020
ACCEPT_DATE
27-Feb-2020
15-Mar-2020
27-Feb-2020
DINNER_DATE
15-Mar-2020
17-Mar-2020
15-Mar-2020
DINNER_ATTENDED
Yes
Yes
No
DINNER_CODE
DI5
DI5
DI2
DINNER_DESCRIPTION
Glowing
ENTREE_CODE
EN3
ENTREE_DESCRIPTION
Stuffed
DESSERT_CODE
DE8
DESSERT_DESCRIPTION
Chocolate
Gerald
B. VanderVoort Meadow
123
Park
M. Gallega
Rose
Court
Alice
B. VanderVoort
325
Meadow
Park
Murkywater
7
Sea
Given that
15
Delight
Ranch
mousse
with
Marinated
crab
steak
DE5
DE2
Cherries Jubilee
Apple pie with honey crust
sauce
write the relational
partial
EN5
Stuffed
crab
Superb
dependencies.
schema
(Hint:
This
and
draw its
structure
dependency
uses
diagram.
a composite
Label
primary
all
key.)
Break up the dependency diagram you drew in Problem 14 to produce dependency diagrams that are in 3NF and write the relational schema. (Hint: You might have to create a few new attributes. Also
make
criteria; are
16
structure, and/or
Sea
EN3
raspberry
transitive
Glowing
Delight
sure
that
that is,
the
new
dependency
make sure that
there
diagrams
are no
contain
multivalued
attributes
attributes,
that
that
meet
the
proper
naming
design
conventions
met and so on.)
Using the results
of Problem 15, draw the ERD.
NOTE Problems
Copyright Editorial
review
2020 has
17-19
Cengage deemed
Learning. that
any
All suppressed
may be combined
Rights
Reserved. content
does
May not
not materially
be
copied, affect
to serve as a case or a mini-project.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
17
7
Normalising
Database
Designs
313
The manager of a consulting firm has asked you to evaluate a database that contains the table structure
Attribute
shown
in the
following
table.
Sample
Name
Sample
Value
James
R. Brown
Value
289
289
298
CLIENT_NUM
Sample
Value
James
D. Smith
D. Smith
CLIENT_NAME
Marianne
CLIENT_REGION
Gauteng
CONTRACT_DATE
10-Feb-2018
15-Feb-2018
12-Mar-2018
CONTRACT_NUMBER
5841
5842
5843
CONTRACT_AMOUNT
a2 358 150.00
a529 537.00
a987 500.00
Internet
Database
Design
Database
Administration
Western
CONSULT_CLASS_1
Database
Administration
CONSULT_CLASS_2
Web Applications
Western
Cape
Services
Cape
Network
CONSULT_CLASS_3
Installation
CONSULT_CLASS_4
CONSULTANT_NUM_1
29
CONSULTANT_NAME_1
Rachel
CONSULTANT_REGION_1
Gauteng
CONSULTANT_NUM_2
56
CONSULTANT_NAME_2
Karl
CONSULTANT_REGION_2
Gauteng
CONSULTANT_NUM_3
22
CONSULTANT_NAME_3
Julian
CONSULTANT_REGION_3
34
G. Carson
Gerald K. Ricardo Western
M. Spenser
Western
H. Donatello
Geraldo
Gauteng
was
the
Each
has
Cengage deemed
Each
contract
Learning. that
any
make a
can
All
Cape
J. Rivera Cape
Rights
may cover
Reserved. content
work
does
in
May not
a consultant
matched to the and
objective
is
make sure that
the
consultants
he or she is located
who is located
consulting
expense,
The
and to
in the
company
it is not always
expertise.
in the
Western
Western
Cape
manager
tries
possible
to
to
and match
do so.) The
maintained:
on
many
the
services
more than
services
design
the
with consultants.
in that region
one region.
may require sign
with
(Although
Cape
clients
properly
with database
minimise travel
are
match
Chen
many clients.
can
contract
suppressed
in
services is
match
to
to
with a consultant
design.
rules
contain
consultant
manager
help
database
is located
can
consulting
2020
is to
business
Each
A client
review
needs
and client locations
client
the
consulting
client
is in
basic
A region
Each
enable
specific
objective
consultant
to
within a given region
if the
expertise
following
Copyright
created
need for
whose
Western
18
match a client
Cape,
Cape
Western
Eastern
For example,
Gerald K. Ricardo
45
CONSULTANT_REGION_4
clients
7
Western Cape
Anne T. Dimarco
Donald
This table
Cape
M. Jamison
34
CONSULTANT_NAME_4
to
Angela
38
CONSULTANT_NUM_4
Editorial
25
not materially
one multiple
copied, affect
scanned, the
overall
of
contract,
database
be
contracts.
but
consulting design
or
many consultants.
duplicated, learning
and
in experience.
whole
each
contract
is
classifications.
signed
(For
by one
example,
client.
a contract
may list
networking.)
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
314
PART II
Design
Concepts
Each consultant A region
can
is located
contain
Each consultant classified Each
area
Given that and
3NF
and
write the
sure
that
many
and
For example,
consultants
in it.
For
and the
business
preceding
(and
example,
the
consulting
experts. rules,
very
write the relational
poor) table
relational new
sure
schema.
(Hint:
dependency
that
there
Using the results
20
Given the sample records in the
of the and
You
diagrams
are
no
may have contain
multivalued
to
create
attributes
attributes,
that
a few
new
that
meet
the
naming
attributes.
proper
diagram for the table
indicates
the
miles, including relationships.
one trip
CHARTER table that follows,
number pickup
For example,
structure.
of passengers points. note
(Hint:
that
Make sure that
carried. Look
write the relational
The
at the
employee
values
has flown
entry is
to
determine
two
charter
the
Sample
Sample
Value
Sample
Value
Sample
Value
CHAR_DATE
15-Jan-2019
15-Jan-2019
16-Jan-2019
17-Jan-2019
CHAR_CITY
STL
MIA
TYS
ATL
CHAR_MILES
580
1 290
524
768
CUST_NUM
784
231
544
784
Hanson
Bryana
Brown
CUST_LNAME
Brown
CHAR_PAX
5
12
2
5
CHAR_CARGO
235 kg
18 940 kg
348 kg
155 kg
Chen
Henderson
COPILOT
Henderson
Melton
FLT_ENGINEER
OShaski
LOAD_MASTER
Benkasi
Melton
PILOT
Melton
1234Q
3456Y
1234Q
2256W
MODEL_CODE
PA31-350
CV-580
PA31-350
PA31-350
MODEL_SEATS
10
38
10
10
MODEL_CHG_MILE
2.13
18.45
2.20
2.20
AC_NUMBER
any
All suppressed
as pilot
Value
10235
Learning.
on
nature
trips
10234
that
based
as copilot.)
Name
Cengage
and
all dependencies.
CHAR_MILES
data
Melton
you label
schema
10233
deemed
design
conventions
10232
has
all
diagrams that
CHAR_TRIP
2020
Label
of Problem 18, draw the ERD using UML notation.
dependency
round-trip
review
schema
structure.
met and so on).
CHAR_PAX
Copyright
may be
networking.
who are networking
for the
a consultant
diagram you drew in Problem 17 to produce dependency
19
Attribute
have
diagram
the
make
can
design
(class).
dependencies.
make
draw the
Editorial
partial
is,
database
of the requirements
dependency
Also
that
both
of expertise
many consultants
brief description
are in
are
more areas
(class)
Break up the dependency
criteria;
7
in
of expertise
and/or
consultants.
has one or
may employ
draw the
transitive
18
many
as an expert
company
in one region.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
21
Decompose the dependency and
22
write the
Draw the Problem in
relational
ERD to reflect 21.
Problem
Make
diagram in Problem 20 to create table
schema.
sure
Make sure
the
that
properly
the
ERD
that
you label
decomposed
produces
7
Normalising
Database
Designs
315
structures that are in 3NF
all dependencies.
dependency
a database
that
can
diagrams track
all
you created in
of the
data
shown
20.
NOTE Use the
dependency
FIGURE P7.1
diagram
shown
in
Figure
P7.1 to
work
on
Problems
23-24.
Initial dependency diagram for Problems 2324
7
A
23
BC
Break up the dependency in 2NF.
DE
FG
diagram to create two new dependency
24
Modify the dependency diagrams you created in Problem 23 to produce a set of dependency diagrams that are in 3NF. To keep the entire collection of attributes together, copy the 3NF dependency diagram from Problem 23; then show the new dependency diagrams that are also in 3NF. (Hint: One of your dependency diagrams will be in 3NF but not in BCNF.)
25
Modify the dependency diagrams in Problem 24 to produce a collection of dependency diagrams that arein 3NF and BCNF. To ensure that all attributes are accounted for, copy the 3NF dependency
26
diagrams
from
Problem
24; then
show the
Suppose
you have been given the table
new 3NF and
structure
BCNF
dependency
and data shown
here,
an Excel spreadsheet. The data reflect that a lecturer can have multiple committees and can edit more than one journal.
Copyright Editorial
review
diagrams, one in 3NF and one
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
diagrams.
which
wasimported
multiple advisees,
some to
third remove
party additional
content
may content
be
can serve on
suppressed at
any
time
from
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
316
PART II
Design
Attribute
Concepts
Sample
Name
Value
Sample
EMP_NUM
123
104
LECT_RANK
Professor
Asst.
EMP_NAME
Ghee
DEPT_CODE
CIS
DEPT_NAME
Computer
Info.
Value
Sample
Value
Sample
118 Lecturer
Lecturer
Lecturer
Rankin
Ortega
Smith
CHEM
CIS
ENG
Chemistry
Computer
Systems
English
Info.
Systems
PROF_OFFICE
KDD-567
BLF-119
KDD-562
PRT-345
ADVISEE
1215, 2312, 3233,
3102, 2782, 3311,
2134, 2789, 3456,
2873, 2765, 2238,
2218,
2008,
2876,
2222,
2002,
2901,
3745,
1783,
2378
2764
COMMITTEE_CODE
2098
PROMO,
TRAF,
APPL,
2046,
SPR,
DEV
2018,
JOURNAL_CODE
7
JMIS,
QED, JMGT
Identify the
Create the dependency
d
Eliminate the
create
attributes
2020 has
Cengage deemed
Learning. that
table:
diagram.
diagrams to yield a set of table structures in 3NF.
multivalued dependencies
by converting the affected table structures to 4NF.
Draw the ERD to reflect the dependency to
review
JCIS, JMGT
multivalued dependencies.
c
e
Copyright
in this
Draw the dependency
b
any
All suppressed
Rights
additional
attributes
conform
Reserved. content
SPR,
DEV
Given the information
a
2308
PROMO,
TRAF
DEV
Editorial
Value
does
May not
not materially
to the
be
copied, affect
naming
scanned, the
to
overall
or
duplicated, learning
define
diagrams you drew in Part c. (Note: the
proper
PKs
and
FKs.
Make
sure
You may have that
all of
your
conventions.)
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
27
Using the descriptions diagram
FIGURE
that
is in
P7.2
of the attributes
at least
7
Normalising
given in Figure P7.2, convert the ERDinto
Database
Designs
317
a dependency
3NF.
Appointment
for ERD for
Problem
27
7
28
Using the descriptions diagram
that is in at least
FIGURE P7.3
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
of the attributes
All suppressed
Rights
a dependency
3NF.
Presentation ERDfor Problem 28
Reserved. content
given in Figure P7.3, convert the ERDinto
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
PartIII
DATABASE PROGRAMMING 8 Beginning Structured Query Language
9 Procedural Language SQL and Advanced SQL
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
BUSINESSVIGNETTE OPENSOURCE DATABASES acquiring
MySQL in
significantly
With Oracle
growing?
An interesting
being
the
ranked
moving,
and
licensing.
worlds
continue
Oracle
most
to
have
procedures,
which have led
For example,
in
2011,
system
PostgreSQL2
(CNAF).3
Frances
69
billion
to
to reduce
took
for
over
11
18
The
BBC news
website
The idea
was that
the
currently
began to
of the
news
per
comprises
minute There
and
over
is
an increasing
using
source
databases
as
Oracle
1
The
2
Open
24 tables
open
using the about
per
8
source real-time
the
rows,
made
features
existing
of data.3
up
world
DBMS
Today,
2006 in which
most
could
the
system
and
in
system,
picked
was tight,
million
was
essential the
database
budget
operating
from
4 terabytes
being
Linux
more than
DBMS
have all the
throughout
As the
Familiales to
the
other
be fed
to
solution
about
would give
by
users. the
was to
and a MySQL
processes
order to
site,
develop
database.
30 000
The
data inserts
hour.4
number
source
zones
users.
DBMSs.
management
day.
open
website
licensing
open source
source
Migration
containing every
a dynamic
on the
all time
to
complex
amounting
an open
was found
queries
produce
to
been
software
dAllocations
benefits
required.
use a MySQL
million
and
switch
databases SQL
stories
from 35
4 000 requests
develop
to
of
database
Nationale
allocating
PostgreSQL
was to
news
stories
attracts
decision
source
have
costs
and
still
MySQL still
organisations
expensive
open
Caisse
with
systems by
due to the
and switch to
to the
system
general,
for their
performance
a billion
BBC News Live Stats system
database
The
and
The aim
sense
criticised
deals
168 individual almost
monitor reader interest. a real
Security
On evaluation,
runs
In
management be answered
databases
switched
system
of reliability
can
platform.1 source
been
database
perhaps
to look for alternatives
people.
and involved
system
audiences
which
Social
million
levels
months
PostgreSQL
often
Government
Security
costs.
necessary
both
source
that
open
customers
its
open
database
towards
French
Social
licensing
and the
the
are
question
popular
move,
and IBM
2009,
of
success
databases.
stories
However,
still have a way to go to
from
organisations
in terms
match the
that
of business
capabilities
of the
have
chosen
intelligence, commercial
to
these
open
vendors
such
and IBM.
Most
Popular
Source
Databases
Database
2019.
Engine
for
Available:
Frances
www.explore-group.com/blog/the-most-popular-databases-2018/bp46/
Social
Security.
Available:
www.linuxbsdos.com/2010/11/25
/open-source-database-new-engine-of-frances-social-security/ 3 4
PostgreSQL. BBC
Available:
News
www.postgresql.org/
Website
uses
MySQL
to
Monitor
Reader
Interest.
Available:
www-it.mysql.com
/whymysql/case-studies/
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
319
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 8 Beginning Structured Query Language IN THIS CHAPTER,YOU WILLLEARN: The
basic
commands
and functions
How to
use
SQL for
data
How to
use SQL for
data
How to
use SQL to
of
SQL
administration
(to
manipulation
create
tables,
(to add,
query a database
to
indexes
modify,
extract
delete
and
views)
and retrieve
data)
useful information
PREVIEW In
this
chapter,
pronounced
you
S-Q-L
database
and table
administration, DBMS to
basic
SQLs is
by the
is
simple,
fact
that
data
format
and
Query
of commands
various to
types
extract
Language
that
(SQL).
enable
of data useful
many software
so the language much
manipulation
have
SQL,
users to create
information.
vendors
work
the
complex
takes
and
data
All relational
developed
extensions
activities
Its simplicity
the
take
place
when
store
language;
be done.
a
For
to
a non-procedural
do not need to know the
that
scenes.
required
but not how it is to
and programmers
complex
behind
structures
SQL is
must be done,
easy to learn.
place
table
Furthermore,
what
end users
is relatively
of its
creates
successfully.
or the
Structured
set.
user specifies
SQL commands, storage
perform
command
manipulate
of
database
SQL,
command
a single
that is, the
basics
is composed
the
supports
SQL
enhanced
and
query
vocabulary
example,
the
structures,
and
software
the
learn
or sequel,
To issue
physical
SQL
data
command
is
executed. Although
quite
applications corrections overlays, expect.
and
on
(adding, chapter.
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
In
May not
or any those
data
spite
of the
SQL
with
SQL itself other are
(creating
deleting
of its
powerful,
entry
features
definition
modifying,
managing
and
Data
additions.
pop-ups Instead,
focuses
Editorial
useful
arena.
does utilities
available tables,
SQL is
not
meant
possible
to
but
stand
not create
menus,
special
and
devices
that
screen
indexes
and
data), the a powerful
views)
for
in
as are
end
and
the
data
report
forms,
users
usually
enhancements. data
basic functions tool
alone
awkward,
as vendor-supplied
and retrieving
limitations,
is
SQL is
SQL
manipulation
presented
in this
information
and
extracting
data.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8.1
INTRODUCTION
8 Beginning
Structured
Query
Language
321
TO SQL
Ideally, a database language allows you to create database and table structures, to perform basic data management chores (add, delete and modify), and to perform complex queries designed to transform the raw data into useful information. Moreover, a database language must perform such basic functions with minimal user effort, andits command structure and syntax must be easyto learn. Finally,it must be portable; that is, it
must conform
to some
basic standard
so that
an individual
does not have to relearn
the
basics
when moving from one RDBMS to another. SQL meetsthose ideal database language requirements SQL functions fit into several broad categories: It is
a data definition
language
(DDL):
SQL includes
commands
to create
database
objects
well.
such
as tables, indexes and views, as well as commands to define access rights to those database objects. The data definition commands you willlearn in this chapter are listed in Table 8.1. It is
a data
manipulation
language
(DML):
It includes
retrieve data within the database tables. chapter arelisted in Table 8.2. It is
a transaction
control language
commands
to insert,
update,
The data manipulation commands
(TCL):
The
DML commands
in
delete
and
you willlearn in this
SQL are executed
within the
context of a transaction, whichis alogical unit of work composed of one or more SQL commands. SQL provides commands to control the processing of these statements in anindivisible unit of work. This will be discussed further in Chapter 9. It is a data control language (DCL): Data control commands are used to control access to data objects, such as giving a specific user permission to view the PRODUCT table. Common TCL and DCL commands are shown in Table 8.3.
TABLE 8.1 Command CREATE
SQL data definition
or Option
8
commands
Description
SCHEMA
Creates
a database
schema
Creates
a new
in the
Ensures
that
a column
will not
have
null values
Ensures
that
a column
will not
have
duplicate
AUTHORISATION CREATE
NOT
TABLE
NULL
UNIQUE PRIMARY
KEY
Defines
a primary
FOREIGN
KEY
Defines
a foreign
key for
a table
Constraint
Creates an index for a table
CREATE VIEW
Creates a dynamic
TABLE
Modifies
TABLE
AS
TABLE
used to validate
a tables
Creates
subset
of rows/columns (adds,
based
deletes
a table
DROP INDEX
Permanently
deletes
an index
DROP
Permanently
deletes
a view
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
from
modifies
on a query in the
Permanently
VIEW
no value is
given)
data in an attribute
definition
a new table
values
a table
CREATE INDEX
2020 has
schema
CHECK
DROP
review
key for
database
Defines a default value for a column (when
CREATE
Copyright
users
DEFAULT
ALTER
Editorial
table
(and
to
electronic reserves
thus its
rights, the
right
some to
one or more tables
or deletes users
attributes
database
or constraints) schema
data)
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
322
PART III
Database
Programming
TABLE 8.2 Command
or
SQL data
manipulation Description
Option
Inserts
INSERT
Selects
WHERE
Restricts
the
Restricts
ORDER
a table from
rows
selection
the
rows
selection
selected
in
one
of rows
selected
Orders the
BY
into
attributes
Groups the
BY
HAVING
based
an attributes
Modifies
DELETE
Deletes one or more rows from
COMMIT
Permanently
ROLLBACK
Restores data to their
one
based
or
expression
more attributes
on one or
in
or views
on a conditional
rows
based
values
more tables
on one or
of grouped rows
or
based
UPDATE
Comparison
on a condition
more attributes
more tables
rows
a table
saves data changes original
values
operators Used in
conditional
expressions
AND/OR/NOT
Used
conditional
expressions
Special operators
Used in
conditional
expressions
BETWEEN
Checks
whether
an attribute
value is
within
Checks
whether
an attribute
value is
null
Checks
whether
an attribute
value
matches
a given
Checks
whether
an attribute
value
matches
any value
EXISTS
Checks
whether
a subquery
returns
DISTINCT
Limits values to unique values
5,
,,
.,
,5,
Logical
IS
.5,
,.
operators
NULL
LIKE IN
in
a range
string
pattern
within
a value list
any rows
Used with SELECT to return
mathematical
COUNT
Returns the number
with non-null
MIN
Returns the
minimum attribute
value found in a given column
MAX
Returns
the
maximum
value
SUM
Returns
the
sum
AVG
Returns
the
average
Aggregate
functions
SQL is relatively yet, to
SQL is worry
The
how it is
SQL
ANSI
the
SQL
SQL
specifications,
many
to
move
However, minor.
Copyright review
2020 has
Whether you use Oracle,
Learning. that
any
All suppressed
Rights
in this
Reserved. content
does
May not
not materially
be
in a given
bodies
a given
of fewer than
National
Standards
which
was formally
add their
own
special
enhancements.
from
one
RDBMS
to
Microsoft manual
SQL Server, IBMs should
be sufficient
adopted
you
Although
making
up to
2011. (ISO),
database it is
some
seldom
changes.
among
Access
them
or any other
SQL
speed
a
adherence
contract
differences
have
December
Standardization
without
Microsoft
get
Better
dont prescribes
Consequently,
the
DB2, to
in
for
government
another
SQL dialects,
words.
you
(ANSI)
150 countries. and
100
be done;
Institute
Organization
commercial
different
column
what is to
of more than
in
column
column
set has a vocabulary
required
are several
a software
for
on columns
values for a given column
found
merely command
SQL:2011,
standards usually
summaries
a given
by the International
application
there
for
of all values
American
is
vendors
a SQL-based
presented
The
accepted
is
attribute
of all values
you
version
of national
of rows
command
done.
also
standard
even though
material
Cengage deemed
are
RDBMS
RDBMS, the
be
most recent
composed
ANSI/ISO
possible
Its basic language:
to
standards
a consortium to the
easy to learn.
a non-procedural
about
standard
Editorial
row(s)
SELECT
GROUP
8
commands
if
you
are well-established know
chapter.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
323
NOTE Throughout
this
Community degrees
ISO
so there
standards
know
book
Edition
RDBMs
Release and
2(11.2)
Part
11,
examples
differences
complies the
between
functionality
by a Swedish
aim
In editions
to
MySQL,
one
developers.
If
number
of versions
benefits
from
SQL and support MySQL
of
which
available
both
is the
Release
however,
For
important
to
to
note that
the
to
of databases SQL
in
MySQL different
Users are interested
portability
defined
and
standard
parts.
example,
which is
2 (11.2)
SQL:2011
several
offers.
within
between Oracle
11g
Part 2, SQL/Foundation,
depending
open source
8.3
MySQL
has
open
wishes
been
on the
development
will remain
or option
use
aims
same to
of
time
MySQL
over the
adding
enhance
the
RDBMS in 1995 and years
additional
usability
has
was
been to
SQL extensions.
of the
MySQL
DBMS
features.
source
to
as an open source
One of the
whil at the
for non-SQL
free
an organisation
Other SQL commands
of Sun.
Currently,
Community
Edition,
a commercial
business
version
requirements,
and revenue
Oracle which is
of each
generated
open source in the future
for
a number
supported
MySQL,
Oracle
a price.
by commercial
is up for
offer
of
by open has
Oracle
versions
made reap
of
a
the
MySQL.
debate.
8
Definition
control language
COMMIT
Permanently
ROLLBACK
Restores
control
It is,
Standard,
as part of its takeover
TABLE
Data
RDBMS
AB.
community
of
Transaction
11g
can assess the
MySQL started life MySQL
Whether a version
Command
each
standards
development
Oracle acquired
of
source
called
with the ISO
MySQL
extensions
2009
dialects.
and comprise
SQL:2011
DBMS,
company
compliance
of the
by adding
Core
Oracle
with the ISO
SQL/Schemata.
work towards The
SQL
so that they
that
with the
both
comply
the
with standards
compliant
using
RDBMSs
are very complex
Whereas Oracle is a commercial sponsored
be given
of these
as SQL:2011
and is
will
Both
are small
such
how a RDBMS
different
SQL
RDBMS.
saves
data
data to its
changes
original
values
language
GRANT
Gives
a user
permission
to take
a system
action
or access
a data
object REVOKE
At the
Removes
heart
of SQL is the
query. In
Chapter
a previously
1, The Database
granted
permission from
Approach,
you learned
a user
that
a query is
a
spur-of-the-moment question. Actually, in the SQL environment, the word query covers both questions and actions. Most SQL queries are used to answer questions such as these: Which products currently held in inventory are priced over 100, and whatis the quantity on hand of each of those products? How many employees have been hired since 1 January 2019 by each ofthe companys departments? However, many SQL queries
are used to perform
actions
such as adding
or deleting table rows
or changing
attribute
values
within tables. Still other SQL queries create new tables orindexes. In short, for a DBMS, a query is simply a SQL statement that must be executed. However, before you can use SQLto query a database, you must define the database environment for SQL withits data definition commands.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
324
PART III
Database
8.2
Programming
DATA DEFINITION COMMANDS
Before
examining
the
SQL syntax for creating
and defining tables
the simple database model and the database tables that youll explore in this chapter.
and other elements,
willform the basis ofthe
lets
first
examine
many SQL examples
8.2.1 The Database Model A simple
database
composed
of the following
chapter: CUSTOMER, INVOICE, Figure 8.1.
FIGURE 8.1
The database
tables
is
used to illustrate
the
SQL commands
LINE, PRODUCT and VENDOR. This database
in this
model is shown in
model
8
The database
model in
A customer
8.1 reflects
may generate
An invoice
contains
Each invoice sell
Figure
line
more than
many invoices.
one
or
more invoice
references
one
the following
one
hammer
to
business
Each invoice lines.
product.
is
generated
Each invoice
A product
more than
one
rules:
line
is
by one
customer.
associated
may be found
in
with
one invoice.
many invoice
lines.
(You
can
customer.)
A vendor maysupply many products. Some vendors do not yet supply products. (For example, a vendor list mayinclude potential vendors.) If
a product
Some
Copyright review
2020 has
Cengage deemed
Learning. that
vendor
products
in-house
Editorial
is
any
or
All suppressed
supplied,
are
not
may have
Rights
Reserved. content
does
May not
that
supplied been
not materially
be
affect
scanned, the
is
by a vendor.
bought
copied,
product
overall
on the
or
duplicated, learning
in experience.
supplied (For
open
whole
or in Cengage
by only
example,
a single
some
vendor.
products
may be
produced
market.)
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
325
Online Content Thedatabasemodel in Figure8.1isimplemented in the Microsoft Access 'Ch08_SaleCo'
database
located
on the
online
platform
for this
book. (This
additional tables that are not reflected in Figure 8.1. These tables only.) If you
use
accompanying structures
Microsoft
this
Access,
book.
so you
can
you
However,
practice
the
can
it is
use the
strongly
database
suggested
SQL commands
that
software the
was installed
database
How you connect to the
on your
administrator.
server
you
Follow
access
the instructions
paths
provided
As you can see in Figure 8.1, the database
model contains
set of data definition
of attention
commands,
the focus
create
online
your
chapter.
manytables.
database are using the
on how the
methods
defined
by your instructor,
will be the
purposes
the data in the database
depends
and
a few
platform
own
If you
and loading
Oracle database
and on the
on the
in this
Oracle or MySQL DBMS, SQL script files for creating the tables are also available online.
contains
are used for discussion
available
illustrated
database
Oracle
and
college
managed
by
or university.
However, to illustrate the initial
PRODUCT
and VENDOR tables.
You
will have the opportunity to use the remaining tables later in this chapter and in the problem section. So that you have a point of reference for understanding the effect of the SQL queries, the contents of the PRODUCT and VENDOR tables are listed in Figure 8.2. Note the following about these tables. (The features correspond to the business rules reflected in the
ERD shown in
Figure 8.1.)
8 FIGURE 8.2 Database Table
The VENDORand PRODUCTtables
name:
name:
Ch8_SaleCo
VENDOR
V_CODE
V_NAME
V_CONTACT
V_POSTAL_CODE
V_PHONE
V_Country
V_ORDER
21225
Bryson, Inc.
Smithson
0181
223-3234
UK
Y
21226
SuperLoo, Inc.
Flushing
0113
215-8995
SA
N
21231
D&E Supply
Singh
0181
228-3245
UK
Y
Khumalo
0181
889-2546
SA
N
Smith
7253
678-1419
FR
N
Anderson
7253
678-3998
FR
Y
Browning
0181
228-1410
UK
N
21344
Jabavu
22567
Dome
23119
Randsets
24004
Brackman
24288
ORDVA, Inc.
Dandala
0181
898-1234
SA
Y
25443
B&K, Inc.
Smith
0113
227-0093
SA
N
25501
Damal Supplies
Gounden
0181
890-3529
SA
N
25595
Rubicon
Du Toit
0113
456-0092
SA
Y
Table
name:
Bros Supply Ltd. Bros.
Systems
PRODUCT
P_CODE
P_INDATE
P_QOH
P_MIN
15 psi.,
03-Nov-18
8
blade
13-Dec-18
32
P_DESCRIPT
11QER/31
Power
painter,
P_PRICE
P_DISCOUNT
V_CODE
5
109.99
0.00
25595
15
14.99
0.05
21344
3-nozzle 13-Q2/P2
Copyright Editorial
review
2020 has
Cengage deemed
18cm
Learning. that
any
All suppressed
Rights
pwr.
Reserved. content
does
saw
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
326
PART III
Database
Programming
P_CODE
P_DESCRIPT
P_INDATE
P_QOH
P_MIN
P_PRICE
P_DISCOUNT
V_CODE
14-Q1/L3
22 cm pwr. saw blade
13-Nov-18
18
12
17.49
0.00
21344
1546-QQ2
Hrd. cloth, 0.6 cm, 2x50
15-Jan-19
15
8
39.95
0.00
23119
1558-QW1
Hrd. cloth, 1.25 cm, 3x50
15-Jan-19
23
5
43.99
0.00
23119
2232/QTY
B&D jigsaw,
30 cm blade
30-Dec-18
8
5
109.92
0.05
24288
2232/QWE
B&D jigsaw,
20 cm blade
24-Dec-18
6
5
99.87
0.05
24288
2238/QPD
B&D cordless
20-Jan-19
12
5
38.95
0.05
25595
23109-HB
Claw
20-Jan-19
23
9.95
0.10
21225
23114-AA
Sledge
hammer,
02-Jan-19
8
14.40
0.05
54778-2T
Rat-tail
file,
0.5 cm fine
15-Dec-18
43
4.99
0.00
21344
89-WRE-Q
Hicut
saw,
40 cm
07-Feb-19
11
0.05
24288
PVC23DRT
PVC
2.5
20-Feb-19
188
75
5.87
0.00
SM-18277
3 cm
metal
01-Mar-19
172
75
6.99
0.00
21225
SW-23116
6 cm
wd. screw,
24-Feb-19
237
100
8.45
0.00
21231
WR3/TT3
Steel
matting,
17-Jan-19
18
0.10
25595
drill,
1.25
cm
hammer
chain pipe,
7 kg
9 cm,
3 0.5 cm
screw,
m
25
50
10 cm
31.25
3 20 cm
cm
10 5 20 5
256.99
5
119.95
mesh
8 The VENDOR designers
note
exist
without
Data
Modelling
products
bought in
example, such
Relationship
values
in
are
supplied
a special
examined
is
in the
optional
such
to
optional
PRODUCT
VENDOR
table.
because
relationships
in
Database
a vendor
detail
in
may
Chapter
5,
Diagrams.
PRODUCT
table
factory
direct,
must (and
a few
warehouse
sale. In
other
VENDOR is
optional
to
conditions
using
You
PRODUCT
do)
have
a
match in the
VENDOR
table
integrity.
just
null V_CODE
nulls
the
who are not referenced that
a product.
Entity
Therefore,
of the
vendors by saying
to
referential
vendor. Afew
with
V_CODE
ensure
A few
contains
possibility
a reference
Existing to
table
that
described
values
are
made in-house,
words, a product
and
a few
may have
is not necessarily
supplied
been
by a
PRODUCT.
were made for the
were used in the
sake
PRODUCT
of illustrating
specific
table to illustrate
(later)
SQL features.
For
how you can track
SQL.
8.2.2 Creating the Database Before
you
second, creates
can
the
If
is the
is that you
Copyright Editorial
review
2020 has
Cengage deemed
any
data
the
All suppressed
the
Rights
Reserved. content
does
dictionary
easy to
May
not materially
in
be
copied, affect
create
scanned, the
overall
you
or
to
duplicated, learning
tasks:
data.
store
the
metadata
operating
a database
want to
in experience.
whole
store
or in Cengage
part.
is the
Due Learning
to
from
a new and
start
rights, the
database
task,
creates
right
and
some to
third remove
party additional
content
RDBMS database
with the
operating
the
may
you
suppressed at
any
use.
File/New/Blank
database.
be
database The good
RDBMS
select
content
the
another.
which
Access, name
RDBMS
the
a default
creating
RDBMS to of
structure;
the
database,
Therefore, one
database
reserves
the
the first
regardless
simple:
electronic
create
means interacting
system.
structure,
database
create
database
differ substantially
the
first,
To complete
When you
will hold the
by the to
two
database.
that
creating which
complete
end-user
tables
files
supported
Access,
not
the
that tends
folder
must
will hold
systems
Microsoft
you
will hold the
physical
one feature
specify
Learning. that
the
it is relatively
use
Database,
that
Creating
and the file
structure
RDBMS,
that
files
creates
administrator.
news
a new
the tables
physical
automatically
system
use
create
time
from if
However,
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
if you
work in the database
enterprise
RDBMS
complexity,
those
You
will
that
If
database
are
the
DBMS
log
on to
enterprise
end
RDBMS,
every
more
among
RDBMS
elaborate
ID
such
ANSI standard
user ID is
However,
as
Oracle,
RDBMS.
associated
with
creation
or
Server,
a database
most
most RDBMSs do not use
in
Note you
boxes.
can
start
process through
To be authenticated,
by the
greater
process,
before
is the
database.
created
327
process.
creation
SQL implementations
SQL
Language
use an
and
will be highlighted
DB2
the
a password
will probably
SQL. For example,
some
Query
requirements
database
Authentication
may access
and
database of the
Structured
you
security
implementations
by the users
a user
organisations,
Given their
exception
with a semicolon.
differences
only registered using
a
DB2.
with the
must be authenticated
RDBMS
or
deviates little from the
syntax
verifies that the
that,
an enterprise
you
used bylarger
Server
require
discover
use SQL that
using
tables
typically
SQL
products
to
Important
you
Oracle,
any SQL command
a semicolon.
creating
as
be relieved
RDBMS vendors require
environment
such
8 Beginning
database
which
you
administrator.
must In
an
schema.
8.2.3 The Database Schema In the SQL environment, are related can
hold
multiple
grouping tables
by
tables
that
owner
CREATE Therefore,
(or
user.) the
the
user
For
is
JONES,
RDBMSs
a first
to
such as tables
or applications. and
level
and indexes
user or application.
views.
Think
Schemas
of security
A single
of a schema
are useful
by allowing
the
that
database
as a logical
in that
they
user to
see
group only
the
8
schema:
command:
command.
a user is
DBMS is used, the
a database
JONES;
that
(When
create
{creator};
use the
support
line.
who owns the schema.
most
users
indexes
a command
AUTHORISATION
command
AUTHORISATION
focuses
enforce
AUTHORISATION
SCHEMA
When the
different
as tables,
and
define
creator
Most enterprise is, from
belongs to a single
him or her.
standards
if the
to
such
function)
SCHEMA
CREATE
belonging
objects,
belong to
SQL
Usually, the schema
schemas
of database
ANSI
a schema is a group of database objects
to each other.
However,
created,
CREATE
the
SCHEMA
That is, if you log
the
command
DBMS
is
seldom
automatically
AUTHORISATION
on as JONES,
used
assigns
directly
a schema
command
that to that
must be issued
you can use only
CREATE
by
SCHEMA
JONES. RDBMSs,
on the
ANSI
the SQL
CREATE
SCHEMA
commands
AUTHORISATION
required
to
create
is
and
optional.
That is
manipulate
tables.
define
PRODUCT
why this
chapter
8.2.4 Data Types After the
database
structures the
within
data
schema the
dictionary
been
database. shown
TABLE 8.4
created,
Attribute
Name
Name
Table
Product
P_DESCRIPT
are ready SQL
to
the
commands
used in the
and
VENDOR
example
are
table
based
on
8.4.
for the Ch8_SaleCo
Contents
P_CODE
you
The table-creating
in
Data dictionary
Table
PRODUCT
has
Data
code
Product
Type
database Range
Format
Required
CHAR(10)
XXXXXXXXXX
NA
Y
VARCHAR(35)
Xxxxxxxxxxxx
NA
Y
PK
FK
or
Referenced
FK
Table
PK
description
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
328
PART III
Database
Table
Attribute
Name
Name
VENDOR
Programming
Contents
Data
P_INDATE
Stocking
date
P_QOH
Units
P_MIN
Minimum
available
units
Type
Format
Range
Required
DATE
DD-MON-YYYY
NA
Y
SMALLINT
####
0-9999
Y
SMALLINT
####
0-9999
Y
PK
FK
or
Referenced
FK
Table
VENDOR
P_PRICE
Product
price
NUMBER(8,2)
####.##
0.00-9999.00
Y
P_DISCOUNT
Discount
rate
NUMBER(4,2)
0.##
0.00-0.20
Y
V_CODE
Vendor
code
INTEGER
###
100-999
N
FK
V_CODE
Vendor
code
INTEGER
PK
V_NAME
Vendor
name
V_CONTACT
Contact
#####
1000-9999
Y
CHAR(35)
Xxxxxxxxxxxxxx
NA
Y
CHAR(25)
Xxxxxxxxxxxxxx
NA
Y
CHAR(5)
99999
NA
Y
CHAR(12)
999-999-9999
NA
Y
CHAR(2)
XX
NA
CHAR(1)
X
Y or
person
8
V_AREACODE
Area
V_PHONE
Phone Country
V_ORDER
Previous
5 Foreign
key
PK
5 Primary
key
CHAR
5 Fixed
the
character
5 Variable
NUMBER
5 Numeric
decimal
INTEGER
length
data,
character
length
data.
places.
only.
shown
here
mind that
1 to
only.
the
data type
of a MONEY in
by
formats
Y
in this
with two
or a CURRENCY
in
places
in
Oracle
and up to
nine
digits long,
including
data type
Oracle
DD-MON-YYYY,
chapter.
in
VARCHAR2
decimal
Oracle
NUMBER
are
dictionary
selection
May be labelled
numbers
NUMBER
Represented
data
characters.
specify
use by
accepted
the
2 000
used to
will be illustrated
As you examine
For
is
permit
Commonly
Y N
255 characters
data,
Represented
values
may vary.
* Not all of the ranges
1 to
RDBMSs
values
5 Small integer
DATE formats
order
NUMBER(9,2)
Some
5 Integer
SMALLINT
number
V_COUNTRY
FK
VARCHAR
code
DD-MON-YY,
However,
you can
Table 8.4, note
is
usually
dictated
some
kind
of numeric
by the
MM/DD/YYYY,
use these
constraints
particularly nature
the
of the
and
to
practise
data types
data
MM/DD/YY
writing
your
own.
selected.
Keep in
and by the intended
use.
example: P_PRICE
clearly
requires
data type;
defining
it
as a character
field
is
not
acceptable. Just
as clearly,
a vendor
VARCHAR2(35) case,
such
Country
dates
instance,
Copyright Editorial
review
2020 has
Cengage deemed
if
you
any
All suppressed
Rights
are to
allow have
does
May
obvious
two
be a(Julian) you
to
not materially
DATE fields,
copied, affect
scanned, the
overall
so
can
15-APR-2019
or
duplicated, learning
in experience.
whole
CHAR(2)
rather
date
you
a character
data type. character
For
example,
strings,
and in this
long.
characters,
simple
for
are variable-length
DATE field
make
by using
be
candidate
names
35 characters
always
used
not
an
vendor
may be up to
Reserved. content
is
because
and 15 April, 2019
Learning. that
strings
P_INDATE
Julian
2018
name
well
abbreviations
Selecting the
fits
than
is
a character
comparisons determine
alogical
and to how
choice. field
is
perform
many
days
desirable date
are
because
arithmetic.
between
1
For March,
- 01-MAR-2018.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
If you use DATE fields, using
15-FEB-2018
Microsoft
Access
today?
to
For example,
interest
arithmetic
would
On the to
hand,
computer
vendor
code,
SMALLINT
for
If
you
it
as a character
do not
new
vendor
values
want to
perform
to
procedures
on the
option
generate
process
is
up to
in
to
queries.
that
for
sorting
represents
the
data type with the
would if
because number
in
numbers it is
2,
which is to
the
clearly
define
the
ones
shown
TABLE
If in
8.5
RDBMS 8.5.
many
you
RDBMSs
give
does
May not
not materially
be
copied, affect
scanned, the
definition,
thought
use of the
the
CHAR(3)
multiplication
to the
or division
H_BATH_NUM
problems.
see
the
8
an attribute
be assigned
valid
data types
values
For example,
value
expected
supported
10
use
as less
of the
data
by SQL. For teaching
any RDBMS can be used to implement
supports
many
beyond
the
more
ones
data
types
specified
in
than ANSI
the SQL.
overall
or
duplicated, learning
in experience.
whole
to integer use
Cengage
part.
specification,
Due
to
electronic reserves
rights, the
that
will be stored
right
and
some to
third remove
counting
that
values
up to
may content
so they places.
If your integer
of INT.
storage
content
decimal
six digits.
length
is
are acceptable,
additional
numbers,
require
instead
DECIMAL
party
the
or -134.99).
numbers
but the
DECIMAL(9)
Learning
12.32,
SMALLINT
That is, greater lengths
or in
numbers
are (whole)
want to store
small,
DECIMAL(9,2),
that
may be up to seven digits long, including
place (for example,
but limited
are relatively NUMBER
indicates
and
as INT. Integers
be used if you
Like the
not.
Reserved.
mathematical
application,
addition,
would
places
decimal
specification.
content
classify data
to the expected
estate could
NUMBER(7,2)
decimal
Like INTEGER
DECIMAL(L,D)
Rights
of
chapter.
potential
it
May be abbreviated
values
All
use
SQL data types
SMALLINT
suppressed
only
the
should
perform
creates
data types
The declaration
any
to
a query some
of the
support
NUMBER(L,D)
Learning.
type
compliant,
Numeric
that
you
a real
to ensure that almost SQL
Comments
Cengage
perform
permit
Character
will do any data
only a few
Format
deemed
cannot
on V_CODE,
(H_BATH_NUM)
must
cannot
has
(You
also
no need
in
decision
Data Type
2020
need
properly.
INTEGER
review
date
you
will ensure that
pay close attention
application
sign and the
Copyright
start
simple
example,
of numbers.
this
For example,
with two
Editorial
in
of bathrooms,
ANSI
Some common
is
procedures
data type
Clearly,
is fully
And
there
CHAR(3)
of data types is limited
your
Table
this
in Table 8.4 contains
the selection
examples.
to
attribute.
a home
the
on the
data type
attribute.
based
you need to
by number
incorrect.
attribute
The data dictionary purposes,
homes
from
one to the largest
INTEGER
entirely
when
in
that
Based
system
Such
For
SQL implementations
composed
purposes.
unlikely
generated.
by adding
as a numeric
Most
SQL
data type,
retrieval
is
60 days
Access).
want your
judgement.
The designation
Therefore,
the
However,
sorts
codes
procedures
it is
of bathrooms
highly
be 2,1,2.5,10.
order
the
data
of bathrooms.
an application
than
and
you
date
by
Date() in
type.
professional
store it as a character
demonstrate
will be the 1 60 (in
2018
Oracle,
329
digits.
though
When you define the attributes attributes
six
in
Language
as follows:
be used.
mathematical
even
attribute,
used
can
What Date()
Query
15 February
SYSDATE
Perhaps
data
V_CODE
data.)
as, or
after the invoice
type
(integers)
such
billing.
can require
must classify
date
Oracle)
a character
data
integer
are quicker
60 days
used
on character
attribute,
questions 1 60 (in
Structured
will be 60 days from
system
useful in
selection
you
numbers
to
V_CODEs to
procedures
counting
The first
the
you
date
RDBMSs
SYSDATE
balance if
what the
the
particularly
data type
about
mathematical the
is
use
answer
use
on a customer
want the
recorded
can the
might
be impossible
other
you
determine
capability
make a decision If
Or you
you
Date arithmetic charging
you can also determine 1 60.
8 Beginning
are
be
but smaller
any
ones are
all acceptable.
suppressed at
a minimum
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
330
PART III
Database
Data
Programming
Type
Comments
Format FLOAT(L,D)
Float is similar
CHAR(L)
Fixed-length
Character
that
are
are left
to
DECIMAL
character
not as long unused.
always
wanted VARCHAR(L)
or
to
as the
you
store
up to
CHAR if you
MySQL.
characters.
parameter
value,
If you the
CHAR(25),
CHAR(3)
spaces
such
However,
would
strings
remaining
strings
as 25 characters.
so
store
as Smith
US area
be appropriate
if you
codes.
character
unused
more often in
255
specify
digits long,
such
characters
not leave
used
are each stored
three
store
Variable-length
VARCHAR2(L)
data for
Therefore,
and Katzenjammer code is
and is
data.
up to
The
designation
25 characters
spaces.
VARCHAR2(25)
long.
However,
Oracle automatically
converts
will let
VARCHAR
will
VARCHAR to
VARCHAR2. Date
DATE
In addition to the TIME,
Stores
data types
TIMESTAMP,
REAL,
dates in the
Julian
date format.
shown in Table 8.5, SQL supports DOUBLE,
FLOAT
and intervals
such
several other data types, including as INTERVAL
DAY TO HOUR.
Many
RDBMSs also have expanded the list to include other types of data, such as LOGICAL, CURRENCY, AutoNumber (Access) and sequence (Oracle). However, because this chapter is designed to introduce the SQL basics, the discussion is limited to the data types summarised in Table 8.5.
8
8.2.5 Creating Table Structures Now you are ready to implement using the
CREATE
TABLE
the
syntax
PRODUCT and VENDOR table structures
shown
with the help of SQL,
next.
CREATE TABLE tablename ( column1
data type [constraint]
[,
column2
data type[constraint]
] [,
PRIMARY KEY(column1 [, column2]) ] [, FOREIGN KEY(column1 [, column2])
REFERENCES tablename] [,
CONSTRAINT constraint ] );
Online Content available Oracle
To
definition. and
Copyright Editorial
review
2020 has
Learning. that
any
Developer
SQL
code
addition,
All suppressed
Reserved. content
Oracle
more
are fully
and
Rights
or
does
line
May not
not materially
be
most
up the
capitalised.
PRODUCT
copied, affect
tables
scanned, the
overall
book. You can copy and paste the SQL commands
into
APEX.
readable,
spaces
names
VENDOR
Cengage deemed
In
attribute
create
on the online platform for this SQL
make the
For Oracleusers,allthe SQLcommandsyou willseein this chapterare
or
SQL
attribute Those
programmers characteristics
conventions
and throughout
duplicated, learning
in experience.
whole
or in Cengage
part.
the
Due Learning
to
are
use
one line
and
constraints.
used in the
per
column
(attribute)
Finally,
both
table
examples
that
following
book.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
8 Beginning
Structured
Query
Language
331
NOTE ABOUT SQL SYNTAX
Syntax
notation
for
SQL
CAPITALS
Required
Italics {a |
commands
used in
SQL command
An end-user-provided A
b| ..}
mandatory
use
tablename
The name
of a table
column
The
of an attribute
data type
A valid
constraint
A valid
constraint
condition
A valid
conditional
columnlist
One or more column
parameter;
data
type
names
One
or
more conditional
expression
A simple
names
brackets
(that
option is
list
optional
76
or
or false)
separated
by commas
by commas
expressions
is,
to true
or expressions
separated
Married)
by logical
operators
(that
P_PRICE
or a formula
is,
- 10)
8
(
UNIQUE,
NOT NULL
V_NAME
VARCHAR(35)
NOT NULL,
V_CONTACT
VARCHAR(15)
NOT NULL,
V_AREACODE
CHAR(5)
NOT NULL,
V_PHONE
CHAR(12)
NOT
NULL,
V_Country
CHAR(2)
NOT
NULL,
V_ORDER
CHAR(1)
NOT
NULL,
PRIMARY
| separated
(evaluates
separated
INTEGER
V_CODE
from square
a table
expression
conditionlist
VENDOR
in
option inside
definition
more table
TABLE
one
required)
definition
or
value
(generally
anything
One
CREATE
keywords
parameter;
An optional
name
book:
parameter
[......]
tablelist
this
KEY (V_CODE));
NOTE Because
table
the
PRODUCT
relationship, If
your
you
RDBMS
Oracle accepts If
your
RDBMS
supported, If you
use
delimiters
Copyright Editorial
review
2020 has
Cengage deemed
a foreign
the
not
create the table for the
support
the
VARCHAR
does
not
key that
VARCHAR2
data type
support
SINT
references
always references
and
1 side.
create
the
Therefore,
VENDOR
in a 1:*
1 side first.) FCHAR
and automatically or SMALLINT,
VENDOR,
the
format,
converts
use INTEGER
use
it to
CHAR.
VARCHAR2.
or INT.
If INTEGER
is
not
use NUMBER. Microsoft
Access,
SQL level.
decimal
you For
places is fine
can
use the
example, in
NUMBER
data type,
NUMBER(8,2)
Oracle,
but you
to indicate
cannot
use it in
but you
cannot
numbers
with
Access;
use the up to
instead,
number
eight
use
characters
NUMBER
without
delimiters.
Learning. that
contains
must always
does
at the
and two
the
table
first. (In fact, the * side of a relationship
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
332
PART III
Database
Programming
If your
RDBMS
delete
them
does
from
If you use the
not
the
support
SQL
PRIMARY
primary
code
shown
and foreign
key
designations
or the
UNIQUE
specification,
here.
KEY designation
in
Oracle, you do not need the
NOT NULL and
UNIQUE
specifications.
The
ON UPDATE
RDBMS.
CREATE
In that
TABLE
case,
delete
PRODUCT
(
clause is part of the the
ON UPDATE
VARCHAR(10)
NOT
NULL
P_DESCRIPT
VARCHAR(35)
NOT
NULL,
P_INDATE
DATE
P_QOH
SMALLINT
NOT
NULL,
P_MIN
SMALLINT
NOT
NULL,
P_PRICE
NUMBER(8,2)
NOT
P_DISCOUNT
NUMBER(4,2)
NOT NULL,
NOT
KEY (P_CODE),
FOREIGN
KEY (V_CODE)
ON
UPDATE
As you
examine
The
NOT
the
specifications
empty in the
(with data
for
UNIQUE
no data
in
the
supported,
element
Because
that
a data
entry
specification
can
use this
is
the
is
will not allow the
this
programs
a unique index
contain
are
not
the
following
made.
features:
When it is
end user to leave made
information
in the respective
at the
to
attribute.
use
Microsoft
and
definition
(attributes,
both a NOT NULL and a UNIQUE
entity integrity
PK in
assumed table
at all).
creates
enforce
entire
ensure
note
table
create
crucial
the level
the
and
data
Use it to avoid
a column.
specifications designate
attributes
sequences,
automatically.
key attributes
automatically
the
command
NOT NULL specification entry
specifications
The
VENDOR
application
specification
values
The primary
the
dictionary,
validation
duplicated
you
clause.
NULL,
SQL table-creating
data available,
dictionary
if
by your
NULL,
REFERENCES
preceding
NULL
attribute
The
may not be supported
CASCADE);
have the
stored
CASCADE
but it
INTEGER,
PRIMARY
to
ANSI standard,
UNIQUE,
P_CODE
V_CODE
8
CASCADE
are
is
Access,
not
spelled
enclosed
primary
key
requirements. PRIMARY
in
If the
KEY
the
NOT
specification.
NOT NULL and
without
the
NULL
and
UNIQUE
specifications. UNIQUE
Those
(For
example,
specifications
are
out.)
parentheses.
A comma
and foreign
key)
definition.
primary
key,
all of the
is
used
to
separate
each
table
NOTE If
you
are
working
parentheses consists
with
a composite
and are separated
of the two
attributes
with commas.
INV_NUMBER
primary
keys
attributes
are
contained
within the
For example, the LINE table in Figure 8.1 has a primary key that and
LINE_NUMBER.
Therefore,
you
would
define
the
primary
key
by typing:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
PRIMARY The
order
KEY (INV_NUMBER, of the
attribute,
primary
then
ordered
key
proceeds
within
each
components
with the
1
1001
2
1002
1
1003
1
1003
2
ON
UPDATE
that
(cascade) clause
ensure
part
CASCADE.
An RDBMS
in the
The command
333
attribute,
because
the indexing
and so on. In this
starts
example,
with the first-mentioned
the line
numbers
would
be
ensures
applied
integrity
standard, does
not
enforces
support
column; that
ends
if you
the
(Although
such
as
clause,
integrity
delete
time,
VENDORs
the system CASCADE
support
code
That is,
cannot
any
UPDATE
not
the
keys.
you
ON
do
it from
in
throughout
the
Oracle,
for foreign
same
a change
key references
maintained.
at the
make
all foreign
RDBMSs,
referential
key
references
sequence
is
some
that
to
ON
shown
UPDATE
here.)
you cannot
delete
a vendor
have
row
as
8
vendor.
with a semicolon.
(Remember,
your
RDBMS
may require
that
you
semicolon.)
use
message,
COLUMN
NAMES
mathematical
symbols
such
as
but PER_NUM is acceptable.
by SQL to the
Language
numbers:
referential
foreign
row
NOTE ABOUT Do not
next
specification
ANSI
automatically entry
the
is important
automatically
RDBMS
as a product
omit
is
that
of the
If your
an invalid long
CASCADE
change
to is
Query
LINE_NUMBER
1001
V_CODE,
Structured
LINE_NUMBER),
of the invoice
INV_NUMBER
The
8 Beginning
perform
message
specific
invalid
functions.
column
1,
2, and /.
For
example,
Also, do not use reserved For example,
in some
PER-NUM
words.
RDBMSs,
may
Reserved
generate
words
the column
an error
are words used
name INITIAL
generates
name.
NOTE TO ORACLE USERS If
you
are
each line, the
Enter
example,
Copyright Editorial
review
2020 has
using
command
aline
number
key.
Line
Oracles
CREATE
TABLE
2
P_CODE
line is
SQL to
create
automatically
numbers
are
execution PRODUCT
command
PRODUCT_P_CODE_PK
5
P_INDATE
6
P_QOH
7
P_MIN
All suppressed
Rights
DATE
NOT
NUMBER
do not type when
looks
using
the
Enter
key
a semicolon Oracles
after
before
SQL
typing
pressing
Developer.
For
like this:
does
May not
not materially
PRIMARY NOT
KEY,
NULL,
NULL,
NOT
NUMBER
Reserved. content
press
(
VARCHAR2(35)
any
you
VARCHAR2(10)
CONSTRAINT
Learning.
when
as you
CREATE TABLE
P_DESCRIPT
that
as long
generated
4
Cengage
Oracle,
automatically
3
deemed
in
generated
also
of the
tables
NULL,
NOT NULL,
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
334
PART III
Database
Programming
8
P_PRICE
9
P_DISCOUNT
10
V_CODE
11
CONSTRAINT
12
NUMBER(8,2)
examine
NUMBER(5,2)
line
NULL,
preceding
definition
REFERENCES
SQL
for
command
P_CODE
VENDOR); sequence,
starts in line
note the
2 and ends
following:
with a comma
at the
end of
3.
The
CONSTRAINT
clause (line
You can name the constraint constraint
was named
Examples CHECK. To
Lines
12
define
V_CODE.
do not
difficult
the
the
time
define
and name
own naming
a constraint
conventions.
in
In this
Oracle.
case, the
about
UNIQUE,
PRIMARY
constraints,
constraint,
you
KEY, in this
KEY
CONSTRAINT
also
Oracle
constraint
clause
is
KEY and
see below.
could
case,
KEY, FOREIGN
use the
following
automatically name
syntax:
names
the
P_CODE constraint.
PRODUCT_V_CODE_FK
generally
used
at the
end
for
of the
the
CREATE
sequence.
name
Unfortunately,
NOT NULL,
a FOREIGN
The
command
If you
KEY
PRIMARY
11 and
you to
meet your
details
a PRIMARY
attribute TABLE
are
For additional
define
3) allows
to
PRODUCT_P_CODE_PK.
of constraints
VARCHAR2(10)
8
NOT
PRODUCT_V_CODE_FK
KEY V_CODE
the
The attribute
NULL,
NUMBER,
FOREIGN
As you
NOT
constraints
yourself,
Oracle-assigned
deciphering
name
it later.
You
Oracle makes
should
automatically
sense
assign
assigns
only to
a name
Oracle,
that
a name. so you
makes
sense
will have to
a
human
beings!
8.2.6 SQL Constraints In Chapter 3, Relational Model Characteristics, you learnt that adherence to entity integrity and referential integrity rules is crucial in arelational database environment. Fortunately, most SQL implementations support
both integrity
rules.
Entity integrity
is enforced
automatically
when the primary
key is specified
in the CREATE TABLE command sequence. For example, you can create the VENDOR table structure and set the stage for the enforcement of entity integrity rules by using: PRIMARY KEY(V_CODE) As you look
enforced
at the
PRODUCT
tables
by specifying in the
FOREIGN
KEY (V_CODE)
That foreign
CREATE
REFERENCES
key constraint
TABLE
sequence,
note that referential
definition
VENDOR
ensures
ON UPDATE
This is the
default
behaviour
Onthe other hand, if a change is reflected
automatically
has been
CASCADE
that:
You cannot delete a vendor from the VENDOR table if atleast vendor.
integrity
PRODUCT table:
for the treatment
one product row references that
of foreign
keys.
madein an existing VENDOR tables
in any PRODUCT
table
V_CODE reference
V_CODE, that change
(ON
UPDATE
must be
CASCADE).
That restriction makesit impossible for a V_CODE value to exist in the PRODUCT table pointing to a non-existent VENDOR table V_CODE value. In other words, the ON UPDATE CASCADE specification ensures the preservation of referential integrity. (Oracle does not support ON UPDATE CASCADE.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
Online
Content
ON UPDATE Section to
NOTE ABOUT The support
and
Server SQL
Oracle
does
Oracle
supports
MySQL
to
establish
automatically
In general, following
do not
Language
335
Model into
Among
a Database
Tables.
Structure,
Appendix
Dis
available
book.
ACTIONS varies from
ON
DELETE
product
to
product.
For example:
CASCADE.
CASCADE.
CASCADE.
support for
support
does
SET
additional
ON
support
NULL. information
DELETE
them
a relationship
pops
this
actions
support
an ER
Relationships
for
ON UPDATE
manuals
it
Query
NULL.
not
level,
Governing
CONSTRAINT
support
Server
does
D, Converting
platform
ON UPDATE
product
MySQL
try
SET
SQL
your
command-line
Rules online
Oracle
support
Appendix
constraint
and
Server
not
and
Refer to
you
General
REFERENTIAL
SQL
MySQL
see
on the
for the referential
MySQL,
While
D.2,
Structured
Fora moredetailed discussion ofthe optionsfor the ON DELETEand
clauses,
download
8 Beginning
CASCADE
through
between
on referential
two
the
or
ON
relationship
tables
in
constraints.
UPDATE window
Access,
the
CASCADE
interface.
at the
In fact,
relationship
SQL
8
whenever
window
interface
up.
ANSI SQL permits the use of ON DELETE and ON UPDATE clauses to cover any of the
actions:
CASCADE,
SET NULL or SET DEFAULT.
Besides the PRIMARY KEY and FOREIGN KEY constraints, following constraints:
the ANSI SQL standard also defines the
The NOT NULL constraint is used to ensure that a column does not accept nulls. The UNIQUE constraint is used to ensure that all values in a column are unique. The DEFAULT
table.
constraint
is used to assign
a value to an attribute
when a new row is
added to a
The end user may, of course, enter a value other than the default value.
The CHECK constraint
is
used to validate
data
when an attribute
value is entered.
The CHECK
constraint does precisely whatits name suggests: it checks to see that a specified exists. Examples of such constraints include the following: ? The
minimum
order value
must be at least
condition
ten.
? The date must be after 15 April 2019. If the CHECK constraint is met for the specified attribute (that is, the condition is true), the data are accepted for that attribute. If the condition is found to be false, an error messageis generated and the data are not accepted. Note that the
CREATE
TABLE
command
lets
you define
When you create the column definition (known When you use the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
in two
different
places:
as a column constraint).
CONSTRAINT keyword (known
Reserved. content
constraints
or in Cengage
as atable constraint).
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
336
PART III
Database
A column
Programming
constraint
constraints In
are
this
SQL
applies to just
supported
chapter,
Oracle
command
one column;
at varying is
used
sequence
uses
a table
levels
of compliance
to illustrate
SQL
the
DEFAULT
constraint
may apply to
by enterprise
constraints.
For
CHECK
constraints
and
many columns.
Those
RDBMSs. example, to
note
that
the
define
the
table
following named
CUSTOMER. CREATE
TABLE
CUSTOMER
(
CUS_CODE
NUMBER
PRIMARY
CUS_LNAME
VARCHAR(15)
NOT
NULL,
CUS_FNAME
VARCHAR(15)
NOT
NULL,
CUS_INITIAL
CHAR(1),
CUS_AREACODE
CHAR(5)
KEY,
DEFAULT '0181'
CHECK(CUS_AREACODE
NOT NULL
IN
('0181','0161','7253')), CUS_PHONE
CHAR(12)
NOT
CUS_BALANCE
NUMBER(9,2)
DEFAULT
CONSTRAINT
8
In that
case,
the
CUSTOMER is
recorded.
0181,
CUS_UI1
0161
the
only table
row
is
added
Also
note
that
the
CHECK
any
other
values
and
is
7253;
tables, the
value
modified.)
In
you
contrast,
on the
with the
same last
possible
to
(See
name
last and first
more than
user
a default
makes
value
no entry
restricts
value
the
the
CHECK
for
of 0181.
the
values
If you
only area
condition
condition
Therefore,
area
for
the
code,
the
if
a new
0181
customers
is
want to
area
value code
and first (This
name.
index
named
valid
Language
not
a customer it
to
a unique
row
the
the
is
added
in
entry
table.)
However,
Microsoft
other
(named
of two Clearly,
CUSTOMER
or
SQL.) Finally,
constraint
process.
when
only to the
attributes
index
and
used
applies
that include
prevents
Smith in the
is
SQL and Advanced
creates
merely illustrates
John
value
expression,
conditions
The index
are added to a table
default
whether
any
check for
sequence
name.
(The
validated
9, Procedural
name
when new rows
code.
may include
command
one person
applies
customers
Chapter
TABLE
customers
have
the
assigned
are rejected.
CHECK
CREATE
end
0.00,
CUS_FNAME));
condition
for
being checked.
of the
is
DEFAULT
entered
must use triggers.
line
CUS_UI1)
is
while the
in the table
last
and the
to note that the no
However,
attributes
attribute
table
when
modified.
(CUS_LNAME,
CUS_AREACODE
It is important then
UNIQUE
NULL,
customers it should
be
NOTE TO MICROSOFT ACCESS USERS Microsoft accept
Access
the
In
will
the
review
2020 has
the
DEFAULT
UNIQUE
command
a new invoice
to
or
CHECK
(CUS_LNAME,
create
and the
the
CHECK
constraints. CUS_FNAME)
INVOICE
line
table,
constraint
the
validates
TABLE
INVOICE
PRIMARY
CUS_CODE
NUMBER
NOT
Learning. that
any
Access
the
unique
constraint
the invoice
will
index.
assigns
date is
greater
a
than
( NUMBER
Cengage
create
DEFAULT
that
INV_NUMBER
deemed
and
2019.
CREATE
Copyright
SQL
date to
1 January
accept CUS_UI1
following
default
Editorial
not
CONSTRAINT
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
NULL
rights, the
KEY,
right
some to
third remove
REFERENCES party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
DEFAULT
SYSDATE
NOT NULL,
Language
337
CUSTOMER(CUS_CODE), INV_DATE
DATE
CONSTRAINT In this
case,
The
INV_CK1
notice
the
CUS_CODE
that
the
The
CHECK
.
TO_DATE('01-JAN-2019','DD-MON-YYYY')));
following:
attribute
CUS_CODE
DEFAULT
(INV_DATE
is
definition a foreign
constraint
contains key.
uses the
REFERENCES
This is
another
SYSDATE
special
CUSTOMER
way to
define
function.
(CUS_CODE)
a foreign
This function
to indicate
key.
always
returns
todays
date. The invoice when
date (INV_DATE)
a new row
A CHECK
added
constraint
comparing
a date to
DATE function. The final
is
SQL
and
is
used to
a
manually
command
validate
TABLE
product
automatically is
given
that
entered
sequence
given
for the
the invoice
takes two
creates
the
date is
LINE
date
greater
clause,
parameters,
The
table
P_CODE
VARCHAR(10)
NOT
NULL,
LINE_UNITS
NUMBER(9,2)
DEFAULT
0.00
NOT
NULL,
LINE_PRICE
NUMBER(9,2)
DEFAULT
0.00
NOT
NULL,
PRIMARY
KEY (INV_NUMBER,LINE_NUMBER),
FOREIGN
KEY (INV_NUMBER)
FOREIGN
KEY (P_CODE)
ON
of the
CASCADE
deletion
following
LINE
table,
A UNIQUE
use
When
of the
TO_
has
used.
a composite
primary
and P_CODE to
the
for
you
a UNIQUE
constraint
is
through
the
action
enforces
referential
key
weak entities of the
row
automatically more
CASCADE,
P_CODE));
enforced
deletion
will learn
ON DELETE
PRODUCT(P_CODE),
that is
foreign
of an INVOICE section,
note
constraint
CASCADE
triggers
INVOICE
UNIQUE(INV_NUMBER,
is recommended
automatically
REFERENCES REFERENCES
LINE_UI1
DELETE
2019.
the
8 NOT NULL,
line.
January
LINE (
NUMBER(2,0)
creation
SYSDATE)
same invoice.
LINE_NUMBER
an invoice
'1
in INV_NUMBER
NOT NULL,
In the
by
date and the date format
LINE
constraint
in the
than
NUMBER
CONSTRAINT
(returned
Oracle requires
the literal
table.
and uses a UNIQUE twice
todays
attribute.
date in a CHECK
is not ordered
INV_NUMBER
the
no value
LINE_NUMBER)
ensure that the same
the
is
The TO_DATE function
key (INV_NUMBER,
CREATE
attribute
to
about
rows
deletes indexes
all
use
the
duplication
index.
Also
use
of
of a row in the
dependent
LINE
how to
prevent
The
deletion
in the
of the
and
to
of a unique integrity.
ensure that the
corresponding
added
creation
rows
weak
related
SQL
ON
DELETE
strong
entity
In that
case,
entity.
to the
invoice.
to
create
commands
of
note that
In the them.
NOTE The current integrated
release as the
Manipulation
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
of
Language
All suppressed
Rights
Reserved. content
MySQL is currently
default
does
storage (DML)
May not
not materially
be
operations
copied, affect
8.0,
The community
engine for tables
scanned, the
overall
or
duplicated, learning
created
followed
in experience.
whole
the
or in Cengage
part.
Due Learning
edition
with
standard
to
electronic reserves
ACID
rights, the
of InnoDB
MySQL. This
right
some to
third remove
storage
engine
was
meant that the
MySQL
Data
(Atomicity,
party additional
content
may content
Consistency,
be
suppressed at
any
time
from if
the
subsequent
Isolation
eBook rights
and/or restrictions
eChapter(s). require
it.
338
PART III
and
Database
Durability)
properties
Programming
properties
of
will be covered
MySQL
maintains
This ensures
that
database
in
more
data integrity
consistency
To summarise,
transactions
detail
in
Chapter
through
is
the
maintained
MySQL version
in line
8.0 (and
with
12,
other
Managing
InnoDB
engine
across
all tables
beyond)
supports
major
RDBMS
Transactions
such
and
by supporting
as
Concurrency.)
FOREIGN
when data is inserted, the following
Oracle. (ACID
KEY
updated
constraints.
or deleted.
SQL constraints:
UNIQUE
PRIMARY
KEY
FOREIGN
KEY
It is important has
no
to
effect
Consider
when the
corresponding CREATE
8
that
the
CHECK
actually
CUSTOMER
table
CUSTOMER
INTEGER
that
has
PRIMARY
CUS_FNAME
VARCHAR(15)
NOT NULL,
CUS_INITIAL
CHAR(1), CHAR(5)
CUS_BALANCE UNIQUE
NOT NOT
be added
Currently,
created
in this
create the
when
CHECK section
CUSTOMER
is
a table
not
using
is
created,
but it
supported.
Oracle
SQL.
Below
is the
table:
NULL,
NULL
DEFAULT
'0181',
NULL,
NUMBER(9,2)
(CUS_LNAME,
been
to
actually
table.
KEY,
NOT
CHAR(12)
can
the
(
VARCHAR(15)
CUS_PHONE
into
sequence
CUS_LNAME
CUS_AREACODE
constraint
entered
MySQL command
TABLE
CUS_CODE
note
data is
NOT
NULL
DEFAULT
0.00,
CUS_FNAME));
8.2.7 SQLIndexes You learnt in Chapter 3, Relational Model Characteristics, that indexes can be used to improve the efficiency of searches and to avoid duplicate column values. In the previous section, you saw how to declare unique indexes on selected attributes when the table is created. In fact, when you declare a primary key, the DBMS automatically creates a unique index. Even with this feature, you often need additional
INDEX
indexes.
command,
The ability to
create indexes
SQLindexes
can be created on the basis of any selected attribute.
CREATE [UNIQUE]
For example, creates
INDEX indexname
based on the attribute
an index
named
CREATE INDEX
quickly
ON tablename(column1
is important.
Using the
[, column2])
P_INDATE stored in the PRODUCT table, the following
P_INDATEX
within the data dictionary.
CREATE UNIQUE INDEX
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
command
ON PRODUCT(P_INDATE);
Using the
without
warning you first, thus
UNIQUE index
qualifier,
preserving the index
you can even create
an index
prevents you from using a value that has been used before. Such a feature is especially the index attribute is a primary key (PK) whose values must not be duplicated:
Editorial
CREATE
The syntax is:
P_INDATEX:
SQL does not let you overwrite an existing index structure
and efficiently
not materially
be
that
useful when
P_CODEX ON PRODUCT(P_CODE);
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
If you now try to in index. when
Many you
declare
A common
operations if you
the
is to
create
attribute you
possible,
Unique
the
should
once test
unique
yet the
for
are often which
date.)
EMP_NUM
that
Remember index
is
used
would
that
in this
used to
required
on the
as a search
be useful to
a vendor
case.
employee
111
entry is
structure
meets
clearly
prevent
Better
data
employee
Given the
A duplicated
value
PK attribute(s)
key
can
yet, to
or in
order.
create
comparison
For example,
an index
supply
many
make the
search
as efficient
test of Table
entity
duplication.
scores
are
8.6, the
integrity
For example,
stored.
PK is
(An
consider
employee
EMP_NUM
requirements
the
the
can take
1 TEST_NUM.
combination
111,3
is
test record
TEST_CODE
TEST_DATE
TEST_SCORE
1
WEA
15-May-2018
93
110
2
WEA
12-May-2018
87
111
1
HAZ
14-Dec-2018
91
111
2
WEA
18-Feb-2019
95
111
3
WEA
18-Feb-2019
95
112
1
17-Aug-2018
91
CHEM
could have been avoided through
EMP_NUM,
on the
products.
110
attributes
339
duplicated.
TEST_NUM
Such duplication
Language
message duplicate
index
want to list rows in a specific
by vendor, it
table.
a UNIQUE
8.6, in
WEA test
TABLE 8.6
error
a unique
Query
is recommended.
indexes
Table
entry
the
create
on any field
or when you
of all products
index
on a given
The third
SQL produces
automatically
an index
PRODUCT
not create
composite
only
create
a report
in
in
value,
Access,
expression,
a composite
case illustrated
P_CODE
including
Structured
PK.
practice
want to
Therefore,
a test
a duplicate
in a conditional
V_CODE
as
enter RDBMSs,
8 Beginning
TEST_CODE
the use of a unique composite index,
using the
and TEST_DATE:
CREATE UNIQUE INDEX EMP_TESTDEX
ON TEST(EMP_NUM,
TEST_CODE, TEST_DATE);
By default, allindexes produce results that are listed in ascending order, but you can create anindex that yields output in descending order. For example, if you routinely print a report that lists all products ordered
by price from
CREATEINDEX To delete an index,
highest
to lowest,
PROD_PRICEX use the
you could
create
an index
ON PRODUCT(P_PRICE
DROP INDEX
named
PROD_PRICEX
by typing:
DESC);
command:
DROP INDEX indexname For example, if you want to eliminate the PROD_PRICEX index, type: DROP INDEX PROD_PRICEX; After creating
the tables
and some indexes,
you are ready
to start entering
use two tables (VENDOR and PRODUCT) to demonstrate
8.3
data.
The following
sections
most of the data manipulation commands.
DATA MANIPULATION COMMANDS
In this section, you willlearn how to use the basic SQL data manipulation commands INSERT, COMMIT,
Copyright Editorial
review
2020 has
Cengage deemed
UPDATE,
Learning. that
any
All suppressed
Rights
ROLLBACK
Reserved. content
does
May not
not materially
be
and
copied, affect
DELETE.
scanned, the
SELECT,
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
340
PART III
Database
Programming
8.3.1 Adding Table Rows SQL requires
the
use of the INSERT
command
to enter
data into
a table.
The INSERT
commands
basic
syntax looks like this: INSERT Because
INTO tablename
the
PRODUCT
VALUES (value1,
table
uses its
value2, ... , valuen)
V_CODE to reference
the
VENDOR
tables
V_CODE,
an integrity
violation occurs if those VENDOR table V_CODE values dont yet exist. Therefore, you need to enter the VENDOR rows before the PRODUCT rows. Given the VENDOR table structure defined earlier and the sample VENDOR data shown in Figure 8.2, you would enter the first two data rows as follows: INSERT INTO VENDOR VALUES (21225,'Bryson, INSERT
INTO
Inc.','Smithson','0181','223-3234','UK','Y');
VENDOR
VALUES (21226,'Superloo,
Inc.','Flushing','0113','215-8995','SA','N');
and so on, until all of the VENDOR table records (To see the
contents
of the
VENDOR table,
have been entered.
type
SELECT
* FROM
VENDOR;)
Enter the PRODUCT table rows in the same fashion, using the PRODUCT data shown in Figure 8.2. For example, the first two data rows would be entered as follows, pressing Enter at the end of each line:
8
INSERT INTO PRODUCT VALUES ('11QER/31','Power
painter,
15 psi., 3-nozzle','03-Nov-18',8,5,109.99,0.00,25595);
INSERT INTO PRODUCT VALUES ('13-Q2/P2','7.25-in. (To see the contents
pwr. saw blade','13-Dec-18',32,15,14.99,
of the PRODUCT table, type:
0.05, 21344);
SELECT * FROM PRODUCT;)
NOTE Date
entry
shown
is
a function
of the
as 25-Mar-2019
RDBMS.
In
MySQL,
in the
date format
Microsoft default
expected
Access
date
format
the use of # delimiters when performing in P_INDATE .5 #25-Mar-19#.
As you examine The row
the
Copyright Editorial
review
2020 has
(string)
Numerical
entries
Attribute
entries
A value
is required
Learning. that
any
All suppressed
Rights
are entered
in
May not
each
not materially
be
affect
March
2019
might
depending
Microsoft
be
on your
Access
requires
based on date attributes,
as
sequence
between
character
after
VALUES is
is also a parenthesis.
apostrophes
(').
apostrophes.
by commas. column
copied,
example.
Note that the first
in the command
are
enclosed
for
25 formats
or comparisons
parentheses.
character
For example, presentation
observe that:
must be entered
not
other
be 2019-03-25,
and date values
for
does
would
DBMS.
or in
any computations
between
are separated
Reserved. content
by the
Oracle
data entry lines,
and that the last
Character
Cengage deemed
preceding
contents
a parenthesis
and
scanned, the
overall
in the
or
duplicated, learning
table.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
This version
Inserting
of the INSERT
Rows
Thus far,
you
a product
have
INTO
VALUES
code
that
Rows
which
in
the
You do that
all of the
row
attribute
or if you dont
Language
341
at a time.
values
are specified.
yet know the vendor
a null, use the following
drill NULL
was
not
with Optional
may be occasions
NULL
of this
in
null. To enter
the
declaration
Inserting
one table
Query
But
what
code? In those
do you
cases,
do if
you
want
syntax:
PRODUCT
note
NULL
There
rows
('BRT-345','Titanium
Incidentally,
as
entered
vendor
INSERT
adds
Structured
with Null Attributes
does not have a vendor
to leave the
NOT
commands
8 Beginning
by listing
example,
entry
is
used
in the
the
accepted
more than
command,
assume
you
attribute that
75, only
10,
4.50,
because
CREATE
TABLE
0.06,
the
NULL);
V_CODE
statement
attribute
for this
is
optional
the
attribute.
Attributes
when
INSERT
bit','18-Oct-18',
one
attribute
can
indicate
names inside
the
is
optional.
just
attributes
than
attributes
parentheses
only required
Rather
the
declaring
that
have
after the table
for the
each
name.
PRODUCT
attribute
required
table
values.
For the are
purpose
P_CODE
and
P_DESCRIPT:
INSERT
INTO
PRODUCT(P_CODE,
P_DESCRIPT)
VALUES ('BRT-345','Titanium
drill bit');
8
8.3.2 Saving Table Changes Any
changes
close the
made to the
program
power
outage
are lost
and
table
contents
you are using,
are
or some other interruption only the
original
not
or use the
table
occurs
contents
physically
COMMIT
saved
on disk
command.
before you issue the
are retained.
until
you
close
If you are using the COMMIT
The syntax
for
the
command,
the
database,
database
COMMIT
and a
your changes command
is:
COMMIT [WORK] The COMMIT and rows to the
command
deleted
PRODUCT
will permanently
made to table
any table
permanent,
save any changes in the
it is
database.
a good
idea
such
Therefore,
to
save
as rows
added,
if you intend
those
changes
attributes
to
modified,
make your changes
by using:
COMMIT;
NOTE TO MICROSOFT
ACCESS USERS
Microsoft
support
execution
Access
doesnt
of each
SQL
the
COMMIT
command.
Access
automatically
saves
changes
after
the
command.
NOTE TO MYSQL USERS MySQL version 5.6 and onwards supports the use of the COMMIT command. When started, the storage engine defaults to the autocommit mode. As soon as any DML statement is executed that updates atable, MySQL automatically commits the transaction, makingit permanent.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
342
PART III
Database
However, of the in
Programming
the
COMMIT
COMMIT
and
transaction
commands
management
Transactions
and
purpose
ROLLBACK
is
commands
(You
not just
(see
will see
to
save changes.
Section
how
such
8.3.5)
issues
is to
are
In fact,
ensure
addressed
the
ultimate
database
update
in
Chapter
12,
purpose integrity Managing
Concurrency.)
8.3.3 Listing Table Rows Use the SELECT command to list the contents
of a table.
The syntax of the
SELECT command is as
follows: SELECT
columnlist
The columnlist
represents
FROM
one or
tablename
more attributes,
as a wildcard character to list all attributes. (A general the
substitute
PRODUCT
for table,
other
characters
separated
wildcard
or commands.)
by commas.
character
You could
use the * (asterisk)
is a symbol that can be used as a
For example,
to list
all attributes
and
all rows
of
use:
NOTE The SELECT Relational
command
Algebra
is
and
based
on the relational
Calculus.
For example,
operator
the
SELECT,
which
was introduced
in
Chapter
4,
of the
rows
in the
statement
8 SELECT * FROM Can be
written
PRODUCT;
in relational
algebra
as
s (PRODUCT) SELECT
* FROM
Figure
8.3
shows
PRODUCT
tables
table
first
command
PRODUCT;
two
the that
output serve
records,
would show
basis
as shown
output
and the
will have
created
and
output
shown
populated
by that for
entered.
Figure
VENDOR
(Figure
8.3
discussions.
preceding
you in
your
command.
subsequent
in the
only the rows
SELECT
future
generated as the
section, Dont
8.3. and
When
If
the
output
of the
the
complete
PRODUCT
all
entered
worry about you
shows
you
tables
only the
preceding
difference
the
SELECT
between
work in this
with the
PRODUCT
correct
your
section,
rows
for
you use in
sections.)
NOTE Your listing
may not be in the order shown in Figure 8.3. The listings
system-controlled
primary-key-based
index
operations.
You
shown in the figure are the result
will learn
later
how to
control
the
output
of so
that it conforms to the order you have specified.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 8.3 P_CODE
11QER/31
The contents
8 Beginning
Structured
Query
Language
343
of the PRODUCTtable
P_DESCRIPT
P_INDATE
P_QOH
P_MIN
P_PRICE
P_DISCOUNT
V_CODE
Power painter, 15 psi.,
03-Nov-18
8
5
109.99
0.00
25595
3-nozzle 13-Q2/P2
7.25
cm
pwr.
saw
blade
13-Dec-18
32
15
14.99
0.05
21344
14-Q1/L3
9.00
cm
pwr.
saw
blade
13-Nov-18
18
12
17.49
0.00
21344
1546-QQ2
Hrd.
cloth,
3 50
15-Jan-19
15
8
39.95
0.00
23119
1558-QW1
Hrd. cloth,
3 3 50
15-Jan-19
23
5
43.99
0.00
23119
2232/QTY
B&D jigsaw,
12 cm blade
30-Dec-18
8
5
109.92
0.05
24288
2232/QWE
B&D jigsaw,
8 cm
24-Dec-18
6
5
99.87
0.05
24288
2238/QPD
B&D cordless
20-Jan-19
12
5
38.95
0.05
25595
23109-HB
Claw hammer
20-Jan-19
23
10
9.95
0.10
21225
23114-AA
Sledge
hammer,
12
kg
02-Jan-19
54778-2T
Rat-tail
file,
1/8
cm
fine
15-Dec-18
43
89-WRE-Q
Hicut
saw,
16
cm
07-Feb-19
11
PVC23DRT
PVC
pipe,
3.5
m
20-Feb-19
188
75
5.87
0.00
SM-18277
1.25
cm
metal
01-Mar-19
172
75
6.99
0.00
21225
SW-23116
2.5
24-Feb-19
237
8.45
0.00
21231
0.10
25595
WR3/TT3
3 1/6
cm,
2
1/2 cm,
chain
cm
Steel
1/4
blade
drill, 1/2 cm
cm,
8
screw,
wd. screw, matting, m,.5
4
25
50 m 3 8
m
8
17-Jan-19
4.99
20 5
256.99
100
119.95
5
18
0.05
14.40
5
0.00
21344
0.05
24288
8
m mesh
NOTE TO ORACLE USERS Some SQLimplementations (such as Oracle) cut the attribute labels to fit the width ofthe column. However, Oracle lets you set the width of the display column to show the complete attribute name. You can also change the display format, regardless of how the data are stored in the table. For example, if you want to display the euro symbols and commas in the P_PRICE output, you can declare: COLUMN P_PRICE FORMAT 99,999.99 to change the output 12347.67 to 12,347.67. In the same manner,to display only the first 12 characters
of the P_DESCRIPT attribute,
use:
COLUMN P_DESCRIPT FORMAT A12 TRUNCATE
Although
SQL commands
best shown
can be grouped together
on separate lines,
with space
on a single line, complex command
between the
SQL command
sequences
and the commands
are
components.
Using that formatting convention makesit much easier to see the components of the SQL statements, making it easy to trace the SQL logic and, if necessary, to make corrections. The number of spaces used in the indention is up to you. For example, note the following format for a more complex statement:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
344
PART III
Database
Programming
SELECT
P_CODE,
P_DESCRIPT,
P_INDATE,
P_QOH,
P_MIN,
P_PRICE,
P_DISCOUNT,
V_CODE
FROM
PRODUCT;
When you run the
a SELECT
command
same characteristics
table
you
specified
default, SQL
most
set of rows.
as a relational
in the
SQL
commands
FROM
data
This is
the
a set of one or more rows that
SELECT
a very important
commands
be set-orientated
may include
RDBMS returns
table. In addition,
clause.
manipulation
are said to
The set
on a table, the
operate
over
commands.
an entire
A SQL
one or more columns
command
lists
characteristic
of
table
(or
set-orientated
and zero or
all rows from the
SQL
commands.
relation).
from
By
That is
command
more rows
have
works
why
over
a
one or more tables.
8.3.4 Updating Table Rows Usethe
UPDATE command to
UPDATE
[WHERE
if
row
(second)
5 expression
conditionlist
example,
second
8
you
of the
row.
want
to
SET
P_INDATE
attribute
is to
PRODUCT
SET
P_INDATE
What
table. a
be
updated
in the
row,
primary
separate
P_PRICE
if the
previous
P_PRICE
UPDATE
and
P_MIN
2018
to
18 January
key (13-Q2/P2)
the
2019
in the
the
correct
to locate
corrections
5 17.99,
command
values
Remember,
the
UPDATE
command
condition,
the
UPDATE
command
by using this
command
WHERE
December
P_MIN
with commas:
5 10
5 '13-Q2/P2';
happened
The P_INDATE,
PRODUCT specify
have
13
Figure 8.3), use the
5 '18-JAN-2019',
P_CODE
would
table (see
from
5 '13-Q2/P2';
UPDATE
Answer:
P_INDATE
5 '18-JAN-2019'
P_CODE
WHERE
5 expression]
type:
PRODUCT
one
change
PRODUCT
UPDATE
more than
[, columnname
];
Therefore,
WHERE If
The syntax for this command is:
tablename
SET columnname
For
modify data in a table.
had
would
is
not included
have
been
a set-oriented
applies
the
the
WHERE
changed
operator. changes
in
Therefore,
to
all rows
condition?
all rows
of the
if you
in the
dont
specified
table.
Confirm the correction(s) SELECT
*
FROM
to
check the
PRODUCT
tables
listing:
PRODUCT;
8.3.5 Restoring Table Contents If you
have
not
yet
used the
COMMIT
command
to
can restore the database to its previous condition any
changes
restore
the
and
brings
data to
their
the
data
back
prechange
to the
store
with the
values
condition,
the
that
changes
permanently
in the
ROLLBACK command. existed
before
the
database,
you
ROLLBACK undoes
changes
were
made.
To
type:
ROLLBACK;
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
and
press
data
to
in
Enter.
their
Chapter
and
or delete table CREATE
ten
UPDATE
Will the
the
called
rows
in
see that the
ROLLBACK
the
ROLLBACK
commands
are
did, in fact,
examined
in
Language
restore
345
the
greater
detail
etc. illustrates
to
RDBMSs,
commands.
and
such
previous
as
example,
all
dictionary
ROLLBACK
add,
modify
to
ROLLBACK
command
definition
commands
cannot
be rolled
back.
will
(CREATE
The lack
between
databases
are
of commands
Microsoft
designed
to
Access
support
such
as ROLLBACK,
and enterprise
large
multi-user
databases
environments
8
controls.
automatically
previous
wouldnt
and
No, the
All data
command.
key differences
Enterprise
Oracle,
command?
commands.
data
COMMIT
if you had used the
afterwards
subtle
the
data integrity
For example,
ROLLBACK
are used
USERS
support
Oracle.
UPDATE
to the
one of the
robust
ROLLBACK
and
ACCESS
doesnt
have
by the
INSERT
MICROSOFT
MySQL
that
actions:
table.
committed
Access
as
commands
you perform these
table.
SALES
of the
Microsoft
need
manipulation
that
command.
COMMIT,
and
with data
assume
SALES
automatically
such
only
be removed
results
NOTE TO
these
and
Query
SALES.
in the
SALES table
are
work
ROLLBACK
only the
in the
again to
COMMIT
For example,
rows
two
Execute
Some
statement
The
ROLLBACK
rows.
a table
INSERT
TABLE)
SELECT
values.
Structured
9.
COMMIT
undo
Use the
original
8 Beginning
have
data
CREATE INDEX
changes
would
undone
anything.
changes
when issuing
command
have
been
Check
committed your
data
after updating
definition
the two
automatically;
RDBMS
manual
rows
doing
to
a
understand
differences.
8.3.6 Deleting Table Rows It is easy to
delete
a table
row
using the
DELETE
statement;
the
syntax is:
DELETE FROM tablename [WHEREconditionlist For
example,
product
if
code
you
are
want to
(P_CODE)
DELETE In that
];
example,
the
PRODUCT
to
table
primary
key key
the
table
WHERE
value
lets
match;
SQL find any
the
attribute
product
that
you
added
earlier
whose
contents
5 'BRT-345';
exact
record
to
may be used.
are several
WHERE P_MIN
tables
the
P_CODE
products
to delete all rows from the
PRODUCT
PRODUCT
PRODUCT
use:
will see that there
command
DELETE FROM
the
PRODUCT
a primary
you
Use the following
from
is 'BRT-345',
FROM
not limited
Check
delete
for
be
For
which the
PRODUCT table for
deleted.
However,
example,
if
you
P_MIN attribute which the
deletions
examine
your
is equal to 5.
P_MIN is equal to 5:
5 5;
again
to
verify
that
all
products
with
P_MIN
equal
to
5 have
been deleted.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
346
PART III
Database
Programming
Finally, remember condition table
is
are
that
optional.
You learnt
in
8.3.1 how to
source
INSERT
INTO
of the
data.
query
or an inner
always
executed
outer
first
queries
in the
which
has
condition,
mind that
all rows
the
from
WHERE
the
you
INSERT
and
the
column
first
Given the
specified
previous
more
table
one character date
the
subquery into
attribute, values,
the
the
A subquery, inside
also known as a nested
another
SQL statement,
query.
The inner
the INSERT
portion
or subquery.
You
every
case, the
output
of the inner
query. In
Chapter
different
types
should
match
you
subquery
column
is
represents
can
9, Procedural
query
nest
queries
(lower-level)
Language
SQL
of subqueries. the
are inserting
SELECT
second
you
another
is:
query
which
section, using
the inner
deep; in
about
to a table,
tablename;
nested)
outer (higher-level)
If the
has
multiple rows
statement FROM
(or
represents
SELECT
statement.
add
INSERT
columnlist
many levels
by the
attribute
for the
embedded
portion
will learn
returned
is
RDBMS.
for the
to add rows to a table. In that
how to
uses a SELECT subquery.
that
SELECT
queries)
SQL,
has
attributes
and
rows
one
has
should
number
data date
return
values
types
of the
attribute,
one or
and the
one
more rows
third
column
values.
Populating
the
VENDOR
The following
steps
with the
to
P are
syntax
statement
by the
inside
character
The
a query
as the input
values
number in
And keep in
WHERE
statement
you learn
SELECT
is
and the
Advanced
table
8
query,
query
query is used
The
a
specify
use the INSERT
tablename
In that case, the INSERT
and
command.
do not
one at a time. In this section,
as the
(place
a set-oriented
you
Table Rows with a Select Subquery
Section
added rows
the
if
deleted!
8.3.7 Inserting
table
DELETE is
Therefore,
data
used
PRODUCT
guide you through
be used in the
as the
PRODUCT
and
data
rest
source.
Tables
the
of the
V and
process
of populating
chapter.
To
P have the
same
the
accomplish table
VENDOR
that
structure
task,
and
two
PRODUCT
tables
(attributes)
tables
named
as the
V and
VENDOR
and
tables.
Online Content Thefollowingsectionsassume thatthe database hasbeenrestored to its original If
condition.
you
are
Therefore,
using
Oracle
you
or
MySQL or Oracle folder the
database.
provided
from
Usethe following
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
sqlintrodbinit.sql
database,
script
file located
to create all tables
follow the instructions
in
either
and load
the
the data in
specific to your school
setup
Microsoft the
online
Access,
copy the
platform
for this
original 'Ch08_SaleCo.mdb'
file available to
book.
steps to populate your VENDOR and PRODUCT tables. (If you havent already created and
before completing
Editorial
run the
on the online platform,
by your instructor.
download
PRODUCT
MySQL,
hosted
To connect to the
If you are using
the
must do the following:
Rights
VENDOR
to
practise
the
SQL commands
in the
previous
sections,
do so
these steps.)
Reserved. content
tables
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Delete all rows
PRODUCT
?
DELETE
FROM
PRODUCT;
?
DELETE
FROM
VENDOR;
Add the
rows
to
VENDOR
and VENDOR
by copying
all rows
Structured
Query
Language
from
V.
Microsoft
Access, type: INSERT
INTO
VENDOR
SELECT * FROM
V;
? If
Oracle
MySQL,
INTO
VENDOR
SELECT
TEACHER.V;
you
are
using
? If
you
to
are
or
PRODUCT
using
by copying
Microsoft
Access,
? If you are using
Oracle or
Oracle
permanently
users
If you followed
must
those
data that
are
the
INSERT
all rows
type:
from
INTO
PRODUCT
MySQL, type: INSERT
INTO
PRODUCT
save
the
changes:
sections
SELECT
* FROM
P;
SELECT * FROM
TEACHER.P;
COMMIT;
you now have the
remaining
* FROM
P.
INSERT
steps correctly,
used in
type:
347
tables.
? If you are using
Add the rows
the
from the
8 Beginning
VENDOR
of the
and PRODUCT
tables
populated
with
chapter.
Online Content If youareusingOracle or MySQL, youcanrunthe sqlintrodbinit.sql script file
located
in
and load
the
INVOICE, specific
In this search
the
MySQL
the
database.
in
LINE,
EMP
your
college
to
8.4
either data
and
or
Oracle This
folder script
EMPLOYEE). or
To
university
setup
hosted file
on the
populates
connect
to
provided
the
the by
online
database,
your
platform
remaining
to
tables
follow
the
create
all tables
(CUSTOMER, instructions
8
instructor.
SELECT QUERIES section, you criteria.
willlearn
SELECT,
how to fine-tune
coupled
the SELECT command
with appropriate
search
conditions,
by adding restrictions
is an incredibly
powerful
to the tool that
enables you to transform data into information. For example, in the following sections, you learn how to create queries that can be used to answer questions such as these: Which products were supplied by a particular vendor?, Which products are priced below 10?, How many products supplied by a given vendor were sold between 5 January 2019 and 20 March 2019?
8.4.1 Selecting Rows with Conditional
Restrictions
You can select partial table contents by placing restrictions on the rows to beincluded in the output. To do this, add conditional restrictions to the SELECT statement, using the WHERE clause. The following syntax enables you to specify which rows to select: SELECT
columnlist
FROM
tablelist
[WHERE
conditionlist
];
The SELECT statement retrieves all rows that match the specified condition(s) also known as the conditional criteria you specified in the WHERE clause. The conditionlist in the WHERE clause of the SELECT statement is represented by one or more conditional expressions separated by logical operators.
The
WHERE clause is optional.
If no rows
match the
specified
criteria in the
you may see a blank screen or a message that tells you that no rows the query:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
WHERE clause,
were retrieved.
party additional
content
may content
be
For example,
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
348
PART III
Database
Programming
SELECT
P_DESCRIPT,
FROM
PRODUCT
WHERE returns
V_CODE
the
description,
FIGURE 8.4
P_INDATE,
P_PRICE,
V_CODE
5 21344;
date,
and
price
of products
with
a vendor
code
of 21344,
as shown
in
Figure
8.4.
Selected PRODUCTtable attributes for vendor code 21344 P_DESCRIPT
P_PRICE
V_CODE
7.25 cm pwr. saw blade
14.99
21344
9.00
17.49
21344
4.99
21344
cm
Rat-tail
pwr. file,
saw 1/8-in.
blade fine
NOTE The
query:
8
SELECT
P_DESCRIPT,
FROM
PRODUCT
WHERE
V_CODE
comprises
both
algebra
the
P_INDATE,
P_PRICE,
V_CODE
relational
algebra
5 21344;
SELECT
and
PROJECT
operators
and
can
be
written
and
4.1.2 in
in
relational
as:
(PRODUCT)) (s v_code 521344 Pp_descript, p_indate, p_price, v_code For
more information
Relational
on the
Algebra
Microsoft
and
Access
users
Access
QBE
generates
Access
SQL
window,
the
SQL
SELECT
and
PROJECT
operators,
see
Sections
4.1.1
Chapter
4,
Calculus.
windows
can use the
Access
QBE (query
by example)
own native
version
of
SQL,
can
bottom
of
Figure
its
as shown
at the
QBE-generated
SQL,
you
8.5.
and the listing
of the
also
Figure
query
choose
8.5 shows
modified
generator.
to type
Although
standard
the
the
SQL in the
Access
QBE screen,
SQL.
NOTE TO MICROSOFT ACCESS USERS The
Microsoft
Access
QBE interface
automatically
designates
the
data source
by using the table
name
as
a prefix. You will discover later that the table name prefix is used to avoid ambiguity when the same column name appears in multiple tables. For example, both the VENDOR and the PRODUCT tables contain the V_CODE attribute. Therefore, if both tables are used asthey would bein ajoin, the source of the V_CODE attribute
Copyright Editorial
review
2020 has
must be specified.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 8.5
8 Beginning
Structured
Query
Language
349
The Microsoft Access QBEand its SQL
Query view options
Microsoft
Numerous
Access-generated
conditional
comparison
restrictions
operators
TABLE 8.7
SQL
can
shown in Table
Comparison
User-entered
be placed
on the
8.7 to restrict
selected
Copyright Editorial
review
2020 has
table
Course
contents.
For
Technology/Cengage
example,
Learning
use the
operators Meaning
5
Equal to
,
Less than
,5
Less than
or equal
.
Greater
than
.5
Greater
than
Cengage
Learning. that
or
to
equal
to
Not equal to
or !5
deemed
SOURCE:
output.
Symbol
,.
8
SQL
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
350
PART III
Database
Programming
The following
example
uses the not
SELECT
P_DESCRIPT,
FROM
PRODUCT
WHERE The
V_CODE
output,
shown
FIGURE 8.6
in
operator:
P_INDATE,
,.
Figure
equal to
P_PRICE,
V_CODE
21344;
8.6, lists
all of the
rows
for
which
the
vendor
code is
not
21344.
Selected PRODUCTtable attributes for vendor codes other than 21344 P_DESCRIPT
P_INDATE
P_PRICE
V_CODE
03-Nov-18
109.99
25595
3 50
15-Jan-19
39.95
23119
Hrd. cloth,
1/2 cm, 3 3 50
15-Jan-19
43.99
23119
B&D jigsaw,
12 cm blade
30-Dec-18
109.92
24288
B&D jigsaw,
3 cm blade
24-Dec-18
99.87
24288
20-Jan-19
38.95
25595
Power Hrd.
painter,
cloth,
1/4
cm,
B&D cordless
8
Claw
hammer
Hicut
chain
1.25
cm
2.5 Steel
2
3-nozzle
drill, 1/2 cm
saw, metal
cm
15 psi.,
16 cm screw,
wd. screw, matting,
4
25
50 3 8 3 1/6
20-Jan-19
9.95
21225
07-Feb-19
256.99
24288
01-Mar-19
6.99
21225
24-Feb-19
8.45
21231
17-Jan-19
3 5 cm
119.95
25595
mesh
As you examine included
in the
The
Figure
8.6, note that
SELECT
command
commands
P_DESCRIPT,
FROM
PRODUCT P_PRICE
WHERE the
output
FIGURE 8.7
with nulls in the
shown
P_QOH,
,5
in
P_MIN,
has
Cengage deemed
Learning. that
any
8.3) are not
P_PRICE
8.7.
P_QOH
All suppressed
Rights
Reserved. content
does
P_MIN
P_PRICE
23
10
9.95
43
20
4.99
PVC pipe, 3.5 cm, 8 m
188
75
5.87
1.25
172
75
6.99
2.5
2020
Figure
10;
Figure
hammer
Rat-tail file,
review
(see
Selected PRODUCTtable attributes with a P_PRICErestriction
Claw
Copyright
column
output.
P_DESCRIPT
Editorial
V_CODE
sequence:
SELECT
yields
rows
May not
cm cm
not materially
1/8 cm fine
metal
screw,
wd. screw,
be
copied, affect
scanned, the
overall
25
50
or
duplicated, learning
100
237
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
8.45
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Using Comparison Because
Operators
computers
Interchange
identify
(ASCII)
character-based
codes,
Therefore,
SELECT
P_CODE,
FROM
PRODUCT
would
be correct
1558-QW1. follows
and
that
the
A is less
ASCII
than
FIGURE 8.8
Language
351
American
may
even
Standard
be
used
Code
to
place
for
Information
restrictions
on
P_QOH,
P_MIN,
P_PRICE
, '1558-QW1';
would yield
(Because
Query
command:
P_DESCRIPT,
P_CODE
(numeric)
operators the
Structured
Attributes
by their
comparison
attributes.
WHERE
on Character
all characters
8 Beginning
a list
code
of all rows
value
B.) Therefore,
the
for
in
which the
P_CODE is
B is
than
the letter
output
is
generated
greater
alphabetically
the
as shown
in
value
of the
Figure
less
than
letter
A, it
8.8.
Selected PRODUCTtable attributes; the ASCIIcode effect P_CODE
P_DESCRIPT
P_QOH
painter,
11QER/31
Power
8
5
109.99
7.25
cm
pwr.
saw
blade
32
15
14.99
14-Q1/L3
9.00
cm
pwr.
saw
blade
18
12
17.49
1546-QQ2
Hrd.
cloth,
15
8
39.95
cm,
2
3-nozzle
P_PRICE
13-Q2/P2
1/4
15 psi.,
P_MIN
3 50
8 String (character) useful
comparisons
when attributes
such
are made from left to right.
as names
are to be compared.
This left-to-right For example,
comparison is especially
the
string Ardmore
would be
judged greater than the string Aarenson but less than the string Brown; use such results to generate alphabetical listings like those found in a phone directory. If the characters 0-9 are stored as strings, the same left-to-right string comparisons can lead to apparent anomalies. For example, the ASCII code for the
character
5 is,
as expected,
greater than the
ASCII code for the
character
4.
Yet the
same 5
will also be judged greater than the string 44 because the first character in the string 44 is less than the string 5. For that reason, you may get some unexpected results from comparisons when dates or other numbers are stored in character format. For example, the left-to-right ASCII character comparison would force the conclusion that the date 01/01/2019 occurred before 12/31/2018. Since the leftmost character
0
in 01/01/2019
is less than the leftmost
character
1 in 12/31/2018,
01/01/2019
is less
than 12/31/2018. Naturally, if date strings are stored in a yyyy/mm/dd format, the comparisons will yield appropriate results, but this is a non-standard date presentation. Thats why all current RDBMSs support date data types, and thats why you should use them. In addition, using date data types gives you the benefit of date arithmetic. Using Comparison Operators on Dates Date procedures are often more software specific than other SQL procedures. For example, the query to list all of the rows in which the inventory stock dates occur on or after 20 January, 2019, willlook like this:
SELECT
P_DESCRIPT, P_QOH, P_MIN, P_PRICE, P_INDATE
FROM
PRODUCT P_INDATE
WHERE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
.5 '20-Jan-2019';
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
352
PART III
Database
Programming
(Remember use
that
Microsoft
#20-Jan-19#
FIGURE 8.9
in the
Access
above
users
WHERE
must use the
clause).
The
Selected PRODUCTtable attributes: P_QOH
P_DESCRIPT
B&D cordless Claw
hammer
Hicut
chain
PVC
pipe,
3 cm
metal
6 cm
wd.
drill,
1.25 cm
# delimiters
for
date-restricted
dates.
output
is
For example, shown
P_MIN
P_PRICE
P_INDATE
38.95
20-Jan-19
12
5
23
10
9.95
20-Jan-19
40
cm
11
5
256.99
07-Feb-19
9 cm,
2.5
m
188
75
5.87
20-Feb-19
172
75
6.99
01-Mar-19
8.45
24-Feb-19
25
screw,
100
237
50
you
Figure
would
8.9.
date restriction
saw,
screw,
in
Using Computed Columns and Column Aliases Suppose you want to determine the total value of each of the products currently held in inventory. Logically, that determination requires the multiplication of each products quantity on hand byits current price. You can accomplish this task with the following command:
8
SELECT
P_DESCRIPT, P_QOH, P_PRICE, P_QOH * P_PRICE
FROM
PRODUCT;
Entering that SQL command in Access generates the output shown in Figure 8.10. FIGURE
8.10
SELECT statement
SQL accepts any
valid
specified
To
Copyright Editorial
review
2020 has
expressions
mathematical in the
Expr label
Expr2;
any valid
to
with a computed
(or formulas)
operators
FROM
clause
all computed
output
in the
and functions
of the
SELECT
columns.
(The
and so on.) Oracle uses the make the
Column in Access
that
computed
column
text
columns.
applied
Note
first
the
are
statement.
actual formula
more readable,
computed
permits
attributes
that
would
as the label
SQL standard
to
also
in
Access
can contain
any
of the
automatically
be labelled
for the the
Such formulas
Expr1;
computed
tables adds
the
an
second,
column.
use of aliases for
any column in
a SELECT statement.
An alias is an alternative name given to a column or table in any SQL statement.
For
rewrite
example,
Cengage deemed
Learning. that
any
you
All suppressed
Rights
can
Reserved. content
does
May not
not materially
be
the
copied, affect
previous
scanned, the
overall
or
duplicated, learning
SQL
in experience.
whole
statement
or in Cengage
part.
Due Learning
as:
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The
SELECT
P_DESCRIPT,
FROM
PRODUCT;
output
of that
command
FIGURE 8.11
P_QOH,
is
shown
P_PRICE,
in
Figure
P_QOH * P_PRICE
Structured
Query
Language
353
AS TOTVALUE
8.11.
SELECTstatement with a computed column and an alias P_DESCRIPT
P_QOH
Power painter,
15 psi., 3-nozzle
P_PRICE
TOTVALUE
8
109.99
879.92
32
14.99
479.68
9.00 cm pwr. saw blade
18
17.49
314.82
Hrd. cloth,
1/4 cm,
2 3 50
15
39.95
599.25
Hrd. cloth,
1/2 cm,
3 3 50
23
43.99
1011.77
8
109.92
879.36
6
99.87
599.22
12
38.95
467.40
23
9.95
7.25
cm
pwr.
saw
blade
B&D jigsaw,
12 cm
B&D jigsaw,
8 cm
B&D
cordless
Claw
hammer,
Rat-tail
file,
1/2
cm
pipe,
1.25
cm
Steel
saw, 3.5
wd. screw, 4
25
50 3 8
3 1/6
214.57
256.99
11
m
screw,
matting,
cm
8
115.20
4.99
43
16 cm
cm,
metal
cm
cm fine
228.85
14.40
8
12 kg
1/8
chain
PVC
.5
blade
drill,
Sledge
2.5
blade
hammer
Hicut
You could
8 Beginning
188
5.87
1103.56
172
6.99
1202.28
237
8.45
2002.65
119.95
18
cm,
8
2826.89
2159.10
mesh
also use a computed
column,
an alias and date
arithmetic
in
a single
query.
For example,
assume that you want to get a list of out-of-warranty products that have been stored more than 90 days. In that case, the P_INDATE is atleast 90 daysless than the current (system) date. The Microsoft Access version of this query is shown as: SELECT
P_CODE, P_INDATE, DATE() - 90 AS CUTDATE
FROM
PRODUCT
WHERE The
P_INDATE
Oracle version
PRODUCT
has
P_INDATE
Oracle, respectively.
Cengage deemed
SYSDATE
- 90;
You could
use the
DATE() and
such as in the value list of an INSERT
Learning. that
,5
DATE() and SYSDATE are special functions
expected,
2020
below:
FROM
and
review
query is shown
P_CODE, P_INDATE, SYSDATE - 90 AS CUTDATE
Note that
Copyright
of the same
DATE()- 90;
SELECT
WHERE
Editorial
,5
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
that return todays
anywhere
a date literal is
when changing
Cengage
part.
Due Learning
to
electronic reserves
functions
Microsoft Access
statement, in an UPDATE statement
or in
SYSDATE
date in
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
354
PART III
Database
the
value
output
Programming
of a date attribute
changes
Suppose
based
a
expiration
manager
date (90
wants
days
P_CODE,
FROM
PRODUCT;
As you
saw in the
or in
Arithmetic
example,
shown
in
The arithmetic
Table
Subtract Multiply
/
Divide
can
confuse
the
as
Microsoft
conjunction
with
multiplication
suggests,
Access;
Copyright review
2020 has
the
latter
For
(*) is
note
the
Perform power operations
3
Perform
4
Perform additions and subtractions
5 50.
any
received that
and
list,
the
warranty
type:
as well as with numeric
attributes.
operators
with table
are
used
commands
often
attributes
in
in
a column
conjunction
with the
of the Similarly,
by (4
All suppressed
of (some
applications
wildcard
only in
symbol
string
use ** instead
of ^)
used
SQL implementations
by some
comparisons,
while
the
former
is
used
in
remember
are the rules that of the
the rules
establish
following
the
of precedence.
order in
computational
As the
which computations
sequence:
within parentheses
multiplications and divisions
application
Learning.
were generate
AS EXPDATE
on attributes,
order
2
that
they To
arithmetic
with the
used
of precedence
example,
Perform operations
Cengage
use
operations
1
deemed
query
procedures.
mathematical
the rules
completed.
expressed
dates
1 90
SQL
power
symbol
mathematical
As you perform
10 * 5
previous
8.8.
Raise to the
^
The
the
Description
2
are
Of course,
operators
Add
name
here.
with date attributes
In fact,
1
such
the
was received).
P_INDATE
you
expression.
Operator
Do not
products,
product
operators
*
Editorial
of all
when the P_INDATE,
previous
operators
TABLE 8.8
as shown
Operators: The Rule of Precedence
a conditional
arithmetic
statement
date.
you can use all arithmetic
8.4.2 Arithmetic
list
a SELECT
a list
from
SELECT
Note that
8
or in
on todays
rules 4
of precedence
1 5^2
* 3 5 4
1 25 * 3
1 5^2) * 3 yields the
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
will tell
duplicated, learning
in experience.
that
5 79 but (4
answer (4
or
you
whole
8
1 2 * 5
5 8
1 10
1 5)^2 * 3 5 81 * 3
5 18,
5 243,
but (8
1 2) * 5
while the
5
operation
1 25) * 3 5 29 * 3 5 87.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
355
8.4.3 Logical Operators: And, Or, and Not In the real
world,
a search
of data normally
involves
multiple
conditions.
For example,
when you are
buying a new house, you look for a certain area, three bedrooms, two and a half bathrooms, two stories and so on. In the same way, SQL allows you to have multiple conditions in a query through the use of logical operators. Thelogical operators are AND, OR and NOT. For example, if you want alist of the table
contents
command
for
either the
V_CODE
5 21344
OR the
V_CODE
SELECT
P_DESCRIPT, P_INDATE, P_PRICE, V_CODE
FROM
PRODUCT V_CODE 5 21344
WHERE That command
FIGURE
8.12
Select PRODUCT table
attributes:
logical
P_PRICE
18 cm pwr. saw blade
13-Dec-18
14.99
21344
22 cm pwr. saw blade
13-Nov-18
17.49
21344
B&D jigsaw,
30 cm blade
30-Dec-18
109.92
24288
B&D jigsaw,
20 cm blade
24-Dec-18
99.87
24288
file, chain
0.3 cm fine
15-Dec-18
saw,
07-Feb-19
40 cm
PRODUCT
P_PRICE , 50
P_INDATE
. '15-Jan-2019';
produces the output shown in Figure 8.13.
Select PRODUCTtable attributes: logical AND P_INDATE
B&D cordless Claw
drill,
3 cm
9 cm,
2.5
metal screw,
6 cm wd. screw,
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
1.25
cm
hammer
PVC pipe,
has
24288
The following command generates a list which P_INDATE is a date occurring after
FROM
P_DESCRIPT
2020
8
21344
256.99
P_DESCRIPT, P_INDATE, P_PRICE, V_CODE
FIGURE 8.13
review
V_CODE
4.99
SELECT
This command
Copyright
OR
P_INDATE
The logical AND has the same SQL syntax requirement. of all rows for which P_PRICE is less than 50 AND for 15 January 2019:
WHERE
match the logical restriction.
P_DESCRIPT
Hicut
Editorial
you can use the following
OR V_CODE 5 24288;
generates the six rows shown in Figure 8.12 that
Rat-tail
AND
5 24288,
sequence:
be
copied, affect
scanned, the
overall
m
25 50
or
duplicated, learning
in experience.
whole
P_PRICE
V_CODE
20-Jan-19
38.95
25595
20-Jan-19
9.95
21225
20-Feb-19
5.87
01-Mar-19
6.99
21225
24-Feb-19
8.45
21231
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
356
PART III
Database
You can For
Programming
combine
example,
The
the logical
suppose
you
P_INDATE is
Orthe
OR with the logical
want
a table
after 15 January
the
required
listing
2019,
P_DESCRIPT,
FROM
PRODUCT
WHERE
(P_PRICE
use
of parentheses
you
executed
,
to
want the logical
first.
FIGURE 8.14
50 AND
The
combine
restrictions
preceding
yields
30 cm
P_INDATE
Claw
hammer
Hicut
chain
PVC
pipe,
3 cm
Note that
the three
P_PRICE
entries use
are placed
blade
drill, 1.25
saw, 9 cm,
cm
within
depends
parentheses
are
always
8.14.
not
the
match
P_PRICE
V_CODE
30-Dec-18
109.92
24288
24-Dec-18
99.87
24288
20-Jan-19
38.95
25595
20-Jan-19
9.95
21225
20-Feb-19
5.87
01-Mar-19
6.99
21225
24-Feb-19
8.45
21231
m
V_CODE
256.99
5 24288
24288
are included
regardless
OR and
a specialty
AND
can
field in
become
quite
mathematics
complex
known
of the
when
P_INDATE
numerous
as Boolean
NOTis used to negate the result of a conditional evaluate row
is
a certain
not
to true
or false.
selected.
condition.
code is not 21344,
and
restrictions
algebra is dedicated
The
If
For example,
use the
command
an expression
NOT logical
is
operator
if you
want to
expression.
true,
the
is typically
see
alisting
row
That is, in SQL, is
used
selected; to find
of all rows
for
if
the
an
rows
which
the
sequence:
* PRODUCT
Note that
NOT (V_CODE
the
clarity.
Learning. that
listed Figure
parentheses
rows.
expressions
WHERE
Cengage
in
the
operators.
is false,
FROM
deemed
shown
2.5
operators
operator
SELECT
has
Conditions
output
place
07-Feb-19
with the
for those
use of logical
expression
2020
rows
on the query. In fact,
all conditional
review
Where you
40 cm
metal screw, 25
of the logical
The logical
Copyright
restrictions.
the
6 cm wd. screw, 50
Editorial
50.
. '15-Jan-2019')
be executed.
blade
20 cm
B&D cordless
for
output.
V_CODE
P_INDATE
B&D jigsaw,
do
on the
Select PRODUCTtable attributes: logical AND and OR
8
vendor
P_PRICE is less than
P_PRICE,
logical to
query
B&D jigsaw,
that
restrictions
conditions:
5 24288;
P_DESCRIPT
to the
place further
following
and the
P_INDATE,
OR V_CODE
The
the
use:
SELECT
on how
AND to
for
V_CODE is 24288.
To produce
Note the
listing
any
condition
is
The logical
All suppressed
Rights
enclosed
NOT can
Reserved. content
does
May not
5 21344);
not materially
be
copied, affect
in
parentheses;
be combined
scanned, the
overall
or
duplicated, learning
that with
in experience.
whole
practice
AND
or in Cengage
part.
Due Learning
and
to
electronic reserves
is
optional,
but it is
highly
recommended
OR.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
357
NOTE If
your
SQL
version
does
not
support
the
logical
NOT, you
can
generate
the
required
output
by
using
the
condition: WHERE
V_CODE
If your version
,.
21344
of SQL does not support
WHERE V_CODE !5
8.4.4 Special ANSI-standard special
use of special
operators
in
conjunction
with the
WHERE clause.
These
include: Used to
Used to
LIKE
21344
SQL allows the
operators
NULL
use:
Operators
BETWEEN IS
,.,
Used to
check
check
check
whether whether
whether
an attribute
an attribute
an attribute
value is value
value
is
within a range.
null.
matches
a given
string
pattern.
value
within
a value
8 IN
Used to
EXISTS
The If
check
Used to
BETWEEN
you
use
whether
whether
Special
software
that
an attribute
products
check
whose
whether
value
a subquery
matches returns
any
any rows.
implements
are
a standard
within
a range
between
50
SQL, the
of values.
and
100,
operator
use the
P_PRICE
NOTE TO
ORACLE
BETWEEN
AND
50.00
MYSQL
AND
DBMS does not support
SELECT
*
FROM
PRODUCT
2020 has
Cengage deemed
Learning. that
used
to
command
check
for
all
sequence:
USERS
any
BETWEEN,
BETWEEN special operator. If you list the higher
you can use:
P_PRICE . 50.00 AND P_PRICE , 100.00;
WHERE
review
following
may be
want to see a listing
100.00;
Always specify the lower range value first when using the range value first, Oracle returns an empty result set.
Copyright
if you
PRODUCT
WHERE
Editorial
BETWEEN
For example,
*
FROM
If your
list.
Operator
value is
prices
SELECT
an attribute
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
358
PART III
Database
The IS
Programming
NULL Special
Standard
SQL
want to list use the
allows
all products
command
use that
SELECT
P_CODE,
FROM
PRODUCT
want to
FROM
PRODUCT
WHERE SQL
uses
LIKE
a special
special
Standard
SQL
matches
(V_CODE
is
For example,
null).
To find
suppose such
a null
you entry
sequence
is:
P_INDATE
NULL; operator
to
of an
test
for
NULL is attribute
that
is
used in
conjunction
you to
use the
per
entire and
string
is
nulls. not
Why couldnt
a value
you just
(such
represents
as the
precisely
enter
number
the
a condition
0 (zero)
absence
such
or the
of any
blank
value.
includes
_23-_56-678_
includes
wildcards
sign (%)
and
to find
patterns
underscore
(_)
within
wildcard
string
attributes.
characters
to
make
known: are
Jernigan,
eligible.
July,
and
For example,
J-231Q
and Jones
_ means any one character _23-456-6789
with
cent
characters
Jones,
Johnson
includes
not
all following
Johnson,
includes
_o_es
assigned
value.
V_CODE
No. Technically,
operator
any
IS
property
allows
includes
Jo%
attribute
Operator
when the
% means J%
a null
a null date entry, the command
a special
5 NULL?
The LIKE Special The
a vendor
P_DESCRIPT,
P_INDATE
but
for
NULL;
check
P_CODE,
space),
have
check
P_DESCRIPT,
SELECT
as V_CODE
NULL to
do not
V_CODE IS
Similarly, if you
Note that
of IS
sequence:
WHERE
8
Operator the
may be substituted
123-456-6789, 123-156-6781,
Jones,
Cones,
Cokes,
for the
underscore.
For example,
223-456-6789,
and 323-456-6789
123-256-6782,
and
totes,
823-956-6788
and roles
NOTE Some
RDBMSs,
such
For example,
as
Microsoft
Access,
the following
query
use the
would find
wildcard
characters
all VENDOR
rows
* and
for
? instead
contacts
of
% and
whose last
_.
names
begin
with Smith. SELECT
V_NAME,
FROM
VENDOR
WHERE If
you
V_CONTACT
check
records:
the
two
original
Smiths
Keep in
mind that
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
does
May not
not materially
data in
Figure
8.2
SQL implementations
be
V_PHONE
again,
youll
see that
this
SQL
query
yields
three
Smithson.
that includes
Reserved. content
one
most
V_AREACODE,
LIKE 'Smith%';
VENDOR
and
will not yield a return
Editorial
V_CONTACT,
copied, affect
yield
case-sensitive
Jones if you use the
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
wildcard
to
electronic reserves
rights, the
searches.
search
right
some to
third remove
For
delimiter
party additional
content
may content
example,
jo%
be
suppressed at
any
Oracle
in a search for
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
last
names.
The reason
a lowercase
j.
On the
For example,
VENDOR
can
be
made
and
no-match
sensitivity.
query
character has
result
preceding
regardless
query
WHERE an output
character SELECT
the
wildcards
queries
special
Cengage
Learning. that
a
allow
strings
The IN Special
deemed
actually
like
query the
a special is
stored
your
the
RDBMS
whose
cannot
table
table.)
allows
the
name
a
match.
entry.
conversions
UPPER function
in the
last
make
necessary
done in the
uppercase
Smith, and smith
no vendor
make the
conversion
including
all rows
letter
combinations
conjunction
to
computers So if you
use
convert
both
memory
only;
want to
of the
to
UPPER
avoid
a
function,
V_PHONE
that
contain such
with the
V_AREACODE,
whose names
alast as
name
Smith,
special
that
smith
begins
and
operators.
with
Smith,
the
query:
SMITH.
For instance,
V_PHONE
match for
do not start
a persons
either
name is
spelling.
The
with Smith. spelled
proper
Johnson
search
would
or Johnsen.
The wildcard
be instituted
by the
query:
LIKE 'Johns_n'
you to
make
matches
may be used in combinations.
can yield the
has
provide
V_AREACODE,
whether
V_CONTACT
characters
2020
used in
SMITH,
an
VENDOR
WHERE
Many
is
causing
contains
exactly
That is,
*
FROM
Thus,
thus
table
written
sensitive.
NOT LIKE 'Smith%';
of all vendors
you find
the
automatically
and if
V_CONTACT,
you do not know
_ lets
with
LIKE 'SMITH%';
a list
V_CONTACT
Suppose
is
(The
value
may be used in
VENDOR
starts
sensitive.
8
operators
FROM
search
359
by using the query:
or lowercase
V_NAME,
may be case
character,
as Oracle,
uppercase.
sensitivity
produces
SELECT
entry
V_CONTACT,
of uppercase
The logical
wildcard
case
Language
Oracle:
SMITH%
Access,
UPPER(V_CONTACT)
WHERE
will yield
query
on how the
on case
VENDOR
not
Query
V_PHONE
Because
(uppercase)
such
to
the same results
FROM
in
queries
alowercase
Microsoft
Others,
V_NAME,
The
the
entries
SELECT
query
J and your are
V_AREACODE,
entries.
when the
as
no effect
based
you can generate
review
only
case
conversion
following
character-based
(unequal)
such
searches
Structured
LIKE 'SMITH%';
SMITH,
RDBMSs,
with a capital
Access
ASCII code from
different
with (uppercase)
eliminate table
as
the
begins
V_CONTACT,
because
has a different
be evaluated
Some
typed
V_CONTACT
are returned
Matches
Copyright
you
FROM
begins
Editorial
Microsoft
V_NAME,
character
the
Jones
hand,
SELECT
No rows
the
because
suppose
WHERE
to
is other
8 Beginning
any
Al, Alton,
would
operator
IN.
All
only
For example,
Blakeston,
blank,
approximate
the
spellings
wildcard
search
bloated
and eligible.
OR can
be
are
based
known.
on the
Wildcard
string _l%
Operator
that
suppressed
Elgin,
when
Rights
Reserved. content
does
require
the
use
For example,
May not
not materially
be
copied, affect
the
scanned, the
of the logical
overall
or
easily
handled
with the
help
of
query:
duplicated, learning
more
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
360
PART III
Database
Programming
SELECT
*
FROM
PRODUCT
WHERE
V_CODE
5 21344
OR
V_CODE
5 24288;
can
be handled
more efficiently
SELECT
with:
*
FROM
PRODUCT
WHERE Note that
V_CODE the IN
IN (21344,
operator
uses
24288);
a value
list.
All of the
Each of the values in the value list is compared value
matches
be only those If the
which
attribute
marks.
the
any of the in
used is
For instance,
preceding
is
21344
had
been
must
In this
be of the
case,
same
data
V_CODE. If the
example,
the rows
type.
V_CODE
selected
will
or 24288.
data type,
V_CODE
list
in this
the row is selected.
either
of a character
in the
attribute
the list
defined
values
as
must
CHAR(5)
be enclosed
during
the
in
single
quotation
table-creation
process,
would have read:
*
FROM
PRODUCT
WHERE
V_CODE
The IN
operator
suppose
you
In that
V_CODE
if the
query
SELECT
8
values in the list,
the
values
to the
case,
is
especially
want to you
IN ('21344',
list
could
'24288');
valuable the
use
when it is
V_CODE
and
a subquery
used
in
V_NAME
within
conjunction
of only
the IN
operator
with
those to
subqueries.
vendors
generate
who
the
For provide
value
list
example, products.
automatically.
The query is: SELECT
V_CODE,
FROM
VENDOR
WHERE
V_CODE
The preceding The inner V_CODE
query
The IN
IN (SELECT
query is executed
values
table
and
V_CODE
in two
or subquery
represent
operator
VENDOR
V_NAME
vendors
compares
the
selects
only the
PRODUCT);
steps:
generates
the
FROM
a list
of
who supply
values
V_CODE
generated
rows
with
values
from
the
PRODUCT
tables.
Those
products. by the
matching
subquery
values
to
the
V_CODE
that is, the
values
vendors
in the
who provide
products. The IN
special
Advanced
operator
SQL,
where
Operator
EXISTS
can
whenever
another
query.
be used
following
has
Cengage deemed
Learning. that
will list
is
a requirement
returns
all vendors,
but
in
Chapter
9,
Procedural
Language
SQL
and
subqueries.
to
any rows,
execute
run the
only if there
are
a command
main query; products
to
based
otherwise,
on the
dont.
result
of
For example,
order:
VENDOR
WHERE
2020
there
attention
about
*
FROM
review
more
That is, if a subquery query
SELECT
Copyright
additional
will learn
The EXISTS Special
the
Editorial
receives
you
any
EXISTS
All suppressed
Rights
Reserved. content
does
May not
(SELECT
not materially
be
copied, affect
* FROM
scanned, the
overall
or
duplicated, learning
PRODUCT
in experience.
whole
or in Cengage
part.
WHERE
Due Learning
to
electronic reserves
P_QOH
rights, the
right
some to
third remove
,5
party additional
content
P_MIN);
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
The EXISTS products
special
quantity
SELECT
EXISTS
EXISTS
special
about
8.5
than
to list
all vendors,
example
double
the
minimum
Language
but only if there
361
are
quantity:
(SELECT
operator
* FROM
will receive
PRODUCT additional
WHERE
attention
P_QOH
in the
,
next
P_MIN
* 2);
chapter,
where
you
will learn
ADVANCED DATA DEFINITION COMMANDS you
by adding
Finally,
you
will learn
All changes
ADD,
in the
you
action
to
may
columns
you
copy
table
and
crucial
are
of tables
you to
the are
structures
by changing
do advanced and
by using
change
you
(unless that
to
made
specific
allows
data
how
or parts
ADD enables DROP
a column
delete
tables
the
DROP.
delete
will learn
structure
produces
characteristics.
allow
how to change (alter) table
Then
how to
that
MODIFY
column
willlearn
columns.
by a keyword
you
data how
the
add a column,
delete
used
does
by
other
delete
not tables.
options enables
a table.
The
new
columns.
command,
Three
contain
characteristics
the
tables.
MODIFY
from
to
TABLE
make. and
a column
column
to
ALTER
want to
attribute
updates
Most
any
values)
basic
syntax
followed
are available: you to
RDBMSs
do
because to
add
cases,
the
change not
such or
an
modify
is:
ALTER {ADD You
less
Query
subqueries.
In this section, and
used in the following
hand,
Structured
VENDOR
WHERE
more
is
on
8 Beginning
*
FROM
The
operator
with the
CHAPTER
TABLE |
can
tablename
MODIFY}( also
ALTER
use the
TABLE
where constraint
ALTER
datatype
TABLE
[ {ADD
command
|
to
MODIFY}
add table
columnname
datatype]);
constraints.
In
these
syntax
is:
tablename
ADD constraint
You could
columnname
[ ADD constraint
] ;
refers to a constraint
also use the
ALTER
definition
TABLE
similar
command
to those
to remove
you learned
a column
in
or table
Section
8.2.6.
constraint.
The syntax
is: ALTER
TABLE
tablename
DROP{PRIMARY
KEY | COLUMN
Notice that,
when removing
one reason
why you should
columnname
a constraint, always
|
CONSTRAINT
you need to specify
name
your
constraints
constraintname
the name in your
};
given to the
CREATE
constraint.
That is
or ALTER
TABLE
TABLE
statement.
8.5.1 Changing a Columns Using
the
ALTER
V_CODE
TABLE
MODIFY
Copyright review
2020 has
Cengage deemed
Learning. that
the
(integer)
V_CODE
in
the
PRODUCT
table
can
be changed
to
a character
by using:
ALTER
Editorial
syntax,
Data Type
any
PRODUCT
(V_CODE
All suppressed
Rights
CHAR(5));
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
362
PART III
Database
Some is
Programming
RDBMSs,
empty.
For
a character already
such
as Oracle,
example,
if
definition,
contains
remember
the
data.
that the
you
above
The
V_CODE in
not
contain
alteration
if the
data,
foreign
changes
P_PRICE
do
column
ALTER
to
TABLE
alter
nine
contains
the
digits,
message If
the
sequence
specified
change
error
the
to
the
the
V_CODE
If the
of the
to
column data
type,
data types V_CODE
expected
creation
definition
V_CODE
message. If the the
be changed
number
VENDOR.
produces
during
column
current because
you
V_CODE in
thus triggering
not
the
table
dont column
structure
PRODUCT
table.
Data Characteristics
already
not
the
unless the
from
an error
references
was
field
explained.
command
key reference
data types
V_CODE
easily
violation,
preceding
If the column to be changed
is
PRODUCT
8.5.2 Changing a Columns
if those
the
will yield
message
integrity
the
you change
change
command
error
match, there is a referential does
do not let
want to
data
use the
data, you can
type.
For
make changes in the columns
example,
if
you
want
to increase
characteristics the
width
of the
command:
PRODUCT
MODIFY (P_PRICE
DECIMAL(9,2));
If you now list the table
contents,
you see that the
column
width of P_PRICE
has increased
by one digit.
NOTE
8
Some
DBMSs impose
Oracle lets attribute
modification
be done
limitations
you increase
only
(but affects
when there
on
not
when its
decrease)
the
the integrity
are
no
of the
data in
any rows
possible size
to
change
of a column.
data in
the
for the
attribute
characteristics.
The reason
database.
affected
for
In fact,
this
some
For example,
restriction attribute
is that
an
changes
can
attribute.
8.5.3 Adding a Column You can alter an existing table by adding one or more columns. In the following example, you add the column named P_SALECODE to the PRODUCT table. (This column will be used later to determine whether goods that have been in inventory for a certain length of time should be placed on special sale.)
Suppose you expect the P_SALECODE entries to be 1, 2 or 3. Because there will be no arithmetic performed with the P_SALECODE, the P_SALECODE is classified as a single-character attribute. Note the inclusion of all required information in the following ALTER command: ALTER TABLE PRODUCT ADD (P_SALECODE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
CHAR(1));
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
363
Online Content If youareusingthe Microsoft Accessdatabases providedonthe online platform
accompanying
For example, look one
named
can
and
column.
continue
changes
you
the
can track
of the
a column,
an error
will default
to
message; if you a value
updates
Each
cumulative
table
with the
in the following
of all
UPDATE
NOT
to
column.
includes
the
new
commands,
modification
and all of the
may even want to use both options, first to
the
add a new column new
effect
sections.
database,
copies
queries and then to examine the
not to include
of null for the
of the two
P_SALECODE
sections. (You
of the update
be careful
of the
PRODUCT_3.
want to see the PRODUCT
effects
each
PRODUCT table in the 'Ch08_SaleCo'
named
will make in the following
When adding so causes
you
one
If you
using
examine the individual
rows
book,
at the copies
PRODUCT_2
P_SALECODE you
this
NULL
atable
that
already
Therefore,
it is
not possible
NULL clause for this new column. (You can, of course add the
cumulative
clause for the
effects.)
new column.
has rows, to
the
Doing
existing
add the
NOT
NOT NULL clause to the table structure
after all of the data for the new column have been entered and the column no longer
contains
nulls.)
8.5.4 Dropping a Column Occasionally, you may want to modify a table by deleting a column. Suppose you want to delete the V_ORDER attribute from the VENDOR table. To accomplish that, you would use the following command: ALTER TABLE VENDOR DROP
COLUMN
V_ORDER;
Again, some
RDBMSs impose
attributes
are involved
that
restrictions
in foreign
on attribute
key relationships,
deletion.
nor
For example,
may you delete
you
may not drop
an attribute
of a table
that
contains only that one attribute.
8.5.5 Advanced Data Updates To make data entries in an existing rows columns, SQL employs the UPDATE command. The UPDATE command updates only data in existing rows. For example, to enter the P_SALECODE value 2 in the fourth
row,
use the
UPDATE
the value use the command
command
PRODUCT
SET
P_SALECODE
(P_SALECODE).
UPDATE
PRODUCT
SET
P_SALECODE
For example,
and 2232/QTY,
P_CODE IN ('2232/QWE',
UPDATE
PRODUCT
SET
P_SALECODE
WHERE
2020 has
Cengage deemed
Learning. that
any
To enter
All
Rights
Reserved. content
does
May not
not materially
be
byits
want to enter the
primary key
P_SALECODE
you use:
'2232/QTY');
copied, affect
command:
5'1'
P_CODE 5'2232/QWE'
suppressed
if you
5 '1'
If your RDBMS does not support IN, use the following
review
P_CODE 1546-QQ2.
data can be entered the same way, defining each entry location
and its column location
WHERE
Copyright
key
5'2'
value 1 for the P_CODE values 2232/QWE
Editorial
primary
P_CODE 5'1546-QQ2';
Enter subsequent (P_CODE)
with the
sequence:
UPDATE
WHERE
together
scanned, the
overall
or
duplicated, learning
OR P_CODE 5'2232/QTY';
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
364
PART III
Database
Programming
To check the results SELECT
P_CODE,
FROM
PRODUCT;
Although
the
process the
of your efforts
is
UPDATE
very
existing
you
want to
P_DESCRIPT,
sequences
cumbersome.
columns,
place
use:
it
just
P_INDATE,
shown
Fortunately,
can
be used
sales codes
allow
if
to
you
to
values
on the
can
to their
between
8
December
2018
16 January
2019
PRODUCT
table,
and
the
SET
P_SALECODE
WHERE
P_INDATE
UPDATE
PRODUCT
SET
P_SALECODE
WHERE
be established
between slots.
the table,
table
cells,
the
the
entries
For example,
using the following
and
suppose
schedule:
command
sequences
make the
appropriate
assignments:
5 '2'
5 '1'
.5
of those
SELECT
P_CODE,
FROM
PRODUCT;
'16-Jan-2019'
8.15.
two
command
sequences,
P_DESCRIPT,
made all of the
Figure
two
,5'10-Feb-2019';
To check the results
If you have
1
2019
, '25-Dec-2018';
P_INDATE
AND P_INDATE
10 February
following
PRODUCT
like
specified
2
UPDATE
look
into
P_SALECODE
25
Using the
values
appropriate
P_INDATE into
P_INDATE before
P_SALECODE
enter
a relationship
assign
based
P_PRICE,
updates
Make sure
P_INDATE,
shown
that
use:
in this
you issue
P_PRICE,
section
P_SALECODE
using
a COMMIT
Oracle, your
statement
to
save
PRODUCT these
table
should
changes.
Online Content Thescreenshotsprovided in Chapter 8,Beginning StructuredQuery Language SQL
and
Chapter
Developer
development
5 within
to
use
Your
5
Getting
Copyright review
2020 has
Cengage deemed
Learning. that
any
can
on
(APEX)
All Oracle scripts
Started
with
Rights
Reserved. content
does
May not
on the
may be part
Express
Oracle Academy
All
and runs
be found
or university
Application
suppressed
Oracle
SQL and SQL
Advanced
Developer
by Oracle. It is free to
10g and later
is
use and
SQL,
a graphical can
were taken tool
be used
Windows, Linux and
for
online
of the
platform
Oracle
a cloud-based
provided
with this book
this
programme. which
can
Oracle
database
with any
Oracle
Mac OSX. Throughout
accompanying
Academy software
from
Chapters 8
A guide for how
book in
If so, you
Appendix
N.
may be using
be used to learn
SQL and
will also work on Oracle APEX. Learn
more about
here: https://academy.oracle.com/en/oa-web-overview.html
Oracle
appdev/sql-developer.html 6 Developing Applications appdev/apex.html
Editorial
APEX.6
provided
Developer
college
PL/SQL. the
Language
will be used as an editor to explore the use of DML and DDL commands.
SQL
Oracle
Oracle
which is
Database version and 9, it
9, Procedural
not materially
be
SQL Developer.
Available:
www.oracle.com/database/technologies/
with Oracle APEX. Available:
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
www.oracle.com/database/technologies/
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The arithmetic in
your
product. arrive,
operators
PRODUCT
table
Suppose, youll
for
want to
are particularly has
example, add them
UPDATE
PRODUCT
SET
P_QOH
WHERE
useful in data updates.
below
you
have
to inventory,
5 P_QOH
P_CODE
FIGURE 8.15
dropped
the
ordered
minimum
8 Beginning
For example,
desirable
20 units
of
value,
product
Structured
if the youll
Query
quantity order
2232/QWE.
Language
365
on hand
more
When the
of the
20
units
using:
1 20
5 '2232/QWE';
The cumulative effect of multiple updatesin the PRODUCTtable (Oracle-APEX)
8
If you wantto add 10 per cent to the price for all products that have current prices below 50, you can use: UPDATE
PRODUCT
SET
P_PRICE 5 P_PRICE * 1.10
WHERE Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
P_PRICE All suppressed
Rights
Reserved. content
does
May not
, 50.00; not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
366
PART III
Database
Programming
NOTE If you fail to roll
back
will not
match
the
command
to
restore
the
changes
results the
shown database
of the
preceding
in
figures.
the
to its
UPDATE
queries,
Therefore,
previous
if
you
the are
output using
of the
subsequent
Oracle,
use
queries
the
ROLLBACK
state.
Online Content If you areusingAccess,copythe original'Ch08_SaleCo.mdb' file from the online platfom for this book.
8.5.6 Copying Parts of Tables As you will discover in later chapters on database design, sometimes it is necessary to break up a table structure into several component parts (or smaller tables). Fortunately, SQL allows you to copy the contents of selected table columns so that the data need not be re-entered manuallyinto the newly created table(s). For example, if you wantto copy P_CODE, P_DESCRIPT, P_PRICE and V_CODE from the
8
PRODUCT
table to
a new table
named
PART, you create the
PART table
structure
first,
as follows:
CREATE TABLE PART( PART_CODE
CHAR(8) NOT NULL
PART_DESCRIPT PART_PRICE
UNIQUE
CHAR(35),
DECIMAL(8,2),
V_CODE
INTEGER,
PRIMARY
KEY (PART_CODE));
Note that the PART column names need not beidentical table
need
not have the
same
number
of columns
to those ofthe original table and that the new
as the
original
table.
In this
case, the first
column
in the PART table is PART_CODE, rather than the original P_CODE found in the PRODUCT table. And the PART table contains only four columns rather than the seven columns found in the PRODUCT table. However, column characteristics must match; you cannot copy a character-based attribute into a numeric structure and vice versa. Next, you need to add the rows
you use the INSERT command INSERT
INTO
to the
new PART table,
using the
PRODUCT
table
rows.
To do that,
you learnt in Section 8.3.7. The syntax is:
target_tablename[(target_columnlist)]
SELECT
source_columnlist
FROM
source_tablename;
Note that the target column list is required if the source column list doesnt match all of the attribute names and characteristics of the target table (including the order of the columns). Otherwise, you do not need to specify
INSERT command INSERT
INTO
the target
column list. In this example,
you
must specify
the target
column list in the
below because the column names of the target table are different: PART (PART_CODE,
PART_DESCRIPT,
PART_PRICE,
V_CODE)
SELECT P_CODE, P_DESCRIPT, P_PRICE, V_CODE FROM PRODUCT; Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
The contents
of the
PART table
SELECT * FROM to
generate
the
can now be examined
by using the
8 Beginning
Structured
Query
Language
367
query:
PART;
new
PART
FIGURE 8.16
tables
contents,
shown
in
Figure
8.16.
PARTtable attributes copied from the PRODUCTtable PART_CODE
PART_DESCRIPT
11QER/31
Power
13-Q2/P2
7.25 cm
pwr. saw
14-Q1/L3
9.00 cm
pwr. saw
1546-QQ2
Hrd. cloth,
1558-QW1
Hrd. cloth, 1/2 cm, 3 3 50
2232/QTY
B&D jigsaw,
12 cm
2232/QWE
B&D jigsaw,
8 cm
2238/QPD
B&D
23109-HB
Claw
23114-AA
Sledge
hammer,
12 kg
54778-2T
Rat-tail
file,
cm fine
89-WRE-Q
Hicut
PVC23DRT
PVC
SM-18277
1.25 cm
SW-23116
2.5
painter,
25595
blade
14.99
21344
blade
17.49
21344
39.95
23119
43.99
23119
109.92
24288
99.87
24288
38.95
25595
3-nozzle
2 3 50
blade blade
drill,
1/2
cm
pipe,
cm
21225
9.95
hammer
chain
V_CODE
109.99
15 psi.,
1/4 cm,
cordless
Steel
WR3/TT3
PART_PRICE
1/8 saw, 3.5
cm,
8
matting,
21344
4.99
8
24288
256.99
16 cm
5.87
m
metal screw,
wd. screw,
14.40
25
50
4 3 8 3 1/6
m,.5
m
6.99
21225
8.45
21231 25595
119.95
mesh
SQL also provides another way to rapidly create a new table based on selected columns and rows of an existing table. In this case, the new table copies the attribute names, data characteristics and rows of the
original
CREATE
table.
The Oracle version
TABLE
SELECT
of the command
is:
PART AS
P_CODE AS PART_CODE, P_DESCRIPT AS PART_DESCRIPT,
P_PRICE AS PART_PRICE, V_CODE FROM
PRODUCT;
If the PART table already exists, Oracle will not let you overwrite the existing table. To run this command, you must first delete the existing PART table. (See Section 8.5.8.) The Microsoft Access version of this command is: SELECT P_CODE AS PART_CODE, P_DESCRIPT AS PART_DESCRIPT, P_PRICE AS PART_PRICE, V_CODE INTO
PART
FROM PRODUCT; If the
PART table
continue
Copyright Editorial
review
2020 has
Cengage deemed
exists,
Microsoft
Access
will ask if you
want to
delete the
existing
table
and
with the creation of the new PART table.
Learning. that
already
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
368
PART III
Database
Programming
The SQL command PART_PRICE, copied rules
and
just
automatically. are
But
automatically
and FK to
shown
V_CODE
note
applied
enforce
entity
creates
columns.
In
that
a new
no entity
to the
all
integrity
new table.
and referential
PART table
addition,
of the (primary
In the
integrity,
with PART_CODE,
data rows
next
key)
(for
the
or referential
section,
you
PART_DESCRIPT,
selected
columns)
integrity
will learn
are
(foreign
how to
key)
define
the
PK
respectively.
8.5.7 Adding Primary and Foreign Key Designations When you from
the
table,
create
use the ALTER ADD
Aside its
(In
the
referential
fact
that
integrity.
8
the integrity more
For
have discovered
the
not
ALTER
tables,
example,
you
rules
been
TABLE
neither the
both
ALTER
automatically
might
procedure
it
the
To
define
does
the
not include
primary
can
transferred
scenarios
forgotten
to
command.
be designated
to
could
integrity
key for the
the
from
a new table
leave
define
tables
did not transfer
ALTER
changes
KEY (V_CODE)
rules
new
PART
you primary
a different
that
without
derives
entity
and
and foreign
database,
keys
you
might
the integrity
rules. In any case, you can
For example,
if the
PART tables
foreign
by:
PART tables at once,
REFERENCES
primary
VENDOR;
key nor its foreign
key has been
designated,
you can
using:
TABLEPART PRIMARY
KEY (PART_CODE)
ADD
FOREIGN
KEY (V_CODE)
For
other have
ADD
Even
new table
not
are
several
by using
designated,
FOREIGN
if
the
key.)
PART
ADD
Alternatively,
table,
primary
Or, if you imported
that the importing
yet
incorporate
rules
other
original tables.
the integrity
has
no
command:
or
when you created
key
is
KEY (PART_CODE);
one
re-establish
on another
there
PART
PRIMARY
from
based
particular,
following
TABLE
from
data
a new table
old table.
composite example,
primary if
you
keys
want
to
and
multiple
enforce
the
REFERENCES foreign integrity
VENDOR;
keys
can
rules
for
be designated the
LINE
in
table
a single shown
SQL
in
command.
Figure
8.1,
you
can use: ALTER
TABLE
LINE
ADD
PRIMARY
KEY (INV_NUMBER,
LINE_NUMBER)
ADD
FOREIGN
KEY (INV_NUMBER)
REFERENCES
ADD
FOREIGN
KEY (PROD_CODE)
INVOICE
REFERENCES
PRODUCT;
8.5.8 Deleting a Table From the Database Usethe PART
DROP TABLE command to delete atable from the database. For example, you can delete the
table
DROP
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
you just TABLE
All suppressed
created
with:
PART;
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
You can drop a table to
drop
a table
violation
has
8.6
the
RDBMS
not participating
generates
as the one
an error
message
Structured
Query
side of any relationship.
indicating
that
a foreign
Language
369
If you try key integrity
occurred.
ADVANCED SELECT QUERIES
One of the logical
only if that table is
otherwise,
8 Beginning
most important
operators
environment.
that
In
calculate
advantages
were introduced
addition,
averages,
SQL
and
have no duplicates
of SQL is its ability to earlier
provides
so on.
Better
or entries
to
useful yet,
update
table
functions
that
SQL allows
whose duplicates
produce
the
complex
contents
count,
free-form
work just
find
minimum
user to limit
queries
as
The
well in the
and
to
queries.
maximum
only those
query values,
entries
that
can be grouped.
8.6.1 Ordering a Listing The ORDER BY clause is especially SELECT
conditionlist
[ORDER
BY
Although
you
ascending
have
option
P_CODE,
FROM
PRODUCT BY is
the in
unaffected
in
Figure
listing
Figure
by the the
Copyright review
type
ascending
contents
of the
BY
8.17
although
ORDER
to the
actual
order,
you
P_DESCRIPT,
P_PRICE
table
product
ORDER
BY yields
an ascending
contents
is listed
first,
BY produces
would
P_INDATE,
produce
For example,
an ordered
suppose
sequence
you
(last
Rights
Reserved. content
does
May not
earlier
in
by the
output, the
want to
name, first
Withinthe order createdin Step 2, ORDER BY middleinitial.
All
price listing. Figure
8.2,
you
next lowest-priced
actual table
contents
DESC;
are used frequently.
suppressed
in
P_PRICE
3
any
default
by P_PRICE
enter:
Withinthe last names, ORDER BYfirst name.
Learning.
shown followed
a sorted
2
that
the
listed
P_PRICE
ORDER BYlast name.
Cengage
table
command.
descending
PRODUCT
deemed
or descending
PRODUCT
1
2020 has
order
P_INDATE,
Note that
Figure
FROM
listings
the
want the
the lowest-priced
P_CODE,
Ordered
you
8.17.
ORDER
list in
be helpful if you could
Editorial
in
8.17,
SELECT
ORDER
declaring
if
P_DESCRIPT,
and so on. However,
To produce
of
DESC] ] ;
P_PRICE;
shown
Comparing will see that,
|
use:
SELECT
output
are
the
8
[ASC
For example,
order,
ORDER
]
columnlist
ascending.
product,
to you. The syntax is:
tablelist
[WHERE
The
order is important
columnlist
FROM
order is
useful when the listing
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
create
a phone
directory.
name, initial) in three
some to
third remove
party additional
content
may content
be
suppressed at
any
time
It
would
stages:
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
370
PART III
Database
Programming
FIGURE 8.17
Selected PRODUCTtable attributes:
ordered by (ascending)
P_PRICE
P_CODE
P_DESCRIPT
P_INDATE
P_PRICE
54778-2T
Rat-tail file, 0.5 cm fine
15-Dec-18
4.99
PVC23DRT
PVC pipe, 9 cm, 2.5 m
20-Feb-19
5.87
SM-18277
3 cm
01-Mar-19
6.99
SW-23116
6 cm wd. screw,
24-Feb-19
8.45
20-Jan-19
9.95
02-Jan-19
14.40
metal screw,
25 50
23109-HB
Claw
hammer
23114-AA
Sledge
13-Q2/P2
7.25
cm
pwr. saw
blade
13-Dec-18
14.99
14-Q1/L3
9.00
cm
pwr. saw
blade
13-Nov-18
17.49
2238/QPD
B&D
cordless
1546-QQ2
Hrd. cloth,
1558-QW1
Hrd. cloth,
2232/QWE
B&D
2232/QTY
B&D jigsaw,
11QER/31
Power
WR3/TT3
Steel
hammer,
7 kg
drill,
1/2 cm
20-Jan-19
38.95
1/4
cm,
2 3 50
15-Jan-19
39.95
1/2
cm,
3 3 50
15-Jan-19
43.99
24-Dec-18
99.87
jigsaw,
8 cm
blade
12 cm
painter,
blade
15 psi.,
matting,
3-nozzle
4 3 8 3 1/6
m,.5
m
30-Dec-18
109.92
03-Nov-18
109.99
17-Jan-19
119.95
07-Feb-19
256.99
mesh
8 89-WRE-Q
Such a multilevel
Hicut chain
ordered
saw,
sequence
16 cm
is known
as a cascading
order
sequence,
and it can be created
easily bylisting several attributes, separated by commas, after the ORDER BY clause. The cascading order sequence is the basis for any telephone directory. To illustrate a cascading order sequence, use the following SQL command on the EMPLOYEE table: SELECT EMP_LNAME,
EMP_FNAME,
EMP_INITIAL, EMP_AREACODE,
EMP_PHONE
FROM EMPLOYEE ORDER BY EMP_LNAME,
EMP_FNAME,
EMP_INITIAL;
That command yields the results shown in Figure 8.18. The ORDER BY clause is useful in many applications, especially because the DESC qualifier can be invoked. For example, listing the most recent items first is a standard procedure. Typically, invoice due dates are listed in descending order. Orif you want to examine budgets, its probably useful to start by looking
at the largest
budget line items.
You can use the ORDER BY clause in conjunction with other SQL commands, note the use of restrictions on date and price in the following command sequence: SELECT
P_DESCRIPT, V_CODE, P_INDATE, P_PRICE
FROM
PRODUCT
WHERE
P_INDATE , '21-Jan-2019'
P_PRICE ,5
too. For example,
AND
50.00
ORDER BY V_CODE, P_PRICE DESC;
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 8.18
Selected PRODUCT table attributes:
EMP_LNAME
EMP_FNAME
Brandon
Marie
Diante
Jorge
Genkazi
Leighla
is
descending
ordered by (ascending)
Query
EMP_INITIAL
EMP_REACODE
EMP_PHONE
G
7325
882-0845
D
0181
890-4567
7235
569-0093
0181
898-4387
W E
Jones
Anne
M
0181
898-3456
Cela
Nkosi
D
0181
324-5456
Lange
John
P
7325
504-4430
Lewis
Rhonda
G
0181
324-4472
Saranda
Hermine
R
0181
324-5505
Smith
George
A
0181
890-2984
Smith
George
K
7235
504-3339
Smith
Jeanine
K
0181
324-7883
P
0181
324-9006
7325
675-8993
Melanie Rhett
Washington
Rupert
E
0181
890-4925
Wiesenbach
Paul
R
0181
897-4358
Williams
Robert
D
0181
890-3220
shown
in
Figure
8.19.
Note
that
within
each
V_CODE,
the
Language
371
P_PRICE
Edward
Vandam
output
Structured
Johnson
Gounden
The
8 Beginning
P_PRICE
8
values
are in
order.
FIGURE 8.19
A query based on multiple restrictions V_CODE
P_DESCRIPT
Sledge hammer, Claw
P_INDATE
P_PRICE
02-Jan-19
14.40
21225
20-Jan-19
9.95
12 kg
hammer
9.00
cm
pwr.
saw
blade
21344
13-Nov-18
17.49
7.25
cm
pwr.
saw
blade
21344
13-Dec-18
14.99
21344
15-Dec-18
Rat-tail file,
1/8 cm fine
4.99
Hrd. cloth,
1/2 cm,
3 3 50
23119
15-Jan-19
43.99
Hrd. cloth,
1/4 cm,
2 3 50
23119
15-Jan-19
39.95
25595
20-Jan-19
38.95
B&D cordless
drill, 1/2 cm
NOTE
If the The
Copyright Editorial
review
2020 has
Cengage deemed
column
ORDER
Learning. that
ordering
any
All suppressed
has
BY clause
Rights
Reserved. content
does
May not
not materially
nulls,
must
be
copied, affect
they
always
scanned, the
overall
or
are listed be listed
duplicated, learning
in experience.
whole
either last
or in Cengage
part.
first
in the
Due Learning
to
electronic reserves
or last
(depending
on the
SELECT
command
sequence.
rights, the
right
some to
third remove
party additional
content
may content
be
RDBMS).
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
372
PART III
Database
Programming
8.6.2 Listing Unique Values How
many different
vendors
are currently represented
in the
PRODUCT
table?
A simple listing
(SELECT)
is not very useful if the table contains several thousand rows and you have to sift through the vendor codes manually. Fortunately, SQLs DISTINCT clause is designed to produce alist of only those values that are different from one another. For example, the command: SELECT
DISTINCT V_CODE
FROM
PRODUCT;
yields only the different (distinct) vendor codes (V_CODE) that are encountered in the PRODUCT table, as shown in Figure 8.20. Notethat the first output row shows the null. (By default, Access places the null V_CODE atthe top ofthe list, while Oracle places it atthe bottom. The placement of nulls does not affect the list contents. In Oracle, you could use ORDER BY V_CODE NULLS FIRSTto place nulls atthe top ofthe list.)
FIGURE
8.20
Alisting
of distinct
(different)
V_CODE values in the PRODUCT table V_CODE
21225 21231
8
21344 23119 24288
25595
8.6.3 Aggregate Functions SQL can perform contain
various
a specified
summing
the
aggregate
condition,
values
functions
TABLE
mathematical
8.9
finding
in
a specified
are
shown
in
has
and
The
number minimum
MAX
The
maximum
SUM
The sum
AVG
The arithmetic
Cengage
Learning. that
maximum
values
the
values
for in
number
some
of rows
specified
a specified
that
attribute,
column.
Those
functions
of rows
The
deemed
or
averaging
the
8.9.
MIN
another
are presented
2020
minimum
as counting
Output
To illustrate
review
the
Some basic SQL aggregate
COUNT
Copyright
for you, such
column, Table
Function
Editorial
summaries
any
All suppressed
standard
using the
Rights
Reserved. content
does
May not
not materially
be
copied, affect
attribute
value
attribute
of all values
SQL command
Oracle
containing
non-null encountered
value for
in
encountered
a given
mean (average)
format,
values
a given
column column
column
for
most of the
in
a given
a specified
remaining
column
input
and
output
sequences
RDBMS.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
COUNT Use the COUNT function in
conjunction
different
in Figure table.
with
vendors
the
DISTINCT
clause.
PRODUCT
FIGURE 8.21
For
table.
6. The answer indicates
(Note that the
Structured
to tally the number of non-null values of an attribute.
are in the
8.21, is
8 Beginning
that
nulls are not counted
COUNTfunction
example,
The
suppose
answer,
generated
six different
VENDOR
as V_CODE
output
you
codes
Language
373
COUNT can be used
want to
by the
Query
first
find
out
how
SQL
code
set
are found
in the
many shown
PRODUCT
values.)
example
8
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
374
PART III
Database
Programming
The aggregate functions can be combined with the SQL commands explored earlier. For example, the second SQL command set in Figure 8.21 supplies the answer to the question, How many vendors
8
referenced
in the
PRODUCT
table
have supplied
products
with prices that
are less
than
or equal to
10? The answer is 3, indicating that three vendors referenced in the PRODUCT table have supplied products that meetthe price specification. The COUNT aggregate function uses one parameter within parentheses, generally a column name such as COUNT(V_CODE) or COUNT(P_CODE). The parameter may also be an expression such as COUNT(DISTINCT
V_CODE)
or COUNT(P_PRICE110).
Using that
syntax,
COUNT
always
returns
the
number of non-null values in the given column. (Whether the column values are computed or show stored table row values is immaterial.) In contrast, the syntax COUNT(*) returns the number of total rows returned by the query, including the rows that contain nulls. In the example in Figure 8.21, SELECT COUNT(P_CODE) FROM PRODUCT and SELECT COUNT(*) FROM PRODUCT will yield the same answer
because
there
are no null values in the
P_CODE
primary
key column.
Note that the third SQL command set in Figure 8.21 uses the COUNT(*) command to answer the question, How many rows in the PRODUCT table have a P_PRICE value less than or equal to 10? The answer, 5, indicates that five products have alisted price that meets the price specification. The COUNT(*) aggregate
function
is used to count rows in a query result
set. In contrast,
the
COUNT(column)
aggregate function counts the number of non-null values in a given column. For example, in Figure 8.20, the COUNT(*) function would return a value of 7to indicate seven rows returned by the query. The COUNT(V_CODE) function would return a value of 6to indicate the six non-null vendor code values.
NOTE TO MICROSOFT
ACCESS USERS
Microsoft Access does not support such
queries in
Microsoft
Access,
For example, the equivalent SELECT
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
must create
subqueries
DISTINCT clause. If you want to use
with DISTINCT
and
NOT
NULL clauses.
Microsoft Access queries for the first two queries shown in Figure 8.21 are:
COUNT(*)
FROM
Editorial
the use of COUNT with the
you
(SELECT
Rights
Reserved. content
does
May not
not materially
be
DISTINCT V_CODE FROM PRODUCT
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
WHERE V_CODE IS NOT NULL)
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
8 Beginning
Structured
Query
Language
375
and
SELECT
COUNT(*)
FROM
(SELECT
DISTINCT(V_CODE)
FROM (SELECT
V_CODE,
WHERE Those
two
queries
Microsoft can
Access
delete
MAX and
that
can
be found
does
trailer
V_CODE
add
the
P_PRICE IS
NOT
on the
a trailer
next time
NULL
online
PRODUCT
AND
P_PRICE
platform
in
end
of the
query
use the
query.
at the you
FROM
the
, 10))
'Ch8_SaleCo' after
you
(Access)
have
database.
executed
it,
but
you
MIN
The MAX and
MINfunctions
Highest (maximum)
help you find answers to problems such as the:
price in the
Lowest (minimum)
PRODUCT
table.
price in the PRODUCT table.
8 The highest price, 256.99, is supplied by the first SQL command set in Figure 8.22. The second command set shown in Figure 8.22 yields the minimum price of 4.99.
FIGURE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
8.22
All suppressed
MIN and
Rights
Reserved. content
does
May not
not materially
be
MAX function
copied, affect
scanned, the
overall
or
duplicated, learning
output
in experience.
whole
or in Cengage
SQL
examples
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
376
PART III
Database
Programming
8
The third
SQL command
conjunction only
one
value, the
with value
a single
question,
Although
based
or a single
Which
product
that
Copyright review
2020 has
P_CODE,
FROM
PRODUCT
Learning. that
any
All
Rights
Reserved. content
does
May not
not materially
values
has the
demonstrates
However, found value.
highest
simple
you
in the It is
that
the
numeric
must remember
table:
easy
to
a single overlook
functions
that the maximum
this
can be used in
numeric value,
warning.
functions
yield
a single
minimum
For example,
examine
price?
enough,
P_DESCRIPT,
P_PRICE
suppressed
8.22
average
query seems
SELECT
Cengage deemed
Figure queries.
on all of the
count
WHERE
Editorial
set in
more complex
the
SQL command
sequence:
P_PRICE
5 MAX(P_PRICE);
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
does not yield the operator be
used
you
can
To answer each
results
thus
producing
only in the
symbol,
is
expected
is incorrect,
the
a single
value
query.
To do that,
The inner
query,
which is
The
query,
which is
executed
last.
case,
SELECT
in the
in this
Using the following price
value,
each
P_PRICE
command
which
is
stored
value, the
in
P_CODE,
FROM
PRODUCT
WHERE
The execution
The
of that
set in
that
most recent
outer
query
command
Since
executes
side of a comparison MAX(columnname)
a comparison
377
that
uses
can
an equality
sign.
maximum
price first,
query. In this
then
compare
case, the
it to
nested
query
the
always
note that
outer
the
first
SQL
command
you
query
the inner
query
now
a value
has
first
finds
to
the
which
maximum
to
compare
P_PRICE
MAX(PRICE)
query
is
sequence.)
properly:
5 (SELECT
nested
the
Language
yields
FROM
the
correct
can
also
PRODUCT);
answer
shown
below the third
(nested)
SQL
Figure 8.22.
MAX and
product
(The
P_DESCRIPT,
P_PRICE
in
equals
you need a nested
as an example,
memory.
query
Also,
of the
function
Query
first.
sequence
SELECT
command
executed
The aggregate
must compute
parts:
outer
right
Structured
MAX(P_PRICE) to the right
statement.
to the
you
use of
message.
of a SELECT
therefore,
by the
the
an error
of two
encounter
the
list
only
question,
price returned
composed
column
use
because
8 Beginning
MIN aggregate has the
product,
functions
oldest
you
date,
would
you
use
would
be used
use
with
date
MIN(P_INDATE).
columns.
In
the
For
same
example,
manner,
to find
to find
8
the
MAX(P_INDATE).
NOTE You
can
has the
use
expressions
highest
SELECT
anywhere
inventory
value.
a column To find
name
the
is
answer,
expected. you
can
Suppose write the
you
want to
following
query:
know
which
product
*
FROM
PRODUCT
WHERE
P_QOH
* P_PRICE
5 (SELECT
MAX(P_QOH*P_PRICE)
FROM
PRODUCT);
SUM
The SUM function computes the total sum for any specified attribute, using whichever condition(s) you have imposed. For example, if you want to compute the total amount owed by your customers, you could use the following command: SELECT
SUM(CUS_BALANCE)
FROM
CUSTOMER;
AS TOTBALANCE
You could also compute the sum total of an expression. of allitems carried in inventory, you could use:
For example, if you want to find the total value
SELECT
SUM(P_QOH * P_PRICE) AS TOTVALUE
FROM
PRODUCT;
because
the total
value is the
sum of the
product
of the
quantity
on hand
and the
price for
all items.
(See Figure 8.23.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
378
PART III
Database
Programming
FIGURE 8.23
The total value of all items in the PRODUCTtable
8
AVG The
AVG function
restrictions. can
set in
Figure
product
Copyright Editorial
review
2020 has
The first
value
Cengage deemed
Learning. that
8.24
any
is
similar
to
SQL command
be generated
price.
examined
format
to
yield
produces
Note
that
five the
that
of
MIN and
set shown
the
computed
output
second
in Figure average
lines query
MAX
describe
uses
nested
subject
8.24 shows
price
that
and is
SQL
the
how a simple
of 56.42125.
products
to
same
average
The second
whose
commands
prices and
operating
SQL
exceed
the
P_PRICE command
the
ORDER
average BY
clause
earlier.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 8.24
AVGfunction
8 Beginning
Structured
Query
Language
379
output examples
8
8.6.4 Grouping
Data
Frequency distributions can be created SELECT statement. The syntax is:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
quickly
duplicated, learning
in experience.
whole
and easily using the
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
GROUP BY clause
third remove
party additional
content
may content
be
suppressed at
any
time
within the
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
380
PART III
Database
Programming
SELECT
columnlist
FROM tablelist [WHERE
conditionlist
[GROUP
BY columnlist
[HAVING
conditionlist
[ORDER
BY columnlist
The
GROUP
in the
The
GROUP
set in
as
PRODUCT BY
a not
conjunction
in SQL
by each
bought
enter
supplied
review
2020 has
Cengage deemed
Learning. that
any
only
MIN,
when
MAX,
P_CODE,
AVG
used
and
output
in
columns
combined
with aggregate
conjunction
SUM.
For
with
example,
one
of the
as shown
in the
SQL
aggregate
first
command
by using:
P_DESCRIPT,
BY expression
with
some
sequence
P_PRICE
vendor?
by
in
Figure
because
a vendor.
code.
it
Perhaps
8.25 uses
nulls
the
properly
aggregate
products
can
were
making mean
the
preceding
BY clause
the
a null for the
person
write the
GROUP
answers
a COUNT
those or the
that
However, if you
function,
8.25 shows
channel
(Remember
error.
aggregate
output line in Figure
via a non-vendor
a vendor
FIGURE 8.25
valid
a GROUP
command
supplied
not
when you have attribute
V_CODE;
sequence
were
is
COUNT,
FROM
been
used
8.25, if you try to group the
Note that the third
Copyright
generally
V_CODE,
are
DESC] ] ;
statement.
SELECT
second
Editorial
[ASC |
BY clause
you generate
to
]
SELECT
such
Figure
GROUP
8
]
BY clause is
functions
functions,
]
question,
SQL command
works
properly.
How
many
The
products
function.
V_CODE, indicating
that two
produced
or they
data
in-house
entry
may have
products may
have
merely forgotten
many things.)
Incorrect and correct use ofthe GROUPBYclause
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
381
8
NOTE When using The
The
in
Copyright Editorial
review
columnlist
must include you could
in the
SELECTs
columnlist.
GROUP
BY clause
columnlist
SELECT
statement,
BY Features useful
like
the
has
HAVING
extension
and expressions
Learning. that
any
supplied
All suppressed
of the
Rights
Reserved. content
does
example, by each
May not
not materially
be
copied, affect
any
and
aggregate
function
by any aggregate
columns
BY feature
SELECT rows,
vendor.
scanned, overall
or
you
from
the
SELECTs
duplicated,
in experience.
whole
the
tables
functions.
columns
specified
in
function
columns
that
in
the
FROM
clause
of
columnlist.
or in Cengage
Due Learning
to
reserves
rights, the
right
some to
Basically,
is applied of the
third
party additional
content
to the
to
may content
be
products
any
time
to of a
of products
suppressed at
applies output
number
the listing
remove
HAVING
WHERE clause
a listing
want to limit
electronic
clause.
the
clause
generate
you
part.
HAVING
However,
HAVING
want to
But this time
learning
is
statement.
while the
suppose
the
names
all non aggregate also group
do not appear in the
GROUP
for individual
For
of column
Clause
WHERE clause in the
BY operation.
Cengage deemed
can include
even if they
statement:
a combination
If required,
the inventory
2020
must include
columnlist.
A particularly
GROUP
a SELECT
BY clauses
GROUP
columns
columnlist
with
GROUP
The
the
BY clause
SELECTs
appear
operates
GROUP
SELECT's
the
The
the
from if
whose
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
382
PART III
Database
prices
average
clause, in
Programming
below
as illustrated
conjunction
the
desired
10.
The first
in the
with the
first
part of that requirement
SQL
GROUP
command
BY clause
set in
in the
is satisfied
Figure
second
8.26.
SQL
with the
Note that
command
help of the
the
set in
HAVING Figure
GROUP
clause 8.26 to
is
BY
used
generate
result.
FIGURE 8.26
An application of the HAVINGclause
8
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Using the
WHERE
Figure
produces
8.26
You
can
also
statement
clause
instead
an error
combine
Select
the
total
only the
clause
in the
second
Query
Language
SQL command
383
set in
message.
multiple
cost
rows
List the results
in
clauses
FROM
PRODUCT BY
and
aggregate
functions.
For
example,
the
following
SQL
BY
the
column
that
order
by
exceed
V_CODE.
500.
by the total
SUM(P_QOH
* P_PRICE)
SUM(P_QOH
syntax
used
expression
(TOTCOST).
totals
grouped
cost.
* P_PRICE)
AS TOTCOST
V_CODE
HAVING (SUM(P_QOH ORDER
products
descending
V_CODE,
GROUP
of
having
SELECT
others
HAVING
Structured
will:
Aggregate
Note
of the
8 Beginning
Some
in
*
. 500)
P_PRICE)
the
DESC;
HAVING
(formula)
used
RDBMSs
allow
and
in the
ORDER
SELECT
you to
BY clauses; statements
substitute
the
in
both
column
column
cases,
you
rather
than
list,
expression
with the
must
specify
the
the
column
alias
alias,
while
column
do not.
8
8.7 As
VIRTUAL TABLES: CREATING A VIEW
you
learnt
(or table). that is,
earlier,
Suppose products
of typing
the
The
output
at the
of a relational
end of every
with a quantity
same
in the database? query.
the
that,
query
end
of every
can
contain
day,
columns,
than
as
SELECT)
get a list
or equal to the
wouldnt
of a relational
(such
would like to
on hand that is less
at the
Thats the function
query
operator
day, you
it
be better
to
is
minimum save
that
view. A view is a virtual table
computed
columns,
aliases
and
another
relation
of all products
to reorder,
quantity. query
Instead
permanently
based on a SELECT
aggregate
functions
from
one
or moretables. The tables on which the view is based are called base tables. You can create a view by using the CREATE VIEW command: CREATE
VIEW viewname
The CREATE SELECT
VIEW statement
statement
The first
SQL
This
view
contains
rows
in
which
rows that
AS SELECT is
used to
command
the
price
a data definition
generate set in
only the
Figure
over
Access,
Copyright review
2020 has
you just which
Cengage deemed
Learning. that
shows
three
50.
The
the
the subquery
syntax
used to
attributes
second
stores
data dictionary. create
(P_DESCRIPT,
SQL
a view
P_QOH
command
sequence
specification
named
and in
PRICEGT50.
P_PRICE)
Figure
the
8.27
and
only
shows
the
ACCESS USERS
The CREATE VIEW command
Editorial
8.27
that
in the
make up the view.
NOTE TO MICROSOFT
view,
command
the virtual table
designated
is
query
any
can
All suppressed
Rights
need
does
May not
not materially
not directly
create
be treated
Reserved. content
to
is
like
be
copied, affect
a SQL a table,
scanned, the
overall
or
query it
duplicated, learning
supported and then
achieves
in experience.
in
whole
the
or in Cengage
part.
Microsoft
save it. same
Due Learning
to
electronic reserves
Access.
While this
is
To create not
a view in
as versatile
Microsoft
as an actual
result.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
384
PART III
Database
Programming
FIGURE 8.27
Creating a virtual table
with the CREATE VIEW command
8
A relational view has several special characteristics: You can use the name of a view anywhere a table name is expected in a SQL statement. Views are dynamically updated. That is, the view is re-created on demand each time it is invoked. Therefore, if new products are added (or deleted) to meetthe criterion P_PRICE . 50.00, those new products automatically appear (or disappear) in the PRICEGT50 view the next time it is invoked. Views
provide
alevel
of security in the
database
because
the
view can restrict
users to specified
columns and specified rows in atable. For example, if you have a company with hundreds of employees in several departments, you could give the secretary of each department a view of only certain attributes and only for the employees that belong to the secretarys department. Views may also be used asthe basis for reports. For example, if you need a report that shows a summary of total product cost and quantity-on-hand statistics grouped by vendor, you could create a PROD_STATS view as:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
CREATE
VIEW
PROD_STATS
SELECT
V_CODE,
SUM(P_QOH*P_PRICE) AS
AVG(P_QOH)
In
BY V_CODE;
Chapter
9, you
through
between
to
more than
combine
DBMS
(join) tables
one table
at and
tables,
you
MIN(P_QOH)
AS
MINQTY,
views
and, in
particular,
about
updating
data in
if
rows
the
WHERE
to
as the join
attributes
databases.
necessary,
is
perhaps
Ajoin is
review
the
the
performed
join
most important
enumerate
the
the
product
However,
the
tables
in the
definitions
and
clause
of the
FROM
of every table in the
to
common
to indicate
is
get the
attribute common
correct
examples
that
match. that
SELECT
FROM clause. (Review
result
values attributes
generally
composed
tables.
For example,
of related
Because table,
V_CODE
the link
TABLE 8.10
is,
That is
are
a natural
done
used to link
with the
is
is the
suppose
foreign
established
Creating
of an equality
on
key in
V_CODE.
links through
the
between
want to join
PRODUCT
(See
Table
foreign
the
table
two
the foreign
key and
and
tables, the
primary
V_CODE
listed
tables,
in the
you
SELECT
P_DESCRIPT,
FROM
PRODUCT,
Cengage deemed
might order
more than
command
sequence
P_PRICE,
any
All suppressed
Rights
columns
Reserved. content
does
May not
which V_NAME,
key in the
Linking
one
of the
must
joined
tables,
be defined.
produces
the
output
V_CONTACT,
attribute
the
To join
source
the
PRODUCT
shown in Figure
V_AREACODE,
table
of and
8.28:
V_PHONE
VENDOR
be presented
of the
be shown
in
PRODUCT.V_CODE
Learning. that
appears
would use the following,
SELECT
WHERE
has
name
and
keys
V_PHONE
attribute
VENDOR
8.10.)
V_COMPANY,
same
to
comparison
you
VENDOR
the
8
Use
referred
V_CODE
output
clause.
(sometimes
P_PRICE
which
The
must select
WHERE
the tables
P_DESCRIPT,
Your
4,
3 to revisit
you
PRODUCT
attributes
Chapter
Chapter
join
Attributes
VENDOR
in
from
statement.
Table
When the
distinction
when data are retrieved
condition).
key
PRODUCT.
(If
Cartesian
which
condition
primary
VENDOR
in
on common
and other
a time.
necessary.)
clause
The join
2020
base tables
Calculus.) simply
will create the terms,
only the
review
more about
database
Algebra
To join
Copyright
MAXQTY,
AS TOTCOST,
AS AVGQTY
will learn
a relational
Relational
Editorial
385
JOINING DATABASE TABLES
The ability
the
Language
views.
8.8
the
Query
PRODUCT
GROUP
these
Structured
AS
MAX(P_QOH)
FROM
8 Beginning
not materially
in is
be
5 VENDOR.V_CODE;
copied, affect
a different not relevant.
scanned, the
overall
or
duplicated, learning
order
because
In fact,
you
in experience.
whole
or in Cengage
part.
Due Learning
the
SQL
are likely
to
electronic reserves
to
rights, the
right
command get
some to
third remove
produces
a different
party additional
content
may content
a listing
order
be
of the
suppressed at
any
time
from if
the
subsequent
in
same
eBook rights
and/or restrictions
eChapter(s). require
it
386
PART III
Database
listing
the
using
an
Programming
next time ORDER
you execute
SELECT
P_DESCRIPT,
FROM
PRODUCT,
WHERE ORDER
BY
cm
list
V_NAME,
V_CONTACT,
V_AREACODE,
5 VENDOR.V_CODE
V_NAME
V_CONTACT
V_AREACODE
V_PHONE
9.95
Bryson, Inc.
Smithson
0181
223-3234
25
6.99
Bryson,
Smithson
0181
223-3234
50
8.45
D&E
Singh
0181
228-3245
Inc. Supply
cm
pwr.
saw
blade
14.99
Jabavu
Bros.
Khumalo
0181
889-2546
9.00
cm
pwr.
saw
blade
17.49
Jabavu
Bros.
Khumalo
0181
889-2546
4.99
Jabavu
Bros.
Khumalo
0181
889-2546
Anderson
7253
678-3998
Anderson
7253
678-3998
ORDVA, Inc.
Hakford
0181
898-1234
ORDVA, Inc.
Hakford
0181
898-1234
256.99
ORDVA, Inc.
Hakford
0181
898-1234
109.99
Rubicon
Systems
Du Toit
0113
456-0092
38.95
Rubicon
Systems
Du Toit
0113
456-0092
Rubicon
Systems
Du Toit
0113
456-0092
file,
1/8
cm fine
Hrd. cloth,
1/4 cm,
2 3 50
39.95
Randsets
Hrd. cloth,
1/2 cm,
3 3 50
43.99
Randsets
B&D jigsaw,
12 cm blade
B&Djigsaw,
8 cm blade
Hicut Power B&D Steel
chain
saw,
painter, cordless matting,
15 psi.,
4
109.92
99.87
16 cm
drill,
by
V_PHONE
7.25
Rat-tail
8
a more predictable
Theresults of ajoin
screw,
wd. screw,
you can generate
VENDOR
Claw hammer
2.5
P_PRICE,
P_PRICE
metal
However,
P_PRICE;
P_DESCRIPT
cm
command.
PRODUCT.V_CODE
FIGURE 8.28
1.25
the
BY clause:
3-nozzle
1/2
3 8
cm 3 1/6
m,.5
119.95
m
Ltd. Ltd.
mesh
NOTE
Table names were used as prefixes in the preceding SQL command sequence. For example, PRODUCT. P_PRICE was used rather than P_PRICE. Most current-generation RDBMSs do not require table names to
be used as prefixes
unless the
same attribute
name
occurs in several
of the tables
being joined.
In that
case, V_CODE is used as a foreign keyin PRODUCT and as a primary key in VENDOR; therefore, you must use the table names as prefixes in the WHERE clause. In other words, you can writethe previous query as: SELECT
P_DESCRIPT, P_PRICE, V_NAME, V_CONTACT,
FROM
PRODUCT,
WHERE
provide
such
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
5 VENDOR.V_CODE;
name occurs in several places, its origin (table)
a specification,
about the attributes highest price.
Editorial
VENDOR
PRODUCT.V_CODE
Naturally, if an attribute
SQL generates
an error
origin. In that case, your listing
Rights
Reserved. content
V_AREACODE, V_PHONE
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
must be specified. If you fail to
message to indicate
that
you have been ambiguous
will always be arranged from the lowest
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
price to the
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The preceding table
in
any
SQL command
which the
vendor
V_CODE
can
V_CODE
entries
can be
matched
deliver for
any
each
with
the
Because
Cartesian
be joined
each
All of the sequence
the
in the
quite
alisting
can be used
acceptable
in
SELECT
P_DESCRIPT,
FROM
PRODUCT,
WHERE AND
SQL
2.5 cm
will be the
output
product
VENDOR
5 176 rows.
tables.
the
Cartesian
and the
3 11)
V_NAME,
multiple
in
VENDOR
(Each
row
For example, shown
V_CONTACT,
in
of PRODUCT
table
contains in
PRODUCT
the following
Figure
and
11 rows, would
command
8.29:
V_AREACODE,
V_PHONE
5 VENDOR.V_CODE
An ordered andlimited listing after ajoin P_PRICE
metal screw, wd. screw,
drill, 1/2 cm
Steel
matting,
4 3 8 3 1/6
Hicut
chain
16
m,.5
m mesh
cm
V_CONTACT
V_AREACODE
V_PHONE
Smithson
0181
223-3234
8
Bryson,
8.45
D&E Supply
Singh
0181
228-3245
9.95
Bryson,
Smithson
0181
223-3234
38.95
Rubicon
Systems
Du Toit
0113
456-0092
119.95
Rubicon
Systems
Du Toit
0113
456-0092
256.99
ORDVA, Inc.
Hakford
0181
898-1234
50
B&D cordless
V_NAME
6.99
25
Claw hammer
saw,
V_CODE
VENDOR
P_DESCRIPT cm
Because
may contain
each
387
VENDOR
condition.
table
words,
Language
. '15-Jan-2019';
FIGURE 8.29
1.25
of (16
produces
P_PRICE,
other
Query
with a row in the clauses
PRODUCT
In
16 rows
on the joined
and
PRODUCT.V_CODE
P_INDATE
the result
table.)
the
table.
table
WHERE
Structured
PRODUCT.
contains
produce
in the
products, VENDOR
rows in
table
PRODUCT
as indicated
ordered
VENDOR
SQL commands
is
of
WHERE clause,
would
row
a row in the
same,
entry in the
PRODUCT
product
to
number
many V_CODE
the
joins
are the
V_CODE
If you do not specify VENDOR.
sequence
values
8 Beginning
Inc.
Inc.
NOTE In
Chapter
4, Relational
a specified
Algebra
and
Calculus,
you learnt
that
a JOIN is used to
way.In SQL, the natural-join is used to join tables together.
SELECT
P_DESCRIPT,
FROM
PRODUCT, VENDOR
WHERE
P_PRICE,
PRODUCT.V_CODE
V_NAME,
V_CONTACT,
combine
two
relations
in
The SQL statement:
V_AREACODE,
V_PHONE
5 VENDOR.V_CODE
AND P_INDATE . '15-Jan-2019'; can be written in relational
algebra as:
PP_DESCRIPT, P_PRICE, V_NAME, V_CONTACT, V_AREACODE, V_PHONE ((s p_indate 5'15-Jan-2019' (PRODUCT)) For
moreinformation
on JOIN
operators,
see Section
4.2 in
Chapter
|X|
4, Relational
VENDOR)
Algebra
and
Calculus.
Whenjoining three or more tables, you need to specify a join condition for each pair of tables. The number of join conditions will always be N-1, where Nrepresents the number of tables listed in the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
388
PART III
Database
FROM
Programming
clause.
tables,
you
For example,
must
Remember, table.
have
the join
For example,
date and
join
tables,
conditions;
condition
using
product
if you have three
four
will
Figure
descriptions
and
match
the
8.1, if you
CUS_LNAME,
FROM
CUSTOMER,
LINE,
Finally,
BY
not to
Bis related
A with B and
conditions;
if you have five
10014,
to
the
primary
key
invoice
of the
last
name,
you
must type the following:
related
number,
invoice
P_DESCRIPT
PRODUCT AND
5 LINE.INV_NUMBER
5 PRODUCT.P_CODE
AND
AND
5 10014
INV_NUMBER;
be careful
Table
of a table
customer
5 INVOICE.CUS_CODE
CUSTOMER.CUS_CODE ORDER
key the
INV_DATE,
INVOICE.INV_NUMBER
LINE.P_CODE
must have two join
for customer
CUSTOMER.CUS_CODE
WHERE
foreign
INV_NUMBER, INVOICE,
you on.
want to list
for all invoices
SELECT
so
to
create
Table
circular
join
conditions.
Table
Cis
also related
C and
B with C. Do not join
For to
example,
Table
if
Table
A, create
A is related
only two
to
Table
join
conditions:
The
aliases
B, join
C with A!
8.8.1 Joining Tables with an Alias 8 An alias are
may be
used
name
to label
used
to identify
the
the
PRODUCT
and
may be used
listing
contains
as an alias.
no duplicate
(Also
P_DESCRIPT,
FROM
PRODUCT
BY
notice
P, VENDOR
from
tables
that
which in the
there
SELECT
P_PRICE,
P.V_CODE
ORDER
table
VENDOR
names in the
SELECT
WHERE
source
are
the
data
next
are taken.
command
no table
name
sequence. prefixes
P and
Any legal
because
the
V
table
attribute
statement.)
V_NAME,
V_CONTACT,
V_AREACODE,
V_PHONE
V
5 V.V_CODE
P_PRICE;
8.8.2 Self-Joins An alias is especially a table to itself, in
Figure
a self-join
when a table is used.
must be joined
For example,
to itself in a recursive
suppose
you are
working
query. In
with the
order to join
EMP table
shown
EMP_
8.30.
FIGURE 8.30
Thecontents of the EMPtable
EMP_
EMP_
EMP_
EMP_
EMP_
NUM
TITLE
LNAME
FNAME
INITIAL
EMP_DOB
EMP_HIRE_
EMP_
EMP_
DATE
AREACODE
PHONE
MG
100
Mr
Cela
Nkosi
D
15-Jun-52
15-Mar-95
0181
324-5456
101
Ms
Lewis
Rhonda
G
19-Mar-75
25-Apr-96
0181
324-4472
100
102
Mr
Vandam
Rhett
14-Nov-68
20-Dec-00
7253
675-8993
100
103
Ms
Jones
Anne
16-Oct-84
28-Aug-04
0181
898-3456
100
104
Mr
Lange
John
08-Nov-81
20-Oct-04
7253
504-4430
105
Copyright Editorial
useful
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
M P
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
EMP_
EMP_
EMP_
EMP_
NUM
TITLE
LNAME
FNAME
Structured
Query
Language
EMP_HIRE_ DATE
AREACODE
PHONE
D
14-Mar-85
08-Nov-08
0181
890-3220
INITIAL
EMP_
389
EMP_
EMP_
EMP_DOB
EMP_
Robert
MGR
105
Mr
106
Mrs
Smith
Jeanine
K
12-Feb-78
05-Jan-99
0181
324-7883
105
107
Mr
Diante
Jorge
D
21-Aug-84
02-Jul-04
0181
890-4567
105
108
Mr
Paul
R
14-Feb-76
18-Nov-02
0181
897-4358
109
Mr
Smith
George
K
18-Jun-71
14-Apr-99
7253
504-3339
108
110
Mrs
Genkazi
Leighla
19-May-80
01-Dec-00
7253
569-0093
108
111
Mr
Washington
Rupert
E
03-Jan-76
21-Jun-03
0181
890-4925
105
112
Mr
Johnson
Edward
E
14-May-71
01-Dec-93
0181
898-4387
100
113
Ms
Gounden
Melanie
P
15-Sep-80
11-May-09
0181
324-9006
105
114
Ms
Brandon
Marie
G
02-Nov-66
15-Nov-89
7253
882-0845
108
115
Mrs
Saranda
Hermine
R
25-Jul-82
23-Apr-03
0181
324-5505
105
116
Mr
Smith
George
A
08-Nov-75
10-Dec-98
0181
890-2984
108
Using the
Williams
8 Beginning
Wiesenbach
data in the
EMP table,
W
you can generate
alist
of all employees
with their
managers
names
byjoining the EMP table to itself. In that case, you would also use aliases to differentiate the tables. The SQL command sequence would look like this: SELECT
E.EMP_MGR,
M.EMP_LNAME, E.EMP_NUM,
FROM
EMP E, EMP M
WHERE
E.EMP_MGR5M.EMP_NUM
ORDER BY
E.EMP_MGR;
The output
of the
above
FIGURE 8.31
command
sequence
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Figure
E.EMP_LNAME
8.31.
Using an alias to join atable to itself EMP_MGR
Editorial
is shown in
8
Rights
Reserved. content
does
EMP_NUM
E.EMP_LNAME
100
Cela
112
Johnson
100
Cela
103
Jones
100
Cela
102
Vandam
100
Cela
101
Lewis
105
Williams
115
Saranda
105
Williams
113
Gounden
105
Williams
111
Washington
105
Williams
107
Diante
105
Williams
106
Smith
105
Williams
104
Lange
108
Wiesenbach
116
Smith
108
Wiesenbach
114
Brandon
108
Wiesenbach
110
Genkazi
108
Wiesenbach
109
Smith
May not
M.EMP_LNAME
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
390
PART III
Database
Programming
NOTE In
Microsoft
Access,
add
previous
SELECT
E.EMP_MGR,
FROM
EMP AS E, EMP AS M
WHERE BY
8.8.3
M.EMP_LNAME,
8.28
showed
V_CODE
in
table.
do
Figure
rows
are listed.
are
show
If you
will notice
those
rows
types
and
VENDOR
matching
PRODUCT
in the
tables,
compare
no
matching
final
output
that
several
final
join
VENDOR
LEFT
JOIN
in the
on the
vendors
have
you
PRODUCT
table
are two
products
must
Also, if
no
matching
use
an outer
Chapter
outer join
in
VENDOR tables
join.
(See
examine
you
the
Figure
8.2,
you
with nulls in the
V_CODE
examine
V_CODE
output,
attribute,
the
in the
VENDOR PRODUCT
join.
4.)
will show
Given the
contents
all VENDOR
of the
rows
and
all
V_NAME
PRODUCT
output
generated
by the left
outer join
command
but show the output in a different
in
Microsoft
Access.
Both
Oracle
order.
Theleft outer join results P_CODE
V_CODE
V_NAME
23109-HB
21225
Bryson,
Inc.
SM-18277
21225
Bryson,
Inc.
21226
SuperLoo,
SW-23116
21231
D&E
13-Q2/P2
21344
Jabavu
Bros.
14-Q1/L3
21344
Jabavu
Bros.
54778-2T
21344
Jabavu
Bros.
22567
Dome
23119
Randsets Ltd.
1546-QQ2
any
to the
If you
5 PRODUCT.V_CODE;
MySQL yield the same result,
Learning.
null value
left
tables.
is that there
based
the following
FROM
that
VENDOR
output
and right.
VENDOR.V_CODE,
Cengage
and
the
output,
left
P_CODE,
FIGURE 8.32
deemed
PRODUCT
SELECT
and
has
it read:
rows:
Figure 8.32 shows the
2020
making
E.EMP_LNAME
Why? The reason
of outer joins:
ON VENDOR.V_CODE
review
E.EMP_NUM,
the
missing.
up in the
you
are two
PRODUCT
of joining
Because there is not
8.2,
To include
There
results
products
attribute.
products
table
the
14 product
will note that two
Copyright
sequence,
E.EMP_MGR;
note that
Editorial
command
Outer Joins
Figure
the
SQL
E.EMP_MGR5M.EMP_NUM
ORDER
8
AS to the
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Inc.
Supply
Due Learning
Supply
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
P_CODE
V_CODE
V_NAME
1558-QW1
23119
Randsets
24004
Brackman
2232/QTY
24288
ORDVA, Inc.
2232/QWE
24288
ORDVA,
Inc.
89-WRE-Q
24288
ORDVA,
Inc.
25443
B&K, Inc.
25501
Damal
11QER/31
25595
Rubicon
Systems
2238/QPD
25595
Rubicon
Systems
WR3/TT3
25595
Rubicon
Systems
SELECT
PRODUCT.P_CODE,
VENDOR.V_CODE,
FROM
VENDOR RIGHT JOIN PRODUCT
Structured
both
FIGURE
391
Supplies
with all matching vendor rows.
The
V_NAME
5 PRODUCT.V_CODE;
Oracle and
8.33
Language
Bros.
8
Figure 8.33 shows the output generated bythe right outer join command sequence in Again,
Query
Ltd.
The right outer join willjoin both tables and show all product rows SQL command for the right outer join is:
ON VENDOR.V_CODE
8 Beginning
MySQL yield the
The right
same result,
outer join
but show the
Microsoft Access.
output in a different
order.
results
P_CODE
V_CODE
V_NAME
23109-HB
21225
Bryson,
Inc.
SM-18277
21225
Bryson,
Inc.
SW-23116
21231
D&E
13-Q2/P2
21344
Jabavu
14-Q1/L3
21344
Jabavu
Bros.
54778-2T
21344
Jabavu
Bros.
1546-QQ2
23119
Randsets
Ltd.
1558-QW1
23119
Randsets
Ltd.
2232/QTY
24288
ORDVA, Inc.
2232/QWE
24288
ORDVA, Inc.
89-WRE-Q
24288
ORDVA, Inc.
11QER/31
25595
Rubicon
Systems
2238/QPD
25595
Rubicon
Systems
WR3/TT3
25595
Rubicon
Systems
23114-AA PVC23DRT
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
Supply
Bros.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
392
PART III
Database
In
Chapter
to
use the
Programming
9, Procedural latest
ANSI
Language SQL
SQL and
standard
Advanced
SQL, you
will learn
more about joins
and how
syntax.
Online Content Foracomplete walk-through example ofconverting anERmodel into a database
structure
ER Model into
and
using
a Database
SQL commands
Structure,
to
create
tables,
see
on the online platform for this
Appendix
D, Converting
an
book.
SUMMARY The
SQL commands
commands
can be divided into
The ANSI standard data types
two
overall
categories:
data definition
language
(DDL)
and data manipulation language (DML) commands.
are
data types
are supported
NUMBER, INTEGER,
CHAR,
by all RDBMS vendors in different VARCHAR
and
ways. The basic
DATE.
The basic data definition commands allow you to create tables, indexes and views. Many SQL constraints can be used with columns. The commands are CREATE TABLE, CREATEINDEX, CREATE
VIEW,
ALTER TABLE,
DROP
TABLE,
DROP VIEW and
DROP INDEX.
DML commands allow you to add, modify, and delete rows from tables. The basic DML commands are SELECT, INSERT, UPDATE, DELETE, COMMIT and ROLLBACK.
8
The INSERT command is used to add new rows to tables. The UPDATE command is used to modify data values in existing rows of atable. The DELETE command is used to delete rows from tables. The COMMIT and ROLLBACK commands are used to permanently save or roll back changes madeto the rows. Once you COMMIT the changes, you cannot undo them with a ROLLBACK
command.
The SELECT statement is the following syntax:
main data retrieval
command in SQL. A SELECT statement
has the
SELECT columnlist FROM tablelist [WHERE conditionlist ] [GROUP
BY columnlist
]
[HAVING
conditionlist
[ORDER
BY columnlist [ASC | DESC] ] ;
]
The column list represents one or more column names separated by commas. The column list may also include computed columns, aliases and aggregate functions. A computed column is represented by an expression or formula (for example, P_PRICE * P_QOH). The FROM clause contains
alist
of table
names
or view names.
The WHERE clause can be used with the SELECT, UPDATE and DELETE statements to restrict the rows affected by the DDL command. The condition list represents one or more conditional expressions separated bylogical operators (AND/OR/NOT). The conditional expression can contain
any comparison
operators
(5,
.,
,,
.5,
,5,
,.)
as well as special
operators
(BETWEEN, IS NULL, LIKE, IN and EXISTS).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Aggregate
functions
computations the
GROUP
The
aggregate The
clause
rows
ORDER sort
a set
BY clause
HAVING
can
(COUNT,
over
to
is
or
is
output
performed
time
If
every
WHERE
you
The
natural
join
columns.
to
output
specify
ajoin
also
values in the
output
sort the
output
and
either
use
two
that
usually
GROUP
perform
used in
computations
of the
by
Query
Language
393
arithmetic
conjunction
one
BY clause
of a SELECT ascending
with the
or
or
with
more
attributes.
by selecting
key
the
in
The
statement.
the
FROM
of one table
DBMS
statement.
or descending
SELECT
more tables
foreign
condition,
uses the join
are
Structured
only the
condition.
the
FROM
functions
functions
of aggregate
the
multiple tables
match
in the
You could
matching
to
specify
you specify
of you
clause
do not
tables
used
more columns
You can join the
in the
the
aggregate
to restrict
match a given
BY clause
by one
MAX, AVG) are special
The
group
used
that
MIN,
of rows.
8 Beginning
to
the
automatically
ORDER
BY clause
order.
The join
operation
clause
and
use
primary
key
of the
performs
is
a join
condition
related
a Cartesian
table.
product
of the
clause.
condition
do a right
to outer
other related
match join
only rows
and left
with
equal
outer join
to
values
select
in the
the
rows
specified that
have
no
table.
KEY TERMS alias
DELETE
OR
ALTERTABLE
DISTINCT
ORDERBY
AND
DROP INDEX
recursive query
authentication
DROP TABLE
reserved words
AVG
EXISTS
ROLLBACK
basetables
GROUPBY
rules of precedence
BETWEEN
HAVING
schema
Boolean algebra
IN
SELECT
cascadingordersequence
INSERT
subquery
COMMIT
IS NULL
SUM
COUNT
LIKE
CREATE INDEX
8
UPDATE
MAX
CREATE TABLE
MIN
CREATE VIEW
NOT
view
wildcardcharacter
FURTHER READING Allison, C. and Berkowitz, Inc., Freeman, Murach,
R. Oracle
Database
Murachs
MySQL,
J.
Jacobs, P.SQL:
Copyright Editorial
review
2020 has
Big Data
Cengage deemed
Learning. that
any
Microsoft Access.
All
Rights
Reserved. content
Release Edition.
Wordware Applications
2 New Features (Oracle Mike
Beginners
Murach
does
Springer
May not
not materially
be
copied, affect
Vieweg,
scanned, the
overall
Press).
& Associates
Inc.,
Library,
Wordware Publishing
or
Education,
2017.
with Exercises and Case Studies,
Models, Languages,
Consistency
2018.
Options and Architectures
2019.
duplicated, learning
McGraw-Hill 2019.
Guide to SQL Programming
M. SQL & Nosql Databases:
Management,
suppressed
12c 3rd
Comprehensive
Meier, A. and Kaufmann, for
N. SQL for
2005.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
394
PART III
Database
Programming
Online Content are available
on the
Answers to selectedReviewQuestions andProblems forthis chapter
online
platform
for
this
book.
REVIEW QUESTIONS
Online Content TheReviewQuestions inthis chapterarebasedonthe'Ch08_Review' database Access
located format.
utilities to
The
Ch08_Review
The charges of the
are
on the If you
online
use
platform
another
DBMS
move the
Access
database
database
stores
data for
based
Ch08_Review
on the
hours
database
each
are
shown
for this such
book. as
This
Oracle,
database SQL
is
Server
stored or
in
Microsoft
MySQL,
use its import
contents.
a consulting
employee in
company
works
Figure
that tracks
on each
project.
all charges
The
structure
to and
projects. contents
Q8.1.
8 FIGURE Q8.1
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
The Ch8_Review database
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
Language
EMP_LNAME
EMP_FNAME
EMP_INITIAL
EMP_HIREDATE
News
John
G
08-Nov-10
502
102
Moonsamy
Kavyara
H
12-Jul-99
501
103
Baloyi
Mzwandile
E
01-Dec-06
503
8
104
Maseki
Noxolo
K
15-Nov-97
501
17
105
Johnson
Alice
K
01-Feb-03
502
12
22-Jun-14
500
0
D
10-Oct-03
500
11
B
22-Aug-01
501
13
18-Jul-07
501
7
William
106
Smithfield
107
Alonzo
108
Khan
Krishshanth
109
Smith
Larry
110
Olenko
Maria
Wabash
W
JOB_CODE
4 15
Gerald
A
11-Dec-05
505
9
Geoff
B
04-Apr-01
506
14
23-Oct-04
507
10
15-Nov-06
508
8
M
Smithson
Darlene
113
Joenbrood
Delbert
114
Jones
Annelise
20-Aug-03
508
11
Travis
B
25-Jan-02
501
13
L
05-Mar-07
510
8
19-Jun-06
509
8
04-Jan-15
510
0
115
Bawangi
116
Pratt
H
Angie
J
James
Frommer
name:
K
Gerald
Williamson
118
Table
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_
ASSIGN_
ASSIGN_
NUM
DATE
NUM
NUM
JOB
CHG_HR
HOURS
CHARGE
has
1001
22-Mar-19
18
103
503
84.50
3.50
295.75
1002
22-Mar-19
22
117
509
34.55
4.20
145.11
1003
22-Mar-19
18
117
509
34.55
2.00
1004
22-Mar-19
18
103
503
84.50
5.90
498.55
1005
22-Mar-19
25
108
501
96.75
2.20
212.85
1006
22-Mar-19
22
104
501
96.75
4.20
406.35
1007
22-Mar-19
25
113
508
50.75
3.80
192.85
1008
22-Mar-19
18
103
503
84.50
0.90
1009
23-Mar-19
15
115
501
96.75
5.60
1010
23-Mar-19
15
117
509
34.55
2.40
1011
23-Mar-19
25
105
502
105.00
4.30
451.50
1012
23-Mar-19
18
108
501
96.75
3.40
328.95
1013
23-Mar-19
25
115
501
96.75
2.00
193.50
1014
23-Mar-19
22
104
501
96.75
2.80
270.90
1015
23-Mar-19
15
103
503
84.50
6.10
515.45
1016
23-Mar-19
22
105
502
4.70
493.50
Cengage deemed
Learning. that
any
8
ASSIGNMENT
ASSIGN_
2020
395
EMP_YEARS
112
117
review
Query
101
111
Copyright
Structured
EMPLOYEE
EMP_NUM
Editorial
8 Beginning
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
105.00
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
69.10
76.05 541.80 82.92
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
396
PART III
Database
Programming
ASSIGN_
ASSIGN_
PROJ_
EMP_
ASSIGN_
ASSIGN_
ASSIGN_
ASSIGN_
NUM
DATE
NUM
NUM
JOB
CHG_HR
HOURS
CHARGE
1017
23-Mar-19
18
117
509
34.55
3.80
1018
23-Mar-19
25
117
509
34.55
2.20
1019
24-Mar-19
25
104
501
110.50
4.90
541.45
1020
24-Mar-19
15
101
502
125.00
3.10
387.50
1021
24-Mar-19
22
108
501
110.50
2.70
298.35
1022
24-Mar-19
22
115
501
110.50
4.90
541.45
1023
24-Mar-19
22
105
502
125.00
3.50
437.50
1024
24-Mar-19
15
103
503
84.50
3.30
278.85
1025
24-Mar-19
18
117
509
34.55
4.20
145.11
131.29 76.01
Table name: JOB JOB_CODE
8
Table name:
JOB_DESCRIPTION
Copyright review
2020 has
Cengage deemed
Programmer
35.75
20-Nov-18
501
Systems
Analyst
96.75
20-Nov-18
502
Database
Designer
125.00
24-Mar-19
503
Electrical
Engineer
84.50
20-Nov-19
67.90
20-Nov-19
504
Mechanical
505
Civil Engineer
55.78
20-Nov-19
506
Clerical
26.87
20-Nov-19
507
DSS Analyst
45.95
20-Nov-19
508
Applications
48.10
24-Mar-19
509
Bio
34.55
20-Nov-18
510
General
18.36
20-Nov-18
Engineer
Support
Designer
Technician Support
PROJECT
Learning. that
JOB_LAST_UPDATE
500
PROJ_NUM
Editorial
JOB_CHG_HOUR
any
All suppressed
PROJ_NAME
PROJ_VALUE
PROJ_BALANCE
EMP_NUM
15
Evergreen
1453500.00
1002350.00
103
18
Amber
3500500.00
2110346.00
108
22
Rolling
805000.00
500345.20
102
25
Starflight
Rights
Reserved. content
does
May not
not materially
be
Wave Tide
copied, affect
2309880.00
2650500.00
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
107
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
As you examine attribute
Figure
Q8.1, note that the
(ASSIGN_CHG_HR)
are likely table.
to And,
stored.
change
the
those
ASSIGNMENT
maintain
over time.
naturally,
Because
to In
historical
fact,
employee
job
of the
to
data.
change
assignment
are required
Structured
Query
stores the JOB_CHG_HOUR
accuracy
a JOB_CHG_HOUR
primary
attributes
table
8 Beginning
maintain the
values
The JOB_CHG_HOUR
is reflected
may change,
in the
so the
historical
397
as an values
ASSIGNMENT
ASSIGN_JOB
accuracy
Language
of the
is
also
data, they
are
not redundant. Given
the
commands
1
structure to
answer
Writethe subset
that
Attribute
contents
questions
(Field)
Ch8_Review
will create the table
EMPLOYEE
the
of the
database
shown
in
Figure
Q8.1,
use
SQL
125.
SQL code that
of the
(Note
and
table.
JOB_CODE
The basic
is the
FK to
structure for a table
EMP_1
table
structure
is
named EMP_1. This table is a summarised
in the
table
below.
JOB).
Data
Name
Declaration
EMP_NUM
CHAR(3)
EMP_LNAME
VARCHAR(15)
EMP_FNAME
VARCHAR(15)
EMP_INITIAL
CHAR(1)
EMP_HIREDATE
DATE
JOB_CODE
CHAR(3)
8 2
Having created the table structure in the
table
FIGURE
shown
Q8.2
in
EMP_FNAME
EMP_INITIAL
EMP_HIREDATE
101
News
John
G
08-Nov-10
502
102
Moonsamy
Kavyara
H
12-Jul-99
501
Mzwandile
E
01-Dec-06
500
Noxolo
K
15-Nov-07
501
Alice
K
01-Feb-03
502
22-Jun-14
500
D
10-Oct-03
500
B
22-Aug-01
501
18-Jul-07
501
Baloyi
review
2020 has
104
Maseki
105
Johnson
106
Smithfield
William
107
Alonzo
Maria
108
Khan
Krishshanth
109
Smith
Larry
W
Assuming the data shown in the EMP_1 table have been entered, all attributes for ajob code of 502.
4
Copyright
of the EMP_1 table
EMP_LNAME
103
Editorial
Q8.2.
The contents
EMP_NUM
3
Figure
Question 1, writethe SQL code to enter the first two rows for
Writethe SQL code that
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
will save the changes
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
JOB_CODE
write the SQL code that
willlist
madeto the EMP_1 table.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
398
PART III
Database
5
Programming
Writethe SQL code to change the job code to 501 for the person After
6
you
have
of the
7
2014,
8
data
that
whose given
existed
will add the
percentage
9
to
job
code
in this
the
results,
then
classification
is
whose employee
reset
the job
code
William Smithfield,
500. (Hint:
Use logical
number is 107.
to its
original
before
who was hired on
operators
to include
attributes
made the
be
EMP_PCT
paid to
each
PROJ_NUM
CHAR(3)
changes
in
Questions
5 and
6.
and
PROJ_NUM
employee.
The
new
to its
structure.
attribute
The
EMP_PCT
characteristics
is the
in
is 103. Figure
Next,
are:
to
18 for
to
you
Figure
SQL command
sequence,
25 for
finish
write the
all employees
Using a single command When
write the
sequences
to
change
the
EMP_PCT
whose
sequence, all
10
SQL code that
job
classification
write the
employees
questions
may assume
FIGURE Q8.3
whose
and
11,
will change the project
(JOB_CODE)
SQL code that job
EMP_2
saved
again
(JOB_CODE)
table
will
is
contain
the
502
that
the
table
has
been
at this
EMP_
EMP_
FNAME
INITIAL
101
News
John
G
08-Nov-10
502
5.00
102
Moonsamy
Kavyara
H
12-Jul-99
501
8.00
103
Baloyi
Mzwandile
E
01-Dec-06
500
3.85
104
Maseki
Noxolo
K
15-Nov-97
501
10.00
105
Johnson
Alice
K
01-Feb-03
502
5.00
106
Smithfield
William
22-Jun-14
500
6.20
107
Alonzo
Maria
D
10-Oct-03
500
5.15
108
Khan
Krishshanth
B
22-Aug-01
501
10.00
109
Smith
Larry
18-Jul-07
501
2.00
any
in
Thecontents of the EMP_2table
LNAME
Learning.
higher.
shown
point.)
EMP_
that
number
or
data
NUM
Cengage
number
500.
will change the project
classification
the
is
EMP_
deemed
as
Q8.4.
(You
has
values
Q8.3.
Using a single command
(PROJ_NUM)
2020
bonus
Writethe SQL code to change the EMP_PCT value to 3.85 for the person whose employee number
(PROJ_NUM)
review
all
problem.)
you
NUMBER(4,2)
shown
Copyright
value.
will restore the data to its original status; that is, the table should contain
EMP_PCT
(EMP_NUM)
Editorial
examine
Writethe SQL code to create a copy of EMP_1, naming the copy EMP_2. Then write the SQL code that
11
and
information
Writethe SQL code that the
8
the task,
Writethe SQL code to delete the row for the person named 22 June,
10
completed
All suppressed
Rights
Reserved. content
does
May not
not materially
EMP_
W
be
copied, affect
scanned, the
overall
or
duplicated, learning
JOB_CODE
EMP_PCT
PROJ_
HIREDATE
in experience.
whole
or in Cengage
part.
Due Learning
to
NUM
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE
Q8.4
The contents
of the EMP_2 table after the
8 Beginning
Structured
Query
Language
modification
EMP_
EMP_
EMP_
NUM
LNAME
FNAME
101
News
John
G
08-Nov-10
502
5.00
102
Moonsamy
Kavyara
H
12-Jul-99
501
8.00
103
Baloyi
E
01-Dec-06
500
3.85
104
Maseki
Noxolo
K
15-Nov-97
501
105
Johnson
Alice
K
01-Feb-03
502
5.00
25
106
Smithfield
William
22-Jun-14
500
6.20
18
107
Alonzo
Maria
D
10-Oct-03
500
5.15
18
108
Khan
Krishshanth
B
22-Aug-01
501
10.00
109
Smith
Larry
18-Jul-07
501
2.00
12
Writethe
SQL code that
before
1 January
restored
to its
13
whose job
preceding
at least
and
18
10.00
employees
may assume
who were hired
that
the
table
will be
question.)
8
EMP_PCT.
Copythe matching EMP_2 valuesinto the TEMP_1table.
Writethe SQL command that
will delete the newly created TEMP_1 table from the database.
Writethe SQL code required to list all employees the
16
501. (You
25
Create atemporary table named TEMP_1 whose structure is composed ofthe EMP_2 attributes
b
15
code is
this
PROJ_NUM
SQL command sequences required to:
EMP_NUM
14
W
will change the PROJ_NUM to 14 for those
and
condition
Writethe two a
2004,
EMP_PCT
HIREDATE
INITIAL
Mzwandile
JOB_CODE
EMP_
EMP_
399
rows
for
both
Smith
and
Smithfield
should
whose last names start with Smith. In other words, be included
in the
listing.
Assume
case
sensitivity.
Usingthe EMPLOYEE, JOB, and PROJECT tables in the Ch08_Review database (see Figure Q8.1), write the
SQL
code
FIGURE Q8.5
that
will produce
the
results
shown
in
Figure
Q8.5.
The query results for Question 16
PROJ_
PROJ_
PROJ_
EMP_
EMP_
EMP_
JOB_
NAME
VALUE
BALANCE
LNAME
FNAME
INITIAL
CODE
JOB_
JOB_
DESCRIPTION
CHG_ HOUR
805000.00
500345.20
Moonsamy
Kavyara
H
501
Systems
Evergreen
1453500.00
1002350.00
Baloyi
Mzwandile
E
500
Programmer
35.75
Starflight
2650500.00
2309880.00
Alonzo
Maria
D
500
Programmer
35.75
Amber
3500500.00
2110346.00
Khan
Krishshanth
B
501
Systems
96.75
Rolling
Tide
Analyst
96.75
Analyst
Wave
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
400
PART III
Database
17
Programming
Writethe SQL code that produces a virtual table named that
18
was shown
Writethe
20
bonus
Writethe is
22
database. by
(See
multiplying
worked
for
each
of running
FIGURE Q8.6
Note that
ASSIGN_CHG_HR
by
employee
that
query
and the
are
shown
values in the ASSIGNMENT table in the
ASSIGN_CHARGE
is
write the SQL code that total
in
a derived
charges
Figure
stemming
will yield the total
from
those
hours
101
News
3.1
387.50
103
Baloyi
19.7
1664.65
11.9
1218.70
SumOfASSIGN_HOURS
number of worked.
The
12.5
1382.50
Maseki
SumOfASSIGN_CHARGE
105
Johnson
108
Khan
8.3
840.15
113
Joenbrood
3.8
192.85
115
Bawangi
12.5
1276.75
18.8
649.54
Williamson
Writea query to produce the total number of hours and charges for each ofthe projects represented in
the
ASSIGNMENT
FIGURE Q8.7
table.
24
The output
is
shown
in
Figure
Q8.7.
Total hours and charges by project PROJ_NUM
SumOfASSIGN_HOURS
SumOfASSIGN_CHARGE
15
20.5
1806.52
18
23.7
1544.80
22
27
2593.16
25
19.4
1668.16
Writethe SQL code to generate the total hours worked and the total charges made by all employees. The results are shown in Figure Q8.8.(Hint: This is a nested query. If you use Microsoft Access, you can generate the result by using the query output shown in Figure Q8.6 as the basis for the query that
Cengage deemed
that
Q8.6.
EMP_LNAME
23
attribute
ASSIGN_HOURS.
EMP_NUM
117
has
Q8.1.)
Total hours and charges by employee
104
2020
Figure
Using the data in the ASSIGNMENT table, hours
order by
willlist only the different project numbers found in the EMP_2 table.
SQL code to calculate the ASSIGN_CHARGE
calculated
results
review
you created in
percentage.
Ch08_Review
Copyright
bonus percentage in the EMP_2 table
8.
Writethe SQL code that
21
Editorial
REP_1, containing the same information
16.
Writethe SQL code that produces alisting for the data in the EMP_2 table in ascending the
8
Question
SQL code to find the average
Question
19
in
Learning. that
any
All suppressed
will produce
Rights
Reserved. content
does
May not
not materially
the
be
copied, affect
output
scanned, the
overall
or
shown
duplicated, learning
in experience.
in Figure
whole
or in Cengage
part.
Due Learning
Q8.8).
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE
Q8.8
SQL code to
The results you
this
26
should
use
Query
Language
401
SumOfSumOfASSIGN_CHARGE
7612.64
90.6
Write the
Structured
Total hours and charges, all employees SumOfSumOfASSIGN_HOURS
25
8 Beginning
generate
be the
Microsoft
the total
same
Access,
hours
as those
you
can
worked
shown
generate
in
the
and the total
Figure
result
charges
Q8.8. (Hint:
by using
the
made to all projects.
This is
query
a nested
output
query.
as the
If
basis
for
query.)
Explain whyit would be preferable to use a DATE data type to store date data instead
of a character
data type.
27
Explain why the following fix the
command
would create an error and which changes could be madeto
error:
SELECT
V_CODE,
SUM(P_QOH)
FROM
PRODUCT;
28
Explain the difference between an ORDER BY clause and a GROUP BY clause.
29
Explain why the following
30
SELECT
DISTINCT
SELECT
COUNT
two commands
COUNT
(V_CODE)
(DISTINCT
Whatis the difference
produce different results:
V_CODE)
FROM
PRODUCT;
FROM
PRODUCT;
between the COUNT aggregate function
and the SUM aggregate function?
31 In a SELECT query, whatis the difference between a WHERE clause and a HAVING clause? 32
Rewrite the following WHERE
WHEREclause
v_COUNTRY
IN ('UK',
without the use of the IN operator:
'SA',
'USA')
PROBLEMS
Online the
online
use
Before
database,
to people
implemented
such
to
all
the
shown pilots
crew
assignments,
Copyright review
2020 has
Cengage deemed
Learning. that
any
are
not
All
stored
or
in
Microsoft
MySQL,
Access
use its import
Rights
does
Although but
Thats Note
multiple
Reserved. content
May
not materially
be
not
format.
utilities
why the
such
If you
to
move the
table.
scanned, the
overall
or
duplicated, learning
does
and
such
as loadmasters
copilots,
between
Nor does the
not
the
optionalities, (Although,
design
is
and flight
does
Certified
database
show
members.
CHARTER
implementation and
Ch08_AviaCo
crew
pilots
as Instrument
stored in the
affect
design
schema are flight
relationship
this
with the
the relational
all employees
all involve
also that
ratings
copied,
yourself
assignments
EARNEDRATINGS
not
familiarise
assignments
pilots.
CREW.)
example,
P8.1.
member
which are properly
suppressed
is
Server
SQL queries,
Figure
member
For
SQL
employees
crew
stored in the (composite)
Editorial
in
are
accommodate who
This database
Oracle,
write any
through
attributes.
book. as
contents.
contents
mind that
flexible of
for this
database
and
in
Problems 115 are based onthe 'Ch08_AviaCo' databaselocated on
DBMS
you attempt
structure
this
platform
another
Access
keep
Content
Flight
and
in
sufficiently
attendants
EMPLOYEE
not include
is
multivalued
Instructor
ratings
CHARTER table include
are
multiple crew
CREW table.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
402
PART III
Database
Programming
FIGURE P8.1
The Ch08_AviaCo database
8
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
Structured
Query
Language
403
CREW CHAR_TRIP
Table
8 Beginning
name:
EMP_NUM
CREW_JOB
10001
104
Pilot
10002
101
Pilot
10003
105
Pilot
10003
109
Copilot
10004
106
Pilot
10005
101
Pilot
10006
109
Pilot
10007
104
Pilot
10007
105
Copilot
10008
106
Pilot
10009
105
Pilot
10010
108
Pilot
10011
101
Pilot
10011
104
Copilot
10012
101
Pilot
10013
105
Pilot
10014
106
Pilot
10015
101
Copilot
10015
104
Pilot
10016
105
Copilot
10016
109
Pilot
10017
101
Pilot
10018
104
Copilot
10018
105
Pilot
8
RATING RTG_CODE
RTG_NAME
CFI
Certified
Flight
Instructor
Certified
Flight
Instructor,
CFII
Instrument
Instrument
INSTR
Multiengine
MEL
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
Land
SEL
Single
Engine,
Land
SES
Single
Engine,
Sea
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
404
PART III
Table
Database
name:
EMPLOYEE
EMP_NUM
EMP_TITLE
EMP_LNAME
EMP_FNAME
EMP_INITIAL
EMP_DOB
EMP_HIRE_DATE
100
Mr
Nkosi
Cela
D
15-Jun-52
15-Mar-98
101
Ms
Lewis
Rhonda
G
19-Mar-75
25-Apr-96
102
Mr
Vandam
Rhett
14-Nov-68
18-May-03
103
Ms
Jones
Anne
M
11-May-84
26-Jul-09
104
Mr
Lange
P
12-Jul-81
20-Aug-00
105
Mr
Robert
D
14-Mar-85
19-Jun-13
106
Mrs
Duzak
Jeanine
K
12-Feb-78
13-Mar-99
107
Mr
Diante
Jorge
D
01-May-85
02-Jul-07
108
Mr
Paul
R
14-Feb-76
03-Jun-03
109
Ms
Travis
Elizabeth
K
18-Jun-71
14-Feb-16
110
Mrs
Genkazi
Leighla
19-May-80
29-Jun-00
Table
8
Programming
name:
John
Williams
Wiesenbach
W
PILOT PIL_MED_TYPE
PIL_MED_DATE
PIL_PT135_DATE
1
12-Apr-2018
15-Jun-2018
SEL/MEL/Instr
1
10-Jun-2018
23-Mar-2019
COM
SEL/MEL/Instr/CFI
2
25-Feb-2019
12-Feb-2019
106
COM
SEL/MEL/Instr
2
02-Apr-2019
24-Dec-2019
109
COM
SEL/MEL/SES/Instr/
1
14-Apr-2019
21-Apr-2019
EMP_NUM
PIL_LICENSE
PIL_RATINGS
101
ATP
SEL/MEL/Instr/CFII
104
ATP
105
CFII
Table name:
EARNEDRATING EMP_NUM
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
RTG_CODE
EARNRTG_DATE
101
CFI
18-Feb-08
101
CFII
15-Dec-15
101
INSTR
08-Nov-03
101
MEL
23-Jun-04
101
SEL
21-Apr-03
104
INSTR
15-Jul-06
104
MEL
29-Jan-07
104
SEL
12-Mar-05
105
CFI
18-Nov-07
105
INSTR
17-Apr-05
105
MEL
12-Aug-05
105
SEL
106
INSTR
20-Dec-05
106
MEL
02-Apr-06
copied, affect
scanned, the
overall
or
duplicated, learning
23-Sep-04
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
EMP_NUM
Table name:
8 Beginning
RTG_CODE
EARNRTG_DATE
106
SEL
10-Mar-04
109
CFI
05-Nov-08
109
CFII
21-Jun-13
109
INSTR
23-Jul-06
109
MEL
15-Mar-07
109
SEL
05-Feb-06
109
SES
12-May-06
Structured
Query
Language
405
CUSTOMER
CUS_CODE
CUS_LNAME
CUS_FNAME
CUS_INITIAL
CUS_AREACODE
CUS_PHONE
CUS_ BALANCE
10010
Ramas
Alfred
A
0181
844-2573
0.00
10011
Dunne
Leona
K
0161
894-1238
0.00
10012
Smith
Kathy
0181
894-2285
896.54
10013
Pieterse
Jaco
0181
894-2180
1285.19
10014
Orlando
0181
222-1672
10015
OBrian
Amy
B
0161
442-3381
10016
Brown
James
G
0181
297-1228
0.00
0181
290-2556
0.00 0.00
Copyright review
F
Myron
George
10017
Williams
10018
Farriss
Anne
G
0161
382-7185
10019
Smith
Olette
K
0181
297-3809
Table
Editorial
W
name:
673.21 1014.56
453.98
CHARTER
CHAR_
CHAR_
AC_
CHAR_
CHAR_
CHAR_
CHAR_
CHAR_
CHAR_
CUS_
TRIP
DATE
NUMBER
DESTINATION
DISTANCE
HOURS_
HOURS_
FUEL_
OIL_
CODE
FLOWN
WAIT
GALLONS
QTS 1
10011
0
10016
2
10014
1
10019
397.7
2
10011
117.1
0
10017
0
348.4
2
10012
4.1
0
140.6
1
10014
1 574.00
6.6
23.4
459.9
0
10017
ATL
998.00
6.2
3.2
279.7
BNA
352.00
1.9
5.3
MOB
884.00
4.8
4.2
10001
05-Feb-19
2289L
ATL
936.00
5.1
2.2
10002
05-Feb-19
2778V
BNA
320.00
1.6
0
10003
05-Feb-19
4278Y
GNV
7.8
0
10004
06-Feb-19
1484P
STL
472.00
2.9
4.9
10005
06-Feb-19
2289L
ATL
1 023.00
5.7
3.5
10006
06-Feb-19
4278Y
STL
472.00
2.6
5.2
10007
06-Feb-19
2778V
GNV
1 574.00
7.9
10008
07-Feb-19
1484P
TYS
644.00
10009
07-Feb-19
2289L
GNV
10010
07-Feb-19
4278Y
10011
07-Feb-19
1484P
10012
08-Feb-19
2778V
2020 has
Cengage deemed
Learning. that
any
8
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
1 574.00
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
354.1 72.6 339.8
97.2
66.4 215.1
party additional
content
may content
be
suppressed at
any
time
0
10016
1
10012
0
10010
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
406
PART III
Database
Programming
CHAR_
CHAR_
AC_
CHAR_
CHAR_
CHAR_
CHAR_
CHAR_
CHAR_
CUS_
TRIP
DATE
NUMBER
DESTINATION
DISTANCE
HOURS_
HOURS_
FUEL_
OIL_
CODE
WAIT
GALLONS
QTS
FLOWN
10013
08-Feb-19
4278Y
TYS
644.00
3.9
4.5
174.3
1
10011
10014
09-Feb-19
4278Y
ATL
936.00
6.1
2.1
302.6
0
10017
10015
09-Feb-19
2289L
GNV
1 645.00
6.7
0
459.5
2
10016
10016
09-Feb-19
2778V
312.00
1.5
0
67.2
0
10011
10017
10-Feb-19
1484P
STL
508.00
3.1
0
105.5
0
10014
10018
10-Feb-19
4278Y
TYS
644.00
3.8
4.5
167.4
0
10017
Table
name:
MQY
AIRCRAFT AC_NUMBER
Table name:
8
1484P
PA23-250
1833.10
1833.10
101.80
2289L
C-90A
4243.80
768.90
1123.40
2778V
PA31-350
7992.90
1513.10
789.50
4278Y
PA31-350
2147.30
243.20
622.10
MODEL
Beechcraft
KingAir
8
2.67
PA23-250
Piper
Aztec
6
1.93
PA31-350
Piper
Navajo
Writethe SQL code that
MOD_NAME
MOD_SEATS
Chieftain
MOD_CHG_MILE
2.35
10
willlist the values for the first four attributes in the
CHARTER table.
Usingthe contents ofthe CHARTERtable, writethe SQL querythat will produce the output shown Figure
P8.2.
FIGURE P8.2 CHAR_DATE
Note that
the
output
is limited
to
selected
attributes
for
aircraft
number
AC_NUMBER
CHAR_DESTINATION
CHAR_DISTANCE
CHAR_HOURS_FLOWN
2778V
BNA
320.00
1.60
06-Feb-19
2778V
GNV
1574.00
7.90
08-Feb-19
2778V
MOB
884.00
4.80
09-Feb-19
2778V
MQY
312.00
1.50
3
Create a virtual table (named
4
Produce the output shown in Figure P8.3 for aircraft 2778V.
2020
Cengage deemed
the
Learning. that
any
All suppressed
2778V.
Problem 2 query results
05-Feb-19
from
has
AC_TTER
C-90A
in
review
AC_TTEL
MOD_MANUFACTURER
2
Copyright
AC_TTAF
MOD_CODE
1
Editorial
MOD_CODE
CHARTER
Rights
Reserved. content
does
and
May not
not materially
be
AC2778V) containing the output presented in Problem 2.
CUSTOMER
copied, affect
scanned, the
overall
or
duplicated, learning
tables.
in experience.
whole
(Hint:
or in Cengage
part.
Note that this output includes
Use a JOIN in this
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
data
query.)
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
FIGURE P8.3 CHAR_DATE
8 Beginning
Structured
Query
Language
Problem 4 query results AC_NUMBER
CHAR_DESTINATION
CUS_LNAME
CUS_AREACODE
CUS_PHONE
08-Feb-19
2778V
MOB
Ramas
0181
844-2573
09-Feb-19
2778V
MQY
Dunne
0161
894-1238
06-Feb-19
2778V
GNV
Smith
0181
894-2285
05-Feb-19
2778V
BNA
Brown
0181
297-1228
5
407
Produce the output shown in Figure P8.4. The output, derived from the CHARTER and MODEL tables, is limited to 6 February 2019. (Hint: Thejoin passes through another table. Note that the connection between CHARTER and MODEL requires the existence of AIRCRAFT because the CHARTER table does not contain a foreign key to MODEL. However, CHARTER does contain AC_NUMBER,
a foreign
FIGURE P8.4
6
key to
AIRCRAFT,
which contains
a foreign
key to
MODEL.)
Problem 5 query results
CHAR_DATE
CHAR_DESTINATION
AC_NUMBER
MOD_NAME
06-Feb-19
STL
1484P
Aztec
1.93
06-Feb-19
ATL
2289L
KingAir
2.67
06-Feb-19
STL
4278Y
Navajo
Chieftain
2.35
06-Feb-19
GNV
2778V
Navajo
Chieftain
2.35
Modify the
query in
Problem
5 to include
is limited to charter records Figure P8.5.)
FIGURE P8.5 CHAR_DATE 09-Feb-19
data from
generated
the
CUSTOMER
MOD_CHG_MILE
table.
8
This time the
since 9 February 2019. (The query results
output
are shown in
Problem 6 query results MOD_CHG_MILE
CHAR_DESTINATION
AC_NUMBER
MOD_NAME
ATL
4278Y
Navajo
Chieftain
2.35
Chieftain
CUS_LNAME Williams
09-Feb-19
MQY
2778V
Navajo
2.35
Dunne
09-Feb-19
GNV
2289L
KingAir
2.67
Brown
10-Feb-19
TYS
4278Y
Navajo Chieftain
2.35
Williams
10-Feb-19
STL
1484P
Aztec
1.93
Orlando
7
Modifythe query in Problem 6 to produce the output shown in Figure P8.6. The datelimitation in
Problem
6 applies to this
problem,
too.
Note that this
query includes
data from
the
CREW and
EMPLOYEE tables. (Note: You may wonder why the date restriction seems to generate more records than it did in Problem 6. Actually, the number of (CHARTER) records is the same, but several records are listed twice to reflect a crew of two: a pilot and a copilot. For example, the record
for the
09-Feb-2019
flight
to
GNV, using aircraft
2289L, required
a crew consisting
of a pilot
(Lange) and a copilot (Lewis).)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
408
PART III
Database
Programming
FIGURE P8.6
Problem 7 query results
CHAR_
CHAR_
AC_
MOD_CHG_
CHAR_
EMP_
CREW_
EMP_
DATE
DESTINATION
NUMBER
MILE
DISTANCE
NUM
JOB
LNAME
09-Feb-19
GNV
2289L
2.67
1 645.00
104
Pilot
Lange
09-Feb-19
GNV
2289L
2.67
1 645.00
101
Copilot
Lewis
09-Feb-19
MQY
2778V
2.35
312.00
109
Pilot
Travis
09-Feb-19
MQY
2778V
2.35
312.00
105
Copilot
Williams Duzak
09-Feb-19
ATL
4278Y
2.35
936.00
106
Pilot
10-Feb-19
STL
1484P
1.93
508.00
101
Pilot
10-Feb-19
TYS
4278Y
2.35
644.00
105
Pilot
10-Feb-19
TYS
4278Y
2.35
644.00
104
Copilot
8
Modify the query in Problem 5to include the computed (derived) possible to the
use SQL to
following
SQL
SELECT
8
to
hours
query
9-Feb-19
existed.
so
Is the
is
per
per hour per
per hour.
Hint:It is
For example,
In this
queries to find
fuel
burn
per case,
value.)
shown in
hour
result
why is the
Use a similar
Figure
shown
fuel
technique
P8.7. (Note
burn for that
due to
or was there
poor fuel
an error in the
management
in
Figure
burn for the
aircraft
out who flew the aircraft
difference
problem,
mile flown
output
that
on joined
254.3 litres/1.5
hour.)
gallons
much higher than the fuel
an important
FIGURE P8.7
by the
recording?
The
provides
managers
Chieftain
on 8-Feb-18?
or which special
management fuel
P8.7
Navajo
4278Y
AC_
DATE
NUMBER
pilot, ability
does it reflect to
generate
an engine useful
query
asset.
MOD_NAME
CHAR_HOURS_
CHAR_FUEL_
FLOWN
GALLONS
Expr1
Navajo Chieftain
1.5
67.2
09-Feb-18
2289L
KingAir
6.7
459.5
68.5820895522388
09-Feb-18
4278Y
Navajo Chieftain
6.1
302.6
49.6065573770492
10-Feb-18
4278Y
Navajo Chieftain
3.8
167.4
44.0526315789474
10-Feb-18
1484P
Aztec
3.1
105.5
34.0322580645161
has
Cengage deemed
Learning. that
any
All suppressed
may
might have
2778V
2020
on
Such a query result circumstances
09-Feb-18
review
with
flown
Problem 8 query results
CHAR_
Copyright Editorial
as the
information.
metering
output
attribute fuel
are not stored in any table.
acceptable:
the gallons
litres
such
to additional
fuel
that
Lange
CHAR_FUEL_GALLONS/CHAR_DISTANCE
the gallons 169.54
output
very important
lead
perfectly
produces
produce
produces
Query
is
attributes
Williams
CHARTER;
above
tables
query
computed
CHAR_DISTANCE,
FROM (The
produce
Lewis
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
44.8
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
409
NOTE The to
output
format
an output
is
determined
heading
by the
labelled
Expr1
RDBMS
to indicate
you
use. In this
example,
the
the
expression
resulting
from
Access the
[CHARTER]![CHAR_FUEL_GALLONS]/[CHARTER]![CHAR_HOURS]created
Oracle of your
9
defaults
RDBMSs
to the full utility
per to
requires
charter
CHARTER
records
the
date,
by the
10
to control
the
output
format
builder.
with the
customers
9 February last
09-Feb-19
(Hint:
The
miles flown.)
2019.
In
addition,
MODEL table
contains
Note
the
the
also that output
is
the
output
help
ordered
by date
2.67
4392.15
09-Feb-19
Dunne
312.00
2.35
733.20
09-Feb-19
Williams
936.00
2.35
2199.60
10-Feb-19
Orlando
508.00
1.93
980.44
10-Feb-19
Williams
644.00
2.35
1513.40
that
P8.9. The total * charge
miles flown
produced the
charge to the per
waited * 50
the
CHARTER
8
charges shown in
by:
value
in the
is
found
MODEL table,
in the
CHARTER
and the hours
table,
the
charge
per
mile
waited (CHAR_HOURS_WAIT)
are
table.
Problem 10 query results CUS_LNAME
09-Feb-19
Brown
09-Feb-19
Dunne
Reserved. content
does
Williams
May not
not materially
be
copied, affect
scanned, the
Charge
Waiting
overall
or
duplicated, learning
in experience.
Charge
Total
0.00
4392.15
Charge
4392.15
733.20
0.00
733.20
2199.60
85.00
2304.60
980.44
0.00
980.44
1513.40
225.00
Orlando
08-Feb-19
Rights
Mileage
Williams
08-Feb-19
All
Problem 9 to produce the
computed
Charge
mile.
is found
09-Feb-19
suppressed
is
Mileage
per hour.
CHAR_DATE
any
output in
customer
(CHAR_DISTANCE)
FIGURE P8.9
Learning.
and,
name.
1645.00
in
charge
is limited
Brown
found
that
total
MOD_CHG_MILE
(MOD_CHG_MILE)
Cengage
tables.
the
CHAR_DISTANCE
Use the techniques
deemed
since
CUS_LNAME
Hours
has
learn
expression
Problem 9 query results
Miles flown
2020
different
contains
CHAR_DATE
Figure
The
in two table
generated
FIGURE P8.8
review
by its
software.
data found
mile, and the
within
Copyright
You should
defaulted
Create a query to produce the output shown in Figure P8.8. Note that, in this case, the computed attribute
Editorial
division label.
software division:
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
1738.40
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
410
PART III
Database
11
Programming
Create the required
SQL query that
output
FIGURE P8.10
is
shown
in
unpaid
OBrian
Amy
Smith
Kathy
Orlando
Myron
Smith
Olette
balances.
FIGURE P8.11
The resulting
values
Balance
Minimum
B
1014.56
descending
453.98
maximum balance, and the total
Figure
the
0.00
output
headers
Maximum
shown
may look
Balance
Total
1285.19
Figure
(Utility
Bills
4323.48
group the aircraft
P8.12.
Unpaid
data. Then use the
software
was
used
to
SQL functions
modify
the
headers,
so
Problem 13 query results Number
of Trips
Total
Distance
Average
Distance
Total
Hours
Average
Hours
1976.00
494.00
12.00
3.00
2289L
4
5178.00
1294.50
24.10
6.03
2778V
4
3090.00
772.50
15.80
3.95
4278Y
6
5268.00
878.00
30.40
5.07
Writethe SQL codeto generatethe output shownin Figure P8.13. Notethat the listing includes all CHARTER
table,
Cengage deemed
to
different.)
4
has
of
P8.11.
1484P
2020
The
order.
896.54
are shown in
Balance
in
listed
review
1285.19
minimum balance, the
produce
14
Copyright
F
K
as the source,
FIGURE P8.12
Editorial
in
673.21
Using the CHARTER table
AC_NUMBER
are listed
CUS_BALANCE
W
balance, the
432.35
your
who have an unpaid balance.
balances
Problem 12 query results
Average
13
the
CUS_INITIAL
Jaco
Find the average customer
8
of customers
Note that
CUS_FNAME
Pieterse
the
a list
P8.10.
Problem 11 query results CUS_LNAME
12
will produce Figure
Learning. that
any
flights
in the
that
while the
All suppressed
did
not include
CREW table.
Rights
MOD_CODE
Reserved. content
does
May not
not materially
be
a copilot
Also note that
copied, affect
requires
scanned, the
overall
or
duplicated, learning
crew
the
access
in experience.
whole
assignment.
pilots last to the
or in Cengage
part.
Due Learning
MODEL
to
electronic reserves
(Hint:
name requires
crew
access
assignments
to the
are
EMPLOYEE
table.)
rights, the
The
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE P8.13
8 Beginning
Structured
Query
Language
Problem 14 query results
CHAR_
CHAR_
AC_
TRIP
DATE
NUMBER
MOD_NAME
CHAR_HOURS_
EMP_
CREW_
FLOWN
LNAME
JOB
5.1
Lange
Pilot
1.6
Lewis
Pilot
Duzak
Pilot
10001
05-Feb-19
2289L
KingAir
10002
05-Feb-19
2778V
Navajo
10004
06-Feb-19
1484P
Aztec
2.9
10005
06-Feb-19
2289L
KingAir
5.7
Lewis
Pilot
10006
06-Feb-19
4278Y
Navajo
2.6
Travis
Pilot
10008
07-Feb-19
1484P
Aztec
4.1
Duzak
Pilot
10009
07-Feb-19
2289L
KingAir
6.6
Williams
Pilot
10010
07-Feb-19
4278Y
Navajo
Chieftain
6.2
Wiesenbach
Pilot
10012
08-Feb-19
2778V
Navajo
Chieftain
4.8
10013
08-Feb-19
4278Y
Navajo
Chieftain
3.9
10014
09-Feb-19
4278Y
Navajo
Chieftain
6.1
10017
10-Feb-19
1484P
Aztec
15
411
Write a query that lists
the
Chieftain
Chieftain
and the
Pilot
Williams
Pilot
Duzak
Pilot
Lewis
3.1
ages of the employee
Pilot
Lewis
date on
which the
query
was run.
The
8
required output is shown in Figure P8.14. (As you can tell, the query was run on 4 February 2013, so the ages of the employee are current as of that date.)
FIGURE P8.14 EMP_NUM 100
Problem 15 query results Age
Query Date
EMP_LNAME
EMP_FNAME
EMP_HIRE_DATE
EMP_DOB
Nkosi
Cela
15-Mar-1997
15-Jun-1952
67
04-Feb-19
101
Lewis
Rhonda
25-Apr-1998
19-Mar-1975
44
04-Feb-19
102
Vandam
Rhett
20-Dec-2002
14-Nov-1968
51
04-Feb-19
103
Jones
Anne
28-Aug-2015
16-Oct-1984
35
04-Feb-19
104
Lange
John
20-Oct-2006
08-Nov-1981
38
04-Feb-19
Robert
08-Jan-2016
14-Mar-1985
34
04-Feb-19
Williams
105 106
Duzak
Jeanine
05-Jan-2001
12-Feb-1978
41
04-Feb-19
107
Diante
Jorge
02-Jul-2006
21-Aug-1984
35
04-Feb-19
Paul
18-Nov-2004
14-Feb-1976
43
04-Feb-19
Elizabeth
14-Apr-2001
18-Jun-1971
48
04-Feb-19
Leighla
01-Dec-2002
19-May-1980
39
04-Feb-19
Wiesenbach
108
Travis
109
Genkazi
110
Online Content Problems 16-33arebasedonthe'Ch8_SaleCo' database locatedon the
online
platform
use another Access
Copyright Editorial
review
2020 has
Cengage deemed
any
book.
This database
is
stored
DBMS such as Oracle, SQL Server or
database
Learning. that
for this
All suppressed
in
Microsoft
Access
MySQL, use its import
format.
utilities to
If you
move the
contents.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
412
PART III
Database
Programming
The structure database
and contents
to
answer
the
of the
Ch8_SaleCo
following
problems.
database
Save
each
are shown query
as
in
QXX,
Figure where
P8.15.
Use this
XX is the
problem
number.
FIGURE P8.15
The Ch8_SaleCo database
8
Table
name:
CUSTOMER
CUS_CODE
Copyright Editorial
review
CUS_
CUS_
LNAME
FNAME
10010
Ramas
Alfred
A
0181
844-2573
0.00
10011
Dunne
Leona
K
0161
894-1238
0.00
10012
Smith
Kathy
0181
894-2285
345.86
10013
Pieterse
Jaco
0181
894-2180
536.75
10014
Orlando
0181
222-1672
0.00
10015
OBrian
Amy
B
0161
442-3381
0.00
10016
Brown
James
G
0181
297-1228
221.19
10017
Williams
0181
290-2556
768.93 216.55
CUS_ INITIAL
CUS_
CUS_
CUS_
AREACODE
PHONE
BALANCE
W F
Myron
George
10018
Farriss
Anne
G
0161
382-7185
10019
Smith
Olette
K
0181
297-3809
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
0.00
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table
name:
V_CODE
8 Beginning
Structured
Query
Language
413
VENDOR V_CONTACT
V_NAME
V_AREACODE
V_PHONE
V_COUNTRY
V_ORDER
21225
Bryson, Inc.
Smithson
0181
223-3234
UK
Y
21226
SuperLoo, Inc.
Flushing
0113
215-8995
SA
N
21231
D&E Supply
Singh
0181
228-3245
UK
Y
21344
Jabavu
Khumalo
0181
889-2546
UK
N
22567
Dome
Smith
7253
678-1419
FR
N
23119
Randsets
Anderson
7253
678-3998
FR
Y
24004
Brackman
Browning
0181
228-1410
UK
N
24288
ORDVA, Inc.
Hakford
0181
898-1234
UK
Y
25443
B&K, Inc.
Smith
0113
227-0093
SA
N
25501
Damal
Smythe
0181
890-3529
UK
N
25595
Rubicon
Du Toit
0113
456-0092
SA
Y
Bros. Supply Ltd. Bros.
Supplies Systems
Table name: PRODUCT
P_CODE
P_DESCRIPT
11QER/31
Power
13-Q2/P2
P_INDATE
painter,
P_QOH
03-Nov-18
15
P_MIN
P_PRICE
V_CODE 25595
0.00
109.99
5
8
P_DISCOUNT
8
psi.,
3-nozzle
7.25
cm
pwr.
cm
pwr. saw
saw
13-Dec-18
32
15
14.99
0.05
21344
13-Nov-18
18
12
17.49
0.00
21344
15-Jan-19
15
8
39.95
0.00
23119
15-Jan-19
23
5
43.99
0.00
23119
0.05
24288
blade
14-Q1/L3
9.00 blade
1546-QQ2
Hrd. cloth,
1/4 cm,
2x50 1558-QW1
Hrd. cloth, 1/2 cm, 3x50
2232/QTY
109.92
12 cm
30-Dec-18
8
5
8 cm
24-Dec-18
6
5
99.87
0.05
24288
20-Jan-19
12
5
38.95
0.05
25595
20-Jan-19
23
9.95
0.10
21225
hammer,
02-Jan-19
8
14.40
0.05
file,
15-Dec-18
43
4.99
0.00
21344
07-Feb-19
11
256.99
0.05
24288
5.87
0.00
B&D jigsaw, blade
2232/QWE
B&D jigsaw, blade
2238/QPD
B&D cordless
drill,
1/2 cm 23109-HB
Claw
hammer
23114-AA
Sledge
10 5
12 kg 54778-2T
Rat-tail
1/8 cm
20
fine
89-WRE-Q
Hicut chain saw,
5
16 cm PVC23DRT
Copyright Editorial
review
2020 has
Cengage deemed
PVC pipe,
Learning. that
any
All suppressed
Rights
Reserved. content
does
3.5 cm,
May not
not materially
be
8 m
copied, affect
20-Feb-19
scanned, the
overall
or
duplicated, learning
75
188
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
414
PART III
Database
Programming
P_CODE
P_DESCRIPT
P_INDATE
SM-18277
1.25
01-Mar-19
172
75
6.99
0.00
21225
24-Feb-19
237
100
8.45
0.00
21231
17-Jan-19
18
5
0.10
25595
cm
metal screw,
P_QOH
P_MIN
P_PRICE
P_DISCOUNT
V_CODE
25 SW-23116
2.5 cm
WR3/TT3
Steel
wd. screw, matting,
3 1/6
m,.5
4
50 3 8
119.95
m mesh
Table name: INVOICE INV_NUMBER
8
Table
name:
Copyright review
2020 has
Cengage deemed
INV_DATE
1001
10014
16-Mar-19
1002
10011
16-Mar-19
1003
10012
16-Mar-19
1004
10011
17-Mar-19
1005
10018
17-Mar-19
1006
10014
17-Mar-19
1007
10015
17-Mar-19
1008
10011
17-Mar-19
LINE
INV_NUMBER
Editorial
CUS_CODE
Learning. that
any
All suppressed
LINE_NUMBER
P_CODE
LINE_UNITS
LINE_PRICE
1001
1
13-Q2/P2
1
14.99
1001
2
23109-HB
1
9.95
1002
1
54778-2T
2
4.99
1003
1
2238/QPD
1
38.95
1003
2
1546-QQ2
1
39.95
1003
3
13-Q2/P2
5
14.99
1004
1
54778-2T
3
4.99
1004
2
23109-HB
2
9.95
1005
1
PVC23DRT
12
5.87
1006
1
SM-18277
3
6.99
1006
2
2232/QTY
1
1006
3
23109-HB
1
1006
4
89-WRE-Q
1
1007
1
13-Q2/P2
2
1007
2
54778-2T
1
4.99
1008
1
PVC23DRT
5
5.87
1008
2
WR3/TT3
3
119.95
1008
3
1
9.95
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
23109-HB
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
9.95 256.99 14.99
rights, the
109.92
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
16
Write a query to count the number ofinvoices.
17
Write a query to count the number of customers
18
Generate a listing P8.16
as your
Figure
P8.16.)
FIGURE P8.16 CUS_CODE
of all purchases
guide. (Hint:
with a customer
made by the customers,
Use the
ORDER
8 Beginning
BY clause
to
Structured
Query
Language
415
balance over 500.
using the output
order the resulting
shown in rows
Figure
as shown
in
Problem 18 query results
INV_NUMBER
INV_DATE
LINE_UNITS
P_DESCRIPT
LINE_PRICE
10011
1002
16-Mar-19
Rat-tail file, 1/8 cm fine
2
4.99
10011
1004
17-Mar-19
Claw hammer
2
9.95
10011
1004
17-Mar-19
Rat-tail
file,
3
4.99
10011
1008
17-Mar-19
Claw
hammer
1
9.95
10011
1008
17-Mar-19
PVC
pipe,
5
5.87
10011
1008
17-Mar-19
Steel
1/8 cm fine
3.5 cm,
matting,
8
m
4 3 8 3 1/6
m,.5
119.95
3
m
mesh 10012
1003
16-Mar-19
7.25
10012
1003
16-Mar-19
B&D cordless
10012
1003
16-Mar-19
Hrd.
10014
1001
16-Mar-19
10014
1001
16-Mar-19
10014
1006
17-Mar-19
1.25
10014
1006
17-Mar-19
B&D jigsaw,
10014
1006
17-Mar-19
Claw
10014
1006
17-Mar-19
10015
1007
10015 10018
19
produce the
Copyright review
2020 has
Claw
cloth,
cm
blade
5
14.99
drill, 1/2 cm
1
38.95
1
39.95
1
14.99
1
9.95
25
3
6.99
blade
1
pwr. saw
1/4
cm,
pwr. saw
2
3 50
blade
hammer cm
metal screw, 12 cm
8
109.92
1
9.95
Hicut chain saw, 16 cm
1
256.99
17-Mar-19
7.25 cm pwr. saw blade
2
14.99
1007
17-Mar-19
Rat-tail file, 1/8 cm fine
1
4.99
1005
17-Mar-19
PVC pipe, 3.5 cm, 8 m
12
5.87
hammer
Usingthe output shown in Figure P8.17 as your guide, generatethe listing of customer purchases, including
Editorial
7.25
cm
subtotals
the listing
for
Learning. that
any
All suppressed
Rights
Reserved. content
does
each
of the invoice
of customer
derived (computed)
Cengage deemed
the
May not
purchases
attribute
not materially
be
copied, affect
in
LINE_UNITS
scanned, the
line
overall
or
duplicated, learning
in experience.
whole
numbers.
Problem
(Hint:
Cengage
part.
Due Learning
the
query
18, delete the INV_DATE
* LINE_PRICE
or in
Modify
to
electronic reserves
to
rights, the
right
calculate
some to
third remove
party additional
the
content
format
used to
column,
and add
subtotals.)
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
416
PART III
Database
Programming
FIGURE P8.17 CUS_CODE
Problem 19 query results
INV_NUMBER
Units
P_DESCRIPT
Bought
Unit
Price
Subtotal
10011
1002
Rat-tail file, 1/8 cm fine
2
4.99
9.98
10011
1004
Claw hammer
2
9.95
19.90
10011
1004
Rat-tail file, 1/8 cm fine
3
4.99
14.97
10011
1008
Claw hammer
1
9.95
9.95
10011
1008
PVC
5
5.87
10011
1008
Steel
pipe,
3.5 cm,
matting,
4
8 m 3 8 3 1/6
m, .5
m
29.35
3
119.95
359.85
5
14.99
74.95
1/2 cm
1
38.95
38.95
2 3 50
1
39.95
39.95
1
14.99
14.99
1
9.95
9.95
25
3
6.99
blade
1
mesh
8
7.25cm
pwr.
saw
blade
10012
1003
10012
1003
B&D cordless
10012
1003
Hrd. cloth,
10014
1001
10014
1001
10014
1006
10014
1006
B&D jigsaw,
10014
1006
Claw
hammer
10014
1006
Hicut
chain
10015
1007
10015
1007
Rat-tail file, 1/8 cm fine
10018
1005
PVC pipe, 3.5 cm, 8 m
20
drill, 1/4 cm,
7.25 cm pwr. saw blade Claw hammer 1.25
cm
7.25
metal
cm
screw,
12 cm
pwr.
saw
1
256.99
256.99
blade
2
14.99
29.98
CUS_BALANCE
Modify the each
NUMBER,
representing
has
Cengage deemed
Learning. that
any
All suppressed
(In
you
will note that
2020
1
4.99
4.99
12
5.87
70.44
Rights
422.77
10015
0.00
34.97
10018
216.55
70.44
words,
count
if the
three
10011
a product
purchase.)
does
May not
not materially
153.85
0.00
customer
Reserved. content
444.00
10014
other
would
Purchases
345.86
query in Problem 20 to include
customer.
Total
0.00
10012
review
9.95
16 cm
10011
Copyright
9.95
Customer purchase summary CUS_CODE
Editorial
109.92
Modify the query used in Problem 19 to produce the summary shown in Figure P8.18.
FIGURE P8.18
21
109.92
1 saw,
20.97
be
copied, affect
the
product
generated
scanned, overall
duplicated, learning
invoice
purchases.
is If
output
in experience.
whole
or in Cengage
values
part.
Due Learning
to
based
you
three invoices,
Your
or
the number of individual
customers
examine
the
which contained must
electronic reserves
product
on three
match
rights, the
right
some to
third remove
those
party additional
content
purchases
products, original
invoice
a total shown
may content
be
one
in
any
data,
of six lines, Figure
suppressed at
made by per LINE_
time
from if
the
subsequent
you
each P8.19.
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE P8.19
Customer total
CUS_CODE
Use the
divided
Total
Purchases
0.00
444.00
6
345.86
153.85
3
10014
0.00
422.77
6
10015
0.00
34.97
2
70.44
1
of
216.55
Problem
P8.20.
by the number
21 as the
Note that
basis
the
for
this
average
query.)
purchase
Total
0.00 345.86
output
values
must
match
is equal to the total
those
purchases
Purchases
Number
of Purchases
Average
Purchase
444.00
6
74.00
153.85
3
51.28
10014
0.00
422.77
6
70.46
10015
0.00
34.97
2
17.48
70.44
1
70.44
216.55
10018
Amount
8
Create a query to produce the total purchase per invoice, generating the results shown in Figure P8.21. Theinvoice total is the sum of the product purchases in the LINE that corresponds to the INVOICE.
FIGURE P8.21
Invoice totals INV_NUMBER
Invoice
24.94
1002
9.98
1003
153.85
1004
34.87
1005
70.44 397.83 34.97
1007
399.15
1008
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
Total
1001
1006
has
Your
amount
of purchases.
CUS_BALANCE
10012
2020
417
Average purchase amount by customer
10011
review
Language
of Purchases
10012
Figure
CUS_CODE
Copyright
Number
10011
results
in
FIGURE P8.20
Editorial
Query
Usea queryto compute the average purchase amount per product made by each customer. (Hint: shown
23
Structured
purchase amounts and number of purchases
CUS_BALANCE
10018
22
8 Beginning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
418
PART III
Database
24
Programming
Use a query to show the invoices
and invoice totals
as shown in Figure P8.22. (Hint:
Group bythe
CUS_CODE.)
FIGURE P8.22
Invoice totals
by customer
CUS_CODE
25
INV_NUMBER
Invoice
Total
10011
1002
9.98
10011
1004
34.87
10011
1008
399.15
10012
1003
153.85
10014
1001
24.94
10014
1006
397.83
10015
1007
34.97
10018
1005
70.44
Write a query to produce the number of invoices and the total purchase amounts by customer, using the output shown in Figure P8.23 as your guide. (Compare this summary to the results shown in Problem 24.)
8 FIGURE P8.23
Number of invoices CUS_CODE
26
and total
Number
purchase amounts by customer
of Invoices
Total
Customer
10011
3
444.00
10012
1
153.85
10014
2
422.77
10015
1
34.97
10018
1
70.44
Usingthe query results in Problem 25 as your basis, write a query to generatethe total number of invoices,
the
invoice
total
for
all
of the
invoices,
the
smallest
amount and the average of all of the invoices. (Hint: output must match Figure P8.24.
FIGURE P8.24 Total
Copyright Editorial
review
2020 has
amount,
the
largest
invoice
output in Problem 25.) Your
# of Invoices
Total
Sales
Minimum
1 126.03
Sale
Largest
34.97
Sale
Average
444.00
Sale
225.21
List the balance characteristics of the customers who have made purchases during the current invoice cycle that is, for the customers who appear in the INVOICE table. The results of this query are shown in Figure P8.25.
Cengage deemed
invoice
Check the figure
Number ofinvoices; invoice totals; minimum, maximum and average sales
8
27
Purchases
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE P8.25
Balances of customers
as shown
FIGURE P8.26
in
Figure
10011
0.00
10012
345.86
10014
0.00
10015
0.00 216.55
P8.26.
Balance
Maximum
0.00
29
Balance
outstanding
balances.
FIGURE P8.27
345.86
Balance
112.48
8
The results
Balance
Minimum
2089.28
30
of this
query
are
shown
in
Figure
P8.27.
Balancesummary for all customers
Total
Balance
Maximum
0.00
Balance
Average Balance
768.93
208.93
Find the listing of customers who did not make purchases during the invoicing must match the output shown in Figure P8.28.
FIGURE P8.28
Balances of customers
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
CUS_CODE
CUS_BALANCE
10010
0.00
10013
536.75
10016
221.19
10017
768.93
duplicated, learning
period. Your output
who did not make purchases
0.00
10019
has
Average
Createa queryto find the customer balance characteristics for all customers, including the total of the
2020
419
Balancesummary for customers who madepurchases Minimum
review
Language
Usingthe results of the query created in Problem 27, provide a summary of customer balance characteristics
Copyright
Query
CUS_BALANCE
10018
Editorial
Structured
who made purchases
CUS_CODE
28
8 Beginning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
420
PART III
31
Database
Programming
Find the customer current
invoicing
FIGURE P8.29
balance summary for all customers period.
The results
in
Balance summary for customers
Total
Balance
Minimum
Balance
1526.87
32
are shown
who have not made purchases
Figure
during the
P8.29.
who did not make purchases
Maximum
0.00
Balance
Average
768.93
Balance
305.37
Create a query to produce the summary ofthe value of products currentlyin inventory. Notethat the
value
of each
product
the unit price. Use the
FIGURE P8.30
is produced
by the
multiplication
ORDER BY clause to
of the
units currently
in inventory
and
match the order shown in Figure P8.30.
Value of products currently in inventory P_QOH
P_DESCRIPT
Power painter,
8
15 psi., 3-nozzle
P_PRICE
Subtotal
8
109.99
879.92
7.25
cm
pwr.
saw
blade
32
14.99
479.68
9.00
cm
pwr.
saw
blade
18
17.49
314.82
Hrd. cloth, 1/4 cm, 2 3 50
15
39.95
599.25
Hrd. cloth,
23
43.99
1/2 cm,
3 3 50
B&D jigsaw,
12 cm blade
8
B&D jigsaw,
8 cm blade
6
B&D cordless Claw
drill, 1/2 cm
hammer
Sledge
hammer,
Rat-tail
file,
Hicut
pipe,
1.25
cm
2.5
cm
Steel
1/8
chain
PVC
12
saw, 3.5
metal
16
cm,
matting,
599.22
12
38.95
467.40
23
9.95
228.85
14.40
115.20
8 43
4.99
214.57
11
m
188
5.87
1103.56
172
6.99
1202.28
237
8.45
2002.65
screw,
4
879.36
cm 8
wd. screw,
109.92 99.87
kg
cm fine
1011.77
25
50 3 8 3 1/6
m, .5
m
256.99
18
2826.89
119.95
2159.10
mesh
33
Using the results of the query created in Problem 32, find the total value of the product inventory. The results are shown in Figure P8.31.
FIGURE
P8.31
Total value of all products in inventory Total
value
of inventory
15084.52
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 Beginning
Structured
Query
Language
421
Online Content Problems 34-42arebasedonthe'Ch8_ThemePark' database located on the
online
use another Access
platform
for this
book.
This
database
is
DBMS such as Oracle, SQL Server or
database
to
in
Microsoft
Access
MySQL use its import
format.
utilities to
If you
move the
contents.
The structure and contents database
stored
answer
of the
Ch8_ThemePark
the following
problems.
database are shown in Figure P8.32. Use this
Save each
query
as
QXX, where
XX is the
problem
number.
FIGURE
P8.32
The Ch8_ThemePark
database
8
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
422
PART III
Table
Database
name:
Programming
THEMEPARK PARK_CODE
PARK_NAME
PARK_CITY
PARK_COUNTRY
FR1001
FairyLand
PARIS
FR
NL1202
Efling
NOORD
NL
SP4533
AdventurePort
BARCELONA
SP
SW2323
Labyrinthe
LAUSANNE
SW
UK2622
MiniLand
WINDSOR
UK
UK3452
PleasureLand
STOKE
UK
ZA1342
GoldTown
JOHANNESBURG
ZA
Table name: TICKET TICKET_NO
8
Table
name:
Copyright review
2020 has
Cengage deemed
PARK_CODE
24.99
Adult
SP4533
13001
14.99
Child
FR1001
13002
34.99
Adult
FR1001
13003
34.99
Adult
FR1001
18721
14.99
Child
FR1001
18722
14.99
Child
FR1001
18723
20.99
Senior
FR1001
18724
34.99
Adult
FR1001
32450
24.99
Adult
SP4533
45767
24.99
Adult
SP4533
67832
18.56
Child
ZA1342
67833
28.67
Adult
ZA1342
67855
18.56
Child
ZA1342
88567
22.50
Child
UK3452
88568
42.10
Adult
UK3452
89720
22.50
Child
UK3452
89723
22.50
Child
UK3452
89725
22.50
Child
UK3452
89728
42.10
Adult
UK3452
ATTRACTION ATTRACT_NAME
10034
ThunderCoaster
10056
SpinningTeacups
10067 10078
Learning. that
TICKET_TYPE
4668
ATTRACT_NO
Editorial
TICKET_PRICE
any
All suppressed
Rights
ATTRACT_CAPACITY
11
PARK_CODE
34
FR1001
4
62
FR1001
FlightToStars
11
24
FR1001
Ant-Trap
23
30
FR1001
Reserved. content
ATTRACT_AGE
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
ATTRACT_NO
ATTRACT_NAME
Copyright Editorial
review
2020 has
Language
423
PARK_CODE
3
120
FR1001
20056
3D-Lego_Show
3
200
UK2622
30011
BlackHole2
12
34
UK3452
30012
Pirates
10
42
UK3452
30044
UnderSeaWord
4
80
UK3452
98764
GoldRush
5
80
ZA1342
HOURS ATTRACT_NO
HOURS_PER_ATTRACT
HOUR_RATE
DATE_WORKED
100
10034
6
6.5
18/05/2019
100
10034
6
6.5
20/05/2019
101
10034
6
6.5
18/05/2019
102
30012
3
5.99
23/05/2019
102
30044
6
5.99
22/05/2019
102
30044
3
5.99
23/05/2019
104
30011
6
7.2
21/05/2019
104
30012
6
7.2
22/05/2019
105
10078
3
8.5
18/05/2019
105
10098
3
8.5
18/05/2019
105
10098
6
8.5
19/05/2019
name:
8
EMPLOYEE
EMP_
EMP_
EMP_
EMP_
EMP_
EMP_HIRE_
EMP_
EMP_
NUM
TITLE
LNAME
FNAME
DOB
DATE
AREACODE
PHONE
100
Ms
Calderdale
Emma
15-Jun-82
15-Mar-02
0181
324-9134
101
Ms
Ricardo
Marshel
19-Mar-88
25-Apr-06
0181
324-4472
102
Mr
Arshad
Arif
14-Nov-79
20-Dec-00
7253
675-8993
103
Ms
Roberts
Anne
16-Oct-84
16-Aug-04
0181
898-3456
104
Mr
Denver
Enrica
08-Nov-90
20-Oct-11
7253
504-4434
105
Ms
Namowa
Mirrelle
14-Mar-00
08-Nov-16
0181
890-3243
106
Mrs
Smith
Gemma
12-Feb-78
05-Jan-99
0181
324-7845
Writethe SQL code
35
ATTRACT_CAPACITY
Query
Carnival
EMP_NUM
34
Structured
10098
Table name:
Table
ATTRACT_AGE
8 Beginning
whichlists all the attractions in each theme
park.
Writethe SQL code to display the attraction name and the capacity for all attractions in the theme park FairyLand. The results are shown in Figure P8.33.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
424
PART III
Database
Programming
FIGURE P8.33
Attractions
and their capacities in theme ATTRACT_NAME
ATTRACT_CAPACITY
ThunderCoaster
34
SpinningTeacups
62
FlightToStars
24
Ant-Trap
30 120
Carnival
36
Usingthe outputin Figure P8.34 as your guide, display the total number of hours worked by each employee
on each
FIGURE P8.34
8
attraction.
Number of hours worked on each attraction by employees
EMP_FNAME
EMP_LNAME
ATTRACT_NAME
Arif
Arshad
Pirates
3
Arif
Arshad
UnderSeaWord
9
Emma
Calderdale
ThunderCoaster
Enrica
Denver
BlackHole2
6
Enrica
Denver
Pirates
6
Marshel
Ricardo
ThunderCoaster
6
Mirrelle
Namowa
Ant-Trap
3
Mirrelle
Namowa
Carnival
9
37
SumOfHOURS_PER_ATTRACT
12
Writea query which shows the total price of all adult tickets price
column
results
as Total
of this
FIGURE P8.35
query
Adult
Ticket
are shown in
Sales
Figure
and round
Total
Adult
review
2020 has
decimal
places.
Ticket
Learning. that
any
All suppressed
The
Sales
104.97
GoldTown
28.67
PleasureLand
84.20
Writea query to show the last names, area codes and phone numbers of all employees on 18 May 2019. Your query should output the rows shown in Figure P8.36.
Cengage deemed
parks. Label the total
price to two
74.97
FairyLand
Copyright
total
P8.35.
AdventurePort
38
sold at all theme
up the
Total adult ticket salesin eachtheme park PARK_NAME
Editorial
park FairyLand
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
who worked
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE P8.36
Employees
EMP_LNAME
39
EMP_AREACODE
0181
324-9134
ThunderCoaster
0181
324-4472
ThunderCoaster
Namowa
0181
890-3243
Carnival
Namowa
0181
890-3243
Ant-Trap
425
park.
TOTAL_TICKETS_SOLD
AdventurePort
3
FairyLand
7
GoldTown
3
PleasureLand
6
who have not worked on any attractions
8
as
Employees who have not worked on any attractions EMP_DOB
EMP_HIRE_DATE
Roberts
Anne
16-Oct-84
16-Aug-04
0181
898-3456
Smith
Gemma
12-Feb-78
05-Jan-99
0181
324-7845
EMP_AREACODE
EMP_PHONE
Write a query that willlist the length of service in years of each employee. Sample output is shown in Figure P8.39 when this query was run on 5 February 2019. Remember, your output will be different.
42
Cengage
EMP_FNAME
EMP_HIRE_DATE
EMP_DOB
Calderdale
Emma
15-Mar-02
15-Jun-82
14
101
Ricardo
25-Apr-06
19-Mar-88
10
102
Arshad
Arif
20-Dec-00
14-Nov-79
16
103
Roberts
Anne
16-Aug-04
16-Oct-84
12
104
Denver
Enrica
20-Oct-11
08-Nov-90
5
105
Namowa
Mirrelle
08-Nov-16
14-Mar-00
0
106
Smith
Gemma
05-Jan-99
12-Feb-78
18
Learning. that
of service of each employee
EMP_LNAME
Writethe
deemed
Thelength
100
any
Marshel
SQL code that
of employees
has
sold at each theme
EMP_FNAME
EMP_NUM
2020
Language
EMP_LNAME
FIGURE P8.39
review
Query
Total tickets sold at each theme park
Write a query to show the details of all employees shown in Figure P8.38.
FIGURE P8.38
Copyright
ATTRACT_NAME
Ricardo
PARK_NAME
Editorial
EMP_PHONE
Calderdale
FIGURE P8.37
41
Structured
who worked on 18 May 2019
Using Figure P8.37 as a guide, show the number of tickets
40
8 Beginning
All suppressed
who
Rights
Reserved. content
does
will produce a VIEW named
work in
May not
not materially
be
Length_of_Service
EMP_PARIS, containing
allthe information
PARIS.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter 9 procedural Language SQL and advanced SQL In thIS Chapter, About
the
How to
you
relational
use the
About the
set
operators
advanced
different
wILL Learn: UNION,
SQL
types
JOIN
use SQL functions
How to
create
to
Language
PL/SQL
functions
How to
create
and
MINUS
syntax
queries
dates, strings
and other
data
views
(PL/SQL)
embedded
ALL, INTERSECT
and correlated
manipulate
and use updatable
Use Procedural
operator
of subqueries
How to
UNION
to create triggers,
stored
procedures
and
SQL
Preview In
Chapter
8, Beginning
definition data.
and
In this
data
chapter,
more advanced
In this
the
chapter, and
JOIN
statement
you learnt
circumstances.
SQLs and
as
real
actions
through
such
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
to
May not
not materially
in
the
and
Chapter
basic
SQL data
manipulate
relational
8 and learn
how
to
use
copied, affect
the
overall
or
such
procedures
and
stored
duplicated,
or even
require
Such
procedures
learning
on stored
the
as the
procedures.
when it is
you
derived
execution
embedded
in
certain
styles
of sub-queries
learn
more
of
of dates
data.
defined
of a new invoice within
addition,
use
previous in
of clearly
be applied
In
the
manipulation
addition
can
In
different
of
how to
be useful
Finally,
data, including
occurs,
a class.
Basic, .NET,
scanned,
based
to learn
can
about the
(UNION,
merge the results
tables.
queries
statement.
from
operators
need
multiple
other
will also learn
procedures
you
from
a SELECT
set
are used to
Therefore,
inside
information
event
of triggers
operators
information
in
extract
use
be
create
SQL relational
of SQL.
you
business
of business
the
queries
well as computations
world,
as Visual
extract
chapter,
in
application
heart
be implemented
enrolment the
you learnt
to
you learnt
about
cascading
when a specific
students
at the to
In this can
strings
what
used
and how those
are
how
many functions
In the
on
will learn
MINUS) Joins
that
build
you
queries.
SQL
chapter,
you
Query Language,
commands
SQL features.
INTERSECT multiple
Structured
manipulation
the
or a DBMS
SQL facilitates
a programming
the
language
C# or Java.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
online Content want to see the
online
Oracle
platform
software
managed
your college
in
action,
you
creating the tables for this
book.
is installed
by the
Language
SQL and
Advanced
SQL
427
Mostofthe examples usedin this chapterarebasedon Oracle. If you
examples
SQL script files for the
9 Procedural
or universitys
and loading
How you
on your
database
need to load
connect
server
and
the required
the data in the database to the
on the
administrator.
Follow
technology
department.
database
Oracle
access
database
paths
the instructions
and
tables.
are located depends
methods
provided
The
Oracle
on
on how the defined
and
by your instructor
or
9
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
428
part
III
Database
9.1
Programming
reLatIonaL
Set operatorS
In Chapter 3, Relational Model Characteristics, you learnt about the eight general relational operators. In this section, you willlearn how to use SQL commands (UNION, INTERSECT and MINUS) to implement the union, intersection and difference relational operators. In previous chapters, you learnt that SQL data manipulation commands are set-orientated; that is, they
operate
over entire
sets
of rows
and columns
(tables)
at once.
Using sets,
you
can combine
two or more sets to create new sets (or relations). Thats precisely what the UNION, INTERSECT and MINUS statements do. In relational database terms, you can use the words, sets, relations and tables interchangeably because they all provide a conceptual view of the data set as it is presented to the relational
database
user.
note The SQL-2011 standard implementation
details
defines the operations that
to the
DBMS vendors.
all DBMSs
Therefore,
some
must perform on data, but it leaves the advanced
SQL features
may not
work on
all DBMS implementations. Also, some DBMS vendors mayimplement additional features not found in the SQL standard. UNION, INTERSECT and MINUS arethe names ofthe SQL statements implemented in Oracle. The SQL standard
uses the
keyword
EXCEPT to refer to the
difference
(MINUS)
relational
operator.
Other RDBMS
vendors may use a different command name or might not implement a given command at all. For example, MySQL version 8.0 supports the UNION operator and not INTERSECT. To learn more about the ANSI/ISO SQL standards, check the ANSI website (www.ansi.org) to find out how to obtain the latest standard documents in electronic form.
9
UNION, INTERSECT and MINUS work properly only if relations are union-compatible. In SQL terms, union-compatible means that the names of the relation attributes must be the same and their data types
must be identical.
In
practice,
some
RDBMS
vendors
require
the
data types
to
be compatible
but not necessarily exactly the same. For example, compatible data types are VARCHAR(35) and CHAR(15). In that case, both attributes store character (string) values; the only difference is the string size. Another example of compatible data types is NUMBER and SMALLINT. Both data types are used to store
numeric
values.
note Some
DBMS
products
9.1.1
Copyright review
2020 has
has
that
common
a combined
listing
Cengage
Learning. that
tables
any
All suppressed
Rights
two
does
May not
SaleCos
customer
not materially
is
to
have identical
be
data
types.
are
affect
scanned, the
overall
or
duplicated, learning
The
excludes
in
whole
or in Cengage
with
goods wants to
merged.
experience.
management
merged
purchased
management
lists
one that
copied,
SaleCos
properly
have
of customers
Reserved. content
company.
list
customers
customers.
when the
another
customer
some
duplicated
deemed
bought
companys
possible
contain
Editorial
SaleCo
acquired
quite
union-compatible
unIon
Suppose the
may require
part.
Due Learning
to
SaleCos
from
electronic reserves
query
to
customer
both
is
make list.
companies,
make sure that
UNION
duplicate
wants
sure
Because
that it is
the two lists
customer
a perfect
records
tool
for
may
are not
generating
records.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
429
online Content The'Ch09_SaleCo' database usedtoillustratethe UNION commands is located
on the
online
platform
for this
book.
The UNION statement combines rows from two or more queries without including syntax of the UNION statement is:
duplicate rows.
The
query UNION query In other words, the UNION statement combines the output of two SELECT queries. (Remember that the SELECT statements must be union-compatible. That is, they must return the same attribute names and similar data types.) To demonstrate the use ofthe UNION statement in SQL,lets use the CUSTOMER and CUSTOMER_2 tables
in the
Ch09_SaleCo
database.
To show the
without the duplicates, the UNION query is SELECT
CUS_LNAME,
FROM
CUSTOMER
combined
CUSTOMER
and
CUSTOMER_2
records
written asfollows:
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
UNION SELECT
CUS_LNAME,
FROM
CUSTOMER_2;
Figure 9.1 shows the UNION query.
FIgure
9.1
Database Table
name:
name:
Copyright review
unIon
of the
LNAMe
FNAMe
10010
Ramas
Alfred
10011
Dunne
Leona
10012
Moloi
10013
Pieterse
10014
Orlando
10015
OBrian
Amy
10016
Brown
James
10017
Williams
George
10018
Padayachee
Vinaya
10019
Moloi
Mlilo
Cengage deemed
Learning. that
any
and the result
of the
9
Ch09_SaleCo
CUS_
has
and CUSTOMER_2 tables
query results
CUS_
2020
CUSTOMER
CUSTOMER
CUS_CODe
Editorial
contents
All suppressed
Rights
CUS_
AreACODe
PHONe
A
0181
844-2573
0.00
K
0161
894-1238
0.00
0181
894-2285
345.86
0181
894-2180
536.75
0181
222-1672
0.00
B
0161
442-3381
0.00
G
0181
297-1228
221.19
0181
290-2556
768.93
G
0161
382-7185
216.55
K
0161
297-3809
iNiTiAL
Marlene
W
Jaco
F
Myron
Reserved. content
CUS_
CUS_
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
CUS_BALANCe
0.00
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
430
part
Table
III
Database
name:
Programming
CUSTOMER_2
CUS_CODe
CUS_LNAMe
345
Terrell
Justine
347
Pieterse
Jaco
351
Hernandez
Carlos
352
McDowell
George
CUS_PHONe
0181
322-9870
F
0181
894-2180
J
8192
123-7654
8192
123-7768
8192
123-9876
H
G
Khaleed
368
Lewis
Marie
J
8192
332-1789
369
Dunne
Leona
K
0161
894-1238
UNION CUSTOMER_2 CUS_FNAMe
9
0181
297-1228
Dunne
Leona
K
0161
894-1238
Padayachee
Vinaya
G
0161
382-7185
Hernandez
Carlos
J
8192
123-7654
Marie
J
8192
332-1789
8192
123-7768
McDowell
George
OBrian
Amy
B
0161
442-3381
Pieterse
Jaco
F
0181
894-2180
0181
222-1672
0181
844-2573
0181
894-2285
Myron Alfred Marlene
Moloi
Mlilo
K
0161
297-3809
Terrell
Justine
H
0181
322-9870
Tirpin
Khaleed
G
8192
123-9876
0181
290-2556
Figure 9.1, note the following:
CUSTOMER
Customers
Cengage deemed
UNION
Learning. that
any
table
Dunne
CUSTOMER_2
Pieterse
All
and
contains
ten
rows,
while the
Pieterse
are included
CUSTOMER_2
in the
table
CUSTOMER
table
contains
as
seven
rows.
well as in the
table. query
are
suppressed
W
George
As you examine
has
A
Moloi
Williams
2020
CUS_PHONe
G
Ramas
The
CUS_AreACODe
James
Orlando
The
CUS_iNiTiAL
Brown
Lewis
review
CUS_AreACODe
Tirpin
CUS_LNAMe
Copyright
CUS_iNiTiAL
365
Query: CUSTOMER
Editorial
CUS_FNAMe
yields
15 records
not included.
Rights
Reserved. content
does
May not
not materially
In
be
copied, affect
because
short,
scanned, the
overall
the
or
duplicated, learning
the
UNION
in experience.
whole
duplicate
query
or in Cengage
part.
yields
Due Learning
to
electronic reserves
records
of customers
a unique
rights, the
right
some to
third remove
set
party additional
Dunne
and
of records.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
431
note You
were first
learnt
introduced
how to
to
combine
the
UNION
all tuples
SELECT
CUS_LNAME,
FROM
CUSTOMER
from
operator two
in
Chapter
relations.
We could
4, Relational
Algebra
and
Calculus,
therefore
write the
SQL
query:
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
when
you
UNION SELECT
CUS_LNAME,
FROM
CUSTOMER_2;
as the
following
relational
algebra
statement:
P CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
(CUSTOMER)
(CUSTOMER_2) PCUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
note The SQL standard calls for the elimination of duplicate rows when the UNION SQL statement is used. However, some DBMS vendors may not adhere to that standard. Check your DBMS manual to see if the UNION statement and
Oracle
18c
is supported both
The UNION statement
and, if so, how it is supported.
support
the
UNION
can be used to unite
SQL
For example,
the latest
version
9
of MySQL 8.0
statement.
more than just two
queries.
For example,
assume
that
you
have four union-compatible queries named T1, T2, T3 and T4. Withthe UNION statement, you can combine the output of all four queries into a single result set. The SQL statement will be similar to this: SELECT column-list
FROM T1
UNION
SELECT column-list
FROM T2
UNION SELECT column-list
FROM T3
UNION SELECT column-list
9.1.2 unIon
FROM T4;
aLL
If SaleCos management wants to know how many customers CUSTOMER_2 lists, a UNION ALL query can be used to produce rows.
Therefore,
the following
query
will keep
all rows from
are on both the CUSTOMER and arelation that retains the duplicate
both queries (including
the
duplicate
rows)
and return 17 rows.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
432
part
III
Database
SELECT
CUS_LNAME,
FROM
CUSTOMER
UNION
CUS_LNAME,
FROM
CUSTOMER_2;
FIgure
the
preceding
9.2 name:
UNION
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
ALL
query
produces
the
result
shown
in
Figure
9.2.
unIon aLL query results Ch09_SaleCo
CUS_LNAMe
CUS_FNAMe
CUS_iNiTiAL
CUS_AreACODe
CUS_PHONe
Ramas
Alfred
A
0181
844-2573
Dunne
Leona
K
0161
894-1238
W
0181
894-2285
0181
894-2180
0181
222-1672
Moloi
Marlene
Pieterse
Jaco
Orlando
F
Myron
OBrian
Amy
B
0161
442-3381
Brown
James
G
0181
297-1228
Williams
George
0181
290-2556
Vinaya
G
0161
382-7185
Moloi
Mlilo
K
0161
297-3809
Terrell
Justine
H
0181
322-9870
Pieterse
Jaco
F
0181
894-2180
J
8192
123-7654
8192
123-7768
Padayachee
9
CUS_FNAME,
ALL
SELECT
Running
Database
Programming
Hernandez
Carlos
McDowell
George
Tirpin
Khaleed
Lewis
Marie
Dunne
Like the
Leona
UNION
statement,
the
G
8192
123-9876
J
8192
332-1789
K
0161
894-1238
UNION
ALL statement
can
be used to
unite
more than
just
two
queries.
9.1.3 InterSeCt If
SaleCos
and
management
CUSTOMER_2
returning
wants
tables,
only the rows
query INTERSECT To generate
to
know
which
the INTERSECT
that
customer
statement
appear in both sets.
records
are
duplicated
can be used to combine
in
the
CUSTOMER
rows from two
The syntax for the INTERSECT
statement
queries,
is:
query
the list
of
duplicate
SELECT
CUS_LNAME,
FROM
CUSTOMER
customer
records,
you
can
use:
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
INTERSECT
Copyright Editorial
review
2020 has
SELECT
CUS_LNAME,
FROM
CUSTOMER_2;
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
433
note The
SQL
query
you
have just
seen
can
be
written
using the
relational
algebra
INTERSECT
operator
as follows:
P CUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
(CUSTOMER)
The INTERSECT
example, code
statement
the following
0181
invoice
(CUSTOMER_2) PCUS_LNAME, CUS_FNAME, CUS_INITIAL, CUS_AREACODE, CUS_PHONE
and
record
be
used
query returns
who
have
for that
SELECT
can
made
the
to
generate
customer
purchases.
(If
additional
codes
for
a customer
useful
customer
all customers
has
made
information.
For
who are located
a purchase,
there
in
must
area be an
customer.)
CUS_CODE
FROM
CUSTOMER
WHERE
CUS_AREACODE
5 '0181'
INTERSECT SELECT Figure
DISTINCT
9.3
shows
both
CUS_CODE
sets
of
SQL
FROM
statements
INVOICE; and their
output.
note Microsoft youll if you
2020 has
Cengage deemed
in some
format
or procedure.
an alternative
and
stored
any
9.3
All suppressed
query
procedures,
here is to
Learning. that
the INTERSECT
Atleast,
FIgure
review
not support
use
objective
Copyright
does
explore in this chapter.
triggers
Editorial
Access
show
you
InterSeCt
Rights
Reserved. content
does
May not
not materially
be
you how
to
cases,
can
use
use
some
query, Access
nor
does it support
might be able to
For example,
Visual
Basic
important
although
code
to
complex
give you the
Access
does
similar
actions.
perform
standard
other
queries
desired
not
results
support
SQL
However,
the
9
SQL features.
query results
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
434
part
III
Database
9.1.4 The
Programming
MInuS
MINUS statement
in
SQL combines
in the first set but not in the second. query
rows
from two
queries
The syntax for the
and returns
only the rows
that
appear
MINUS statement is:
MINUS query
For example,
if the
found in the
SaleCo
managers
want to know
what customers
in the
CUSTOMER
table
are not
CUSTOMER_2 table, they can use:
SELECT
CUS_LNAME,
FROM
CUSTOMER
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
MINUS SELECT
CUS_LNAME,
FROM
CUSTOMER_2;
If the
managers
want to
CUSTOMER table, they
know
which
customers
in the
CUSTOMER_2
table
are
not found
in the
merely switch the table designations:
SELECT
CUS_LNAME,
FROM
CUSTOMER_2
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
CUS_FNAME,
CUS_INITIAL,
CUS_AREACODE,
CUS_PHONE
MINUS
9
SELECT
CUS_LNAME,
FROM
CUSTOMER;
You can extract much useful information by combining For example, the following query returns the customer 0181
minus the
ones
who have
made purchases,
MINUS with various clauses such as WHERE. codes for all customers located in area code
leaving
the
customers
in area code
0181
who have
not made purchases. SELECT
CUS_CODE
FROM
CUSTOMER
WHERE
CUS_AREACODE
5 '0181'
MINUS SELECT
DISTINCT CUS_CODE FROM INVOICE;
Figure 9.4 shows the preceding three
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
SQL statements
or
duplicated, learning
in experience.
whole
or in Cengage
part.
and their output.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
9.4
9 Procedural
Language
SQL and
Advanced
SQL
435
MInuS query results
9
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
436
part
III
Database
Programming
note Some the
DBMS
products
difference
illustrated
here
INTERSECT
operator
are supported
Syntax
your
DBMS
subqueries
query
as
FIgure
DBMS.
9.5
name:
in
MINUS
Consult
For example,
INTERSECT
statements,
your
the
the
use
and
For example,
Section
while
DBMS
current
manual
version
others
to
of
may implement
see if the
MySQL
statements
does
not
support
MINUS
statements,
the following
you
can
use IN
and
NOT IN
query produces
the
same results
FROM
CUSTOMER 5 '0181'
AND
CUS_CODE
of the INTERSECT
FROM
INVOICE);
alternative.
alternative
Ch09_SaleCo
CUSTOMER
CUS_
CUS_
CUS_
CODe
LNAMe
FNAMe
10010
Ramas
Alfred
10011
Dunne
Leona
Moloi
10013
Pieterse
10014
Orlando
CUS_
CUS_
CUS_
AreACODe
PHONe
BALANCe
A
0181
844-2573
0.00
K
0161
894-1238
0.00
0181
894-2285
345.86
0181
894-2180
536.75
0181
222-1672
CUS_ iNiTiAL
Marlene
10012
W
Jaco
F
Myron
0.00 0.00
10015
OBrian
Amy
B
0161
442-3381
10016
Brown
James
G
0181
297-1228
221.19
0181
290-2556
768.93 216.55
George
Williams Padayachee
10018
Moloi
10019
Table
name:
Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
review
2020 has
Cengage deemed
0.00
INVOICE iNv_NUMBer
Copyright
as the
9.1.3.
DISTINCT
InterSeCt
name:
the
similar results. shown
IN (SELECT
9.5 shows
10017
Editorial
or
EXCEPT.
CUS_AREACODE
CUS_CODE Figure
support
CUS_CODE
WHERE
Table
by your
doesnt
SELECT
Database
SQL
alternatives
to obtain
INTERSECT
9
the INTERSECT
in
or MINUS statements.
9.1.5 If
do not support
relational
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
CUS_CODe
iNv_DATe
1001
10014
16-Jan-19
1002
10011
16-Jan-19
1003
10012
16-Jan-19
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
iNv_NUMBer
9 Procedural
CUS_CODe
Language
SQL and
Advanced
SQL
437
iNv_DATe
1004
10011
17-Jan-19
1005
10018
17-Jan-19
1006
10014
17-Jan-19
1007
10015
17-Jan-19
1008
10011
17-Jan-19
Query result: CUS_CODe 10012 10014
note Microsoft Access generates an input request for the CUS_AREACODE if you use apostrophes around the area code. (If you supply the 0181 area code, the query will execute properly.) To eliminate that problem, use standard double quotation marks, writing the WHERE clause in the second line of the preceding SQL statement
as:
WHERE CUS_AREACODE
50181
AND
Microsoft Access will also accept single quotation
Using the same alternative to the MINUS statement, query shown in Section 9.1.4 by using: SELECT
The results
CUS_AREACODE FROM INVOICE); of that
query are shown in Figure
in area code 0181 who have not
FIgure Database Table
9.6 name:
name:
Copyright review
CUS_
10010
Ramas
Alfred
10011
Dunne
Leona
10012
Moloi
10013
Pieterse
10014
Orlando
Cengage
Learning. that
any
DISTINCT CUS_CODE
output includes
only the
customers
have not generated invoices.
Ch09_SaleCo
FNAMe
deemed
9.6. Note that the query
MINUS
MInuS alternative
CUS_
has
AND CUS_CODE NOTIN (SELECT
made any purchases and, therefore,
LNAMe
2020
you can generate the output for the third
CUSTOMER
CUS_CODe
Editorial
5'0181'
9
marks.
All suppressed
Rights
CUS_ PHONe
A
0181
844-2573
0.00
K
0161
894-1238
0.00
0181
894-2285
345.86
0181
894-2180
536.75
0181
222-1672
Marlene
W
Jaco
Reserved. content
CUS_ AreACODe
CUS_ iNiTiAL
does
F
Myron
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
CUS_BALANCe
0.00
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
438
part
III
Database
CUS_CODe
Programming
CUS_
CUS_
CUS_
LNAMe
FNAMe
iNiTiAL
CUS_
CUS_
AreACODe
PHONe
CUS_BALANCe
10015
OBrian
Amy
B
0161
442-3381
0.00
10016
Brown
James
G
0181
297-1228
221.19
0181
290-2556
768.93 216.55
Padayachee
10018
Moloi
10019
Table
George
Williams
10017
name:
Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
0.00
INVOICE iNv_NUMBer
CUS_CODe
iNv_DATe
1001
10014
16-Jan-19
1002
10011
16-Jan-19
1003
10012
16-Jan-19
1004
10011
17-Jan-19
1005
10018
17-Jan-19
1006
10014
17-Jan-19
1007
10015
17-Jan-19
1008
10011
17-Jan-19
Query result:
9
CUS_CODe 10010 10013 10016 10017 10019
9.2
SQL JoIn
The relational
join
operatorS
operation
merges rows from
two tables
and returns
the rows
with one of the following
conditions: Have common values in common columns (natural join). Meet a given join
condition
(equality
or inequality).
Have common values in common columns or have no matching values (outer join). In Chapter 8, Beginning Structured Query Language, you learnt how to use the SELECT statement in conjunction with the WHERE clause to join two or moretables. For example, you can join the PRODUCT and VENDOR
tables
Copyright review
2020 has
common
V_CODE
by writing:
P_CODE, P_DESCRIPT, P_PRICE, V_NAME
FROM
PRODUCT, VENDOR
Cengage deemed
their
SELECT
WHERE
Editorial
through
Learning. that
any
PRODUCT.V_CODE
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
5 VENDOR.V_CODE;
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
The preceding contains the
SQL join
the
tables
syntax is sometimes
being
joined
and that
referred
the
WHERE
note
the
9 Procedural
Language
as an old-style
join.
clause
the join
contains
SQL and
Advanced
Note that the FROM condition(s)
SQL
439
clause
used
to join
tables. As you The
examine
FROM
clause
operation
tables
takes
place
and
The number example,
two
at a time,
criteria
Generally,
the
statement
are
connected
for
T1 and
join
rows
section,
some
use
of the
of join
will learn
tabLe Join
9.1
if you
which
which the
rows
V_CODE
the join
are joining
to table
T3.
will be returned.
values in the
of tables
will have
AND logical
operator.
condition
(j2)
two
being joined
join
The first
defines
minus one. For
conditions join
the join
(j1
and j2).
condition criteria
for
All
(j1)
defines
the
output
comparison
of the
primary
key in
and outer joins.
are selected.
values
that
one table
and the
table.
for
returns
The join
one table the
same
The inner
criteria
join is the traditional
can be an equality
or both tables result
ways to express
that
the join
SQL join
CROSS
For example,
statement
for
are included,
of that join are then joined
T3), you
join
asinner joins
queries.
Classification
more tables
to right.
SELECT
all rows
an
different
support
following
the
will be an equality
second
attribute
do not
left
equal to the number
second
Table 9.1.). It is useful to remember
and that
from
or
condition (theta join). An outer join returns not only the
type you
If three
T3.
condition
with unmatched
In this
tells
T2 and
through
meet a given criteria
a special
be joined.
to T2; the results
always
The
points:
equal.
(T1,
T2.
key in the
or equijoin) or aninequality introduces
is
tables
following
starting
returns
are
conditions
can be classified
only rows that
clause
tables
and table
foreign
Join operations
WHERE
three
are to
T1 is joined
SELECT
of join
conditions join
related
the
tables
VENDOR
if you join
of the first join
(see
which tables
in the
case, the
PRODUCT
the
query,
indicates
condition
In this
the
preceding
T1, T2 and T3, first table
The join
join
the
shown
your
expression
join
to
operations
in this
DBMS
be joined.
Cartesian
not all DBMS vendors
styles
Refer to
as the
The
meet the
Oracle
you
are
11g is
using
(natural
join
SQL standard
of two
sets
ANSI
provide the same level
section.
manual if
condition
which
matching rows, but also
product
that
join in
also
or tables.
9
SQL standard
of SQL support
used
to
a different
demonstrate DBMS.
styles
Join Type
SQL Syntax
CROSS
SELECT
example
* FROM
Description
T1, T2
Returns
JOIN
(old SELECT
* FROM
T1
the
Cartesian
product
of T1 and
T2
Cartesian
product
of T1 and
T2.
style).
Returns
the
CROSS JOIN T2 INNER
Old-Style
SELECT * FROM T1, T2
JOIN
WHERE
Returns only the rows that
T1.C15T2.C1
condition rows
NATURAL
SELECT
JOIN
NATURAL
* FROM JOIN
T1
with
Returns
T2
in the
the
WHERE
matching
only the
matching
meet the join
clause
values
old style.
Only
are selected.
rows
with
columns.
The
matching
values
matching
in
columns
must have the same names and similar
data
types. JOIN
SELECT
USING
T2
JOIN
ON
* FROM
T1 JOIN
Returns
USING (C1)
SELECT
the
* FROM
T1 JOIN
Returns
T2 ON T1.C15T25C1
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
only the
columns
rows
only the
Due
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
that
may
values
USING
clause.
be
ON clause.
suppressed at
any
time
in
meet the join
in the
content
matching
in the
rows
condition indicated
Learning
with
indicated
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
440
part
Join
III
Database
Programming
Classification
Join
OUTER
Type
SQL
LEFT JOIN
Syntax
example
Description
SELECT
* FROM
T1 LEFT
OUTER
JOIN
ON
T2
Returns includes
T1.C15T2.C1 RIGHT
SELECT
JOIN
T1
OUTER JOIN
Returns
T2
FULL
SELECT
* FROM
T1 FULL
OUTER
JOIN
ON
T2
Returns
with
the left
and
table
(T1)
with
matching
values
and
with
values.
rows
includes
T1.C15T2.C1
from
values
all rows from the right table (T2)
unmatched
JOIN
matching
values.
rows
includes
ON T1.C15T2.C1
with
all rows
unmatched
* FROM
RIGHT
rows
with
all rows
matching
from
with unmatched
values
and
both tables
(T1
and
T2)
values.
9.2.1 Cross Join A cross join performs cross join syntax is:
a relational
SELECT column-list
product (also known as the
FROM table1
Cartesian product)
of two tables.
The
CROSS JOIN table2
For example, SELECT * FROM INVOICE
CROSS JOIN LINE;
performs a cross join of the INVOICE (There
were eight invoice
rows
and LINE tables.
and 18 line rows,
thus
That CROSS JOIN query generates 144 rows. yielding
You can also perform a cross join that yields only specified
9
SELECT
INVOICE.INV_NUMBER,
FROM
INVOICE
The results
generated
8 3 18
5 144 rows.)
attributes. For example, you can specify:
CUS_CODE, INV_DATE,
P_CODE
CROSS JOIN LINE;
through
that
SQL statement
SELECT
INVOICE.INV_NUMBER,
FROM
INVOICE,
can also be generated
CUS_CODE, INV_DATE,
by using the following
syntax:
P_CODE
LINE;
9.2.2 natural Join Recall from Chapter 3, Relational Model Characteristics, that a natural join returns allrows with matching values in the matching columns and eliminates duplicate columns. That style of query is used whenthe tables share one or more common attributes with common names. The natural join syntax is: SELECT column-list The natural join
FROM table1
will perform
NATURAL JOIN table2
the following
Determine the common attribute(s) data types. Select only the rows
tasks:
by looking for attributes
The following example performs a natural join of the selected attributes:
Copyright review
2020 has
product
CUS_CODE, CUS_LNAME, INV_NUMBER,
FROM
CUSTOMER
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
of the two tables.
CUSTOMER and INVOICE tables
SELECT
Cengage deemed
names and compatible
with common values in the common attribute(s).
If there are no common attributes, return the relational
Editorial
withidentical
and returns
only
INV_DATE
NATURAL JOIN INVOICE;
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
The SQL code
FIgure
and its results
9.7
are shown
naturaL
at the top
of Figure
9 Procedural
Language
SQL and
Advanced
SQL
441
9.7.
JoIn query results
9
You are not limited PRODUCT
tables
Copyright review
2020 has
project
SELECT
INV_NUMBER,
FROM
INVOICE
The SQL code
Editorial
to two tables. and
Cengage deemed
Learning. that
any
All suppressed
Reserved. content
does
selected
NATURAL
May not
not materially
be
copied, affect
scanned, the
overall
or
by
LINE
at the
duplicated, learning
in experience.
whole
LINE_UNITS,
NATURAL
bottom
or in Cengage
JOIN
of Figure
part.
a natural join
Due Learning
of the INVOICE,
LINE and
writing:
P_DESCRIPT,
JOIN
are shown
you can perform attributes
P_CODE,
and its results
Rights
For example,
only
to
electronic reserves
LINE_PRICE
PRODUCT;
9.7.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
442
part
III
Database
Programming
One important does you
difference
not require
the
projected
attribute
the INV_NUMBER
9.2.3 JoIn A second
express in the
The syntax
SELECT
in the
USING
INVOICE PRODUCT
natural join
join
attributes.
not
require
syntax is that the
In
any
the first
table
and INVOICE
natural
qualifier,
tables.
The
natural join
join
even same
example, though
can
the
be said
of
example.
is through
the
USING
in the
keyword.
USING
This
clause
query
and that
returns
only the
column
must
and
tables
rows
exist
with
in
both
table1
JOIN
action,
table2
lets
perform
P_CODE,
JOIN
LINE
USING
(common-column)
ajoin
of the INVOICE
P_DESCRIPT,
LINE_UNITS,
LINE
by
writing:
LINE_PRICE
USING (INV_NUMBER)
USING (P_CODE);
SQL statement
9.8
did
CUSTOMER
second
query in
FROM
FIgure
the
and the old-style common
projection
both
indicated
FROM
INV_NUMBER,
The
the
is:
column-list JOIN
a join column
SELECT
JOIN
in
for
uSIng Clause
values
To see the
natural join
qualifier
yet the
appeared
attribute
way to
matching
between the
of a table
CUS_CODE
CUS_CODE
tables.
use
produces
the
results
shown
in
Figure
9.8.
JoIn uSIng results
9
As was the case with the NATURAL JOIN command, the JOIN USING operand does not require table qualifiers. As a matter of fact, Oracle will return an error if you specify the table name in the USING clause.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9.2.4 JoIn
9 Procedural
Language
SQL and
Advanced
SQL
443
on Clause
The previous two join
styles
used common
attribute
names in the joining
tables.
Another
way to express
ajoin whenthe tables have no common attribute names is to use the JOIN ON operand. That query will return only the rows that meetthe indicated join condition. Thejoin condition will typically include an equality comparison expression of two columns. (The columns may or may not share the same name but, obviously,
must have comparable
SELECT column-list
data types.)
FROM table1 JOIN table2
The syntax is:
ONjoin-condition
The following example performs ajoin of the INVOICE and LINE tables, using the ON clause. The result is shown in Figure 9.9. SELECT
INVOICE.INV_NUMBER,
FROM
INVOICE JOIN LINE ONINVOICE.INV_NUMBER
JOIN PRODUCT
ON LINE.P_CODE
FIgure
JoIn
9.9
P_CODE, P_DESCRIPT, LINE_UNITS, LINE_PRICE 5 LINE.INV_NUMBER
5 PRODUCT.P_CODE;
on results
9
Note table
that,
unlike
qualifier
ambiguously
Copyright Editorial
review
2020 has
Cengage deemed
the
NATURAL common
defined
Learning. that
the
for
any
All suppressed
Rights
error
Reserved. content
does
May not
JOIN
and the
attributes.
JOIN
If you
USING
do not
operands,
specify
the
the
table
JOIN
qualifier,
ON clause you
requires
will get
a
a column
message.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
444
part
III
Database
Programming
Keep in common you
mind that the JOIN attribute
can
use the
name.
ON syntax lets
For
following
example,
(recursive)
SELECT
E.EMP_MGR,
FROM
EMP E JOIN
ORDER
BY
to
you
perform
generate
a list
a join of all
even
when the tables
employees
with the
do not share
managers
a
names,
query: M.EMP_LNAME,
EMP
E.EMP_NUM,
M ON E.EMP_MGR
E.EMP_LNAME
5 M.EMP_NUM
E.EMP_MGR;
9.2.5 outer Joins An outer join in
the
types
of outer
are
joins:
or
third
table
right
in
right
FROM
the
and full.
clause
the
with
The left
Remember
condition
unmatched
and right
that
join
will be the left
right
returns
common
side table.
rows
are being joined,
becomes
outer join
matching the join
also the
DBMS.
more tables
The left values
but
left,
by the
named in the
If three
not only the rows
columns),
processed
table
the
returns
common
side,
The
designations
operations
the result
(that is, rows
values.
reflect
take
and the second
of joining
order in
two
table
the first two
matching
standard
the
place
with
ANSI
tables
three
which the
at a time.
named tables
values
defines
tables The first
will be the right becomes
the left
side. side;
side.
not
only the
column),
but
rows
also
matching
the
rows
in
the
the
join
left
condition
side
(that
table
is,
rows
with
with unmatched
matching
values
in the
The syntax is:
SELECT
column-list
FROM
table1
LEFT [OUTER]
JOIN
table2
ON join-condition
9 For
example,
the
and includes
VENDOR
LEFT
in the
column),
The
is:
syntax
table1 the
also includes
rows
also the
those
query
[OUTER] lists
products
the
that
do not
FROM
VENDOR
RIGHT JOIN
any
rows
and
vendor
name
for
all
products
5 PRODUCT.V_CODE;
9.10.
the join
in the
All suppressed
Rights
Reserved. content
does
output
May not
not materially
be
are shown
copied, affect
scanned, the
overall
or
right
condition
side
(that
table
with
is,
rows
with
unmatched
matching
values
in the
ON join-condition
code,
have
a
in
vendor matching
code,
and
vendor
vendor
name
for
all
products
code:
V_NAME
PRODUCT
duplicated, learning
table2
product
VENDOR.V_CODE,
Learning.
in Figure
matching
JOIN
P_CODE,
that
code
ON VENDOR.V_CODE
are shown
SELECT
Cengage
vendor
V_NAME
PRODUCT
only the but
RIGHT
following
The SQL code and its
deemed
not
code,
products:
column-list
For example,
has
returns
common
side table.
and
JOIN
SQL code and its result outer join
FROM
2020
product
matching
FROM
SELECT
review
the
no
VENDOR.V_CODE,
values
Copyright
lists
with
P_CODE,
The right
Editorial
query
vendors
SELECT
The preceding
left
following
those
ON VENDOR.V_CODE
5 PRODUCT.V_CODE;
Figure 9.11.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Copyright Editorial
review
2020 has
FIgure
9.10
LeFt JoIn results
FIgure
9.11
rIght
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
9 Procedural
Language
SQL and
Advanced
SQL
445
JoIn results
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
9
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
446
part
III
Database
The full
Programming
outer join
values
in
the
syntax
is:
returns
common
SELECT
table1
also
matching
all of the
FULL [OUTER]
For example,
the following
and includes
all product
the join
rows
matching
JOIN table2
query lists the rows
product
(products
condition
with unmatched
(that values
is, rows in
either
with
matching
side table.
The
code, vendor
without
SELECT
P_CODE,
VENDOR.V_CODE,
FROM
VENDOR
FULL JOIN
9.12
ONjoin-condition
matching
code and vendor
vendors)
as
name for all products
well as all vendor
rows (vendors
products):
The SQL code and its result
FIgure
but
column-list
FROM
without
not only the rows
column),
V_NAME
PRODUCT
are shown in
Figure
ON VENDOR.V_CODE
5 PRODUCT.V_CODE;
9.12.
FuLL JoIn results
9
9.3 The the
SubQuerIeS
use
of joins
following
the
allows
query
CUSTOMER
2020 has
Cengage deemed
Learning. that
any
database you
to
to
get the
All
Rights
Reserved. content
does
May not
not materially
be
get information customers
from
data
INVOICE.CUS_CODE,
two
with their
or
more tables.
respective
For example,
invoices
by joining
CUS_LNAME,
CUS_FNAME
INVOICE
CUSTOMER.CUS_CODE
suppressed
QuerIeS
tables.
CUSTOMER,
WHERE
review
allow
INV_NUMBER,
FROM
Copyright
a relational
would
and INVOICE
SELECT
Editorial
anD CorreLateD
copied, affect
scanned, the
overall
or
duplicated, learning
5INVOICE.CUS_CODE;
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
In the
previous
matching
However, example,
the
query, the
rows
with it is
you
Beginning
often
table
Structured
following
FROM
Similarly,
to
to
process
data
of vendors
products
Query Language,
based
who
some
on
provide
that
other
are processed
processed
products.
of them
you learnt
V_CODE,
V_NAME
V_CODE
NOT IN (SELECT
generate
price, you can
FROM
SQL
447
at once,
data.
(Recall
that
are only potential
you could
generate
Suppose,
not
vendors.) such
a list
for
all vendors
in
In
Chapter
8,
by
writing the
of
all
V_CODE
products
with
FROM
a price
PRODUCT);
greater
than
or equal
to
the
average
product
query:
P_CODE,
P_PRICE
P_PRICE
.5
PRODUCT
WHERE
both of those
cases,
Which vendors What is the both
a list
write the following
SELECT
input
Advanced
VENDOR
WHERE
In
and INVOICE)
SQL and
values.
a list
provided
(CUSTOMER
Language
query:
SELECT
In
generate
have
both tables
CUS_CODE
necessary
want to
VENDOR
data from
shared
9 Procedural
Although
you needed
provide
average
cases,
for the
you
(SELECT
AVG(P_PRICE)
FROM
to get information
that
PRODUCT);
was not previously
known:
products?
price
used
of all products?
a subquery
to
generate
the
required
information
that
could
then
be used
as
9
originating
query.
you learnt
how to
use subqueries
in
Chapter
8, lets
review
the
basic
characteristics
of a
subquery:
A subquery
is a query (SELECT
A subquery
is normally
The first The
query
query
The
In this
is
(such
the
Copyright Editorial
review
2020 has
as
known
as the
the
SQL
statement
is
known
as the
is
you
based
But subqueries manipulation
is
executed
have
a
language multiple
Cengage
Learning. that
any
All suppressed
Reserved. content
does
wide range
May
not materially
referred
of uses.
inner
be
copied, affect
is
for
to
the
query. query.
practical
For
outer
query.
as a nested
statement
you
UPDATE
expected.
query.
use of subqueries. to return
example,
(INSERT,
or a table)
in
the
SELECT
statement
codes
not
as the input
more about
(DML)
vendor
Rights
used
use of the
use of SELECT subqueries
deemed
is
is sometimes
will learn on the
outer
first.
query
SQL statement
section,
subquery
parentheses.
statement
of an inner
The entire
inside
a query.
SQL
query
output
expressed
inside
in the
inside
The inner
statement)
one or can
use
a subquery
or DELETE)
Table
9.2
You already
where
uses
know
that
more values to another
simple
within
a value
a SQL
or alist
examples
to
a
query. data
of values summarise
DML statements.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
448
part
III
Database
tabLe SeLeCT
Programming
9.2
SeLeCt subquery
Subquery
INSERT
INTO
SELECT
examples
examples
explanation
PRODUCT
* FROM
Inserts
P;
all rows
from
Table
must have the
same
attributes.
Table UPDATE
PRODUCT
SET P_PRICE FROM
5 (SELECT
AVG(P_PRICE)
WHERE
V_CODE
FROM
VENDOR
V_CODE
V_CODE
FROM
VENDOR
WHERE
PRODUCT
table.
Both tables
returns
all rows
from
product
that
0181.
price to the
are provided
The first
subquery
average
by vendors
subquery
returns
product
returns
the list
price,
but
only for
who have an area code the
average
of vendors
price; the
with an area
code
equal
0181.
5 '0181')
DELETE FROM PRODUCT WHERE
to
second to
WHERE V_AREACODE
the
the products equal
IN (SELECT
the
The subquery
P.
Updates
PRODUCT)
Pinto
Deletes the PRODUCT table rows that are provided
IN (SELECT
V_CODE
area
code
codes
V_AREACODE
equal
to
0181.
with an area code
The subquery equal to
returns
by vendors
the list
with
of vendors
0181.
5 '0181')
Using the examples shown in Table 9.2, note that the subquery is always at the right side of a comparison or assigning expression. Also, a subquery can return one value or multiple values. To be precise, the subquery
can return:
One single value (one column and one row). This subquery is used anywhere a single value is expected,
9
as in the right
side of a comparison
expression
(such
as in the
UPDATE example
above
when you assign the average price to the products price). Obviously, when you assign a value to an attribute, that value is a single value, not alist of values. Therefore, the subquery mustreturn only one value (one column, one row). If the query returns multiple values, the DBMS will generate an error. Alist of values (one column and multiple rows). This type of subquery is used anywhere alist of values is expected, such as when using the IN clause (that is, when comparing the vendor code to
alist
of vendors).
Again, in this
case, there is
only one column
of data
instances. This type of subquery is used frequently in combination WHERE conditional expression.
with
multiple value
with the IN operator in a
A virtual table (multicolumn, multirow set of values). This type of subquery can be used anywhere atable is expected, such as when using the FROM clause. You will see this type of query later in this chapter. It is important to note that a subquery can return no values at all;it is a NULL.In such cases, the output of the outer query mayresult in an error or a null empty set depending where the subquery is used (in a comparison, an expression or atable set). In the following
retrieve
sections,
you
will learn
how to
write subqueries
within the
SELECT
statement
to
data from the database.
9.3.1 where Subqueries The most common type of subquery uses an inner SELECT subquery on the right side of a WHERE comparison expression. For example, to find all products with a price greater than or equal to the average product price, you write the following query: SELECT
P_CODE, P_PRICE FROM PRODUCT
WHERE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
P_PRICE .5 (SELECT
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
AVG(P_PRICE) FROM PRODUCT);
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The output a
.,
,,
(one
of the
5,
column,
attribute
preceding
.5,
or
,5
one row).
to the left
string.
Also, if the
FIgure
query is shown in
conditional The value
of the
generated
comparison
query returns
9.13
Figure
expression,
by the
symbol
is
more than
where subquery
9.13.
requires
Language
Note that this type
a subquery
subquery
must
a character
a single
9 Procedural
type,
value, the
that
SQL and
of query,
returns
only
subquery
DBMS
single
value
type;
if the
data
must return
will generate
SQL
449
when used in
one
be of a comparable the
Advanced
a character
an error.
examples
9
Subqueries
can also
customers
who
be used in combination
ordered
the
product
SELECT
DISTINCT
FROM
CUSTOMER
WHERE
CUS_LNAME,
JOIN INVOICE
LINE
USING
JOIN
PRODUCT
For example,
the following
query lists
all of the
hammer:
CUS_CODE,
JOIN
P_CODE
with joins.
claw
CUS_FNAME
USING (CUS_CODE)
(INV_NUMBER) USING
5 (SELECT
(P_CODE) P_CODE
FROM
PRODUCT
WHERE
P_DESCRIPT
5 'Claw
hammer'); The result In
the
of that
query
preceding
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
also
example,
The P_CODE is then
Editorial
is
Rights
the
in
does
May not
not materially
be
copied, affect
Figure
inner
used to restrict
Reserved. content
shown
the
query
the
scanned, overall
or
9.13. finds
selected
duplicated, learning
in experience.
whole
the
rows
or in Cengage
part.
P_CODE
to
Due Learning
for
only those
to
electronic reserves
rights, the
right
the
product
where the
some to
third remove
party additional
content
claw
hammer.
P_CODE in the
may content
be
suppressed at
any
time
from if
the
subsequent
LINE
eBook rights
and/or restrictions
eChapter(s). require
it
450
part
III
Database
table
Programming
matches the
this
P_CODE for Claw
hammer.
Note that the
previous
query
could
have been
written
way:
SELECT
DISTINCT
FROM
CUSTOMER
WHERE But
what happens
JOIN
LINE
JOIN
PRODUCT
if the
USING (CUS_CODE)
USING (P_CODE) 5 'Claw
original
in the
CUS_FNAME
USING (INV_NUMBER)
You get an error as shown
CUS_LNAME,
JOIN INVOICE
P_DESCRIPT
description? operand,
CUS_CODE,
hammer';
query
encounters
message.
next
the claw
To compare
hammer
one value to
string in
alist
more than
of values,
you
one product
must use an IN
section.
9.3.2 In Subqueries What
would
or saw
compare
a single
lists
has
to find
product
product
attribute
all customers
table
there
code to
but they
has
are
FROM
CUSTOMER
(single
a list can
all customers DISTINCT
value),
of values,
CUS_CODE,
using
purchased
LINE
USING
JOIN
PRODUCT
USING
P_DESCRIPT
LIKE '%hammer%'
OR
P_DESCRIPT
LIKE '%saw%');
Cengage
Learning. that
any
of hammers:
of products
and so on. In such cases, alist
use the
IN
a query,
of product
code
operator. you
hammers
claw
that
must
use
of saw
hammer
contain
saw
and
in their
you need to compare values.
When the
or saws
kind
an IN
When you
P_CODE
values
subquery.
the
want to are
not
The following
or saw blades.
CUS_FNAME
(P_CODE)
WHERE
shown
types
or any
(INV_NUMBER)
IN (SELECTP_CODE
is
different
a hammer
USING (CUS_CODE)
P_CODE
query
purchased
occurrences
CUS_LNAME,
JOIN INVOICE
who
but to
you
be derived
who have
JOIN
of that
two
multiple
WHERE
9.14
deemed
the
SELECT
The result
2020
wanted
There are saw blades, jigsaws
one
beforehand
example
review
that
descriptions. not to
FIgure
you
Also note that
P_CODE
known
Copyright
Note
hammer.
product
9
do if
blade?
sledge
Editorial
you
in
Figure
FROM
PRODUCT
9.14.
In subquery examples
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
452
part
III
Database
The result
FIgure
Programming
of that
9.16
query is shown in
Multirow subquery
As you examine
the
query
The query is a typical
Figure 9.16.
operator example
and its
output in
Figure 9.16, its important
to
note the following
example of a nested query.
The query has one outer SELECT statement second SELECT subquery (call it sqB).
with a SELECT subquery (call it sqA) containing
The last SELECT subquery (sqB) is executed first and returns alist of all vendors from
9
points:
a
South Africa.
The first SELECT subquery (sqA) uses the output of the SELECT subquery (sqB). The sqA subquery returns the list of product costs for all products provided by vendors from South Africa. The use of the ALL operator allows you to compare a single value (P_QOH * P_PRICE) with alist of values returned by the first subquery (sqA), using a comparison operator other than equals. For arow to appear in the result set,it has to meetthe criterion P_QOH * P_PRICE . ALL of the individual values returned bythe subquery sqA. The values returned by sqA are alist of product costs. In fact, greater
than
ALL is equivalent
to greater
than the highest
product
cost of the list.
In the same
way, a condition ofless than ALL is equivalent to less than the lowest product cost ofthe list. Another
powerful
operator
is the
ANY
multirow
operator
(near
cousin
of the
ALL
multirow
operator).
The ANY operator allows you to compare a single value to alist of values and select only the rows for which the inventory cost is greater than any value ofthe list orless than any value ofthe list. You could use the equal to ANY operator, which would be the equivalent of the IN operator.
9.3.5 FroM Subqueries So far,
you
have seen
how the
SELECT
statement
uses subqueries
within
WHERE,
HAVING
and IN
statements and how the ANY and ALL operators are used for multirow subqueries. In all of those cases, the subquery was part of a conditional expression and it always appeared at the right side ofthe expression. In this section, you willlearn how to use subqueries in the FROM clause. As you already
know, the
FROM clause
specifies
the table(s)
from
which the
data are drawn.
Because
the output of a SELECT statement is another table (or more precisely a virtual table), you could use a SELECT subquery in the FROM clause. For example, assume that you want to know all customers who have purchased products 13-Q2/P2 and 23109-HB. All product purchases are stored in the LINE table. It is easy to find out who purchased any given product by searching the P_CODE attribute in the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
LINE table. one.
But in this
You
could
case, you
write the
SELECT
DISTINCT
FROM
CUSTOMER, (SELECT
know
all customers
CUSTOMER.CUS_CODE,
P_CODE
(SELECT
SQL and
both
products,
who purchased
Advanced
SQL
453
not just
5 '13-Q2/P2')
P_CODE
CUSTOMER.CUS_LNAME
FROM
FROM
5 '23109-HB')
CUSTOMER.CUS_CODE
INVOICE
NATURAL
JOIN
LINE
INVOICE
NATURAL
JOIN
LINE
CP1,
INVOICE.CUS_CODE
WHERE
Language
query:
INVOICE.CUS_CODE
WHERE
WHERE
want to
following
9 Procedural
CP2
5
CP1.CUS_CODE
AND
CP1.CUS_CODE
5
CP2.
CUS_CODE;
The result
of that
FIgure
query is
9.17
shown in
Figure 9.17.
FroM subquery example
9
As you examine Figure 9.17, note that the first subquery returns all customers who purchased product 13-Q2/P2, while the second subquery returns all customers who purchased product 23109-HB. So, in this
FROM
subquery,
you are joining
the
CUSTOMER
table
with two
virtual
tables.
The join
condition
selects only the rows with matching CUS_CODE values in each table (base or virtual). In the previous chapter, you learnt that a view is also a virtual table; therefore, you can use a view name anywhere a table is expected. So, you could create two views: one listing all customers who purchased product 13-Q2/P2 and another listing all customers who purchased product 23109-HB. Doing so, you
would
write the
query
as:
CREATE VIEW CP1 AS SELECT INVOICE.CUS_CODE
FROM INVOICE
NATURAL JOIN LINE
FROM INVOICE
NATURAL JOIN LINE
WHERE P_CODE 5 '13-Q2/P2'; CREATE VIEW CP2 AS SELECT INVOICE.CUS_CODE WHERE P_CODE
SELECT
DISTINCT CUS_CODE, CUS_LNAME
FROM
Copyright Editorial
review
2020 has
Cengage deemed
CUSTOMER
Learning. that
5 '23109-HB';
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
NATURAL JOIN CP1 NATURAL JOIN CP2;
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
454
part
III
Database
You
Programming
may be tempted
to speculate CUS_CODE,
CUS_LNAME
FROM
CUSTOMER
NATURAL
However,
P_CODE if
different
you
values
The
SELECT
Those
at the
A subquery
can
in the a simple
average
FROM 9.18
9.18, the
Copyright review
2020 has
when
Learning. that
AND
P_CODE
5 '23109-HB';
carefully,
attribute
you the
list
of base
list
will
note
query
JOIN
that
syntax:
LINE
a P_CODE
will not return
to indicate
tables
which
or computed
a subquery
must return can
cannot
be equal
to two
any rows.
to list
the
columns
to
attributes
expression,
one single
be used
P_CODE,
P_PRICE,
P_PRICE
(SELECT
any
result
note that
the
All suppressed
is the
aliases you
parses
Cengage deemed
NATURAL
can also include
query
the
value
expression,
Editorial
list
attribute
shows
of the column
DBMS
INVOICE
Therefore,
be attributes
value;
project
or the
in the
result
also known
otherwise,
difference
(SELECT
each
resulting
set.
of an aggregate
as an inline
an error
between
AVG(P_PRICE)
AVG(P_PRICE)
of that
Inline subquery
Figure
and that
list
JOIN
using the following
subquery.
code is raised.
products
price
For
and the
FROM
FROM
PRODUCT)
PRODUCT)
AS
AS
AVGPRICE,
DIFFERENCE
PRODUCT;
9.18
In
also be written
price:
SELECT
Figure
time.
uses the
inline
product
query
query could
List Subqueries
The attribute
example,
that
same
statement
columns
function.
5 '13-Q2/P2'
examine
9.3.6 attribute
FIgure
above
SELECT
WHERE
9
that the
get
alias
Rights
same
query
in
every
when computing
in the
executes
does
May not
not materially
output row.
the
message.
defined
Reserved. content
examples
the inline
an error
is
and
query.
returns
Note
difference.
The column same
one
also that
attribute
single the
value
query
(the
used
average the
full
products
expression
In fact, if you try to use the alias in the
alias
cannot
list.
That
be used in DBMS
computations
requirement
is
difference
in the due to
price) instead
the
attribute way the
queries.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Another
example
example, by
employee
table.
To
the
you
of
query,
total you
sales.
a common
use of attribute list code,
To get the
by employee
attribute.
of employees,
write the
by
need
you dont
subqueries
sales
you
know
the
need
need a common related
to
to
use
SQL
455
For
contribution only
the
of employees
LINE (from
LINE and EMPLOYEE
attribute.
each
Advanced
aliases.
and the
number
you can see that the
employees
SQL and
and column
by product,
product, to
Language
You need to know
product.
So to
answer
the
code:
COUNT(*)
* LINE_PRICE)
FROM
EMPLOYEE)
ROUND(SUM(LINE_UNITS EMPLOYEE),2)
sales
total
structures,
total
SUM(LINE_UNITS
(SELECT
the
you
In fact,
not the
following
P_CODE,
FROM
the
product
As you study the tables
number
SELECT
the
contribution
table).
would
know
products
the
do not share
only the
want to
each
compute
EMPLOYEE
tables
will help you understand
suppose
9 Procedural
AS
AS SALES, AS ECOUNT,
* LINE_PRICE)/(SELECT
COUNT(*)
FROM
CONTRIB
LINE
GROUP
BY
P_CODE;
The result
of that
query
up to
decimal
two
FIgure
is
shown
places
9.19
using
in the
Figure SQL
9.19.
Notice
ROUND
that
the
CONTRIB
column
has
been
rounded
function.
another example of aninline subquery
9
The use of that type of subquery is limited to certain instances where you need to include data from other tables that are not directly related to a main table or tables in the query. The value will remain the same for each row, like a constant in a programming language (although you will learn
another
use
of inline
subqueries
later
in
Section
9.3.7,
Correlated
Subqueries).
Note that
you cannot use an alias in the attribute list to write the expression that computes the contribution per employee. Another wayto write the same query by using column aliases requires the use of a subquery in the FROM
clause,
as follows:
SELECT
P_CODE, SALES, ECOUNT, SALES/ECOUNT
FROM
(SELECT FROM
P_CODE, SUM(LINE_UNITS * LINE_PRICE) AS SALES,(SELECT
EMPLOYEE)
FROM
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
COUNT(*)
AS ECOUNT LINE
GROUP
Editorial
AS CONTRIB
May not
BY
not materially
be
P_CODE);
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
456
part
III
Database
In that
Programming
case,
and returns contains query
you are actually a virtual
an inline receives
table
using two with three
subquery
the
using the column
that
output
subqueries.
columns:
returns
of the inner
The subquery
P_CODE,
the
number
query,
you
of
can
in the
SALES
and
employees
now
refer
FROM
as
to
clause
ECOUNT.
the
The
ECOUNT.
columns
executes
FROM
Because
in the
first
subquery the
outer
outer
subquery,
aliases.
9.3.7 Correlated Subqueries Until
now,
all subqueries
command its
sequence
output
SQL
is
used
statement
about
execute
in a serial fashion,
outer
query,
which
independently.
one after another.
then
executes
That is,
The inner
until the last
each
subquery
subquery
outer
query
in
executes
executes
a
first;
(the
first
code).
a correlated
That process is similar FOR
have learnt
executes
by the
in the
In contrast,
you
subquery
is a subquery
to the typical
nested loop in
that
executes
once for each row in the outer query.
a programming
language.
For example:
X 5 1 TO 2 FOR
Y 5 1 TO 3 PRINT
'X
5 'X, 'Y
5 'Y
END END will yield
the
output
X 5 1
Y 5 1
X 5 1
Y 5 2
X 5 1
Y 5 3
X 5 2
Y 5 1
X 5 2
Y 5 2
X 5 2
Y 5 3
9
Note that 3 is
the
outer
completed
for
correlated
loop
X 5 1 TO 2 begins
each
X outer loop
subquery
the
value.
process
by setting
The relational
X 5 1; then
DBMS
uses
the
the
same
inner
loop
Y 5 1 TO
sequence
to
produce
results:
1 It initiates the outer query. 2
For each row of the outer query result set, it executes the inner the inner
That process subquery
query. is the
because
column
of the
sold
value
2
Copyright review
2020 has
you have seen to the
outer
so far.
query
The query is called
because
the inner
a correlated
query references
a
greater
In that
subquery
in
than
average
the
case,
action,
complete
suppose units
you
sold
value
the following
want to for that
know
all product
product
(as
sales
opposed
in
which
to the
the
average
procedure:
value for a product.
Compare the average computed in Step 1to the units sold in each sale row; then select only the
Cengage deemed
subqueries
Compute the average-units-sold
rows in
Editorial
is
of the
query is related
subquery.
correlated
for all products).
1
opposite the inner
outer
To see the units
query by passing the outer row to
Learning. that
any
which the
All suppressed
Rights
Reserved. content
does
number
May not
not materially
be
of units sold is greater.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The following
correlated
SELECT
query
completes
INV_NUMBER,
FROM
LINE
WHERE
P_CODE,
LS.LINE_UNITS
FIgure
in
9.20
two-step
SQL and
Advanced
SQL
457
process:
LINE_UNITS
. (SELECTAVG(LINE_UNITS)
LINE
LA
WHERE example
preceding
Language
LS
FROM
The first
the
9 Procedural
Figure
LA.P_CODE 9.20
shows
the
5 LS.P_CODE);
result
of that
query.
Correlated subquery examples
9
As you examine the top query and its result in Figure 9.20, note that the LINE table is used more than once; so, you need to use table aliases. In that case, the inner query computes the average units sold of the product that matches the P_CODE of the outer query P_CODE. That is, the inner query runs once
using the first
product
code found
in the (outer)
LINE table
and returns
the
average
sale for that
product. Whenthe number of units sold in that (outer) LINE row is greater than the average computed, the row is added to the output. Then the inner query runs again, this time using the second product code found in the (outer) LINE table. The process repeats until the inner query has run for all rows in the (outer) LINE table. In that case, the inner query is repeated as manytimes as there are rows in the outer
Copyright Editorial
review
2020 has
query.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
458
part
III
Database
Programming
To verify the results a correlated units
inline
sold
see, the
column new
product.
and to
subquery for
query
provide
to the
each
product.
contains
(See
a correlated
You not only get an answer,
You can also use correlated want to
know
all
subquery
like
the
customers first
one
CUS_CODE,
FROM
CUSTOMER
in
correlated
query
subquery
with the
placed
an
inline
and its
that
lately.
in
the
you can add
shows
Figure
the
9.20.)
average
units
average
As you sold for
can each
answer is correct.
EXISTS special order
results
computes
subqueries,
subquery
operator.
In that
case,
For example, you
could
suppose
use
you
a correlated
9.21:
CUS_LNAME,
EXISTS (SELECT
In subquery
second
inline
Figure
CUS_FNAME
CUS_CODE
WHERE
9.21
the
That
but also can verify that the
have
shown
of how you can combine
query.
subqueries who
SELECT
WHERE
FIgure
an example
previous
FROM INVOICE
INVOICE.CUS_CODE
5 CUSTOMER.CUS_CODE);
examples
9
The second example of an EXISTS correlated subquery in Figure 9.21 will help you understand how to use correlated queries. For example, suppose you want to know which vendors you must contact to start ordering products that are approaching the minimum quantity-on-hand value. In particular, you
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
want to know the vendor than
double
the
code
minimum
SELECT
V_CODE,
query
for products
that
answers
having
that
a quantity
question
is
SQL and
Advanced
SQL
459
on hand that is less
as follows:
VENDOR
WHERE
EXISTS
FROM
(SELECT*
PRODUCT
WHERE
P_QOH
,
AND
P_MIN
* 2
VENDOR.V_CODE
As you examine
the
second
query in
5 PRODUCT.V_CODE);
Figure
9.21, note that:
Theinner correlated subquery runs using the first vendor.
2 If any products the
3
vendor
match the condition (quantity
code
and name
The correlated vendors
9.4
often
requires
output.
subquery runs using the second vendor,
a year.
A product
designed
For
For
basis
example,
the
years, to
programming
language,
SQL functions
code
and the
data
process repeats itself
until all
its
very likely
are useful tools.
Youll
ordered
by year of birth or when your
ordered
by postal
code
need
use
elements
to can
be derived
The value a table.
data
and the first that
from
an
Therefore,
digits
not
present
have data
the
into
from
use functions
of their
a
month
in the
Functions
anywhere
functions If
numbers.
use
or literal)
both
modern
all employees
of those
cases,
youll
a SQL function
date
or string
may be an attribute
a SQL statement
enabled a
of all customers
using
a numerical,
date
familiar.
alist
instead
or it
know
want to list
In
database,
number, that
you
will look
when you
always
in
special
section
and
may be
employee
wants you to generate
telephone
as such
had
data
decomposition
a day,
line,
decompositions. in this
department
itself (a constant
may appear
information
involves
be subdivided
production
SQL functions
attribute.
command
shift,
those
need to
four
are
a function
the
marketing
existing
may be part of the
like
that
can
languages
transformations
Generating
manipulation
SE-05-2-09-1234-1-3/12/04-19:26:48)
plant,
programming
data
of birth
example,
region,
information.
such
date
(for
manufacturing
perform
business
Sometimes,
an employees
conventional
programmers
of critical
manipulations.
manufacturing
to record
time.
are the
many data
elements.
that
in the
minimum quantity),
SQL FunCtIonS
of data
and
are listed
on hand is less than double the
are used.
The data in databases
can
of vendors
The
Language
V_NAME
FROM
1
and name
quantity.
9 Procedural
where a value
value.
located
in
or an attribute
be used. There
are
functions.
many types
This
overview
of the
section
of
SQL functions,
will not
explain
all
such of those
most useful
ones.
main
vendors
support
vendors
invariably
as arithmetic, types
trigonometric,
of functions
in
string,
detail,
but it
date
will give
and you
time a brief
note Although
the
DBMS
may differ. In fact, The functions
covered
Read your
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
DBMS
DBMS
All suppressed
Rights
section
SQL reference
Reserved. content
in this
does
May not
not materially
be
copied, affect
the
SQL functions add
represent
their
just
a small
manual for a complete
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
covered
own
part.
Due Learning
to
to
portion
list
electronic reserves
here, the
functions
of functions
of available
rights, the
right
syntax
products
some to
third remove
or degree
to lure
of support
new
supported
customers.
by your
DBMS.
functions.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
460
part
III
Database
9.4.1
Programming
Date and time
All SQL-standard date
the
data
data
types
support
type)
it lets
Because functions
the
for
Microsoft
9.3
with that differ
Access/SQL
Selected
YeAr
vendor
Server
to
and for
access/SQL
Lists a four-digit
year
all employees
SELECT
vendor,
Oracle.
this
Table
Server date/time
born in
EMP_LNAME,
YEAR(EMP_DOB)
YEAR(date_value)
FROM EMPLOYEE
MONTH
(of
Unfortunately,
problem
data types
a
date/
occurs
because
are to
be stored;
section
will
9.3 shows
alist
cover
basic
date/time
of selected
Microsoft
functions
1966:
EMP_FNAME,
a two-digit
month
code
SELECT
Syntax:
EMP_LNAME,
FROM
5 1966;
born in
November:
EMP_FNAME,
MONTH(EMP_DOB)
MONTH(date_value)
AS
DAY
EMPLOYEE
Lists all employees the
number
of the
day
SELECT
DAY(EMP_DOB)
DAY(date_value)
FROM
EMP_FNAME,
GeTDATe() Returns
Access
SQL
todays
Lists how
Server
AS DAY
5 14;
many days are left
SELECT
#25-Dec-2019#
Note two
features:
date
The
Christmas
doing SQL
Use
date
has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
affect
enclosed
in number
in
signs
Microsoft
Access.
( #) because
you
are
arithmetic.
GETDATE()
copied,
date is
which is acceptable
Server:
between
2020
until Christmas:
- DATE();
There is no FROM clause,
In
month:
EMP_DOB,
EMPLOYEE
WHERE DAY(EMP_DOB) Microsoft
5 11;
born on the 14th day of the
EMP_LNAME,
Syntax:
DATe()
EMP_DOB,
MONTH
WHERE MONTH(EMP_DOB)
Returns
EMP_DOB,
AS YEAR
Lists all employees
Returns
review
The
parameter
issue. from
Syntax:
Copyright
or date type).
vendors.
but does not say how those
WHERE YEAR(EMP_DOB)
Editorial
DBMS
one
example(s)
Returns
9
numeric
different
take
functions.
Microsoft
Function
All date functions
(character,
by
date data types
deal
date/time
functions.
a value
differently
functions
Server
and time
and return
defines
vendor
date/time
Access/SQL
date
are implemented
ANSI SQL standard
instead,
tabLe
DBMSs
or character
time
Functions
dates,
scanned, the
overall
or
duplicated, learning
to
get the
use the
in experience.
whole
current
DATEDIFF
or in Cengage
part.
Due Learning
to
system
date.
function
electronic reserves
(see
rights, the
right
some to
third remove
To compute
the
difference
below).
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Function SQL
Server
a number
periods
to
Adds
of selected
time
a number
hours,
a date
days,
SELECT
Syntax: DATeADD(datepart,
In
DATeDiFF
SQL Server
Subtracts two
Microsoft
FROM In
9.4 shows
function
the
(TO_CHAR)
equivalent to
extract
convert
character
strings
shows
selected
date/time
acquisition
to
461
date.
quarters,
Dateparts
or years.
For
can
be
minutes,
example:
AS DueDate
It is therefore
for
list
a current
tabLe
example Access,
adds
90 days to
P_INDATE.
use the following: AS DueDate
between two
dates expressed in a selected
datepart.
DATEDIFF(day,
Microsoft
PRODUCT;
date/time
functions
the
different Oracle
functions
9.4
GETDATE())
AS DaysAgo
use the following:
parts
of a date.
MySQL
AS DaysAgo
used in
date format
for
advisable
of up-to-date
Access,
DATE() - P_INDATE
FROM
a valid
P_INDATE,
PRODUCT;
of MySQL, it is likely that there
in the future.
that version
Oracle.
Also,
can
will be an overlap
Note that
another
be used in
5.6. It is
function date
worth
of a number
that the you refer to the
Oracle
(TO_DATE)
arithmetic. noting
DBMS
is
Finally,
that,
same used
Table
due to
of MySQL and
appropriate
uses the
to 9.5
Oracles
Oracle functions
SQL reference
9
manual
functions.
Selected
oracle date/time
Function
functions
example(s)
TO_CHAr Returns
Lists a character
a formatted
string
string from
or
all employees
SELECT
a date
born in
EMP_LNAME,
1992:
EMP_FNAME,
TO_CHAR(EMP_DOB,'YYYY')
value
EMP_DOB,
AS YEAR
FROM EMPLOYEE
Syntax:
WHERE TO_CHAR(EMP_DOB,'YYYY')
TO_CHAR(date_value, 5 format
MONTH:
fmt)
used;
name
of
MON: three-letter MM: two-digit D: number
can
Lists
be:
of day
FROM
name
of
SELECT
Cengage
Learning. that
any
All suppressed
value
Rights
Reserved. content
5 '11';
born on the 14th day of the
does
EMP_LNAME,
EMP_FNAME,
TO_CHAR(EMP_DOB,'DD')
year value
year
EMP_DOB,
AS MONTH
TO_CHAR(EMP_DOB,'MM')
Lists all employees
FROM YY: two-digit
November:
EMP_FNAME,
month:
month
DAY: name of day of week YYYY: four-digit
5 '1992';
EMPLOYEE
WHERE
week
of
born in
EMP_LNAME,
TO_CHAR(EMP_DOB,'MM')
month name
day
all employees
SELECT
month
month
for
DD: number
deemed
a given
P_lNDATE)
P_INDATE190
SELECT
has
SQL
PRODUCT;
SELECT startdate,
enddate)
2020
Advanced
For example:
DATeDiFF(datepart,
review
to
months,
Returns the difference
dates
Syntax:
Copyright
weeks,
The preceding
date)
FROM
Editorial
of dateparts
DATEADD(day,90,
SELECT
fmt
SQL and
FROM PRODUCT;
number,
Table
Language
example(s)
DATeADD Adds
9 Procedural
AS
EMP_DOB,
DAY
EMPLOYEE
WHERE TO_CHAR(EMP_DOB,'DD')
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
5 '14';
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
462
part
III
Database
Programming
Function
example(s)
TO_DATe
Lists the
Returns
a date
character
value
string
format
using
a
anniversary
and a date
a date between
FROM
fmt)
name
MM: two-digit
of
month
YY: two-digit
is
in
name
date
EMP_DOB)/365
AS YEARS
string,
not a date.
translates
the text string to a valid
How many days between
Thanksgiving
and Christmas
SELECT TO_DATE('2018/12/25','YYYY/MM/DD')
week
TO_DATE('NOVEMBER
year
value
FROM
value
Oracle date used
arithmetic.
of
year
2018?
-
23, 2018','MONTH
DD,
YYYY')
DUAL;
Note the following: The TO_DATE in
date
DUAL is
Lists
SYSDATe date
translates
the text
string
to
a valid
Oracle
date
used
Oracles
pseudo table
used only for cases
are left
Christmas:
where a table is not
needed.
how
SELECT FROM
function
arithmetic.
really
todays
a text
The TO_DATE function
name
of day of month of day
YYYY: four-digit
Returns
AS ANIV_DATE,
month
month
DD: number
9
EMP_FNAME,
Note the following:
D: number for day of week
name
tenth
used; can be:
MON: three-letter
DAY:
companys
BY YEARS;
'11/25/2018' MONTH:
on the
EMPLOYEE
ORDER
5 format
employees
(TO_DATE('11/25/2008','MM/DD/YYYY')
Syntax:
fmt
EMP_LNAME,
EMP_DOB, '11/25/2018'
formats
TO_DATE(char_value,
age of the
date (11/25/2018):
SELECT
mask; also used to
translate
approximate
many days
until
TO_DATE('25-Dec-2018','DD-MON-YYYY')
SYSDATE
DUAL;
Notice two things: DUAL is really The to ADD_MONTHS Adds
Lists
a number
of
months
date; useful for adding
to
a
Christmas
a valid
FROM
or years to a date
pseudo
table
used
only for
cases
where
a table
is
not
date is
enclosed
in
a TO_DATE
function
to translate
the
date
date format.
all products
SELECT
months
Oracles
needed.
with their
P_CODE,
expiration
P_INDATE,
date (two
years
from
the
purchase
date):
ADD_MONTHS(P_INDATE,24)
PRODUCT
ORDER BY ADD_MONTHS(P_INDATE,24);
Syntax: ADD_MONTHS(date_value, n 5 number
of
n)
months
LAST_DAY
Lists all employees
Returns the date of the last of the
month
given in
day
SELECT
a date
FROM
Syntax:
who were hired
EMP_LNAME,
EMP_FNAME,
within the last
seven days of a month:
EMP_HIRE_DATE
EMPLOYEE
WHERE
EMP_HIRE_DATE
.5
LAST_DAY(EMP_HIRE_DATE)-7;
LAST_DAY(date_value)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
tabLe
9.5
Selected
MySQL date/time
Function
Language
SQL and
Advanced
SQL
463
functions examples
Date_Format
Displays
Returns a character formatted
string or a
string from
received
a date value
Syntax: DATE_FORMAT(date_value, fmt
9 Procedural
= format
used;
fmt)
can
be:
the
product
code
and
date the
product
was last
into stock for all products:
SELECT
P_CODE,
FROM
PRODUCT;
SELECT
P_CODE,
FROM
PRODUCT;
DATE_FORMAT(P_INDATE,
'%m/%d/%y')
DATE_FORMAT(P_INDATE,
'%M
%d,
%Y')
%M: name of month %m: two-digit
month
%b: abbreviated %d: number
of day
%W: weekday
of
name month
name
%a: abbreviated %Y: four-digit
number
month
weekday
name
year
%y: two-digit
year
YeAr
Lists
Returns a four-digit
year
all employees
SELECT
EMP_LNAME,
Syntax: FROM
MONTH
Lists a two-digit
month
code
all employees
SELECT
FROM
DAY
November:
EMP_FNAME,
Lists the
number
of the
day
AS
MONTH(EMP_DOB)
all employees
5 11;
born
SELECT
EMP_LNAME,
FROM
EMPLOYEE
on the
WHERE
ADDDATe Adds a number
of days to a date
SELECT
5 number
n)
DATe_ADD This is similar to
ADDDATE
It allows
the
user to
or years to
except it is
specify
the
date
a date.
more robust. unit to
to
INTERVAL
will have been on the shelf
ADDDATE(P_INDATE,
30)
with their
30);
expiration
date (two
years from the
date):
SELECT
add.
n unit)
P_INDATE,
ADDDATE(P_INDATE,
P_CODE,
P_INDATE,
INTERVAL
2 YEAR)
FROM
5 number
DATE_ADD(P_INDATE,
PRODUCT
ORDER
BY
DATE_ADD(P_INDATE,
INTERVAL
2 YEAR);
add
5 date
unit,
add
n days
DAY:
BY
purchase
Syntax: DATE_ADD(date,
DAY
with the date they
Lists all products months,
month:
PRODUCT
ORDER
of days,
of the
EMP_DOB,
5 14;
P_CODE,
FROM
of days
Adds a number
day
for 30 days.
Syntax: ADDDATE(date_value,
AS
DAY(EMP_DOB)
List all products
14th
EMP_FNAME,
DAY(EMP_DOB)
DAY(date_value)
9
EMP_DOB, MONTH
EMPLOYEE
Syntax:
unit
born in
EMP_LNAME,
WHERE
n
= 1982;
MONTH(EMP_DOB)
MONTH(date_value)
EMP_DOB,
AS YEAR
YEAR(EMP_DOB)
Syntax:
n
EMP_FNAME,
EMPLOYEE
WHERE
Returns
1982:
YEAR(EMP_DOB)
YEAR(date_value)
Returns
born in
can
be:
WEEK: add n weeks MONTH: add n months YEAR:
Copyright Editorial
review
2020 has
add
Cengage deemed
Learning. that
any
n years
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
464
part
III
Database
Programming
Function
examples
LAST_DAY
Lists
all employees
of a
month:
Returns
the
given in
a date
date
of the
last
day
of the
month
Syntax:
who
SELECT
EMP_LNAME,
FROM
EMPLOYEE
LAST_DAY(date_value)
were hired
within the last
EMP_FNAME,
WHERE EMP_HIRE_DATE
seven
days
EMP_HIRE_DATE
>= DATE_ADD(LAST_DAY
(EMP_HIRE_DATE),
INTERVAL
-7
DAY);
9.4.2 numeric Functions Numeric functions can be grouped in many different ways, such as algebraic, trigonometric and logarithmic. In this section, you will learn two very useful functions. Do not confuse the SQL aggregate functions you saw in the previous chapter with the numeric functions in this section. The first
group
operates
over a set of values (multiple
the numeric functions parameter and return in an Oracle DBMS.
tabLe
9.6
Selected
oracle numeric
name
aggregate
functions),
while
functions
example(s)
ABS
Lists absolute
Returns
hence, the
covered here operate over a single row. Numeric functions take one numeric one value. Table 9.6 shows a selected group of numeric functions available
Function
9
rows
the
absolute
value
of a
SELECT
number
values:
1.95, -1.93,
FROM
ABS(1.95),
ABS(-1.93)
DUAL;
Syntax:
ABS(numeric_value) rOUND
Lists the product
Rounds a value to a specified precision
(number
SELECT
of digits)
Syntax: ROUND(numeric_value,
p)
FROM
prices rounded
P_CODE,
to one and zero decimal
places:
P_PRICE,
ROUND(P_PRICE,1)
AS PRICE1,
ROUND(P_PRICE,0)
AS PRICE0
PRODUCT;
p 5 precision TrUNC
Lists the product
Truncates
a value to
precision (number
a specified
SELECT
of digits)
p
p)
5 precision
Copyright Editorial
review
2020 has
FROM
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
P_CODE,
does
May not
not materially
be
copied, affect
to one and zero decimal
places and truncated:
P_PRICE,
ROUND (P_PRICE,1)
Syntax:
TRUNC(numeric_value,
price rounded
AS PRICE1,
ROUND(P_PRICE,0)
AS PRICE0,
TRUNC(P_PRICE,0)
AS PRICEX
PRODUCT;
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
Function
9 Procedural
Language
SQL and
Advanced
SQL
465
example(s)
CeiL/FLOOr Returns
the
Lists the smallest
greater than number
price,
or equal to a
or returns
integer
integer
SELECT
the largest
FROM
equal to or less than
number,
product
price, smallest integer
and the largest
P_PRICE,
integer
greater than
equal to
CEIL(P_PRICE),
or less
than
or equal to the product
the
product
price:
FLOOR(P_PRICE)
PRODUCT;
a
respectively
Syntax; CEIL(numeric_value)
FLOOR(numeric_value)
9.4.3 String String
Functions
manipulations
report
using
any
of characters, a subset
are among the
programming
printing
of useful
tabLe
9.7
language,
names
string
in
manipulation
Selected
functions know
||
Microsoft
different
Oracle
Access
and
data from
a single
In
Oracle,
FROM
two
columns
and
In
and
a string in
all lowercase
LOwer
use the
Microsoft
Lists all capital
or
In
letters
9.7 shows
9
following:
|| ', ' || EMP_FNAME
Access
and
EMP_LNAME
AS NAME
SQL
Server,
1', '
use the following:
1 EMP_FNAME
AS NAME
EMPLOYEE;
all employee
Oracle,
SELECT
Syntax:
FROM
UPPER(strg_value)
In
LOWER(strg_value)
SQL
Lists
Server,
SQL
FROM
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
names
use the
UPPER(EMP_FNAME)
1 ', '
AS NAME
1 UPPER(EMP_FNAME),
AS NAME
in
all lowercase
letters
(concatenated).
following:
LOWER(EMP_LNAME)
Server,
|| ', ' || LOWER(EMP_FNAME)
AS NAME
use the following:
LOWER(EMP_LNAME)
1 ', '
1 LOWER(EMP_FNAME)
AS NAME
EMPLOYEE;
Not supported
All
|| ', ' ||
EMPLOYEE;
SELECT
suppressed
(concatenated).
use the following:
all employee
FROM
any
letters
EMPLOYEE;
SELECT
Learning.
all capital
UPPER (EMP_LNAME)
Oracle,
In
in
following:
UPPER(EMP_LNAME)
SELECT
In
names
use the
EMPLOYEE;
FROM
that
strings
Table
1 strg_value
UPPer
Cengage
concatenating
attribute.
names (concatenated).
EMP_LNAME
SELECT
column
Returns
deemed
of a given
|| strg_value
strg__value
has
of properly
the length
a
EMPLOYEE;
FROM
strg_value
2020
If you have ever created
functions
Lists all employee
Syntax:
review
the importance
or knowing
SELECT
character
returns
Copyright
programming.
functions.
string
Server
Concatenates
Editorial
in
example(s)
Concatenation 1
you
uppercase,
Function
SQL
most-used
duplicated, learning
by
in experience.
whole
Microsoft
or in Cengage
part.
Due Learning
to
Access.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
466
part
III
Database
Programming
Function
example(s)
SUBSTriNG
Lists the first
Returns given
a substring
string
or part
of a
In
parameter
of all employee
phone
numbers.
EMP_PHONE,
SUBSTR(EMP_PHONE,l
,3)
AS PREFIX
FROM EMPLOYEE;
SUBSTR(strg_value,
p,l)
Oracle
SUBSTRING(strg_value,p,l)
In
SQL
SQL Server, use the following:
SELECT
Server
l
characters
use the following:
SELECT
Syntax:
p
Oracle,
three
FROM
5 start
position
5length
SUBSTRING(EMP_PHONE,1,3)
AS PREFIX
EMPLOYEE;
Not supported
by
Microsoft
Access.
of characters
LeNGTH Returns
EMP_PHONE,
Lists all employee last the
characters
number in
of
a string
order
value
In
Syntax:
by last
Oracle,
Oracle
LEN(strg_value)
Server
SQL
of their
names in descending
name length.
use the following:
SELECT
LENGTH(strg_value)
names and the length
EMP_LNAME,
LENGTH(EMP_LNAME)
AS NAMESIZE
FROM EMPLOYEE; In
Microsoft
SELECT FROM
Access
and
EMP_LNAME,
SQL
Server,
use the following:
LEN(EMP_LNAME)
AS NAMESIZE
EMPLOYEE;
9.4.4 Conversion Functions Conversion
9
value
in
functions another
TO_CHAR
and
allow
tabLe
you
string
and
of the
9.8
Note that
the
will see how to
selected
Selected
how to
use the functions
a given
function
date in
use the
TO_CHAR
TO_NUMBER
conversion
in
to
SQL
CONverT
convert basic value
it to
the
equivalent
conversion
functions:
and returns
TO_DATE function
a character
takes
a character
Oracle format. function to
to
convert
convert
text
numbers strings
to
to
a formatted
numeric
values.
9.8.
functions
Character:
Lists
all product
Oracle
inventory
Server
In
SQL Server
Returns a character
cost
Oracle,
SELECT
string from a numeric
prices, using
quantity
formatted
on hand,
and total
use the following:
TO_CHAR(P_PRICE,'999.99') TO_CHAR(P_QOH,'9,999.99') TO_CHAR(P_DISCOUNT,'0.99')
fmt)
discount,
P_CODE,
Syntax:
TO_CHAR(numeric_value,
per cent
values.
value.
Oracle:
and of the
a date
way, the
function Table
type
two
takes
same
an actual
shown
data
about
example(s)
TO_CHAr CAST
of
you learnt
TO_CHAR
is
Function Numeric
a value 9.4.1,
a date and returns
section,
A summary
to take Section
a day, a month or a year. In the
string representing
character
In
TO_DATE.
string representing
In this
you
data type.
AS PRICE, AS QUANTITY, AS DISC,
TO_CHAR(P_PRICE*P_QOH,'99,999.99')
AS TOTAL_COST FROM
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
PRODUCT;
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
Function
9 Procedural
Language
SQL and
Advanced
SQL
467
example(s)
SQL Server:
In
CAST (numeric
AS varchar
(length))
CONVERT(varchar(length),
SQL Server, use the following:
SELECT P_CODE,
numeric)
CAST(P_PRICE
AS VARCHAR(8))
CONVERT(VARCHAR(4),P_QOH) CAST(P_DISCOUNT
AS QUANTITY,
AS VARCHAR(4))
CAST(P_PRICE*P_QOH
AS PRICE,
AS DISC,
AS VARCHAR(10))
AS TOTAL_COST
FROM PRODUCT; Not supported Date to
Character:
Lists
TO_CHAr
Oracle
CAST
Server
SQL
CONverT
In
Oracle,
SQL Server
character
string
a date
value
using
different
date formats.
use the following: EMP_LNAME,
FROM
EMP_DOB,
DAY, MONTH DD, YYYY)
EMPLOYEE;
SELECT TO_CHAR(date_value,
fmt)
EMP_LNAME,
EMP_DOB,
TO_CHAR(EMP_DOB,
SQL Server: CAST (date
of birth,
AS DATEOFBIRTH
Syntax: Oracle:
dates
Access.
TO_CHAR(EMP_DOB,
string or a formatted
from
Microsoft
all employee
SELECT
Returns a character
in
YYYY/MM/DD)
AS DATEOFBIRTH AS varchar(length))
CONVERT(varchar(length),
FROM date)
In
EMPLOYEE;
SQL
Server,
SELECT
use the following:
EMP_LNAME,
EMP_DOB,
CONVERT(varchar(11), EMP_DOB) FROM
AS DATE OF BIRTH
EMPLOYEE;
SELECT
EMP_LNAME,
CAST(EMP_DOB
EMP_DOB,
9
as varchar(11))
AS
DATE
OF BIRTH
FROM EMPLOYEE; Not supported String
to
Number:
Converts
TO_NUMBer Returns
table
a formatted
character
number
string,
from
a
using a given format
In
5 format
used;
fmt) can
9 5 displays
a digit
0 5 displays
aleading
below
to
Access.
numeric
source
uses the
to
in text
values
when importing
format;
for
TO_NUMBER
example,
function
Oracle default numeric
to
data to a the
convert
query text
values using the format
masks
5 displays
the
comma
5 displays
the
decimal
5 displays
the
dollar
blank
S 5 leading
sign
MI 5 trailing
'S999.99'),
TO_NUMBER('99.78-','B999.99MI')
In
DUAL;
SQL Server, use the following:
SELECT
CAST('-123.99'
point
AS NUMERIC(8,2)),
CAST(-99.78
sign
The
the
SQL
Server
character
AS NUMERIC(8,2))
CAST function
does
not support
the trailing
sign
on
string.
Not supported
in
Microsoft
Access.
minus sign
SQL
Server
The following
DeCODe
Oracle
Compares
an attribute
with a series associated
use the following: TO_NUMBER('-123.99',
be:
zero
.
B 5leading
Oracle,
SELECT
FROM
,
CASe
strings
another
formatted
Oracle:
$
Microsoft
given.
TO_NUMBER(char_value, fmt
text
from
shown
Syntax:
in
returns
the
sales tax rate
for
specified
countries:
Compares
V_COUNTRY to 'SA'; if the values
match, it returns .08.
Compares
V_COUNTRY
match, it returns
Compares
V_COUNTRY to 'UK'; if the values
or expression
of values value
example
and returns
or a default
to 'FR';
if the
values
.05.
an
value if
no
match, it returns .085.
match is found
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
468
part
III
Database
Programming
Function
example(s)
Syntax:
If there is
Oracle:
SELECT
DECODE(e,
x, y, d)
e 5 attribute x
5 value
equal to SQL CASE
which to
compare
e
'UK',.085,
0.00)
FROM
VENDOR;
In SQL Server, use the following:
value to return if eis not
SELECT
x
V_CODE,
V_COUNTRY,
CASE
Server: When condition
THEN value1
ELSE value2
END
WHEN
V_COUNTRY
WHEN
V_COUNTRY
WHEN V_COUNTRY ELSE
0.00
FROM
END
9.5
oraCLe
If you
use
you can
use the
key,
to that
column,
that
Similarly,
might
data type
offers
a column
type,
include
to
to
create
named
every time
starting
define
with
you insert
with
an a row
you
different
1 and increasing
Oracle
sequences
are
Oracle
sequences
have
Oracle
sequences
are
Oracle sequences
use the
a name not tied
to
An Oracle sequence to
can be created
create
SEQUENCE
a sequence name
that
accept,
you
After
new row
to
type
database.
define
you
close
a value
is
are
Microsoft
Also,
as an a value
you
cannot
value at all.
on a table.
(Sequences
anywhere
that
adds
add.
deserves
a primary
a column
you edit that
a column and
populated
automatically
will not let
Access,
to define
will notice
you
Access
every
Microsoft
will be automatically
and forget
type.
values
In
Access
Access
data
be used
data type.
Microsoft
by 1 in
in the
But an
Oracle
scrutiny:
not
a data type.)
expected.
or a column.
a value
can be assigned
based
and deleted
[START
table,
assign
value that
in
if you data
value
to
a table
which you assign
column;
Autonumber
can
a numeric
your table
Microsoft
Microsoft
object
and to
key
the
Access
an independent
generate
attribute
CREATE
can from
in
atable in
in the
a sequence
Oracle,
The basic syntax
Access.
Autonumber
Autonumber
in
The table
THEN .085
with the
a column
a primary
ID
statements
very
THEN .05
Microsoft
be familiar
column in your INSERT
is
5 'FR'
5 'UK'
AS TAX
in
values. In fact, if you create
Access
Autonumber
you
Autonumber
creats
sequence
Access,
numeric
Microsoft
Access
THEN .08
SeQuenCeS
Microsoft
with unique
5 'SA'
VENDOR
Not supported
9
default value).
AS TAX
y 5 value to return in e 5 x d 5 default
0.00 (the
V_COUNTRY,
DECODE(V_COUNTRY,'SA',.08,'FR',.05,
or expression with
no match, it returns V_CODE,
to
on a sequence
any column in any table. can be edited
and
modified.
any time.
Oracle is: WITH n] [INCREMENT
BY n] [CACHE
|
NOCACHE]
where:
? name is the ?
n is
name
an integer
? START
value
increment
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All
can
be positive
the initial
BY determines
value is ascending
suppressed
that
WITH specifies
? INCREMENT
create
of the sequence.
Rights
Reserved. content
does
the
May not
sequence value
1. The sequence or descending
not materially
be
copied, affect
scanned, the
overall
or negative.
value. (The
by
which
increment
the
default
value is
sequence
1.)
is incremented.
can be positive
or negative
(The
default
to enable
you to
sequences.)
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
? The in
CACHE memory.
For example, each
time
or NOCACHE (Oracle
you
a new
automatically
could
time
CREATE
SEQUENCE
CREATE
SEQUENCE
To check Figure
create
customer
each
all of the
clause indicates
pre-allocates
is
a sequence
added
and
a new invoice
is
you
assign
another
SQL
will pre-allocate
to the
to code
SQL and
Advanced
sequence
use the
values
accomplish
SQL
469
numbers
code to
the
those
automatically
invoice tasks
number
is:
NOCACHE;
WITH 4010
created,
customer
assign to
WITH 20010
START
have
values
sequence The
START
INV_NUMBER_SEQ
Oracle
Language
by default.)
to
added.
CUS_CODE_SEQ
sequences
whether
20 values
9 Procedural
NOCACHE;
following
SQL command,
illustrated
in
9.22:
SELECT * FROM
FIgure
USER_SEQUENCES;
9.22
oracle sequence
9
To use sequences during data entry, you must use two special pseudo columns: NEXTVAL and CURRVAL. NEXTVAL retrieves the next available value from a sequence. CURRVAL retrieves the current value of a sequence. For example, you can use the following code to enter a new customer: INSERT INTO
CUSTOMER
VALUES (CUS_CODE_SEQ.NEXTVAL, The preceding
SQL statement
adds
'Connery',
'Sean',
a new customer
NULL, '0181', '898-2008',
to the
CUSTOMER
20010 to the CUS_CODE attribute. Lets examine some important CUS_CODE_SEQ.NEXTVAL Each time
you use
Once a sequence your
retrieves
NEXTVAL, the
the
sequence
value is used (through
SQL statement
rolls
next available
table
0.00);
and assigns
the
value
sequence characteristics:
value from the
sequence.
is incremented.
NEXTVAL), it cannot be used again. If, for some reason,
back, the sequence
value
does not roll
back. If you issue
another
SQL
statement (with another NEXTVAL), the next available sequence value is returned to the user willlook as though the sequence skips a number. You can issue an INSERT statement
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
it
without using the sequence.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
470
part
III
Database
CURRVAL was in
retrieves
generated
the
the
Programming
same
and
current
The
LINE
main
tables
the INV_NUMBER
numbers
related
Then,
cannot
use for
in the
use
that is, the last
CURRVAL
CURRVAL
is to
Ch09_SaleCo
unless
enter
database
sequence
CURRVAL,
foreign
key
rows
in
in
in
the
LINE
tables.
For
to generate invoice
INV_NUMBER
used and assign
For example:
INTO INVOICE
VALUES (INV_NUMBER_SEQ.NEXTVAL,
20010,
INSERT
INTO
LINE VALUES (INV_NUMBER_SEQ.CURRVAL,
1, '13-Q2/P2',
1, 14.99);
INSERT
INTO
LINE
2, '23109-HB',
1, 9.95);
(INV_NUMBER_SEQ.CURRVAL,
example,
relationship
INSERT
VALUES
which
previously
a one-to-many
sequence
table.
used,
was issued
dependent
are related
you can get the latest
attribute
number
a NEXTVAL
You can use the INV_NUMBER_SEQ
using
INV_NUMBER
of a sequence,
You
attribute.
automatically.
it to the
value
a NEXTVAL.
session.
INVOICE
through
the
with
SYSDATE);
COMMIT; The results
FIgure
are
9.23
shown
in
Figure
9.23.
oracle sequence examples
9
In
the
example
sequence the
Copyright Editorial
review
2020 has
number
use of the
Cengage deemed
shown
Learning. that
any
All suppressed
in
(4011)
SYSDATE
Rights
Reserved. content
does
May not
Figure
9.23,
INV_NUMBER_SEQ.NEXTVAL
retrieves
the
and assigns
it to the INV_NUMBER
column in the INVOICE
attribute
automatically
current
not materially
be
copied, affect
to
scanned, the
overall
or
duplicated, learning
in experience.
whole
insert
or in Cengage
part.
Due Learning
the
to
electronic reserves
rights, the
right
some to
next
available
table.
Also note
date in the INV_DATE
third remove
party additional
content
may content
be
suppressed at
any
time
from if
attribute.
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Next, the following
two INSERT
statements
INV_NUMBER_SEQ.CURRVAL In this
way, the
refers
relationship
statement
at the
also issue
a ROLLBACK,
back (but the no
it!
unique
Remember The
points
not
were
you could you
is
can
you think
optional.
associated created
drop
(one
for
from
earlier,
DROP
SEQUENCE
INV_NUMBER_SEQ;
columns
original
using
the
does
not
to
number
(with
guarantee
that
case, (4011).
The
COMMIT
Of course,
and LINE tables
you
can
are rolled
NEXTVAL), there
the
471
sequence
is
always
data
set.
following
code
the
you
manually.
the
values
with
values
sequence
can
you
object
and INVOICE
Therefore,
values
examples
and
and used it to
and INV_NUMBER)
CUSTOMER
the
in
Figure
one for invoice
generate
SEQUENCE
assigned
to
two
number
unique
a DROP
9.23,
distinct
values),
values for
but
both tables.
command.
For
example,
you type:
delete
only the
(CUS_CODE
Because the the
in INVOICE
designed
you recall
a database
CUS_CODE_SEQ;
deletes
If
customer
SEQUENCE
table
permanent.
SQL
sequences:
enter
a table.
DROP
it
can
one sequence
created
a sequence
about
You
with
a sequence
drop the sequences
Dropping
Advanced
number
automatically.
changes
Once you use a sequence is
sequence
established
you inserted
characteristic
when
have created just
INV_NUMBER);
is
makes the
case the rows not).
LINE
SQL and
being sold to the LINE table. In this
INV_NUMBER_SEQ
and
sequence
Language
values.
is
sequences
which
number is
of sequences
A sequence
Finally,
in
products
last-used
INVOICE
command
This no-reuse
these
use
between
of the
sequence
way to reuse
generates
to
end
add the
to the
9 Procedural
from
remain
tables delete
the
table
database.
in the
(CUS_CODE
values
you
and
assigned
to the
database.
are used in subsequent
the
attributes
The
customer,
invoice
examples,
and line
you should
rows
you just
keep
added
9
by
commands:
DELETE
FROM
DELETE
FROM
INVOICE
WHERE INV_NUMBER
CUSTOMER
WHERE
5 4011;
CUS_CODE
5 20010;
COMMIT; Those commands invoice
(the
and the
delete the recently
LINE tables
recently
added invoice
INV_NUMBER
added
customer.
foreign
The
key
COMMIT
and all of the invoice was defined
statement
line rows
with the
saves
all
associated
ON DELETE
changes
to
with the
CASCADE
permanent
option)
storage.
note At this
point,
will be used
Copyright Editorial
review
2020 has
again
need later
to re-create
in the
the
chapter.
SEQUENCE
CUS_CODE_SEQ
CREATE
SEQUENCE
INV_NUMBER_SEQ
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
CUS_CODE_SEQ
and INV_NUMBER_SEQ
sequences,
as they
Enter:
CREATE
Cengage deemed
youll
duplicated, learning
START
WITH 20010
START
in experience.
whole
or in Cengage
part.
NOCACHE;
WITH 4011
Due Learning
to
electronic reserves
rights, the
right
NOCACHE;
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
472
part
III
Database
9.6
Programming
upDatabLe
VIewS
In Chapter 8, Beginning Structured Query Language, you learnt how to create a view and why and how views are used. As mentioned in Chapter 8, Microsoft Access does not support views. Whileviews can be simulated using a SQL query, asis seen here, a view is far more versatile. You will now look at how to make views serve common data management tasks executed by database administrators. One of the
most common
operations
in
production
database
environments
is
using
batch
update
routines to update a master table attribute (field) with transaction data. Asthe name implies, a batch update routine pools multiple transactions into a single batch to update a master table field in a single operation. For example, a batch update routine is commonly used to update a products quantity on hand
based
on summary
sales transactions.
Such routines
are typically
run
as overnight
batch jobs
to
update the quantity on hand of products in inventory. The sales transactions performed, for example, bytravelling salespeople in remote areas were entered during periods when the system was offline. To demonstrate a batch update routine, lets begin by defining the master product table (PRODMASTER) and the product monthly sales totals table (PRODSALES) shown in Figure 9.24. As you examine
FIgure Table
the tables,
9.24
name:
the
note the
proDMaSter
between
and proDSaLeS
the two tables.
tables
PRODMASTER
9
Table
1:1 relationship
name:
PrOD_iD
PrOD_DeSC
PrOD_QOH
A123
SCREWS
60
BX34
NUTS
37
C583
BOLTS
50
PRODSALES
PrOD_iD
PS_QTY
A123
7
BX34
3
online Content For Microsoft Accessusers, the PRODMASTER andPRODSALES tables are located Oracle After
Using
in the 'Ch09_UV'
users, you locate
sequences
into
the
in
tables
tables
product
produce
Copyright review
2020 has
script
your
you
Figure
9.24,
monthly
SET
PRODMASTER.PROD_QOH
Learning. any
the
All
Rights
Reserved. content
does
May not
not materially
be
copied, affect
in the
you
copy
PRODMASTER from
query is
can
student and
companion.
paste the
command
the
table
by subtracting
PRODMASTER
the
tables
PRODSALES
PROD_QOH.
To
written like this:
PRODSALES 5 PROD_QOH
PRODMASTER.PROD_ID
suppressed
platform for this book. For
are located
uv04.sql),
(PS_QTY)
update
PRODMASTER,
that
through
update
quantity
the
on the online
section
program.
lets
sales
update,
which is located see in this
files (uv01.sql
SQL*Plus
UPDATE
Cengage deemed
the
the required
WHERE
Editorial
database,
all SQL commands
scanned, the
overall
-
PS_QTY
5 PRODSALES.PROD_ID;
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Note that the Join the Update
update
PRODMASTER the
To
be used
in
Oracle
produced
UPDATE
attribute
update,
works in
statement.
in
the
error
In fact,
if
each
sequence
Language
used
not all views are
to
update
SQL
473
tables. row
of the
PRODSALES
Access,
message you
attributes
are updatable.
Advanced
of events:
PRODMASTER
table
with the
use
data
must
be stored
but Oracle returns
because Oracle,
Oracle
you
expects
cannot
in
the error
matching
join
a base
in the
Actually,
base table(s)
several
that
to find
tables
is (are)
restrictions
a single
in
the
used in the
govern
table
rather
than
in
message shown in Figure
You
views,
a
9.25.
name in the statement.
an updatable
view.
updatable
table
UPDATE
solve that problem, you have to create an updatable view. Asits name suggests, a view
SQL and
table.
Microsoft
the
the following
PRODSALES
PRODSALES
a batch
That query
reflects
and
PROD_QOH
PROD_ID in the
view.
statement
9 Procedural
To
view is
must realise
and some
that
of them
vendor-specific.
FIgure
9.25
the oracle upDate error message
9
note Keep in
mind that
on updatable supports
both
reference
Copyright Editorial
review
2020 has
Cengage deemed
examples
by the
updatable
in this
section
are
DBMS you are using, and insertable
views.
generated
in
Oracle.
To see
check the appropriate For
more information
which
DBMS on the
restrictions
manual.
syntax,
are
placed
MySQL version
consult
the
MySQL
8.0 8.0
manual.
Learning. that
the
views
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
474
part
III
Database
The
Programming
most common GROUP You
BY expressions use
set
or aggregate such
Most restrictions
are based
on the
meet the
limitations,
To
cannot
view restrictions
operators
Figure
FIgure
updatable
Oracle
an
as
are as follows:
functions
cannot
UNION,
INTERSECT
use of JOINs
updatable
be used in the and
or group
view
named
updatable
views.
MINUS.
operators
in views.
PSVUPD
has
been
created,
as
shown
in
9.26.
9.26
Creating an updateable view in oracle
9
One easy
way to determine
whether
a view can be used to update
a base table is to examine
the views
output. If the primary key columns of the base table you want to update still have unique values in the view, the base table is updatable. For example, if the PROD_ID column of the view returns the A123 or BX34 values morethan once, the PRODMASTER table cannot be updated through the view. After creating the updatable view shown in Figure 9.26, you can use the UPDATE command to update
the
view,
thereby
updating
the
PRODMASTER
command is used and shows the final contents executed.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
table.
Figure
9.27 shows
of the PRODMASTER table
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
how the
are after the
may content
be
suppressed at
any
time
from if
UPDATE
UPDATE is
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
9.27
proDMaSter
9 Procedural
Language
SQL and
Advanced
SQL
475
table update, using an updatable view
Although the batch update procedure just illustrated meetsthe goal of updating a mastertable with data from a transaction table, the preferred, real-world solution to the update problem is to use procedural SQL. Youll learn about procedural SQLin the next section.
9
9.7
proCeDuraL
SQL
Thus far, you have learnt to use SQL to read, write and delete data in the database. For example, you learnt to update values in a record, to add records and to delete records. Unfortunately, SQL does not support the conditional execution of procedures that are typically supported by a programming language IF
using the general format:
,condition.
THEN ,perform ELSE
,perform
procedure. alternate
procedure.
END IF SQL also fails to support the looping operations in programming languages that permit the execution repetitive actions typically encountered in a programming environment. The typical format is:
of
DO WHILE ,perform
procedure.
END DO
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
476
part
III
Database
Programming
Traditionally,
if you
operation
(that
as .NET,
C#
are
it
based
is,
C, Visual
Basic
on enormous
usually involves
changes
the
thus
yielding
(see
Chapter
be stored
platform
and
numerous Flow
executed
within
programming control
the
have
is
the
that
and logic
database.
structures
of
such
applications
is
Therefore,
still
common,
when procedural
programs.
all application
An environment
programs
application
In
code
call the
is isolated
any case, the
requirement,
that most
a single
of distributed
(see
more
shared
in
rise
databases
required
extensions
business
approach
object-orientated
meet that
type
language
problems.
Databases)
Those
legacy) that
many different
control.
and
To
extensions.
programming
(so-called
management
and then
(DO-WHILE)
a programming
Although
code
Databases)
or looping use
many programs.
data
Object-Orientated
language
procedural
older lines.
code in
maintenance
book,
would
approach
Distributed
for this
many
creates
critical
better 14,
you
must be made in
often
modular
(IF-THEN-ELSE)
program
modifications
of that
databases
why
COBOL
of application
is to isolate
advantage
online
Thats
of
redundancies
program,
on the
programming),
of
or Java.
program
approach
The
a conditional
type
duplication
by such
A better
perform
numbers
are required,
characterised
code.
wanted to a procedural
Appendix
application
RDBMS
G code
vendors
created
include:
(IF-THEN-ELSE,
DO-WHILE)
for logic
representation. Variable Error
declaration
and
designation
within
the
procedures.
in
SQL and to
management.
To remedy
the lack
of procedural
functionality
provide
some
standardisation
within the
many vendor offerings, the SQL-99 standard defined the use of persistent stored modules. A persistent stored module (PSM) is a block of code (containing standard SQL statements and procedural extensions) can
9
that
is
stored
be encapsulated,
assign
specific
for persistent (such
as
the
SQL
Server
standard
that
database. such error
as variables, The
by the
Anonymous
Stored
procedures
PL/SQL
functions
Do not confuse built-in
within
The
supported
PSM represents
database
module to ensure that
its
to
stored
users.
business
A PSM lets
only authorised In fact, for
procedure
procedural
use and
makes it
logic
that
an administrator
users can use it. many years, some
modules
processing
procedural
code
is
SQL language.
store
possible
conditional
within the
Support RDBMSs
database
before
to
procedural
merge
code
SQL
(IF-THEN-ELSE), executed
and
by the
SQL (PL/SQL)
SQL statements
and traditional basic
as a unit
Procedural
a
within the
programming and
is
constructs
loops
(FOR
DBMS
when it is invoked
WHILE loops)
and
(directly
or
create:
blocks.
in
Section
9.7.1).
(covered
in
(covered
PL/SQL
functions
within
DB2)
possible SQL
PL/SQL
(covered
SQL
server.
multiple
end user. End users can use PL/SQL to
Triggers
invoked
and
PSMs through
makes it
trapping.
DBMS
among
was promulgated.
Procedural
indirectly)
at the
shared
modules is left to each vendor to implement.
Oracle implements language
executed and
access rights to a stored stored
Oracle,
official
and
stored
in
Section
functions
can
PL/SQL
Section
be
SQL statements,
provided
SQLs
only
within
such they
and
Section
9.7.3).
9.7.4).
with
used
programs
9.7.2
built-in SQL
as triggers conform
to
aggregate statements,
and
stored
very
specific
functions while
PL/SQL
procedures. rules
that
such
as
functions
Functions are
MIN and are
can
also
dependent
MAX. mainly
be called
on your
DBMS
environment.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
477
note PL/SQL, in the
Using and
triggers
Oracle END
(See
and
following
SQL*Plus,
clauses.
Figure
FIgure
stored
sections
you
For
procedures
assume
can
the
write
example,
the
are illustrated use
of
a PL/SQL
following
within the
Oracle
code
context
of an
Oracle
DBMS.
All examples
inside
BEGIN
VENDOR
table.
RDBMS.
block
PL/SQL
by enclosing
block
inserts
the
a new
commands row
in
the
9.28.)
9.28
anonymous pL/SQL block examples
9
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
478
part
III
Database
Programming
BEGIN INSERT
INTO
VALUES
VENDOR
(25678,'Microwork
Corp.',
'Adam
Gates','5910','546-8484','NL','N');
END; /
The PL/SQL block shown in Figure 9.28 is known as an anonymous not
been
given
to indicate press
the
a specific
the
end
Enter
key
after
message PL/SQL But suppose is
note entry.)
the forward
slash.
successfully
a
example,
that
That
the
blocks
type
last
of PL/SQL
Following
the
line
uses
block
PL/SQL
a forward
executes
blocks
because it has slash
as soon
execution,
(/)
as you
you
see the
completed.
more specific
message
displayed
New
Added.
To
Vendor
on the
SQL*Plus
produce
a
more
ON.
SQL*Plus
screen
specific
after
a procedure
message,
you
must
do
things: At the
SQL
console
like
.
prompt,
(SQL*Plus)
standard server
enter
SET
SET
to receive
side,
SERVEROUTPUT messages
PL/SQL
not at the
SERVEROUT
from
the
client
side. (To
This
server
code (anonymous
PUT
messages from the
PUT_LINE
command
side (Oracle
blocks,
triggers,
stop receiving
enables
DBMS).
the
Remember,
and procedures)
messages from the
client
just
are executed
server,
you
would
OFF.)
PL/SQL
block to the
SQL*Plus
console,
use the
DBMS_OUTPUT.
function.
The following New
type
SQL, the
at the
To send
9
typing
want
for
(Incidentally,
command-line
procedure you
completed
two
name.
of the
PL/SQL block
anonymous
Vendor
Added!
PL/SQL
(See
Figure
block inserts
a row in the
VENDOR table
and displays
the
message
9.28.)
BEGIN INSERT
INTO
VALUES
VENDOR
(25772,'Clue
Store','Issac
Hayes','5910','323-2009','NL','N');
DBMS_OUTPUT.PUT_LINE('New
Vendor
Added!');
END;
/ In
Oracle, you can use the
PL/SQL
blocks.
generate
an error
after
The following supported
SQL*Plus
The SHOW
ERRORS
creating
example
by the
SHOW
command
yields
or executing
of
procedural
in fact,
command
a PL/SQL
an anonymous language.
many vendors
their
to
help you diagnose
debugging
errors found
information
whenever
in you
block.
PL/SQL
Remember
enhance
ERRORS additional
block that
demonstrates
the
products
exact
several
syntax
of the
with proprietary
of the
constructs
language
is
vendor-dependent;
features.
DECLARE W_P1
NUMBER(3)
:5
0;
W_P2
NUMBER(3)
:5
10;
W_NUM
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
NUMBER(2)
Rights
Reserved. content
does
May not
not materially
be
:5
copied, affect
0;
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
479
BEGIN WHILE
W_P2
SELECT
,
300
LOOP
COUNT(P_CODE)
WHERE
P_PRICE
INTO
BETWEEN
W_NUM W_P1
DBMS_OUTPUT.PUT_LINE('There ' and ' ||
W_P2);
W_P1 :5
W_P2
1 1;
W_P2 :5
W_P2
1 50;
AND are ' ||
FROM
PRODUCT
W_P2; W_NUM
|| '
Products
with
price
between
' ||
W_P1 ||
END LOOP; END; / The
blocks
code
FIgure
and
9.29
execution
are
shown
in
Figure
9.29.
anonymous pL/SQL block with variables andloops
9
The PL/SQL block shown in Figure 9.29 has the following
characteristics:
The PL/SQL block starts with the DECLARE section in which you declare the variable names; the data types; and, if desired, aninitial value. Supported data types are shown in Table 9.9.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
480
part
III
tabLe Data
Database
Programming
9.9
pL/SQL basic data types Description
Type
Character values
CHAR
of a fixed length;
for example:
W_PCODE CHAR(5) Variable length
VARCHAR2
W_FNAME Numeric
NUMBER
values;
W_PRICE
for
for
W_EMP_DOB Inherits
%TYPE
the
database
WHILE
is
used.
The
SELECT statement You
can
the
same
Note the
data type
an attribute
of a
as the
P_PRICE
column
in the
PRODUCT
table
syntax:
only the
INTO
more than
of the
Each statement
uses the INTO
use
returns use
or from
statements;
LOOP
Note the
previously
LOOP
END
statement
you declared
example:
PRODUCT.P_PRICE%TYPE
condition
variable.
for
W_PRICE
PL/SQL
9
example:
data type from a variable that
W_PRICE
WHILE loop
example:
DATE
table;
Assigns
A
values; for example:
NUMBER(6,2)
Date values;
DATE
character
VARCHAR2(15)
string
inside
one
keyword
keyword
value,
the
PL/SQL
inside
you
concatenation
to assign
will get
symbol
code
the
output
a PL/SQL
block
of the
query to a PL/SQL
of code.
If the
SELECT
an error.
| |
to
must end
display
the
output.
with a semicolon
;.
note PL/SQL
blocks
SELECT,
INSERT,
directly
can contain
only standard
UPDATE
supported
in
The
most
useful
and
executed
you
need
a PL/SQL
SQL data
DELETE.
The
use
manipulation
of
data
is that
they
language
definition
(DML)
language
commands
(DDL)
such
commands
as
is
not
block.
feature either
to
and
of PL/SQL implicitly
use triggers
blocks
or explicitly
and
stored
by the
procedures.
let
you
DBMS.
create
That
We explore
code
capability
database
that is
can
be
especially
triggers
and
named,
stored
desirable stored
when
procedures
in the next sections.
9.7.1 triggers Automating critical
in
inventory
business a
supported
Copyright review
2020 has
Cengage deemed
Learning. that
business
management.
any
of triggers
All suppressed
Rights
Reserved. content
does
and
automatically
environment.
For
with sufficient
functionality
Editorial
procedures
modern
example,
product
you
May not
not materially
be
copied, affect
of the
want
to
availability.
can be supported
scanned, the
overall
or
duplicated, learning
maintaining
One
most make
Microsoft
at the
in experience.
whole
Cengage
part.
Due Learning
sure
to
electronic reserves
this
rights, the
right
some to
and
consistency
procedures
current
does
level,
integrity business
that
Access
application
or in
data critical
product
not support
third
party additional
content
may content
triggers.
be
any
time
can
be
While the
same level
suppressed at
proper
sales
does not create the
remove
are
is
from if
the
subsequent
eBook rights
and/or restrictions
of
eChapter(s). require
it
Chapter
data integrity that
the
layer,
provided
correct
it is
possible
propagated.
that
for
are
edits
Therefore,
products
ensuring
by triggers.
updates
hand
reflects
have
been Business
logic
and
requires
logic
without
a product
allowable
SQL and
into the
the
order
quantity
at the
appropriate
be
written
on hand.
Advanced
database
are implemented
database
that
Language
to
SQL
481
ensures
application
updates
being
a vendor
when
Better yet, how about
automatically? ordering,
must
an
the
minimum
consistent
key issues
to
ensure
below its
business
When triggers
directly to
product
an up-to-date two
made
completed
automatic
set,
be
to embed
propagated.
necessary
drops
that the task is
To accomplish
to
it is
inventory
Using triggers
always
9 Procedural
you first
value.
need to
After the
make sure that
appropriate
the
product
products
quantity
availability
on
requirements
be addressed:
update
of the
product
quantity
on hand
each
minimum
allowable
time
there
is
a sale
of that
product. If the
products
level,
the
To accomplish quantity
those
two tasks,
and
in the
correct
inefficient
on hand falls
because
another
data
is invoked
A trigger
is
of SQL
time
multiple
product
there
must be
requires
that
the
It is
or after
with
table
whether by the
state
useful
the
trigger
trigger,
as part
one
are critical
Triggers
can enforce
can
RDBMS
and to the
Auditing
a critical
Automatic
To For
see
how
if
automatically
Copyright review
2020 has
Cengage deemed
Learning. that
is
any
All suppressed
derived
does
the
May
the
is
sold.
SQL tasks.
or deleted.
The
external
at the
is
action
is
program
management.
actions
one of the
The condition
occurred.
or an
and
critical
it.
model. An event can be any operation that
event
be enforced
insert
database
what
determines
what is
undertaken
being
called.
For example:
DBMS design and implementation
and providing
most common
records truly
as a whole.
column
for
appropriate
levels.
warnings
uses for triggers
and
is to facilitate
in tables
useful;
Oracle
they
and also
recommends
call other add
triggers
stored
procedures.
processing
power
to the
for:
and
the
lets
not materially
be
purposes. used,
lets
on hand
is
quantity
use the
copied, affect
scanned, the
values.
constraints.
backup
quantity
not
the
be
by the RDBMS upon the occurrence
operation.
executed
values,
or security
whether
Reserved. content
a product
to perform
each
would
audit logs).
created
process,
Rights
table
system
a products
that
(eCA)
operation
that cannot
making
tables
check
demonstrate
in
of business
a trigger
process
each time
product
to run
integrity.
(creating
of replica
example,
database
update
of
the
have
multistage
executed
updated
that triggered
after
action. In fact,
role
update
would
that:
an update
being
by automating
to
generation
Enforcement
and
a
must remember
is inserted,
example
be executed
proper
database
purposes
Such
one to you
more triggers.
command
constraints
be used
play
(quantity-on-hand)
table.
or
for
of referential
Triggers
Creation
to
for remedial
enforcement
Triggers
rule) is to
add functionality
suggestions the
written
to remember
of the transaction
database,
as a SQL
Triggers
Triggers
Editorial
(or
such
Next,
sale.
somebody
a data row
a database
may have
of the
flag.
a new
Triggers tend to follow an event-Condition-Action changes
inventory
SQL statements:
reorder
was
statements
event.
before
is executed
write
the
SQL code that is automatically invoked
associated
database
A trigger
each
manipulation
A trigger
Each
update
SQL environment
Atrigger is procedural of a given
you could to
order
a series
worse, that
below its
must be reordered.
on hand
statement
Even
quantity
product
overall
or
on
examine
hand
PRODUCT
duplicated, learning
in experience.
a simple
updated
whole
falls
below
table in
or in Cengage
part.
inventory
when the
Due Learning
to
its
Figure
electronic reserves
right
is
sold,
minimum
9.30.
rights, the
management
product
some to
third remove
additional
problem.
system
allowable
Note the
party
the
content
should
quantity.
use of the
may content
be
suppressed at
any
time
from if
To
minimum
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
482
part
III
Database
order
quantity
ORDER field
Programming
(P_MIN_ORDER)
indicates
the
that indicates
values
FIgure
are
whether
set to
9.30
and the
minimum
0 (No)
quantity
the
to
product
serve
the proDuCt
product for
reorder
restocking
needs
to
be reordered
basis
for
the initial
as the
flag (P_REORDER)
a order.
The
(1
5 Yes,
trigger
columns.
P_REORDER 0
The P_MIN_
column
is
5 No). The initial
a numeric
P_REORDER
development.
table
9
online of the this
book.
Microsoft
Given the
quantity trigger
Content (The
PRODUCT
sets the
represents
table
listing
P_QOH. If the
The
syntax
OR REPLACE
[BEFORE
table
also
shown
shown
in
quantity
column to
in the 'Ch09_SaleCo'
Figure
9.30,
lets
database
create
on hand is below the
to
create
TRIGGER
/ AFTER] [DELETE
EACH
is
on the online platform for that is
stored
in
format.)
P_REORDER
Yes.)
CREATE
shown in Figure 9.30. The script file is located
PRODUCT
Access
on hand,
[FOR
Oracleuserscanrunthe PRODLIST.SQL scriptfileto formatthe output
PRODUCT table
1. (Remember a trigger
in
that
the
a trigger
to
evaluate
the
products
minimum
quantity
shown in
P_MIN, the
number
1 in the
P_REORDER
column
Oracle is:
trigger_name
/ INSERT
/
UPDATE
OF column_name]
ONtable_name
ROW]
[DECLARE] [variable_namedata
type[:5initial_value]
]
BEGIN
PL/SQL instructions; .......... END;
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
As you can see,
a trigger
definition
timing:
BEFORE
The triggering executes
in this
The triggering
case, before
event:
contains
the following
or AFTER.
This timing
or after the triggering
The statement
that
9 Procedural
Language
SQL and
Advanced
SQL
483
parts: indicates
when the triggers
statement
causes the trigger
is
to
PL/SQL
code
complete.
execute
(INSERT,
UPDATE,
or
DELETE). The triggering
level:
There are two
types
of triggers,
statement-level
triggers
and row-level
triggers. ? A statement-level of trigger default
trigger
is executed
is assumed
once,
before
trigger
requires
executed
once for
each row
ten rows,
the trigger
executes
action:
The PL/SQL
The triggering statement
inside
In the PRODUCT an UPDATE
of the
P_QOH
use of the affected
ROW keywords.
statement
is
This type
completed.
This is the
code enclosed
code you
must end
will create
and P_MIN
If the
FOR EACH
ROW keywords.
by the triggering
statement.
This type
(In other
of trigger
is
words, if you update
ten times.)
The trigger
column.
to
The trigger
PL/SQL case,
table.
P_MIN
P_REORDER
the
tables
PRODUCT
with the
FOR EACH
case.
? A row-level
in the
if you omit the
or after the triggering
value
of
the
a statement-level
attributes
action
between
is
an equal
row
UPDATE to
END keywords.
Each
;.
trigger
for an existing
executes
P_QOH
BEGIN and
with a semicolon
that is implicitly
or AFTER an INSERT
statement
or less
executed
than
that
P_MIN,
of a new row
compares the
AFTER
the
trigger
P_QOH
updates
the
1.
code is
shown in
Figure 9.31.
9
FIgure
To test
9.31
this trigger
the trg_proDuCt_reorDer
version,
lets
change
the
trigger
minimum
quantity
for
product
'23114-AA'
to
8. After
that update, the trigger makes sure that the reorder flag is properly set for all of the products in the PRODUCT table. (See Figure 9.32.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
484
part
III
FIgure
Database
Programming
9.32
Successful trigger
execution after the p_MIn value is updated
This trigger seems to work well, but what happens if you reduce the minimum quantity of product 2232/QWE? Figure 9.33 shows that when you update the minimum quantity on hand of the product
9
2232/QWE,
FIgure
Copyright Editorial
review
2020 has
it falls
9.33
Cengage deemed
Learning. that
any
below the
new
the p-reorDer
All suppressed
Rights
Reserved. content
does
May not
not materially
be
minimum,
but the reorder
flag is
still 0.
Why?
value mismatchafter update of the p_MIn attribute
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The answer is that the trigger REORDER
trigger
The trigger two
code
fires
(INSERT
or P_QOH
UPDATE
The triggering
action
product.
inserted),
action
is
Now lets
performs
modify
the
FIgure
is
completed.
UPDATE
UPDATE
plus
in the
Lets
examine the
Therefore,
UPDATE).
PRODUCT
that
all 519,129
do not value
need
Advanced
SQL
485
TRG_PRODUCT_
the
DBMS always
That is,
table,
the
after
you
trigger
executes
do an update
executes
another
to
all This
handle
all
of the can
table
rows
rows
affect
in the
the
PRODUCT
with 519,128
(519,128
table,
performance
original
rows
rows
of the
and you insert
plus the
one
you
an update!
only to
when the inventory
trigger
updates
one row!
if you have a PRODUCT
P_REORDER
9.34
that
just
will update
rows
required the
or
updates
what happens
sets the clearly
cases.
SQL and
detail:
a new row
an
The trigger
including
The trigger
UPDATE
statement
Imagine
one
plus
all possible
Language
automatically.
triggering
database. just
more
statement
or you insert
statement
even if the
9.31) in
after the triggering
statements
of P_MIN
does not consider
(Figure
9 Procedural
1; it
level
update
is
does
not reset
back
to
scenarios,
the
a value
value
greater
as shown
in
to
0, even if such
than
Figure
the
minimum
an value.
9.34.
the second version of the trg_proDuCt_reorDer
trigger
9
The trigger
in Figure
9.34 sports
several
new features:
The trigger is executed before the actual triggering statement is completed. In Figure 9.34, the triggering timing is defined in line 2, BEFORE INSERT
OR UPDATE. This clearly indicates
that the triggering
statement is executed before the INSERT or UPDATE completes, unlike the previous trigger examples. The trigger
is a row-level
trigger
instead
of a statement-level
keywords makethe trigger arow-level trigger. affected by the triggering statement. The trigger action uses the :NEW attribute.
trigger.
The FOR EACH
ROW
Therefore, this trigger executes once for each row
attribute reference to change the value of the P_REORDER
The use ofthe :NEW attribute references deserves a more detailed explanation. To understand its use, you must first consider a basic computing tenet: all changes are done first in primary memory, then to permanent memory.In other words, the computer cannot change anything directly in permanent storage (disk). It
must first read
in primary
Copyright Editorial
review
2020 has
Cengage deemed
permanent
storage to
primary
memory; then it
memory; finally, it writes the changed data back to permanent
Learning. that
the data from
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
makes the
change
memory (disk).
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
486
part
III
Database
The
DBMS does exactly
DBMS (You
Programming
makes two will learn
copies
more
contains
the
changed
(new)
about
original
made by an INSERT, to
refer
and :OLD
attribute
row
that
Although
the
not
uses
operator.
value
This
the
a BEFORE
trigger,
this
triggering
and the :NEW
values
value
0 to the version
not
disk,
assign
a value
:OLD
and the
value),
values;
value 1 to the
P_REORDER not
(after
original
table).
values; can
action.
copy
contains
the
any changes
You
you can use :NEW
For example:
minimum
quantity
is
for
comparison
use
the
already
not
to the
in
the
done
of each
The
assignment
change column
BEFORE
are
column
memory.
before is
uses the
them.
the means
made in
assignment
always
hasnt
otherwise,
Remember,
P_REORDER
you cannot
statement
place;
changes
table.
P_REORDER
triggering
taken
exist.
but after the
stored
the
database
with the
this
mean that has
would
permanently
are read-only
does
does
values
to
reference
1; assigns
copy
trigger
on hand
Therefore,
The first
second
the
of a database
statement
:NEW
to the
in
the
statement.
statement.
saved to
(never
stored
data integrity,
or DELETE)
Concurrency).
The
to refer to the
quantity
trigger.
and
changes.
are
code
the
UPDATE
saved to the
that
PL/SQL
are permanently
:5
new trigger
the
values
a row-level
the
are
the
You can use :OLD
(the
more. To ensure
Transactions
before
compares is
contrary,
results
The :OLD
P_REORDER assigns
the
or INSERT
the :NEW
within
triggering
have fired
changes
The trigger UPDATE
is
On the
would
before the
to
trigger
yet.
this
Managing
are permanently
or DELETE). values
to something by a DML (INSERT,
attributes
that
5 :NEW.P_MIN
by the
12,
of the
only
that
changed
Chapter
changed
references ,
being
attributes
UPDATE
updated
executed trigger
of the
Remember is
in
values
to the
IF :NEW.P_QOH a product.
this
(old)
values
use :NEW
the same thing, in addition
of every row
the
always :5
done
assignment
Note that :NEW.
and :NEW.P_REORDER
:5
0;
column. any
DML
statement!
9 FIgure
Copyright Editorial
review
2020 has
9.35
Cengage deemed
Learning. that
any
execution of the second version of the trigger
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Before testing the
the new trigger,
minimum
After
you
quantity,
create
As you
the
table.
is
Figure
triggering
up as shown in
Figure 9.35.
will run
other
attribute,
The use of triggers are independent are
use the
following
DROP
would
features:
have
value
affected
run.
they
with it.
the
automation
are
associated
However,
if you
row
of
multiple
with
database
need
to
in this
case,
data
delete
P_QOH
all
without
must
487
be 0.
in Figure
of the
9.35.
PRODUCT
PRODUCT
rows
statement
tasks.
When you
atrigger
SQL
was set
or P_MIN. If you update
management
tables.
flag
as shown
not
Advanced
on hand that is above
reorder
the triggering
or update
SQL and
all rows
rows,
the reason
row
the
to fire it,
only three
set. Thats
a new product
wont
has a quantity
statement
affected
Language
condition,
important
only if you insert trigger
currently
Given that
an UPDATE
following each
1.
the
facilitates
deleted
set to
execute
P_REORDER
objects,
objects
is
for
statement
correct
The trigger
can
note the
invoked
would have the
any
flag
you
9.35,
automatically
If your
product '11QER/31'
reorder
new trigger,
examine
The trigger
note that
yet the
9 Procedural
Although
delete
deleting
triggers
a table,
the
all trigger
table,
you
could
command:
TRIGGER
trigger_name
9.7.2 Stored procedures A stored stored
procedure
with triggers, same that
is
procedures this
integrity they
be used
procedure
doing that,
as stored
to
to represent
a single transaction.
is
network.
The use
executed
locally
Stored
at the
procedures unique
chance
of errors
cost
OR REPLACE
[variable_name
update
or the
stored
to the
code that
are
by
called
of application
addition
SQL
performance
means
is
create
customer.
By
them
as
Because the stored
statements
because
over the
all transactions
isolation
over the
and
programs),
and
can
and execute
have to travel
of code
by application
development
of a new
performance.
not
procedures you
procedures:
of individual
does
As
would not have the
of stored
procedure
triggers,
procedures.
For example,
and increase
system
duplication
and as such
use of stored
SQL statement
database
stored
advantages
transactions.
no transmission
each
layer
major
within a single
improves
and
modules
and the
use the following
CREATE
is
procedures
RDBMS,
PL/SQL
a credit
network traffic
there
help reduce
(creating
procedure,
reduce
server,
of stored on the
sale,
Just like
does not support
application
business
clear advantages
substantially
stored
Access
One of the
SQL statements
There are two
procedures
procedure
at the
and represent
a product
and SQL statements.
Microsoft
procedures.
encapsulate
you can encapsulate
Stored
of procedural
database.
would need to be implemented
robustness
can
a stored
a named collection
are stored in the
code
thereby
maintenance.
are network.
sharing minimising
the
To create
a stored
data-type,
... )] [IS/AS]
syntax:
PROCEDURE
procedure_name
data type[:5initial_value]
[(argument
[IN/OUT]
]
BEGIN PL/SQL
or SQL
statements;
...
END; Note the following Argument
could
Copyright Editorial
review
2020 has
Cengage deemed
specifies
have zero
Learning. that
important
any
All suppressed
Rights
the
or
about
parameters
does
May not
not materially
be
copied, affect
scanned, the
stored
that
more arguments
Reserved. content
points
overall
are
procedures
passed
and their
to the
stored
syntax:
procedure.
A stored
procedure
or parameters.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
488
part
III
Database
IN/OUT
Programming
indicates
whether the
Data-type is one of the match those
name, its
data type,
equal
stored
to twice
9.36
the
between
procedures,
the
minimum
quantity.
used in the
keywords
that
discount Figure
RDBMS.
The data types
normally
IS and
BEGIN.
You
must specify
the variable
value.
you for
or both.
statement.
an initial
assume
5 per cent
or output
SQL data types
and (optionally)
an additional
is for input
RDBMS table-creation
can be declared
assign
FIgure
procedural
used in the
Variables
To illustrate to
parameter
want to create
all products
9.36
shows
Creating the prC_proD_DISCount
a procedure
(PRC_PROD_DISCOUNT)
when the quantity how
the
stored
on hand is
procedure
is
more than
or
created.
stored procedure
9
online Content Thesourcecodefor allofthe storedprocedures shown inthis sectioncan be found
As you
on the
examine
Figure
OUTPUT.PUT_LINE
you previously To execute
online platform for this
9.36,
function
ran the
note to
that
the
display
a
SET SERVEROUTPUT stored
procedure,
you
book.
PRC_PROD_DISCOUNT message
when the
stored procedure
procedure
executes.
uses the
(This
action
DBMS_ assumes
ON.) must use the following
code:
BEGIN PRC_PROD_DISCOUNT; END; /
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Note that if you are using the the
following
SQL*
Plus command
line,
9 Procedural
Language
you can also execute
SQL and
stored
Advanced
procedures
SQL
489
using
syntax:
EXEC procedure_name[(parameter_list)]; For example, the
to see the results
of running
EXEC PRC_PROD_DISCOUNT
FIgure
9.37
the
PRC_PROD_DISCOUNT
command
shown in
Figure
results of the prC_proD_DISCount
stored
procedure,
you can use
9.37.
stored proCeDure
9
Using
Figure
a quantity (Compare
9.37
on hand the
first
One of the previous
increase
Copyright review
2020 has
you
more than
PRODUCT
can
increase
Learning. that
any
All suppressed
an input
procedure.
Rights
Reserved. content
does
listing
May
not
be
copied, affect
9.38
scanned, the
overall
the
to the
second
case,
shows
the
duplicated, learning
in experience.
whole
for
all products
was increased
table
code
part.
fine,
can
pass
for that
Due Learning
to
but
electronic reserves
with
by 5 per
cent.
listing.)
you can pass values to them.
you
Cengage
attribute
quantity
PRODUCT
worked
or in
discount
minimum
is that
In that
or
product
to twice
procedure
variable?
materially
how the
of procedures
Figure
not
see
or equal
table
main advantages
to the
Cengage deemed
of
guide,
PRC_PRODUCT_DISCOUNT
percentage
Editorial
as your
what if
you
an argument
For example,
wanted
to
to represent
the
make the the
rate
of
procedure.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
490
part
III
FIgure
Database
9.38
Figure
Second version of the prC_proD_DISCount
9.39
shows
the
execution
of the
procedure.
Note that
if the
procedure
parentheses
and they
must
be separated
discount is
not
FIgure
9
Programming
of 1.5, the
error
message
second
requires
version arguments,
by commas.
from
within
stored proCeDure
the
Also
stored
of the
PRC_PROD_DISCOUNT
those
arguments
notice
that,
procedure
is
when shown
stored
must be enclosed we try to
and the
in
apply
a product
product
discount
applied.
9.39
results of the second version of the prC_proD_DISCount
stored proCeDure
Stored procedures are also useful for encapsulating shared code to represent business transactions. For example, you can create a simple stored procedure to add a new customer. By using a stored procedure, all programs can call the stored procedure by name each time a new customer is added. Naturally, if new customer
attributes
are added later,
you
However, the programs that use the stored procedure added
attribute
and
PRC_CUS_ADD
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
would
stored
Rights
Reserved. content
does
need
to
add
procedure
May not
not materially
be
copied, affect
only
shown
scanned, the
overall
or
duplicated, learning
a new
in Figure
in experience.
whole
or in Cengage
would need to
modify the
stored
procedure.
would not need to know the name of the newly
parameter
to the
procedure
call. (Take
alook
at the
9.40.)
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
9.40
9 Procedural
Language
SQL and
Advanced
SQL
491
the prC_CuS_aDD stored proCeDure
9
As you examine
Figure
The PRC_CUS_ADD CUSTOMER
9.40, note these
features:
procedure uses several parameters,
one for each required
attribute in the
table.
The stored procedure uses the CUS_CODE_SEQ sequence to generate a new customer code.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
492
part
III
Database
Programming
The required null
only
parameters
when the
second
customer
and
cannot
The
procedure
was
those
table
specified
specifications
addition
in the table
permit
nulls
was unsuccessful
definition
for
because
Until
now,
returned
error.
displays
a
message
in the
SQL*Plus
If you
returned
all
of the
SQL
a single want to
is
the
use
There
stored,
DBMS
If the
an SQL
you
have
SQL
statement
statement
used inside
that
types
SQL
to let
the
a PL/SQL
returns
returns
may return
two
CURSOR you
FETCH,
holding
not in the
client
of cursors:
is
user
note
that
a required
the
attribute
know
that
the
customer
more than
more rows
inside
cursor_name have
and
the
9.10
one value
you
inside
usedin procedural
and rows.
or stored
value,
your
PL/SQL
an code,
SQL to hold the data rows
area of memory in
Cursors
procedure)
will generate
are
held in
which the
output
a reserved
memory
and explicit. returns
An implicit
cursor
only one value.
is
automatically
Up to this
point,
created
in
all of the examples
cursor is created to hold the output of an SQL statement that
could
return
0 or only
DECLARE
one row).
To create
an explicit
cursor,
use the
section:
IS select-query;
declared
CLOSE)
summarises
(but
a PL/SQL
(trigger
one
computer.
implicit
An explicit
block
more than
as a reserved
columns
SQL statement
cursor.
or
syntax
of a cursor
an array
when the
created an implicit following
You can think
like
server,
are two
procedural
tabLe
CUS_AREACODE
and can be
example,
with Cursors
statements value.
by an SQL query. query
Once
the
console
you need to use a cursor. A cursor is a special construct
9
For
added.
have
area in
parameter.
be null.
9.7.3 pL/SQL processing
of the
must be included
that
a cursor,
anywhere
main use
you
can
between
of each
use
the
specific
BEGIN
of those
PL/SQL
and
END
cursor
processing
keywords
of the
commands
PL/SQL
(OPEN,
block.
Table
9.10
commands.
Cursor processing commands
Cursor Command
explanation
OPeN
Opening cursor the
the
for
OPEN FeTCH
it
The
doesnt
SQL command
cursor
populate
and
declaration
the
cursor
populates
command with the
the
cursor
only reserves
data.
Before
you
with
data,
a named can
use
opening
memory
a cursor,
the
area for
you
need to
cursor_name cursor
copy
FETCH
is
opened,
it to the
cursor_name
data types
statement
you can
PL/SQL
variables
INTO
The PL/SQL variables have
the
For example:
Once the and
executes
processing.
cursor;
open it.
cursor
variable1
five
with the
columns,
FETCH
processing.
[, variable2,
used to hold the
compatible
returns
use the for
data
to retrieve is:
data from
the
cursor
...]
must be declared in the
columns
there
command The syntax
retrieved
must be five
by the
PL/SQL
DECLARE section
SQL command.
variables
to receive
and
must
If the
cursors
the
data from
SQL the
cursor.
This type
of processing
database cursor
is
second CLOSe
Copyright Editorial
review
The
2020 has
Cengage deemed
Learning. that
any
copied row
CLOSE
All suppressed
Rights
to the
of data is
PL/SQL placed
command
Reserved. content
resembles
the
one-record-at-a-time
models. The first time you fetch
does
May not
not materially
be
closes
copied, affect
scanned, the
variables;
in the
overall
the
or
duplicated, learning
the
PL/SQL cursor
in experience.
processing
used in previous
a row from the cursor, the first row of data from the
for
whole
second
time
variables;
you fetch
a row
from
the
cursor,
the
and so on.
processing.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Cursor-style
processing
a cursor,
it
becomes
opening
a cursor,
When you
the
PL/SQL
continues
the
fetch
variables.
do you
reached
the
important
of the
9.11
the
cursor
contains
of the
the
Language
one row
a current
SQL and
at a time.
row
Advanced
SQL
493
Once you open
pointer.
Therefore,
after
cursor.
data from
the
row
current
pointer
row
in
the
moves to the
cursor
is
copied
next row in the
to
set and
cursor.
of rows
data
set?
9.11
row
set
the current
end of the
Table
data
first
cursor,
number
cursor
data from
That
is the
the
the which
information.
tabLe
row
from
set.
After the fetch,
know
end
retrieving
data
current a row
until it reaches
How
involves
an active
9 Procedural
are in
You know
summarises
the
cursor?
because
the
Or how
cursors
cursor
have
do you
know
when
special
attributes
you
that
have
convey
attributes.
Cursor attributes
Attribute
Description
%rOwCOUNT
%FOUND
Returns
the
error. If
no FETCH
Returns
TRUE if the last
not return
number
has
been
been %iSOPeN
done
FETCH
so far. If the but the
cursor
returned
cursor is
a row.
is
not
OPEN, it returns
OPEN, it returns
Returns
FALSE
an
0.
if the last
FETCH
did
an error. If no FETCH has been
NULL.
Returns TRUE if the last FETCH
fetched
any row. If the cursor is not OPEN, it returns
done, it contains %NOTFOUND
of rows
returned
a row.
FETCH did not return If the
done, it contains
cursor
is
not
any row.
Returns FALSE if the last
OPEN, it returns
an error.
If no FETCH
has
NULL.
Returns
TRUE if the
cursor is
open (ready
for
processing)
closed.
Remember,
before you can use a cursor, you
or FALSE
if the
cursor
is
must open it.
9
To illustrate have in
the
a quantity
Figure
use of cursors, lets
use a simple
stored
on hand
the
average
quantity
procedure
code
greater
than
procedure
example
on hand
for
that lists
all products.
all products The
code is
that shown
9.41.
As you
examine
the
stored
shown
in
Figure
9.41,
note
the
following
important
characteristics: The
type
%TYPE
data
type
in
is used to indicate
declared
or from
indicate
that
columns
compatible The
that the
definition
section.
given variable inherits
of a database
W_P_CODE PRODUCT
and
table.
This
way, you
is
declared
as:
the
In this
W_P_DESCRIPT
table.
As indicated
Table
data type from
case,
will have ensure
in
that
you
the the
are
9.9, the
a variable
using
the
same
data type
PL/SQL
variable
%TYPE
data
previously %TYPE to
as the
respective
will have
a
data type.
PROD_CURSOR
To open
variable
an attribute
the
in the
the
the
cursor
PROD_CURSOR
cursor
and
CURSOR
populate
PROD_CURSOR
it, the
following
command
is
executed:
OPEN
PROD_CURSOR;
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
494
part
III
Database
FIgure
Programming
9.41
9
a simple prC_CurSor_eXaMpLe
The LOOP statement is used to loop through the data in the cursor, fetching
one row at atime.
The FETCH command is used to retrieve a row from the cursor and place it in the respective PL/SQL variables. The EXIT command is used to evaluate when there are no more rows in the cursor (using the %NOTFOUND cursor attribute) and to exit the loop. The
%ROWCOUNT cursor attribute is used to obtain the total
number of rows
processed.
The CLOSE PROD_CURSOR command is used to close the cursor. The use of cursors,
combined
with standard
SQL,
makes relational
databases
very
desirable
because
they
enable programmers to work in the best of both worlds: set-oriented processing and record-orientated processing. Any experienced programmer knows to use the tool that best fits the job. Sometimes you may be better off manipulating data in a set-orientated environment; at other times, it may be better to
use a record-orientated
environment.
Procedural
cake and eat it, too. Procedural SQL provides functionality while maintaining a high degree of manageability.
SQL lets
you have your
that enhances the capabilities
proverbial
of the DBMS
9.7.4 pL/SQL Stored Functions Using programmable procedures
or procedural
and functions
SQL, you can also create your own stored functions.
are very similar.
A stored
function
and SQL statements that returns a value (indicated create afunction, you use the following syntax:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
is basically
a named
group
by a RETURN statement in its
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
Stored
of procedural
program code). To
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
CREATE
FUNCTION
function_name
(argument
IN
9 Procedural
data-type,
Language
... ) RETURN
SQL and
data-type
Advanced
SQL
495
[IS]
BEGIN PL/SQL
statements;
... RETURN
(value
or expression);
END; Stored functions from to
confuse
There
is little
doubt
and its
requires
to
access
executable
internet.
No
matter
host language. capabilities
is
processing
side in its
authors
Mount
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
known
array
College
Reserved. does
May
grateful
not
be
affect
scanned, the
overall
to
as Visual
and systems.
or DB2
of
are related
If you
Yet,
Windows-based
the
almost
GUI system
you
will likely
need
the
in experience.
whole
or in Cengage
part.
Due Learning
you
generally
is,
each
All of the
a binary-executable
typically
use
his
rights, the
right
arrays
true
at a time.
provided
reserves
that
runs
Ada, FORTRAN,
can
especially
for
electronic
requires
at a time.1
is
(COBOL,
have adopted
to
procedural
at the
environment).
This is
basis
the
it is called the
language;
The host program
one record
the
over
languages
one instruction
comments is
run
languages:
for file
several
by
to
third remove
Pascal,
hold
data,
manipulation, newer
object-oriented
manner.
Emil
T.
considerable
some
to
However,
data sets in a cohesive
thoughtful experience
to
maintaining
procedural
host language
Although
Studio .NET
to
9
may be a standard
designed
interpreted
languages
at a time.
manipulate
duplicated,
with
DBMS
within an application
SQL statements
procedural
program).
data
learning
SQL
executed
at a time.
the
application
approach
and
Meanwhile,
one row
or
ease
such
and .NET.
developed
embedded
and it is
manipulates
IBM
systems
ASP
Oracle
being
a non-procedural,
as Visual
for
Server,
Web
mixing SQL
from
element
whose
copied,
be a
programming
data
elements
materially
or
most common
(different
typically
and
not
may
SQL is
space
one
Web application
due to its
language
programs
Java,
SQL
part
database
programming
The program
However,
Conventional
particularly
content
or it
as a compiled
such
Rights
Java.
side.
programmer
Mary
not
functions.
SQL statements that are contained
C# and
checked
help the
are
Saint
is
world,
with
Access,
between
that
environments that
be invoked
Remember
database.
Linux,
server
memory
host language
programming
The
the
rules).
is in
with other
you use, if it contains
syntax
process
process
the
stored
language
familiar
your
Microsoft
differences
at the
(also
own
PL/I)
extensions
1
its
mismatch:
and still
use, if
SQL is still the
key
place
program
Processing
or
Remember
parsed,
takes
and cannot
compliance
with
real
systems
most likely
applications.
some
mismatch:
instruction
where
as
which language
DBMS-based
Run-time
you
you
as VB.Net,
Embedded
in
C11,
But in the
are
data in the
Windows
understand
AVG)
manipulation
database
you
such
the
such
in
or triggers
specific
need a conventional
SQL is aterm used to refer to
binary
client
capabilities.
tools
a database
language
very
MAX and
as a data
and you still
programming to
procedures
some
MIN,
popularity
data-retrieval
manipulate
embedded programming
you
SQLs
Web applications,
use SQL to
that
as
C # or COBOL to integrate
of the
within stored follows
(such
and programs,
developing
regardless
function
SQL
that
powerful
Basic, .Net,
only from
the
SQL functions
eMbeDDeD
other systems
are
(unless
built-in
9.8
use
can be invoked
SQL statements
party additional
Cipolla, and
content
may content
who teaches
practical
be
suppressed at
any
time
from if
at
expertise.
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
496
part
III
Database
Programming
Data type match
mismatch:
data
types
To bridge the several
SQL provides
used in
differences,
programming
A standard
several
different
the
Embedded
languages.
data types,
host languages
The
SQL
example,
Standard2
defines
Embedded
syntax to identify
embedded
syntax to identify
host variables.
but some
(for
data types
and
varchar2
a framework
SQL framework
SQL code
of those date
defines
within the
may not data
types).
to integrate
the
SQL within
following:
host language
(EXEC
SQL/
END-EXEC).
A standard receive
data from
the
host language. A communication language. Another which
is
database
All host area
This
used
to
by the
the
exchange
status
and
error
(ODBC)
and
information
variables
programming
Connectivity
code)
in the
host language
process
the
that
data in the
(:). between
SQL
and
SQLSTATE.
SQLCODE
SQL is through
an application
Open Database
and
two
are variables
SQL
by a colon
contains
languages
writes to
embedded
preceded
area
host
programmer
provided
are
communication
way to interface the
(through
variables
Host variables
the
use
interface
of a call level (API).
and the
interface
A common
host
(CLI),3
CLI in
in
Windows
interface.
online Content Thesourcecodefor allofthe storedprocedures shown inthis section is available
on the
online
platform
for this
book.
Before continuing, lets explore the process required to create and run an executable program embedded SQL statements. If you have ever programmed in COBOL or C11, you will be familiar
9
the
multiple steps required
among language
to
generate
the final
executable
and DBMS vendors, the following
The programmer
writes embedded
SQL code
program.
is used to transform
DBMS-and language-specific. to the host language.
the
the
specific
details
vary
general steps are standard: within the
host language
follows the standard syntax required for the host language A preprocessor
Although
with with
embedded
instructions.
and embedded
SQL into
specialised
The preprocessor is provided by the
The code
SQL. procedure
calls that
are
DBMS vendor and is specific
The program is compiled using the host language compiler. The compiler creates an object code module for the program containing the DBMS procedure calls. The object code is linked to the respective library modules and generates the executable program. This process binds the DBMS procedure calls to the DBMS run-time libraries. Additionally, the binding process typically creates an access plan module that contains instructions to run the embedded
code
at run time.
The executable is run, and the embedded
Copyright Editorial
review
2020 has
2
https://crate.io/docs/sql-99/en/latest/chapters/39.html
3
www.oracle.com/database/technologies/appdev/oci.html
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
SQL statement retrieves
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
data from the database.
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Note that in the
you can embed individual
book,
PL/SQL
you
blocks
extremely
have
in
executed
and
SQL statements
DBMS-provided
to
embed
use
address
ad hoc
To embed
to
to
or ad
process
hoc
data
a host language,
Advanced
Up to this
SQL
However,
a host language.
this
497
and
inside
compiled
SQL
point
statements
requests.
that it is
follow
SQL and
block.
write
transactions
within a host language
SQL into
Language
PL/SQL
(SQL*Plus)
one-time
queries
SQL statements
as needed.
or even an entire
application
mode to
awkward
typically
as often
EXEC
a
an interpretive
difficult
Programmers
used
9 Procedural
it is
once
and
syntax:
SQL SQL
statement;
END-EXEC. The
preceding
following
syntax
works for
embedded
EXEC
SQL
SELECT,
code
INSERT,
will delete
UPDATE
and
109,
George
employee
DELETE
statements.
Smith,
from
For example,
the
EMPLOYEE
the
table:
SQL DELETE
FROM
EMPLOYEE
WHERE
preceding
embedded
EMP_NUM
5 109;
END-EXEC. Remember,
the
Therefore,
the
changes
it.
only for
the
useful
if
statement
Each first
you
time run;
could
send
from
data
the
using
the
embedded
practice
is
to
COBOL,
To use
you
host
would
an employee write the
whose
following
EXEC
to indicate
to
the
the
employee
SQL,
as the
SQL
host
variables
is
to
in the
by preceding represented
attributes.
host
is
would
good
be
more
to
may be used
receive
the
host language. For
Storage
variable
9
data
Common
example,
section.
with a colon (:).
by the
code
be used
it in the
Working
them
programmer code
be deleted.
may
declare
source
the
preceding this
statement.
The host variables
or they
names
the
Clearly,
number
must first
of course,
short,
by a colon (:).
embedded
an executable
unless, In
an error.
you
variable
number
row.
generate the
to generate
change
same
variable,
SQL section
employee
compiled
are preceded
a host
define the
refer to them in the embedded
is
and cannot
deletes will likely
variables
host language
similar
it
runs
a variable
SQL.
use
runs,
all subsequent
SQL, all host
from
permanently
program
specify
In embedded to
is fixed
the
SQL statement
if
you
Then you
For example,
W_EMP_NUM,
are
would
to
delete
you
would
code:
SQL
DELETE
FROM
EMPLOYEE
WHERE EMP_NUM
5 :W_EMP_NUM;
END-EXEC. At run
if the
time,
the
host
employee
statement
has
defines
a SQL
known
as the
EXEC
you been
variable
value
are trying
to
completed
used
delete
without
communication SQLCA
is
and is
execute
doesnt
errors?
area to
area
to
hold
defined
in
embedded
exist in the
As
and
Data
SQL
the
information.
Division
What
happens
How do you know
previously,
error
statement.
database?
mentioned
status the
the
In
embedded
that
the
SQL standard
COBOL,
such
an
area is
as follows:
SQL INCLUDE
SQLCA
END-EXEC. The SQLCA main values
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
area contains returned
All suppressed
Rights
Reserved. content
two
by the
does
May not
variables
variables
not materially
be
copied, affect
scanned, the
for status and their
overall
or
duplicated, learning
and error reporting.
Table 9.12 shows
some
of the
meaning.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
498
part
III
Database
tabLe
Programming
9.12
variable
SQL status
and error reporting
value
Name
variables
explanation Old-style
SQLCODe
integer
error reporting
supported
value (positive
0
Successful
100
No data; the
for
backward
compatibility
only; returns
an
or negative).
completion
of command.
SQL statement
did not return
any rows
or did not select,
update,
or
delete any rows. Any negative
-999
value indicates
that
an error occurred.
Added by SQL-92 standard to provide
SQLSTATe
character 00000
string (5
Successful Multiple XX-.
The following
embedded
EXEC
SQL
EXEC
SQL SELECT
9
completion
represents
SQL
code
EMP_LNAME,
WHERE
the
defined
as a
of command. XXYYY
class
the subclass
illustrates
where:
code.
the
EMP_LNAME
EMP_NUM
error codes;
long).
values in the format
represents
YYY-.
characters
predefined
code.
use
INTO
of the
SQLCODE
within
:W_EMP_FNAME,
a COBOL
program.
:W_EMP_LNAME
5 :W_EMP_NUM;
END-EXEC. IF
SQLCODE
5 0 THEN
PERFORM
DATA_ROUTINE
PERFORM
ERROR_ROUTINE
ELSE
END-IF. In that
example,
successfully. is
the
SQLCODE
If that is the
host
case, the
variable
is
checked
DATA_ROUTINE
to
is
determine
performed;
whether
the
otherwise,
query
completed
the
ERROR_ROUTINE
data
from
performed. Just
as
returns
with
PL/SQL,
more than
Section
or in the
in this chapter.
one
embedded
value.
If
Procedure To declare
SQL requires
COBOL
is
Division. a cursor,
used,
the
the
cursor
The cursor you use the
use can
of cursors
hold
be declared
must be declared syntax
to
either in the
and processed,
shown in the following
a query
Working
that
Storage
as you learnt
earlier
example:
EXEC SQL DECLARE SELECT WHERE
PROD_CURSOR P_CODE,
FOR
P_DESCRIPT
P_QOH
. (SELECT
FROM
PRODUCT
AVG(P_QOH)
FROM
PRODUCT);
END-EXEC.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Next, you open the cursor to EXEC
make the
cursor ready for
9 Procedural
Language
SQL and
Advanced
SQL
499
processing:
SQL OPEN
PROD_CURSOR;
END-EXEC. To
process
a time
and
FETCH
the
the
data
place
the
command
COBOL EXEC
rows
in the
values
in
completed
program.
cursor,
the
you
host
FETCH
variables.
The
This
section
successfully.
Such a routine
use the
is executed
command
SQLCODE of code
with the
to
must
retrieve
one row
be checked
typically
to
constitutes
PERFORM
ensure
part
command.
of
data that
at the
of a routine
in
For example:
SQL FETCH
PROD_CURSOR
INTO
:W_P_CODE,
:W_P_DESCRIPT;
END-EXEC. IF
SQLCODE
5 0 THEN
PERFORM
DATA_ROUTINE
PERFORM
ERROR_ROUTINE
ELSE
END-IF. When all rows EXEC
have
been
processed,
you
close
the
cursor
as follows:
9
SQL
CLOSE
PROD_CURSOR;
END-EXEC. Thus
far,
you
statements
have
and
seen
examples
parameters.
of
were specified in the application meaning SQL
that
the
SQL
statement SELECT
P_CODE,
FROM
data data
access access
Dynamic
a program
the
neither
preceding
Cengage
Learning. that
P_QOH,
while
the
used
are limited
predefined
to the
SQL
actions
that
SQL is known as static
application
is running.
For
SQL,
example,
the
P_PRICE
any
All
on the
fly.
They are
Therefore,
used to describe
the
the
preceding
more likely end
the
Rights
Reserved. content
does
programmer
SQL statement.
to require
user requires
Unfortunately,
the flexibility
that
SQL
be
May
not materially
be
which the
SQL statement
At run time in a dynamic
are required
end
are to
in
at run time.
that
nor the queries
could
not
an environment
is generated
SQL statements
or how those example
suppressed
are known in the
environment.
SQL statement
can generate
of the
deemed
programmer
That style of embedded
change
and conditions
requirements
be generated
has
the
programs
of defining
as dynamic
as
requirements.
are to
2020
tables
SQL is a term
environment,
review
which
of the
this:
work in a static
advance; instead,
Copyright
not
users
. 100;
Note that the attributes,
Editorial
SQL in
end
PRODUCT
end users seldom
the
will
P_DESCRIPT,
WHERE P_PRICE
their
like
the
programs.
statements
may read
embedded
Therefore,
to respond
user is likely
be structured.
to
to
know
is
not known in
SQL environment,
ad hoc queries. In such an
precisely
For example,
what
a dynamic
kind
of queries
SQL
equivalent
be:
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
500
part
III
Database
Programming
SELECT :W_ATTRIBUTE_LIST FROM :W_TABLE WHERE :W_CONDITION; Note that
the
attribute
list
W_ATTRIBUTE_LIST
and the
and
in the query generation.
end
user
user
might
might
W_CONDITION
Although
be In
want to
want to
know
dynamic
much slower addition,
know
how
are text
which
many
known
until the
variables
uses the
products
units
clearly
static
are
not
that
flexible,
SQL, and
more likely
to
have
of a given
find
are
flexibility
levels
specifies
the
to
end-user
than
input
values
variables,
100;
in
another
for
sale
at any
a price.
used
the end
case,
the
given
Dynamic
more computer
of support
W_TABLE,
For example, in one instance,
available carries
them.
build the text
outputs.
a price less
SQL requires
different
user
contain
different
product
such
dynamic
end
end-user input
multiple times to generate
SQL is
that
you
are
Because the program
user can run the same program the
condition
SQL
resources
end
moment. tends
to
(overhead).
and incompatibilities
among
DBMS
vendors.
note Appendix
O,
procedural
Building
a
language
relational
database
highlights
some
will show
how
Simple
and
Object-Relational
advanced
SQL that
may be developed.
of the
The
object features
a simple
example
appendix
that
can
Database
using
was introduced
in
briefly
using
the
into
Oracle
Objects,
chapter
introduces
have been incorporated
be implemented
Oracle
this
expands
to illustrate
concepts
Oracles
of
data
on the
how
an object
Oracle
objects
and
model. This appendix
objects.
9
SuMMary SQL provides relation. and
relational
The
set operators
UNION and
produce
UNION
a new relation
to
combine
with
all unique
The INTERSECT
relational
set
operator
operator
selects
rows
are
different.
that
output
of two
combine
(UNION)
queries.
only the
the
ALL set operators
the
or duplicate
selects
to generate
of two (or
(UNION
only the
UNION,
queries
output
ALL) rows
common
INTERSECT
rows.
and
a new
more) queries from
The
both
MINUS
MINUS require
set
union-compatible
relations. Operations join
in
that join
which
tables
only rows
as well as the rows A natural
join
duplicate
that the Joins
name.
other
2020 has
Cengage deemed
Learning. that
any
All suppressed
Reserved. content
as
does
both tables.
values
in
the
when the
between
the
outer joins.
An inner
join
An outer join
returns
the
and
or both tables
matching tables
share
USING
column
ON clause is used, the
and
matching
is
query
rows
eliminates attribute
common
in the
traditional
bejoined.
a common
clause
indicated
is the
and the old-style
qualifier for the
ON. If the
to
columns
natural join
use of a table
values in the
If the
queries
are used
not materially
be
copied, affect
may
A subquery
scanned, the
overall
when it is necessary
query uses results
Subqueries
statement.
May
are selected.
used
the
and
with
join
a
syntax is
attributes.
used,
the
query
will
USING clause; that
will return
only the rows
that
condition.
query.
not
is
USING
data. That is, the
a SELECT
Rights
query
joins
values for one table
difference
matching
and correlated
by another in
of
such
with
join
processed
clauses
review
keywords
specified
generated
style
criteria
matching
does not require
must exist in
Subqueries
Copyright
This
as inner
attribute
with
One important
only the rows
meet the
meet a given
all rows
natural join
column
Editorial
returns
may use
return
that
be classified
with unmatched
columns.
common
can
or
duplicated, learning
in experience.
that
be used
whole
or in Cengage
part.
Due Learning
to
process
were previously
with the
may return
to
FROM,
a single
electronic reserves
rights, the
right
row
some to
third remove
data based
unknown WHERE,
or
party additional
content
may content
and that
IN
multiple
on
and
are
HAVING
rows.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Most subqueries request;
then
is
executed
in
a programming
to the
are executed
the inner once
for
query
SQL functions
are
table,
to serve
in
the
used
extract
to
as the
and
outer
query.
That
process
is
so named
The
output
computation
is
and
of derived
conversion
of the
be used
to
the
SQL
501
data that
nested
query
loop
is related
subquery. functions
values
or to serve
that
Advanced
a subquery
typical
used store
Aside from
functions
the is
the inner
outer
to
variables
can be vendor-specific.
functions
similar
most frequently
can
SQL and
subquery
because
a column
data.
function
Language
query initiates
a correlated
subquery
or transform
formats
string
contrast,
query references
of the
basis for the
Function
That is, the
In
outer
A correlated
The results
numeric
executed.
the inner
functions.
are
row
is
because
data comparisons. there
each
language.
outer
and time
in a serial fashion.
subquery
9 Procedural
time
in
date
as a basis for
and
convert
are
a database
date functions,
one
data
format
to
another.
Oracle sequences sequence data
type
to
Procedural
generate
A trigger
occurrence critical
numeric
is
to
proper
data
A stored
procedure
procedures
are
operation
in
the
procedures
help reduce
code
programs,
When
SQL
cursor
is
of the
query
memory cursors:
are
You
implicit
and
embedded
is
maintaining
to
statements.
of the
major
by the
automate
enforce
DBMS
upon
the
Triggers
various
constraints
Just like
are
transaction
that
and increases
are not
one
columns
value
area
of
and rows.
than in the
client
inside memory
computer.
are
called
by
of application
PL/SQL
in
are
9
Stored
that
cost
the
Cursors
is that
Use of stored
modules and the
stored
procedures
performance.
PL/SQL
of errors
more than
triggers,
transactions.
system
unique
chance
database of stored
business
as a reserved
holding
help
and PL/SQL
or DELETE).
advantages
complete
the
of a cursor
invoked INSERT
They
SQL
return
server, rather
to the
Basic, .NET, called
a
an Autonumber
which
code,
the
held in
a
output
a reserved
There are two types
of
explicit.
SQL refers
as Visual
designed
an array
DBMS
For example, uses
procedures
automatically (UPDATE,
by creating
minimising
can think like
area in the
Embedded
such
statements
stored,
a record. Access
levels.
traffic
duplication
maintenance.
is
One
network
stored
can be used to
and represent
thereby
and
needed.
to
Microsoft
management.
of
database.
reduces
development
and
and they
encapsulate
substantially
application
is
event
collection
procedures
the
that
and implementation
a named
stored
code
manipulation
design
is
can be used to
SQL
data
processes,
DBMS
be assigned
automatically.
sequences.
procedural
management
values to
invoices
can be used to create triggers,
database
at the
generate
number
of a specified
enforced
they
to
SQL (PL/SQL)
functions.
and
may be used to
may be used
the
procedural
use
of
SQL
statements
C#, Python
host language. capabilities
in
within
an application
or Java.
The language
Embedded
SQL is
DBMS-based
in
programming
which the
still the
most
language
SQL statements
common
are
approach
to
applications.
Key terMS anonymous PL/SQLblock
Copyright Editorial
review
explicit cursor
statement-level trigger
batchupdateroutine
hostlanguage
static SQL
correlated subquery
implicit cursor
stored function
crossjoin
inner join
stored procedure
cursor
outerjoin
trigger
dynamicSQL
persistentstored module (PSM)
updatableview
embeddedSQL
proceduralSQL(PL/SQL)
Event-Condition-Action (ECA) model
row-level trigger
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
502
part
III
Database
Programming
online Content are available
Further MySQL
8.0
Oracle
Reference
/index.html,
Answers to selectedReviewQuestions andProblems forthis chapter
online
platform
for
this
book.
reaDIng
Database
Malepati,
on the
Manual
18c
PL/SQL
[online].
Available:
[online].
https://dev.mysql.com/doc/refman/8.0/en/
Available:
(2019).
www.oracle.com/technetwork/database/features/plsql
2019.
T.
Shah,
B. and
Vanier,
E. Advanced
MySQL
8.
OReilly,
2019.
reVIew QueStIonS 1
The relational set operators UNION, INTERSECT and MINUS work properly only when the relations are union-compatible. What does union-compatible mean, and how would you check for this condition?
2
Whatis the difference
3
that information,
4
5
6
query
query
output for the
UNION
query? (List the
query
output.)
in Question 3, whatis the query output for the UNION ALL query?
output.)
in Question 3, whatis the query output for the INTERSECT
query?
output.)
in Question 3, whatis the query output for the
MINUS query? (List
output.)
7
Whatis a CROSS JOIN? Give an example of its syntax.
8
Whichthree join types
9
has
query
Giventhe employee information the
2020
query
Giventhe employee information (List the
review
what is the
Giventhe employee information (List the
Copyright
UNION and UNION ALL? Writethe syntax for each.
Suppose you have two tables: EMPLOYEE and EMPLOYEE_1. The EMPLOYEE table contains the records for three employees: Alice Cordoza, John Cretchakov and Anne McDonald. The EMPLOYEE_1 table contains the records for employees John Cretchakov and Mary Chen. Given
9
Editorial
between
areincluded
in the
OUTER JOIN classification?
Usingtables named T1 and T2, write a query example for each ofthe three join types you described in Question 8. Assume that T1 and T2 share a common column named C1.
10
Whatis a subquery, and what are its basic characteristics?
11
Whatis a correlated subquery?
Give an example.
12
Which Microsoft Access/SQL Server function should you use to calculate the number between the current date and 25 January 2019?
13
Which Oracle function should you use to calculate the number of days between the current date and 25 January 2019?
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
of days
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
14
Suppose a PRODUCT table contains two attributes, attributes
have
values
table
contains
(The
VEND_CODE
VENDOR
of
a single
ABC, attribute,
attribute
table.)
125,
in
DEF,
124,
Given that information,
table
what
is
A UNION query based onthe two tables?
b
A UNION ALL query based on the two tables?
c
AnINTERSECT query based onthe two tables?
d
A MINUS query based on the two tables?
Which Oracle string function EMP_LNAME
values?
123, respectively.
123,
124,
125
key
to the
query
using
a table
named
Advanced
126,
503
Those two
The
and
SQL
VENDOR
respectively.
VEND_CODE
in the
output for:
should you use to list the first three
Give an example
SQL and
JKL,
a foreign
would be the
a
15
and
with values
PRODUCT
Language
PROD_CODE and VEND_CODE.
GHI, 124,
VEND_CODE, the
9 Procedural
characters
of a companys
EMPLOYEE.
16
Whatis an Oracle sequence?
Writeits syntax.
17
Whatis atrigger,
18
Whatis a stored procedure, and whyis it particularly useful? Givean example.
19
Give an example of a stored function.
20
What are the four occasions on which Oracle recommends
and whatis its purpose?
Give an example.
How would the function
be called?
you use a trigger?
probLeMS 9
online Content The'Ch09_SimpleCo' database islocatedonthe onlineplatform for this
book,
as are
the
script
files
to
duplicate
this
data
set in
Oracle.
Use the database tables in Figure P9.1 as the basis for Problems 1-18.
FIgure
p9.1
Database Table
Table
Copyright Editorial
review
2020 has
name:
Learning. that
any
database tables
Ch09_SimpleCo
CUSTOMER
name:
Cengage deemed
name:
Ch09_SimpleCo
CUST_NUM
CUST_LNAMe
CUST_FNAMe
1000
Smith
Jeanne
1001
Ortega
Juan
CUST_BALANCe 1050.11 840.92
CUSTOMER_2
All suppressed
Rights
Reserved. content
does
May not
not materially
be
CUST_NUM
CUST_LNAMe
CUST_FNAMe
2000
McPherson
Anne
2001
Ortega
Juan
2002
Kowalski
Jan
2003
Chen
George
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
504
part
III
Table
Database
name:
Programming
INVOICE iNv_NUM
1
CUST_NUM
iNv_DATe
iNv_AMOUNT
8000
1000
23-Mar-19
235.89
8001
1001
23-Mar-19
312.82
8002
1001
30-Mar-19
528.10
8003
1000
12-Apr-19
194.78
8004
1000
23-Apr-19
619.44
Create the tables. (Use Figure P9.1 to see which table
2 Insert the data into the tables 3
names and attributes to use.)
you created in Problem 1.
Writethe query that will generate a combined list of customers (from the tables CUSTOMER and CUSTOMER_2) that do not include the duplicate customer records. (Note that only the customer named
4
Juan
Ortega shows
Writethe query that
up in
both customer
table).
will generate a combined list of customers to include the duplicate customer
records.
5
Writethe query that
will show only the duplicate customer records.
6
Writethe query that
will generate only the records that are unique to the
CUSTOMER_2 table.
7
Writethe query to show the invoice number, the customer number, the customer name, the invoice date and the invoice amount for all customers with a customer balance of 1 000 or more.
8
Writethe query that will show the invoice number, the invoice amount, the average invoice and the difference between the average invoice amount and the actual invoice amount.
9
amount
9
Writethe query that will write Oracle sequences to produce automatic customer number and invoice number values. Start the customer numbers at 1000 and the invoice numbers at 5000.
10
Modify the CUSTOMER table to included two new attributes: CUST_DOB and CUST_AGE. Customer 1000 was born on 15 March 1969, and customer 1001 was born on 22 December 1978.
11
Assuming you completed customers.
12
Assuming the CUSTOMER table contains a CUST_AGE attribute, write the values in that attribute. (Hint: Usethe results of the previous query.)
13
Writethe query that willlist the average age of your customers. (Assume that the CUSTOMER table has been modified to include the CUST_DOB and the derived CUST_AGE attribute.)
14
Problem 10, write the query that lists the
Writethe trigger to update the record
is entered.
(Assume
CUST_BALANCE in the
that the
sale is
a credit
sale.)
names and ages of your
query to update the
CUSTOMER table Test the trigger,
when a new invoice
using the following
new
INVOICE record: 8005,
1001, '27-APR-19',
225.40
Name the trigger trg_updatecustbalance.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
15
9 Procedural
Language
Writea stored procedure to add a new customer to the CUSTOMER table. in the
Name the
'Peter',
procedure
505
values
prc_cust_add.
Run a query to see if the record
has been added.
Use the following
values in
new record:
8006,
1000,
Name
the
17
Use the following
SQL
0.00
Write a procedure to add a new invoice record to the INVOICE table. the
Advanced
new record:
1002, 'Rauthor',
16
SQL and
'30-APR-19', procedure
301.72 prc_invoice_add.
Run a query
to
see if the
record
has
been
added.
Writea trigger to update the customer balance when an invoice is deleted. Namethe trigger trg_updatecustbalance2.
18
Writea procedure to delete aninvoice, giving the invoice number as a parameter. Namethe procedure
Use the
database
FIgure Table
prc_inv_delete. tables
p 9.2
name:
Copyright review
2020 has
P9.2
basis
for
database
8990765
Rough
912122048
Oracle
18c
912934511
Oracle
Backup
name:
Learning. any
invoices
Problems
8005
and
8006.
19-26.
tables
NUMBer_PAGeS
Cell
that
as the
TiTLe
72121333
Cengage deemed
Figure
by deleting
BOOK
935642189
Editorial
in
procedure
Ch09_publishing
iSBN
Table
Test the
Guide to
TYPe
496
6.99
Fiction
245
10.45
Reference
976
34.99
Reference
399
54.50
Reference
4990
19.99
Reference
Prague
Reference
Introduction
PriCe
Guide
& Recovery
to
SQL
9
AUTHOR
All suppressed
Rights
Reserved. content
does
May not
not materially
AUTHOr_iD
FirST_NAMe
LAST_NAMe
1
Stephen
King
2
Michael
Abbey
3
Michael
Robinson
4
Kenny
Smith
5
Steph
Haisley
6
Mandla
7
Rushford
Majoy
8
Farmyi
Madagore
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
Langa
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
506
part
III
Table
Table
Database
name:
name:
Programming
AUTHOR_BOOK iSBN
AUTHOr_iD
72121333
1
8990765
6
8990765
7
912122048
2
912122048
3
912934511
4
912934511
5
935642189
8
STOCK iSBN
STATUS
STATUS_DATe
72121333
IN
STOCK
8990765
IN
STOCK
912122048
ON
QUANTiTY 54 9
ORDER
12/05/2019
20
912934511
FUTURE
30/03/2019
32
935642189
ON ORDER
15/04/2019
50
9 19
Create the tables. (Use Figure P9.2 to see which table names and attributes to use.)
20 Insert the data into the tables 21
you created in Problem 19.
Modify the BOOK table to include a new attribute that records the DATE_PUBLISHED. SQL code required to update the DATE_PUBLISHED for the following books. iSBN
Writethe
DATe_PUBLiSHeD
72121333
12-MAR-19
912122048
23-NOV-19
912934511
12-MAY-19
935642189
11-JUNE-19
22
Writethe query that than two years.
will display the ISBN and title of all books that have been published for
23
Write a query that creates a list of unique authorbook the authors last name and the first eight characters
more
ids, using the first five characters of of the book title. Label the column
AUTHOR_BOOK_ID.
24
Writean anonymous PL/SQL block that displays the maximum author_id currently held in the database
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
and displays it to the
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
screen.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
25
Write an anonymous book
26
titled
Oracle
the
BOOK
Language
PL/SQL block to display the status date entered in the 18c
Writean anonymous from
9 Procedural
Reference
SQL and
Advanced
SQL
507
STOCK table for the
Guide.
PL/SQL block that contains a simple cursor to display only the first three titles
table.
(Hint:
use the
cursor
function
%ROWCOUNT.)
note The following
Use the
problem
sets can serve
Ch09_SaleCo2 database to
FIgure Table
p9.3
name:
as the
basis for
a class
project
work Problems 27-31 (Figure
Ch09_SaleCo
database
or case.
P9.3).
tables
CUSTOMER
CUS_
CUS_
CUS_
CODe
LNAMe
FNAMe
10010
Ramas
Alfred
10011
Dunne
Leona
10012
Moloi
10013
Pieterse
10014
Orlando
10015
OBrian
Amy
10016
Brown
James
10017
Williams
10018
Padayachee
10019
Moloi
CUS_
CUS_
CUS_
AreACODe
PHONe
BALANCe
A
0181
844-2573
0.00
K
0161
894-1238
0.00
0181
894-2285
345.86
0181
894-2180
536.75
0181
222-1672
0.00
B
0161
442-3381
0.00
G
0181
297-1228
221.19
0181
290-2556
768.93 216.55
CUS_ iNiTiAL
Marlene
W
Jaco
F
Myron
George Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
9
0.00
Table name: PRODUCT P_CODe
P_DeSCriPT
11QER/31
Power
P_iNDATe
painter,
15 psi.,
P_QOH
03-Nov-18
P_MiN
8
5
P_PriCe
P_DiSCOUNT
109.99
v_CODe
0.00
25595
3-nozzle
Copyright Editorial
review
13-Q2/P2
7.25
cm
pwr.
saw
blade
13-Dec-18
32
15
14.99
0.05
21344
14-Q1/L3
9.00
cm
pwr.
saw
blade
13-Nov-18
18
12
17.49
0.00
21344
1546-QQ2
Hrd. cloth,
1/4 cm,
2 3 50
15-Jan-19
15
8
39.95
0.00
23119
1558-QW1
Hrd.
1/2
3
15-Jan-19
23
5
43.99
0.00
23119
2232/QTY
B&D jigsaw,
12
8
5
109.92
0.05
24288
2232/QWE
B&D jigsaw,
8 cm
2238/QPD
B&D cordless
23109-HB
Claw
2020 has
Cengage deemed
Learning. that
any
All suppressed
cloth,
Rights
cm
3 50
blade
30-Dec-18
blade
24-Dec-18
drill, 1/2 cm
hammer
Reserved. content
cm,
does
May not
not materially
be
copied, affect
scanned, the
overall
or
6
5
99.87
0.05
24288
20-Jan-19
12
5
38.95
0.05
25595
20-Jan-19
23
0.10
21225
duplicated, learning
in experience.
whole
or in Cengage
part.
10
Due Learning
to
electronic reserves
9.95
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
508
part
III
Database
Programming
P_DeSCriPT
23114-AA
Sledge
hammer,
54778-2T
Rat-tail
file,
89-WRE-Q
8
P_PriCe
5
P_DiSCOUNT
v_CODe
14.40
0.05
4.99
0.00
21344
0.05
24288
cm fine
15-Dec-18
43
Hicut chain saw, 16 cm
07-Feb-19
11
PVC23DRT
PVC
pipe,
20-Feb-19
188
75
5.87
0.00
SM-18277
1.25
cm
01-Mar-19
172
75
6.99
0.00
21225
SW-23116
2.5 cm wd. screw, 50
24-Feb-19
237
100
8.45
0.00
21231
0.10
25595
1/6
Table
name:
1/8
3.5
cm,
metal
matting, m, .5
8
m
screw,
25
4 3 8
17-Jan-19
3
20
256.99
5
119.95
5
18
m mesh
VENDOR v_CONTACT
v_AreACODe
v_PHONe
v_COUNTrY
v_OrDer
Smithson
0181
223-3234
UK
Y
SuperLoo, Inc.
Flushing
0113
215-8995
SA
N
21231
D&E Supply
Singh
0181
228-3245
UK
Y
21344
Jabavu
Ortega
0181
889-2546
SA
N
22567
Dome Supply
Smith
7253
678-1419
FR
N
23119
Randsets
Anderson
7253
678-3998
FR
Y
24004
Brackman
Browning
0181
228-1410
UK
N
24288
ORDVA, Inc.
Hakford
0181
898-1234
UK
Y
25443
B&K, Inc.
Smith
0113
227-0093
SA
N
25501
Damal
Smythe
0181
890-3529
UK
N
25595
Rubicon
Orton
0113
456-0092
SA
Y
Table
name: INVOICE
v_CODe
v_NAMe
21225
Bryson,
21226
Inc.
Bros.
Ltd. Bros.
Supplies Systems
iNv_NUMBer
Copyright Editorial
P_MiN
02-Jan-19
Steel
12
P_QOH
kg
WR3/TT3
9
P_iNDATe
P_CODe
review
2020 has
1001
10014
16-Jan-19
1002
10011
16-Jan-19
1003
10012
16-Jan-19
1004
10011
1005
iNv_SUBTOTAL
iNv_TAX
iNv_TOTAL
24.90
1.99
26.89
9.98
0.80
10.78
153.85
12.31
166.16
17-Jan-19
34.97
2.80
37.77
10018
17-Jan-19
70.44
5.64
76.08
1006
10014
17-Jan-19
397.83
31.83
429.66
1007
10015
17-Jan-19
34.97
2.80
37.77
1008
10011
17-Jan-19
Cengage deemed
iNv_DATe
CUS_CODe
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
431.08
31.93
399.15
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
Table
name:
9 Procedural
Language
SQL and
Advanced
SQL
509
LINE
iNv_NUMBer
LiNe_UNiTS
LiNe_PriCe
LiNe_TOTAL
LiNe_NUMBer
P_CODe
1001
1
13-Q2/P2
1
14.99
14.99
1001
2
23109-HB
1
9.95
9.95
1002
1
54778-2T
2
4.99
9.98
1003
1
2238/QPD
1
38.95
38.95
1003
2
1546-QQ2
1
39.95
39.95
1003
3
13-Q2/P2
5
14.99
74.95
1004
1
54778-2T
3
4.99
14.97
1004
2
23109-HB
2
9.95
19.90
1005
1
PVC23DRT
12
5.87
70.44
1006
1
SM-18277
3
6.99
20.97
1006
2
2232/QTY
1
109.92
109.92
1006
3
23109-HB
1
9.95
9.95
1006
4
89-WRE-Q
1
1007
1
13-Q2/P2
2
14.99
29.98
1007
2
54778-2T
1
4.99
4.99
1008
1
PVC23DRT
5
5.87
29.35
1008
2
WR3/TT3
3
119.95
359.85
1008
3
1
9.95
9.95
23109-HB
256.99
256.99
9
online Content The'Ch09_SaleCo2' databaseusedin Problems 27-31islocatedon the
27
online
platform
for this
book,
as are the
script
files
to
duplicate
this
data
set in
Oracle.
Create a trigger named trg_line_total to write the LINE_TOTAL value in the LINE table every time you add a new LINE row. (The LINE_TOTAL value is the product of the LINE_UNITS and the LINE_PRICE
28
values.)
Create atrigger named trg_line_prod that willautomatically update the product quantity on hand for
29
each product
sold after
a new LINE row is added.
Create a stored procedure named prc_inv_amounts to update the INV_SUBTOTAL, INV_TAX and INV_TOTAL.
The procedure
takes
the invoice
number
as a parameter.
The INV_SUBTOTAL
is the
sum ofthe LINE_TOTAL amounts for the invoice, the INV_TAX is the product ofthe INV_SUBTOTAL and the tax rate (8%), and the INV_TOTAL is the sum of the INV_SUBTOTAL and the INV_TAX.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
510
part
III
Database
Use the
FIgure Table
9
Ch09_AviaCo
p9.4
name:
database
to
work Problems
31-42
(Figure
P9.4).
Ch11_aviaCo database tables
CHARTER
CHAr_
CHAr_
AC_
CHAr_
CHAr_
TriP
DATe
NUMBer
DeSTiNATiON
DiSTANCe
CHAr_
CHAr_
CHAr_
CHAr_
CUS_
HOUrS_
HOUrS_
FUeL_
OiL_
CODe
FLOwN
wAiT
GALLONS
QTS
354.1
10001
05-Feb-19
2289L
ATL
936.00
5.1
2.2
10002
05-Feb-19
2778V
BNA
320.00
1.6
0.0
1574.00
7.8
0.0
2.9
4.9
72.6
1
10011
0
10016
2
10014
1
10019
397.7
2
10011
5.2
117.1
0
10017
7.9
0.0
348.4
2
10012
644.00
4.1
0.0
140.6
1
10014
1574.00
6.6
459.9
0
10017
ATL
998.00
6.2
3.2
279.7
0
10016
1484P
BNA
352.00
1.9
5.3
1
10012
2778V
MOB
884.00
4.8
4.2
215.1
0
10010
TYS
644.00
3.9
4.5
174.3
1
10011
4278Y
ATL
936.00
6.1
2.1
302.6
0
10017
09-Feb-19
2289L
GNV
1645.00
6.7
0.0
459.5
2
10016
10016
09-Feb-19
2778V
MQY
312.00
1.5
0.0
0
10011
10017
10-Feb-19
1484P
STL
508.00
3.1
0.0
105.5
0
10014
10018
10-Feb-19
4278Y
TYS
644.00
3.8
4.5
167.4
0
10017
10003
05-Feb-19
4278Y
GNV
10004
06-Feb-19
1484P
STL
10005
06-Feb-19
2289L
ATL
1023.00
5.7
3.5
10006
06-Feb-19
4278Y
STL
472.00
2.6
10007
06-Feb-19
2778V
GNV
1574.00
10008
07-Feb-19
1484P
10009
07-Feb-19
2289L
GNV
10010
07-Feb-19
4278Y
10011
07-Feb-19
10012
08-Feb-19
10013
08-Feb-19
4278Y
10014
09-Feb-19
10015
Table name:
472.00
TYS
339.8
97.2
23.4
66.4
67.2
CUSTOMER
CUS_CODe
CUS_LNAMe
CUS_FNAMe
CUS_iNiTiAL
CUS_AreACODe
CUS_PHONe
10010
Ramas
Alfred
A
0181
844-2573
0.00
10011
Dunne
Leona
K
0161
894-1238
0.00
10012
Moloi
W
0181
894-2285
896.54
0181
894-2180
1285.19
0181
222-1672
673.21 1014.56
Marlene Jaco
F
CUS_BALANCe
10013
Pieterse
10014
Orlando
10015
OBrian
Amy
B
0161
442-3381
10016
Brown
James
G
0181
297-1228
0.00
0181
290-2556
0.00
0.00
2020 has
Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
Moloi
10019
review
George
Padayachee
10018
Copyright
Myron
Williams
10017
Editorial
Programming
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
453.98
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Table
name:
review
SQL and
Advanced
eMP_TiTLe
eMP_LNAMe
eMP_FNAMe
eMP_DOB
eMP_Hire_DATe
Mr
Kolmycz
George
D
15-Jun-1952
15-Mar-1997
101
Ms
Lewis
Rhonda
G
19-Mar-1975
25-Apr-1998
102
Mr
Vandam
Rhett
14-Nov-1968
20-Dec-2002
103
Ms
Jones
Anne
M
16-Oct-1984
28-Aug-2015
104
Mr
Lange
John
P
08-Nov-1981
20-Oct-2006
105
Mr
Robert
D
14-Mar-1985
08-Jan-2016
106
Mrs
Duzak
Jeanine
K
12-Feb-1978
05-Jan-2001
107
Mr
Diante
Jorge
D
21-Aug-1984
02-Jul-2006
108
Mr
Paul
R
14-Feb-1976
18-Nov-2004
109
Ms
Travis
Elizabeth
K
18-Jun-1971
14-Apr-2001
110
Mrs
Genkazi
Leighla
W
19-May-1980
01-Dec-2002
2020 has
name:
Cengage deemed
Learning. that
any
SQL
eMP_iNiTiAL
100
Table
Copyright
Language
511
EMPLOYEE
eMP_NUM
Editorial
9 Procedural
Williams
Wiesenbach
CREW
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
CHAr_TriP
eMP_NUM
Crew_JOB
10001
104
Pilot
10002
101
Pilot
10003
105
Pilot
10003
109
Copilot
10004
106
Pilot
10005
101
Pilot
10006
109
Pilot
10007
104
Pilot
10007
105
Copilot
10008
106
Pilot
10009
105
Pilot
10010
108
Pilot
10011
101
Pilot
10011
104
Copilot
10012
101
Pilot
10013
105
Pilot
10014
106
Pilot
10015
101
Copilot
10015
104
Pilot
10016
105
Copilot
10016
109
Pilot
10017
101
Pilot
10018
104
Copilot
10018
105
Pilot
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
9
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
512
part
III
Table
Database
name:
Programming
AIRCRAFT AC_NUMBer
Table
Table
name:
MOD_CODe
1833.10
1833.10
101.80
2289L
C-90A
4243.80
768.90
1123.40
2778V
PA31-350
7992.90
1513.10
789.50
4278Y
PA31-350
2147.30
PiL_LiCeNSe
PiL_MeD_TYPe
PiL_MeD_DATe
PiL_PT135_DATe
101
ATP
1
12-Apr-2018
15-Jun-2018
104
ATP
1
10-Jun-2018
23-Mar-2019
105
COM
2
25-Feb-2018
12-Feb-2018
106
COM
2
02-Apr-2018
24-Dec-2019
109
COM
1
14-Apr-2018
21-Apr-2018
name:
RATING rTG_CODe
rTG_NAMe
CFI
Certified
CFII
Certified Flight Instructor,
name:
2020 has
Flight Instructor
Instrument
Instrument Multiengine
Land
SEL
Single Engine, Land
SES
Single Engine,
Sea
MODEL MOD_MANUFACTUrer
MOD_NAMe
MOD_SeATS
MOD_CHG_MiLe
C-90A
Beechcraft
KingAir
8
2.67
PA23-250
Piper
Aztec
6
1.93
PA31-350
Piper
Navajo
name:
Cengage deemed
Learning. that
Chieftain
2.35
10
EARNED_RATING eMP_NUM
rTG_CODe
eArNrTG_DATe
101
CFI
18-Feb-08
101
CFII
15-Dec-15
101
INSTR
08-Nov-03
101
MEL
23-Jun-04
101
SEL
21-Apr-03
104
INSTR
15-Jul-06
104
MEL
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
29-Jan-07
SEL
104
review
243.20
eMP_NUM
MOD_CODe
Copyright
622.10
PILOT
MEL
Editorial
AC_TTer
PA23-250
INSTR
Table
AC_TTeL
1484P
9
Table
AC_TTAF
copied, affect
scanned, the
overall
or
duplicated, learning
12-Mar-05
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
30
the
31
online
Content platform
Modify the
Attribute
review
rTG_CODe
eArNrTG_DATe
105
CFI
18-Nov-07
105
INSTR
17-Apr-05
105
MEL
12-Aug-05
105
SEL
23-Sep-04
106
INSTR
20-Dec-05
106
MEL
02-Apr-06
106
SEL
10-Mar-04
109
CFI
05-Nov-08
109
CFII
21-Jun-13
109
INSTR
23-Jul-06
109
MEL
15-Mar-07
109
SEL
05-Feb-06
109
SES
12-May-06
book,
as are the
Name
Attribute Waiting
charge
33
Modify the
Cengage deemed
Learning. that
any
All suppressed
script
files
to
duplicate
Description
Writethe queries to update the
has
Advanced
SQL
513
9
this
data
set in
Oracle.
MODEL table to add the attribute and insert the values shown in the following table.
32
2020
SQL and
The'Ch09_AviaCo'databaseusedfor Problems31-42is located on
for this
MOD_WAIT_CHG
Copyright
eMP_NUM
Language
Create a procedure named prc_cus_balance_update that will take the invoice number as a parameter and update the customer balance. (Hint: You can use the DECLARE section to define a TOTINV numeric variable that holds the computed invoice total.)
online
Editorial
9 Procedural
per
Attribute hour for
each
model
100
Numeric
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
values
for
C-90A
50
for
PA23-250
75
for
PA31-350
MOD_WAIT_CHG attribute values based on Problem 31.
CHARTER table to add the attributes shown in the following
Rights
Attribute
Type
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
table.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
514
part
III
Attribute
Database
Programming
Name
Attribute
CHAR_WAIT_CHG
Waiting
CHAR_FLT_CHG_HR
Flight
CHAR_FLT_CHG
charge
for
charge
using
the
Flight
Attribute
Description each
per
model (copied
mile for
each
MOD_CHG_MILE
charge
from
the
model (copied
Numeric
MODEL table)
from
the
Type
Numeric
MODEL table
attribute)
(calculated
by
CHAR_HOURS_FLOWN
Numeric
3
CHAR_FLT_CHG_HR) CHAR_TAX_CHG
CHAR_FLT_CHG
3 tax rate (8%)
Numeric
CHAR_TOT_CHG
CHAR_FLT_CHG
1 CHAR_TAX_CHG
Numeric
CHAR_PYMT
Amount
paid by customer
CHAR_BALANCE
Balance
remaining
34
after
Numeric payment
Numeric
Writethe sequence of commands required to update the CHAR_WAIT_CHGattribute valuesin the CHARTER
35
table.
(Hint:
Use either
an
updatable
view
or a stored
procedure.)
Writethe sequence of commands required to update the CHAR_FLT_CHG_HRattribute valuesin the
36
CHARTER
table.
(Hint:
Use either
an
updatable
view
or a stored
procedure.)
Writethe command required to update the CHAR_FLT_CHG attribute values in the CHARTER table.
37
Writethe command required to update the
CHAR_TAX_CHG
attribute
values in the
CHARTER
CHAR_TOT_CHG
attribute values in the
CHARTER
table.
9
38
Writethe command
required to update the
table.
39
Modify the PILOT table to add the attribute shown in the following table.
Attribute
Name
Attribute
PIL_PIC_HRS
Pilot in
command
tables
Create a trigger new
41
row
tables table
42
shows
tables
is
review
2020 has
PIL_PIC_HRS
CREW to
CREW_JOB
CHARTER when the
Learning. that
any
All suppressed
when
update
Rights
Reserved. content
does
Numeric CREW
and
updates the AIRCRAFT table
tables
AC_TTER
CHAR_HOURS_FLOWN
uses
update
the
updates the PILOT table
a pilot
PILOT
May
not materially
be
a new
copied, affect
CHARTER
scanned, the
when a
update
the
values.
that automatically table
to
CREW_JOB
tables
entry.
PIL_PIC_HRS
when a new
Use the only
CHARTER
when the
CREW
entry.
source. (Assume
not
Type
be pilot
CHARTER
AC_TTEL,
and the
the
Create a trigger named trg_cust_balance that automatically updates the
Cengage deemed
by adding
to the
that automatically
named trg_pic_hours
added
a pilot
CHG as the
Copyright
updated
to
Use the
CHAR_HOURS_FLOWN uses
hours;
CREW_JOB
AC_TTAF,
CUST_BALANCE
Editorial
the
row is added.
Create a trigger CREW
(PIC)
named trg_char_hours
CHARTER
AIRCRAFT
Attribute
CHAR_HOURS_FLOWN
table
40
Description
overall
or
that
duplicated, learning
in experience.
row
is
added.
all charter
whole
or in Cengage
part.
Due Learning
Use the
charges
to
electronic reserves
CHARTER
CUSTOMERtables tables
CHAR_TOT_
are charged to the customer
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
balance.)
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
9 Procedural
Language
SQL and
Advanced
SQL
515
CaSe EliteVideo is a start-up company providing a concierge DVD kiosk service in upscale neighbourhoods. EliteVideo can own several copies (VIDEO) of each movie(MOVIE). For example, a kiosk may have 10 copies of the movie Cry,the Beloved Country. In the database, Cry,the Beloved Country would be one MOVIE, and each copy would be a VIDEO. Arental transaction (RENTAL) involves one or more videos being rented
to a member (MEMBERSHIP).
A video
can be rented
many times
over its lifetime;
therefore,
there
is an M:N relationship between RENTAL and VIDEO. DETAILRENTAL is the bridge table to resolve this relationship. The complete ERDis provided in Figure P9.5.
FIgure
p9.5
the Ch09_MovieCo erD
9
43
Write the SQL code to create the table structures for the entities shown in Figure P9.5. The structures should contain the attributes specified in the ERD. Use data types that are appropriate for the data that will need to be stored in each attribute. Enforce primary key and foreign key constraints
44
as indicated
The following tables data needs
to
by the
ERD.
provide a very small portion ofthe data that
be inserted
into
the
database
for testing
necessary to place the following datain the tables that your DBMS, be certain to save the rows permanently.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
will be kept in the database. The
purposes.
Write the INSERT
commands
were created in Problem 43. (If required
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
by
and/or restrictions
eChapter(s). require
it
516
part
III
tabLe MeM
Database
p9.1
Membership table MeM_CiTY
MeM-STreeT
MeM_
MeM_
NUM
Programming
MeM_PrOv
BALANCe
POSTAL
LNAMe
FNAMe
MeM_
MeM_
CODe 5200
110
KZN
4001
60
Pretoria
Gauteng
0001
0
Musket Ball Circle
Cape Town
Western Cape
7100
150
Maxwell Place
Durban
KZN
4001
0
Polokwane
Limpopo
0700
50
Bloemfontein
Free State
9300
0
26 Takli
Dawson
Circle
102
Tami
103
Koert
Wessels
45
104
Jamal
Melendez
78 East 145th
105
Palesa
Mamorobela
60
106
Nasima
Carrim
107
Rose
Ledimo
108
Mattie
Smith
430 Evergreen
109
Clint
Taylor
171
110
Thabang
Moroe
24 Southwind
111
Stacy
Mann
89
112
Louis
Du Toit
113
Sulaiyman
Philander
tabLe
p9.2
446
Cornell
Court
78 Danner
26 430
Elm
Avenue
Avenue Street
Street
East
East London
Eastern
Durban
Cape Circle
Cook
Avenue
Town
Western
Johannesburg
Gauteng
Drive
Cape
Upington
Northern
Mbombela
Mpumalanga
Melvin Avenue Vasili
Cape
Polokwane
Cape
Limpopo
7100
100
2001
0
8801
80
1200
30
0700
0
rental table
9 reNT_DATe
1001
01-MAR-20
103
1002
01-MAR-20
105
1003
02-MAR-20
102
1004
02-MAR-20
110
1005
02-MAR-20
111
1006
02-MAR-20
107
1007
02-MAR-20
104
1008
03-MAR-20
105
1009
03-MAR-20
111
tabLe
p9.3
Detailrental
table DeTAiL_ reTUrNDATe
DeTAiL_ DAiLYLATeFee
reNT_NUM
viD_NUM
DeTAiL_Fee
DeTAiL_ DUeDATe
1001
34342
20
04-MAR-20
02-MAR-20
1001
61353
20
04-MAR-20
03-MAR-20
10
1002
59237
35
04-MAR-20
04-MAR-20
30
Copyright Editorial
MeM_NUM
reNT_NUM
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
reNT_NUM
viD_NUM
Copyright review
Language
SQL and
Advanced
DeTAiL_
DeTAiL_
DeTAiL_
reTUrNDATe
DAiLYLATeFee
54325
35
04-MAR-20
09-MAR-20
30
1003
61369
20
06-MAR-20
09-MAR-20
10
1003
61388
0
06-MAR-20
09-MAR-20
10
1004
44392
35
05-MAR-20
07-MAR-20
30
1004
34367
35
05-MAR-20
07-MAR-20
30
1004
34341
20
07-MAR-20
07-MAR-20
10
1005
34342
20
07-MAR-20
05-MAR-20
10
1005
44397
35
05-MAR-20
05-MAR-20
30
1006
34366
35
05-MAR-20
04-MAR-20
30
1006
61367
20
07-MAR-20
10
1007
34368
35
05-MAR-20
30
1008
34369
35
05-MAR-20
1009
54324
35
05-MAR-20
1001
34366
35
04-MAR-20
p9.4
30
9
Video table MOvie_NUM
54321
18-JUN-19
1234
54324
18-JUN-19
1234
54325
18-JUN-19
1234
34341
22-JAN-18
1235
34342
22-JAN-18
1235
34366
02-MAR-20
1236
34367
02-MAR-20
1236
34368
02-MAR-20
1236
34369
02-MAR-20
1236
44392
21-OCT-19
1237
44397
21-OCT-19
1237
59237
14-FEB-20
1237
61388
25-JAN-18
1239
61353
28-JAN-17
1245
61354
28-JAN-17
1245
61367
30-JUL-19
1246
61369
30-JUL-19
1246
Cengage deemed
Learning. that
any
All suppressed
30
02-MAR-20
viD_iNDATe
has
517
30
05-MAR-20
viD_NUM
2020
SQL
DUeDATe 1003
tabLe
Editorial
DeTAiL_Fee
9 Procedural
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
518
part
III
Database
Programming
tabLe
p9.5
MOvie_
Movie table
MOvie_TiTLe
MOvie_YeAr
MOvie_COST
MOvie_GeNre
PriCe_CODe
NUM 1234
The
1235
Family
Christmas
2016
39.95
FAMILY
2
Smokey
Mountain
Wildlife
2013
59.95
ACTION
1
1236
Richard
Goodhope
2017
59.95
DRAMA
2
1237
Beatnik
2016
29.95
COMEDY
2
1238
Constant
2017
89.95
DRAMA
1239
Where
2007
25.49
DRAMA
3
1245
Time to
2014
45.49
ACTION
1
2015
58.29
COMEDY
1
1246
Fever Companion Hope
Dies
Burn
What He Doesnt
tabLe
9
Cesar
p9.6
price table
PriCe_CODe
PriCe_DeSCriPTiON
PriCe_reNTFee
PriCe_DAiLYLATeFee
1
Standard
20
10
2
New
35
30
3
Discount
15
10
10
05
4
For
Questions
those
tables
4559, in
has
that
were
created
in
Problem
43 and the
data that
was loaded
46
Writethe SQL command to change the price code for all action
47
Writea single SQL command to increase Alter the DETAILRENTAL table to include of up to three
Update the
Cengage deemed
Learning. that
any
All suppressed
Make
Rights
Reserved. content
does
into
digits.
each
a derived attribute
The attribute
May not
not materially
be
entry
copied, affect
match
scanned, the
overall
or
duplicated, learning
the
in experience.
movies to price code 3.
all price rental fee values in the PRICE table by ZAR7.00.
should
accept
DETAILRENTAL table to set the values in
component.
2020
tables
44.
Writethe SQL command to change the movieyearfor movienumber 1245 to 2014.
49
review
use the
Problem
45
integers
Copyright
Release
Weekly Special
48
Editorial
Know
values
whole
or in Cengage
shown
part.
Due Learning
to
named DETAIL_DAYS-LATE to store
null values.
DETAIL_RETURNDATE in the
electronic reserves
following
rights, the
right
some to
third remove
to include
a time
table.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
tabLe
p9.7
updates for the Detailrental
9 Procedural
SQL and
Advanced
SQL
519
table DeTAiL_reTUrNDATe
viD_NUM
reNT_NUM
Language
1001
34342
02-MAR-20
10:00am
1001
61353
03-MAR-20
11:30am
1002
59237
04-MAR-20
03:30pm
1003
54325
09-MAR-20
04:00pm
1003
61369
09-MAR-20
04:00pm
1003
61388
09-MAR-20
04:00pm
1004
44392
07-MAR-20
09:00am
1004
34367
07-MAR-20
09:00am
1004
34341
07-MAR-20
09:00am
1005
34342
05-MAR-20
12:30pm
1005
44397
05-MAR-20
12:30pm
1006
34366
04-MAR-20
10:15pm
1006
61367
1007
34368
1008
34369
05-MAR-20
09:30pm
1009
54324
1001
34366
02-MAR-20
10:00am
9
50
Alter the
VIDEO table
to include
an attribute
named
VID_STATUS
to
store
character
four characters long. The attribute should have a constraint to enforce the domain (IN, LOST) and have a default value ofIN.
Copyright review
up to
OUT
and
51
Update the VID_STATUS attribute of the VIDEO table using a subquery to set the VID_STATUS to OUT for all videos that have a null value in the DETAIL_RETURNDATE attribute of the DETAILRENTAL table.
52
Alter the PRICE table to include an attribute named PRICE_RENTDAYS to store integers of up to two digits. The attribute should not accept null values, and it should have a default value of 3.
53
Update the PRICE table to place the values shown in the following table in the PRICE_RENTDAYS attribute.
tabLe
Editorial
data
2020 has
updates for the price table
PriCe_CODe
PriCe_reNTDAYS
1
50
2
30
3
50
4
70
Cengage deemed
p9.8
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
520
part
III
54
Database
Programming
Create a trigger the
named trg_late_return
DETAILRENTAL
trigger
when the
should
satisfy
whenever
following
return
date is
null, then
If the
return
date is
not
If the
return
date is
noon
late,
If the return
date is
an
trigger
Calculate
the
Calculate
value
Subtract
of the
of the
fee
DETAIL_ DAYSLATE in
should
execute
attributes
are
as a BEFORE
updated.
The trigger
of the late fee
prior value
now
or earlier,
then
video the
is returned
video
is
not
video is considered
late,
stored.
are returned date
late.
of zero (0).
due date, then the
late.
attributes
the
update
days late
treat
if the
The trigger
are
updated
membership
should
in the
execute
as
DETAILRENTAL
conditions:
prior to
null, then
determine
will maintain the correct value in the
or return
fee
date
and
that
following
be null.
a value
videos
fee is the
fee is
due
have
day after the
when
date
also
should
be calculated
the
late
was
of the late
the
due
of the late
the value
value
after the should
must
satisfy
should
days late
day
table
when the should
of the late
If the
of the
past noon
The value
value
the
days late
MEMBERSHIP
The trigger
trigger.
days late
named trg_mem_balance
in the
AFTER
table.
the
of days late
Create a trigger balance
The trigger
or DETAIL_DUEDATE
null, then
and the
number
will write the correct value to is returned.
conditions:
If the
so the
that
a video
DETAIL_RETURNDATE
the
considered
55
table
it
that
triggered
multiplied
by the
as zero
after the
treat
that triggered
it
of the late fee from the
execution fee.
of the If the
previous
(0).
update
null, then
this
daily late
as zero
current
this
execution
of the trigger.
(0).
value
of the late fee to
determine
the
change in late fee for this video rental. If the
9
amount
amount
56
calculated
calculated
in
Part
for the
Create a sequence
c is
not zero (0),
membership
then
associated
named rent_num_seq
update
the
with this
to start
membership
balance
by the
rental.
with 1100 and increment
by 1. Do not cache
any values.
57
Create a stored procedure procedure The
should
membership
number
Use a Count() function table. If it does exist
and
If the
no data
balance
balance
Insert
of
58
has
Cengage deemed
is the display
value for
procedure
that
message
2020
then
membership
Verify
review
written
to
the
previous Previous
table
RENT_NUM,
number
provided
Learning. that
any
All suppressed
should
number the that
Rights
does
video
May not
the
the
number
video
not materially
be
exists
does
copied, affect
scanned, the
overall
not
or
duplicated, learning
exists in the
membership
balance.
balance
as the
MEMBERSHIP
membership
does
not
current
system
value
and display
if the
a message
membership
has a
R5.00.)
using the rent_num_seq the
that the
(For example,
balance:
following
will be provided
the
Reserved. content
satisfy
number
be displayed
database.
does exist, then retrieve amount
R5.00,
the
The video
Copyright
be
membership
a message should
for
Create a stored procedure named prc_new_detail The
Editorial
should
new rows in the RENTAL table. The
as a parameter.
verify that the
a new row in the rental
generate the
to
to insert
conditions:
will be provided
not exist, then
membership
that the
named prc_new_rental
satisfy the following
sequence
date for the
created
above to
RENT_DATE
value,
and
MEM_NUM.
to insert
new rows in the
DETAILRENTAL table.
requirements:
as a parameter. in the exist,
in experience.
VIDEO and
whole
or in Cengage
table.
do not
part.
Due Learning
to
If it
write
electronic reserves
rights, the
does
any
right
not
exist,
data to the
some to
third remove
party additional
content
then
display
a
database.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
If the
video
status
number
is
not IN,
be rented
again,
If the
status is IN,
Calculate
Insert
date
59
been
as the
Verify
the
that
the
video
the
If the
video
video
in the now
review
2020 has
Cengage deemed
Learning. that
any
Advanced
video is IN. be entered
SQL
521
If the before
it
can
the
number
in the
using
the
of days in
current
previous
number
provided
DETAIL_FEE,
the
as the
PRICE_
table.
value
in the
PRICE_
system
date.
returned
parameter
RENT_
as the
due date calculated
value for
by
VID_NUM,
above for the
DETAIL_DAILYLATEFEE,
and null
has
multiple not
VIDEO
All
available
Rights
does
for
May
not materially
be
any
rental.
current
not
in the
copied, affect
the
overall
it
the
video
is rented
does
outstanding
If it
does
not
do not
write
any
If the
video
date,
duplicated,
but and
message has
then that
in
whole
or in Cengage
video
part.
Due Learning
to
date.
any
reserves
rights, the
right
some to
one row message
no outstanding
third remove
status
rental,
party additional
to IN
then
to IN
in
database.
video
status
has only
an error
had
was successfully
electronic
video
more than
data to the
outstanding
message
database.
the
video
video
If
a
the
display
write
update
the
one
display
data to
not returned,
and update the
experience.
a return
do not
only
exist,
to ensure that the
have
rentals,
a
learning
not
rentals
display
or
requirements:
use a Count() function
a message that the
scanned,
table. and
which
system
following
VIDEO
for
outstanding
and
Then display
Reserved. content
have
table,
to enter data about the return of videos that
the
was not found
does exist, then
that
satisfy
as a parameter.
exists
provided
indicates
date to the
suppressed
should
DETAILRENTAL
does
VIDEO table.
Copyright
by adding
video
must
PRICE_RENTFEE,
PRICE
named prc_return_video
number
number in
video
return
Editorial
the
will be provided
number
DETAILRENTAL
but is
videos
the
table
value for
procedure
video
video
one record
that
rental
for the
return
SQL and
database.
of the
from
PRICE_DAILYLATEFEE
The
number
If the
values
video
videos
data to the
DETAILRENTAL
procedure
rented.
The video
that
the
VID_STATUS
the
Language
DETAIL_RETURNDATE.
Create a stored have
any
that
(hours:minutes:seconds)
RENT_NUM,
DETAIL_DUEDATE, the
write
for the
in the
as the
verify that the
message
PRICE_RENTDAYS
PRICE_RENTFEE
for
not
11:59:59PM
a new row
a
then retrieve
due
to
NUM_SEQ
do
and
the
RENTDAYS
display
and
DAILYLATEFEE,
the
does exist, then
then
9 Procedural
for the
9
rentals update
for that
the
video in the
returned.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
PartIV
Database DesIgn 10 Database Development Process
11 Conceptual, Logical, and Physical Database Design
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
busIness
VIgnette
eM-Dat: tHe InteRnatIOnaL DIsasteR FOR DIsasteR PRePaReDness
Database
In 1998, the first emergency events database known as EM-DAT was set up by the WHO Collaborating Centre for Research on the Epidemiology of Disasters (CRED) with support from the Belgian Government. The purpose ofthe database wasto aidin decision makingfor disaster preparedness, as well as provide an objective base for vulnerability assessment and priority setting1 . During the last few
years
EM-DAT
has become
the
main global reference
database.
EM-DAT stores information on over 22 000 disasters that have occurred across the world from 1900 to the present day. Datais collected from many sources such as the United Nations, governments and the International Red Cross. The datais of various quality sois constantly checked for inconsistencies,
data redundancy
and incompleteness.
Each natural
disaster is recorded
using
a unique disaster number identifier, the disaster type, subtypes, associated disasters, start and end dates, and location. Disasters are classified into 15 types of natural disasters (and morethan 30 subtypes) and technological disasters which cover 15 disaster types. This now means that if a natural disaster affects a number of countries all the data that are collected from each country can be recorded
under
one unique
reference
number.
For example,
the
2004 tsunami
in
South
East
Asia affected 13 different countries but is recorded as one single event. From the database, disaster-related economic damage estimates can be obtained and also details ofinternational aid contributions for specific disasters. Each year EM-DAT aids CREDin conducting areview of disaster events throughout the year, e.g., In 2018, there were 281 climate-related and
geophysical
events recorded
in the
EM-DAT
with 10 733 deaths,
and over 60
people affected across the world.2 Data analytics is used to produce summary tables in different geographical locations who have been affected by specific disaster types.
1 2
Information 2018
about
Review
EM-DAT
Of Disaster
is
available:
Events,
Centre
million
of people The 2018
www.emdat.be/database for
Research
on the
Epidemiology
of
Disasters,
2019.
523
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
524
PaRt
IV
Database
review
Design
concludes
better
disaster
a need
to
The
ensure
disasters
which
effective
effective
data is
of good
3
Copyright review
2020 has
Cengage deemed
and
any
disaster
planning.
and
education
for risk
of
global
risk.
is to
Risk
management.
The
governments
Wahlstrm,
Reduction
You
to
mitigate the
Margareta
Disaster
natural
measures.3
a tool for
The intention
in reducing
disaster
effects reduction
for
quality,
Reduction
(UNISDR),
20152030
The four
disaster
action
risk
Back
actions
is
is recorded
which
states
cannot
to
and Priority
manage
it
and from
has
the
a people-centred 1.
manage
to
disaster
risk . . . Priority
3.
preparedness
and reconstruction.4
been
trustworthy
approach
disaster
rehabilitation never
Sendai
Understanding
disaster
4. Enhancing
in recovery,
Hence,
accurately
adopts
is implementing
are: Priority
governance
Better
data.
currently
priorities
for resilience,
to Build
these
Strategy
for
Framework
Learning. that
disaster
Secretary-General
this to
that there is
more important
sources
to
enable
to its
ensure
true
worth
effectively.
International Sendai
of
attributing
be a focus. and
and provides
successful
Reduction
risk reduction
completing
be utilised
4
Risk reduction.
response
to
should
occurrences
years,
acknowledges
be identified
of the
Risk
2. Strengthening
disaster
Central
to
Editorial
risk
in
measures
to
collection
development
to
required
critical
Disaster
Disaster
Priority
for
for
for
Investing
data
previous
the report
measure.
disaster
risk . . .
is
than in
However,
of the
aid in the
Representative
Office
Framework reducing
to
the funding
to information
UN
consistent
populations
Special
what you cannot The
used
was lower
standards.
statistics
life through
Nations
Access
and
provides
vulnerable
determine
death toll
and living
complete
are then
allows
of human
United
2018 the
website
use in order to
that
that
EM-DAT
information
loss
that in
management
All suppressed
Rights
for
Reserved. content
does
Disaster Disaster
May not
not materially
be
Reduction. Risk
copied, affect
scanned, the
overall
Available:
Reduction.
or
duplicated, learning
www.unisdr.org/disaster-statistics/introduction.htm
Available:
in experience.
whole
or in Cengage
part.
www.unisdr.org/we/coordinate/sendai-framework
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHaPteR10 Database Development Process In tHIs CHaPteR, yOu wILL LeaRn: That
successful
database
is
database
information
Systems
That,
systems
Development
the information
system
of
which
the
evaluation
are developed
within
a framework
known
as
are subject
to
Life Cycle (SDLC)
within the information
frequent
must reflect
a part
That successful the
design
system,
and revision
the
most successful
within a framework
databases
known
as the
Database
Life
Cycle (DBLC) How to
conduct
About
evaluation
database
design
vs decentralised Common put in
and revision strategies:
within
top-down
the vs
SDLC
and
bottom-up
DBLC
design
frameworks and
centralised
design
threats
to
the
security
of the
data
and
which
security
measures
could
be
place
The importance
of
The technical
database
and
administration
managerial
in
an organisation
roles
of the
database
administrator
picture
called
an information
(DBA)
Preview Databases that
fail
are a part
That is,
means to
an end rather
routines
to
fit the
staged
creation
the
and
2020 has
Cengage deemed
Learning. that
any
All
Rights
just
to
happen;
Systems establish through
evolution
does
seem
Database
whole
that the want the
to require
are
database database
that
designs
not likely
to
be
is a critical to
serve
managers
their
alter their
requirements.
dont
May not
they
analysis
of information
Life
not
be
copied, affect
scanned, the
overall
or
duplicated, learning
are the
product
used to
determine
Within known systems
Cycle,
and replacement
materially
is
its limits. a process
Development
Reserved. content
databases
system.
larger
Managers
enhancement
suppressed
of this
must recognise
created
Systems
part
designers
many
and
is
is
an end in itself.
process.
system
maintenance,
review
database
system
database
but too
development
The
called
than
systems
an information information
the
database
needs,
Information
Copyright
that
successful.
management
Editorial
of a larger
to recognise
systems
in
whole
or in Cengage
part.
follows
Due Learning
to
electronic reserves
the
an iterative
process
actual
pattern
of creation,
system.
rights, the
need for
the
development.
of the information
experience.
a carefully
analysis,
as systems
a continuous
of
right
some to
third remove
A similar
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
526
PaRt
IV
Database
cycle
Design
applies
to
databases.
The
database
explores
two
is
created,
maintained
and
enhanced,
and
eventually
replaced. This
chapter
security.
Data
data security
also is
could
data and about fully
very
important
and is
critical
have serious implications.
and
within
chapter,
and technical
to
protect
an organisation
you
roles
will learn
of the
issues: to
the
You willlearn
can be adopted
accepted
In this
managerial
resource
measures that
understood
be implemented. the
briefly
a corporate
about
database
database
which threats the
before
administration
organisation.
data.
important
data
administrator
a
data
breach
can affect the security
Database
a sound
and
Therefore,
data
administration
administration
management
in
of the must be
strategy
issues
can
by looking
at
(DBA).
10
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10.1
tHe InFORMatIOn
10
Database
Development
Process
527
systeM
A database is a carefully designed and constructed repository of facts. The fact repository is a part of alarger whole, known as an information system. Aninformation system provides for data collection, storage and retrieval. It also facilitates the transformation of data into information and the management of both data and information. Thus, a complete information system is composed of people, hardware, software,
the database(s),
application
programs
and procedures.
establishes the need for and the scope of aninformation system is known as systems development.
Systems
analysis
is the
process
that
system. The process of creating aninformation
nOte This chapter is not meant to cover all aspects of systems analysis and development covered in a separate
course
or book.
However, this chapter
should
help you develop
of database design, implementation and management issues that are affected in which the database is a critical component.
these
are usually
a better understanding
bythe information
system
Within the framework of systems development, applications transform data into the information that forms the basis of decision making. Applications usually produce formal reports, tabulations and graphic displays
designed
to
produce
insight.
Figure 10.1 illustrates
that
every application
is composed
of two
parts: the data and the code (program instructions) by which the data are transformed into information. Data and code work together to represent real-world business functions and activities. At any given moment, physically stored data represent asnapshot of the business. But the picture is not complete without an understanding of the business activities that are represented by the code. 10
FIguRe 10.1
generating information for decision making Information
Data
Application code
The performance
of aninformation
Decisions
system depends on a triad of factors:
Database design and implementation. Application
design and implementation.
Administrative
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
procedures.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
528
PaRt
IV
Database
Although
Design
this
book
most important), system.
Creating
much
planning
another,
emphasises
failure
to
a sound to
that
models.
the
database
this
that chapter
do not
you
used to
design
corporation.
The
to
this
size,
and
chapter
type
design
and
another,
is
to
that
information
development they
create
conceptual, database
broadly
such
Most
require
complement
one
of database
complete,
logical
storage
any size
as one for
building
for
design
normalised,
and
physical
structure,
loading
a local
shoe
database
shop,
requires
chapter
and
database data into
the
even
scale
a segment just
and far-ranging
as
For
procedures
do not precisely or
on the described
implemented.
However,
a blueprint,
more complex
focuses
procedures
being
of database.
corporation
house
requires
the
processes
of the
or type
a large
a small
stadium
applicable,
of the
or complexity
design
a database
design
but the
analysis
one
of the triad (arguably
functioning
describes the process
the
systems.
database,
up to
of such
building
planning,
a
Moses
analysis
and
database
life
house.
you
approaches
in
to
next sections Once
integrated
creating
needed
does,
the
and fully
a small
stadium
a poorly
management.
on the
an analogy,
segment
yield
systems with
database
all information
depend
To use
Mabhida
data
discussed to
work:
interface
in
includes
must plan, analyse
procedures
design than
for
hard
will likely
development
objective
possible)
procedures
is
activities
database
phase
are common
example,
cycle.
extent
and providing
make the
elements
the
the
and implementation
segments
on time.
primary
The implementation
To
in
The
(to
system
all of the
In a broad sense, the term
design
other two
are completed
and implementation. nonredundant
database
the
information
ensure
and that they
the
address
trace
the
are familiar database
overall
systems
with those
design,
such
development
processes
and
as top-down
life
cycle
procedures,
you
vs bottom-up
and
and the related will learn
about
centralised
different
general
vs decentralised
design.
nOte The
10
Systems
come
to
framework,
texts
Development
understand there
focus
is
maintained
are
Modelling
part
of the
Rapid
unfulfilled
smaller
cohesion.
This
with the
purpose
the
James
6 For
more information
has
the
specified
design
which
information
you
can
track
systems.
in the
SDLC.
For
and implementation,
and
Within that example,
and that
this
focus
is
methodologies: to
support
the
UML is covered
in
Appendix
tasks
associated
G, Unified
with the
Modelling (UML),
as
is
an interactive
software
to
develop
development,
development
application
which suffered
methodology
systems.
from long
that
RAD started
deliverable
times
uses
as and
is a framework
subprojects
method
to
emphasises
of increasing
for
obtain close
customer
developing
valuable
software
deliverables
communication
applications
in shorter
among
all users
that
times and
and
divides with better
continuous
evaluation
satisfaction.
methodologies
may change, the
basic framework
within
which they
are used
change.
Martin,
Cengage deemed
database
management
structured
development
5 See
2020
tasks
alternative
through
maintain
tools
(RAD)5
and flexible
Development6
work into
not
review
various
provides
systems.
to traditional
does
Copyright
(UML)
are
framework and
requirements.
Although
Editorial
complete
there
Development
Agile Software the
a general develop
resources.
CASE tools,
an alternative
is to
and on relational
Language
Application
prototypes,
to
However,
of information online
(SDLC)
required
ways
modelling
chapter.
development
Cycle
activities
different
on ER
in this
Unified
Life
the
Learning. that
any
Rapid
Application
about
All suppressed
Rights
Agile
Reserved. content
does
May not
Development. Software
not materially
be
copied, affect
Prentice-Hall,
Development,
scanned, the
overall
or
duplicated, learning
Macmillan
go to
in experience.
whole
or in Cengage
College
Division,
1991.
www.agilealliance.org.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10.2
tHe systeMs
The Systems
DeVeLOPMent
Development
Life
10
Database
Development
10.2
Cycle (SDLC)
the systems
529
LIFe CyCLe (sDLC) traces
the
history (life
cycle)
of an information
Perhaps moreimportant to the system designer, the SDLC provides the big picture database design and application development can be mapped out and evaluated. FIguRe
Process
Development
system.
within which the
Life Cycle (sDLC)
Phase
Section
Action(s)
Initial Planning
assessment
Feasibility
10.2.1
study
User requirements
10.2.2
Existing system evaluation
Analysis
Logical
Detailed systems design
system
Detailed
design
system
Coding, testing
10.2.3
specification
10.2.4
and debugging
Implementation
Installation,
fine-tuning
10
Evaluation Maintenance
Maintenance
10.2.5
Enhancement
Asillustrated in Figure 10.2, the traditional SDLC is divided into five phases: planning, analysis, detailed systems design, implementation and maintenance. The SDLC is an iterative rather than a sequential process. For example, the details of the feasibility study might help refine the initial assessment, and the details discovered during the user requirements portion of the SDLC might help refine the feasibility study. Because the
Database
Life
Cycle (DBLC)
fits into
and resembles
the
SDLC,
a brief
description
of
the SDLC is in order.
10.2.1 Planning The SDLC planning phase yields a general overview of the company and its objectives. An initial assessment of the information-flow-and-extent requirements must be made during this discovery portion of the SDLC. Such an assessment should answer some important questions: Should the existing system be continued? If the information generator does its job well, there is no point in modifying or replacing it. To quote an old saying, If its not broken, dont fix it. Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
530
PaRt
IV
Database
Design
Should the extent
existing
and flow
considering
existing
system
are
the
Participants
a new
must address
The technical
the
initial
mainframe,
cost.
must
in the
order.
in
When
mind the
required
even
might indicate
to
create
a new
more important
that
the
system,
in this
current
a careful
case than it is in
to
software
address
question
and evaluate
is
whether
systems,
nature mobile
The
of the
alternative
it is feasible.
type
decisions
hardware
device)
database
mundane
and the
solutions.
The feasibility
might
not (yet)
requirements software
and software,
question, Can
a careful review
(desktop,
requirements
programming
languages
the
most
cost
The
operational
cost.
Does the system
or buying
is
solution
company
have
assessment).
not
(with
effective
operational?
be assessed
we afford it? is crucial (and the
of the initial
problem
in-house
aim is to find
the
requirements.
the or
a thousand-rand
a system
should
next
to
and so on).
might force
keep
the
supercomputer
The
culture
and
The admittedly
building
to
must keep
must begin to study
necessary,
operating
solution
resources
deficiencies may be in
assessment
assessment
effort
perhaps
assessment is
applications
question
between
The initial
Given the
of hardware
multi-user
million-rand
fixing.
but they
or
be used by the
to that
be replaced?
system
aspects
The system
in the initial
indicates
modifications
following:
be vendor-specific,
(single-or
major)
needs.
wants and needs is
SDLCs
that
mid-range
participants
assessment
even
system.
in the
decided
If the initial
minor (or
and
beyond
between
modifying
study
wants
flaws
distinction
modified?
the
between the
systems
If it is
be
modifications,
distinction Should
system
of the information,
defensible.
customisation)
that the
required
resistance
business
human,
of the
to
change
vendor
needs
technical
new
system
should
that
may need to
a third-party
meets the
The impact
as peoples
It bears repeating A decision
answer a be
made
system.
of the
organisation.
and financial on the
companies
not be underestimated.
10
10.2.2 analysis Problems
defined
during
A macroanalysis
such
must
Do those The
analysis The
phase
existing
problems
users
and
and the
examined needs
overall
in
and
end
information
effect,
greater
detail
during
organisational
needs,
the
analysis
addressing
phase. questions
systems
are
users?
requirements?
a thorough
audit
also
of
user requirements.
studied
during
of the systems
designer(s)
the
system
can
which
new
with a study
creation
Learning. any
inputs,
All suppressed
Rights
the
analysis
functional
areas,
phase.
actual
The result
and
potential
does
systems
May not
not materially
be
must is
work together
vital to
to identify
defining
the
processes
appropriate
and
to
performance
uncover objectives
be judged.
processes
Reserved. content
cooperation
of user requirements
of a logical
model,
that
systems
understanding
system
by
Cengage
the
software
be a better
Such
deemed
are
current
SDLC is, in
areas.
data
has
of the
problem
Along
2020
fit into
hardware
should
of the
potential
the
phase
of both individual
and opportunities.
End
review
requirements
requirements
of analysis
Copyright
planning
made
as: What are the
Editorial
the
be
design. and
copied, affect
expected
scanned, the
and the existing The logical
overall
or
duplicated, learning
output
in experience.
whole
or in Cengage
systems,
design
the
analysis
must specify
the
phase
also includes
appropriate
conceptual
requirements.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
When creating hierarchical database
(modules)
for
each
is
validated
yields
within the
against
and place
the
functional
such
such
entity
as data flow
point
to
entities
environment.
Development
diagrams
(ER)
the
of the
and
tools
as
(DFDs), The
describe
systems
DFDs.
531
all
database.
components
All data transformations
analysis
Process
diagrams.
discover
within
descriptions
systems
Database
relationship
at this
among
database
using
those
take
relationships
also
documented
might use tools
diagrams
activities
system
process
and
designer
(HIPO)
and the
logical
described
model
attributes
the
the
output
data-modelling
and their
Defining
design,
process
designs
entities
are
a logical
input
10
The
(processes) conceptual
data
processes.
10.2.3 Detailed systems Design In the The
detailed design
systems
includes
devices
that
are laid
out for
might
also planned
design
phase, the
all necessary be used
to
conversion
and
help
from
designer
technical make the
the
old to
must be submitted
completes
specifications system
a
new
system.
the
for
the
for more
managements
design
the
of the
screens,
efficient
systems
menus,
information
Training
processes.
reports
and
other
generator.
The
steps
principles
and
methodologies
design
process,
are
approval.
nOte Because this
attention
point
has been focused
explicitly
Such
approval
points
along
recognised
is the
on the
the fact
needed
because
way to
a completed
details
that
of the
systems
managements
a GO decision systems
approval
requires
is
funding.
needed
There
this
book
at all stages
are
many
has not until
of the
GO/NO
process.
GO decision
design!
10
10.2.4 Implementation During and
the implementation
the
database
system actual
enters
into
database
is
authorisations The
and
phase,
design
database
a cycle
so
hardware,
and the
DBMS
During
of coding,
created,
and
the
is implemented.
testing system
and
is
software
the
initial
and
application
stages
debugging
customised
of the
until it is
programs
ready
by the
creation
batch
mode,
are installed
implementation
to
phase,
the
be delivered.
of tables
and
The
views,
user
on.
contents
may
be loaded
interactively
or in
using
a variety
of
methods
devices: Customised Database
user interface
Conversion
utility
The system
is
and testing
of a new
are trained.
that
import
to
exhaustive
system
took
application After testing
The
the
data
from
a different
file
structure,
using
batch
programs,
the
implementation
or both.
subjected
of sophisticated time.
programs.
programs
a database
testing
programs.
system
testing
50 to
generators
is concluded, is in full
until it is ready
60 per cent
and
debugging
the final
operation
of the
for
total
tools
end
in
or in
of this
Traditionally, time.
has substantially
documentation
at the
use.
development
is reviewed phase
but
However,
the
decreased
and printed,
advent
coding
and
and end users
will be continuously
evaluated
and fine-tuned.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
experience.
whole
Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
532
PaRt
IV
Database
10.2.5 Almost
Design
Maintenance as soon
as the
generate system Corrective
system is operational,
maintenance activities,
maintenance
Adaptive
in response
maintenance
Perfective
due to
end users
begin to request
changes
in it.
Those
changes
which can be grouped into three types:7 to systems
changes
in the
errors.
business
environment.
maintenance to enhance the system.
Because every request for structural change requires retracing sense, always at some stage of the SDLC. Each system
has a predetermined
operational
life
span.
the SDLC steps, the system is, in a
The actual
operational
life
span of a system
depends onits perceived utility. There are several reasons for reducing the operational life of certain systems. Rapid technological change is one reason, especially for systems based on processing speed and expandability. Another common reason is the cost of maintaining a system. If the systems maintenance cost is high, its value becomes suspect. Computer-aided systems engineering
(CASe)
technology,
such
as System
Architect
or Visio Professional,
helps
make it possible
to produce better systems within a reasonable amount of time and at areasonable cost. In addition, the more structured, better-documented and especially standardised implementation of CASE-produced applications tends to prolong the operational life of systems by making them easier and cheaper to update
and
10.3
maintain.
tHe Database
LIFe CyCLe (DbLC)
Withinthe larger information system, the database, too, is subject to alife cycle. The Database Life Cycle (DBLC) contains six phases (Figure 10.3): database initial study, database design, implementation
10
and loading,
testing
and evaluation,
operation,
10.3.1 the Database Initial
and
maintenance
and evolution.
study
If a designer has been called in, chances are the current system has failed to perform functions deemed vital by the company (you dont call the plumber unless the pipes leak). So, in addition to examining the
current
systems
operation
current system fails.
7
See
E.
Reed
Information
Doke
but it
especially
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
and
Neil 4(1),
remains
have
All suppressed
Rights
does
pp.
today.
May not
not materially
be
copied, affect
date
with (but
revisited: on this
software
underlying
must determine
principles
and
why the
mostly listening to) end users.
a product
reference
environment
how
life
may
perspective,
you
with
dizzying
changes
of software
cycle
cause
design,
to
consider
it
frequency,
implementation
and
longevity.
scanned, the
The the
most of the
remarkable
designer
maintenance
8-11.
Although
to its interface,
enjoyed
Reserved. content
Software
Winter
1991,
the
alot of time talking
E. Swanson,
relevant
with respect
management
company,
That means spending
Executive,
outdated,
within the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
FIguRe
10.3
10
Database
Development
Process
533
the Database Life Cycle (DbLC) Phase
Action(s)
Section
Analyse the company Database initial
Define Define
study
problems objectives
Define scope
situation
and boundaries
Create the conceptual Database
DBMS
design
10.3.1
and constraints
software
design
10.3.2
selection
Create the logical design Create the physical design Install
Implementation
the
and loading
Load
Testing
and
be excellent
be alone
on the
operator
senior systems here to The
a technical
communicators,
Depending
cover
Analyse
purpose
the
must
and
scope
or part of a systems and one or
a wide range
overall
business,
and they
complexity
analysts
of the
company
Define problems
of
data
10.3.4
the required
application
information
flow
10.3.5
Introduce changes Make enhancements
evolution
is
the
database
Produce
Maintenance
design
10.3.3
Fine-tune the database Evaluate the database and its programs
Operation
database
or convert
Test the
and
evaluation
Although
DBMS
Create the database(s)
database
people-orientated.
tuned
database
development
team
also
have finely of the
more junior
design
it is
Database
interpersonal
composed
analysts.
designers
must
designer
might
skills.
environment,
team
systems
10
10.3.6
the
database
of a project leader,
one or
more
The word designer is used generically
compositions.
initial
study
is to:
situation.
and constraints.
Define objectives. Define Figure
scope
10.4
and
depicts
the
DBLC successfully. the
development
examine
Copyright Editorial
review
2020 has
each
Cengage deemed
Learning. that
any
All
interactive
and iterative
As you examine of the
of its
suppressed
boundaries.
Rights
components
Reserved. content
database
does
May not
not materially
be
in
copied, affect
Figure
system
the
overall
or
10.4,
duplicated, learning
required
note that
objectives.
greater
scanned,
processes
the
to
complete
database
Using Figure
initial
the
first
study
phase
of the
phase leads
10.4 as a discussion
template,
to
lets
detail.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
534
PaRt
IV
FIguRe
Database
Design
10.4
a summary
of activities in the database initial
Analysis
of the
company
Company
objectives
situation
Company
operations
Definition problems
study
and
Company
structure
of constraints
Database
system
specifications
10
Scope
Objectives
Analyse
the
Company
The company structure the
describes
mission.
companys These
Situation
situation
and its
What is the
must
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
are
Rights
Reserved. content
does
are,
general
The design
that
conditions
company
components
mission. For example, database
the
in
situation,
how
they
which a company the
database
function
and
how
operates, its designer
organisational
must
discover
what
they interact.
be resolved:
organisations
environment?
the general
To analyse
operational
issues
Boundaries
May
must satisfy the
a mail-order
quite
not
operating
not materially
different
be
copied, affect
the
overall
operational
business is likely
from
scanned,
environment,
or
those
duplicated, learning
in experience.
of a
whole
or in Cengage
and
what is its
demands
manufacturing
part.
created
to have operational
Due Learning
to
electronic reserves
mission
by the
within
that
organisations
requirements
involving
its
business.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Whatis the quite
organisations
useful
formats
when
and
so
structure?
you
are trying
Define Problems
and Constraints both
length
has
of time,
the
existing
it
Aside
the
The
has
some
must
version
definition
to
describe
problems
encountered
during
from
Finding
among
suppose
may find
and less
The real
month not
the
535
whom is and
query
water
tap
the
not
How
real
the
does
version;
the
have
users
the real
of a companys
operational
operation
will have
to
the
do a better
any
the
house.
job
problems
been
of
problem:
solving
Using
the
taps
an adequate the
problem?
database
(admittedly
but
made.
Is that
experienced
solving
relationships
departments
determined
supply
problem
end
or to identify
work.
marketing
You water
Company
view
routine
yet almost
database
any
does the system
more informal
operations
managerial
washers
simplistic,
of so-called
for
can be very informative.
unstructured.
much progress
high.
tap
the
concerning
off the
of the
be
actual
will solve
bill is too
scenario
to
especially
and turn
also
of company
Often the
system
outside
instances
appear
perform
paper trail
is
existed
or computer-based).
Which documents
the
there
has
differ.
scope
department,
replacement
most complete
world
constraints.
is
Process
more
designer
complicated
obvious).
Even the
a
who
important,
home
step
initially
operations.
a proposed
your
the leaky
similar
to
report
company
manual
require?
Studying
these
the larger
production
You
Or would
find
specific
If the
place (either
operation, how
might
users is
in
By whom?
see
company end
of the
The solution?
You can
of the
units. If
those
solution?
Development
who reports
flows,
of information.
does the system
systems to
precisely
answers
business
an analogy, leak.
that
precise
exacerbate
of the
process
unable
different
of system
used?
enough
are often
is
kind
output
be clever
problem
what and
information
sources
Whatinput
system
official
designer
who controls
required
and informal
function?
How is the
from
formal
already
system
generate?
the
define
Database
on.
The
designer
Knowing
to
10
usually
and
Such constraints
and
within
a solution.
a R20
The
accurate
intrudes
include 000
designer
problem
to limit
the
time,
budget,
definition
design
budget,
a solution
must learn
to
of
does
even
personnel, that
takes
distinguish
not always
the
most
and two
lead
elegant
more. If you years
between
to
whats
to the
perfect
database
solution.
by imposing
must have a solution
develop perfect
at a cost and
within
of R800
whats
000
10
possible.
nOte Whentrying to develop solutions, the database designer are
many cases
of database
treat the symptoms
Define during
Copyright Editorial
review
2020 has
the to
to
mustlook for the source of the problems.
satisfy the
end users
because
they
There
were designed
to
of the problems rather than their source.
database problem
system discovery
must be designed process.
As the
list
to
help solve of problems
at least
the
unfolds,
major problems
several
common
identified sources
are
be discovered.
Cengage deemed
that failed
Objectives
A proposed
likely
systems
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
FIguRe
10.5
two views of data: business
10
Database
Development
Process
537
manager and designer
Company
Engineering
Manufacturing
Purchasing
Managers Shared
information
view
What are the
problems?
What
solutions?
are the
What information
implement What
is
the
data
generate
needed
to
solutions?
is
required
the
desired
to
information?
Designers
view
How must the data be structured? Co m p a ny
b a se
Da t
How
10
will the
data
be
accessed?
How is the data transformed into information?
As you
begin to
remember these
examine
the
procedures
required
to
complete
the
design
phase in the
DBLC,
points:
The process of database design is loosely related to the analysis and design of alarger system. The data component is only one element of alarger information system. The systems analysts or systems programmers are in charge of designing the other system components. Their activities create the procedures that will help transform the data within the database into useful information. The database design does not constitute a sequential process. Rather, it is an iterative that provides continuous feedback designed to trace previous steps.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
process
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
538
PaRt
IV
Database
Design
The database
design
FIguRe 10.6 Section 9-4
process
is depicted
in
Figure
10.6.
Database design process
Stage
Activities
Steps
Conceptual
Data
analysis
and
requirements
Determine
Design Entity
Relationship
modeling
and
Define
normalisation
entities,
Draw
Data
model
end-user
ER
Identify
verification
Distributed
database
design*
modules
the
and
insert, integrity,
and
DBMS
Select
Selection
the
DBMS
Determine
DBMS
and
data
relationships
update,
Logical
Map conceptual
Design
model to logical
Validate
logical
model
using
Validate
logical
model integrity
Validate
logical
model
model
components
Define
normalisation constraints
against
tables,
columns,
Normalised Ensure
user requirements
set entity
Ensure
to
use
relationships,
and
Dependent
constraints
of tables and
the
referential
model
integrity;
supports
user
define
column
Physical
Define
data
storage
Define
integrity
organisation
Define
tables,
indexes,
and
constraints
requirements
Hardware 9-7
rules
security
Independent
DBMS
9-6
delete
and
strategy
Hardware
model
and
access,
allocation
and
requirements
attributes
views,
fragmentation
transaction
and
entity
validate
queries,
DBMS
9-5
and
domains,
normalise
reports,
Define
outputs,
attributes,
diagrams; ER
Validate
views,
views
Dependent
physical
organisation
Design Determine
10
* See +
See
Chapter
14,
Chapter
13,
Distributed
will
to
a clear
must
other
data
Copyright Editorial
review
2020 has
is
words,
any
security
Define
database
groups,
and
query
roles,
and
execution
access
controls
parameters
of the
components
of these
about
each
in
databases
areas. In
in
Chapter
component
Figure
10.6.
a real-world
Knowing setting.
11, Conceptual,
in greater
those This
Logical,
details
chapter
and
is
Physical
detail.
All
hardware
and its
used
be
to
used
create
an abstract
way possible.
functional might
independent
minimal
is there,
make
Reserved. content
is
most realistic
model to
sure
elements
Rights
modelling
business
following
defined
suppressed
data
The conceptual
areas.
not
so the
yet
At this
have
system
database
level
model
be set
up
the
Therefore, within
that
must embody
of abstraction,
been identified.
can
structure
any
type
the
of
design
hardware
and
chosen later.
needed
All data
Learning. that
of the
mind the
elements
Cengage deemed
each
and implement
objects in the
and
platform
needed.
about
stage,
database
be software
Keep in
In
willlearn
design
and/or
All that
users,
Design
real-world
software
briefly
an overview
you
understanding
hardware
Define
Performance
design
provide
conceptual
represents
SQL
will learn
Design,
Conceptual
In the
measures measures+
successfully
only intended
i.
and
you
you
Database
performance
Database
section
help
security
Databases
Managing
In this
and
does
that
May
not materially
be
data
affect
needed
must
be used
overall
or
duplicated, learning
in experience.
whole
is
needed.
are in the
database
scanned, the
is there
by the
model
copied,
rule:
all that
all
required
in the
not
and
data
model
transactions by at least
or in Cengage
part.
Due Learning
to
electronic reserves
and must
one
rights, the
right
that
database
some to
third remove
all
data in
be defined
party additional
the
in the
model
model,
are
and
all
transaction.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
However, on the
as you apply the
immediate
design
must leave
in information
needs
room
for future
resources
As you examine Data
analysis
of the
data rule,
business,
avoid
but
modifications
also
and
Database
an excessive
short-term
on the
data
future
additions,
ensuring
bias.
needs.
that
Development
the
Focus
Thus,
539
not only
the
businesss
Process
database
investment
will endure.
Figure
10.6, note that
conceptual
design requires
the following
four
steps:
and requirements
Entity
relationship
Data
model
modelling
and
normalisation
verification
Distributed
database
Each of these
minimal
data
10
steps
design
will be explained in
detail in
Chapter
11, Conceptual,
Logical,
and Physical
Database
Design.
ii.
DBMS
Software
The selection the
of DBMS software
advantages
avoid the
false
Selection
and
is critical to the information
disadvantages
expectations,
the
the factors
affecting
of the
end
user
proposed
must
be
systems
DBMS
made
smooth
software
aware
of the
operation.
should
Consequently,
be carefully
limitations
of
both
studied.
the
DBMS
To and
database. Although
most common
Purchase,
DBMS
features
application
more
maintenance, and tools.
report
task.
generators,
pleasant
work
Underlying
model.
Portability.
end
ease
of use,
and third-party
support
network, systems
requirements.
to
company,
some
of the
and
a variety
of tools
of query
and the
relational,
facilitate
(QBE),
the
screen
DBMS
a
Database
concurrency
software
object/relational,
create
programmer.
security,
also influence
costs.
and so on, helps to
application
performance,
that
by example
data dictionaries, user
conversion
10
control,
selection.
or object-orientated.
and languages.
Processor(s),
RAM, disk space,
and so on.
Design
The second
stage in the
stage is to
map the conceptual
1
company
training
includes
availability
both the
facilities,
platforms,
the
generators,
for
Hierarchical,
Across
DBMS hardware
DBMS.
vary from
installation,
software
example,
application
query
processing
licence,
database For
environment
facilities,
transaction
Logical
decision
operational, Some
development
administrator
iii.
purchasing
are:
Cost.
painters,
the
The logical
database
design
design model into
stage
Creating the logical
cycle is known alogical
consists
of the
as logical
model that
following
design.
can then
The aim of the logical
be implemented
design
on a relational
phases:
data model.
2 Validatingthe logical data model using normalisation. 3
Assigning and validating integrity
4
Merging logical
5
Reviewing the logical
You
will learn
in
constraints.
models constructed
detail
for different parts for the database.
data model with the user.
about
logical
design
in
Chapter
11,
Conceptual,
Logical,
and
Physical
Database
Design.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
540
PaRt
IV
Database
Design
The right to use the to
use the tables,
framework,
the
The logical
model
within
iv.
by defining
the
of the
questions the
hardware
require
domain
the
of data
physical
characteristics methods
of the
are
data in the
database
design
a function
supported
access
model into
tables
requirements
that
Who will be allowed
users?
of appropriate
the required
of the
by the
storage
Within
a relational
rights
and
views.
a software-dependent
and the
necessary
allow the
access
system to function
2
Translate each relation identified in the logical
3
Determine a suitable file organisation.
4
Defineindexes.
5
Define user views.
6
Estimate data storage requirements.
7
Determine database security for users. will learn
about
each
of these
stages
but
down into
in
and data access characteristics
types
system,
device(s),
can be broken
Analyse data volume and database usage.
of devices
and the also the
supported
DBMS. performance
a number
by the
Physical
hardware,
design
of the
of the
affects
not
system.
of stages:
data modelinto tables.
more detail in
Chapter
11,
Conceptual,
Logical,
and
Physical
Design.
Physical of the
phase.
which
environment.
1
Database
to
conceptual
process of selecting the data storage
access
location
Physical
You
definition
definitions,
define the
design
will be available
Design
The storage
type
during the logical
table(s)
software-independent
appropriate
design is the
database.
10
to those
translates
selected
Physical
only the
is also specified
portion(s)
The stage is now set to
the
Physical the
which
answers design
restrictions.
database
and
design
desktop
database
is
a very
world.
software
technical
job,
Yet even in the has
assumed
more typical
more complex much
of the
of the
client/server
mid-range
burden
and
of the
and
mainframe
mainframe
physical
world
environments,
portion
of the
than
modern
design
and its
implementation.
Online Content Physical design is particularly importantin the olderhierarchical and network
models
Network
Database
databases
are
In
spite
described
in
Appendices
Model, respectively,
more insulated
of the fact
from
that
I and
J, The
available physical
details
relational
Hierarchical
on the
online
than
the
models tend
Database
platform
older
to
Model
for this
hierarchical
hide the
and
book.
The
Relational
and network
complexities
models.
of the
computers
physical
characteristics, the performance of relational databases is affected by physical-level characteristics. For example, performance can be affected by the characteristics of the storage media, such as seek time, sector and block (page) size, buffer pool size and number of disk platters and read/write heads. In addition, factors such as the creation of an index can have a considerable effect on the relational databases
performance,
that is, data access
speed
and efficiency.
Even the type of data request must be analysed carefully to determine the optimum access method for meeting the application requirements, establishing the data volume to be stored and estimating the performance. Some DBMSs automatically reserve the space required to store the database definition and the
Copyright Editorial
review
2020 has
Cengage deemed
users
Learning. that
any
All suppressed
data in
Rights
Reserved. content
does
permanent
May not
not materially
be
copied, affect
storage
scanned, the
overall
or
duplicated, learning
devices.
in experience.
whole
This ensures that the
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
data are stored in sequentially
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
adjacent
locations,
performance
thereby
tuning
Physical
is
design
performance
is
surprising
that
reducing
covered
data access time
in
becomes
detail
more
affected
by the
designers
favour
in
Chapter
complex
when
and increasing
13, data
communication
are
database
software
distributed
hides
and
at different Given
as
Development
performance.
Database
throughput.
that
Database
system
Managing
medias
10
SQL
(Database
because
complexities,
many of the
541
Performance.)
locations
such
Process
the
it is
physical-level
not
activities
as possible. The
preceding
sections
have
separated
the
discussions
In fact,
logical
and
physical
design
can
be carried
basis.
Logical
and
physical
design
can
also
with hierarchical
and network
understanding hardware
of the
models.
software
and
The output attributes, phase,
install This
step
of the
database
domains,
views,
you
the
many
Such
actually
is required the
technique
design
phase is
indexes,
only
database
when
to
on a table-by-table
parallel
when
require
take
the
full
the
activities.
(or
file-by-file)
designer
designer
advantage
skills
design
a new
dedicated
that
made
and
to
of
detailing
storage
is
working
have a thorough
both
software
and
and
the
creation
performance
of tables,
guidelines.
In
specifications.
user
Relational
groups
use
of
Database
can be easily
Create the
have
representations
virtual
of
services
in the
(RDS).
managed, tested
DBMS the
is
necessary
standard
developed.
computing
and
is
virtual
for the
to leverage
The
resources used in
private
DBMS
server
such
This
generation
new
and scaled
of the
system
configuration
services
as
that
many
In
may
and
such
routing.
allows
to create
Service
users
to
10
server running
Another
Database
as
environment,
administrators
networks
of services
is a of the
of computing,
DBMS on a virtual
SQL
In
be installed
virtualisation
a database
and network
Microsoft
system.
investments
are independent
areas
networks.
of a new instance
a task that involves
database
Services
already
The technique
storage
normally
and
cloud
of the DBMS
One current trend is called virtualisation.
refers to the installation This is
instance a particular
employees
resources.
services,
hardware.
the
common or
create
Amazon
databases
up as needed.
Database(s)
modern
storage-related
relational
DBMSs,
constructs
group (or file contain
logical
virtualisation
appropriate
most
of instructions
constraints,
all these
will have
computing
of virtual
on shared
is
creates
physical
creation
In
activities
order
a series
security
implement
and the
that
underlying
that
in
design
and Loading
organisation
technology
trend
parallel
hardware
on a new server or on existing servers.
the
parallel, out in
physical
DBMS
cases,
in the
out in
be carried
and
characteristics.
10.3.3 Implementation
this
of logical
groups),
more than
For example,
to
the table
one table
the
a new
house the spaces
space
database
end-user
of the
a table logical
requires
The constructs
and the tables.
and that
implementation
implementation
tables.
Figure
space design
can
10.7
shows
contain
in IBMs
the
creation
of special
usually include
the
that
a storage
more than
one table.
DB2
would
require
the
storage
group
can
following:
1 The system administrator (SYSADM) would create the database storage group. This step is mandatory
groups to see
for
such
mainframes
automatically whether
as
DB2.
when a database is
you need to create
Other
DBMS
created. (See
a storage
group
software
may create
Step 2.) Consult
and, if so,
your
what the
equivalent
DBMS
command
storage
documentation syntax
2
The SYSADM creates the database
3
The SYSADM assigns the rights to use the database to a database administrator (DBA).
must be.
within the storage group.
4 The DBA creates the table space(s) withinthe database.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
542
PaRt
IV
Database
5
The DBA creates the table(s)
6
Design
The DBA assigns spaces. is
not required
for
GRANT
For
access
example,
may be granted
SELECT
10.7
may be limited
database
standpoint.
PROFESSOR
FIguRe
access rights to the table
Access rights
security
within the table space(s).
in the
to the
organisation
relational
the
user TO
spaces
views rather
using
ON PROFESSOR
Physical
to
and to the tables than to
environment,
following Miriam
USER
The creation
views
access
are
rights
whose identification
desirable to
table
of views from
a table
code is
a
named
MLEDIMO:
MLEDIMO;
of a Db2 database
Storage
but
command,
Ledimo,
within specified
whole tables.
environment
group
Database
Table
Table Table
Table Table Table
Table
Table
space
space
Table space
Table
Table
space
10 Table
space
Load or Convert the Data After the database has been created, the data must beloaded into the database tables. Typically, the data will have to be migrated from the prior version of the system. Often, data to be included in the system
must be aggregated
from
multiple sources.
In a best-case
scenario,
all of the data
will be in a
relational database so that it can be readily transferred to the new database. however, in some cases data may have to be imported from other relational databases, non-relational databases, flat files, legacy systems, or even manual paper-and-pencil systems. If the data format does not support direct importing
into
the
new database,
conversion
programs
may have to
be created
to reformat
the
data
for importing. In a worst-case scenario, much of the data may have to be manually entered into the database. Once the data has been loaded, the DBA works with the application developers to test and evaluate the database.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Loading for
this
also
existing
is that
on the
be a very
data into
most
cloud
amount
of
expensive
negotiating
a cloud-based
services
data that
are
travels
proposition.
the terms
of the
database
priced over the
Therefore,
cloud
service
based
not
service
can sometimes
only
network.
system
In
on the
such
Database
of data
loading
must
to
Process
careful
543
The reason be stored
but
a 1 TB database
be very
to ensure that there
Development
be expensive.
volume
cases,
administrators
contracts
10
in
could
reading
will be no hidden
and
costs.
10.3.4 Database security Data
stored
does
not take
database
have
in
the
or
when
a serious
system misuse. reading
the
This
section
to
ask
of threat.
to the
Threats
system
The loss
being
and/or its
access
operational,
The loss
Threats
examples
electronic
business.
see.
Activities
in
and
polices
a hacker
cause
goals.
from?
essential
What to the
that
data
security
from
loss,
any
misuse
users
of the
referred
For example,
causes
the
data from
to
as the
kind
or harm
a person
privacy
gaining
account.
database
accessing
such
and
breach
system
to
stop
it.
of data).
This could
as a password
10
be
or a bank
data can be lost security
then
Learning. any
All suppressed
it
is
but different
does
accidental such
May not
are likely
goals
to
be perpetrated
working
for
with
or her
so that
he or she
both inside
and
him
occur
case
from
alarge
outside
the
where an employee
steals
a person
specific not
of severity.
are:
a salesperson
he or she
loss
as user
not materially
be
of data.
copied, affect
policies
data
by humans, company can
often
his
or her
organisation
and
access
he or she is
to the
by
who resigns
start
has legitimate
that
connected
the
overall
or
duplicated, learning
to
in experience.
often
However,
caused
each
to the
unauthorised
organisation
by humans
it is important
and procedures
If employees
for them
scanned,
This is
authorisation.
security
by poor staff training.
Reserved. content
be
can
will be impossible
Rights
as these
and are of various levels
security
actually
data.
causes
procedures
on the
Consider the
system
stealing
to an organisation
such
and fraud
ensure that it has excellent
that
to
further
goals relate
the
money from the
private information
database
differently.
This internal
error that
Cengage
modification.
some
effects
would
customer
database
Human
deemed
to
and their
Both theft
be treated
breaking
has
it is
may
or
security
database
security
protect
potential
unauthorised
data (also
access
An example
your
organisations
2020
the
so
undertake the
the
data
of computer destruction
should
design,
doing
and removing
authorised
and externally
of data.
takes
have to
review
protect
database
and in have
For example,
of the
gaining
of threats
means.
and then
Copyright
which stops
to
the
balance.
Theft and fraud
own
data through
data.
establish
It
a student
can cause:
a bank account
can occur internally
Some
Editorial
of
of confidentiality
account
Threats
you
most common
to
loss,
to
Within
that
or accidental so
The
to
any kind
important
we trying
goals
for
users.
access
or damage
of security,
it is
of data.
unauthorised
have
misuse
intentional
ideas
by
major concern
prevent?
security
students
Any
a
what are
of circumstances
of the
by a person
to
access
when
data!
is
system
as,
availability
data.
to
payroll
basic
from
results
against
the
meet the
protected
Security
the
we trying
set
of availability
caused
to
any
of the integrity
The loss
only
such
are
to
to
data
developing
and the
are
unauthorised
access
the
questions
developed
be
the likely
organisation.
protecting
problems
are
have
When
must
predict
will highlight
confidentiality
measures
to
of
area.
related
to
on the
aim
this
It is important
integrity,
database
employees
impact
with
in
security
company
much imagination
in
place to
do not know the
for
not following an organisation
begin
with. In addition,
procedures
surrounding
data
be followed.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
544
PaRt
IV
Database
Design
Electronic ?
infections.
Viruses.
A virus
spreading
they
cannot
Email
is
be caught
viruses.
Worms
also
?
to
that
is
usually
infections:
capable
attached
of copying to
such
attaches
itself
to
email
itself
a program
as opening
messages
all people in the receivers
pieces
of software
that
or hole in
security.
networks
systems
quickly through
are
human intervention,
of virus
small
between
of electronic
of software
As viruses
without
This kind
telecommunication
travel
piece
categories
and
or application,
an email
attachment
program.
mailing itself
are
general
malicious
a network.
an infected
automatically ?
a
across
or running ?
There are four
without
email
replicate
and replicates
address
themselves
They
are
any human intervention
using
different
itself
by
book. any form
from
viruses
and can replicate
of
in
that
they
themselves
very
networks.
Trojan
horses.
action.
It remains
The
Trojan
horse
dormant
is
a computer
until run
program
and then
that
begins to
claims
to
do damage
perform
such
one task
as erase
or
a hard
disk. The introduction the loss
of a virus to a computer
of availability
of the
The occurrence
of natural
and not deliberate addition,
data
disasters
actions
could
network
system resulting
but
such
can result in
both the loss
due to
of the
consequences
to the
business.
as storms,
fires
These
are unpredictable
or floods.
would still result in the loss
be corrupted
of integrity
in serious
power
surges
of integrity
and
and availability
hardware
would
data and
of data. In
become
physically
damaged. Unauthorised access
is
and
making
access
could
result
and its
hardware
privileges
to
user
goes
set
then up
weak
and then The above
it
does
should DBMS the
Copyright review
2020 has
Learning. that
any
by no
need for
a number part
of the
computer
used
system
in
other
copied,
scanned,
parts
unauthorised
a computer Obtaining
data to
system, unauthorised
gain information
Unauthorised
malice
against
the
property,
reputation
modification
of data,
by the
database
One example
that
modification
organisation. and
This
safety
physically
allow
of a damaging
within
would
to
steal
(DBA)
DBA granting the
not
excessive
organisation.
be that or obtain
The
the
DBA
has
only
login
information
users.
and
to
example
attackers
administrator
is the
of his or her job Another
exhaustive
an organisation
gaining
by this threat.
database
of data security
within.
of
and
requirements
means
contained
unauthorised
be caused
which
for
entering
also the
of training.
of genuine
used
organisation.
acts
but
privileges.
schemes,
is
measures
security
Cengage deemed
the
the
often
of illegally
deleted.
access
could
lack
these
the identity
of threats
highlight
security
data
Editorial
list
only
exceeds
abuses
authentication
assume
contain is
who
on to
data
deliberate system
This
through
act
the
of data are also covered
knowledge
a user
with
phrase
browsing
Unauthorised
administration.
enough
and
or even
computer
employees.
and the theft
database
having
any
files
or against
changed
concerned
only
The as the
a person
benefit
being
data.
defined
to the
may involve
is
not
of
usually
persons
data
sabotage
would include business
is
changes
to that
in the
Employee
modification
unauthorised
be used
Poor
and Hacking
to a database
could
10
access hacking.
a summary
is
provided
have a comprehensive
measures to
protect
both the
infrastructure
within
of the
You
system.
in
Figure
10.8.
However
data security
plan.
The plan
data
hardware.
and the
an organisation will now look
and
will often
at some
of the
The
rely
on
common
measures.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
affect
the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
FIguRe
10.8
External
threats
a non-exhaustive
summary
10
Natural
include
horses
Storms unauthorised
data Theft
Process
545
disasters
Flood
Unauthorised
Development
of security threats
Electronic Infections such as viruses, worms and Trojan Hacking Gaining
Database
Fire
access
modification
of
of data
Fraud
Power
surges
Poor
database
administration
Poor security Set
by the
Granting Weak
Internal
threats
by
of
and
procedures
excessive
privileges
authentication
schemes
employees
Employee not trained in No employee monitoring Theft
policies DBA
procedures
data/unauthorised
modification
of
data
Sabotage
10 Data
Security
Physical type
Measures
security
of database
For
example,
existence terms
building
due to
security
controls, of the
identity. use
User
of
data
Password security
2020 has
as
Cengage deemed
Learning. that
any
is
a way
allows enforced
can
Rights
makes
access
to
can
tools
as they
characteristics
biometrics
include
contain
of the
basement by
have
of a
push-button been
a digital
to recognise
fingerprints,
In
placement
in the
systems
The
impractical.
the
be controlled
biometric
security.
security
disaster,
on the
be practical.
physical
do not locate
rooms
Recently,
for
physical
a natural
Depending
may not always
candidate
For example
behavioural
physical
tables,
does
May not
not materially
can
be
time
through (CREATE,
views,
copied, affect
the
overall
or
user
duplicated, learning
and
of access at the the
verifying
through rights
operating use
UPDATE,
queries
scanned,
the
be achieved
assignment at logon
operations
Reserved. content
the
be established
databases,
All
often due to
systems.
of identifying
This
usually
suppressed
networks
authentication
cases,
security
alikely
areas.
seen
imprint
to
of an
or authenticate
a persons
retina
a
or iris
geometry.
security
rights
not
hardware
Physical
convenient
some
access to specific
physical
is
considered.
biometric
most common
may restrict
such
and
or applications.
is
Access rights
review
hand
or
and, in
The
and
of floods.
cards
secure
authentication
restricted
Copyright
most
of data
be carefully
possibility
physical
persons and the
could
swipe
database
microcomputer
the loss
physical
establishing
research
multiserver
the
personnel
however,
student
against
in a building
individuals
Editorial
implementation,
of large
one
only authorized
a university
of guarding
hardware
be
allows
that
the
use
to
specific
system
of database DELETE
the
allowed
of passwords
and
authorised
to
access
access
rights:
users.
Password
level.
software. and
user is
so
on)
The
assignment
of access
on predetermined
objects
and reports.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
546
PaRt
IV
Database
Design
User authentication role.
This
will be
Audit trails is
an
that
to link
some
to
If
can
out
a persons
describes
which
used,
the
method
is
key, the In the
message.
The
the
10
Some
second
whom
DBMS
without
key,
the
decrypts
the
known
original
for
stores
writing
who
method
or the
guesses
the
be
used
might
account
have
number
the
as the
numbers
stored
zero to (0 to
nine to
99).
one
(DES).
data.
key.
Here, the
only
standard
the
be required
encryption
Where
encryption
value
This example
algorithm.
key.
of
example,
The real
five.
encryption
Data
numbers for
will be 32456.
decipher
(the
who
number
five is known
data
would
the
one-digit
encryption
order to
may then
users
of the
called
as the
digit is Both the
With the
one-key
guess the
Therefore,
key),
the longer
data.
wish to
send
an encrypted
key to transform private
the
key, is
used
message.
The
message
data in the by the
only
have
decryption
person
a public
message into
who
key.
The
an encrypted
algorithm
to
may hold the
convert
private
the key is
was destined.
encryption (TDE),
routines.
For example,
which allows
of complex
column.
known
work in
can identify
data
encrypt
value
number
represent
we have so far discussed.
to
subtraction
data is
trails
system.
wishes
encrypted
guesses
data in the
lots
the
key in
the
message
it in the
is
100
as the
to the
Data Encryption
need
and
back
managerial
measures
data itself
audit
unauthorised
by a secret
by the
real
the
to
Audit
security
audit
The
measures that
where the
up to ten
public
repair
code
then
one-key
all users
to
useless
the
to the
decipher
products include
as Transparent
encrypts
it is to
DBAs
Although the audit trail
use.
our other
occurred.
a bank
value
know the
up to
uses this
message
one for
as the
need to
method,
alter
algorithm,
method,
difficult
algorithm
encrypted
unauthorised
has
or security
algorithm,
would require
a two-key
data
32451
number
to
would
two-key
encryption
be to is
by the
is referred
more
after it
Supposing
encrypted
a specific
method an intruder
the
the
added
and receiver
with
discourage
may be used
security layers
encryption
of adding
five,
and
number
from
a very simple
whereas
which is part of the
we would rather
access
would
account
value
sender
can
Although
user
stage
decrypted
This logic
existence
by an algorithm.
The first
can be then
management,
chapter.
be used to render
database
carried
customers.
five.
a particular
of the
is
mere
defence.
or unauthorised
encryption
encryption
this
does not gain access to the system, if all else fails, the
a violation
violated
on in
its
database
of a violation
Data
of authorisation
later
device,
of the
an attacker
existence
a function
are usually provided by the DBMS to check for access violations.
after-the-fact
the last line
its
is
discussed
columns
code.
Similarly,
when
Oracle
DBMS
in a database
When
users
insert
users
select
the
has a feature
table to
data, column,
known
be easily encrypted
the
database
transparently
the
database
automatically
it.
nOte The
most
protocol
common that
Netscape,
example
was
SAIC,
Terisa
over the internet. customers
amount
of
of data
can
User-defined employees
review
2020 has
Cengage deemed
be sent
any
All
public
a secure securely.
policies
suppressed
Verisign)
Secure
Electronic
of companies
interested
authenticity
in
Transactions
(VISA,
ensuring
of electronic
key
encryption
connection The
use
and
(SET).
MasterCard,
data
privacy
transactions
such
Rights
Reserved. content
does
the
as training
May not
not materially
be
copied, affect
the
overall
SET is
an open
GTE, IBM,
Microsoft,
in all electronic
and
be seen
and
provides
when
be put in
security
in experience.
measures.
in
security
whole
or in Cengage
part.
Sockets
commerce
a guarantee
a person
that
Due
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
from
to
can
employees
may content
be
suppressed at
any
an
ensure
procedures
monitoring
any
web address.
organisation and
which
goods
before the
policies
technology
over
purchases
by the
and
(SSL)
server,
of http
Such
aspects
Learning
Layer
an external
place
duplicated, learning
Secure
should data
or
in
a user
instead
employees
scanned,
used
by the use of https
procedures
how implement
is
between
of SSL can
This is normally indicated
controls
Learning. that
and
the
and
create
know
personal
Copyright
private
store.
at work is consortium
are protected.
SSLs
internet-based
Editorial
Systems
transactions
internet.
by a large
SET ensures
A combination on the
of encryption
designed
time
to
from if
the
subsequent
that cover ensure
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
that they is
also
are actually following
a responsibility
Backup
and
recovery
ultimately
recovered.
Data backup
lies
software
known
or potential
unique
signature
themselves.
should
with the
The establishment
will be discussed
DBA to
and recovery
a known
is
used
viruses.
in
be in
place
in
ensure
the
data
in the
context
basis,
Firewalls
is
only
are systems
a set of rules network.
more
detail
the
event
within
of disaster
applications.
are
If
Packet
to
in this
of a
the
Development
of polices
later
Process
547
and procedures
chapter.
disaster
database
occurring.
can
management
methods
which
The
always
be fully
will be discussed
later
each
that
are accepted
messages
message
or software
are allowed
from
devices
record
antivirus
to
for
the
software
any external
be used
applications
the
rules, in
any
viruss
will check,
source
to
media
see
scan
any
which
act as gatekeepers
not
out
contains
in
allowed
of the
device.
can
be
through.
checked
designated
you to establish
or out of the
organisations
accessed
by
Firewalls
organisations
data is
be sent to the
by allowing
database
it is
and
access
be allowed
organisations
that
to
should
an
flowing
or packet
also
media
vendors
The
unauthorised
messages
as breaking
software
network can
and
date.
devices
when
drives
database.
software
up to
used
control
hard
antivirus
software
the
of hardware
is flagged
filtering
Packets
discovered,
They are used to prevent
commonly
to
system
an organisations
kept
determine
most
a message
more of three
if regularly
search
their
On request,
comprising
or filters
They
a virus is
entering
enter.
network.
to
it into
messages to
useful
to an organisations
time
incorporate
all
virus is trying
measure
by organisations
Each
and then
on a real-time
This
and
Database
chapter.
Antivirus
if
procedures
DBA,
strategies
responsibility
in this
the
of the
10
use
Web
one
or
network:
against
system
a set
of filters.
and all others
are
discarded. Proxy server
the
organisation a proxy
proxy server
and
server,
external other
requested
so that
increases
response
users
than
all the
client
gateway
will run
so that
proxy the
measures.
the
internet.
It if
also
can
other
users
proxy server
between
the internal
There
are further
cache
the
network
advantages
Web pages
request
the
same
that
page.
can also be used to limit
of an
of using have This
the
been also
websites
that
organisation.
blocks
The
all communication as the
is reduced
addition,
the
machines
machine.
as the internet,
traffic
time. In
gateway
such
security
network
may view outside
Circuit-level
manages
networks
1
all incoming software server
messages to any host
to
allow
performs
internal
client
them
to
but itself.
establish
a connection
all communications
machines
never
Within the
with
actually
have
any
organisation,
with the
external
contact
circuit-level
network
, such
with the outside
world. Diskless
workstations
the information
allow
from their
end users to
access
the
database
without
being
able to
download
workstations.
nOte James
Martin provides
security
strategy
of database Data
that
security
an excellent remains and
enumeration
relevant
today.8
may be summarised
are
of the
security
as one in
Users
Protected
and description Martins
strategy
desirable
attributes
is
on the
based
of a database seven
essentials
which:
are
Identifiable
Reconstructable
Authorised
Auditable
Monitored
Tamperproof
8
Copyright Editorial
review
2020 has
Martin,
Cengage deemed
Learning. that
any
J.,
All suppressed
Managing
Rights
Reserved. content
does
the
May not
not materially
Database
be
copied, affect
Environment.
scanned, the
overall
or
duplicated, learning
Englewood
in experience.
whole
or in Cengage
part.
Due Learning
Cliffs,
to
electronic reserves
NJ:
rights, the
right
Prentice-Hall,
some to
third remove
party additional
content
1977.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
548
PaRt
IV
Database
Design
10.3.5 testing
and evaluation
In the design phase, decisions of the
database.
were madeto ensure integrity,
During implementation
and loading,
these
security, performance plans
were put into
and recoverability
place. In testing
and
evaluation, the DBAtests and fine-tunes the database to ensure that it performs as expected. This phase occurs in conjunction with application programming. Programmers use database tools to prototype the applications during coding of the programs. Tools such as report generators, screen painters and menu generators are especially useful to application programmers. Test the Database During this step, the DBA tests the database to ensure that it maintains the integrity and security of the data. Dataintegrity is enforced by the DBMS through the proper use of primary and foreign key rules.
Many DBMSs
also support
the
creation
of domain
constraints
and
database
triggers.
Testing
will ensure that these constraints are properly designed and implemented. Dataintegrity is also the result of properly implemented data management policies, which are part of a comprehensive data administration framework. evaluate
the
Database
and its
Application
Programs
As the database and application programs are created and tested, the system must also be evaluated using a more holistic approach. Testing and evaluation of the individual components should culminate in a variety of broader system tests to ensure that all of the components interact properly to meetthe needs of the users. Atthis stage, integration issues and deployment plans are refined, user training is conducted,
and system
documentation
is finalised.
Once the
system
receives
final
approval,
it
must
be a sustainable resource for the organisation. To ensure that the data contained in the database are protected against loss, backup and recovery plans are tested. Timely data availability is crucial for almost every database. Unfortunately, the database can lose data through unintended deletions, power outages and other causes. Data backup and recovery procedures create a safety valve, ensuring the availability of consistent data. Typically, database vendors encourage
10
the
use of fault-tolerant
components
such
as uninterruptible
power
supply (UPS)
units,
RAID storage
devices, clustered servers and data replication technologies to ensure the continuous operation ofthe database in case of a hardware failure. Even with these components, backup and restore functions constitute a very important part of daily database operations. Some DBMSs provide functions that allow the database administrator to schedule automatic database backups to permanent storage devices such
as disks,
DVDs, tapes
and online storage.
Database
backups
can be performed
at different levels:
Afull backup, or dump, of the entire database. In this case, all database objects are backed up in their entirety. A differential backup of the database, in which only the objects that have been updated modified since the last full backup are backed up. Atransaction
log
backup,
which backs
up only the transaction
log
operations
that
or
are not
reflected in a previous backup copy of the database. In this case, no other database objects are backed up. (For a complete explanation of the transaction log, see Chapter 12, Managing Transactions and Concurrency.) The database backup is stored in a secure place, usually in a different
building
from the
database
itself,
and is protected
against
dangers
such
as fire, theft,
flood and other potential calamities. The main purpose of the backup is to guarantee database restoration following a hardware or software failure. Failures that plague databases and systems are generally induced by software, hardware, programming exemptions, transactions, or external factors. Table 10.1 summarises the most common sources of database failure. Depending on the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
tabLe
10.1
10
Database
Development
Process
549
Common sources of database failure
Source
Description
Software
Software-induced traceable DBMS
Software failures
to the
operating
software,
viruses
and
may be system,
application
other
In April 2017, a new vulnerability the
was found
programs,
or
in the
e-Business
malware.
Oracle
Suite,
which
unauthenticated
allows
attacker
to
modify, or delete critical Hardware
Hardware-induced memory disk
chip
failures
errors,
sectors
and
mayinclude
disk
crashes,
disk-full
A bad
bad
memory
hard-disk
errors.
can
data.9
module or a multiple
failure
system
an
create,
in
a database
bring it to an abrupt
stop.
Programming
exemptions
Application
programs
or end users
roll back transactions conditions
are
exemptions malicious
also
be caused
the
Reserve
code that
from
by hackers.
New
York
Federal
Bank to transfer
the
central
bank
to accounts in the hackers
$81
by
malware
million
of Bangladesh
Philippines.
used fraudulent
injected
of
hackers fraudulently
instructed
by
tested
2016, a group
unidentified
Programming
or improperly
can be exploited
In February
when certain
defined.
can
may
The
messages
disguised
as a
PDF reader.10 Transactions
The system
aborts (See External factors
detects
deadlocks
and
Deadlock
one of the transactions.
occurs
multiple simultaneous
Backups system
are especially important suffers
complete
earthquake,
flood,
when a
destruction or other
In August
from
a local
natural
disaster.
utility
providers
data
Although
power
automatically, enough
data loss
type and extent of the failure, the recovery a major long-term
Database
rebuild.
transactions.
2015, lightning
Googles
long
is not possible
executing
Chapter 12)
fire,
to
when
Regardless
in
struck grid
centres
in
backup
near
kicked
the interruption to
cause
affected
10
Belgium. in
was
permanent systems.
process ranges from a minor short-term inconvenience of the
extent
of the required
recovery
process,
recovery
without a usable backup.
recovery
generally
follows
a predictable
scenario.
First, the type
and extent
of the required
recovery are determined. If the entire database needs to be recovered to a consistent state, the recovery uses the most recent backup copy of the database in a known consistent state. The backup copy is then rolled forward to restore all subsequent transactions by using the transaction log information.
If the
database
needs to
be recovered
but the
committed
portion
of the
database
is
still usable, the recovery process uses the transaction log to undo all of the transactions that were not committed (see Chapter 12, Managing Transactions and Concurrency). Atthe end of this phase, the database completes an iterative process of testing, evaluation and modification that continues until the system is certified as ready to enter the operational phase.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
550
PaRt
IV
Database
Design
10.3.6 Operation Once the database has passed the evaluation stage, it is considered to be operational. At this point, the database, its management, its users andits application programs constitute a complete information system. The beginning
of the
operational
phase invariably
starts the
process
of system
evolution.
As soon
as all of the targeted end users have entered the operations phase, problems that could not have been foreseen during the testing phase begin to surface. Some of the problems are serious enough to warrant emergency patchwork, while others are merely minorissues. For example, if the database design is implemented
to interface
with the
Web, the
sheer
volume
of transactions
may cause
even
a well-designed system to bog down. In that case, the designers have to identify the source(s) of the bottleneck(s) and produce alternative solutions. These solutions mayinclude using load-balancing software to distribute the transactions among multiple computers, increasing the available cache for the DBMS, and so on. In any case, the demand for change is the designers constant, whichleads to the
next
phase:
10.3.7
maintenance
and evolution.
Maintenance and evolution
The database administrator must be prepared to perform routine maintenance database. Some of the required periodic maintenance activities include: Preventive
maintenance (backup)
Corrective
maintenance (recovery)
Adaptive
maintenance (enhancing
Assignment
of access
performance,
permissions
and their
adding entities and attributes
maintenance
Generation of database access statistics to improve audits and to monitor system performance Periodic security
10
The likelihood formats
quarterly, or yearly) system-usage
of new information
require
application
requirements
changes
and
for
new and old users
of system
statistics
summaries for internal
billing or budgeting
and the demand for additional reports
possible
within the
and so on)
the efficiency and usefulness
audits based on the system-generated
Periodic (monthly, purposes
activities
minor changes
in the
database
and new query
components
and
contents. Those changes can be easilyimplemented only whenthe database design is flexible and when all documentation is updated and online. Eventually, even the best-designed database environment will no longer be capable ofincorporating such evolutionary changes; then the whole DBLC process begins anew. You should
not
be surprised
to
discover
that
many of the
activities
described
in the
Database
Life Cycle (DBLC) remind you of those in the Systems Development Life Cycle (SDLC). After all, the SDLC represents the framework within whichthe DBLC activities take place. A summary ofthe parallel activities that take place within the SDLC and the DBLC is shown in Figure 10.9.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
FIguRe
10.9
10
Database
Development
Process
551
Parallel activities in the DbLC and the sDLC
SDLC
DBLC
System
Database initial
design
Analysis
study
Screens
Conceptual Database
design
Detailed
Logical
design
Reports Procedures
Physical
Implementation
Coding
and loading
Prototyping
Creation
System
Loading
implementation
Fine-tuning
Testing
and
Testing
evaluation
and
Debugging
evaluation
Operation
Database
Application
maintenance
and evolution
10.3.8 Determine Performance
Measures
Physical
when
design
performance
becomes
that
designers
possible.
Despite
characteristics,
example, sector In
and
block
review
2020 has
Learning. that
database
fact
that
(page)
any
All
Rights
pool
is,
deals
tend
hides
to
size,
is
and the
data
access
the
and
DBMS
storage
of
have
disk
not as
physical
properties.
on the
summary,
For
as seek time,
and read/write effect
In
and queries
it is
computers
media, such
platters
the
activities
storage
a considerable efficiency.
complexities,
of the
by physical
10
because
physical-level
complexities
affected
number
locations
Given such many of the
of the
can
speed
with fine-tuning
as
hide the
databases
of an index
at different
throughput.
that
by characteristics
creation
that
heads. relational
physical
design
to ensure that they
will meet
requirements.
Reserved. content
models
buffer
distributed
medias
of relational
as the
measurement
suppressed
relational
size,
such
data is
software
can be affected
performance
Cengage deemed
communication
performance
performance
Copyright
by the favour
performance
factors
databases
Editorial
the the
performance
addition,
end-user
more complex
is affected
surprising
program
maintenance
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
552
PaRt
IV
Database
10.4
Design
Database
DesIgn stRategIes
There are two classical approaches to database design: Top-down
design
starts
by identifying
the
data sets, then
those sets. This process involves the identification each entitys attributes.
defines the
data elements
for
each
of
of different entity types and the definition of
Bottom-up design first identifies the data elements (items), then groups them together in data sets. In other words, it first defines attributes, then groups them to form entities. The two approaches areillustrated in Figure 10.10. The selection of a primary emphasis on top-down or bottom-up procedures often depends on the scope ofthe problem or on personal preferences. Although the two methodologies are complementary rather than mutually exclusive, a primary emphasis on a bottom-up
approach
may be more productive
for small
databases
with few
entities,
attributes,
relations
and transactions. For situations in which the number, variety and complexity of entities, relations and transactions is overwhelming, a primarily top-down approach may be more easily managed. Most companies have standards for systems development and database design already in place.
FIguRe 10.10
top-down vs bottom-up design sequencing
Conceptual
T
B
o
o
p
t Entity
D o w n
10
model
t o m
Entity
U p
Attribute
Attribute
Attribute
Attribute
nOte Even
when
structures the
a primarily
selection
Copyright Editorial
review
techniques
on a distinction
2020 has
Cengage deemed
Learning. that
any
All suppressed
approach
a bottom-up
of attributes
normalisation based
top-down
is (inevitably)
and form
rather
Rights
Reserved. content
does
entities the
than
May not
not materially
is
selected,
technique. can
basis
the
be described for
normalisation
ER models
most
as
designs,
process
constitute bottom-up. the
that
revises
a top-down Because
top-down
vs
existing
process both
the
ER
bottom-up
table
even
when
model
debate
and
may
be
a difference.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10.5
CentRaLIseD
Vs DeCentRaLIseD
10
Database
Development
Process
553
DesIgn
The two general approaches (bottom-up and top-down) to database design can be influenced by factors such asthe scope and size of the system, the companys management style, and the companys structure (centralised or decentralised). Depending on such factors, the database design may be based on two very different design philosophies: centralised and decentralised. Centralised
design is productive
when the
data component
is composed
of a relatively
small number
of objects and procedures. The design can be carried out and represented in a fairly simple database. Centralised design is typical of relatively simple and/or small databases and can be successfully done by a single person (database administrator) or by a small, informal design team. The company operations and the
scope
of the
problem
are sufficiently
limited
to
allow
even a single
designer
to
define
the
problem(s), create the conceptual design, verify the conceptual design withthe user views, define system processes and data constraints to ensure the efficacy of the design, and ensure that the design will comply with all the requirements. Although centralised design is typical for small companies, do not make the mistake of assuming that centralised design is limited to small companies. Even large companies can operate
within a relatively
simple
database
environment.
Figure
10.11 summarises
the
centralised
design option. Note that a single conceptual design is completed and then validated in the centralised design approach. Decentralised design might be used when the data component of the system has a considerable number
of entities
and complex
relations
on which very complex
operations
are performed.
Decentralised
design is also likely to be employed when the problem itself is spread across several operational and each element is a subset of the entire data set. (See Figure 10.12.)
FIguRe
10.11
Centralised
sites
design
Conceptual
model
10
Conceptual
User
views
model verification
System
Data
In large
and
Instead,
a carefully
selected
Within the
decentralised
project. modules. modules
complex
Once the design to
As each
design
groups
design
group
the interrelation
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
projects,
any
among
All suppressed
Rights
does
within
May
not
be
copied, affect
modelling must
scanned, the
constraints
cannot
is
the
be
employed
to
done
by
tackle
database
design
the lead
designer
only
one
a complex
task is
divided
assigns
person. database
into
design
several
subsets
or
team.
on
subsets
materially
typically
designers
have been established, the
Data
dictionary
design
database
design framework,
focuses
not
database of
criteria
data
Reserved. content
the
team
processes
overall
or
duplicated, learning
a subset
be very
in experience.
whole
of the
precise.
or in Cengage
part.
Each
Due Learning
to
electronic reserves
system,
the
design
rights, the
right
definition
group
some to
third remove
party additional
creates
content
may content
be
of boundaries
and
a conceptual
data
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
554
PaRt
IV
FIguRe
Database
Design
10.12
Decentralised
design
Data
Engineering
Conceptual
Submodule
component
criteria
Manufacturing
Purchasing
models
Views Processes Constraints
Verification
Views
Views
Processes
Processes
Constraints
Constraints
Aggregation
Conceptual
model
Data dictionary
10 model corresponding
to the subset
being
modelled.
Each conceptual
model is then
verified individually
against the user views, processes and constraints for each ofthe modules. After the verification process has been completed, all modules are integrated into one conceptual model. Because the data dictionary describes the characteristics of all objects within the conceptual data model,it plays a vital role in the integration process. Naturally, after the subsets have been aggregated into a larger conceptual model, the lead
designer
must verify that
the
combined
conceptual
model is
still
able to
support
all of the
required transactions. Keep in mind that the aggregation process (Figure 10.13) requires the designer to create a single modelin which various aggregation problems must be addressed: Synonyms and homonyms. Different departments might know the same object by different names (synonyms), or they might use the same name to address different objects (homonyms). The object
can be an entity,
an attribute
or a relationship.
An example
of a synonym
department refers to the client while another refers to the customer. is if the IT department uses the term the client to refer to a computer
is
where one
An example of a homonym asin a client/server setup.
Entity and entity subtypes. An entity subtype might be viewed as a separate entity by one or more departments. The designer mustintegrate such subtypes into a higher-level entity. Conflicting object definitions. Attributes can be recorded different domains can be defined for the same attribute. designer mustremove such conflicts from the model.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
as different types (character, numeric), or Constraint definitions can also vary. The
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
FIguRe
10.13
summary
of aggregation
Synonyms:
two
10
Database
Development
Process
555
problems
departments
use
different
names
for
the
same
entity.
Label used: Entity
X
Department
A
X
Department
B
Y
Homonyms:
two
(Department Entity
different
B uses
the
entities label
are
X to
Label
X
addressed
describe
by the
both
entity
same
label.
X and
Y entity).
used: X
X
Entity Y
Entity
and entity
subclass:
The entities
X1 and
X2 are subsets
of entity
X.
Example:
Name Entity
EMPLOYEE
X
Common
Address
attributes
Phone
Entity X1
Entity
Department
A
X2
Department
SECRETARY
Typing
B
PILOT
speed
Conflicting
object
definitions:
attributes
for the entity
Payroll
Dept.
Systems
Primary
key:
PROF_ID
PROF_NUM
definitions
Phone
attribute:
898-2853
2853
Database
Data is
an important
attributes
PROFESSOR
Conflicting
10.6
Distinguishing
Hours flown Licence
Classification
Dept.
10
aDMInIstRatIOn and valuable
resource
within an organisation
and requires
a successful
database
administration strategy to beimplemented. Data managementis a complex job and hasled to the development of the database administration function. The person responsible for the control ofthe centralised and shared database is the database administrator (DBA). The size and role ofthe DBAfunction varies from company to company, as doesits placement within a companys organisational structure. Onthe organisation chart, the
DBA function
might be defined
as either
a staff
or line
position.
Placing the
DBA function
in a staff
position often creates a consulting environment in whichthe DBAis able to devise the data administration strategy but does not have the authority to enforce it or to resolve possible conflicts. The DBA function in aline position has both the responsibility and the authority to plan, define, implement and enforce the policies,
standards
and procedures
used in the data administration
activity.
The two
possible
DBA function
placements are illustrated in Figure 10.14.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
556
PaRt
IV
FIguRe
Database
Design
10.14
the placement of the Dba function Line Authority
Position
Information systems
(IS)
Application
Database
Database
development
operations
administration
Staff
Consulting
Position
Information systems
(IS)
Database administration
10
There is the
Application
Database
development
operations
no standard
DBA function
changes The
for itself
in
The distributed of each local
coordinating
activities
The growing
use of internet-ready
warehousing expanding
and
The increasing platform
departmental mention the database
Although DBA
has
Cengage deemed
to
Learning. that
any
In
cover
All suppressed
needs. created
short,
Rights
the
Reserved. content
does
May
not
be
the
because the
fast-paced
For example: decentralise
system
the
DBA to
new and
data
define
more complex
power
new
DBAs
and the
data
growing
modelling
and
number
design
of
activities,
job. of
But such
exists, to
materially
requires
In fact,
desktop-based
DBMS
cost-effective
and
an environment
also invites
who lack the technical
desktop
packages efficient
environment
provide solutions
specific
data duplication,
qualifications
requires
an easy to
the
to
DBA to
not to
produce
develop
good
a new
set
skills.
the
following
not
to
databases
add to the
DBAs
by people
the
standard
according
to
of user-friendly,
managerial
current
operations
personnel
2020
information problems
and
no
and
development
designs.
of technical
the
sophistication
for the
database
part, that is
DBA.
are likely
diversifying
styles.
DBA, thus imposing
In
functions.
an organisation
and object-orientated
applications
structure.
organisations
organisational
can force
further.
system
of any
changing
databases
on the
an organisations
dynamic
dictate
of distributed
function
fits in
most
the responsibilities
thus
review
the
and delegate
data
Copyright
DBA function
probably
DBMS technology
development
administration
Editorial
how the is
it is DBLC
common phases.
practice If that
to
define
approach
is
the
DBA function
used,
the
by dividing
DBA function
the
requires
activities:
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Database
planning,
including
Database requirements Database
logical
Database Database
Database
and
Figure
10.15
procedures,
Database
Development
Process
557
and enforcement
design
design
and implementation
debugging
operations
Database training
of standards,
and conceptual
and transaction
design
testing
definition
gathering
design
physical
the
10
and
maintenance,
including
installation,
conversion,
and
migration
and support
represents
FIguRe 10.15
an appropriate
DBA functional
organisation
according
to that
model
a Dbafunctional organisation DBA
Planning
Design
Implementation
Conceptual
Operations
Logical
Training
Physical
Testing
10
Keep in
mind that
different
operations.
support
top
a company
the
For example,
daily transactions
managements
in the each
different DBMS.
(SYSADM);
needs.
There
trend
charts
management
the
roles
two
manual.
Thus, the
DA is in
of the company
Copyright Editorial
review
2020 has
to
Cengage deemed
Learning. that
any
DAs job
charge
DBMS.
All
Rights
in the
does
May not
not
be
copied, affect
DBA
scanned, overall
or
some
middle
to and
DBMSs installed DBA assigned
systems
for
administrator
in
whole
For
example,
the
between
a DBA and the
manager
(irM),
of responsibility
and
usually
authority
than
the
extent.
data resources,
area of operations data,
expanded
structures
experience.
resource
degree
computerised the
function.
make a distinction
corporate
alarger
within
duplicated, learning
a higher
overall
covers
on the
the
as the
support DBMS
support
of desktop
management
as the information
to
the
only the
of the
materially
to
might have one
known
data
corporations
given
overlap
not
Depending
Reserved. content
to
description
placement
company.
and is
controlling
of controlling
The
suppressed
tend
for
company
to
with a hierarchical database
may also be a variety the
installed
in Figure 10.16. specialisation
The DA, also known
to top
DBMSs
corporations
a relational
of all DBAs is sometimes
towards
(DA).
and
an environment,
used by some of the larger
The DA is responsible
the
In such coordinator
and incompatible
to find
ad hoc information
a growing
although
uncommon level
departments.
directly
not
different
operational
administrator
DBA,
it is
several
at the
The general
organisation
reports
have
that position is illustrated
There is
data
might
but
Cengage
part.
Due Learning
to
than that also
the
organisational
components,
or in
both computerised
electronic reserves
the
rights, the
right
some to
third
party additional
content
might
may content
DBA because
outside
structure DBA
remove
of the
data
the
any
time
from
to the
suppressed at
scope
may vary report
be
and
from if
the
subsequent
DA,
eBook rights
and/or restrictions
eChapter(s). require
it
558
PaRt
IV
Database
the IRM, label
FIguRe
Design
the IS
DBA is
manager
used
10.16
here
or directly to the
as a general
title
companys
that
CEO. For simplicity
encompasses
Multiple database administrators
and to
all appropriate
data
avoid
confusion,
administration
the
functions.
in an organisation
Systems
administrator
Desktop
DBA
DBA
DB2
DBA
Oracle
relational
DBA
POSTGRES
SQL
relational
DBMS manager
Server
relational
You will now learn briefly about two distinct roles that a DBA must perform. These are known as the managerial role and the technical role. The DBAs managerial role is focused on personnel management and on interactions with the end-user community. The DBAs technical role involves the use of the DBMS database design, development and implementation as well as the production, development and use of application
tabLe
10.2
programs.
Alist
of the
Broad
Coordination
Conflict
10.2.
Technical
business
Analytical
skills is given in Table
Desired Dba skills
Managerial
10
desired
Broad
understanding
data-processing
Systems
skills
development
Structured
skills
resolution
skills
Communications
skills (oral
and
written)
background life
cycle
knowledge
methodologies:
Data flow
diagrams
Structure
charts
Programming
languages
Database
cycle
life
Database
knowledge
modelling
and
design
skills:
database
skills
Conceptual Logical Physical Negotiation
skills
Operational management,
Online Content in Appendix book.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
does
May not
not materially
implementation,
data
dictionary
and so on
Thedatabaseadministration functionis coveredin muchgreaterdepth
K, Database Administration,
Reserved. content
security,
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
which is available on the online platform for this
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10.6.1 the
Managerial Role of the Dba
As a
the
manager,
administration.
DBA
must concentrate
Therefore, the
DBAis responsible
Coordinating,
monitoring
Defining
and formulating
goals
More specifically, tabLe
10.3
on the
and allocating strategic
and
administration
plans for the
Process
559
of database
people
administration
and data.
function.
and services
Planning
End-user
Organising
Policies,
support
procedures
Data security,
Testing
Data backup
Delivering
Data distribution
DBA is generally
responsible
for
and standards
privacy
Monitoring
and integrity
and recovery and use
planning,
organising,
and delivering quite afew services. Those services might be performed the DBAs personnel. Lets examine the services in greater detail. end-User
dimensions
resources:
database
DBA Service
that the
Development
are shown in Table 10.3.
DBA Activity
Table 10.3 illustrates
planning
Database
for:
database
the DBAs responsibilities Dba activities
control
10
testing,
monitoring
by the DBA or, morelikely,
by
Support
The DBA interacts with the end user by providing data and information organisations departments. Because end users usually have dissimilar end-user support services include:
support services to the computer backgrounds, 10
Gathering user requirements. The DBA must work within the end-user community to help gather the data required to identify and describe the end users problems. The DBAs communications skills are very important at this stage because the DBA works closely with people whotend to
have different
computer
backgrounds
and communication
styles.
requirements requires the DBA to develop a precise understanding and to identify present and future information needs.
The gathering
of user
of the users views and needs,
Building end-user confidence. Finding adequate solutions to end users problems increases end-user trust and confidence in the DBA function. Resolving conflicts and problems. Finding solutions to end users problems in one department might trigger conflicts with other departments. End users are typically concerned with their own specific data needs rather than with those of others, and they are not likely to consider how their data affect
other
DBA function
departments
within the
organisation.
has the authority and responsibility
Finding solutions
to information
When data/information
conflicts
arise, the
to resolve them.
needs. The ability and authority to resolve
data conflicts
enable the
DBAto develop solutions that will properly fit withinthe existing data management framework. The DBAs primary objective is to provide solutions to the end users information needs. Giventhe growing importance ofthe internet, those solutions arelikely to require the development and management of
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
560
PaRt
IV
Database
Design
Web browsers to interface the
use
interfaces
quality
and integrity
Ensuring it
must
be properly
programmers data
access
internet that
end
and
access
database
interfaces
The
Standards
A prime
component
policies,
procedures
before
The
they
Policies
coordinates
and
of a successful and
Standards activity.
for
define,
to fire
both
application required
do
programs
must be given to the management For example,
a trigger,
that
for
transactions
application
transaction
One of the
trigger
must
DBMS features
if
an internal also
most time-consuming
properly.
understanding
correct
document
statements
The
of the
all activities
data and
DBA
functions
concerning
strategy
creation,
be fired
DBA
must ensure that and
use
end-user
is the
usage,
communicate
of direction
of the
all
DBMS
education.
the
continuous
distribution policies,
enforcement
and
deletion
procedures
or action that
communicate
of the within the
and
and support
are more detailed and specific than policies and describe the activity.
In
For example,
effect,
standards
standards
programmers
Procedures
are written instructions
performance
standards
support
that
are
used
to
of application
DBA goals.
minimum requirements
evaluate
the
programs
quality
and the
of the
naming
must use.
of a given activity.
must
are rules
define the structure
conventions
To illustrate
of the
attention the
with
procedures database
quality
environment.
use the database
monitors
the
has been found,
interface.
users.
a basic
work and
requires
sales.
solution
must
that
the
of e-commerce
product
be enforced:
DBA
and they
provide
data administration
standards
must
are general
of a given
10
have
sure
Special
is required
via the internet
database
make
database
of DBMS
DBA standards
Certifying
do not
and
and Procedures
DBA
can
transaction
end users how to
DBA
also
DBA function.
interfaces
and support
the
must
data quality.
is generated
is teaching
accessing
database.
DBA
the
database
DBMS-managed
database
Managing the training
Policies,
the
the
growth
queries
Once the right
Therefore,
them
The
the explosive product
and data.
used.
teach
those
in
when the transaction
software.
to
is a crucial
because found
application-based
users
and
users
manipulation.
the
In fact,
interactive
of applications
affect the databases
are typically
activities
databases.
to facilitate
implemented
and
not adversely that
with the
of dynamic
and
the distinctions
that describe a series of steps to be followed
Procedures
enhance
among
that
must be developed
within
existing
during the
working
conditions,
environment.
policies,
standards
and procedures,
look
at the following
examples:
Policies All users
must have
Passwords
must
passwords.
be changed
every
six
months.
Standards A password
must
A password
may have
ID numbers,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
have
names
Rights
Reserved. content
does
a a
minimum
of five
maximum
of 12
and birth
May not
not materially
be
copied, affect
characters. characters.
dates cannot
scanned, the
overall
or
duplicated, learning
in experience.
be used as passwords.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10
Database
Development
Process
561
Procedures To create
a password,
an account; computer
(2) the operator
information user
to
changes
creates
the
end
the
Standards
and
database.
Standards
define,
procedures
account,
of the to
by the
procedures
enforce
requirements
a permanent are
by all end each
must facilitate
gathering.
is
and
sent
creation
operator; sends
to the
the
DBA;
of
(3) the account
and (5) the
one.
used
procedures
for the
computer
password
information
complement
Procedures
and
it to the
a temporary
account
DBA
must
DBA a written request
and forwards
assigns
a copy
password
policies.
database
the request
defined
and
communicate
End-user
the
user; (4)
temporary
of data administration must
(1) the end user sends to the
DBA approves
the
that
users
other
and
who
want to
must
constitute
benefit
work of end users and the
cover
areas
such
Which documentation
from
the
an extension
DBA. The DBA
as:
is required?
What forms
must
be used?
Database
design
and
or object-oriented
modelling.
Which database
methodology)?
Which tools
design
to
methodology
use (CASE
tools,
to use (normalisation
data
dictionaries,
or ER
diagrams)? Documentation elements, Design, for
coding
given
the
software
provides
work
with
and integrity. is
internet and
controlled
to
databases than
Database information
2020 has
Learning. that
any
All suppressed
the
information
Reserved. content
does
May not
not materially
be
affect
scanned, overall
new
with
a
closely
standards
procedures
software that
by the
related
software
be
organisation, the
DBA
and
must
also
10
solutions.
governing
security defined
and integrity. and
of security
no system
strictly
scenarios
can ever be completely
standards.
security
from
other require
be clearly
multitude
the
and
environment,
threats
more traditional
work
any
might
needed
must
handle
define
standards.
connectivity
meet critical
to
must
and
policies
Although to
backup
The growing that
are far
internally
with internet
use of
more complex
generated
security
attacks launched
and
and
specialists
inadvertently
solutions
duplicated, learning
in experience.
management
operator
to
or attacks
of problems.
must
be clearly
whole
or in Cengage
part.
Due
Operational
be established
to
electronic reserves
backups.
must be clearly and
the
documented.
notes.
Such
procedures
notes must also
procedures.
specified.
Learning
must include
of the
instructions
and recovery
program must
procedures
daily operations
write
backup
or
and recovery
execution
must
and
training
the
to
must
those
DBA
internet
the
The DBMSs
training
copied,
the
standards
protected
and they causes
the
package
todays
define
door
proper
concerning
governing
Rights
must
Database
A full-featured
of all data
users.
guarantee
logs,
pinpointing
training.
DBA
and operation.
keep job
procedures
Cengage deemed
in
precise
End-user
review
must
helpful
include
and recovery.
maintenance
Operators
the
definition
standards
must enforce
Web-to-database
the
the
DBA DBA
has the features
In
encountered
are properly
to
that it
minimised.
by unauthorised
necessary
Database
and
databases
backup
DBMS
be designed
opens
those
use in
The The
For example,
Security
are
Therefore,
deliberately
of the
must be designed
manage
ensure that the
are
problems
interfaces.
launched
must
procedures
to
DBA
crucial.
procedures
interfaces difficult
DBA
proper
The
especially
security
security
programs.
and the
on investment. to find
security
ensure that
software,
return
to
and testing.
managed.
with existing
security
secure,
application
The selection
Database
documentation
database?
documentation
properly
Database
to
Copyright
be
a positive
Security
the
programmers,
Web administrators
enforced.
Editorial
of database
selection. must
Which
access
coding,
application
database
it
that
and testing
properly interfaced that
conventions.
programs
program
to the
Database
naming
and
application
are
to
and
sets
The
rights, the
within
right
some to
third remove
the
objective
party additional
content
organisation,
is to indicate
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
562
PaRt
IV
Database
clearly
Design
who does
available
Procedures the
DBMS
and
can
The security,
to
the
Define
to ?
the
each
?
user
to
through
SQL
rights
2020 has
Cengage deemed
Learning. that
any
All suppressed
of
company,
and
The
who
manage
information
sites,
thus
making
data configuration
provided
and
database
to limit
achieved
is
to,
In
by the
addition,
other
has
DBMS DBAs
security
a function
user
access
user to log
at two level,
levels:
the
on to the
user ID or employ
DBA
to
must
mechanisms
of authorisation
and guarantee access
database
management,
to the
database
at the
operating
can request
computer
the same
view
and likely
system.
system
the
creation
At the
DBMS
user ID to authorise
to
level of a level,
the
the
end user
Physical
a user
or a group
define
Reserved. does
May not
end users
operating
access
and
DBMS
according
to
the access
describes
to read-only,
less
common
privileges
needs
of individual
rights
the
of authorised
in relational
users to
access
or access type
The use
probable.
or the authorised
privileges
dates.
and to remind
access
privileges
privilege
system
expiration
periodically
unauthorised
managing
Access
security
can prevent
and facilities. include
must define
to
users.
specific
users
access.
access
mayinclude
databases
video,
user.
The
of one The
unauthorised common
entrances,
are
SQL
assigned
directly
security
practices
must
and
provide
the
and the
CREATE
found
workstations, biometric
and control the
more tables
command
physical
recognition
protect
DBMS or
users from
password-protected
voice
data views to
composed of users.
Some
secured
closed-circuit
are
both
commands.
an authorised
that
and
privileges.
at
with predetermined
user groups
assigns
REVOKE
DBA
be done
making
users into
DBA
badges,
The
thus
may be limited
and
access.
can
An access
rights
GRANT
of views
content
designed
system
of controlling
DELETE
accessible
Rights
multiple
section.
services
are not limited
DBA to screen
databases.
personnel
to
DBAs through
monitoring.
can be assigned
Classifying
installations
to
across
previous
proxy
but
This, too,
the
DBMS installation
databases
in the
to
multiple-site
mechanisms
data in the
usage is
periodically,
access
? View definition. are
concern productivity
defines procedures to protect
This is
user.
the
definition
of the
end
database
electronic
review
each
privileges.
physical
accessing
and integrity
include,
passwords
specified
WRITE
of great
security
operating
the
DBAs job
access
READ,
that
of the
DBMS.
the
access
Control
the introduction
reorganisation
greater
The
defined
a different
passwords
in large
Copyright
allows
create
are
way to
of data
DBMS
At the
dates enables
For example,
?
date and to ensure that
standards.
distribution
build
database.
change
facilitates
Editorial
to the
passwords
Assign
of the
procedures:
of expiration
to
the
and integrity.
This function
The database
their
in the
firewalls,
and
following
Define user groups.
?
Naturally,
violations,
the
to
procedures
DBMS level.
access the
Assign
environment.
work
security
privacy
control
user ID that
levels.
up to
database
pointed
management
Those
DBA can either
10
and extent
attacks.
and
management.
at the
logon
of the type
to keep them
and
policies
experts
possible
access
at least
and
the
data in the
use the
administration
Authorisation
User access
?
DBA
security
DBMS
includes
in
procedures
also resulted
security
and integrity.
definition,
annually
or integrity
has
data control,
the
data from
management. security
of the
of the
has
that
database
Protecting
changes
Technology
maintain
with internet
to safeguard
at least
of security
and integrity
Technology
the up
to
revision
installations.
made it imperative enforce
must be aware
Privacy and integrity
more difficult
team
quickly discovery
require
privacy
DBMS
management.
it
the
changes
Data Security,
Each end user
must be revised
adapt
software,
similar
current
when and how.
methodology.
and standards
organisation
new
what,
training
scope
tools
that
assignment
VIEW is
technology.
of the allow
data
the
of access
used
in relational
views.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
?
DBMS
access
DBMSs
query
and
?
only
DBMS
control.
Database
and reporting
by authorised
usage
access
tools.
can be controlled
The
DBA
must
the
database Security
The DBA
Preserved:
description pinpoint
accesses
or just
can yield
Action
must also audit the
unnoticed
must
computer
viruses
The integrity For
example,
Whatever
the
procedures
reason,
are
the
can
ensure as
trails
all
of
to
external
but
not
problems,
destroy
database.
and the
beyond
or destruction
database access
by
data.
the
a fire makes
recovery
the
database
or alter
an explosion,
data
by unauthorised
disrupt
include
factors
by
corruption
does
breaches
designed
or corrupted: problems,
security
security are
exist
face
potentially
all database
loss
of the
is
DBAs
control.
or an earthquake.
backup
and recovery
security,
has lost
integrity,
backup
security
and recovery
includes
disaster
testing
measures and
tasks
all of the
or a database
of
database
must include
of the
In large
security
database
loss
part
might
database
the
integrity.
of
of the
1
mean that
is
insurance
is
that
of database
loss
entire
backup
ensure
physically
you can buy.
so critical that
many DBA
officer (DSO). The DSOs sole
shops,
the
DSOs
activities
are
often
management.
management
data
and integrity.
the
database
data
also
a physical A total
or that
cheapest
must
or loss
when
integrity.
lost
are the
Therefore,
DBA
data loss
be caused
entirely
procedures
losses.
The
of physical can
database
but its integrity
ruinous
installations.
in case
A partial
and recovery
database
and
recovery
in
part
of database
a physical
Periodic
when
to
backup
disaster
Disaster
recovery
audit
record
are produced
snooping
have created a position staffed by the database
classified
recovery
such
or destroyed
companies
critical
or total.
or
continues
job is to
organising
violations
of similar
because
of database
available,
partial
management
following
actions
can be fully recovered
be
any case,
The
but
Corrupting
be lost
Such to
DBA.
are
occurred
database
departments
most security
be damaged
possibility
not readily
has
lost. In
might
to any
database
database
repetition
avoid the repetition
whose
Several
accesses.
purposes,
by hackers
procedures
Data loss
properly
and recovery
recovery
data in the
used
database.
be tailored
security
and
the
563
which automatically
by all users.
can
of similar
information
might
trails
the
avoid
to
performed audit
preserved
state.
database
Process
use of the
are
data in the
is either
a consistent
a database
on the
tools
of an audit log,
whose integrity
to
of
crucial
When data
to
The
database
a database
for
the
Data Backup and
failed
Action is required
be recovered
those
use of the
operations
violations.
As a matter of fact,
access
Corrupted:
database
access
is required
may not be necessary. and
of the
DBA to
breaches
that
Development
personnel.
monitoring.
a brief
enable
Database
by placing limits
make sure
DBMS packages contain features that allow the creation records
10
automatic.
integrity
failure.
contingency
designed to secure
Disaster
plans
and
management
recovery
data availability
includes
procedures.
all planning,
The
backup
and
at least:
applications
data in the
DBA activities
backups.
Some
database.
The
Products
such
DBA
DBMSs
include
should
as IBMs
tools
use those
to
tools
DB2 allow the
ensure to
backup
render
creation
the
and backup
of different
and
backup
types: full, incremental and concurrent. Afull backup, also known as a database dump, produces a complete copy of the entire database. Anincremental backup produces a backup of all data since the last backup date; a concurrent backup takes place while the useris working on the database. Proper
backup
identification.
date information,
the
Copyright Editorial
review
2020 has
database.
Cengage deemed
Learning. that
any
All suppressed
thus
Backups
enabling
the
While cloud-based
Rights
Reserved. content
does
May not
not materially
be
copied, affect
must DBA to
ensure
backups
scanned, the
overall
or
duplicated, learning
be clearly
identified
that
the
are fast replacing
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
through
correct
backups
tape
rights, the
right
detailed are
backups,
some to
third remove
party additional
content
descriptions used
and
to recover
many organisations
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
564
PaRt
IV
Database
Design
still use tapes. must
As tapes
be done
and location. tapes
for
Storage (NAS) approach
and fast
restoration.
Convenient
and
outside
safe
the
and
access and
data in the financial
additional
to
but it is less
points
are
and
contingency
plans
and recovery it is
are
useful
are the
the
purpose
prepared
of
and
controls.
The
backups
to
the
of closed
might include computer
protection
sites
use to
also includes
installation.
Multilevel
tokens
be stored?
provide
the
air
provision
of
can
passwords
properly
and
identify
authorised
is
to
and
Current
to
when
than
the
or security
be thoroughly are
not likely
to
not to
officer
of a database
disaster
must
is
establish
they
reach
created
tested
and
must
failure.
by
secure
an
The insurance
massive
data loss.
evaluated,
be disparaged,
cover
priorities
and they
and they
all components
concerning
based
to
have
for the
require
must
top-level
the
of an information
system.
nature
and
extent
of the
data
accomplish
task
that
information
ensure that
All suppressed
Rights
Reserved. content
does
May
not materially
be
being
copied, affect
right
time.
The
DBA is
responsible
at the right time
programming the
and in the right
very time-consuming,
to
especially
environment,
data in the
databases
database.
corporate
where Although
users,
makes it easy for authorised
is to facilitate ends.
and
scanned, the
the
overall
or
use
They
dependent
standards
not
opened
Web front
without
appropriate
access
at the
people,
their
The
when the
data
users
the use
for
format.
depend
internet has
on
and its
also
created
DBA.
philosophy
new internet
users
applications
programs
extensions
right
can become
on a typical
the
data distribution
and the
the
to the right
and use tasks
deliver
extranet
way to
any
DBA event
enforcement.
data are distributed
a new set of challenges
Learning.
The in the
drills
fire
program
distribution
capacity
that
defeats
each
and Use only
that the data
Cengage
place
Where
and
sites inside
process.
programmers
deemed
and
appropriate
Data Distribution ensuring
So-called
support
A backup
recovery
has
data,
of emergency.
database.
expensive making.
Therefore,
required
same
and temperature
of the
Physical
protection
worth
frequently.
managements
One
storage
must include
must be properly
Protection
challenge/response
for the
tools
software.
of a database
software
provide
intranet
same
well as preparation
control to the software
policy
delivery
Attached
use a layered
of the
in
(1)
use
disk-based
storage.
locations
questions:
use in case
coverage
DBAs
solutions
backups
well as humidity
protection.
Insurance
Data recovery
2020
as
insurance
Data
as
and
media for intermediate
storage the
not typically
on Network
archival
multiple The
currency
of resources.
be practised
review
and
and fire
and
be
for
backups
to two
hardware
DBMS for
hardware
disk
do
of tapes
of tape
be stored?
access,
and
backup
fast
track
optical
based
The storage locations
vaults,
respond to
power
computer
privileges
Two
Copyright
to
of both
may be expensive,
Editorial
quakeproof
backups
backup
Personal
different
a DBA
up to
location.
place.)
hire
and labelling
keep
include
storage
to tape
must
must
Enterprise
backed
There
a different
(Keeping
with restricted
a backup
10
storage. in
a policy
are
protection
conditioning,
users
data is transferred
backup
must establish
installations
the
to
solutions
online
(SAN).
are first
backups in the first
For how long
Physical
Networks
data
be stored
fire-safe
include
storage
DBA
enough
backup
the
organisation.
multiple
DBA
Later,
and the
are large
emerging
solutions
Area
which
must
may include
(2)
in
and copy
having
backup
it is vital that the
operators, that
Other
and Storage
backup
backup
organisations
Such
storage,
computer
backup.
devices.
physical
by the
However, enterprise
backup
require
diligently
duplicated,
the
generation
DBA to
on applications
procedures
learning
of a new
enable
in experience.
whole
end users to access the of
educate
more sophisticated end
programmers.
database.
users
to
Naturally,
query
produce
the
DBA
the must
are adhered to.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
This distribution as
database
enabling to
end
more
philosophy
technology users
efficient
to
administration to
ensure
nature
function. the
Data
The
DBAs
technical
role
data
modelling,
the
DBAs technical
of the
DBMS
of the
application
Many
might
For
example,
and support. covered
The technical
Designing
Training
and
the
The following
DBAs
management the
DBA
That
plan
a computer
needs.
community,
of desired
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
the
details
the
programming
issues.
maintenance
implementation
For
example,
and upgrading and
maintenance
extension
of the
and integrity,
DBAs
backup
managerial
activities.
and recovery,
as a capsule
and
training
whose technical
core is
in the
following
and related
areas
of operation:
utilities
10
criteria
and supporting
on the
step
not
be
hardware
evaluating
and
organisations that
the
for
is
needs search
the
rather
is for
needs,
the
managers,
administration
and
than
must
is involved function
make
database
Therefore,
and
specific
problems
hardware.
software rather
and
than
and not a technological
plan is to
in the
can
to
tool
utilities
on
solutions
the
organisation.
DBMS,
acquisition
DBA
selecting
use in the
selecting
a DBMS is a management
of those
sure
that
process.
be clearly
for
toy.
determine
company
the
end-user
entire
Once
established
the
needs
and the
are
DBMS
can be defined. to the
materially
areas.
responsibilities
of the evaluation
data
That
May
operational
technical
a plan for
mid-level
capability
not
of those
DBMS and Utilities
Put simply,
of the
does
the
must recognise
top-and
Reserved. content
level
understand
and applications
applications
picture
features.
Rights
organisational
applications
primarily
objectives
DBMS
data
and applications
software
a clear
DBMS
DBMS
and
execute
DBA
and selection match
under
database.
are rooted
most important
most important
including the
features
Editorial
based The
To establish
identified,
To
and
and
operation,
might be conceptualised
the
and
or DBMS software.
The first
lead
produce
of the
configuration,
DBMS-related
development,
a logical
security
databases
and
utility
be
are
job
and installing
first
features.
can
DBAs job
completely
functions,
other
installation,
design,
with the
DBAs
will explore
develop
must
hardware
the selection,
database
utilities
system,
must
also
efficiency
at the
not
data
The
the
checks
of can
Clearly,
could inadvertently
function.
do
user.
users
DBMS,
Selecting
of the
data subsets
compromise
more common
end
use
their
565
shell.
utilities
sections
evaluating, One
and
databases
supporting
Maintaining
and
Process
data elements.
of DBMS
and installing
DBMS,
the
democracy
who
understanding
dual role
of the
and evaluating the
use of the
for
data
without
users
methodologies,
interact
and implementing
Operating
again
end
a broad
with
DBAs
selecting
Thus,
activities
managerial aspects
Evaluating,
might flourish
to
Development
will become
acquisition
data administration
complicated
as well as the
that
deals
Thus, the
the
Yet this
design
technical
DBA
and the
it
more flexible
micromanage
make improper
include
programs
by a clear
Testing
and
activities
the
in
that
Database
Role of the Dba
requires
DBAs
users
elements.
and utility software,
of the
end users
duplication
of data
is
process.
sufficiently
of data
10.6.2 the technical
languages,
those
and it is likely
self-sufficient
decision
become
uniqueness
and sources
an environment
Letting
between might
today,
Such
in the
side effects.
circumstances
common on.
relatively
of data
sever the connection those
become
use
some troublesome
is
marches
10
organisations
DBMS
copied, affect
scanned, the
overall
needs,
checklist
or
duplicated, learning
in experience.
should
whole
or in Cengage
part.
the
DBA
at least
Due Learning
to
electronic reserves
would
address
rights, the
right
some to
be
wise to
these
third remove
party additional
develop
a checklist
issues:
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
566
PaRt
IV
Database
DBMS
Design
model. Are the
object/relational multidimensional DBMS
Security
Can the
audit
trail
Backup
and recovery.
offer?
DBMS
DBA
errors
How
many
capacity
are
Which
performance
tools
some
provided?
disk needed?
application
monitoring,
Does the
and
entity
to spot
integrity
errors
multiple
screen
DBMS
provide
rules,
access
and security
backup
rights,
violations?
and recovery
backups?
users?
coding
second
DBMS
is
does
offer
What levels
needed the
in the
DBMS
some
type
provide?
tools?
Does the
DBMS
of isolation
support?
of
DBA
Does the
(table,
application
page,
programs?
Are additional
management
DBMS
interface?
provide
alerts
to
the
occur? Can the
DBMS
or interoperability
to
automated
or network-based
DBA interface
distribution.
and standards. run
on
and from
without
DBMS
Hardware.
other
work
with
level is DBMS
other
DBMS
achieved?
packages?
types
in
Does the
the
same
DBMS support
Does the
DBMS support
systems
and platforms?
a
dictionary.
Vendor
training does
Available
with
and
third-party
Rights
data
Which
computers?
national
Can the
and industry
Can
DBMS
standards
Which
tool?
vendor
Is the
If
so,
Does the
what information
DBMS
offer in-house
DBMS
May
not materially
additional
in the
are required
not
dictionary?
dictionary
management
What is the
does
require?
a data
Does the
are involved
Reserved. content
desktop
support
training?
documentation
is any
Does
CASE tools?
What type
easy to read
kept in it?
and
and level
of
helpful?
What is
policy?
personnel
All
any
tools.
costs?
suppressed
operating
and
all platforms?
DBMS
have
provide?
access
What costs
additional
on different
computers
on
the
DBMS
support.
vendor
upgrade
data dictionary,
recurring
does
Does the
the
vendors
modification
hardware
interface
support
DBMS run
mid-range
follow?
Which
DBMS
Can the
mainframes,
run
does the
any
data
the
Which coexistence
applications
Learning.
or
architecture?
DBMS
that
storage
are supported?
referential
cloud,
per
violations
WRITE operations
Portability
Cengage
other
dictionary,
query
manual
Does the
does
or security and
client/server
deemed
tools.
of information
Cost.
4GLs
data
support
much
administration
What type
the
or
a relational
logs?
How
Database
the
should
size is required?
what
use of audit trails
disk,
transaction
Does the
DBMS
environment?
has
up the
needed?
Data
or
and
support
the
optical
many transactions
the
3GLs
DBMS provide
tape,
How
READ and
object-orientated,
is required,
modified?
processors
10
by a relational,
database
units
design,
DBMS
transaction
when
and
Are end-user
Does the
control. the
Interoperability
2020
be
support
back
Performance.
review
schema
available?
DBMS support
size
DBMS
Concurrency
Copyright
Which
Does the
Does the
automatically
Editorial
are
disk
many tape
support.
and integrity.
does
served
application
access?
and so on?
Does the
maximum How
(database
painters)
Web front-end
row)
What
be supported?
tools
menu
better
warehouse
be used?
development
development and
needs
a data
capacity.
must
Application
If
DBMS
storage
packages
companys
DBMS?
be
copied, affect
acquisition
and
expected
scanned, the
overall
or
tools
duplicated, learning
are
and control,
of the
what level payback
in experience.
whole
offered
by third-party
and storage software
of expertise
vendors
allocation and
hardware?
is required
(query
management How
of them?
tools,
tools)? many
What are the
period?
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Pros and cons alternatives computer
devices,
the
The
both
computer
DBA
process
be familiar
cannot
mind that
Designing
and
department.
also
that
the
DBA
both
other to
be
data communication associated
with
costs.
For
example,
preparation
and
the
DBA
must
maintenance
of the
the
also
The transactions
require
any
All suppressed
are
Reserved. content
and The
network
such
of your
that
the
details
systems
be
services
to
include
and
enforce
within
design
standards
is in place, the the framework.
of the
design
is
database
both
dedicated
and
to
may
decision
and
database
organisational
support
modelling
ensure
determined
the
reviewing
quality
the
to
by the
production
systems.
activities.
and
covered
be assigned
10
and the
modelling
areas
at
DBMS-and
and hardware-independent,
on externally
programmers
the
Such
data-processing
framework
conceptual
personnel
design
community.
the
performed
during
to the
design
based
determine
are
support
people
data
end-user within
and procedures
or executive
the
The
DBA
That coordination priorities.
and integrity
database
of
database
application
design
to
does
events.
overload
the
compliant with
DBMS.
with integrity
broad
applications
database
and
the
the
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
conversion,
whole
or in Cengage
and
during
generation,
plan is a set of instructions
standards.
the
and
Due Learning
to
electronic reserves
skills.
of the physical
and
rights, right
some to
third remove
party additional
content
database.
including
migration
storage
at application
the
physical
design,
database
compilation
generated
part.
programming
implementation
oversight
data loading,
also include
and
design
requires
assistance
and creation, tasks
Rights
DBA is to
according
resources
do not
plan. An access
Learning.
Therefore,
to the
activities
and
and
mirror real-world
of the
determination
that
design
several
systems,
support
must provide
implementation
Cengage
data
components.
log files,
sections
group
DBMS-dependent
modelling
personnel
The implementation
access
and
be grouped
coordinate
The transactions
DBA
of such
the
installed;
are:
Efficient:
the
support
and transaction
configuration
services
of a
(Remember
that
might
of available
The transactions
Therefore,
levels.
with application
Correct:
activities
to
being
DBMS-dependent.
and
standards
assistance
requires
Such
are
design
activities
modelling
managerial
works
procedures
of backup
development
appropriate
database
and transactions.
Compliant:
and
application
hardware-dependent.)
to
designated components
and Applications
design is
and
startup
details
the logical
design jobs
hardware of the
the installation
primary
physical
reassignment
DBA
an
necessary
people
and
details.
DBMS-and
ensure that transactions
deemed
from
evaluations.
in the
and
modelling
database
example,
financial
may require
has
Available
use is likely
devices,
567
existing
support
DBMSs
Process
details.
data
usually
These
For
schedules
The
those
of the
and
function
activities.
Consult
Once the
the
logical
design is
systems,
2020
process.
and so on. The costs
preparation
as the location
storage
with
one
provides
conceptual,
application.
review
Development
organisations
it requires the
storage
in the
understanding
such
Databases
to be used.
must ensure
The
for
provides
Therefore,
DBA then
physical
Copyright
solution;
system,
of all software
configuration
book.
coordinated
hardware-independent,
Editorial
with the
example,
auxiliary
sites
configuration
and
in this
often
and procedures
DBAs
of the For
involved
a thorough
physical
and implementing are
space
the
during the selection
compatible
be included
expenditures
details
installation
guide
DBA function
services
These
consider
have
include
administration
design
part
processor must
with the installation,
be addressed
design
also
be
processor,
the installation
information
Keep in
the
must
must
procedures
configuration
DBA
programs.
a transaction
and recurring
strategy;
installation
The
is just
utility
components
must supervise
administration
DBMS
system,
must
DBMS
CPU, afront-end
software
one-time
a and
must be evaluated
software
Database
room installations.
The
The
available
and
selection
include
that
software
operating
solutions
because
Remember
by the
hardware
alternative
restricted
application
constrained
must
often
system.
hardware,
the
of several
are
10
storage
services.
of the
completion
may content
be
suppressed at
any
time
The
applications
time that
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
568
PaRt
IV
Database
Design
predetermines validate
how the
the
access
Before
an
maintenance.
The
draw
Therefore,
the
when
data
and
DBA
applications. services
place
before
Although
testing
services,
service
they
programmers
and
are
or
and
and
testing
To be able to
access test
the
and implement
procedures
the
include
responsibility
users to access
for the
create
and
database. operational
utilising
database
database
training,
control
from
and
which the
the
fine-tuning
by
managing
reconfiguring
shared
DBMS
of the
corporate
data
may require
with equal
services
extension
section.
of the
Clearly,
program
can
services
are
are often too
for
the
DBMS.
repository.
assignment
of
efficiency.
for
related
database
development and
database
design
studied
must
already
and implementation
for the separation being
end-user
and implementation
company.
problem
and
standards
use in the
to
The reason
close to the
of the
procedures
be approved closely
all
design,
testing
maintained independently.
designers
the
users
evaluation
are the logical
evaluation
modified,
original
and/or the
Applications
and
preceding
are
application
added
new
application
and
usually
develop,
well as assigning
may require
Databases
any
must
to
operational
all applications
the
provide
in the
as
at run time.
rights
data.
These services
described
DBA
must authorise
assists
evaluating
must also
database
required
Such
plans,
structures
to
the
system.
database
DBMS
resources
Testing
DBA
the
have the
online,
new
required
of a new
that
additional
be in
the
addition
must
and recovery
Finally, the
applications
The
by the
and backup
will access
user
comes
required
Remember
the
application
procedures
security,
application
plan,
is that to
applications
detect
errors
and
omissions. Testing the
usually
applications,
application
and its
The testing
and
and
Technical
purpose
evaluation
creation
of
aspects
integrity,
use
Evaluation
of
and
Observance
SQL
written
Following
the
has
operations
for
can
of all
database
contains
rules
test
of the
data for
database
and
all aspects
database.
that
of the
evaluation
system
process
Backup
from
the
simple
covers:
and recovery,
security
and
be evaluated the
documentation
and
procedures
are
documenting
and
coding
data rules
applications,
the
to
database
and the
procedures,
the
system
is
end users.
and Applications into
four
main areas:
support monitoring
auditing
any
That
and integrity
The
must
ensure
made available
be divided
Security
Learning.
cover
and the
to
with existing
and can be
and recovery
that
definition
performance
naming,
testing
Backup
Cengage deemed
applications
of all data validation
thorough
Performance
2020
data
and retirement.
documentation
conflicts
operational
System
review
the
database.
application
use
application
Operating the DBMS, Utilities
Copyright
testbed
follow
of standards
The enforcement
Editorial
check
of a database data to its
and
easy to
Data duplication
DBMS
of the
is to
of both the
of the
accurate
declared
with the loading
programs.
collection
10
starts
All suppressed
Rights
and
Reserved. content
and tuning
does
May not
monitoring
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
System
support
and its
applications.
verifying
the
activities
status
activities
for
Performance to
To carry
monitoring
Establish the
DBMSs
the
often
information.
the
performance
to
and
and
demonstrate
Since
sources.
programs
and time.
and
System-related
and
resource
These activities
satisfactory
DBA
performance
are
levels.
must:
objectives
are
being
met
objectives
allow
the
are not
DBA to
use
of indexes,
can
have
they
vendors,
facilities.
Most
bottlenecks.
query
tools,
by third-party
processor
met)
The
are
or they
of the
usage
available
from
may be included
performance-monitoring
most common
query-optimisation
database
bottlenecks
algorithms
in
and
DBMS
management
routines
database
Available
not
Such
much
most
a plan
give
the
user
an index
Managing
Database
is
trying
DBMS
guidelines
and
mode
choice
DBA should and
time
Typically,
command-line
13,
usually
is
and
DBMS
especially
within
to
educate
programmers examples
within
the
that
SQL Performance,
that
10
application
a query,
create indexes
of databases
of application
The amount
of primary
types
tuning.
The
DBMS
can improve
for
examples
pool
used
of
allowing access
determining by the
efficient
concurrency.
the
DBMS
the
desired level
and requested
operation
(See
few to
of the
Chapter
system,
12,
Managing
subject)
primary
and
allocation
secondary
of storage can
memory,
resources
be used
to
is
must also
determined
be
when
determine:
concurrently
or users supported
memory (buffer
thereby
concurrent
for
to the
parameters
may be opened
programs
of locks
on that
of both
package,
parameters
influence
configuration
that
The number
that
DBMS to improving
is important
more information
performance Storage
by the issue
in terms
the
orientated
DBA specify
affected
factors
for
into
are
concurrency
resources,
DBMS
integrated
let the
also
with the
configured.
number
spend
performance
both in the
routines
Concurrency,
during
useful
the
are
the
be familiar
storage
is
performance,
plan.
use of SQL statements.
Therefore,
packages
Because
and
usage
to
user.
Chapter
Concurrency
must
on system
tuning.)
applications.
Transactions
and
DBA is likely
contain
do
the
Query-optimisation
of concurrency.
considered
proper
systems
(See
Several
DBA
on the
for
effect
creation
of SQL statements,
relational
options.
index
the
manuals
selection
a negative
environment.
performance,
use
performance
database.
defined
database
Query-optimisation
The
attention
performance-monitoring
system
selection
end users
performance.
database
DBMS
the
that
are provided
to the
a carefully
proper
makes the index
the
power
DBMS
applications.
(if performance
tools
not include utilities
administration the
programs.
the
to
569
solutions
or transaction
satisfactory
programmers
by the
checking
special
maintain
performance
solutions
on selected
index
a relational
produce
tuning
the
alternative
are related
adhere in
system
whether
does
utilities
improper
installations
manuals
of the
tape
Process
resources.
Because
important
Development
operations
changing
emergency
DBAs
tasks
Database
goals
DBMS
DBA to focus tuning
of storage
To
DBMS
to
as running
applications
and tuning
performance
sources.
and
such
much of the and
day-to-day
logs
of database
performance-monitoring
system
allow
packages
versions
require
to the
out job
tasks
utilities
evaluate
selected
If the
operating
tools
DBMS,
and find
include
many different in
to
problem
Implement
and tuning
the
filling
disk
occasional
performance
DBMS
the
hardware,
upgraded
related
from
performance-monitoring
DBMS
Monitor
Isolate
that
directly
range
periodic,
new and/or
ensure
out the
all tasks
activities
of computer
include
configurations
designed
cover
These
10
concurrently
size) assigned
to
each database
and each
database
process
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
570
PaRt
IV
Database
Design
The size and location The log
files
increase
can
of the log files (remember
be located
DBMS
Since
data
primary
issues
manuals
loss
files
and on the relative
DBMS
incremental.
and
to
DBA provides
DBAs
and
The technical common
been
must
the
database.
movement
and to
are faster
is
must
become
familiar
and
recovery
activities
a schedule
dependent
components
for
on the
the
with
task. are
of
backing
up
application
type
database,
the
database
backups,
be they
up periodically. automated
than
full
an incremental
backup
full
or
requires
purposes.
requires
plan, implement,
database
backups,
be useful for recovery failure
DBA
performance-monitoring
establish
frequency
schedule
application
test
and
of the transaction
enforce
log to the
a bulletproof
backup
and
The
assigning objects,
DBA
assignment
and end
users,
users.
and
security
using
creating
trails
an audit
and,
rights
and the
aspects
of security
SQL commands
audit
generate
violations
of access
The technical
access rights,
must periodically
or attempted
DBMS
to
trail
report
if so, from
to
discover
grant
security
to
determine
which locations
and,
required
is
one
for
DBMS, activities
use of the
users
and
can
development
with
ensure
and
activities.
In
addition,
utilities for the applications DBMS tools
as
well as the
be
programmers developed
of a technical
is to
also included
facilitate
database
such
in the support.
used to find
solutions
that
the
DBMS
vendors.
company
concerning
to
Establishing
has new
a good
products
give organisations
good
external and
relationships
support
source.
personnel
retraining.
an edge in determining
the future
DBA are an extension
includes
maintenance
usually
done
as
to
space
Reserved. does
May not
not materially
of the
new
copied, affect
is DBMS
of the
physical
reorganising
free
physical
activities.
disk-page
also
activities.
or secondary
the
fine-tuning
contiguous might
operational
Maintenance
environment.
the
locations space
storage
location
devices.
of
data in
The reorganisation
to the
allocated
the of
a
DBMS to increase
to
deleted
data,
thus
data.
scanned, the
of the
DBMS
management
process
for
be
of the
activities
part
allocate
The reorganisation disk
Applications
preservation
maintenance
might be designed
content
end
information
of the
to the
common
Rights
DBMS and its
the
procedure
are also likely
Utilities
database
All
for
by interaction way to
relations
This is
suppressed
technical
programming.
support
up-to-date
database.
more
DBAs
development.
DBMS
performance.
covers
database
the
in the
in the use of the
training
for
provided is
source
most
is included
training
programmer
might include
are dedicated
of the
tools
problems.
support
the
Periodic
and its
troubleshooting
maintenance
activities
any
appropriate
database
A technical
of database
Learning.
and
technical
suppliers
Maintaining
that
creating
on-demand
Good vendor-company
Cengage
the
Users
use the
standards
are the
providing
assumes
users
actual
procedure
of the
direction
deemed
the
backup
system
by programmers
violations.
technical
software
Vendors
has
must
or secures technical
activities.
Part
to
Application
Unscheduled,
2020
DBA
DBA
be backed
that
to
monitoring
and Supporting people
procedures
review
The
All critical
backups
privileges
rights
have
programmers.
Copyright
head
by whom.
Training Training
Editorial
disks
in the
organisation,
Backup
must
utilities
monitoring involve
there
possible,
One
data.
logs
involved
the
operation.
of the
The
or attempted
whether
The
to
intervals.
full backup
and
access
violations
with
are used to recover
the
Therefore,
details
after a media or systems
copy.
auditing
and revoke
to
DBMS
incremental
use of access
auditing
the
reduce
DBMS-specific.
devastating
include
of a periodic
Security
10
to
procedure.
proper
if
the
packages
database
recovery
be
transaction
Database recovery correct
that these files
volume
technical
at appropriate
Although
existence
are
the
importance
and the
Most
to
during
and log
applications
to learn
is likely
concern
database
the
a separate
performance)
Performance-monitoring the
in
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Maintenance require
the
might
create
DBMS
warehousing
databases.
The
upgraded
from
Database
conversion
analysis,
efforts
or for
different
services
desktop
charting,
computer
to
to
of the
DBA include software.
migration Such
modelling,
and
software-specific)
so
level
on.
to
perform
Migration
or at the
formats
access.
for
when the
by an entirely
host
data in
system
new
DBMS
of activities
conversion
is
DBMS.
(mainframe-based)
spreadsheet
services
(storage-media-or
data
or between
services
common
a variety
and
physical
host
a client/
data support,
conversion
the
on a different
for internet
dissimilar
are
data from
user
as spatial
DBMS is replaced
downloading that
and
conditions
Or it
in
571
might
tool.
running
interfaces
data in
Process
The upgrade
running
such
Development
front-end
applications
programming
or when the existing
allow
software.
DBMS
DBMS
exchange
Database
an internet
a host
features
for Java
DBMS
to
or
distributed
include
need
utility
software
access
in
databases
also include
statistical
(DBMS-or
allow
common
with the
one version to another
users
the logical
to
DBMS and
DBMS
and support
are faced
maintenance
formats
an end
are
Also, new-generation
companies
the
of the
gateway
services
and star query support,
often
incompatible
to
upgrading
version
DBMS
gateway
environment.
Quite
also include of a new
an additional
computer.
server
activities
installation
10
can
be done
at
operating-system-specific)
level.
10.6.3 Developing a Data administration For
a company
regardless
system The
to succeed,
supports
strategy
used
stable
thus
require
decrease
from
of the
company
a detailed
that
Copyright Editorial
review
2020 has
planning,
processes.
Cengage deemed
Learning. that
any
are
modification
and
is
existence. systems. change.
control
of future
the
goals.
processes
mission. that
its
Therefore, information
systems
companys to
goals,
ensure
strategic
plans.
the
plan
After
its
all,
condition
compatibility
of
development.
The
engineering.
those
of existing
or
ensure
of the companys
The IE rationale
when
of the available
guide
as information
much during their
on systems
development
affect ISA
analysis
and to
achieve
to
with the information
methodologies
is known
is
areas.
conflict
plans,
main objectives
organisation
business
The output of the IE process is aninformation for
to its
any
(ie) allows for the translation
will help the
the frequent
for
not
systems
and do not change
the impact
must
Several
methodology
that
be committed
step
each of its
derived
and information
data instead
fairly
are needs.
engineering
applications
corporate
plans
business
administration
information and
plans for
and its
most commonly
must
a critical
administration systems
or situation,
activities
size,
its strategic
database
the information
data
its
of a companys
strategy
strategic goals into the data
IE focuses
simple:
business
In contrast,
architecture
information
systems.
description
data types
processes
By placing the
systems
on the
tend
to remain
change
emphasis
10
of the
often
and
on data, IE helps
(iSA) that serves asthe basis Figure
10.17
shows
the
forces
development.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
572
PaRt
IV
FIguRe
Database
Design
10.17
Forces affecting the development
of the Isa
Company managers
Goals
Company
Critical
success
factors
Information
Information
mission
Strategic
systems
engineering
plan
architecture
Implementing
IE
commitment
of resources,
and
An ISA
control.
integrated
tools
the
DBA
of the
a successful
technological
must
and
commitment.
company to
For example,
standards,
codification,
data
involvement.
given
strategy. change
trained
has
Cengage deemed
any
All suppressed
Analysts
and standards.
in the
Learning. that
a
factors,
automated
and
current
success
necessary
must
situation
of the and to
design, Needs
data administration
success
Critical
is
example
position
of the critical
factors factors
helps include
as:
commitment The
handled?
involvement
that
use
Rights
of the
Reserved. content
does
May not
not materially
channel
affect
degree
to
are
be set
problems
enforce
the
use
of
top.
data
a clear
documentation, and
to
at the
corporate have
scanned, the
overall
duplicated, learning
to
upper-level
adapt
critical
administration
vision
of
what
must
implementation, should
to the
change
to the
be identified
to
first,
of the
involved?
Successful
Users
ensure
overall
must be familiar
success
change.
management
are key to the
and programmers and
or
aspect
of organisational able
and programmers If analysts
copied,
another
channels
procedures
be
is
people
Good communication
Defined standards. procedures
2020
the
strategy. such
analysis,
What is the
requires
an open communication
implementation.
review
The
other issues
End-user
administration
organisational
Copyright
issues,
companys
how are database and
planning,
of critical
computerised,
and, therefore,
administration
management
the
involves
identification
of
Understanding
controls.
analysis.
use
that
prioritized.
End-user
Editorial
data culture
understand
the
strategy
factors.
corporate
and
process
objectives,
includes
systems
Top-level
a costly
tools.
success
planning,
be done.
then
CASE
is
well-defined that
corporate
situation
be analysed
and
critical
procedures,
Thorough
liability,
overall information
develop
Management
organisation
a framework
as a DBMS
on several
standards,
an
management
depends
managerial,
in
provides
such
The success strategy
10
methodologies
should
success
be
of the
process.
with appropriate
lack familiarity,
they
methodologies, may need to
be
standards.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
Training. users
The vendor
must
maximum
benefit,
so they
the
pilot
to
that
DBA personnel
tools,
in the
standards
increasing
others
project.
company,
use the
thereby
can train
A small
must train the
be trained
and
end-user
use of the
procedures
confidence.
10
Database
DBMS
to
Development
and other tools.
obtain
and
Key personnel
Process
End
demonstrate
should
573
the
be trained
first,
later.
A small the
project
output
is
is recommended
what
to
was expected,
ensure
and that
that
the
the
DBMS
personnel
will
have
work in
been
trained
properly. This list
of factors
framework the
list
is
for the
planning
factors
data activity
and
cannot
development
of success
a successful
not
comprehensive.
of a successful
is, it
must
administration of the
be
be based
strategy
Nevertheless,
strategy. on the
However,
notion
are tightly
that
it
no
does
matter how
development
integrated
provide
with the
the
initial
comprehensive
and implementation
overall
information
of
systems
organisation.
suMMaRy An information manage
system is
both
system.
data
Systems
information
analysis
system.
The Systems within
the information systems
The
Database
Life
The
Like the
and loading,
to
database
protect its
plan to
conceptual
portion
The database
orientated
the
mix of skills.
any
cycle)
five
phases:
The
SDLC
extent
of, an system.
of an application planning,
is
All suppressed
Rights
has
Reserved. content
does
analysis,
an iterative
rather
than
a
May
of integrity,
security
not
and
within
the information
database
design,
maintenance
confidentiality
measures
and
common
is
tend
and
be
for
practice
handled
and
develop
and
10
evolution.
availability
of data.
a comprehensive
affect
the
to
data
overall
or
duplicated, learning
managerial
in experience.
whole
or in Cengage
part.
operations
administrator
Compared
Due
to
electronic reserves
to the
executes
rights, right
to
to data
organisation.
This
third
managerially
DBA function,
the
DA
when the
of the
DAs functions.
all
remove
more
However,
the
some
The company.
a broader
DA is
focus.
the
to
according with
the
responsibilities,
Learning
basic
(DA).
speaking,
DBA
database.
within the
and longer-term the
on two
company
a position
data
Generally
DBA.
from
DBA
created
data
corporate
varies
divide
by the
based
vs decentralised.
and other
a DA position,
scanned,
variations,
managing the
have
overlap.
and
several
function
orientated
technical
copied,
to
centralised
companies
to
not include
materially
study,
sequential.
with a broader
both
not
database
initial
operation,
administration
Some
activity
does
of the
may be subject
more technically
Because
Learning.
it is
DBMS-independent,
DBA
the loss
vs top-down
DBA functions
diverse
that
into
and the
and to
of the information
an information
history (life
database
manage computerised
management
chart
history
than
relevant
design
phases.
than the is
organisation
Cengage
of creating
the
evaluation,
rather
database
exists,
mandate to
DA and the
the
(DBA) is responsible
of the
life-cycle
data
function
deemed
of the
no standard
database
broader
has
for,
part
data.
administrator
management
2020
need
maintenance.
phases:
and
the
bottom-up
organisation
Although
of six
include
select
philosophies:
internal
review
the
be divided
and
of data into information
a very important
process
traces
can
describes
testing
security
security
The
SDLC
DBLC is iterative
should
the
The
composed
An organisation
design
is the
implementation,
is
establishes
Cycle (SDLC)
Cycle (DBLC)
DBLC is
SDLC, the
Threats
Copyright
Life
database
that
development
system. design,
implementation
Editorial
Systems
process
the transformation
the
process.
system.
The
to facilitate Thus,
is the
Development
detailed sequential
designed
and information.
party additional
DBA
content
may content
must
be
have
suppressed at
any
time
from if
a
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
574
PaRt
IV
Database
The
Design
managerial
services the
of the
?
Supporting
?
Defining
and enforcing
policies,
?
Ensuring
data
privacy
? Providing ?
The technical Evaluating,
?
Designing
?
Testing
?
and the and
Maintaining
The
needs. most
DBMS,
data
Therefore,
commonly
the
database
function
database
in at least
these
activities:
DBMS and
and
applications
applications
applications
and
applications
administration
the
strategy
development
requiring
of this
integrating
is
closely
related
of an organisations
a detailed
development
used
the
and
utilities
administration,
To guide
for the
users
of the
and objectives. of data
utilities
data in the
be involved
databases
supporting the
of the
DBA to
databases
DBMS,
development
that
the
and standards
services
use
and installing
evaluating
at least:
and integrity
and
and implementing
Training
?
distribution
selecting
procedures
and recovery
role requires
Operating
?
security,
the
include
community
data backup
Monitoring
?
end-user
DBA function
analysis
overall
plan,
methodology
is
the
strategic
of company
companys
situation,
methodology
as information
mission
plan corresponds
goals,
an integrating
known
to
and
to business
is required.
engineering
The
(IE).
Key teRMs accessplan
database development
physical security
auditlog
Database LifeCycle(DBLC)
policies
audittrail
database securityofficer(DSO)
procedures
authorisation management
decentralised design
scope
bottom-up design
disaster management
standards
boundaries
full backup
systems administrator (SYSADM)
centraliseddesign
incremental backup
systemsanalysis
computer-aided systems engineering
information resource manager(IRM)
systems development
information systems architecture (ISA)
Systems Development Life Cycle
10
(CASE) conceptual design
Information engineering (IE)
concurrentbackup
informationsystem
dataadministrator(DA)
logical design
data encryption
Copyright review
2020 has
Cengage deemed
Learning. that
any
top-down design userauthentication
minimal data rule
databaseadministrator (DBA)
Editorial
(SDLC)
All suppressed
Rights
Reserved. content
does
virtualisation
physical design
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
10
Database
Development
Process
575
FuRtHeR ReaDIng Bertino,
E. and
Sandhu,
Dependable Du,
and
W. Computer
R. Database
Secure
Security:
ReVIew
2(1),
A Hands-on
Online Content are contained
Security
Computing,
Concepts,
Approaches,
and
Challenges,
IEEE
Transactions
on
2005.
Approach.
CreateSpace
Independent
Publishing
Platform,
2017.
Answers to selectedReviewQuestionsand Problemsforthis chapter
on the online platform
accompanying
this
book.
QuestIOns
1
Whatis aninformation
2
How do systems
system?
analysis
Whatis its purpose?
and systems
development
fit into
a discussion
about information
systems?
3
What does the acronym
SDLC mean, and what does a SDLC portray?
4
What does the acronym
DBLC mean, and what does a DBLC portray?
5
Discuss the distinction
6
Whatis the
7
Discussthe distinction between top-down and bottom-up approaches in database design.
8
Whatis the data dictionarys
9
Whichfactors
between centralised
and decentralised
minimal data rule in conceptual
are important
design?
function in database design?
11
Describe and characterise the skills desired for a DBA. What are the
13
DBAs
managerial roles?
Describe the
managerial activities and services
provided
DBA.
Which DBA activities are used to support the end-user
14
10
in a DBMS software selection?
Describe the DBAs responsibilities.
by the
database design.
Whyis it important?
10
12
conceptual
community?
Explain the DBAs managerialrole in the definition and enforcement of policies, procedures and standards.
15
Protecting data security, privacy and integrity areimportant database functions in authorisation management.
Which
activities
are
required
in
the
DBAs
managerial
role
of
enforcing
those
functions?
16
Discuss the importance describe
17
Copyright review
2020 has
actions that
Cengage deemed
Learning. that
any
a checklist
All suppressed
Rights
Reserved. content
does
assigned
for the technical
May not
not materially
of database backup and recovery
must be detailed in
Assume that your company Develop
Editorial
the
and characteristics
be
copied, affect
scanned, the
overall
or
backup
you the responsibility
and other
duplicated, learning
and recovery
in experience.
whole
aspects
or in Cengage
part.
Due Learning
to
reserves
rights, the
right
Then
plans.
of selecting the corporate
involved
electronic
procedures.
in the
some to
third remove
selection
party additional
content
may content
DBMS.
process.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
576
PaRt
IV
Database
18
Design
Describe the activities that are typically the
19
DBA technical
function.
Briefly explain the concepts (ISA).
How
20 Identify
do those
associated
Which technical
ofinformation
concepts
affect
the
withthe design and implementation
skills
are
desirable
in
the
DBAs
engineering (IE) and information data
administration
systems architecture
strategy?
and explain some of the critical success factors in the development
of a successful
data
administration
21
What are the
main categories
22
Whatis data encryption
services of
personnel?
and implementation
strategy.
of threats faced
by an organisation in trying to protect its data?
and whyis it important
to data security?
PRObLeMs 1
Thabos
Car Service & Repair Centres are owned by Nationwide
and repairs provide Each and
only
of the
centre
repairs
You have
also
maintains
been
Car
entire
Dealers.
Each a
used,
billing,
contacted
system.
managed centre
manual
costs,
file
system
service
by the
and
maintains
of Thabos
hours
manager
preceding
operated
a fully
in
dates,
employees
Given the
Three
Car
Service
& Repair
Centres
province.
is independently
mechanics.
parts
the
services
stocked
which
owner,
by a shop
each
and
manager,
parts
cars
a receptionist
inventory.
maintenance
so on. Files
history
is
are also kept
kept:
to track
and payroll.
of one of the
information,
centres
to
design
and implement
a
do the following:
Indicate the most appropriate sequence of activities by labelling each of the following steps in the
10
for
eight
purchasing,
computerised
a
centres
made,
inventory,
Nationwide
and repair
three
at least
Each
cars from
service
Car Dealers; Thabos
correct
order.
(For
example,
if you think
that
Load
the
database
is the
appropriate
first
step, label it 1.) ______________Normalise
the
______________Obtain
a general
______________Load
the
______________Create ______________Test
the
______________Interview the
using
ER diagrams.
programs.
mechanics.
file (table) the
structures.
shop
manager.
How will a data dictionary help you develop the system?
Cengage deemed
model,
flowcharts.
c
Learning. that
modules that you believe the system should include.
Which general (system) recommendations system
derived
has
and system
Describe the different
if the
2020
process.
b
d
review
diagram
application the
______________Interview
Copyright
of each system
a conceptual
______________Create
operations.
system.
a data flow
______________Create
of company
database.
the
______________Create
model.
description
a description
______________Draw
Editorial
conceptual
any
All suppressed
from
Rights
Reserved. content
does
will be integrated, such
May not
not materially
which
an integrated
be
copied, affect
scanned, the
overall
or
might you maketo the shop modules
system?
duplicated, learning
in experience.
whole
Give examples.
will be integrated?
Include
or in Cengage
part.
Due Learning
several
to
electronic reserves
general
rights, the
right
some to
third remove
manager? For example, Which
benefits
would
be
recommendations.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHaPteR
e
Whatis the best approach to conceptual
f
database design?
10
Database
Development
Process
577
Why?
Name and describe atleast four reports the system should have. Explaintheir use. Who will use those
reports?
2 Suppose you have been asked to create aninformation produces
nuts
and how
and
bolts
would the
of
answers
many shapes,
to those
sizes
questions
3
What do you envision the SDLC to be?
4
What do you envision the DBLC to be?
system for a manufacturing plant that
and functions.
affect the
Which
database
questions
would
you
ask,
design?
5 Suppose you perform the same functions notedin Problem 2 for alarger warehousing operation. How
6
are the
two
sets
of procedures
7
system
(For
for
the
sequence
example,
of
if you think
______________Create
the
______________Load
why
are they
different?
example
in
Chapter
5?
the
the
______________Create
the the
______________Obtain ______________Draw
Learning. any
All suppressed
Rights
Reserved. content
does
May not
each
database
of the
is the
following
appropriate
steps
first
in
the
correct
step, label
most order.
it 1.)
programs.
of each system
conceptual
soccer
process.
file
model.
club president.
a conceptual
______________Interview
that
the
application
the
______________Create
Cengage
by labelling
database.
______________Interview
deemed
activities
system.
______________Normalise
has
University
a description
______________Test
2020
Tiny
that Load
the
______________Create
review
and
You have been assigned to design the database for a new soccer club. Indicate the appropriate
Copyright
How
Usingthe same procedures and concepts employed in Problem 1, how would you create an information
Editorial
similar?
model
soccer
using
ER diagrams.
club director
of coaching.
(table)
structures.
a general
description
a data flow
diagram
not materially
be
copied, affect
scanned, the
overall
or
10
duplicated, learning
of the and
in experience.
whole
soccer
system
or in Cengage
part.
Due Learning
club
operations.
flowchart.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 11 Conceptual, Logical, and Physical DatabaseDesign IN THIS CHAPTER,YOU WILLLEARN: About
the
How to
three
stages
design
functional
of
database
a conceptual
design:
model
conceptual,
to represent
the
model can be transformed
into
logical, business
and
physical
and its
key
areas
How the
conceptual
alogically
equivalent
set of
relations How to translate
the logical
data
model into
a set of specific
DBMS table
specifications About
different
types
How indexes How to
can
estimate
of file
organisation
be applied
to improve
data
storage
data
access
and retrieval
requirements
PREVIEW In
Chapter
cycle that
10, you learnt
was that have
of the
been
model that
about
actual
captured
accurately
in the
reflects
Such is the importance Conceptual the
Logical define
design
by producing
relationships
within
database integrity
our
Life
design.
It is
Cycle.
The
essential
initial
study
are
user requirements
where
the
we create the
to
build
needs
conceptual
of this
characteristics a database
of the
down into three
model that identifies
phase
data
used
and the
design; it is broken
a data
most critical
that
business.
distinct
stages:
representation
the relevant
of
entities
and
system
design
rules
Database
database
the
of database
database
database
the
database
where
to
ensure
we design relations there
are
based
no redundant
on each entity
relationships
and
within
our
database Physical target
database DBMS.
and how the
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
design
In this
data is
scanned, the
overall
or
stage,
we have
physical to
database
consider
is implemented
how
each
relation
in is
the
stored
accessed.
duplicated, learning
where the
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Figure
11.1 shows the
procedural
flow
of these
11
stages
Conceptual,
and the
Logical,
steps
and
Physical
Database
within each that
Design
need to
579
be
taken.
FIGURE 11.1
The three stages of database design
Data analysis and requirements Data analysis and requirements Entity relationship relationship Entity normalisation Data model
modelling and and modelling
Data model Distributed
verification database
design
Distributed
database
design
Creating
normalisation
verification
the
logical
data
model
Creating the logical data model Validating the logical data model using normalisation Validating the logical data model using normalisation Assigning
and
validating
integrity
constraints
Assigninglogical and validating integrity Merging models constructed
constraints for different
parts
database
Merging logical models constructed for different Reviewing the logical data model with the user Reviewing the logical data model with the user
Translate Translate model
into
Determine Determine Define
each each relation relation identified identified
inin the the logical logical
of the
parts
data data
of the
database
model into
tables
tables
a suitable file organisation a suitable fi le organisation
indexes
Define indexes Define Define
user views views user
Estimate Estimate Detemine Detemine
data data storage storage requirements requirements database database
security security for for users users
11
These three stages of database design are not totally intuitive and obvious. There is no single quick or automated method for tackling each stage. A well-designed database takes a considerable amount of time and effort to envisage, build and refine. It cannot be stressed enough that, if the time is taken to design your databases properly, then it will provide a solid foundation in which to build a complete system.
One of E.F.
Codds
requirements
when designing
was that the design should maintain logical and physical two stages is very important. Logical design is concerned Physical design is concerned with how the logical design in secondary storage. Codds rules on relational database if the logical should
structure
not change
if the physical user interface
of the database should change, then the
(logical
methods (hardware, should
Copyright Editorial
review
2020 has
Cengage deemed
design
Learning. that
any
All suppressed
using
Rights
Reserved. content
does
not materially
in any
way (physical
system
be
data change, then the
data independence).
about the steps required to complete
a number
May
management
way the user views the database
storage, etc.) of storing and retrieving
not be affected
not
database
data independence)
In this chapter, you willlearn database
a relational
data independence. The separation of these with whatthe database looks like to the user. mapsto the physical storage of the database design stated that:
conceptual, logical,
and physical
of examples.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
580
PART IV
Database
11.1
Design
CONCEPTUAL DESIGN
In the conceptual represents
design stage, data modelling is used to create an abstract
real-world
objects in the
most realistic
way possible.
database structure that
The conceptual
model
must embody
a clear understanding of the business and its functional areas. At this level of abstraction, the type of hardware and/or database modelto be used might not yet have been identified. Therefore, the design must be software and hardware independent so the system can be set up within any hardware and software platform chosen later. Keep in
mind the following
minimal
All that is needed is there,
data rule:
and all that is there is
needed.
In other words, make sure that all data needed are in the model and that all data in the model are needed. All data elements required by the database transactions must be defined in the model, and all data elements defined in the model must be used by atleast one database transaction. However,
as you apply the
minimal
data rule,
avoid
an excessive
short-term
bias.
Focus
not only
on the immediate data needs of the business, but also on the future data needs. Thus, the database design mustleave room for future modifications and additions, ensuring that the businesss investment in information resources will endure. As you re-examine Figure 11.1, note that conceptual design requires four steps, each of which will be examined
in the
Data analysis
next
sections:
and requirements
Entity relationship
modelling and normalisation
Data model verification Distributed database design
11.1.1 Data Analysis and Requirements 11 The first step in conceptual
design is to discover the characteristics
of the data elements.
database
factory
for
is
an information
that
produces
key ingredients
successful
An effective
decision
making.
Appropriate data element characteristics are those that can be transformed into appropriate information. Therefore, the designers efforts are focused on: Information needs. What kind ofinformation is needed that is, what output (reports and queries) must be generated by the system, whatinformation does the current system generate, and to what extent is that information adequate? Information users. Who will use the information? different end-user data views? Information sources. once it is found?
Whereis the information
How is the information
to be found?
to be used?
Howis the information
What are the
to be extracted
Information constitution. What data elements are needed to produce the information? What are the data attributes? Whatrelationships exist among the data? Whatis the data volume? How frequently
are the data used?
What data transformations
are to
be used to
generate
the required
information?
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The designer compile
obtains
the
Developing to
a precise
used
to
Directly
data
in
data type
job.
an accurate
data
types
and their
total
business.
business
environment.
data
are
Business and
enforce
they
define
rules,
derived
actions
within
business
rules
Examples
A customer
may
payment
make
on account may generate
Each invoice
is
their
critical
role
needs
of the
Ideally,
description
database
designer,
the
organisations
the
operating
airline
2020 has
Cengage deemed
usually system
to
is
charge
is
(DBA) the
designer
of the
true part
any
All suppressed
Rights
the
in
to
May
not materially
existence
to the
understanding
Chapter
2,
the
of data
Data
description
understanding
becomes
Models,
or principle
of the companys
required
that
meaningful
a business
within a specific
of an organisations
environment.
When
multiplicities
and
is
operations,
business
rules
a
help
are
written
must be widely disseminated
interpretation
distinguishing
rule
organisations
constraints.
and they
a common
only
of the rules.
characteristics
of the
to
Using simple
data
as viewed
by
1
as follows:
customer. business
to
rules
database
a formal
that
must
designs
not
be established
casually.
and implementations
description
be
is
affect
is
how
be
of operations.
the
overall
or
design
duplicated, learning
organisations
Poorly
that fail to
As its
in experience.
operating
data sources on the
quite
different
name
meet
from
is
or in Cengage
part.
Due Learning
to
electronic reserves
of a steel
Naturally,
For example,
manufacturer, data
data
an
analysis
and
environment
and
of operations.
rights, the
mission.
when the
a
To the
data users.
may be, the
enhanced
within a description
whole
that
organisations
implies,
and thoroughly
environment.
and the
organisations
different
the
process
and precisely
scanned,
an
both the
dependent
would
matter database
copied,
define
environment
environment
Yet no
not
database
only one customer.
from
activities
of the
does
the
a desktop
the
according
yield
collection
procedure
shares
design,
derived
accurately
Reserved. content
Systems
of the
usually implies
database
the
carefully
of designing
usually
considered
the
discover
the reports.
process is part
model. (This
an existing
to identify
(tables)
support
in
has
on account.
of a university
home.
views
end users.
operating
component
Learning. that
user
existing
to
by themselves
of view,
are
rules lead
are
environment
data use are described
review
do not
main and
one
database
of the
the
581
many invoices.
operating
or a nursing
requirements
Copyright
data
of operations is a document that provides a precise, detailed, up-to-date
reviewed
Design
he or she can
end-user
and files
analyst
design
relationships,
credited
business
rules
design
administrator
a detailed
payments
organisations
business
Database
end user(s) interact
system in place, the
systems
organisations
rules
by only
in
the
forms
DBA designs
from
from that
the
is
generated
or inaccurate
the
must be easy to understand
many
the
The end
must have athorough
data
organisation
describe
so that
and the
turn,
data required
database
of a policy,
of business
A customer
description
Editorial
rules
the
database
point
attributes,
business
the input
The database
Remember
every person in the
reviews
The
But
description
entities,
designer
designer
uses.
Physical
analyst.
model, the
defined.
The
of a database
a database
narrative
output.
describe
the
systems
and
From
company.
Given
the
by the
In
desired
conceptual cases,
jointly.
and
cases,
department.
extent
To be effective,
defined
the other
and
sources
designer
has an automated
to
some
Logical,
elements.
examines
group.
In
The presence
rules
brief and precise
Each
In
created
To develop
the
develop
data-processing
specifications
ensure that
design
views
data
existing
reports
Cycle (SDLC).
will also
of a formal
language,
data
designer
desired
environment.)
administrators
properly,
The
systems
Life
system
create
system:
different
The database
main
or computer-based).
and
with the
computer
to
current
current
Development
when
of end-user
characteristics.
the
Interfacing
of the
data views.
databases
from
Conceptual,
sources:
and volume. If the end user already
examines
new
the
the
questions
Note these
description
place (manual
and their
to those
end-user
help identify
observing
system
answers
information.
and gathering
develop
are
the
necessary
11
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
582
PART IV
Database
In
Design
a business
and, therefore, written and
business
documentation more
direct
perceptions rules.
a task.
Such
does
designers
job
business
role
help
business data
business
Business
rules
They
help
They
constitute
They
allow
rules
plays
yield
standardise
the
They
designer
to
of the
Example
the first
a DVD rental
action
available
TABLE 11.1
of the
Within this and
in
Cengage deemed
Learning. that
any
yield
can initiate
should
end-user
of the
end
users
perceptions.
perceptions
problems,
of such
a
perform
Although
verify
very different
business
of what
that
problems,
general
the
reconciliation
database
to
ensure
that
nature,
design
how the
business
the
designer
and
scope
role
of new
works and
must identify
the
of data.
systems:
users
and
designers.
role
and
scope
of the
data.
processes. relationship
with Entity
whether
fully
participation
Relationship
a given relationship
rules
and foreign
key
Diagrams.)
is
mandatory
or optional
is usually
rule.
for a DVD Rental Store
conceptual
store
understand
design
movie titles Each
copies.
type
are
process,
contains
For example,
let
classified
note
us now consider according
many the
possible
summary
an example
to their
type:
titles,
and
presented
Type
Title
Family
Chronicles
of Narnia
1
Chronicles
of Narnia
2
Action
has
mechanic
in
based
comedy, most titles
Table
on
family, within
a
11.1.
The DVDrental type and title relationship
Comedy
2020
any
to
Consequently,
nature,
Modelling
new release.
multiple
business
specifying
data.
appropriate
5, Data
to
in the
of
the
develop
business
stage
store.
documentary, are
understand
noteworthy:
applicable
the
on the
between
Data Analysis and Requirements
To illustrate
type
Chapter
point is especially
a function
review
tool
to understand
(See
to
to management
results
impact
view
designer
The last
Copyright
companys
They allow the the
comes
consequences.
verify
designer
benefits
to
allow
because
authorisation
pays
discovery
operations.
important
designer
constraints.
Editorial
the
their
a communications
the
it
Given the and
company analyse
several
may point
designer.
enables
and
major legal rules,
A faster
Unfortunately,
and accurate.
within
rules
a discovery
differences
are appropriate
has
manuals.
and
users.
with inspection
business
managers
operations
end
when it
who perform the same job
database the
but it of
of operations
department
may believe that
mechanics
trivial,
and
with
source
description
makers,
standards
mechanic
only
for the
policy
interviews
reliable
department
people
the
direct
be a less
While such
is to reconcile
the
the
companys
11
are.
not
is
development
with several
rules
Knowing what
the
managers,
procedures,
rules
may seem
to
components
diagnosis
company
actually
a distinction
of information
company
user can
when
contributors
job
as
a maintenance
Often, interviews their
main sources are
of business
procedure,
crucial
the
such
source
For example,
such
the
rules
differ, the end
maintenance
are
environment, of
All suppressed
Rights
Reserved. content
does
May not
not materially
be
Copy
Toy Story
1
Toy
Story
2
Toy
Story
3
Simpsons
1
Simpsons
2
Simpsons
3
Lord of the
Rings
1
Lord
Rings
2
copied, affect
of the
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
You have been asked to set
of business
rules
produce
from
the
The
movie type
The
movie list is updated
DVD shop
owner
shop
on the vendor list
for
The video number
shop
selects
manager
not desirable
Physical
and have been provided
are necessarily
Database
Design
583
with the following
in stock.
a movie on that list
might not be ordered if the
for some reason.
order
vendors
movies from the
from
are reclassified
wants to
a title,
more titles,
whom
entire
movies
vendor list;
some vendors
may be ordered in the future.
to an appropriate
have an end-of-period
customer return
pay in
checks date
type
after they
have been in
(week,
month, year) report
for the
quickly
whether
out
The
DVD store
owner
The
owner
the
to
must be able to find it quickly.
written.
a title,
Each invoice
a record
Upon the
return
is late
wants to
wants
assistant
is
may thus
When a customer
contain
charges
for
one
or
cash.
and time.
check
also
the shop
an invoice
All customers
When the expected
and
by type.
or
more titles.
Logical,
days.
requests
one
however,
merely potential
30
of rentals
If a customer
that it is
as new releases
more than
store
not all types
does not necessarily
are
Movies classified
is standard;
as necessary;
decides
for this
Conceptual,
manager:
classification
The DVD rental
stock
a database
11
to
kept
assess
generate
generate
of the
of rented
and to
be able
be able to
is
return
checkout
titles,
the
date
shop late
revenue
reports
inventory
and time
assistant
appropriate
periodic
periodic
the
reports
and the must
return
by title
and to
be able to
fee. and
keep
by type.
track
of titles
on order. The
DVD store
owner,
wants to keep track entries
in
a
work
who
employs
two
of all employee
schedule,
while
(salaried)
work time
full-time
and
all employees
and three
payroll
sign in
data.
and
(hourly)
Part-time
out
on a
part-time
employees,
employees
must arrange
work log.
NOTE
11 When capturing aspects
As
you
of the
start
to
requirements
in
the requirements, business;
think
help
characteristics (whose
the
might
zero
for
the
differently to
use
review
2020 has
exist,
Cengage deemed
Learning. that
any
All suppressed
two,
ensure
full-time
Rights
situation
May not
not materially
be
handled If full-time
problem
employees,
while the
pay
computations, On the
in terms for
a supertype/subtype
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
and
whole
or in Cengage
part.
Due Learning
to
reserves
keep
differences.
are few
For
distinguishing
an attribute
EMP_CLASS
earn
salary
a base
by having two
approach, is
the
zero
selects
employees
benefits,
and so on, it employees.
for
is
the
either
are
and
attributes,
HOUR_PAY
set to
software
PART_TIME
electronic
Also,
using
part-time
relationship
next.
employees
application
of work scheduling, FULL_TIME
attributes.
for design
If there
EMP_BASE_PAY
hand, if
operational
and information
and
can be handled
Using this
the
other
by
the
we have listed
transaction
or part-time.
table.
wage, that
that
relationships
may be
establishes
objectives
many possibilities
as full-time
EMPLOYEE
classification
does
entities,
EMPLOYEE.
employees
Reserved. content
remember
required
not only
system
problem leaves
classification.
more sense
specific
in
correct
employee
the
the
P) in the
full-time
a supertype/subtype
variables
Copyright
To
from
by the
of operations
database,
classification
EMP_BASE_PAY,
salaried
on the
this
earn only an hourly
and
employees. depending
the
some
by defining
provided
be F or
employees
EMP_HOURPAY to
designing design
EMPLOYEE
between
values
part-time
Editorial
about
drive the
consider
description
it also establishes
mind that the description
example,
the
set part-time
F or
handled
P,
quite
would be better The
more
unique
makes.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
584
PART IV
Database
Design
Once the for the
basic requirements
DVD rental
store
is
have been captured,
presented
in
11.1.2 Entity Relationship Before
creating
the
be used in the documentation Designers
And
overlook
team.
system
designer
ER model can be created.
communicate
The standards
and any other
often
lead
to
and
enforce
the
to
The ER model
often
poor
design
promise
(but
appropriate
during
when they means
work. In
standards
use of diagrams
be followed
especially
documentation
work easier
and
include
conventions
requirement,
standardise
failures
the first
section.
must
design.
very important
to
make design
are
a failure
contrast,
documentation.
working
to
as
members
communicate
later.
well-defined
do not guarantee)
to
and symbols,
and
a smooth
enforced
integration
of all
components.
Because
the
designer
and
the
of the
this
Failure
communications
standards
model,
next
Modelling and Normalisation
writing style, layout
often
of a design
ER
documentation
the
business
developing
Table
rules
must incorporate
the
usually
them
define
into
conceptual
the
the
nature
conceptual
model using
of the model.
ER diagrams
relationship(s) The
process
among
the
of defining
can be described
entities,
business
using the
steps
the rules
shown in
11.2.1
TABLE 11.2
Developing the conceptual
model using ERdiagrams
Activity
Step 1
Identify,
2
Identify
analyse the
and refine
main entities,
the
business
using the
3
Define the relationships
among
4
Define the attributes,
5
Normalise the entities. (Remember
6
Complete the initial
7
Have the
primary
rules.
results
the
of Step
entities,
using
keys and foreign that
1. the results
of Steps
1 and 2.
keys for each of the entities.
entities are implemented
as tables in an RDBMS.)
ER diagram.
main end users verify the
model in
Step 6 against the data, information
and processing
requirements.
11 8
Modify
the
ER
diagram,
using
the
results
of
Some of the steps listed in Table 11.2 take process,
can generate
Step
7.
place concurrently.
a demand for additional
entities
and/or
And some, such as the normalisation
attributes,
thereby
causing the
designer
to
revise the ER model. For example, whileidentifying two main entities, the designer might also identify the composite bridge entity that represents the many-to-many relationship between those two mainentities. To review, suppose you are creating a conceptual model for the JollyGood Movie Rental Company, whose end users want to track customers movie rentals. The simple ER diagram presented in Figure
11.2 shows
a composite
entity that
helps track
customers
and their
DVD rentals.
Business
rules
define the optional nature of the relationships between the entities DVD and CUSTOMER depicted in Figure 11.2. For example, customers are not required to check out a DVD. A DVD need not be checked out in order to exist on the shelf. A customer mayrent many DVDs, and a DVD may be rented by many customers.
1
See
Alice
March
Copyright review
2020 has
Cengage deemed
Learning. that
any
particular,
Sandifer
1991,
changed
Editorial
In
pp.
note the
and
Barbara
13-16.
All
Rights
but the
Reserved. content
von
Although
substantially,
suppressed
does
composite
May not
not materially
be
Halle,
the
copied, affect
scanned, the
overall
Linking
source
process
has
or
duplicated, learning
RENTAL
entity that
Rules
seems
to
dated,
connects
Models, it
the two
Database
remains
the
main entities.
Programming
current
and
standard.
Design,
4(3),
The technology
has
not.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 11.2
A composite
11
Conceptual,
Logical,
and
Physical
Database
Design
585
entity
As you willlikely discover, the initial ER model may be subjected to several revisions before it meets the systems requirements. Such arevision process is quite natural. Remember that the ER modelis a communications
tool
as well as a design
blueprint.
Therefore,
the initial
ER
model should
give rise to
questions such as,Is this really what you meant? when you continue to meet with the proposed system users. For example, the ERD shown in Figure 11.2 is far from complete. Clearly, many more attributes must be defined and the dependencies must be checked before the design can be implemented. In addition, the design cannot yet support the typical DVDrental transactions environment. For example, each
DVD is likely
to
have
many copies
available
in Figure 11.2 is used to store the titles shown in Table 11.3.
TABLE 11.3
DVD entity
shown
as well asthe copies, the design triggers the data redundancies
Data redundancies in the DVDtable DVD_COPY
DVD_CHG
DVD_DAYS
SF-12345FT-1
Star
Wars
1
13.50
1
SF-12345FT-2
Star
Wars
2
13.50
1
SF-12345FT-3
Star
Wars
3
13.50
1
WE-5432GR-1
Beauty and the
Beast
1
12.30
2
WE-5432GR-2
Beauty and the
Beast
2
12.30
2
ERD
one copy From the
attribute have you
shown
preceding
definition,
are
in
available
completed
normalisation
that
system
tools
Cengage
Learning. that
any
ER
the
All
Rights
Reserved. content
does
May not
and
not
be
copied, affect
verification)
that
scanned, overall
or
place will
often take process
the
duplicated, learning
take
the
in experience.
whole
or in Cengage
part.
Due Learning
to
reserves
question,
Is
Figure
right
some to
In fact,
among that
is
and the
the
more
third remove
the
party additional
content
once
capable
of
process
may
be
any
time
meeting
the array
of
model.
suppressed at
you until
is iterative.)
conceptual
content
(entity/
activities
11.4 summarises
produce
rights, the
activities
sequence.
design
parallel,
use to
electronic
the
modelling
and forth
a database
place in
can
to
ER
a precise back
interactions.
designer
answer
11
must be supported. that
in
move
represents
activities
modelling
the
to reflect
transactions
are you
accurately
sources
materially
modified
might get the impression
chances
(The ER
be
Also, payment
you
model
and information
suppressed
must
model,
demands.
design
deemed
ER
the
11.3 summarises
has
11.2
discussion,
Figure
2020
Figure
for each title?
the initial
satisfied
the required
review
However, if the
DVD_TITLE
The initial
Copyright
purposes.
DVD_ID
than
Editorial
for rental
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
586
PART IV
Database
Design
FIGURE 11.3
ER modelling is aniterative
process based on many activities
Data analysis Database
initial
study
User views business
and rules
DBLC processes database
Initial
and
ER
model
transactions
Attributes
Verification
Normalisation
Final
FIGURE 11.4
Conceptual
Information
design tools and information
sources
Design
ER
model
sources
tools
Conceptual
model
11 Business data
rules
and
ER diagram
constraints
Data flow
diagrams
Normalisation
DFD*
Process
ERD
functional
descriptions (user
(FD)*
Data dictionary
Definition
views)
and validation
*
Output
Copyright Editorial
review
2020 has
generated
Cengage deemed
by the
Learning. that
any
All suppressed
systems
Rights
Reserved. content
does
analysis
May not
not materially
be
and
copied, affect
design
scanned, the
overall
or
duplicated, learning
activities
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
All objects is
used
(entities,
problems.
with the
During
Define the
attributes,
in tandem
this
ER
modelling
entities,
attributes,
relationships
among
Make
decisions
relations,
and
process
process,
primary
about
views
normalisation
keys
the
entities.)
adding
new
help
key
Logical,
defined
eliminate
designer
and foreign
primary
Conceptual,
so on) are
to
the
11
in
data
and
Physical
a data
Database
dictionary,
anomalies
and
Design
587
which
redundancy
must:
keys.
(The
attributes
to
foreign
satisfy
keys
serve
end-user
as the
and/or
basis
for
processing
requirements. Make
decisions
about
Make
decisions
about
Make decisions
about the
The Real
establish
Modelling
with
will
the
Entity
the following
and
TOUR
naming
entities
are
to
ignored
of the
at the
it is important
standards
completion
naming they
are
designers to
defined
design.
conventions
and
Therefore,
were
will be revisited across
to the will be
names
entity
should
peril.
ensure
that
enforced.
it is
in
established
greater
extent
more broadly
in
detail
a reasonably
greatest
wherever
contains entity
the
Proper
very
useful
This
range
As the
applicable.
Chapter
here.
broad
possible.
personal
possible.
of
older
You should
For example,
information
may be related
and
TOUR
show
what
entity
name
to
about
booking,
any
All suppressed
will be named
the
if the
Reserved. content
does
May not
not materially
be
Therefore,
to
5, Data book
uses
DBMSs
and
DBMSs fade
try to
in a travel customer
11
adhere to
agency who
customer
of those
choice
composite entity
had
makes
booked
(bridge)
be the
the
entity.
a
next point
composite
represent.
and a
entity that designer
In
such
finds
cases,
For example,
STUDENT
in the
they
many TOURs
names.
entity that links
would
of
Occasionally,
segments
one discussed
relationship
the composite
TOUR_BOOKING.
composite
the
may consist
by the
make the
A better in
describes
being linked
may borrow
sparingly.
Rights
are
that
a BOOKING
made for it.
entities
might
enrols
a name
database,
may be the
convention
STUDENT
Learning.
assigned
agency
many BOOKINGs
be used
the
usually
STU_CLASS
naming
that
frequently Therefore,
naming
be acceptable
HOTEL
in the travel
composite
Cengage
the
self-documenting.
to
and attribute
and the
BOOKING
University,
deemed
yet it is by teams.
attribute
conventions
CUSTOMER
may have
necessary
that
review
Concepts.)
hotel.
Composite
that
(If necessary,
Advanced
dictionary.
which
requirements
and
For example,
the
in
Diagrams,
are likely
entity
the
specific
it
1:1 relationships.
Modelling
conventions:
database,
links
requirements.
conventions.
successful
entity
that
scene, the
a BOOKING
has
the
Relationship
conventions
data
accomplished
are, in effect,
useful
Use descriptive
2020
to
meet self-documentation
from
Data
is important,
generally
that
some
naming
an environment
crucial
procedures
Although
naming
standard
is
work in is
in the
requirement
design
members
documentation
definitions
about
conventions
database
team
keys in
6,
processing
ER diagram.
element
decisions
naming
Chapter
satisfy
entities.
all data
Make
to
of foreign
in
attributes.
relationships.
corresponding the
multivalued attributes
placement
ternary
unnecessary
of
derived
Avoid
Include
review
adding
relationships
Normalise
Copyright
treatment
supertype/subtype
Draw the
Editorial
the
and
in
CLASS.
Tiny
However,
more cumbersome,
entity
name
ENROL,
so it
to indicate
a CLASS.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
588
PART IV
Database
Design
An attribute the
table
name should
in
characters.
For example,
VEND_PHONE.
foreign
probably
links that
DEPT_CODE,
EMPLOYEE start
occasionally, If
with the
one table
is
named
an attribute originating
in the
OI, as the
to
to the
assign
a reserved
In that
of entities
using
proper
source
significantly.
and
attribute
Example
fact
it
does
our DVD rental
following
additional
to
and
identifies
such is the
this
a
as EMP_ID,
foreign
key that
and table
naming
convention
use
spite
other
in
Modelling for the
names
such
as
generally
some
RDBMSs
word in
conventions. less
might
use
a SELECT
relate
helpful
the with
a
inventive
in identifying
will reduce
to the
or
Also,
be somewhat
are less
ITEM
Sometimes
descriptive.
have to
prefixes
obviously
attributes
ORDER.)
use of prefixes
WITHDRAW,
to the
it is
mind that
ORD
willidentify
of characters,
naming
you
those
not
as a prefix
limitation,
in
than
prefix
prefix
as a reserved
to the
consistent
CO does
ORD
of that
or entity design,
the
The ITEM
a combination
In
strictly
But then
the
which
ORDER_ITEM,
origin. (Keep
attribute
originate
store, for
and
of relationships
bend
named
use
name
a complex
prefix
not
you
may be interpreted
adhere the
prefixes.
while the
that
existence
cannot
names.
a table
Nevertheless,
Entity Relationship
Let us revisit
use
makes
name
you
an attributes
in
attributes
ORDER table.
should
ORDER
attributes
of the attribute.
is the
so you
possible
lengths
For example,
as obvious
the
name
is
in the
Clearly,
that identify
not always
VEND_ID as ITEM_ID
DEPT_CODE
that
counterpart
attribute
should
contains
the
as
such
is that it immediately
that
dictate
such
will be five
point.
table.
For example,
table
helps identify
length
names
convention
Naturally,
prefix
attributes
attribute
obvious
might
weak
table,
you
contain
naming
a prefix that
maximum
contain
originating
the ITEM
case,
to limit
number
precise
characters
prefixes
As you can tell, it is
about
DEPARTMENT.
and its
contain
the
EMPLOYEE
to
ORDER_ITEM
word list.
statement.
large
if the
same
ORDER_ITEM
prefix
requirement
of this
an attribute in
might
might
it is immediately
ORDER
originating
here,
table
table
as you can see in the next
will be used to indicate
possible
ITEM
The advantage
and
and it should
purposes
VENDOR
the
key(s). For example,
EMP_LNAME
names
For the
the
Similarly,
ITEM_DESCRIPTION. tables
be descriptive,
which it is found.
the
sourcing
CHECK_OUT
doubts
table,
just
USER.
DVD Rental Store
we have gathered
the
basic requirements.
Now examine
requirements:
11 DVDs are classified Release),
so
The shop
assistant
This requirement entering
the
The shop the
add
actual
not
The store
Here
we
Copyright review
2020 has
Cengage deemed
Learning. that
any
the
All suppressed
do
creation
Rights
does
to
to
Documentary,
prevent
requests
way to
data
Action
and
New
redundancy.
quickly.
query the
quickly
fees create
DVD data (by name, type,
to
the
be able to keep track
new
etc.)
while
that
attributes
entity.
Note
relationship.
database
there
business
conceptual
assess
as expected is
no need
mind that
in the tables
the
of all employee
that
Keep in
attributes
enforces
and to
such
to
some
and by combining
rule.
Remember
that
diagram.
work time
and payroll
data.
This
entity.
entities:
and the
RENTAL
the appropriate
in the
or not the return is late
met by adding
an additional
of an EMPLOYEE two
whether is
program
be represented
schedule
Reserved. content
and late
met by including
can
payroll
check
an application
wants to
work
generate
Editorial
date
we need
must create
employees
customers
This requirement
return
rules
the
fee.
nor
through
owner
will require
return
are easily
all business
Family,
be created
an easy
must be able to
entity,
attributes
must
data.
late
requirements
TYPE (Comedy,
TYPE
met by creating
assistant
date,
to their
called
must be able to find
RENTAL
a new
those
entity
is
appropriate
return
according
a new
WORK_SCHEDULE
actual
times
and
worked,
WORK_LOG,
respectively.
which
These
will show
entities
also
help
from
eBook
the us
report.
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
if
the
subsequent
rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The description
also specifies
End-of-period entities
to
report
generate
Revenue report generate
Titles
all the rental
on order.
Employee and After
for the
by type.
This report
This report
This report
Logical,
that the database This report
period
will use the
and
Physical
Database
eventually
will use the
Design
589
produces:
RENTAL
and
DVD
of time. RENTAL,
DVD and TYPE entities
to
will use the
will use the
and
payroll
ORDER,
DVD and TYPE entities.
DVD and
data.
This report
gathered
so far,
TYPE
entities.
will use the
EMPLOYEE,
WORK_SCHEDULE
entities.
all the
FIGURE 11.5
of rentals
reports
Conceptual,
data.
work times
analysing
expected
data for some specified
and by type.
reports.
WORK_LOG
of the
number
all rental
by title
Periodic inventory
some
11
requirements
afirst
draft
of the
ER
model is
shown
in
Figure
11.5.
ERDof the DVDrental store
11
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
590
PART IV
Database
Although to
Design
there
is a temptation
WORK_LOG
entity
for
and
an
attribute.
It is far
EMPLOYEE
entity.
applications
software
entities,
there
The
point,
is
no
on the
the
entitys
LOG_DATE
and times to
same
are
out.
the
and
work
of verification
next
been
dont
against
the
been
rented
has
customer
cannot
which are then
reflects
perhaps be
related
a substitution
named
of an
EMP_TYPE,
P 5 part-time
WORK_LOG
transaction
or F
and
in the
5 full-time.
The
WORK_SCHEDULE
and
which
yet
be used
proposed
logged
it is
the
You
the
or out. Therefore, In
addition,
to record
two
how the
dates
if you
the
additional
rent work log
time
want
in
schedules.
requires
willlearn
customers
addition,
employees
produced
system.
in
For example,
If five In
necessary
to track
has been
copy.
LOG_DATE_OUT.
employee,
model that
requirements.
by a customer.
has
when all employees
part-time
the
then
against
LOG_DATE_IN
by each
data
attribute,
entries into the
which
named
schedule
an would
entities,
a decision
value.
video
know
PART_TIME
such
create
verified
of tracking
Clearly, the
and
values
attribute
yet
specific
you
worked
the
when?
process the
not
perhaps
hours
simply
be used to force
is incapable
Similarly,
worked
in
the
to
attribute
which
video,
necessary,
determine
time
has
check
of the
FULL_TIME respectively,
EMP_TYPE
ERD
way to
copies
better
EMP_TYPE
can then
depending
At this
to create
WORK_SCHEDULE,
and
Who has
work through
data
model is verified
section.
11.1.3 Data Model Verification The
ER
model
intended run
must
be verified
processes
through
of tests
data views
manipulation Chapter Access
and their
paths
and
Revision
of the
by a detailed important
some
of the
other
entities.
original
Or
and
DELETE,
with a careful
that
describe
on attribute
details
primary
first
that
that the
include which
the
the
model be
data
you learnt
re-evaluation
those
entities.
primary
Chapter
That
3,
change
entered.
Relational
Learning. any
primary
All suppressed
Rights
does
May not
not materially
key.
affect
original
problems
about
in
of the
entities,
This process
followed
serves
several
the
overall
or
duplicated, learning
in experience.
it
key
Cengage
Due Learning
to
or
within
contain
more new
of relationships
relationships
Perhaps
be attributes out to
a
entities.
as they
lead
are
to implementation
might be useful to create
key
always
part.
out to
of one
example
speed,
electronic reserves
in
the
you
rights, right
some to
third remove
and
same
key 3.18
and
PROD_CODE.
order
may create
primary
the
in Figure
of INV_NUMBER
of INV_NUMBER appear
a new primary
illustrated
composed
multiple-attribute
or in
themselves.
may turn
nature
composed
processing
whole
turn
in the invoicing
the invoice
entities
later.
a primary
primary in
the
defined
requirements,
an existing
scanned,
about
For example,
and to increase
copied,
will, instead,
the introduction
clues
Characteristics,
of the
be an attribute
keys. Improperly
end-user
replace
be
warrant
provide
the items
queries key to
Reserved. content
the
that
to
development
Model
replaced ensured
To simplify
surrogate
can
a revision
be entities
to
and to application and/or
to
considered
and foreign
an existing
may lead to
believed
was originally
processing
that
requires
Such transactions
UPDATE
starts
details
first
what
To satisfy
Cengage
corroborate
Language.
attributes
attribute
to replace
deemed
to
and constraints.
of subcomponents
LINE_NUMBER
has
INSERT,
design
of the
components
by the
problems
2020
order
model. Verification
transactions.
Query
database
of the
number
defined
review
in
security.
examination
The focus
Copyright
database
processes
purposes:
sufficient
Editorial
required
Structured
system
following:
data requirements
The emergence
in
proposed
by the the
SELECT,
Beginning
Business-imposed
11
against
commands
8,
the
can be supported
a series
End-user
against
as they
were
a single-attribute
key.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Unless the to
entity
evaluate
guard
the
against
A careful
details (the extent
undesirable
review
revisions Because
rough
ensure
real-world
database
a specific
function,
important
The
greatly
The
the module
be identified
entire
the
as
modules
the
fragments
be able
to
against
the complete
support
that
end
users.
are,
they
all of the
591
difficult helps
so
on.
should
strive
to
organise
the
system component that handles
At the
Creating
design
level,
and using
a
module
modules
is
an ER
accomplishes
can be delegated
to
design
more
number
of entities
manageable
number
within
within a complex
is
also
a great
programming confidence
quickly, the implementation
being
made
ER
model
all of the
processes.
and that
models
To avoid
ER model. That verification
process
The ER model verification
at least
fragments.
ER
that
design
can
of entities.
and applications
prototyping
is
represent
groups
work.
online
of the
Fragmentation
problem,
system
creates and
the
is detailed in
builder.)
of one or more
part
components
trouble
is ready
a potential
may, therefore,
modules
must
not
be verified
Table 11.4.
process
Activity
11
2
Identify
each
module
3
Identify
each
modules
the
ER
models
central and its
entity. components.
transaction
requirements:
Updates/Inserts/Deletes/Queries/Reports
Module interfaces
4
Verify all processes
5
Make all necessary
6
Repeat Steps 2-5 for all modules.
Keep in
mind that the
well
as system
systems
Learning. that
any
All suppressed
verification
and
modules.
Cengage deemed
you
progress
Identify
has
by teams,
ER model.)
(Quick
1
2020
Those
Design
requirements.
be brought
required
External:
review
a
not include
Internal:
Copyright
to revisions.
to lead end-user
quickly. Implementation
cant
may
Step
as
it is levels
is likely
and
work. The large
more readily.
system
TABLE 11.4
Editorial
payroll
within them)
contains
will demonstrate serving
useful
problem:
defined,
Database
normalisation
meeting
of
done
development
design
can be prototyped
Even if the
As
simplify
modules
begin
Physical
of the
module is aninformation
orders,
up the
The
to
blueprint
generally
even the segments
Each
modules
and
are precisely
Knowledge
capable
part of the overall
speeding
modules
can
is
is
modules. (A
be daunting.
spots
Logical,
ends:
modules (and
teams,
design
design
as inventory,
that is an integrated
several
the design
major components into
segment
characteristics)
normalisation.
database
that
designs
such
and their
designs
Conceptual,
redundancies.
of the
will help
attributes
of the
11
Reserved. content
changes
process
does
The
11.6 illustrates
May not
not materially
be
copied, affect
scanned, the
overall
ER model.
suggested
requires
user requirements.
Figure
Rights
against the
the
duplicated, learning
continuous
verification
the iterative
or
in Step 4.
in experience.
whole
nature
or in Cengage
part.
Due Learning
to
verification
sequence
must
of the
process.
electronic reserves
rights, the
right
some to
third remove
of business be repeated
party additional
content
may content
transactions
for
be
each
suppressed at
any
time
from if
the
subsequent
of the
eBook rights
and/or restrictions
eChapter(s). require
it
592
PART IV
Database
Design
FIGURE 11.6
Iterative
ER model verification
process
Identify
processes
Change process Verify
results
Define transaction
Change
ER
The
the
systems
in the
The
greatest
boundaries
attention
among
Analyse
each
coupling must
modular
have
other,
modules
highly
Finding
the
Processes
Operational
review
2020 has
Cengage deemed
Learning. that
any
All
they
unnecessary the
right
designer
selects
the
central
within the
results
according
display
entity that
has
belongs
uses it
modules
the
high
is
most
of
entity
more lines
and to
define
most frequently.
framework
to
to let
you
creating
more
is, the
another.
other
modules.
the
entities
among
entities. and that
Modules
Note:
One of the
coupling. are
coupling
of a truly
Often,
dependent
effect, resulting
designers
Module
Low
creation
cohesion modules
coupling.
of one
allowing
has the reverse database
that
module
of
thereby
of the relationships
self-sufficient.
address
between
coupling
strength
cohesivity
and
relationships
in
Decreasing
describes
are independent
balance
the
quest
on each
in low
cohesion.
job.
to their:
(INSERT
or ADD,
UPDATE
or exceptions). or
CHANGE,
DELETE,
queries
and reports,
batches,
backups). must
be verified
verification
and attributes
Reserved. content
entity of
the
module
are independent
dependencies,
type
Rights
central focus
it is the entity that
the
the
be complete
modules
that
yearly,
The process
suppressed
which
monthly,
entities
The
to:
modules
balance is a key part of the
processes
implemented.
to
placed
must
other
weekly,
and
which
need
must
(daily,
maintenance All identified
A module
with
modules
may be classified
Frequency
additional
eliminating
cohesive
entity is
module
to
achieve
to
cohesivity
indicating
high coupling. correct
entity,
ER diagram,
belongs
The term
intermodule
is to
entity
you
entities.
extent
unnecessary
hence
central
(In the
entity.
and it is the
details.
relationships
the
and
important)
relationships,
the
framework,
and the
coupling,
challenges
(most
models
or subsystem The
the central
modules
low
central
to identify
module scope.
modules
related,
system
design
the
cohesivity.
describes display
decreases
to
the
be strongly
of the
words,
entity/module
modules
the
most
of relationships.
and
on the
central
Ensure the
must
other
module is identified,
your
selecting in
other.)
step is to identify
found
with
number
any
next
Within the
Copyright
In
modules
focus
Editorial
participation
to it than
Once each
11
starts
of its
operations.
connected
that
process
in terms
involved
model
model
verification
defined
ER
steps
does
May not
not materially
be
copied, affect
against
the
is repeated
ER
the
overall
or
duplicated, learning
in experience.
whole
If
for all of the
will be incorporated
scanned,
model.
or in Cengage
into
part.
Due Learning
necessary,
models
modules.
the conceptual
to
electronic reserves
rights, the
right
some to
third remove
appropriate
changes
are
You can expect
that
model during its validation.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
At this
point,
independence life
by
making it
Example Applying
a conceptual ensures
the
possible
model has been defined systems
to
portability
migrate
to
Data Model Verification the
verification
store from the
process
previous
FIGURE 11.7
across
another
for the
produces
Conceptual,
Logical,
as hardware-and platforms.
DBMS
another
Physical
Database
software-independent.
Portability
and/or
and
may extend
hardware
the
Design
593
Such databases
platform.
DVD Rental Store
described
section
11
in
the
Table
verified
11.4
to
data
the
conceptual
model
model as shown in
of the
DVD rental
Figure 11.7.
Verified ER modelfor the DVDrental store
11
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
594
PART IV
Database
Design
As you examine
the
components
All relationships LINE. (The factor
in
are read from
natural tendency
an
can
and
out
COPY
clearing
FK in
the
vary
from
are likely the
ORD_
to
keep
the
PKs
COPY
than in
movie, each
COPY not a composite
a requirement
that
combination
ORD_LINE.
the
FK in
vendors
that
the
COPY
and
vendor.
entity? entity
by the
PK of the
of DVD_ID
is
RENT_LINE.
entity
is the
COPY_CODE).
Therefore,
However, if the
individual
when
PK is referenced
goes to a particular
of the
will be the
are
entitys
PK. (Note
than the
track
may reside
be found the
The
of a given
order
the
VEND_
goes to a clearing
supplied
the
movies
to the
ORD_LINE.
Database Design in to
different
in
different must
design
physical locations.
another.
designer
For
example,
physical also
complications
1, The Database
11.2
ORDER contains
are 12 copies
So, why is
single-attribute
each order
one location
system,
database.
Chapter
Therefore,
points:
or from left to right is not the governing
movie. If there
entities.
case,
rather
VEND_CODE
of a database
process across
entity.
bottom
a single-attribute
that
want to
11.1.4 Distributed
may also
why
ORDER, rather
still
house,
Portions
of
have
to assume
you
of each
In this
COPY_CODE,
It is reasonable
and
top to
are composite
entity.
must
attribute
CODE is the
copies
example
by another
Therefore,
house
parent to the related
the following
separately.
RENT_LINE
an excellent
referenced
single
model in Figure 11.7, note particularly
to read from
individual
be rented
ORD_LINE Here is
the
ERD
ERD!)
We can now track copy
of the
locations.
develop
If the
the
introduced
by
Processes
that
process
and
a retail
data
database
and
the
is to
are
storage
be distributed
allocation
processes
database
warehouse
process
distribution
distributed
access a
strategies
examined
in
for
detail
in
Approach.
LOGICAL DATABASE DESIGN
11 When the
business
conceptual
design
rules that, in turn,
and constraints.
(Remember
enforced
application
within
at the five
definition
days
thus
attributes
that
requiring
the
For
The
to
meet the requisite
were used concurrently.
the
each
of the
entities
models
ERD.
normalisation such
entities
requirements.
item
must
are required
focus
to
was
In short,
design
use of design
the
before
they
can
and relationships,
on the
verified
be returned
includes
meet information
entities
models
the
cardinalities
model
must be normalised
data
level
and are, therefore,
conceptual
additional
the
the
modelled
checked-out
and that
Because
conceptual
connectivities,
be
the
may yield
design,
concurrent
a
addition,
process
initial
cannot
constraint In
conceptual
at the
optionalities,
elements
ERD.)
an implementable
In fact,
ERD reflects
the
normalisation of the
produce
design
example, in
the
relationships,
of the
describe
modification
design
completed,
some
mind that the
implemented.
conceptual
certain to
that
be reflected
Keep in
properly
is
define the entities,
level.
cannot
of the
requirements. be
phase
verification
in this
and normalisation
and normalisation
of the
chapter
were
processes
reflects
real-world
practice. The logical
second design
stage stage
on a relational
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
in the is to
DBMS.
Rights
Reserved. content
does
database map the
The logical
May not
not materially
be
copied, affect
design
cycle
conceptual
design
scanned, the
overall
or
duplicated, learning
is
known
model into
stage
in experience.
whole
consists
or in Cengage
part.
Due Learning
as logical
alogical
database
model that
of the following
to
electronic reserves
rights, the
right
some to
third remove
design. can then
The aim
of the
be implemented
phases:
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
1
Creating the logical
2
Validating the logical
3
Assigning and validating integrity
4
Merginglogical
5
Reviewing the logical
Next, you data
willlearn
11
Conceptual,
Logical,
and
Physical
Database
Design
595
data model data model using normalisation constraints
models constructed for different parts for the database
about
data model with the user each of these
phases in
detail,
which
will help you to build a successful
logical
model.
11.2.1 Creating the Logical Data Model The first
stage
database
constructs.
of relations must
of logical
using
be created
no dependents
is
key attribute(s) however,
Step
lets
of rules.
whilst
at the
not
now
becomes
an
time
in
the
the
Strong
keys)
required
for
each
are
to
conceptual
entity
integrity
created
keys.
design
in
and
the
conceptual
a set
attributes
relations
relations,
brackets.
So far, this sounds the
phase into
Usually,
To create
enclosed
convert
a set of relational
and relationships
constraints.
first.
attributes
design into
with
the
Next, the
name
primary
quite straightforward;
model in
detail.
Entities
entities
relation. Notice
conceptual
required
by any foreign
steps
the
ER model from the
associated
followed actual
for
relation.
meeting
with its
all regular
attribute
corresponding
attribute
at the
the
must be created
any foreign
along
Relations
This rule transforms
its
same
is (are) identified, look
converting
A relation
containing
specified
1: Create
design is to translate
This involves
a set
(e.g.
of the relation
database
in the
Figure that
the
ER diagram
11.8
shows
primary
into
the
relations.
entity
Each attribute
DVD from
the
in the
relation
key is indicated
in the
DVD rental by
entity
store
and
underlining
the
DVD_ID.
FIGURE 11.8
Transforming the strong entity DVDinto the DVDrelation 11
Step 2: Create In
Chapter
Relations for
5, Data
Modelling
that is, it cannot entity has a primary states
that,
The primary
Copyright Editorial
review
2020 has
key
Cengage deemed
for
each
primary
Learning. that
any
key
cannot
All suppressed
Entity
exist
key that is weak
Weak Entities
with
without the
partially
entity,
of the
Reserved. content
does
May not
not materially
be
is then until
copied, affect
scanned, the
overall
Diagrams,
entity
or totally
a new relation
relation
be established
Rights
Relationship
or
duplicated,
with which it
derived from the must
be created
determined
all the
learning
we defined
in experience.
whole
or in Cengage
has a relationship.
each
owner
key relationships
part.
Due Learning
to
electronic reserves
entity
as being
existence-dependent;
In addition,
a weak
parent entity in the relationship.
that includes
from
foreign
a weak
rights, the
right
all attributes of the with the
some to
third remove
party additional
content
from
entity.
may
be
any
entity. the
entities
suppressed at
the
However,
owning
content
Step 2
time
from if
the
subsequent
have
eBook rights
and/or restrictions
eChapter(s). require
it
596
PART IV
Database
Design
been identified.
To do this, the
key attribute. primary the
The primary
key
weak
of the
entity
by a *
owner
the
FIGURE 11.9
key
the
and
relation.
foreign
key of the
new relation
and the
from
RENT_NUM
RENT_LINE
after
primary of the
entity
RENT_LINE
The attributes key on the
key
partial DVD
owner then
entity is included becomes
identifier
Rental
of the
Store
RENTLINE_NUM
in the new relation
a composite weak
appears
in
entity.
the
of transforming
11.9.
to indicate
keys
as a foreign
combining
An example
Figure
are underlined
You will also notice that foreign
key through
on the
the
composite
new relations
primary
are indicated
attribute.
Example of mappingthe weakentity RENT_LINE
Validation of foreign keys occurs later validate integrity constraints.
on in the logical
Step 3: Map Multivalued Attributes As part of the ER verification process, attributes that database designer will have either: created several new attributes,
database
contain
one for each of the original
created a new entity composed
of the original
design phase when we assign and
multiple values are identified
and the
multivalued attributes components,
multivalued attributes
or
components.
Therefore, at logical database design, such attributes should not exist in the ERD. However, if they do, then for each multivalued attribute that is found within an entity, create a new relation. The new relation should have aforeign key, which is the primary key from the original entity. The primary key ofthe new relation is a composite key comprised ofthe primary key of the original entity and one or more attributes
11
of the
multivalued
attribute
itself.
Supposing we created an entity called CAR, where CAR_COLOUR was a multivalued attribute comprising of attributes containing information about the different colours (COL_COLOURS) that are used on different sections of the car (COL_SECTION). Figure 11.10 shows how a new entity called CAR_COLOUR
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
is used to represent
Rights
Reserved. content
does
May not
not materially
be
copied, affect
this
scanned, the
overall
or
duplicated, learning
multivalued
in experience.
whole
or in Cengage
attribute
part.
Due Learning
to
electronic reserves
and the relations
rights, the
right
some to
third remove
party additional
content
that
may content
be
are created.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 11.10
11
Example of mapping multivalued
Step 4: Map Binary
Conceptual,
Logical,
and
Physical
Database
Design
597
attributes
Relations
One-to-many (1:*) Relationships For each 1:* relationship, create the relations for each of the two entities that are participating in the relationship. To create the foreign key on the many side, include the primary key attribute from the one side.
The one
side is referred
to as the
parent table
and the many
side is referred
to as the
dependent
table. For example, Figure 11.11 shows a portion of the ER diagram which represents the pays one-to-many relationship between CUSTOMER and RENTAL and the corresponding relations that are created.
FIGURE
11.11
Example
of mapping
a 1:* relationship
11
One-to-one (1:1) Relationships 1:1 relationships are a special kind of relationship and each one has to be treated individually on the participation constraints between the two entities:
depending
If both entities are in a mandatory participation in arelationship and they do not participate in other relationships, it is mostlikely that the two entities should be part of the same entity.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
598
PART IV
Database
Design
If there is
mandatory
optional becomes
the
foreign If
key
both
role
becomes
dependent of the
entities
of the
are in
entity
to
of the
both sides key
LECTURER
would
SCHOOL
FIGURE 11.12
hard
entity that
has the
the
to
dependent
parent
has the
mandatory
entity
determine
entity.
To
be determined,
relationship
should
which make
perhaps
can be obtained,
are
both
again
mandatory,
in
either
Figure
then
create
deans
of this the
contain
the
would
take
a decision,
through
it is
the
more
the identification
up to the
database
key is
Each
new relation
mandatory
would
with two
exist
have
and optional
to
must
have
LECTURER
is
LECTURER
between at each
two
copies
participations
copies
of the
in this
between
a lecturer a
to
also shown in Figure
placed
can
constraints
the relationship
school
so the
is
relationship participation
or
primary
relationship.
SCHOOL
1:1 relationship
foreign
the
consider
of a school,
the
at the
one relation
the
11.12. from
(e.g.
look
of both
to represent
relationship
are the
participant,
is recursive
would
comprises
you could
shown
all lecturers
entities you
map a 1:1 relationship,
the
one. The mapping optional
then the that
to
it is
information
two
then
new relation
Therefore,
not
optional is the
we
and
school.
the
entity),
then
a further
how
be the
no further
If they
are optional,
To illustrate
However,
would
then
would have to
key. If the relationship
or create
entity
corresponding
participation,
which
between same
relationship.
primary
and the
decision.
1:1 relationship
of the
The relation
However, if
of the
entity
entity.
and
make the
occurrences side
parent
about the relationship
designer
of the
entity.
an optional
more attributes.
If the
on one side of the relationship, the
dependent
parent
information of
participation
participation
entities
who is the
mandatory
SCHOOL
11.12.
the
dean
participation.
relationship
Notice that
is
an
as SCHOOL
relation.
Example of mapping a 1:1 relationship
11
NOTE Chapter
6, Data
Modelling
Advanced
Concepts,
contains
more about
design issues
of implementing
1:1 relationships.
Many-to-many For
each
Copyright Editorial
review
2020 has
Then
the
Cengage deemed
Learning. that
Relationships
*:* relationship,
relationship. contain
(*:*)
any
foreign
All suppressed
Rights
keys
Reserved. content
create
create
does
May not
the
a third of the
not materially
be
two
copied, affect
relations
relation original
scanned, the
overall
or
duplicated, learning
for
each
of the
to represent entries
in experience.
whole
the
that
or in Cengage
part.
two
entities
actual
participate
Due Learning
to
electronic reserves
in the
rights, the
that
are
relationship.
right
some to
third remove
original
party additional
content
participating
The third
in the
relation
will
*:* relationship.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Figure
11.13
examine
shows
Figure
composite Figure
11.13,
entity
11.13
a *:* relationship you
called
also
can
see that
FIGURE 11.13
the
11
the
entities
between the
ATTRACT_TOUR,
shows
CHAPTER
relationship which
corresponding
Conceptual,
TOUR
between
contains
the
foreign
Logical,
and
two
keys to
and
Database
Design
ATTRACTION.
As you
is represented
by the
entities both
Physical
ATTRACTION
and
599
TOUR.
relations.
Example of mapping a*:* relationship ATTRACTION ATTRACTION_NO CITY_ID
ATTRACT_TYPE
TOUR TOUR_ID
{PK}
{FK}
ATTRACT_NAME
ATTRACT_TOUR
{PK}
TOUR_NAME
may_contain
c
TOUR_ID
ATTRACT_WEBSITE
{PK}{FK1}
may_be_visited
c
ATTRACT_PHONE
TOUR_DESCRIPTION ATTRACTION_NO
{PK}{FK2}
ATTRACT_OPENING_TIME
TOUR_PRICE_ADULT 1..1
TOUR_PRICE_CHILD
0..*
0..*
ATTRACT_CLOSING_TIME
1..1
ATTRACT_ADDRESS TOUR_PRICE_CON ATTRACT_TRAVEL_INSTRUCT ATTRACT_COST_ADULT ATTRACT_COST_CHILD ATTRACT_COST_CON
TOUR
Relation
TOUR
{TOUR_ID,
TOUR_NAME,
TOUR_DESCRIPTION,
TOUR_PRICE_CHILD,
ATTRACTION
ATTRACTION{ATTRACTION_NO, CITY_ID*,ATTRACT_TYPE,ATTRACT_NAME,ATTRACT_WEBSITE,
Relation
ATTRACT_PHONE,
ATTRACT_OPENING_TIME,
ATTRACT_ADDRESS,
ATTRACT_CLOSING_TIME,
ATTRACT_TRAVEL_INSTRUCT,
ATTRACT_COST_CHILD,
ATTRACT_TOUR
TOUR_PRICE_ADULT,
TOUR_PRICE_CON}
ATTRACT_COST_ADULT,
ATTRACT_COST_CON}
ATTRACT_TOUR{ATTRACTION_NO*, TOUR_ID*}
Relation
NOTE
11 Remember
that,
during the
at the
ER model verification
1:* relationships
and
of a ternary three
in the fourth
a fourth
The
keys
primary
and
is
no single
of relations.
depend
Copyright Editorial
review
2020 has
on a number
Cengage deemed
set
Learning. that
any
All
Rights
may have
not, then
been
identified
and
all *:* relationships
then
resolved
be
mapped to
should
phase.
does
entities
May not
not materially
be
amongst
must of the
be
entities
created
entities
in
composite
to the
for
designers
mapping have
One consideration
affect
scanned, the
overall
represent
ternary
primary
exist in the
the
ERD. In the
relationship
relationship
key along
supertype
come
is
up
or
duplicated, learning
in experience.
whole
or in Cengage
and with
case
amongst
become
the
foreign
with any additional
subtype
several
whether individual
not part of the inheritance
copied,
may also
keys
attributes.
Relationships
available
of factors.
Reserved. content
relation of each
database
with other
suppressed
design
degrees
and Subtype of rules
Therefore,
relationships
However, if
database
may also form the
Step 6: Map Supertype There
process.
of higher
relationship,
relation
*:* relationships
Relations
other relationships
entities.
level,
during the logical
Step 5: Map Ternary Ternary
conceptual
part.
hierarchy.
Due Learning
to
electronic reserves
right
subtypes
some to
third remove
party additional
content
into
a set
techniques,
participate
Another is the type
rights, the
relationships
different
may content
be
any
in further
of disjoint
suppressed at
which
time
from if
the
subsequent
and
eBook rights
and/or restrictions
eChapter(s). require
it
600
PART IV
Database
Design
overlapping options
constraints
that
exist
Option
1:
Merge the
participation In this and
in the
case,
subtypes
determines
into the
the
supertype
and the
which
for the
rows
subtypes
whether
2:
subtypes
in the
Create
example,
the
relationship
subtype.
Two of the
most common
for
an attribute
corresponding
called
of attributes
each
Both subtype
only
and the
is
used
provide
frequency
if the
is created
case,
in
in
STUDENT
to
determine
P
or F
the
supertype
to
11.14
illustrates
participate
option
the
1 and
in
in
between
Other
a
the
mandatory
merge the supertype
mapped relation
discriminate
and
subtypes.
Figure
subtypes
a guide. of
supertype
example,
no overlapping
shown two
Notice that
which
options
overlap.
in the
PART_TIME_STUDENT
value
this
we could follow
PERSON.
PERSON_TYPE,
table.
in
Therefore,
In are
relationship
called
when
or (P)ART_TIME.
subtype.
These
For
subtypes
the
can
placed
STUDENT_TYPE
and there
STUDENT.
subtypes
is
subtype.
be assigned
and
relationship
PERSON.
that
was (F)ULL_TIME
supertype
and
one relation
database
FIGURE 11.14
in the
could
that
type (with
attribute
supertype/subtype
with the supertype
contains
each
to
This is suitable
and the
attribute
belong
additional
STUDENT
EMPLOYEE
into
table
one relation.
mandatory
was student
STUDENT_TYPE of
participate
PERSON
subtypes
number
merged.
is
an additional
database
then the
one relation
subtypes
create
STUDENT
instance
optionally
overlapping
and the
were
a particular
Option
to
supertype
and create
relationship
discriminator
and FULL_TIME_STUDENT), if the
supertype
supertype/subtype
use the
discriminator
For
between
are:
factors
Figure
11.14
subtypes
to
in
consider
the
are the
overlapping.
Example of mapping asupertype/subtype relationship
{AND,Mandatory}
11
11.2.2 Validating the Logical Data Model Using Normalisation As you will have seen in Chapter 7, Normalising Database Designs, the normalisation process helps to establish appropriate attributes, their characteristics, and their domains. Nevertheless, because the conceptual modelling process does not preclude the definition of attributes, you can reasonably argue that normalisation often straddles the line between conceptual and logical modelling. In creating the logical
data
model, we have so far created
relations
from
the ERD and these
bein third normal form. If they are not, then it is likely that we have made some verification process and it will be necessary to revisit the ERD model.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
relations
should
already
mistake during the model
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
If
normalisation
additional
tool
certain the
has not
in
which to
anomalies
database
been
verify
undertaken
the
relation.
Chapter
3,
constraints
entity
integrity
integrity
updating
Relational
the
lets
within
or deleting
logical
database
and
and
design, it
all, a normalised
data
database
check
Physical
can
relational
will therefore
and validate
Characteristics,
you
on objects
integrity.
at domain
range
shown in Table
Model
In
Database
be used
schema
help to
design
the
database.
relations
but it is
the integrity
were introduced
within the
creating
will have been identified
look
the
design is to
we can impose
and referential
within
First,
After
Logical,
Design
601
as an
will avoid
keep
the
data in
Constraints
database
that
constraints
stage
be
stage in logical
In
integrity
design.
Conceptual,
consistent.
11.2.3 Validate Integrity The next
prior to logical
database
when inserting,
11
to the
These
within
constraints
the logical
essential
that these
that
assigned
three
were
on each
main types
domain model,
most
are validated
of
constraints, of these
as a separate
process.
constraints.
All the
of allowable
values.
For
Domain
constraints
values
example,
the
are
domain
to
a specific
constraints
for
attribute
the
must
DVD relation
are
11.5.
TABLE 11.5
for the
Attribute
Description
DVD_ID
Set of all possible
DVD_COPIES
DVD relation Domain
Number
values for
movie codes
of copies held of each
Alphanumeric
movie
Integer
character
size 10
2 digits
Minimum value of 1 and
DVD_NAME
The name
DVD_CHARGE
The cost to rent
of the
movie a DVD
maximum
value
Character
size
The amount
to
be paid for
each
day the
DVDis late CATEGORY
Set of all possible
50
DVD_CHARGE
.5
6
DVD_CHARGE
.5
0.25
DVD_CHARGE
,5
25
CHARGE DVD_LATE_CHG_DAY
of 50
movie categories
,5
and
'Comedy',
Entity integrity
is
no null
singular
and
into
parent
the
keys
review
key
values
maintained.
must
What to
contain
is
foreign
key
If
has
are
11
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
movie
May not
not materially
be
have
checking
place.
For example,
not
copied, affect
scanned, the
overall
or
values.
are
deleted.
in experience.
whole
row
If
then the
there
duplicated, learning
key.
This
in logical
'Doc'
constraint database
the relationships in the
in the
guarantees design,
every
prior to this copy
both
a number
of
ways in
DVDrelation This allows
or in Cengage
part.
Due Learning
to
electronic reserves
so that stage
key cannot
NULL values
the
is
user
rights, the
right
some to
are allowed which they
performing
third remove
party additional
content
child
or not
an entry
in the
may
can
be
be dealt
any
with:
COPY relation delete
suppressed at
COPY
as values in the
the
content
and
whether
be NULL in the
until all values in the
the
the relations
store, for every row
of a DVD requires
DVD_ID foreign then
between
DVD rental
DVD relation
been identified
was optional,
been
stage
'Action',
checked.
NULL
allowed
a primary
At this
involves
Don't allow any movies to be deleted from the with that
has
key.
be an existing
mandatory,
that if the relationship NULLs
relation
may have
It is possible
2020
are
keys are in there
be allowed
attributes.
each primary
constraints
and the relationship
associated
Copyright
the
relation.
1
Editorial
relation
is
that
into
foreign
COPY
would
DVD relation
integrity
correct
relationship
foreign
primary
referential
ensure that the
entered
by ensuring
are inserted
composite
Validating to
validated
values
and
Character size 6. Must be one of 'Family',
that
DVD_
16
time
operation
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
602
PART IV
Database
to
Design
check if they
integrity
2
is
wish to
proceed
with the
deletion
and,
moreimportantly,
When a MOVIEis deleted, then delete all associated rows in the a cascading
3
delete
Set the foreign in
It is
the
store
possible
that referential
is
COPY relation.
This is known as
constraint.
key value to
that
that
consideration
ensures
maintained.
are
NULLin the COPY relation. This means we may have copies of a movie
no longer
different
listed
choices
as being
available
will apply to
different
for
rental.
Follow
relations
this
in the
strategy
same
with
database,
care.
so careful
needed.
11.2.4 Merge Relations In the database different
user views
directly
from
relations. small
design
next
databases,
Identify
For
the
example,
in the
database.
can
see
ideal
Although
this
view
data
combine
be easily
merged
be
model
them.
and
to
should
new logical
store,
redundancy
remove
merged
which
one
are
amongst
any redundancy.
not
at a time.
For
For
each
duplicated.
Usually,
these
relations
model,
ensuring
that
will have the
identified.
logical
managers
different
views
the following
data
and
sales
staff
of the database two
relations
all integrity
will require
different
information
may have been designed
for the
from
managers
which have been created:
EMP_FNAME,
Learning.
EMP_LNAME, relations
they
EMP_SALARY,
All suppressed
to
AREACODE,
Rights
does
May not
simple,
from
However,
was called
Reserved. content
not materially
EMP_ID
same
meaning.
these
be
copied, affect
SALES_STAFF
describe
the
called
PHONE)
there
different called
attributes
scanned, the
overall
or
duplicated, learning
PHONE)
have
the
characteristic
same
AREACODE, are a number
views.
When
EMPLOYEE.
PHONE,
of
These
of recognised we
merged
problems
the
We assumed
those
that
of employees,
are
in experience.
EMP_NO?
are
known
synonyms
whole
or in Cengage
part.
that
Due Learning
to
key
value,
This
makes
EMP_SALARY)
electronic reserves
We would then as synonyms represent
rights, the
right
some to
third remove
party additional
occur
SALES_
MANAGER
and
based
upon the fact that
if the
primary
have two
may content
be
with
database
number.
suppressed at
key in the
attributes
up to the
employees
content
can
and
both the
and it is the
that
MANAGER
what would have been our assumption instead
primary
of an employee.
EMPLOYEE.
to the same characteristics,
were the same.
that
AREACODE,
EMP_FNAME,
a new relation
but the
and
that
one relation
be quite
of relations
referred
understand
into
EMP_LNAME,
we formed
relation
attributes
merging
appears
relations
EMP_FNAME,
MANAGER
have
for
names
to
any
two
merging
keys
different
that
the
process the
primary
Cengage deemed
new
user
will so far have created relations
duplication
of relations
should:
can
EMP_LNAME,
addition,
relations,
designer
has
two
Consider
(EMP_NO,
SALES_STAFF
2020
DVD rental
candidates
SALES_STAFF
review
in the
sets
designer
and
relations
(EMP_NO,
that In
surrounding
Copyright
each
in the
are similar
so such
(EMP_NO,
EMPLOYEE
Editorial
from
process
some
to represent
SALES_STAFF_VIEW
EMP_NUM.
the
that
merge these
database
all relations
Therefore,
SALES_STAFF
the
of relations
merged the
design to
will have been generated
MANAGER_VIEW
In the
STAFF
to
of ERDs
maintained.
sales staff.
MANAGER
them
therefore
relationships are
You
is
sets
key,
all the
In the
11
lead
stage
constraints
and the
database
will inevitably
relations
primary
Check
The logical
which
include
those
same
system.
that is
Automatically
a number
ERD,
the
set of relations
that
of the
each
The
process, it is likely
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
A second meanings. and
problem These
can occur if attributes
kinds
SALES_STAFF
database
both
designer
stores
of attributes had
attributes
discovered
that
area code and phone
employees merged
home
EMPLOYEE
when
merging
already
been
if a duplicate
relations,
merged
sure that the
completing
complete
database moving
attributes
when
phone
database If
the
original
of the
logical
the
users
database
by verifying
logical
model
a single,
in
transactions
it is
need to a few
be solved
steps
11.3
database
by the
DBMS
specifications to
addition,
we
The
was
happen to
if the
a particular
to an individual
be included
in the
STORE_PHONE,
EMP_
check
that
supertype/subtype
supertype/subtypes
created,
then
the
that
decision
merging relation.
have
needs
to
This is just to
be
make
that
the
stages
the logical
in
the
actively model.
all the
different
should
exist
constraints model
which
should
with the
represents
be again
the
validated
user.
Model with the User
database
ensure
model
all integrity
reviewing
have
beginning
design
database.
reference
the
before
different
The next
in
stage
views.
This
database design
the
conceptual
involves
data requirements
user
physical
participated
reviewing
have
stage
design
is
design
been
the
stage,
completed
modelled
very important
of the
and
as any
even if it requires
all the
problems
going
back
process.
PHYSICAL DATABASE DESIGN
Physical used
to
within the
and revisiting
to
contains
logical
correct,
process:
should
conceptual
users
are supported
needs
exists in the
validated,
that
the
database
the
with the
have to
603
was correct.
stage
of the
also
relations
model
11.2.5 Review the Complete Logical So far,
would
Design
MANAGER
would
referred
STORE_AREACODE,
designer
one
relationship
To ensure
final
meanings
have
referred
they
Database
relations
What
relation
relation
Physical
relations
original
PHONE.
MANAGER
Both
EMP_FNAME,
the
stage,
system.
to the
and
the
and
separate
The two
SALES_STAFF
number?
correctly.
decision this
in
whilst in the
supertype/subtype
original
After
AREACODE
name in
homonyms.
Logical,
EMP_SALARY)
are represented
revisited
called
EMP_LNAME,
EMP_PHONE,
relationships
before
and
same
as
Conceptual,
i.e:
(EMP_NO,
AREACODE, Finally,
relation,
with the
known
these
number,
area code
EMPLOYEE
are
11
for attribute
must
be able
designer
ultimate
goal
to
and
definition
response been
data
time.
In
or access
data.
this,
decisions
that
storage
the logical
In
doing
key)
any relationships
complex
ensure
of specific must translate
by a primary
to represent
have
we
accessing
(represented
be to
of query
needs
the
do this,
with some
must
in terms
information
requires order to storing
a data
database
efficient
In
1
to
we
must
be located
that
occur
regarding
storage
order
can
between
effective
carry
out
to
physical
that
every
physical
is
This
and design,
stored.
security, the
In
presents
physically
integrity
database
logical
database.
relations.
database
ensure
will be
a set of specific
ensure
in the
how the
is
methods that
model into
and
following
collected:
1 A set of normalised relations devised from the ER modelandthe normalisation process. This would have
been
derived
from
the
conceptual
and logical
design
stages
and
would
be the
logical
data
model.
2
An estimate
of the volume of data which will be stored in each database table
and the usage
statistics.
Copyright Editorial
review
3
An estimate of the physical storage requirements
4
The physical storage characteristics
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
for each field (attribute)
within the database.
of the DBMS that is being used.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
604
PART IV
Database
Physical
Design
database
design
can be broken
down into
1
Analyse data volume and database usage.
2
Translate each relation identified in the logical
3
Determine a suitable file organisation.
4
Defineindexes.
5
Define user views.
6
Estimate data storage requirements.
7
Determine database security for users.
Next
you
will learn
about
each
of these
phases
a number
of stages:
data modelinto
in
more
atable.
detail.
11.3.1 Analyse Data Volume and Database Usage Analysing The
user queries
process
is
introduced it is
often
to in
usage,
the
data
gather to
volume
predict
that
and the
has
cent
been
Wiederhold
in
of this
his 1983
chapter.
data volume
involved
either
that
book
may arise.
types are
in
design, that
will take
to
cent
of
system,
the
in
it
would
further
were
on each
involves
a given
file
and
essential occur
in
to order
be impossible
that
rule
to
account
suggested
reading
requested
used today in analysing
place
that
80/20
the
queries
you
database,
It is therefore
transactions
upon
design.
Every transaction of requests
may have, so transactions
based
which is listed
20 per
the
modification.
a large
users
This is
This rule is often
or
Generally,
that
designing that
number
most important
database
Cycle (SDLC)
processing.
the
viewing
of queries that
estimated
in the
the
considered.
on database
Weiderhold
of transactions
on both
for
Life
When physically
number
based
on at least
data
80 per cent of data accesses.
11
the
stage of physical
Development
Process.
possible
different to
Systems
requested
issues
possible
of the
is usually the first
Development
and limitations
as
of accesses
database
approximately
overheads
performance
all the
80 per
know
much information
establish
end
has
as part
Database
you
database
and
as
out
10,
that
within the
data
carried
Chapter
very important
table
and the size of the
section
by users
for
by
Gio
at the
account
data usage in existing
for
database
systems. The
steps
required
Identifying
the
of the
most
DVDs
every
an impact
In
and critical
shown each
Copyright Editorial
review
2020 has
transactions
to
would need to data
usage
map for
as dashed entity
Cengage deemed
usage
Learning. that
any
lines
represent
All suppressed
Rights
such
determine
For example renting
in our
a DVD.
as a Friday
does
May not
DVD rental
store,
While customers
or Saturday
evening,
one
would
which
rent
might
have
be
copied, affect
are usually
usage arrows
scanned, overall
or
shown
DVD rental
representing
number
the
in the
database
relations
(COPY,
duplicated, learning
on a simplified
map or a transaction
of the
estimated
not
relations
a DVD, four
participate
in these
RENT_LINE,
RENTAL
be accessed.
a section
materially
which
to rent
statistics
with the the
Reserved. content
transactions.
peak times
diagram is known as a composite composite
are:
will be a customer
may be
order for a customer
CUSTOMER) and
phase
performance.
of critical
Data volume
out this
transactions
day, there on the
Analysis
carrying
most frequent
common
transactions. and
for
store. the
of records
in experience.
whole
or in Cengage
part.
Due
Access
direction
stored
Learning
usage
to
electronic reserves
each
rights, the
frequencies of the
in
right
version
some to
of the
ERD.
This
map. Figure 11.15 shows the to
access.
each
relation
The numbers
are inside
relation.
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 11.15
11
Conceptual,
Logical,
and
Physical
Database
Design
605
Composite usage mapfor the DVDrental store
11
As you examine Figure 11.15, note the following: It is estimated that the store has 600 customers. The store
holds
1 500 different
movies and has on average
three
copies
of each
movie title (giving
an average of 4 500 records in the COPY relation).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
606
PART IV
Database
Design
There are approximately comprise
customers
Each customer
visits the
1 200 records gives
in the
an average
Of the
500
(renting
500
The
store
the
receives
estimated
also receives DVD is
It is important
to
available
are rented
accesses
with the
store.
to an estimate
by each
of
customer,
which
relation.
access
to
400
the
of these
RENTAL
(assuming
relation
a customer
to the
COPY table
movies
each
involve could
rents
to validate
20
generate
more than
the rented
week; therefore
DVD rentals a
one
DVD).
copy.
accesses
are required
to
degrade.
of
The
complete
that
has taken
of new
where
that
and
the
also
be taken
that
this
the
asking if a copy
to the
of new reporting
which
of a
of the
over the
database, as
can
cause
next
Once
and
users,
OLAP (Online
performance
The
an increase Analytical
of the
size
in
of
analysis
statistics
years.
for
but to gain
an initial
access
several
its
an idea
business.
additional
such
are only estimates exact figures
designer
of data volume
business
tools
obtain
database
functions
an estimate
shown
to
database,
give the
of the
with the
access
in
phase
provide
will grow
statistics
It is not necessary
may occur
an overview
require
introduction
into
customers
and volume
business.
from
to
database
applications
and the
a week from
data access
bottlenecks gathered
place it is important
assumption
the
within the
statistics
system
80 enquires
for rent.
emphasise
an understanding
data
20 new
on average
most critical transactions
the
each
DVDs
day. These
registering
DVDs, leading
approximately
relation
500 accesses
on average
four
relation,
that
each
customers
at week to rent
RENT_LINE
RENT_LINE
relation
new
DVD relation.
specific
to
and
On average
in the
CUSTOMER
to the
could lead to
The store
the
It is
CUSTOMER
DVDs
on average twice relation.
to the
accesses
This in turn
store
RENTAL
accesses
to the
and returning
of 4 800 records
or returning).
further
500 accesses
renting
on the
development the
volume
Processing)
of
must
all
consideration.
11.3.2 Translate Logical Relations into Tables The
11
output
represents in the
from the
data
the
logical
complete
dictionary
database
database (e.g.
the
construct
each corresponding
DBMS,
but the stages involved
a Identify
DEFAULT
names
stage
was
For each
relation
of the
database
each attribute
require
design
system.
attributes,
table.
are similar.
a complete
set
of normalised
we use the information
their
domains,
How the tables
For each relation
details
to
be inserted
into
the
of any
are constructed
which
documented
constraints,
is specific
etc) to
to the target
you should:
name and its domain from the data dictionary.
values
relations
we have
attribute
whenever
Note any attributes
new rows
are inserted
which
into
the
database.
b
Determine any attributes that require
a CHECK constraint in order to validate the value of the
attribute.
c Identify the primary and any foreign d Identify be
those
UNIQUE.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
attributes that are not allowed to contain You
automatically
can
exclude
imposes
Rights
Reserved. content
keys for each table.
does
May not
the
not materially
be
copied, affect
the
NOT
scanned, the
overall
primary
NULL
or
duplicated, learning
key
and
in experience.
whole
UNIQUE
or in Cengage
NULL values and those
attribute(s)
part.
Due Learning
here
as the
PRIMARY
which should KEY
constraint
constraints.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Once you have identified order
of table
created
creation
first,
Lets
followed
now look
database for the
by those
one foreign
using
the
contain
key, then of the
relations
DVD rental
store.
Table
and
tables
where
Figure
the
11.16
target
Conceptual,
SQL can
that
two
COPY relations
database
FIGURE 11.16
DBMS-specific Relations
with
example
stage from
DVD and
CREATE
above, the
very important.
at an
design
corresponding
the
is
11
two,
key
Physical
Database
create the table.
dependencies
Design
607
The
should
be
etc. that
were
11.6 shows
is
and
be written to
no foreign
shows
DBMS
Logical,
the
determined
a portion
SQL code
during
of the
used to
the
data
logical
dictionary
create
the two
Oracle.
Creating the DVD and COPYtables
TABLE
DVD (
DVD_ID VARCHAR2(10), DVD_COPIES NUMBER(3) NOT NULL, DVD_NAME VARCHAR2(50) DVD_CHARGE
NOT NULL,
NUMBER(2,2),
DVD_LATE_CHG_DAY
NUMBER(2,2),
CATEGORY CHAR(6), CONSTRAINT pk_dvd_dvdcode
PRIMARY KEY(DVD_ID),
CONSTRAINT ck_dvd _category
CHECK (CATEGORY IN ('Family',
CONSTRAINT ck_dvd_charge
CHECK (DVD_CHARGE
CHECK (DVD_LATE_CHG_DAY
CONSTRAINT
CHECK (MOVIE_COPIES
CREATE
TABLE
'Comedy',
'Doc')),
BETWEEN 0.25 and 25),
CONSTRAINT ck_dvd_latecharge ck_dvd_dvdcopies
'Action',
BETWEEN 6 and 16),
BETWEEN
1 AND 50));
COPY (
11 COPY_CODE VARCHAR2(10), DVD_ID VARCHAR2(5), COPY_NUM
NUMBER(2) DEFAULT 1 NOT NULL,
CONSTRAINT pk_copy_copycode CONSTRAINT fk_copy_movie
In Figure
11.16,
notice that the
PRIMARY KEY(COPY_CODE),
FOREIGN KEY(MOVIE_CODE)
constraints
imposed
on the
REFERENCES MOVIE(MOVIE_CODE));
attributes
have
been named. It is important
to name constraints using a standard so that we can associate a particular constraint with a particular table. If we do not name them, the DBMS assigns an unnamed constraint that is difficult to understand. Naming them makesit easy to modify or drop constraints and quickly fix any errors as we will know which table they
(constraint where the
are related
to.
An example
standard
format
for constraint
naming
could
abbreviation )_(table name)_(column_name)
constraint
abbreviations
would be PK (primary
key), FK (foreign
key),
CK (check
UK(unique constraint). Using this format you can see that the primary key constraint as pk_dvd_dvdcode in Figure 11.16.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
be:
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
constraint),
has been named
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
608
PART IV
Database
Referenced
FK
Design
MOVI Table
/
/
NULL
NULL
NULL
NOT
NOT
NOT
NULL
1
places
Constraints
CHECK
UNIQUE
NOT
CHECK
DEFAULT
OF
decimal
PK or
FK
PK
PK
FK
the
including
Required Y
Y
Y
Y
Y
Y
long,
digits
99.99
99
10.00
99
nine
to
Range 0
0
0.00
'Family',
'Action',
'Comedy',
'Doc'
0
up
and
places
Format
X9999
Xxxxxxx
99
99.99
99.99
Xxxxxx
X9999
Xxxxxxxxx
99
rental decimal
two
COPY
with
and
11
NUMBER(3) VARCHAR2(10)
Type
NUMBER(2,2) NUMBER(2,2) CHAR(6) VARCHAR2(50)
VARCHAR2(10)VARCHAR2(5)
NUMBER(2)
DVD numbers
characters
late
DVD
for
movie 000
paid is
specify code
DVD the
2
be
to
DVD
to
characters
DVD
of
1
identifier
entries
the
identifier
to
copy used
number
possible
255
the
the
DVD
DVD
data,
is
rent to copies
of
all 1
day
to
amount
unique
copy
of of
store length
Description
Unique
Name
No
Cost
Set
The
each
data,
Unique
The
categories
The
in
dictionary
NUMBER(9,2)
length
Data character
data.
character Variable key key
11.6
Numeric
5
Name
DVD_ID
CHG_DAY
DVD_COPIES DVD_NAME DVD_CHARGE DVD_LATE_
COPY_CODE
CATEGORY
DVD_ID
COPY_NUM
Attribute
Fixed
5
Primary
Foreign
5 5
TABLE Table
Copyright Editorial
review
2020 has
Cengage deemed
Name
Learning. that
any
COPY
DVD
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
FK
or in Cengage
part.
Due Learning
to
electronic reserves
PK
rights, the
right
CHAR
some to
VARCHAR2
third remove
NUMBER5
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
11
Conceptual,
Logical,
and
Physical
Database
Design
609
NOTE Both the In
SQL required
Chapter
11.3.3
8,
aspect
arranged.
from
model
one single
database record
store,
Selecting efficiently
the and
records
and
in
and
only
of the
Files that
are
Files
hashed
In the following
Heap
File
into
inserted
into
File
a sequential
more fields,
Copyright Editorial
review
2020 has
onto
stored
attribute
data
rows
to
SQL
Each
each
data
to
a record
Each a data
with the
are
are
file
DBMS.
types.
known
tables.
entity field.
as file many
Each row
in the logical For
fields
physically
may contain
many different
fields.
corresponds
records
storage
files.
from
of data
as
database
secondary
in
of a number
know
where
this
record
database
Oracle
data
example,
DVD_ID,
12c,
is
complex
techniques.
There
known
more fields,
such
known
stored
in the
DVD_TITLE,
are
the
are alot
data is
contains this
and
record
can identify
of file
are three
categories
you need to are
that of file
it. It
organisation
techniques
it is important
of
as quickly
of criteria that
However,
stored
thousands
how it
organisation
you
built
have
an
organisations:
files
organisations
which
are
based
on indexes
11
as hash files.
more detail about the characteristics
These
that
whether the type
file
as heap
as file
ensure
database and retrieve
and
administrator.
records
you learn in
to
your
As you can see, there
as
database
more fields,
If
be able to locate
of the
data loss.
ordered
possible.
must
growth
such
very important
as
you
must
or
is
quickly
organisation
or
organisation
file
each
heap,
any
come.
for
first
the
row.
sequential,
indexed,
of the
b-trees,
most commonly
bitmap
and join
used
indexes,
All
Heap files
time.
Since
of the
heap file are
The input
the
only
impractical
used
when
sequence
way to
if
only
where records
access
we want to
is
a large
often
a record
provide
are
quantity
used
in this
efficient
unordered.
to
of
Records
data
needs
automatically
type
of file is to
are to
be
generate
a
search
every
data retrieval.
Organisations organisation,
case
Rights
can it
Reserved. content
the
records
often the primary
search
suppressed
is that
as they
become
which is
Learning. that
file
and every record
not the
Cengage deemed
basic
on one
file
this
is
as
by the
by one
a table
key for
situations,
are specific
common
clusters.
the
Sequential
if this
tuning
sorted
row in the file, they
searched
attribute
some
Organisations
inserted
In
are
organisation
a DBMS
randomly
most basic file
primary
each
how the
relations
be represented
file
DBMS
techniques.
and
each
can
against
sections,
organisation
hash files
and
at the future
In
need
contain
for
choosing
may contain
consist
is required,
the
protection
Files that
The
record
consideration. often
or it
can
be retrieved
to look
understanding
file
can
To do this,
some
DVD
most suitable
one single
possible.
into
entity
is
a database
table,
a record
types
we identified
DVD_CHARGE.
data
is also important provides
the
and
design
arranging
in
that
data
Language,
File Organisation
physically All data
to
and the
Query
database
a database
DVD_COPIES
take
for
will correspond
DVD rental
as
of physical
techniques.
represents
tables
a Suitable
Techniques
organisation
creating Structured
Determine
An important
rows
for
Beginning
does
in the file be fast
can
May not
are
key. In
stored
if all the
not
be
copied, affect
scanned, the
overall
or
records
duplicated, learning
a sequence
order to locate
must be read in turn are
be very time-consuming
materially
in
in experience.
whole
or in Cengage
part.
find
Due Learning
a specific
record,
until the required
ordered to
based
based one
to
electronic reserves
on the
specific
rights, the
right
some to
third
the
record
party additional
content
value
of one
whole file
In some
key value.
Inserting
may content
be
any
However,
or
suppressed at
time
from if
or
must be
is located.
primary
record.
remove
on the
modifying
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
610
PART IV
Database
records
Design
usually results in rewriting
deletion
of records
leads
file is
a telephone
sequential a record
that
is
FIGURE 11.17
to
stored
the
storage
whole file,
space
book,
which is very impractical
being
as shown
in
wasted Figure
unless
11.17.
the
Each
in a database.
In addition,
file is reorganised. name
and
the
An example
phone
number
of a
represents
alphabetically.
Example of sequential file organisation LAST
NAME
FIRST
NAME
AREACODE
PHONE
First
Brown
James
0181
297-1228
record
Dunne
Leona
0161
894-1238
Marlene
0171
894-2285
Mlilo
in file
Moloi Moloi
0181
297-3809
Amy
0161
442-3381
Paul
0181
894-2180
Myron
0181
222-1672
Padayachee
Vinaya
0161
382-7185
Ramas
Alfred
0181
844-2573
George
0181
290-2556
OBrian
To locate Williams
all other
Orlando
records must first
be
read
Williams
Dueto their deficiencies, Indexed
sequential files are not used for
File Organisations
Accessing
a record
directly instead
file organisation. Records in afile unsorted sequence and an index speeding up data access. Indexes join operations. Theimprovement
11
modern database storage.
values that
contains
the index
of searching
through
the entire file involves
the
use of an indexed
supporting this type of file organisation can be stored in a sorted or is created to locate specific records quickly. Indexes are crucial in facilitate searching, sorting and using aggregate functions and even in data access speed occurs because an index is an ordered set of
key and pointers.
The pointers
are the row IDs for the actual table rows.
Conceptually, a data index is similar to a book index. Whenyou use a book index, you look up the word, similar to the index key, which is accompanied by the page number(s), similar to the pointer(s), which direct(s) you to the appropriate page(s). An index
scan is
more efficient
than
a full table
scan
because
the index
data are already
ordered
and the amount of data is usually a magnitude of scale smaller. Therefore, when performing searches, it is almost always better for the DBMS to use an index to access a table than to scan all its rows sequentially. For example, Figure 11.18 shows the index representation of a CUSTOMER table with 14 786 rows and the index COUNTRY_NDX on the CUS_COUNTRY attribute.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 11.18 STATE_NDX Key
Row 1 ....
....
....
FR
CUSTOMER (14 Row
ID
....
.....
.......
CUS_
CUS_
CUS_
CUS_
LNAME
FNAME
INITIAL
Physical
Database
Design
611
ROWS) CUS_
CUS_
CUS_
CUS_
AREACODE
PHONE
COUNTRY
BALANCE
1
10010
Ramas
Alfred
A
0181
844-2573
AS
2
10011
Dunne
Leona
K
0161
894-1238
SA
3
10012
Moloi
0181
894-2285
UK
4
10013
Olowski
0181
894-2180
5
10014
Orlando
0181
222-1672
FR
OBrian
0161
442-3381
NL
0181
297-1228
CZ
0.00
0181
290-2556
UK
0.00
0161
382-7185
SW
0181
297-3809
UK
3
6
10015
UK
4
7
10016
UK
8
8
10017
UK
10
9
10018
.......
10
10019
.....
0.00
you submit
0.00
the
DBMS
of CPU
updates
........
.........
.......
........
........
.........
.......
23120
course,
D
415
342-9234
UK
675.00
.......
.........
........
.........
........
........
.........
.......
......
.......
.........
........
.........
........
........
.........
.......
24560
Suraez
Victor
7898
233-8999
UK
342.00
query: CUS_COUNTRY
5 'UK'; must
perform
COUNTRY_NDX equal
a full table
is
to UK
created,
that
scan,
the
and then
Assuming
so important, column
why
in
every
thus
DBMS
reads
reading
for customer
all 14 786
automatically
all subsequent
only five rows
not index table
if the table
uses the index CUSTOMER
meet the rows
customer
condition
that
do not
rows.
to locate
rows,
using
the
CUS_COUNTRY
meet the
criteria.
11
5
Thats
has
every
will tax
column
the
in
DBMS
many attributes,
every
too
has
table?
Its
much in terms
many rows
not
practical
to
do
of index-maintenance
and/or
requires
many inserts,
deletes.
are logically
that
George
......
a state
especially
and/or
is applied
Veron
would save 14 781 I/O requests
are
Indexes
0.00 453.98
........
as a guide.
every
processing,
K
cycles!
If indexes so. Indexing
Mlilo
.........
index
row IDs in the index the
Moloi
G
.........
DBMS
with
Vinaya
........
the
customer
G
George
Padayachee
CUS_COUNTRY
that
first
James
Williams
673.21 1014.56
........
CUSTOMER
Assuming
Brown
B
1285.19
.........
the following
no index,
Amy
UK
.........
FROM
is
Myron
.......
CUS_NAME,
If there
F
Paul
896.54
.......
SELECT
WHERE
W
......
14786
Suppose
Marlene
......
13245
a lot
and
TABLE
786
CODE
UK
UK,
Logical,
5
....
the
Conceptual,
for the CUSTOMERtable
INDEX
AS ...
Index representation
11
they
and
require
and is
physically
their
own
an important
independent
storage
factor
of the
space.
How
that is initially
data in the
much
decided
associated
space
depends
within the
table. on the
physical
This
type
database
means,
of index
design
of
that
stage.
Types of Indexes There
are three
main types
Primary index to locate can
Copyright Editorial
review
2020 has
of indexes
these indexes
a specific
record
have
several
secondary
Secondary
index
these
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
that
can
are placed
pointed
to
be used:
on unique fields
by the
index.
A file
such can
as the
have
at
primary most
key. They are used
one
primary
index
but
indexes. indexes
not materially
be
copied, affect
can
scanned, the
overall
or
be placed
duplicated, learning
in experience.
whole
on any field
or in Cengage
part.
Due Learning
to
electronic reserves
in the
rights, the
right
file that
some to
third remove
party additional
is
content
unordered.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
612
PART IV
Database
Design
Multilevel index number
these indexes
of separate
which
keeps
index
track
on the
FIGURE 11.19
indexes
in
of these
DVD table
are used order
additional
from
the
where one index
to reduce
the
indexes!
DVD rental
becomes
too large
and is
This then
results
a further
search.
Figure
11.19
shows
in
an example
split into
a
index,
of a two-level
store.
Multilevelindex onthe DVDtable
Level 1 Index
Level 2 Index
The
M1231
W6790
W6790
DVD_CHARGE
DVD_CODE
DVD_NAME
M1000
Ramblin
M1020
Once
Upon
M1231
Tulips
and
S8756
S8756
MOVIE data file
6.50
Tulip
S3425
Khumba
S4854
Action
S4978
Invictus
a
Midnight
6.00
Breezy
6.00
Threelips
6.00 6.00
Heros
6.50
S6785
Tales of the
S8756
The
6.50
Unexplained
6.00
Stars
W4567
Flowers
in
Summer
6.00
W6756
Flowers in
Spring
6.00
W6790
The
6.00
Winter Garden
Each index can be defined as being sparse or dense. When using a sparse index, index pointers are created only for some of the records, whereas with a dense index, an index pointer appears for every search
key value in the file. In practice,
dense indexes
are faster,
but sparse indexes
space. In addition to these three types ofindex, there are a number of other types You willlearn about each of these indexes in the next sections.
11
require
less
storage
ofindex that are popular.
B-trees
Within a DBMS, indexes are often stored in a data structure known as a tree. Trees are generally more efficient at storing indexes as they reduce the time of the search compared with other data structures such aslists. These trees are often referred to as Balanced or B-trees and are used to maintain an ordered
set of indexes
or data to
allow
efficient
operations
to
select,
delete
and insert
data.
A B-tree
consists of a hierarchy of nodes that contain a set of pointers that link the nodes of the B-tree together. Each B-tree that is created is said to be of the order n where n is the maximum number of children allowed for each parent node. We can say that each node in a B-tree of order n contains at most 2n keys and 2n 1 1 pointers. This is true
except
for the root
node,
which
provides
the
starting
point
of the
B-tree.
When a node
does not have any children, it is called aleaf node. Each item (index or data) stored in a B-tree is known as a key. Each key is unique and can occur in the B-tree in only onelocation. The B-tree must always be balanced in that every path from the root to the leaf must be exactly the same length. The general principle is that for every node (which we will call n)in the tree: The left subtree of n contains only values smaller than the value in n. The right subtree of n contains
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
only values greater than the value in n.
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
A special often Often, are
kind of B-tree is
used
to represent
B-trees
some
details
which
a suitable
file
concentrate
level
root
than
the
Figure
to other
not in the
greater
shows in
of
nodes.
node,
France
general
so
in
the
tree.
the
data record.
Here,
FIGURE 11.20
structure
a
the
each
index
can
Although
they
have
similar
kinds
of this
in the
two
for
of trees
in
chapter.
context
which
so it
data record for
contains
for
Germany
UK, so
match
and
so that
of a B1-tree,
the
Logical,
Physical
This tree is be quickly
an article
As we are
Database
Douglas
dealing
of physical
database
represents
country
at
keys
613
most
located.
proprieties, by
Design
there Comer,
with choosing design,
we will
section.
order
must look
than
we find
section
this
of the
Tolocate we
and less
access
is
versa.
these
upon indexes in
map
vice
about
reading
B1-trees
11.20
and
Conceptual,
where all keys reside in the leaves.
as a road
more
based
basics
B1-tree, act
B1-trees
are in the further
11.20
and pointers is
as
can read
organisation
B1-tree
as the which
to
You
on the
Figure The
are referred
differences.
of
known
indexes
11
in
we select Germany
names. (the
country
names)
Germany, first look in the root
node.
Germany
the
the
most two
child
nodes.
Alphabetically,
middle
pointer
and
and follow
the
pointer
Germany
proceed
to the
to the
left
of
is
second
Germany
to
B+-tree terminology
11
Now that
we have introduced
new key into
the tree.
some of the basic terminology
Suppose
we want to
the DVDtable. Figure 11.21 illustrates the DVD_IDs shown in Table 11.7.
TABLE 11.7
use the
of B1-trees, lets
attribute
DVD_IDs to
the steps required to construct
see if we can insert
act as the
a B1-tree
primary index
a on
of order two to store
DVD codes DVD_CODE M1020 M1231 M1000 S3425 S4854
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
614
PART IV
Database
Design
FIGURE 11.21
Creating a B+-tree
11
Insert
MOVIE_CODE
right
of S3425,
parent
S4854.
but the
node (which
It is
greater
node is full.
currently
So
contains
than
we have to M1020).
becomes full, so we have to create a further
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
S3425
or
duplicated, learning
in
should
split this
However,
child
experience.
and
whole
or in Cengage
part.
Due Learning
node
this
node for
to
therefore and
will
be placed
promote
mean that
to the
S3425 the
to the
parent
node
M1231.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
As you have data,
which
seen, the leads
appropriate
to
levels
B-tree is a powerful a faster
of index
space,
ensures
that
a node
as the tree
each
response
for
the
node
is
reorganises
way of storing
time
file
for
being
at least
itself
11
user
indexed half
indexes
queries. and,
used
Figure
Logical,
as it
The
through
and full.
as you saw in
Conceptual,
Database
never
615
of
sustains
management a case
Design
quick retrieval
automatically
careful
There is
Physical
allows the
B-tree
the
and
the
of storage
of overcrowding
at
11.21.
NOTE You
may be wondering
node to
pointers
the
data record
continue
to
at all. The again
once
are
mainly
used
Bitmap
Indexes we have looked
is
Another
known
given form
B-trees
when
we delete
a record.
do
perform
implementations
basis
for
the
maximum
this is that
all files
number
when you know that
domain.
not
at indexes
popular type
as the
bitmap
the
their
age.
are
not
are likely
of children
to
Whileit is the
grow
in the
a query refers to
store, So, the
applied
to
speed
that is often used on
Bitmap
indexes
if customers
DVD rental
enter
that
ofindex
index.
For example,
to join
would
to
B-tree
possible
actual
and therefore
node
has
been
a column
to remove leaf
deletion
of the
the leaf
pointer
is likely
to
reached.
which is indexed
and
only a few rows.
So far,
tables.
some
grow
B-tree indexes will retrieve
what happens
to records,
are
values
for
would
applied
to
enter
age in the
retrevial
multidimensional
usually
were required
everyone
up data
enter their
to
database
relational
database
data held in data
attributes
personal name
from
that
information
and
would
warehouses
are sparse
in
their
on an application
address,
but
a large
number
be sparse.
NOTE You will look at how bitmap indexes Chapter
15,
Databases
In a bitmap index,
for
can be used to optimise queries that use multidimensional
data in
11
Business Intelligence.
a two-dimensional
array is constructed.
One column is
generated
for every row in the
table that we wantto index, with each column representing a distinct value within the bitmapped index. The two-dimensional array represents each value within the index multiplied bythe number of rows in the table. An example of a bitmap index on the DVD_CHARGE field is shown in Figure 11.22. The DVD table
also shown
values {6.00,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
in
Figure
6.50,
All suppressed
Rights
Reserved. content
11.22
7.00,
does
May not
currently
7.50,
not materially
be
copied, affect
8.00}.
scanned, the
overall
or
duplicated, learning
has 11 rows
and the
This bitmap index
in experience.
whole
or in Cengage
part.
Due Learning
to
DVD_CHARGE
field
has five
different
has 11 entries with five bits per entry.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
616
PART IV
Database
Design
FIGURE 11.22 The
Bitmap index
on the field
MOVIE_CHARGE
MOVIE table
MOVIE_
MOVIE_
MOVIE_NAME
CODE
COPIES
M3456
3
Ramblin
R2345
2
Once Upon a Midnight
S4567
3
S4854
MOVIE_
MOVIE_LATE_CHG_
CHARGE
DAY
CATEGORY
6.50
0.25
Family
8.00
0.25
Comedy
Tulips and Threelips
6.00
0.25
Family
3
Action
6.00
0.25
Action
S4978
2
Invictus
6.50
0.50
Action
S6785
3
Tales
6.50
0.25
Action
S8756
2
The Stars
6.00
0.50
Doc
W1234
5
Khumba
8.00
0.50
Family
W4567
2
Flowers
in
Summer
6.00
0.25
Doc
W6756
2
Flowers
in
Spring
6.00
0.25
Doc
W6970
3
The
6.00
25.00
Doc
Bitmap
index
on the field
Tulip Breezy
Heros
of the
Unexplained
Winter
Garden
DVD_CHARGE
MOVIE_CHARGE 6.00
6.50
7.00
7.50
8.00
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
11
Bitmaps
are
more compact
bitmap indexes a bitmap index associated entry. the
bitmap
2020 has
Cengage deemed
Learning. that
for
the
wanted The
up less
storage
in
performance.
field in the
CATEGORY to find
DVD table.
field.
out the
SQL to retrieve
this
This
names
space.
Figure bitmap
of all
However,
Suppose 11.23
index
shows the
has
movies
combining
with
DVD table
11 entries a
multiple
we decide to also create
movie
with four
charge
and the bits
per
of 6.00
and
data is:
MOVIE
WHERE
review
index
and take
improvements
MOVIE_NAME
FROM
Copyright
B-trees
significant
CATEGORY
we then Family.
SELECT
Editorial
on the
Suppose
category
than
can provide
any
CATEGORY
All suppressed
Rights
Reserved. content
does
5 'Family'
May not
not materially
be
copied, affect
AND
scanned, the
overall
or
duplicated, learning
MOVIE_CHARGE
in experience.
whole
or in Cengage
part.
Due Learning
5 6.00;
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
To retrieve an
AND
retrieval data,
this
data,
operation of data
but
we would access
with the where
bitmaps
both
are
FIGURE 11.23
third
also
bits easy
the first
bit from had to
a
the
bit from
11
the
value
bitmap
of 1.
Logical,
CATEGORY
DVD_CHARGE
matching
Conceptual,
Physical
Database
bitmap index
index.
Not only is this
and
This
would
an efficient
and
Design
617
perform
then
allow
the
way of accessing
read.
Bitmap index onthe CATEGORYfield CATEGORY Family
Bitmap indexes
Comedy
Doc
Action
1
0
0
0
0
1
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
are usually used when:
A column in the table has low cardinality. Although all DBMSs vary, Oracle considers where the index has fewer than 100 distinct values.
columns
The table is not used often for data manipulation activities. This meansthat there are hardly any updates to the data in the table and few rows areinserted or deleted. Updating bitmapped indexes takes a lot of time, so, for example, if you update the data in the table regularly another type
of index
would be less resource
intensive.
As a guideline,
bitmapped
indexes
are
11
most
suitable for large, read-only tables. Specific
SQL queries
reference
a number
of low
cardinality
values in their
WHERE clauses.
Join Index Like the bitmap index, the join index is used mainlyin data warehousing and applies to columns from two
or
more tables
whose values
come from the
same
domain.
It is
often referred
to
as a bitmap join
index and it is a way of saving space by reducing the volume of data that must bejoined. The bitmap join stores the ROWIDs of corresponding rows in a separate table. For example, Figure 11.24 shows two tables, CUSTOMER and EMPLOYEE, which both have columns containing area codes (CUST_AREACODE and EMP_AREACODE) that share the same domain. Each table
also has a ROWID.
The join index
on the
the ROWIDs for the rows in each table
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
AREACODE
column
(also
shown in Figure 11.24)
shows
which share the same AREACODE.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
618
PART IV
Database
FIGURE 11.24 The customer
Design
Join index
on the AREACODEfield
table
ROWID
CUST_NUM
CUST_LNAME
CUST_FNAME
CUST_INITIAL
CUST_AREACODE
50001
1001
Ramas
Alfred
A
0181
844-2573
50002
1002
Dunne
Leona
K
0161
894-1238
50003
1003
Moloi
0191
894-2285
50004
1004
Olowski
0181
894-2180
50005
1005
Orlando
0181
222-1672
50006
1006
OBrian
Amy
B
0161
442-3381
50007
1007
Brown
James
G
0181
297-1228
50008
1008
Williams
0113
290-2556
50009
1009
50010
1010
The employee
Marlene
W
Paul
F
Myron
George
Padayachee Moloi
CUST_PHONE
Vinaya
G
0161
382-7185
Mlilo
K
0181
297-3809
table
The join index
ROWID
EMP_NUM
EMP_LNAME
EMP_AREACODE
EMP_PHONE
72001
230
Smithson
0191
555-1234
72002
231
Johnson
0181
123-4536
72003
233
Wallace
0113
342-6567
72004
235
Ortozo
0161
899-3425
on the
common
column
AREACODE
11
This type
of index
is useful
ROWID
ROWID
AREACODE
50001
72002
0181
50002
72004
0161
50003
72001
0191
50004
72002
0181
50005
72002
0181
50006
72004
0161
50007
72002
0181
50008
72003
0113
50009
72004
0161
50010
72002
0181
when dealing
with large
quantities
of data that
are typically
found
in
data
warehouses. Join indexes are less common in small relational databases. However, like bitmaps they are unsuitable when there are high-volume updates. The queries that access these indexes may also not reference any fields in the WHERE clause which are not in the join index.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Hashed File Organisations A hashed file organisation uses a hashing algorithm to address
in
hashed
organisation
hashing
the
file.
will tell
DBMS
from
a shorter
1
the
2
this
algorithm
method
Lets look number
record
number
for
at an example.
Database
evenly
Thus,
are
within the
has no direct
start
files
many
of the
will usually
file. reduce
that
Design
follow
different
619
the
kinds
of
data storage
area.
meaning except
that
This
artificial
the
number
primary
key
is known
as the
division/remainder
method.
is
value
The steps for
20 per cent larger than the number of records you
would
be 997
as it is
would
of
The
362
value
this
hashing
FIGURE 11.25
primary key by the prime number and use the remainder
storing
Suppose
of 120001
illustrates
Physical
store.
number 362).
and
are:
Divide the value of the logical relative
file.
There
that
the
The algorithm
Choose a prime number that is approximately want to
number to
the
files.
records
relative
key.
Logical,
a hash.
of hashing
using
an artificial
primary
throughout
or direct
distribute
is located
Conceptual,
map a primary key value onto a specific record
order
as random
generates record
logical
called
type
a hash
a random
to
aim of each is to
a real-world
One common
in
referred
algorithm where
identifier,
generating
are stored often
but the
hashing
the
generated to
are
algorithms,
Each specific it
Records
11
the
we need to store information
approximately
then would
have
20 per
a hash
then
be the
as the
record.
cent
larger
of 362 (120001 relative
about than
800.
divided
record
800 customers.
by
number
of
A suitable
A customer
997
gives
customer
with
120
prime
a customer
with
a remainder
120001.
Figure
11.25
algorithm.
Hashing algorithm
applied to the customer
number field
11
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
620
PART IV
Database
Design
In hashed file
organisations,
as a bucket.
If the
bucket
hold
can
to the
number
each relative
bucket
can
hold
several
records,
of records
that
address
more than then
the
the
that is
capacity
bucket
generated
one record,
can
then
of the
is held in a storage
individual
bucket
hold, including
for
records
are
a specific
some
free
hashed
space
location
held in file
known
a slot. If the
should
for future
be set
modifications
of records. The
main weakness
generated.
If the
collision
as the
hash
DBMS
will decrease
to
capacity,
the record
problem,
order
to
should
be stored
produce
algorithm
Dealing manage are in
in the
is
with
primary
storage
in the
overflow
used to
collisions
all of the
a random
order
alot
area is
rather
record
This type can
be
the
to
good
obtained
based
key
To deal with are stored.
overflowed value
In
record
is rehashed
to
full, then
bucket.
is
upon
of
been filled
area becomes
news
organisation
the
is
as a
performance
already
records
overflow
next free
so the
known
overflowing.
where
address
it is
the
has
primary
If the
into the
of file
occur,
overflow
point
a unique keys
bucket
bucket from
where
the logical
complicated,
matches
If the
used to
Alternatively,
that
primary
of collisions
area to store the record.
organisation.
exact
different
prevent the
a pointer
area.
If
overflow
put the overflowing
file
and
to
an
two
data increases.
elsewhere
may seem
hashed
for
record.
retrieve
records,
there is no guarantee
hash
relative to
have
overflow
is that
same
same taken
organisations
of these
a new location
another
to the
as the time
file
track
the
must be stored
hashed
keep
algorithms
generates
will point
the
this
with hashing
algorithm
is
that
most
generally
the
used
hash
that
DBMSs
will
when records
is
generated.
Clusters User in
queries
more
secondary
columns
are
would
make
required
to
access
common,
cluster
11
is
11.26
CUST_NUM
As part
clustered
is
necessary
Cengage deemed
together.
This
related
the
would physical
speed
RENTAL table Therefore,
reduce files
the in
time
different
or set of fields, that the clustered tables
the table join.
CUSTOMER
together.
design The
mainly
you
general
used
any
are frequently
obviously
whether to
Learning. that
and the
stored
common
to increase
table frequently
the
are
share
The cluster
key is
determined
have
when the
Notice
and
that
RENTAL
tables
with the
each
CUST_NUM
is
only
would have to select
appropriate
tables
cluster
stored
key
once
and
the two tables. database
are
that
table
All suppressed
to
undertake
both clustered
has
key is a field,
of the
stored
not a good idea
determining
2020
a portion
together.
that
tables
a clustered
review
how
between
tables
Clustering
Copyright
together
together.
accessing
that
for
rules
are:
queries
and
not
other
data
manipulation
that
may benefit
operations
such
as
or update.
Select
Editorial
tables
tables
tables
CUSTOMER
be accessed
two with
Where these Usually,
clustered
the
and
these
through
tables. time.
physically
store,
compared
The cluster
be physically
of physical
being Select
are
CUST_NUM
clustering
usually identified
shows
would
acts as the join
insert
fields
multiple
response
DVD rental
records,
storage.
which is
query
so, they
in the
consider
related
data from
on the
queries;
common
to
require
created.
Figure
from
same
on the sense
parts of secondary in
not
an impact
For example,
be joined
would
than
has
used in the
of data retrieval.
it
often
storage
Rights
Reserved. content
does
takes
May
longer
tables
some
not
together
when applications
cluster
and stored
joined
require
than or not
experiments
within
queries.
a full
a full
table
will largely
that
database scan
of an
depend
will compare
table
upon
the
query
to
be scanned.
unclustered the
table.
application
response
Scanning However,
and it is
times
often
when tables
are
separately.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 11.26
CUST_
CUST_
LNAME
FNAME
Ramas
Alfred
A
Key
Physical
Database
RENTAL
Design
621
CUST_
RENT_
RENT_
AREACODE
PHONE
NUM
NUM
CHARGE
0181
844-2573
......
6.00
6
8.00
9
6.00
13
6.00
1
6.50
8
0.00
1003
2
6.00
.....
.....
1002
894-2285
.....
3
1001
894-1238
0191
W
TABLE
CUST_
0161
K
......
Cluster
and
CUST_
INITIAL
Marlene
......
Logical,
TABLE
CUST_
Leona
Moloi
Conceptual,
Cluster key on the CUSTOMERand RENTALtables CUSTOMER
Dunne
11
....
......
5 CUST_NUM
11.3.4 Define Indexes As
you
discovered
perfomance and
decisions
secondary) table.
queries
in
the
to
be
Secondary
a primary index
are
are
the
speed
on the
UNIQUE
using
DVD_ID
INDEX
of the
fields
part
be indexed
on additional
INDEX
key field from
DVDINDEX
an important of
role
and the
index fields
in improving
physical type
created that
are
the
database
design
of index
(primary
or
for the
primary
key
used
regularly
in
user
retrieval.
CREATE
primary
a large
has a primary
placed
data
play
is
to
typically
usually
can
indexes
the
Each table
indexes
created
indexes
Defining
made regarding
to increase
SQL, indexes
section,
system.
will be applied.
order
CREATE
previous
database
need
that
of the
In
in
of the
statement.
the
For example,
DVDs table, the
if
we wanted
to
1
create
SQL would be:
ON DVD(DVD_ID)
where: UNIQUE
specifies
duplicate
value
addition,
these
after
The
the
DVD_ID and
2020 has
Cengage deemed
field,
the
an attempt
unique
constraint
name
of the index the
are created store
to
will instead
duplicate
CREATE
INDEX
file that
is
a similar
if
SQL
a specific
give the
created
column
statement
DVD_TITLE.
table
store
which
command.
index
by the
the
a
message.
DVD table
In and
are not inserted.
is
of the being
that store.
query to the
contain
an error
into
value
Supposing
stocked
A regular
each
the
does
will return records
DVD_IDs), they to
for
DVD is
If the
any further
have duplicate
and the
using
values.
made to insert
(e.g.
table
enquire
is
have
index. created.
customers They
frequently
are
database
unlikely
ring
to
know
may be:
DVD
Learning. that
if
may not
DVD_ID
WHERE
review
DVD_ID
specifies
indexes
FROM
Copyright
is the
DVD rental
SELECT
Editorial
in the
ON clause
Secondary
the index
creation,
violate this
DVDINDEX
up the
that
any
M.DVD_TITLE
All suppressed
Rights
Reserved. content
does
5 'Flowers
May not
not materially
be
copied, affect
in
scanned, the
overall
Winter';
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
622
PART IV
Database
To speed
Design
up results
of this
CREATE INDEX As it is
possible
UNIQUE
that
keyword
there
The
index
a table
both
optimisation,
process
must,
create
large
tables
the
complete
When
a
for
When the
data
MIN function
table
sparsity.
can
different
rows are
also
in
you
create
13.
must
DVD
as the
also
through
linked
codes),
values
the
of fields
every time
be inserted
to
During the which
table.
a new
into
database
physical
fields
Indexes
to
the
is
tuning
database
index.
should
an index
to
also
and
design
Generally, be
you
considered
more efficient
than
on
scanning
you
Data sparsity
refers
is
said
you
to is
have
to
a small
subset
column
high
sparsity. For
is
a
said
to
read
Knowing
have
the
example,
a high
when
percentage
you
of the
work.
of rows
from
a large
with high selectivity. Index selectivity processing.
of
in
date of birth can have
appropriate.
are likely
number
column
that
student
may be unnecessary
select
to the
a STU_GENDER
stores the
of an index
processing
BY clause.
M or F; therefore,
that
sparsity,
want to
high.
column use
ORDER
column.
values,
that
or
For example,
column
the
index
is
have.
which has a WHERE or HAVING clause.
BY
an indexed
possible
with low
criterion
a GROUP
column
whether
the
are
Here are some
indexes
primary
in join
Managing declare
Learning.
same rows,
Using
operations.
Database primary
All suppressed
Rights
and
keys
Reserved. content
does
as in
table,
based
on a
is a measure of how
general
guidelines
for
creating
May not
same
keys
(Note SQL
not materially
be
copied, affect
with low
logic,
do not
query
that the
keys
scanned, the
overall
or
duplicated, learning
because
in experience.
whole
the
or in Cengage
a table too
part.
Due
to
with low
costly
reserves
and low-sparsity
sparsity
and
making
for tables
may return the
with few
DBMS
will be covered joins
electronic
small tables
a specific
and
optimiser
Learning
are
full
table
rows
and few
values in a column.
within
optimiser
All natural
have
the index
each row. Indexes
Remember,
indexes
optimiser
query
Performance.)
create
DBMS
5 INVOICE.CUS_CODE.
in
of unique
the if you
by accessing
P_PRICE for
operation
existence
conditions,
For example,
be handled
sparsity.
condition
the index
so the
and foreign
can
ORDER BY, or GROUP
search scan.
and evaluating
A search
making
the
used in
of a full table . 10.00
or tables thing.
HAVING,
CUSTOMER.CUS_CODE
must ensure the
and foreign
attributes
all table rows
such
not the
a WHERE,
P_PRICE
in small tables
option.
all single
condition
of table
used in
scan instead
scanning
unless you
Declare
for
an index
expressions,
percentage a viable
attribute
indexes
using
of sequentially
attributes
any
in
will be used in query
P_PRICE,
tables
that
when
table
used in join
a high
Cengage
by itself
therefore,
each single
Do not use indexes
deemed
different
be used:
STU_DOB
therefore,
for
If
for
instead
has
to
possibly
a column
useful
the
an index
2020
with
For example,
closely
about
of each
as searching
only two
the
decide
an index
BY clause.
review
field:
indexes:
accesses
Copyright
is
Chapter
key
The objective is to create indexes
Create indexes
Editorial
primary
applied
could
values;
you to
anyway;
it is that
scan
DVD_TITLE
indexes
a record
decisions
by itself in a search
is
have
date
a search
using
(but
overheads.
index,
in
initial
on the indexed
In contrast,
perform
also
the
appears
a column
helps
and
some
appears
sparsity
sparsity
likely
title
most secondary
indexes
are covered
are likely
column
values
STUDENT
11
same
of
secondary
accessed
column
MAX or
different
condition.
with the
a secondary
and
make
are regularly
an indexed
Indexes
on the
table.
When
table
be created
can be additional
has
which
indexes
When an indexed
many
DVDs
This is true
there
primary
As a general rule, indexes
low
be two
that
however,
unique
that
could
table.
of
performance
should
index
ON DVD(DVD_TITLE);
not be used.
and as a result into
selection
you
could
should
is inserted
corresponding
a secondary
DVD_TITLE_INDEX
are often repeated record
query,
in
old-style
rights, right
some to
third remove
party additional
detail in joins
will use the
the
can
content
may
Chapter
will benefit
available
content
use the
be
indexes
suppressed at
any
time
from if
the
subsequent
13, if
you
at join
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
time. (The Also,
for
declaration the
same
Procedural Declare than
of a PK or FK will automatically
reason,
Language indexes
the
SQL
in join
primary
it is
better
and
write joins
Advanced
columns
and foreign
to
other keys,
11
Conceptual,
create using
Logical,
an index
the
SQL
you
do join
and
for the
JOIN
Physical
Database
declared
syntax
(see
Design
623
column.)
Chapter
9,
SQL).
than
you
the
PK/FK.
If
may be better
operations
off declaring
indexes
user views required
for the
in
columns
in those
other
columns.
11.3.5 Define User Views During the Using
the
conceptual relations
defined
taking
users.
We discuss
in
SQL in
design
defined
database
security
how to
Chapter
stage, the
different
in the logical
define
8, Beginning
data
into
account
roles
in
model, as they
section
Structured
these
views
can
11.3.7.
help
You
must to
now
define
can learn
database
are determined.
be defined.
the more
roles
Views
of
about
are
different
how to
often
types
create
of
views
Query Language.
11.3.6 Estimate Data Storage Requirements Allocating Most
physical
of the
storage
information
technical
manuals
characteristics
necessary
of the
software
depends
for
defining
you
are
the
on the physical
DBMS
and the
storage
operating
characteristics
can
systems
used.
be found
in the
using.
NOTE If the
DBMS
physical details
does
design of the
versions
not automate
requires
database,
of relational
the
process
well-developed operating
DBMS
technical
system
software
of determining
and
hide
skills
hardware
most
and
used
of the
storage
locations
a precise by the
data access
knowledge
database.
complexities
and
inherent
of the
Fortunately, in the
paths,
physical-level the
physical
more recent design
phase.
11 During the process of physical database design it is important to estimate not only the size of each table but also its long-term growth pattern. It is not necessary to be 100 per cent accurate but it should be based
upon the expected
growth
of the
business.
Therefore,
input into this
process
should
be provided
by the business experts within the company. They will need to answer questions such asHow many customers are welikely to have in the next five years? or Are welikely to expand the products that we currently sell? Next, the
physical
requirements
of each table
must be estimated.
One simple
way of performing
this for each table is to: 1
Estimate
the
size of each row
by summing
2
Estimate the number of rows, taking into
3
Multiply the size by the estimated
the length
in bytes for
consideration
each data type.
the expected
growth.
number of rows.
Table 11.7 shows this calculation for the DVDtable from the DVD rental store database.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
624
PART IV
Database
Design
TABLE 11.7
Physical storage
Attribute
Name
requirements:
Data
the
DVD table
Type
Storage
DVD_ID
VARCHAR2(10)
MOVIE_COPIES
NUMBER(3)
MOVIE_NAME
VARCHAR2(50)
MOVIE_CHARGE
NUMBER(2,2)
4
MOVIE_LATE_CHG_DAY
NUMBER(2,2)
4
CATEGORY
CHAR(6)
6
Row length: Number
(Bytes)
10
50
77
of rows:
7 590
Total space required:
The physical than
Requirement
584 430
size of any indexes
estimating
table
sizes,
that
because
have been specified the
actual
must also be estimated.
size can depend
on the
specific
This is
more difficult
DBMS.
NOTE Oracle
18c
provides
a number
CREATE_TABLE_COST average
row
size in
These
for
determines
estimating
the
size
the size of the table
of database
tables
given various
and indexes:
attributes
including
the
bytes.
CREATE_INDEX_COST existing
of tools
determines
the
amount
of storage
space
required
to
create
an index
on
an
table.
tools,
however,
can
only
usually
be accessed
by the
database
administrator.
11
11.3.7 Determine In
Chapter
Database Security for
10, Database
Development
Process,
Users
issues
surrounding
the security
of the databases
such
as potential threats and measures that could be taken to combact these threats were discussed. As part of the Systems Development Life Cycle(SDLC), the security requirements ofthe database will have been identified. This will haveincluded all the users of the database and their individual access requirements and restrictions. During physical database design, these requirements must beimplemented withinthe target
DBMS
and database
privileges
for
users
will need to
be established.
For example,
privileges
may
include selecting rows from specified tables or views, being able to modify or delete data in specified tables, etc. Implementing basic data security in Oracle requires all users to be given an account comprising a user name and an associated password. Oracle has two levels of privilege (system and object) that allow the
database
administrator
(DBA)
to
control
how
much power
a specific
user is
granted.
For example,
we do not want all staff to be able to access the complete database or to drop tables when they have no right to do so. System privileges authorise a user account to execute SQL data definition language (DDL) commands such as CREATE TABLE. Object privileges allow a user account to execute SQL data manipulation
operations
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
language
(DML)
commands
such
as performing
SELECT, INSERT,
UPDATE
and
DELETE
on specific tables.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
The SQL commands user
accounts.
Craig
the
ability
to
the
select
rows DVD TO
SELECT
ON
GRANT
CREATE
TABLE
these
privileges ON
REVOKE
CREATE
TABLE
difficult
time
managing
each
then
automatically
rental
store,
the
sales
CREATE Once
created,
privileges SELECT
GRANT
UPDATE
that
create
to
add
all the need
to
have
625
on specific
with the
username
tables:
and
users,
overcome
from
this,
to
DBA
the
under
a role
assigned
of the
role
operations
database
name. These
For
The
major
changes
example,
on the
a very
can be assigned
a single
role.
will have
users
at any time.
that
UPDATE
create the role
so the
and a database
referred
been
SELECT
used to
of
require
privileges
who
perform
ROLE is
To
they
of privileges
or revoke
users
number
are required.
of privileges
can then
ON
be granted
CUSTOMERS
ON
CUSTOMERS
chooses
on selected
TO
in
will
the
CUSTOMER
DVD table.
STAFF_CUSTOMER_ROLE:
granting
database
objects
to the
new role.
For example:
STAFF_CUSTOMER_ROLE;
TO
STAFF_CUSTOMER_ROLE;
the role to individual
STAFF_CUSTOMER_ROLE
DBA then
assigned
to
ability
privileges
account
Design
SQL statements:
will be a very large
a collection
can
stage then involves
GRANT If the
simply
staff
the
Database
STAFF_CUSTOMER_ROLE;
GRANT
The last
and the
or withdraw
grant
Physical
Craig;
privileges
CREATE
ROLE
DVD table
on the type
a DBA
apply
The SQL command
statements
and
Craig;
there
all the
is that
SQL
Logical,
Craig;
FROM
that
A role is
of roles
two
the
FROM
depending
group.
benefit
from
DVD
it is likely
can be grouped
REVOKE are used to authorise
following
can be done using the following
SELECT
company,
Conceptual,
TO Craig;
REVOKE
any
to
GRANT and
example,
GRANT
Removing
In
For
11
users accounts,
e.g. Lindiwe;
TO Lindiwe;
to revoke
a privilege
from
the
role,
it is
automatically
removed
from
all users
1
to the role.
SUMMARY Conceptual
database
by producing This
a data
stage
of the
requirements,
processes
can
and
Logical
database
designed
based data
review
2020 has
Cengage deemed
Learning. that
any
Cycle
the
of tests
design
All suppressed
Rights
does
May not
entity
not
the
be
entities down
model
of the
database
and relationships
in to
and normalisation,
conceptual
by the
is the
materially
relevant
conceptual
proposed
model using
Reserved. content
representation
be broken
system
four
data
is
within
steps:
data
created
the
analysis
model verification,
must embody
end-user
system. and
and
a clear understanding
affect
stage
and its
overall
or
views
and their
duplicated, learning
in the
within
stages:
creating
assigning
in experience.
whole
or in Cengage
part.
phase
of
required
Life
Cycle,
Due
to
electronic reserves
where
conceptual
the logical
rights, the
right
third remove
party additional
content
be run
access
relations
are
Creating
model,
integrity
to
model
model.
data
some
the
transactions,
constraints.
the
ER model
that the intended
that
and
and validating
Learning
where the
requires
Database
relationships
following
the
Verification
data requirements
second
scanned,
design
in order to corroborate
model.
data
normalisation,
copied,
database processes
database
business-imposed
on each
data
the can
modelling
against
and
conceptual
areas.
model involves
the logical
Copyright
identifies
The final
be supported
security
the
is part of the
against
a series
paths
Editorial
design.
model verification
through
where
Life
and its functional
must be verified
logical
model that
Database
database
business
Data
is
entity relationship
distributed the
design
the
validating
constraints,
may content
be
merging
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
626
PART IV
Database
logical model When
Design
models
constructed
with the
user.
creating
the logical
is important.
Those
be translated
tables
used
database access,
Selecting storage
indexes;
data
views,
file
the
of the the
data
DBMS.
integrity
entities
primary
the logical
are translated
any foreign
relation
is
key
model is
The and
design
data
into
relations
keys) should
specified
along
attribute(s)
suitable
with its
is identified,
estimating
data
goal
must
the
physical
be to
following
identified
organisation,
storage
onto the
and to improve
comprises
file
mapped
ultimate
security,
each relation
most
organisation
is important
most common indexed
files, in
known
as
and join
for
types
ensure
seven
defining
data
in terms
stages:
of
analysing
data
indexes
and
database
that
efficiency
in the logical
requirements,
value
of the
B-trees, indexes.
fast
of file
sequential
which
algorithm
primary
which
allow
These
are
data
retrieval
organisation
files,
which a hashing
upon the
indexes
and reviewing
do not contain
Finally,
database
the
records;
structures
bitmap
which
name
usage, translating
The three
based
database,
to
model into
speed
determining
up data
appropriate
users.
and hashed
each record
are
user for
ordered
the
chosen
ensure
Physical
a suitable
randomly
to
determining
space.
in
(e.g. that
brackets.
in the
time.
security
order
where the logical
and database
designing
the
relations, in
design is
tables,
database
the
effectively,
response
parts for the
keys.
be implemented
is
model,
enclosed
database
data volume
in
To create
attributes
to
storage query
data
by any foreign
Physical
different
with no dependents
first.
associated followed
for
are
is
key.
sorted
used to
efficient
heap
on one
on
Two
of
which
or
determine
data retrieval. used
use
files,
contain
more fields
the
Within a DBMS, indexes
fast often
and
are
address
using
of
are often stored
other
kinds
multidimensional
of indexes
data
held in
data
warehouses. Indexes
are
aggregate because
sparsity
crucial
for
functions an index
refers
recommended
speeding
and is
an
to the in
up data
even join ordered
number
access.
operations.
set
of values
of different
highly sparse
Indexes
that
values
columns
facilitate
The improvement contains
data
the index
a column
used in search
searching, in
could
sorting,
access
key
possibly
and
and
speed
using
occurs
pointers.
Data
have. Indexes
are
conditions.
11
Online Content In Appendices BandC,available onthe onlineplatform forthis book,you will have the
creation
chance
to
experience
of two real-world
e-commerce
all the
database
stages
systems:
of the
the
database
University
design
Lab and
life
cycle
through
Global Tickets
the
Ltd, a travel
database.
KEYTERMS B-tree
file organisation
bitmapindex
hashedfile
cluster key
heap file
primaryindex
cohesivity
index selectivity
secondaryindex
compositeusage map
indexes
sequentialfile
conceptual design
join index
transaction usage map
description of operations
Copyright Editorial
review
module
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
minimal data rule
Reserved. content
modulecoupling
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
11
Conceptual,
Logical,
and
Physical
Database
Design
627
FURTHER READING Comer,
D.The
Garmany,
Ubiquitous
J.,
B-Tree,
Walker, J. and
ACM
Computing
Clark, T. Logical
Surveys,
Database
11(2),
Design
pp. 121137,
Principles
1979.
(Foundations
of
Database
Design).
AUERBACH, 2005. Pavlovic, Z. and Veselica, Lightstone, to
S., Teorey,
Exploiting
M., Oracle Database 12c Security Cookbook.
T. and
Indexes,
Management
Nadeau,
Views,
Systems,
T. Physical
Storage,
and
Database
Design:
More, 4th revised
edition.
Wiederhold,
Morgan
Professionals Kaufmann
Guide
Series in
Data
2007.
Teorey, T. Database Modelling and Design Logical D, 5th edition. Systems,
Packt Publishing, 2016.
The Database
Morgan Kaufmann
Series in Data Management
2011. G. Database
Design,
Online Content are available
on the
2nd edition.
McGraw-Hill,
1983.
Answers to selectedReviewQuestionsand Problemsforthis chapter online platform
accompanying
this
book.
REVIEW QUESTIONS 1
What are the stages of the conceptual
database design?
2
What are business rules?
3
Which steps are required in the development
Whyare they important
to a database designer?
of an ER diagram?
4
List and briefly explain the activities involved in the verification
of an ER model.
5
Describethe logical database design process.
6
Describethe steps required to convert the conceptual ER modelinto the logical
7
What are the typical
8
Whichintegrity
9
What are the stages of physical database design?
model.
11
Copyright Editorial
review
problems in
constraints
merging relations?
need validating
during logical
10
Whyis it important
11
Whenshould indexes be used?
12
Describe the purpose of a B-tree.
13
When are factors important
14
How is basic database security implemented?
15
How is entity integrity
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
database design?
to analyse data volume and usage statistics?
May not
in selecting
a bitmap index?
and referential integrity
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
enforced
or in Cengage
part.
Due Learning
to
when creating tables in SQL?
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
628
PART IV
Database
Design
PROBLEMS 1
Writethe proper sequence of activities in the design of a video rental database. (The initial ERD was shown in Figure 11.7.) The design must support all rental activities, customer payment tracking and employee work schedules, as well as track which employees checked out the videos to the customers. When you have finished writing the design activity sequence, complete the ERD to ensure that the
normalised 2
database
design
can be successfully
implemented.
(Make
sure that
the
design is
properly and that it can support the required transactions.)
Create the initial ER diagram for a car dealership. The dealership sells both new and used cars, and it operates a service facility. Base your design on the following business rules: a
A salesperson
b
A customer
c
A salesperson
d
A customer
e
can sell many cars, but each car is sold by only one salesperson.
can buy many cars, but each car is sold to only one customer. writes a single invoice
for each car sold.
gets aninvoice for each car(s) he or she buys.
A customer
might come in
only to
have a car serviced;
that is,
one need
not buy a car to
be
classified as a customer. f
When a customer takes in one or more cars for repair or service, one service ticket is
written
for each car.
g
The car dealership maintains a service history for each car serviced. referenced by the cars serial number.
h
A car brought in for service can be worked on by many mechanics, and each work on many cars.
i
A car that is serviced adjust
3
The service records
Verify the
a carburettor
may or may not need parts. (For example,
or to
conceptual
clean
a fuel injector
model you created
in
are
mechanic can
parts are not necessary to
nozzle.)
Question
2. Create
a data
dictionary
for the
verified
model.
11 4
Transform
the ERD in
FIGURE P11.1
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Figure
P11.1 into
a relational
schema
showing
all primary
and foreign
keys.
ERDfor Problem 4
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
630
PART IV
Database
11
Design
Should you create anindex Problems
12 and 13 are based
SELECT
P_CODE,
FROM
LINE
Why or why not?
on the following
query:
SUM(LINE_UNITS)
GROUP
BY
HAVING
SUM(LINE_UNITS)
12
P_CODE
Whatis the likely
13
on EMP_DOB?
. (SELECT
data sparsity
MAX(LINE_UNITS)
of the LINE_UNITS
14
If not, explain
Problems
14 and 15 are based P_CODE,
FROM
PRODUCT
on the following
Whatis the likely
query:
P_QOH*P_PRICE
P_QOH*P_PRICE
WHERE
16
on P_CODE? If so, write the SQL command to create that index. If
your reasoning.
SELECT
15
column(s) be and why would you create
your reasoning.
Should you create an index not, explain
LINE);
column?
Should you create an index? If so, what would the index that index?
FROM
. (SELECT
data sparsity
Should you create an index,
AVG(P_QOH*P_PRICE)
FROM
PRODUCT)
of the P_QOH and P_PRICE columns? what would the index
column(s) be, and why should you create that
index?
17
Consider the composite The
composite
etc.).
There
materials
11
for
usage are
two
account
can
types
for
70 per cent
of
35 per
100
that
there
materials,
cent
of purchases.
be greater than
FIGURE P11.3
usage mapshown in Figure P11.3 for a building company called BricksRUs. map shows
of
full
materials
As the
per cent.
are
1000
rows
price that
same
in the
materials are
materials
material
and
purchased
table
(e.g.
wholesale while
wish to
apply for
cement,
materials.
wholesale
Full-price
materials
can be of both subtypes,
When contractors
bricks,
the
a contract
account
percentages for
a building
Composite usage mapfor BricksRUs 500
60
CONTRACTOR
MATERIAL 1000
100
40 1..1
{Mandatory,AND} 60
35% FULL
0..*
70% 350
175
WHOLESALE
PRICE
MATERIALS
1..*
1..1
ESTIMATE
700
MATERIALS
350
3200
40
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
(80)
40
200
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
job, they and, are
send in
on
average,
500
accesses
materials
there After
a
and
estimates. they to the
350
of time,
The number 175 require
b
are roughly
80 estimates
material
accesses
are 40 subsequent a period
There
provide
table
to
accesses the
which
assumptions
of direct
for
accesses
can
this
Logical,
3200 down
Of the
the
into
175
60
accesses
usage
map
have
changed
estimate
for
accesses
631
BricksRUs
On average
to the
Design
there
to full-price
contractor
table,
as follows:
to 400 per hour.
Out of this,
table.
Wholesale materials now account for 80 per cent of all materials.
Full price materials nowrepresent only 25 per cent of all materials.
d
There are now an average of 60 estimates for each supplier.
Draw a new composite
Draw a B1-tree You should have no
19
jobs
Database
table.
c
18
Physical
estimates).
materials has decreased
to
and
who undertake of
be broken
estimate
accesses to
subsequent
a total
materials.
to the
Conceptual,
40 contractors (giving
wholesale
11
usage
map reflecting
this
new information.
with n 5 2 and insert the following
show the insertions
more than two
Remember
keys and no fewer than
Draw another B1-tree show the insertions
at each stage.
keys in order: A, B, C, D and Einto the tree. n 5 2 means that
nodes
are allowed
to
one key each.
with n 5 2 and insert the following
keys in order: B, D, C, A, E, F. You should
at each stage.
11
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Part V
DATABASE TRANSACTIONS ANDPERFORMANCE TUNING
12 Managing Transactions and Concurrenc
13 Managing Database and SQL Performanc
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
BUSINESS VIGNETTE FROM DATA WAREHOUSETO DATA LAKE Since the early 1990s, a vast amount of data has been stored in data warehouses in order to provide a central repository for business intelligence within an organisation. The concept of a data warehouse originated from studies undertaken at MITin the early 1970s.1 However, the term information
warehouse
was first
used in 1986
by Barry Devlin and Paul Murphy in an article
entitled
An Architecture for a Business and Information System in IBM Systems Journal.2 They identified what was known commonly as the islands of information problem. This is where organisations had many operational systems that were not integrated, data were duplicated and reporting from the global business perspective was rare. Data warehousing took off in earnest in 1991 when Bill Inmon
published
his book
entitled
Building
the
Data
Warehouse.3
While in
1996 there
were
more data warehouse projects initiated than in previous years, arguments began about whether data warehousing solutions were too generalised in trying to model the whole organisation. An alternative methodology to developing a data warehouse that focused on the use of data marts was championed by Ralph Kimball.4 The development of data martsfocused on the data requirements of individual
departments
rather
than the
whole organisation.
The data
mart proved
successful
as
it provided a quick return oninvestment and introduced the concepts of the dimensional modelling of data. It is now the norm for data warehouses to store terabytes of data. The number of users accessing
a typical
more complex
1
Haisten,
M. Data
Newsletter. 2
Devlin, 27(1),
3
B. and
warehouse
has increased,
whats
next?
Part
along
with the requirements
for
Withthe rise of Big Data, traditional
4: integrate
the
new
islands
of information,
data
DM
Direct
www.dmreview.com/article_sub.cfm?articleId55238 P. An
architecture
for
a business
and information
system,
IBM
Systems
Journal
1998.
B. Building Ralph.
Warehouses,
at
Murphy,
6080,
Kimball,
data
warehousing:
Available
pp.
Imnon,
4
organisations
queries and near real-time information.
the
Data
The
Kindle
Warehouse,
Data
Warehouse
Edition.
John
Wiley
4th
edition.
Toolkit: & Sons,
Hungry
Minds
Practical
Inc,
2005.
Techniques
for
Building
Dimensional
Data
2010.
633
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
634
PART
V Database
Transactions
warehouses
user to store
companies Today,
of the
to
whose
right
with data larger
Data
will be required
a
competitors
our
customers
used
to
Lake
deliver
In 2018,
new
based
business
upon
that
to
their
previous and
their
company
Big
analytics
and
artificial
Columbus,
L. 10
Charts
That
Business
Data, it
would
intelligence
of tools
to
The
Data is to
Big
analytics warehouse
dimensional
such
as What
products
do
business
business
from
value. Big
and
multiple
Today,
that, if they
more
as
to
of this
a data
questions
to Which
will increase reported
deliver
such
habits?
Data is likely
data analysis
uses
answer
view
be devised
intelligence
more efficient
approaches
Smart
need to
typically
shopping
produce
context.
complex
campaign?
Big Data revenues
decision
which
are important
cross-functional
architectures
intelligence
advertising
advanced
5
business
a variety
alongside
accelerate
right
mining. It can be used to
59 per cent of executives
and to
Data Lakes
a summarised,
organisations.
billion in 2027.5 In addition,
accuracy
is a Data Lake,
form.
Furthermore,
Business
opportunities
Forbes reported
the
through
data
provides
information
sizes.
results
an expensive
might like,
to find
and in
decisions
and
introduced
time
which
where different
timely
An alternative
an unstructured
driven.
a business.
make
tools
in
Data,
and different
within
to
data
right
Data
volumes
tools
Data
our
is
Big
forecasting
as rigid in structure.
Smart
at the
cope
analyses,
are
require
information
are fundamental
be seen
business
from
Tuning
data in its raw format
organisations
be extracted
and/or
Performance
can sometimes
allows the for
and
if
we think
intelligence
processes.
$42 billion in
2018 to
$103
used artificial intelligence main
driver
achieve
of combining
greater
predictive
making.
Will Change
Your
Perspective
Of Big
Datas
Growth,
Available:
www.forbes.com/sites/
louiscolumbus/2018/05/23/10-charts-that-will-change-your-perspective-of-big-datas-growth/#1d6790d32926
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER 12 Managing Transactions and Concurrency IN THIS CHAPTER,YOU WILLLEARN: What a database
transaction
What concurrency
control
is is
and
and
what its
properties
what role it
are
plays in
maintaining
the
databases
integrity Whatlocking
methods
are and how they
work
How stamping
methods
are used for
How optimistic
methods
are used for
concurrency
management
is
How
database
The
recovery
ANSI levels
of transaction
concurrency
used
control control
to
maintain
database
integrity
isolation
PREVIEW Database
transactions
as buying
a product,
Transactions require
the
are likely
to
the
customers
updating
sellers
accounts
completed
to
transactions The
main
many
problems.
defining
how SQL can be used to represent the
many
control.
discusses
2020 has
that
some
Learning. that
any
of the
sales
All suppressed
Rights
Reserved. content
does
May not
at the of
not materially
copied, affect
and
may updating
must be successfully executing
atomicity,
and
managing
consistency,
durability,
properties,
this
and how transaction
same
is
number
provide occur
the
overall
or
duplicated, learning
in experience.
are is
especially
chapter
logs
with
can
called
concurrent
or in Cengage
part.
Due Learning
to
electronic reserves
concurrency
in
routinely Web!).
This
transactions You
enforces
concurrent
important
via the
summaries.
whole
called
of transactions
services
and inconsistent
scanned,
they
transactions
control
can
time,
when a DBMS scheduler
be
transaction
inventory
transaction
such
the
and
that
data
can be solved
are
transactions
place
(just imagine
problems
such
account.
transactions.
concurrency
conduct
by events
a current
a sales
product
Therefore,
these
execution
uncommitted
problems
Cengage deemed
the
can imagine, environment
updates,
such
take
Managing
As you
companies
lost
to recover
transactions
database by
ability
example, the
in
activities.
properties
After
ensure
DBMSs
For
are triggered a deposit
of a transaction
system
transaction
that
making
adjusting
All parts
database
serialisability.
or
parts.
account,
data integrity
database
transactions.
review
transactions
a course,
shows
When
Copyright
for
receivable.
prevent
and
real-world
contain
are important
isolation
Editorial
reflect registering
will
rights, right
some to
third remove
handled chapter such
discover
concurrency
the
a multi-user
party additional
as that
control.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
636
PART
V Database
Transactions
You willlearn optimistic
and
about the
methods.
and types
Performance
most common
Because
of locks.
Tuning
Locks
locks
can
algorithms
are the
also
create
most
for concurrency widely
deadlocks,
so
control:
used
method,
you
you
will learn
locks,
time
stamping
will examine
about
various
strategies
for
and levels
managing
deadlocks.
Database
contents
management
can be damaged
failures.
databases
contents
full backups
or destroyed
Therefore,
you
means
of various
by
to transaction
log
will learn
by critical how
backup
operational
database
errors, including
recovery
procedures.
Such
transaction
management
backup
maintains
procedures
range
a
from
backups.
12
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
12.1
12
Managing
Transactions
and
Concurrency
637
WHATIS A TRANSACTION?
To illustrate what transactions are and how they work, lets relational diagram for that database is shown in Figure 12.1.
use the
Ch12_SaleCo
database.
The
Online Content The'Ch12_SaleCo' database usedtoillustratethe material in this chapter is
available
on the
FIGURE 12.1
online
platform
for
this
book.
The Ch12_SaleCodatabase ERD
12
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
638
PART
V Database
Transactions
and
Performance
Tuning
NOTE Although
SQL
be able
commands
to follow
Language,
the
and
Chapter
SQL commands
database
in
8 and
examine
The
design
indicate
the
whenever
on the
entity
stores
the
total
The
you
practice
can
Including
to
serve
To understand Furthermore, scenario,
You
must
update
the
account
You
must
update
the
customer
The
preceding
sales
any
action
reads
to
to
one
statements.
is a logical acceptable. not
the
A successful
database
has
Cengage deemed
state
Learning. that
any
All suppressed
is
Rights
or it
example you
is rolled
May not
not
be
products
to
customer
CUSTOMER
table
or to
generate
balances. payments
to
track
the
copied, affect
database
to
you to
reflect
track
the
a product
to
accounting transactions
you
sell
purchase
to
her/his
a customer.
account.
Given that
parts:
inventory.
in the
database.
it
may consist
tables;
it of
includes can
now
the
such
or only the
original
the
overall
or
duplicated, learning
in experience.
constraints
whole
or in Cengage
successfully.
database
part.
Due Learning
that
electronic reserves
states
mentioned
If any existed
one consistent are
to
state
statements. A transaction
no intermediate
receivable
to
and INSERT
UPDATE
previously
accounts
statements
UPDATE
and
is
SELECT
statements
of a transaction.
aborted;
as the
UPDATE
of INSERT
of INSERT definition
a transaction
of a simple
of related
of SELECT,
or entirely
must be completed
terms,
of a series
a combination
transaction,
all data integrity
scanned,
of a series
a combination
completed
the
database
may consist
may consist
augment
only the inventory
back to
In
A transaction
changes the database from
which
materially
table
the
any customer
will enable
following
a database.
must be entirely
transaction
does
features:
is increased
when
and
that
the
the
may consist
in the transaction
one in
Reserved. content
to
contents;
Updating
transaction
attribute
maximum
here
suppose
be reflected
words, a multi-component
completed.
CUSTOMER
decreased
Ch12_SaleCo
may charge
in various
discussion,
SQL statements entire
started.
2020
In other
simplifying
discussions.
of at least
writes
of table
unit of work that
be partially
All of the fail,
must
transaction
preceding
material
balance.
more tables;
The sales
Given the
the
transactions.
of attributes
or
the
invoice.
and/or
a list
values
following
in the
purchases
of the
on hand in the
transaction
generate
the
the
balance in the
and
provided
customer
quantity
change
SQL, ignore
and to augment
balance for
all customer
chapters
consists
the
add rows
review
the
must reduce
statement
account
know
Query
of SQL, you can use the
CUST_BALANCE
minimum
of a transaction,
transaction
from
note
and it is
current
should
procedures.
value
The
design
of the
You
that
records
purpose
that
12.1,
credit,
the
average,
database
must write a new customer
to
Copyright
the
You
12
Editorial
sales
stored
Figure
customer
dont
knowledge
you
Structured
activity.
concept
suppose
your
and
issues,
Beginning
UPDATE examples
customer.
but the implementation
the
you
in
determine
table
the
SQL. If
current
as total,
account
more precisely,
well enough
Advanced
on
the
such
change
8,
(CUST_BALANCE)
by the
control
Chapter
a purchase
write a query to
of customer
Naturally,
balance
owed
concurrency
studied
own triggers
diagram
makes
ACCT_TRANSACTION
details
and
your
customer
summaries
and
not
If you have a working
relationship
amount
makes it easy to
have
Language
writing
customer
makes a payment.
transaction
you
your own SELECT and
9 by
the
the
important
if
discussions.
to generate
Chapters
As you
several even
9, Procedural
and focus
Ch12_SaleCo presented
illustrate
discussions
is
not
sale,
are must
acceptable.
of the
SQL statements
before
the
state to another.
transaction
A consistent
satisfied.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
To ensure consistent database later,
consistency
state. that
of the
If the
violates
its
all transactions
database,
database
is
integrity
are
every transaction
not in and
controlled
a consistent
business
and
the
For that
by the
Managing
must begin
state,
rules.
executed
12
with the
transaction
reason,
DBMS
Transactions
to
database
will yield
subject
and
639
in a known
an inconsistent
to limitations
guarantee
Concurrency
database
discussed
integrity.
Mostreal-world database transactions are formed by two or more database requests. A database request is the equivalent of a single SQL statement in an application program or transaction. Therefore, if
a transaction
uses
is
three
composed
database
operations
that
requests.
read
from
all transactions
customer
number
using
SQL
the
update 10016
the
in
CUSTOMER
WHERE
query
does not
because the
database
one INSERT
request
storage
statement,
the
several
input/output
generates
transaction (I/O)
media.
Results
the
Suppose
you
CUSTOMER
want
table.
to
Such
examine
the
a transaction
current can
balance
for
be completed
by
CUST_BALANCE
CUST_NUMBER
that
alter the
and
code:
FROM
transaction
each
database.
located
statements
physical
Transaction
CUST_NUMBER,
access,
turn,
write to
SELECT
Although
UPDATE
In
or
12.1.1 Evaluating Not
of two
it
make any changes
accesses
database
5 10016;
remains
the in
database.
in the
If the
CUSTOMER
database
a consistent
state
after
may consist
of a single
table,
existed
the
access,
SQL
statement
in
the
SQL code represents
a consistent
because
the
state
a
before
transaction
the
did
not
database.
Remember
statements.
that
a transaction
Lets revisit
Ch12_SaleCo
the
database.
product
89-WRE-Q
to
INVOICE,
LINE,
PRODUCT,
represent
this transaction
INSERT
INTO
VALUES
INTO
VALUES
SET
INSERT
in
'18-Jan-2019',
1, '89-WRE-Q',
of
a more complex you
the
amount
and
ACCT_TRANSACTION
register
277.55.
the
transaction,
credit
The required tables.
sale
using the
of
one
transaction The
SQL
unit
affects
of the
statements
that
256.99,
20.56,
277.55,
'cred',
0.00,
277.55);
12
1, 256.99,
256.99);
PRODUCT 5 PROD_QOH
PROD_CODE
UPDATE
5 CUST_BALANCE
CUST_NUMBER
INTO
1
5 '89-WRE-Q';
CUST_BALANCE
WHERE
10016
CUSTOMER
2019
LINE
SET PROD_QOH WHERE
to illustrate
on 18 January
SQL
are as follows:
10016,
(1009,
UPDATE
customer
sales example
that
of related
INVOICE
(1009,
INSERT
previous
Suppose
or a collection
CUSTOMER
1 277.55
5 10016;
ACCT_TRANSACTION
VALUES (10007,
'18-Jan-19',
10016, 'charge',
277.55);
COMMIT;
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
640
PART
V Database
Transactions
The results involved
of the
in the
To further
the
The
The
successfully
your 1009
row
value
The customer the
The
the
appear
the
in
Figure
was 12), thus leaving
was added to the
row,
the
12.2. (Note
that
all records
following:
derived
of one
values
(PROD_QOH)
in
attribute
values
were
stored
balance.
purchase
attribute
a quantity
(CUST_BALANCE)
the
and the invoice
to reflect
derived
note
In this
total,
on hand
value
results, table.
invoice
was added row,
balance (the initial
A new row number
tax,
transaction
INVOICE
quantity
balance
existing
the
In this
89-WRE-Qs
one (the initial
to
1009
of 256.99.
product
of the
the
for invoice
a price
transaction
highlighted.)
was added subtotal,
Tuning
completed
are
understanding
invoice
LINE
with
Performance
transaction
A new row for
and
for
unit
the line
the
of product
amount
PRODUCT
89-WRE-Q
were stored.
table
was reduced
by
on hand of 11.
for customer
10016
was updated
by adding
277.55
to
was 0.00).
ACCT_TRANSACTION
table
to reflect
the new account
transaction
10007.
COMMIT
FIGURE 12.2
statement
is
used
to
end
a successful
transaction.
(See
Section
12.1.3.)
Tracing the transaction in the Ch12SaleCodatabase
Table name: INVOICE INV_
12
INV_
INV_
CUST_
INV_
SUBTOTAL
DATE
INV_TOTAL
INV_PAY_
TAX
INV_
INV_PAY_
TYPE
BALANCE
AMOUNT
NUMBER
NUMBER
1001
10014
16-Jan-19
54.92
4.39
59.31
cc
59.31
0.00
1002
10011
16-Jan-19
9.98
0.80
10.78
cash
10.78
0.00
1003
10012
16-Jan-19
270.70
21.66
292.36
cc
292.36
0.00
1004
10011
17-Jan-19
34.87
2.79
37.66
cc
37.66
0.00
1005
10018
17-Jan-19
70.44
5.64
76.08
cc
76.08
0.00
1006
10014
17-Jan-19
397.83
31.83
429.66
1007
10015
17-Jan-19
34.97
2.80
37.77
1008
10011
17-Jan-19
1033.08
82.65
1115.73
cred
1009
10016
18-Jan-19
20.56
277.55
cred
Table
name:
256.99
cred
100.00
329.66
37.77
0.00
500.00
615.73
chk
0.00
277.5
PRODUCT
PROD_
PROD_DESCRIPT
PROD_
CODE
INDATE
11QER/31
Power
painter,
15 psi.,
PROD_
PROD_
VEND_
MIN
PRICE
DISCOUNT
NUMBER
8
5
109.99
0.00
25595
PROD_
PROD_ QOH
03-Nov-18
3-nozzle 13-Q2/P2
7.25 cm pwr. saw blade
13-Dec-18
32
15
14.99
0.05
21344
14-Q1/L3
9.00 cm pwr. saw blade
13-Nov-18
18
12
17.49
0.00
21344
1546-QQ2
Hrd. cloth,
1/4
cm,
2
3 50
15-Jan-19
15
8
39.95
0.00
23119
1558-QW1
Hrd. cloth,
1/2
cm,
3
3 50
15-Jan-19
23
5
43.99
0.00
23119
2232/QTY
B&D jigsaw,
12 cm
30-Dec-18
8
5
109.92
0.05
24288
2232/QWE
B&D jigsaw,
8 cm
24-Dec-18
6
5
99.87
0.05
24288
2238/QPD
B&D cordless
20-Jan-19
12
5
38.95
0.05
25595
23109-HB
Claw
20-Jan-19
23
10
9.95
0.10
21225
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
blade blade
drill, 1/2
cm
hammer
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
PROD_
Concurrency
641
PROD_
PROD_
PROD_
PROD_
PROD_
VEND_
QOH
MIN
PRICE
DISCOUNT
NUMBER
8
5
14.40
0.05
Rat-tail file,
1/8 cm fine
15-Dec-18
43
20
4.99
0.00
21344
89-WRE-Q
Hicut
saw,
07-Jan-19
11
5
0.05
24288
PVC23DRT
PVC pipe,
06-Jan-18
188
75
5.87
0.00
SM-18277
1.25
01-Mar-19
172
75
6.99
0.00
21225
SW-23116
2.5 cm
50
24-Feb-19
237
100
8.45
0.00
21231
4 m 3 8 m
17-Jan-19
18
0.10
25595
54778-2T
chain
cm
Steel
Table name:
16 cm
3.5 cm, metal
8
screw,
wd. screw, matting, m,.5
m 25
CUST_
CUST_
AREACODE
PHONE
BALANCE
0181
844-2573
0.00
0161
894-1238
0181
894-2285
0.00
0181
894-2180
0.00
0181
222-1672
0.00
B
0161
442-3381
0.00
G
0181
297-1228
0181
290-2556
0.00
CUST_
NUMBER
LNAME
FNAME
10010
Ramas
Alfred
A
10011
Dunne
Leona
K
10012
Moloi
10013
Pieterse
10014
Orlando
Myron
10015
OBrian
Amy
10016
Brown
James
F
615.73
277.55
Vinaya
G
0181
382-7185
0.00
Mlilo
K
0161
297-3809
0.00
LINE LINE_NUMBER
PROD_CODE
LINE_UNITS
LINE_PRICE
LINE_AMOUNT
1001
1
13-Q2/P2
3
14.99
44.97
1001
2
23109-HB
1
9.95
9.95
1002
1
54778-2T
2
4.99
9.98
1003
1
2238/QPD
4
38.95
155.80
1003
2
1546-QQ2
1
39.95
39.95
1003
3
13-Q2/P2
5
14.99
74.95
1004
1
54778-2T
3
4.99
14.97
1004
2
23109-HB
2
9.95
19.90
1005
1
PVC23DRT
5.87
70.44
1006
1
SM-18277
3
6.99
20.97
1006
2
2232/QTY
1
109.92
109.92
Cengage deemed
W
George
Moloi
INV_NUMBER
has
INITIAL
Jaco
Padayachee
10019
CUST_
Marlene
Williams
10018
2020
119.95
CUST_
CUST_
name:
5
m mesh
CUST_
Table
256.99
CUSTOMER
10017
review
and
02-Jan-19
Sledge hammer,
3 1/6
Copyright
Transactions
12 kg
23114-AA
WR3/TT3
Editorial
Managing
INDATE
PROD_DESCRIPT
CODE
12
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
12
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
12
and/or restrictions
eChapter(s). require
it.
642
PART
V Database
Transactions
INV_NUMBER
Table
and
Performance
LINE_NUMBER
Tuning
PROD_CODE
LINE_UNITS
LINE_PRICE
LINE_AMOUNT
9.95
1006
3
23109-HB
1
1006
4
89-WRE-Q
1
256.99
256.99
1007
1
13-Q2/P2
2
14.99
29.98
1007
2
54778-2T
1
4.99
4.99
1008
1
PVC23DRT
5
5.87
1008
2
WR3/TT3
4
1008
3
23109-HB
1
1008
4
89-WRE-Q
2
256.99
513.98
1009
1
89-WRE-Q
1
256.99
256.99
name:
9.95
29.35 479.80
119.95
9.95
9.95
ACCT_TRANSACTION
ACCT_TRANS_
ACCT_TRANS_
CUST_
ACCT_TRANS_
ACCT_TRANS_
NUM
DATE
NUMBER
TYPE
AMOUNT
10003
17-Jan-19
10014
charge
329.66
10004
17-Jan-19
10011
charge
615.73
10006
29-Jan-19
10014
payment
329.66
10007
18-Jan-19
10016
charge
277.55
Now suppose that the DBMS completes the first three SQL statements. Further, suppose that during the execution of the fourth statement (the UPDATE of the CUSTOMER tables CUST_BALANCE value for customer 10016), the computer system experiences aloss of electrical power. If the computer does not have a backup power supply, the transaction cannot be completed. Therefore, the INVOICE and LINE rows
were added,
the
PRODUCT
table
was updated
to represent
the sale of product
89-WRE-Q,
but customer 10016 was not charged, nor wasthe required record in the ACCT_TRANSACTION table written. The database is now in an inconsistent state, and it is not usable for subsequent transactions. Assuming that the DBMS supports transaction management, the DBMS will roll back the database to a previous
12
consistent
state.
NOTE Microsoft to
Access
an external
as Oracle,
supports
DBMS,
transaction
or via
SQL Server
and
management
Access
Data
through
Objects
DB2, do support
(ADO)
its
native
JET
components.
the transaction
engine,
More
management
via
an
ODBC interface
sophisticated
DBMSs,
components
such
discussed
in this
chapter.
Although
the
interruption programmer of the
ten
Copyright Editorial
review
2020 has
and
transaction
units
Cengage deemed
DBMS is prevents
any
All suppressed
truly
Rights
to recover
completion
represents
89-WRE-Q,
Reserved. content
designed
does
May not
not materially
be
copied, affect
correct.
the
scanned, overall
The
real-world
the inventory
the
a database
of a transaction,
must be semantically
of product
Learning. that
the
or
duplicated, learning
in experience.
to
a previous
the
transaction
DBMS
cannot
event.
For example,
commands
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
is
guarantee
UPDATE
whole
consistent
itself
right
defined that
suppose
the
that
to
third remove
party additional
content
may content
any
an or
the
sale
of
way:
suppressed at
user
meaning
following
be
when end
semantic
were written this
some
state by the
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER
UPDATE
PRODUCT
SET
OD_QOH
5 PROD_QOH
OD_CODE
5 '89-WRE-Q';
WHERE The
sale
should
UPDATE
added
Although
decreased
to
event
product
customer Clearly,
Some
that
many
errors
1546-QQ2
10012 rather improper
DBMSs
constraints
especially
letting
customer
the
into
end the
a customer
All transactions properties
is
successfully
Consistency database
one
Isolation second
Tn) until
the
very
Copyright review
2020 has
to
Cengage
any
the
number
that
of the
first
test.
of
programmers
are
the
quantity value
on
user
can
define
integrity. enforceable
referential
structures if
database
are
properly
a transaction already
and
defined,
inserts
exists,
the
key integrity
entity
a new
DBMS
will
rule.
Lets
durability
look
briefly
serialisability.
at each
of a transaction
of these
These
properties:
be completed; if not,
SQL requests, is
and
all four
aborted.
In
requests
other
must
words,
be
a transaction
work.
of the databases another.
one is
consistent
When a transaction
state. Atransaction
is
parts violates
of a system
does
particularly
completed,
an integrity
the
takes a
database
constraint,
the
entire
May not
not materially
useful
execution
scanned, the
overall
a transaction
in
T1 is
by any
multi-user
other
database
database
being transaction
environments
at the same time. they
system is
or
duplicated, learning
in experience.
of several transactions
after
multiple
transaction
is
cannot
be undone
executed
whole
Cengage
part.
Due
to
are likely
serialisability ensures
at a time.
Learning
electronic reserves
to is
some to
be executed not
atomicity
an issue.
third remove
party additional
content
and isolation and the
a single-user
right
is important
serialisability
The
rights, the
consistent
This property
transactions
DBMSs. (Even
or in
yields
T1, T2 and T3 yields results
another).
executed,
automatically
by the single-user
affect
if
be accessed
of transactions
order (one
where
a single
one transaction
copied,
words,
are done (committed),
execution
in serial
database
be
other cannot
cannot be used by a
failure.
databases, only
of atransaction
and update the
changes
the concurrent
if
In
data item
that the concurrent
only
Reserved. content
is
once transaction
a single-user
Rights
completed.
X, that
property
distributed
because
All
DBMS
represents
CUST_BALANCE
governing
primary
isolation,
transaction
unit
users can access
event
Naturally,
suppressed
and
the
being inserted
a violation
entire
to
data item
must be guaranteed
Learning. that
Yet the
of reducing
which the
For example,
T1 has four
logical
have been executed and
database,
deemed
a transaction
This
different
nature,
transactions
Editorial
the
ensures
multi-user
the
1
More specifically,
concurrently. By its
customer
as the
state
until the
ensures
appear
users
effect
as those
when the table
ACIDS
If
T1 ends.
Serialisability
in
DBMS
state. If any of the transaction
even in the
results.
End
by
such
transactions.
the permanence
using
several
Durability
643
aborted.
and is
or lost,
means
rules,
to
consistent
transaction
because
results.
consequences
a devastating
meansthat the data used during the execution
executed (T2 ...
provide
to indicate
otherwise,
a consistent is
Instead,
whether the transaction
or of crediting
have
consistency,
indivisible,
indicates
from
transaction
that
aborted.
as a single,
reaches
89-WRE-Q
atomicity,
referred
completed;
is treated
by ten.
use yields incorrect
the
requires that all operations (SQL requests)
transaction
Concurrency
Properties
sometimes
Atomicity the
89-WRE-Q
responsibility.
can
by the some
and the
must display
are
and
10016.
variety
code
product
evaluate
Imagine
Other integrity
validate
an error
12.1.2 Transaction
users
transactions
table
with
end
its
cannot
of product
rules.
Transactions
value.
correct,
fashion.
customer
automatically
DBMS
transaction
the
in this
the relational
on business
are enforced
thereby
is
for
PROD_QOH
The DBMS
instead
than
value
syntax is
or incomplete
based
integrity,
89-WRE-Qs
anyway.
Managing
1 10
PROD_QOH
commands
correctly;
of introducing
on hand for
the
product
UPDATE
the transaction
real-world
capable
for
ten
the
will execute the
have
12
DBMS
may content
be
durability
must
suppressed at
any
time
of
from if
the
subsequent
of
manage
eBook rights
and/or restrictions
eChapter(s). require
it.
644
PART
V Database
Transactions
recovery
from
improper
application
and
errors
Multi-user
created
the
of transactions
and integrity. second
property
is
by using
in addition if
transaction
violated
and the
concurrency
American
the
require
sequence
events
1
is
Standards
such
to
interruptions
first
guard the
The
DBMS
multiple
and
concurrent
serialisability
databases over the
transaction
undesirable
to
ensure
are executed
the
consistent.
avoid
Institute
support
that,
no longer to
to
transactions before
subject
controls
and durability
database
techniques
are typically
must implement
concurrent the
when
must continue
is
(ANSI)
provided
through
has
by two
a transaction
same
is finished,
must
manage
and
consistency data set
the
the
isolation
transactions
situations.
defined
standards
SQL statements:
sequence
is initiated
all succeeding
that
govern
COMMIT
by a user
SQL statements
and
SQL
ROLLBACK.
or an application
until
database ANSI program,
one of the following
four
occurs:
A COMMIT statement is reached, in database.
2
power
Management with SQL
Transaction
standards
several
database
control
National
transactions.
interruptions,
LAN-based,
DBMS
to atomicity
updates
12.1.3 Transaction The
mainframe-or
multi-user
For example,
and the
by operating-system-induced
whether
Therefore,
isolation
Tuning
execution.)
databases,
transactions.
Performance
The
COMMIT
statement
which case all changes are permanently recorded automatically
ends the
within the
SQL transaction.
A ROLLBACK statement is reached, in which case all changes are aborted and the database is rolled
3
back
to its
previous
consistent
state.
The end of a program is successfully reached, in within
4
the
database.
This
action
The program is abnormally aborted
and the
database
is
equivalent
terminated,
is rolled
to
in
back
which case all changes are permanently recorded COMMIT.
which case the changes
made in the database
previous
This
to its
consistent
state.
action
is
are
equivalent
to
ROLLBACK. The
use
of
COMMIT
quantity
on hand
product
1558-QW1
is illustrated
in the
(PROD_QOH) priced
at
and
following
the
43.99
simplified
customers
per
unit (for
sales
balance a total
of
example, when
87.98)
the
and
which
updates
customer
buys
charges
the
a products two
purchase
units to
of
his or
her account:
12 UPDATE
PRODUCT
SET
PROD_QOH
5 PROD_QOH
WHERE
PROD_CODE
UPDATE
CUSTOMER
SET
CUST_BALANCE
WHERE
2
5 '1558-QW1';
5 CUST_BALANCE
CUST_NUMBER
1 87.98
5 '10011';
COMMIT; (Note
that
database,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
the
example
is
the transaction
All suppressed
Rights
Reserved. content
does
May not
simplified
to
would involve
not materially
be
copied, affect
scanned, the
overall
or
make it
several
duplicated, learning
in experience.
whole
easy
to trace
additional
or in Cengage
part.
Due Learning
to
the
table
electronic reserves
transaction.
In
the
Ch12_SaleCo
updates.)
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Actually, is the
the
COMMIT
applications
practice
last
dictates
that
A transaction
such
BEGIN
you to
the
COMMIT
implicitly
the
example
application
when
not necessary
terminates statement
the
ANSI standard;
is
first
normally. at the
SQL
some (such
Managing
Transactions
if the
UPDATE
However,
end
good
of a transaction
statement
is
and
645
statement
programming
declaration.
encountered.
as SQL Server)
Concurrency
Not
use transaction
all
SQL
management
as:
TRANSACTION;
to indicate
the
follow
used in that
and the
you include
begins
implementations statements
statement action
12
the
beginning
assign
Oracle
of
a new transaction.
characteristics
RDBMS
uses
Other
for the transactions
the
SET
SQL implementations,
as parameters
TRANSACTION
statement
such
as
VAX/SQL,
to the
BEGIN statement.
declare
a new transaction
to
allow
For example, start
and its
properties.
12.1.4 The Transaction Log A DBMS uses a transaction information statement, crash.
a programs Some
state.
After
a server
failure,
transaction
DBMS log.
A record
the
that
The type
?
transactions
of the
and after
The ending (COMMIT)
as a network forward
back
uncommitted
written to the physical database,
the
it
also
or a disk
a currently
modify
The
by a ROLLBACK
discrepancy
to
but not yet
that
(SQL
consistent
transactions
and
database.
automatically
updates
the
statement):
delete,
affected
values
to the previous
rolls
database.
triggered
transaction
(update,
objects
such
a database
automatically
update the
stores:
component
of the
The before
? Pointers
log
beginning
of operation
? The names
Oracle
that
requirement
failure
to recover
were committed
executes
For each transaction
log
example,
The transaction
for
or a system
transaction
for
of all transactions
DBMS for a recovery
termination,
use the
transactions
While the
used by the
abnormal
RDBMSs
rolls forward
?
log to keep track
stored in this log is
insert)
by the transaction
for the
fields
being
and next transaction
(the
name
of the table)
updated
log
entries for the
same transaction
of the transaction.
12 Although
using
corrupted
database
management
Table
basis
Copyright review
2020 has
Cengage deemed
Learning. that
any
as
like
All suppressed
Rights
COMMIT,
does
transaction
May
failure
overhead
Access etc.
not materially
be
were not
copied, affect
scanned, the
overall
log
that
occurs,
of a DBMS,
the
does not support
As such
it is
or
ability
to restore
advanced
not as resilient
duplicated,
in experience.
whole
a basic
DBMS
a
transaction
to failure
process written
or in Cengage
part.
Due Learning
is
the
recovery
electronic reserves
the
database
rights, the
right
some to
composed
the transaction
database
complete,
to the
to
transaction
will examine
(ROLLBACK)
recovery
physically
learning
reflects
the
and restore
When the
that
not
Microsoft
ROLLBACK,
transactions
information.
Reserved. content
processing
price. (Note:
If a system
transactions
the
Oracle.)
or incomplete of that
increases
a simplified
statements.
uncommitted
Editorial
such
12.1 illustrates
all committed
log
worth the
databases
UPDATE
on the
is
options
as enterprise
SQL
a transaction
third remove
to its
DBMS
before
party additional
content
may
previous writes in
the
content
of two
log for
be
failure
suppressed at
any
time
the log
occurred.
from if
all
state
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
646
PART
V Database
Transactions
TABLE 12.1 TRL_ID
341
and
Performance
Atransaction
TRX_
PREV
NEXT
NUM
PTR
PTR
101
Null
352
Tuning
log OPERATION
TABLE
START
ROW ID
ATTRIBUTE
BEFORE
AFTER
VALUE
VALUE
****Start Transaction
352
101
341
363
UPDATE
PRODUCT
1558-QW1
PROD_QOH
363
101
352
365
UPDATE
CUSTOMER
10011
CUST_
25
23
525.75
615.73
BALANCE 365
101
363
Null
COMMIT
****
End of
Transaction TRL_ID
5 Transaction
TRX_NUM (Note:
The transaction
PTR
If
a ROLLBACK
only for
is issued
that
previous
particular
transaction As the
In
logs
number
number
for
committed
to common some
on several
different
disks
dangers
of the to
the
transactions
most
reduce
the
such
risk
DBMS.)
in
the
database
durability
any other
conditions
a DBMS,
of a system
the
of the
back.
DBMS like
as disk-full data
will restore
maintain
not rolled
by the
critical
DBMS to
are
managed
by the
ID
all transactions,
and it is
database
contains
record
assigned
of a transaction,
than
a database,
automatically
log
termination
words,
is
a transaction
rather
log
database.
The
and disk crashes.
some
implementations
failure.
CONCURRENCY CONTROL
coordination
of the
simultaneous
known as concurrency of transactions
12
to
the
other
log is subject
12.2 The
before
log is itself
transaction
support
5 Pointer
transaction,
transactions.
The transaction
log record ID
5 Transaction
in
simultaneous
a
control.
multi-user
execution
consistency
execution
database
The three
a
control
database
are lost
multi-user
database
system
is
control is to ensure the serialisability
Concurrency
over a shared
main problems
in
of concurrency
environment.
of transactions
problems.
of transactions
The objective
is important
can create
updates,
several
uncommitted
because
the
data integrity
and
data and inconsistent
retrievals.
12.2.1 Lost Updates The lost
update
data element
of lost
updates, lets
products value
problem
quantity
is
occurs
and one of the
35.
Also
when two
concurrent
on hand assume
(PROD_QOH). that
two
Assume
concurrent
PRODUCT
table.
Purchase
T2:
Sell 30 units
2020 has
T2,
are
updating
you
have
a product
T1 and
T2,
The transactions
whose occur
the
same
To see an illustration
attributes is a
current
that
PROD_QOH
update
the
PROD_
are:
Computation
T1:
review
that
T1 and
other transaction).
One of the PRODUCT tables
transactions,
Transaction
Copyright
by the
examine a simple PRODUCT table.
QOH value for some item in the
Editorial
transactions,
updates is lost (overwritten
100
Cengage deemed
Learning. that
any
units
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
PROD_QOH
5 PROD_QOH
1 100
PROD_QOH
5 PROD_QOH
2 30
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Table 12.2 shows the correct
answer
serial
execution
PROD_QOH
TABLE 12.2
of those
transactions
Normal execution
Step
1
T1
Read
2
T1
PROD_QOH
3
T1
4
T2
Read
5
T2
PROD_QOH
6
T2
suppose
12.3
shows
35,
when
and its
promptly
a transaction (using
how the lost the
overwritten
TABLE 12.3
is able to read
the same
update
second
subtraction
Concurrency
yielding
Stored
647
the
product)
problem
can
transaction
5 35
1 100 135 135
PROD_QOH
5 135
2 30
yields
5 in
by T2. In short, the
a products
arise. is
In
the
memory.
105
PROD_QOH
has been committed.
(T2)
addition
Note that
transaction
Therefore,
meantime,
T1
value from the table
The sequence
the first
executed.
Value
35
PROD_QOH
Write PROD_QOH
that
and
circumstances,
Write PROD_QOH
transaction
committed
normal
Transactions
of two transactions
Transaction
However,
under
Managing
5 105.
Time
a previous
12
T2 still
writes
of 100 units is lost
the
depicted
(T1)
has
operates
value
before
in
not
yet
on the
135 to
Table
disk,
been value
which
is
during the process.
Lost updates Stored
Time
Transaction
Step
1
T1
Read
PROD_QOH
35
2
T2
Read
PROD_QOH
35
3
T1
PROD_QOH
5 35
1 100
4
T2
PROD_QOH
5 35
2 30
5
T1
Write PROD_QOH
6
T2
Write
(Lost
update)
Value
135
PROD_QOH
5
12
12.2.2 Uncommitted The
phenomenon
Data
of uncommitted
data
occurs
when two transactions,
T1 and
T2, are executed
concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has already accessed the uncommitted data thus violating the isolation property of transactions. Toillustrate this possibility, lets use the same transactions described during the lost updates discussion. T1 has two atomic
parts to it,
one of which is the
update
of the inventory,
the other
possibly
being the
update
of
the invoice total (not shown). T1 is forced to roll back due to an error during the update of the invoice total; hence, it rolls back allthe way, undoing the inventory update as well. This time the T1transaction is rolled back to eliminate the addition of the 100 units. Because T2 subtracts 30 from the original 35 units, the correct answer should be 5.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
648
PART
V Database
Transactions
and
Performance
Tuning
Computation
Transaction
T1: Purchase
100 units
T2: Sell 30 units
PROD_QOH
5 PROD_QOH
PROD_QOH
5 PROD_QOH
Table 12.4 shows how, under normal circumstances, the correct answer.
TABLE 12.4
Correct
Step
1
T1
Read PROD_QOH
2
T1
PROD_QOH
3
T1
Write PROD_QOH
4
T1
5
T2
Read
6
T2
PROD_QOH
7
T2
Write PROD_QOH
12.5 shows
how the
begun
TABLE 12.5
its
5 35
yields
135 35
*****
35
PROD_QOH 5 35
uncommitted
Value
1 100
2 30 5
data
problem
can arise
when the
ROLLBACK
is
completed
execution.
An uncommitted data problem
Transaction
Step
1
T1
Read PROD_QOH
2
T1
PROD_QOH
3
T1
Write PROD_QOH
4
T2
Read PROD_QOH (Read
5
T2
PROD_QOH
6
T1
7
T2
Stored
*****
Inconsistent
5 35 1 100 135
5 135
ROLLBACK
uncommitted
data)
135
2 30
*****
35
if transaction
occur
finish
working
T1 calculated
transaction,
are
To illustrate
changed that
T1 calculates
2
Atthe same time,
with such
the same
and
problem,
1
when a transaction
a summary
T2, was updating
they
105
Retrievals
retrievals
transactions
Value
35
Write PROD_QOH
12.2.3 Inconsistent
before
of those transactions
35
Time
other
the serial execution
*****ROLLBACK
T2 has
- 30
Stored
Transaction
after
back)
of two transactions
Time
Table
12
execution
1 100 (Rolled
other
the total
quantity
(using
accesses
For
data before and after one or
example,
an inconsistent
SQL aggregate
functions)
retrieval
over
data. The problem is that the transaction
data
assume
data.
after
the
they
are
following
on hand
changed,
thereby
would
a set
of data
might read
yielding
some
inconsistent
more occur
while data
results.
conditions:
of the
products
stored in the
PRODUCT
table.
T2 updates the quantity on hand (PROD_QOH) for two of the PRODUCT tables
products.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The two transactions
are shown in
TABLE 12.6 transaction
Retrieval
Table
during
T1
SELECT FROM
Transactions
and
Concurrency
649
update T2
UPDATE
PRODUCT
Managing
12.6.
Transaction
SUM(PROD_QOH)
12
PRODUCT
SET PROD_QOH
5 PROD_QOH
1 10
WHERE
PROD_CODE
5 1546-QQ2
UPDATE PRODUCT
SET PROD_QOH
5 PROD_QOH
2 10
T2 represents
the
WHERE PROD_CODE
5 1558-QW1
COMMIT;
While T1
calculates
of a typing
error:
ten
units to
(Only
the
added
PROD_QOH
a few
in
of the
the
quantity
TABLE 12.7
is
Transaction
for
given
1558-QW1s
PROD_QOH
few
the
table
are
but
meant to
PROD_QOH.
values
are reflected
shown.
To illustrate
After
PROD_QOH
PROD_QOH
1546-QQ2
15
(15 1 10)
1558-QW1
23
(23
2232-QTY
8
8
2232-QWE
6
6
92
92
that
inconsistent
execution
for
1 25
before
has
total
Cengage deemed
next
Learning. that
any
is
All suppressed
in
are
write 65
Rights
Reserved. content
reflects
statement
does
was
are correct
during
the
shown
after the
total
1 23
12.7
summation
was read
The Before
Table
possible
The After
1546-QQ2 5 65.
the
Before
2020
retrievals
incorrect.
product
40
shown
the
data entry correction
32
results
12.7.
point,
products.)
32
final
the two
Table
the
13-Q2/P2
the
add the
product
(See in
8
Although
correction
user adds ten to
8
Total
review
product
problem,
Before
11QER/31
Copyright
PROD_QOH,
the
PRODUCT
for those
all items,
1558-QW1s
and final
the
results:
for
To correct
ten from
PROD_CODE
Editorial
product
The initial
values
values
to
subtracts
12.6.)
PROD_CODE
PROD_QOH
units
(PROD_QOH)
PROD_QOH.
and Table
on hand
ten
1546-QQ2s
statements
sum for
total
user
product
1546-QQ2s UPDATE
the
after the
transaction in
Table
write statement the
fact
completed
that to
adjustment,
reflects
value
reflect
the
of 23 for corrected
12.8
making the
was completed.
the
10) ?13
Table
execution, 12.8
? 25
fact
Therefore, product
demonstrates
the
that
12
result the
of T1s
value
the After
total is
1558-QW1
update
of
13.
of 25
was read
Therefore,
the
5 88.
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
650
PART
V Database
TABLE 12.8
Transactions
and
Inconsistent
Performance
Tuning
retrievals Value
Total
Time
Transaction
Action
1
T1
Read PROD_QOH for
PROD_CODE
5 '11QER/31'
8
8
2
T1
Read PROD_QOH for
PROD_CODE
5 '13-Q2/P2'
32
40
3
T2
Read PROD_QOH for
PROD_CODE
5 '1546-QQ2'
15
4
T2
PROD_QOH
5
T2
Write PROD_QOH
for
PROD_CODE
5 '1546-QQ2'
25
6
T1
Read
PROD_QOH
for
PROD_CODE
5 '1546-QQ2'
25
(After)
7
T1
Read
PROD_QOH
for
PROD_CODE
5 '1558-QW1'
23
(Before)
8
T2
Read
PROD_QOH
for
PROD_CODE
5 '1558-QW1'
23
9
T2
PROD_QOH
10
T2
13
11
T2
12
T1
Read
13
T1
Read
5 15
5 23
for
PROD_CODE
5 '1558-QW1'
PROD_QOH
for
PROD_CODE
5 '2232-QTY'
8
96
PROD_QOH
for
PROD_CODE
5 '2232-QWE'
6
102
COMMIT
65 88
2 10
Write PROD_QOH *****
1 10
*****
The computed answer of 102 is obviously wrong because you know from Table 12.7 that the correct answer is 92. Unless the DBMS exercises concurrency control, a multi-user database environment can create
havoc
within the information
system.
12.2.4 The Scheduler You now know that severe problems can arise when two or more concurrent transactions are executed. You also know that a database transaction involves a series of database I/O operations that take the database from one consistent state to another. Finally, you know that database consistency can be ensured
only before
and
after the
execution
of transactions.
A database
always
moves through
an unavoidable temporary state of inconsistency during a transactions execution. That temporary inconsistency exists because a computer cannot execute two operations at the same time and must therefore execute them serially. During this serial process, the isolation property of transactions prevents
12
them
from
accessing
the
data not yet released
by other transactions.
In previous examples, the operations within atransaction were executed in an arbitrary order. Aslong as two transactions, T1 and T2, access unrelated data, there is no conflict and the order of execution is irrelevant to the final outcome. However, if the transactions operate on related (or the same) data, conflict is possible among the transaction components and the selection of one operational order over another
may have
some
undesirable
consequences.
So, how is the
correct
order
determined,
and
who determines that order? Fortunately, the DBMS handles that tricky assignment by using a built-in scheduler. The scheduler is a special DBMS program that establishes the order in which the operations within concurrent
transactions
are executed.
The scheduler
interleaves
the
execution
of database
operations
to ensure serialisability and isolation of transactions. To determine the appropriate order, the scheduler bases its actions on concurrency control algorithms, such aslocking or time stamping methods, which are explained in the next sections.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The scheduler If there
were
a first-come, the
waits
first-come, DBMS
for
scheduling
the
data
that
some
processing
yield
other
is that
thereby
would
processing
time
several
response
method is
and
Concurrency
651
unit (CPU) is used efficiently.
losing
unacceptable
scheduling
Transactions
all transactions
approach
to finish,
to
a
facilitates
at the
that
and
WRITE
isolation
time.
CPU
times
be
executed
is
wasted
cycles.
the
on when
In
within the
needed to improve
are
executed
operations
to
ensure
operations
Table
12.9 shows the
concurrently
are in
that
Database
For example,
T2,
two
data
same
conflicts.
T1
note
is
scheduler
element
produce
12.9,
of them
with that
operation
tends
central
of transactions,
problem
WRITE
Therefore,
transactions,
Table
The
or
computers
execution
Managing
short,
multi-user efficiency
of
system.
same
actions
the
basis.
a READ
environment.
Additionally,
two
schedule
first-served
overall
the
makes sure that the
way to
first-served
CPU
the
also
no
12
conflict
over
when
two
transactions
may require
the
they
possible same
access
same
update
and/or
conflict
data.
the
do not
READ
WRITE
scenarios
when
Using
the
summary
in
data
and
at least
one
operation.
TABLE 12.9
Read/write
conflict
scenarios:
conflicting
database
operations
matrix
Transactions
Operations
Several
methods
transactions. methods
have
Those are used
12.3
been
Read
Read
No conflict
Read
Write
Conflict
Write
Read
Conflict
Write
Write
Conflict
proposed
have
a lock
so that
another
assumes
that
execution
of conflicting
as locking,
time
operations
stamping
in
concurrent
and optimistic.
Locking
is
a data item access; can lock
to
a current
that is currently
the lock the
transactions
based
WITH LOCKING
data
item
might
on the
assumption
discussion
that
that
used
(unlocked)
for its
attempt
transaction.
being
is released
METHODS
manipulate
conflict
other
words, transaction
by transaction
T1. A transaction
when the
exclusive
to
In
between
use.
This
the
same
transaction series
complete
of locking
data
transactions
is
at the
is likely
1
actions
same
and is
time.
known
locking. the
database
earlier
may be in
are required
Most
to
transaction
Recall from
to
multi-user
managed
the
use of a data item
data
concurrent
The use of locks
schedule
CONTROL
exclusive
prior to
as pessimistic
to
been classified
most frequently.
T2 does not have access
is
Result
methods
guarantees
acquires
locks
T2
CONCURRENCY
A lock
the
T1
a temporary
prevent
DBMSs
by a lock
data
inconsistent
another
transaction
automatically
manager,
consistency
initiate
cannot
state
when
from
reading
and enforce
which is responsible
for
be guaranteed
several
updates
are
inconsistent
locking
assigning
during executed.
Therefore,
data.
procedures. and
a transaction;
policing
All lock information the
locks
used
by the
transactions.
12.3.1 Lock
Granularity
Lock
indicates
granularity
table, Copyright Editorial
review
2020 has
page, row
Cengage deemed
Learning. that
any
All suppressed
the level
of lock
use. Locking
can take
place at the following
levels:
database,
or even field (attribute). Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
652
PART
V Database
Transactions
Database Level In a database-level database batch
and
lock,
by transaction
processes,
access
Performance
the entire database is locked, T2
but it is
while
Note that
next
for
T1 and
the
had to
executed.
the
This level
DBMSs.
You
previous
Figure
12.3 illustrates
same
database
of locking
can imagine
wait for the
database.
access
preventing the use of any tables in the is
how
transaction
to
the
good
slow
data
be completed
database-level
concurrently
for
the
even
lock.
when they
use
Database-level locking sequence
Time
Payroll
Transaction (Update Lock
database
Database
Transaction
1 (T1)
Table
A)
TABLE
(Update
A
2(T2)
Table
B)
Lock
request
Locked
2
being
multi-user
entire
T2 cannot
thus
tables.
FIGURE 12.3
1
Tl is
online
of transactions
one could reserve
transactions
different
transaction
unsuitable
would be if thousands
before the
Tuning
database
request
WAIT
OK
3
4
TABLE
5
B OK
Locked
Unlocked
6
7
8
9 Unlocked
12
Table Level In a table-level transaction
lock, the entire table is locked,
T1 is
locked.
using
However,
the
two
table.
If
a transaction
transactions
can
preventing requires
access
the
access to any row by transaction access
same
to
several
database
tables,
as long
each
as they
T2 while
table
access
may be different
tables. Table-level
locks,
transactions forces
are
a delay
when
the
suitable Figure
Copyright review
2020 has
note
rows;
Learning. that
to
any
All suppressed
Rights
does
Figure
transactions
May not
wait
not materially
be
same
T1
until
copied, affect
table.
the
overall
or
Such
with
each
and
duplicated, learning
the
in experience.
whole
to
other.
the
T2 cannot
locks,
cause
a condition
access
12.4 illustrates
T1 unlocks
scanned,
database-level
require
not interfere
DBMSs.
must
Reserved. content
the
than
transactions
would
that
T2
restrictive
access
different
multi-user
12.4,
Cengage deemed
waiting
when
transactions
for
different
Editorial
while less
especially
parts
same
same
table
when
locks
lock.
even
many
if the table,
table-level
of a table-level
the
jams irksome
of the
Consequently,
effect
access
is
different
traffic
lock
that
is,
are
not
As you examine
when they
try to
use
table.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 12.4
12
Managing
Transactions
and
Concurrency
653
An example of a table-level lock Payroll
Time
Transaction
1 (T1)
(Update
1
Lock Table Arequest
2
Locked
row
Database
Table
Transaction
A
2 (T2)
(Update
5)
row
Lock
30)
Table
A request
WAIT
OK
3 4
5 Unlocked (end
6
of transaction
OK
1)
Locked
7 8
9 Unlocked (end
Page Level In a page-level lock, diskblock, such
as 4K, 8K or 16K.
page
from
disk,
can
contain
several
rows
multiuser
DBMS
locking
12.5.
used
until the
page
Cengage deemed
Learning. that
any
updated
As you examine
diskpages.
has
if you
a page
different
2020
For example,
be read
Figure
review
an entire diskpage.
as a directly
and
most frequently
Copyright
DBMS locks
must
pages,
Editorial
the
which can be described
is
All suppressed
If
T2 requires
unlocked
Rights
Reserved. content
Figure
does
by
May not
not materially
in
memory
12.5,
the
use
A diskpage, or page, is the equivalent
addressable want to
or
note that
of a disk.
only 73 bytes to
written
back
more tables.
method.
of a row
section
write
and
of one
of transaction
to
located
A table
on a page
can
locks
the
that
same table
is locked
is
4K
several
currently
lock
of a size,
entire
span
are
of a page-level
T2 access
has a fixed
a 4K page, the
Page-level
An example
T1 and
disk.
A page
2)
the
shown
in
12
while locking
by T1,
T2
must
wait
T1.
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
654
PART
V Database
Transactions
FIGURE 12.5
and
Performance
Tuning
An example of a page-level lock
Time
Transaction (Update
Payroll
1(T1)
row
Database Transaction
1)
(Update
Table A
Lock page 1 request
2 (T2)
rows
5 and 2)
1
Locked
Page 1
2
OK
1
Lock
3
2
4
Lock
page
Page 2
5
2 request
Locked
OK
4
3
page
1 request
Wait
5 Unlock
6 7
(end
page
6
1
OK
of transaction)
Locked
Unlock (end
pages
1 and
2
of transaction)
Row number
Row
Level
A row-level
lock
transactions
to
page.
Although
requires the
high
use
is
much less
access the
overhead.
rows
FIGURE 12.6
discussed
of the same table,
locking
(A lock
Figure
approach
exists
12.6,
for
each
note that
are on the same
in
earlier.
even
improves
row
page.
both
each
The
DBMS
when the rows
the table
transactions
of the
allows
concurrent
are located
availability
on the
of data, its
database.)
can
execute
T2 must wait only if it requests
Payroll
Transaction (Update
12
rows
the locks
same
management
Figure
12.6 illustrates
concurrently,
the
even
same row
when the
as T1.
An example of arow-level lock
Time Lock
1
than
lock.
examine
requested
different
row-level
of a row-level
As you
restrictive
row
Database
1(T1)
row
Transaction
1)
Table
(Update
2 (T2) row
2)
1
1 request
Page 1
2
2
Lock
row
2 request
3
OK
Locked
3
A
OK
Locked
4
4 Unlock
5 (end
6
row
Page 2
5
1
of transaction)
6
Unlock row 2 (end
of transaction)
Row number
Field
Level
The field-level the
lock
use of different
flexible
multi-user
allows
concurrent
fields (attributes) data
access,
transactions
to
within that row.
it is rarely
done
access
Although
because
the
same row
field-level
it requires
as long
locking
as they require
clearly
an extremely
yields the
high level
most
of computer
overhead.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
12
Managing
Transactions
and
Concurrency
655
12.3.2 Lock Types Regardless
of the level
of locking,
the
DBMS
may use different lock
types:
binary
Binary Locks A binary lock has only two states: locked (1) or unlocked (0). If an object page
or row
is locked
by a transaction,
no other transaction
can
or shared/exclusive.
that is, a database, table,
use that
object.
If
an object
is
unlocked, any transaction can lock the object for its use. Every database operation requires that the affected object belocked. As a rule, atransaction must unlock the object after its termination. Therefore, every transaction requires alock and unlock operation for each data item that is accessed. Such operations are automatically managed and scheduled by the DBMS; the user does not need to be concerned
about locking
or unlocking
data items.
(Every
DBMS
has a default locking
mechanism.
If
the end user wants to override the default, the LOCK TABLE and other SQL commands are available for that purpose.) The binary locking technique is illustrated in Table 12.10, using the lost updates problem you encountered in Table 12.3. As you examine Table 12.10, note that the lock and unlock features eliminate the lost
update
problem.
(The
lock is
not released
until the
write statement
is
completed.
Therefore
a PROD_QOH value cannot be used until it has been properly updated.) However, binary locks are now considered too restrictive to yield optimal concurrency conditions. For example, the DBMS will not allow two transactions to read the same database object even though neither transaction updates the
database
(and,
therefore,
no concurrency
problems
can occur).
concurrency conflicts occur only whentwo transactions the database.
TABLE
12.10
An example
Time
Remember
execute concurrently
from
Table
12.9 that
and one ofthem updates
of a binary lock
Transaction
Step
Stored
1
T1
Lock
2
T1
Read PROD_QOH
3
T1
PROD_QOH
4
T1
Write
5
T1
Unlock PRODUCT
6
T2
Lock
7
T2
Read PROD_QOH
8
T2
PROD_QOH
9
T2
Write PROD_QOH
10
T2
Unlock
Value
PRODUCT 15
5 15
1 10 25
PROD_QOH
12
PRODUCT 23
5 23
10 13
PRODUCT
Shared/Exclusive Locks An exclusive lock exists when access is reserved specifically for the transaction that locked the object. The exclusive lock must be used when the potential for conflict exists (see Table 12.9). A shared lock exists when concurrent transactions are granted Read access onthe basis of a common lock. A shared lock
produces
no conflict
as long
as all the
concurrent
transactions
are read-only.
A shared lock is issued when a transaction wants to read data from the database and no exclusive lock is held on that data item. An exclusive lock is issued when a transaction wants to update (write) a
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
656
PART
V Database
data item
Transactions
and no locks
exclusive
locking
As you
in
Table the
on that
T2 lock
12.9,
two
Read transactions
or exclusive
granted
to transaction
same
conflict
the
data
granted
T2,
if
and
possibility
lock)
T2
X, an
only if
The
schema
Although
has
locks
known
been
at least
Using the shared/ and
one
data item
exclusive
of them
at once,
For example,
(Write).
is
shared
a
locks
Write allow
if transaction
T1 has
X, T2 may also obtain
locks
a shared
as two-phase
renders
data
overhead,
data
This
condition
more efficient,
item
X.
Therefore,
T1, an exclusive
T1 commits.
access
over
lock is
if
cannot
known
a
be
as the
on the same object.
a shared/exclusive
lock
for several reasons:
before
a lock
(to
can
check
be granted.
the
type
of lock),
WRITE_LOCK
(to
issue
to
and
the
lock). to
allow
a lock
may not
upgrade
(from
shared
they
can lead
exclusive)
a lock
A database deadlock,
managed:
deadlocks
are
wait for
each
serialisability
is
can
examined
to two
major
problems:
be serialisable.
transactions
be
and
techniques
T2
data item.
at a time can own an exclusive lock
data inconsistencies,
can
locking,
Those
begin
by
on the
shared).
when two
problems
required
held
X by transaction
schedule
caused
is
are
until
enhanced
serious
lock
locks
wait to
the
to
other
on data item
may create deadlocks.
both
techniques.
concurrently.
exclusive
no
READ_LOCK
exclusive
transaction
city, is
Fortunately,
(Read)
executed
held
be known
exist:
prevent
The schedule a big
must
(to release
(from
The resulting
in
held
UNLOCK
downgrade
when
must
managers
operations
and
only
T2 wants to read
item
and
of shared
the lock
of lock
lock
shared
be safely
data item
X and transaction
lock is already
schema increases
Three
by any other transaction.
unlocked,
can
mutually exclusive rule: only one transaction
The type
states:
transactions
to read the
updates
is
shared
Although
data item
have three
X.
If transaction exclusive
held
can
two
on data item
on data item
The
a lock
Read transactions
a shared lock
Tuning
are currently
Because
several
Performance
concept,
saw
transaction.
lock
and
be
in the
managed next two
whichis equivalent to traffic other
to
unlock
guaranteed using
through
deadlock
gridlock
data. a locking
detection
protocol
and
prevention
sections.
12.3.3 Two-Phase Locking to Ensure Serialisability 12
Two-phase
locking
guarantees
1
serialisability,
A growing Once
2
defines
A shrinking
have
locking
No unlock
Figure
Copyright Editorial
review
2020 has
Learning. that
any
are
All
not
Rights
the
Reserved. content
does
is
May
the
can
precede
operation
until
all locks
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
alllocks
by the following
a lock
phases
is in its locked
releases
locks.
locking
The two
locks.
acquires all required locks
have conflicting
are
and relinquish
deadlocks.
transaction
governed
two-phase
not
acquire
prevent
which a transaction
cannot
affected
depicts
suppressed
does
acquired,
protocol
operation
12.7
Cengage deemed
been
phase, in
Two transactions
No data
but it
phase, in which a transaction
all locks
The two-phase
how transactions
obtained
in the that
is,
Two-phase
locking
are:
without unlocking any data.
point.
and cannot obtain any new lock.
rules:
same
transaction.
until the
transaction
is in its
locked
point.
protocol.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 12.7
Two-phase locking
12
Managing
Transactions
and
Concurrency
657
protocol Locked point Release lock
Acquire lock
Acquire lock
Time
Release lock
12
34
56
78
Operations
Start
Growing
End
Locked phase
phase
Shrinking
phase
In this example, the transaction acquires all of the locks it needs until it reaches its locked point. (In this example, the transaction requires two locks.) When the locked point is reached, the data are modified to conform to the transaction requirements. Finally, the transaction is completed asit releases all of the locks
it acquired
in the first
phase.
Two-phase locking increases the transaction processing effects. One undesirable effect is the possibility of creating
cost and may cause additional undesirable deadlocks.
12.3.4 Deadlocks A deadlock
occurs
12
when two transactions
occurs when two transactions, T1 5 access
wait for each
other to
unlock
T1 and T2, exist in the following
data items
X and Y
T2 5 access data items
Y and X
data.
For example,
a deadlock
mode:
If T1 has not unlocked data item Y, T2 cannot begin; if T2 has not unlocked data item X, T1 cannot continue. Consequently, T1 and T2 wait indefinitely, each waiting for the other to unlock the required data item.
Such
a deadlock
is
also known
as a deadly
embrace.
Table
12.11
demonstrates
how a
deadlock condition is created.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
658
PART
V Database
Transactions
TABLE 12.11
and
Performance
Tuning
How a deadlock condition is created Lock
Status
Transaction
Reply
1
T1:LOCK(X)
OK
2
T2: LOCK(Y)
OK
Locked
3
T1:LOCK(Y)
WAIT
Locked
Locked
4
T2:LOCK(X)
WAIT
Locked
Locked
5
T1:LOCK(Y)
WAIT
Locked
Time
0
Data Y
Unlocked
Unlocked Unlocked
Locked
D
Locked
WAIT
T2:LOCK(X)
6
Data X
7
T1:LOCK(Y)
WAIT
Locked
8
T2:LOCK(X)
WAIT
Locked
9
T1:LOCK(Y)
WAIT
Locked
Locked
E A
Locked Locked
D
Locked
...
..............
........
.........
L O
...
..............
........
.........
C
..........
...
..............
........
.........
K
..........
...
..............
........
.........
The preceding a real-world
DBMS,
probability
obtain
The three
many
techniques
prevention.
a deadlock
are rolled
control
A transaction
deadlocks
a deadlock
simultaneously, possible
thereby
only
condition
a new lock
is aborted,
when can
is
prevention
aborted
one
exist
condition.
In
increasing
of the
the
transactions
among
when there
all changes
by the transaction
Deadlock
demonstrate
shared
locks.
are:
requesting
obtained
execution.
are
no deadlock
can occur. If the transaction
for
to
be executed
deadlocks
on a data item;
to
back and alllocks
rescheduled
transactions
can
Note that
lock
.........
concurrent
more transactions
deadlocks.
an exclusive
basic
Deadlock
to
used only two
of generating
wants to
that
example
..........
are released.
works
because
it
is the
possibility
made by this transaction The transaction
avoids
the
is then
conditions
that
lead
deadlocking.
12 Deadlock found,
detection. one
transaction Deadlock
avoidance.
increases
Copyright Editorial
review
2020 has
Cengage deemed
any
avoids
is
high,
the
database
aborted
for
(rolled
deadlocks.
back
If
a deadlock
and restarted)
and the
is other
All suppressed
Rights
does
May not
not materially
all of the locks
of conflicting
the serial lock
control
is low,
it
needs
transactions
assignment
be
affect
scanned, the
overall
detection
before
it
can
by requiring
required
on the
or
database
in
be
that
deadlock
locks
avoidance
environment.
is recommended.
is recommended.
might
copied,
method depends
deadlock
prevention
avoidance
Reserved. content
obtain
rollback
times.
deadlock
deadlock
must
the
However,
response
of deadlocks
is
Learning. that
tests
victim)
The transaction
of the best deadlock
probability
list,
periodically (the
in succession. action
The choice
priority
DBMS
This technique
be obtained
deadlocks
The transactions
continues.
executed.
the
of the
However,
If response
time
is
not
For example,
if the high
probability
on the
if of
systems
be employed.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
12.4
12
Managing
CONCURRENCY CONTROL WITHTIME STAMPING
Transactions
and
Concurrency
659
METHODS
The time stamping approach to scheduling concurrent transactions assigns a global, unique time stamp to each transaction. The time stamp value produces an explicit order in which transactions are submitted to the DBMS. Time stamps must have two properties: uniqueness and monotonicity. Uniqueness ensures that no equal time stamp values can exist, and monotonicity6 ensures that time stamp
values
always increase.
All database operations (Read and Write) within the same transaction must have the same time stamp. The DBMS executes conflicting operations in time stamp order, thereby ensuring serialisability of the transactions. If two transactions conflict, one is stopped, rolled back, rescheduled and assigned a new time
stamp
value.
The disadvantage of the time stamping approach is that each value stored in the database requires two additional time stamp fields: one for the last time the field was read and one for the last update. Time stamping thus increases memory needs and the databases processing overhead. Time stamping tends to demand considerable system resources because manytransactions may have to be stopped, rescheduled
and restamped.
12.4.1
Wait/Die and Wound/Wait Schemes
You have learnt that time stamping methods are used to manage concurrent transaction execution. In this section, you willlearn about two schemes used to decide which transaction is rolled back and which continues executing: the wait/die scheme and the wound/wait scheme.7 An example illustrates the difference. Assume that you have two conflicting transactions, T1 and T2, each with a unique time stamp.
Suppose
T1 has a time
stamp
of 11548789
and
T2 has a time
stamp
of 19562545.
You can
deduce from the time stamps that T1is the older transaction (the lower time stamp value) and T2is the newer transaction. Given that scenario, the four possible outcomes are shown in Table 12.12. TABLE
12.12
Wait/die
Transaction
and
wound/wait
Transaction
Requesting
Lock
Owning
T1 (11548789)
concurrency
Wait/Die
control
schemes
Scheme
Wound/Wait
Lock
T2 (19562545)
T1
waits
and
until
T2 is
T2 releases
completed
its
T1
locks.
preempts
T2 is
T1(11548789)
T2 dies (rolls
back).
T2 is rescheduled
same time
Using the
using
monotonicity
term
and to its
and recovery 7
The
Copyright review
2020 has
Cengage deemed
in
procedure
distributed
Editorial
any
is proper
All suppressed
part
of the
standard
concurrency
use
was in
an article
written
decentralized was first
database
Learning. that
12
same
stamp.
T1 releases
the
T2. the
completed
and
its locks.
stamp.
wait/die scheme:
The term this
back) using
T2 waits until T1is
If the transaction requesting the lock is the older of the two transactions, transaction is completed and the locks are released.
6
(rolls
rescheduled
time
T2 (19562545)
Scheme
Rights
systems,
Reserved. content
computer
described
does
May not
not materially
ACM
be
systems,
by
copied, affect
R.E.
the
overall
or
duplicated, learning
in
vocabulary.
Kohler, Surveys
and
Lewis
on
experience.
control W.H.
Computer
Stearnes
Transactions
scanned,
by
whole
P.M.
Database
or in Cengage
part.
Due Learning
A
electronic reserves
authors
of techniques
June
1981,
pp.
II in System-level
Systems,
to
The
survey 3(2),
it will wait until the other
2, June
rights, the
right
some to
third remove
first
introduction
for
synchronization
149-283.
concurrency 1978,
party additional
content
pp.
may content
to
control
for
178-98.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
660
PART
V Database
Transactions
If the transaction
and
Performance
requesting
and is rescheduled
In short, in the
using
wait/die
Tuning
the lock the
same
is the younger time
scheme, the
of the two transactions,
it
will die (roll
back)
stamp.
older transaction
waits and the younger is rolled
back and
rescheduled. In the
wound/wait
If the the
scheme:
transaction younger
requesting
transaction
preempted
(by
transaction
transaction
requesting
other
transaction
is
short, in the
reschedules In
to
T1
is the
the
two
pre-empts same
younger
and the locks
scheme,
of the
the
transactions, T2
time
when
it
will pre-empt
T1 rolls
back
T2.
(wound)
The
younger
stamp.
of the
two
transactions,
it
will
wait until the
are released.
older transaction
one of the transactions
in
many cases,
each lock
rolls
a deadlock.
To
waits for the
a transaction
request?
value. If the lock is
back the
younger
transaction
and
Obviously,
prevent
that
not granted
requests that
type
scenario
of
other transaction
multiple can
deadlock,
before the time-out
locks.
cause
some
each lock
expires,
to finish
How long
does
transactions
request
has
the transaction
and release
to an
the
atransaction
have
wait indefinitely,
associated
is rolled
time-out
back.
CONCURRENCY CONTROL WITH OPTIMISTIC METHODS
The optimistic not
completed
schemes,
wait for
12.5
back). using
the lock
wound/wait
However,
causing
it
older
it.
both
locks.
rolling
is the
is rescheduled
If the
In
the lock
approach
conflict.
The
transaction
moves
During the
based
on the assumption
approach
is executed
transaction
and
is
optimistic
does
without restrictions
through
two
or three
makes the
updates
to
are recorded
a private
in
majority
locking
or time
until it is committed.
phases.
Read phase, the transaction
transaction
that the
not require
The
phases
reads the copy
file,
Read,
values.
which is
database
an optimistic
a
each
Write.8
the needed
computations
operations
accessed
do
Instead,
approach,
and
All update
not
operations
techniques.
validation
executes
database
update
Using
are
database,
of the
a temporary
of the
stamping
of the
by the
remaining
changes
made
transactions. During
12
not
the
affect
validation
phase,
the integrity
and
transaction and the
goes to the changes
During the
are
the
transaction
consistency
Write phase. If the
validated
to
database.
validation
ensure
If the
test is
that
the
validation
negative,
test
is
positive,
the transaction
will
the
is restarted
discarded.
Write phase, the changes
The optimistic
is of the
approach
is
acceptable
are permanently for
most read
applied
or query
to the
database
database. systems
that require
few
update
prevention
and
transactions. In
a heavily
constitutes
discussed,
as
8
The
pp. two
Copyright review
2020 has
Cengage deemed
approach
Even
decades
ago.
any
All suppressed
Rights
to for
the
Reserved. content
environment,
does
May not
concurrency
control
not materially
be
copied, affect
software
scanned, the
overall
or
duplicated, learning
The
techniques.
is
control,
current
management
function.
on those
concurrency
most
the
DBMS
as variations
methods
213-26.
Learning. that
DBMS
an important
well
optimistic
Optimistic
Editorial
used
detection
is
in experience.
built
whole
in
the
an
or in Cengage
part.
on
conceptual
Due Learning
to
electronic reserves
rights, right
or
by
some to
more of the
third remove
that
party additional
is
H.T.
Database
standards
the
their
one
deadlock
article
Transactions on
deadlocks
will use
However,
described
ACM
of
DBMS
content
sometimes
King
and
Systems were
may content
techniques
J.T. 6(2),
suppressed at
any
time
from if
Robinson, June
developed
be
worse
1981,
more than
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
than
the
disease that locks
recovery
techniques
management levels
12.6
in
ANSI
SQL
standard
isolation
concurrent
of read
that
t2,
yielding Based
may be necessary
that
To further
to
about
and
employ
understand
you learn
Atransaction
additional
on the
rows
above
levels.
Server
databases.
Concurrency
661
database
how transaction
the transaction
database
does
isolation
also
mode
of operation
commits.
The Repeatable level
query
the first
shows
uses
reads
for
most
locks
However,
are always
However,
possible.
therefore,
data,
t1,
from
other transactions
are described
by the
are:
12.13
the
(including
query at time t2,
four by
other transactions.
ANSI Oracle
transaction and
At this isolation
transaction
Oracle
and to
SQL
Server).
wait until the
MS SQL
level,
performance
the
but at the
to read only committed
other transactions
at
Read Uncommitted,
the
provided
which increases
same row
or deleted.
isolation: shows
of isolation
forces transactions
causing
and then it reads
of transaction
data from
to
ensure
new rows
it is important
will detect
that
to
other transactions
are read
isolation
Most databases
they
levels.
data. Thisis
At this original
level,
the
transaction
Read isolation level ensures that queries return consistent results. This type
query ran. The Serialisable
standard.
data
levels
and then it runs the same
Table
level
databases
on data,
shared locks
it.
at time t1,
an additional
on the
which
operations
at time
four levels
uncommitted
on
isolation
may have been updated
Serialisable.
Read Committed
will use exclusive
original
and
any locks
database
of isolation
query.
will read
cost of data consistency. default
satisfy the
ANSI defined
based
of read
a given row
The original row
Read,
isolation or isolated
not yet committed.
a query
operations,
not place
reads
on transaction
data is protected
described
The types
executes
Repeatable
Uncommitted
are
based
transaction
the transaction
data that is
results.
that
The table
which
levels
or not.
A transaction
read:
Committed,
Read
allows
different
management
to
More precisely,
can read
read:
degree
The isolation
yielding
isolation
and,
it
state.
it is important
transaction
to the
execution.
A transaction
Phantom
SQL
Therefore,
a consistent
Transactions
standard.
defines
refer
a transaction
Non-repeatable time
the
1992
(1992)
levels
during
Dirty read:
Read
to
a database,
SQL
transactions.
can see (read) type
in
ANSI
to cure.
database
Managing
ANSI LEVELS OF TRANSACTION ISOLATION
Transaction other
the
is implemented
as defined
The
are supposed
to restore
12
(phantom
level is the
note
that
deadlocks
during
the
as these
most restrictive
even
use a deadlock
do not update
read)
with
rows
level
a Serialisable
detection
approach
transaction
a row
did
not
defined
isolation
phase
when
by the
level,
ANSI
12
deadlocks
to transaction
validation
after the
exist
management
and reschedule
the
transaction. The reason
levels
for
the
different
go from the least
the isolation
level,
at the
expense
in the
transaction
BEGIN
levels
restrictive
the
more locks
of transaction
for
is to increase
Uncommitted)
(shared
ISOLATION
using LEVEL
The isolation
general READ
concurrency.
more restrictive
are required
performance.
example
transaction
to the
and exclusive)
concurrency
statement,
TRANSACTION
of isolation
(Read
ANSI
SQL
The isolation
(Serialisable).
to improve level
The higher
data consistency,
of a transaction
is
defined
syntax:
COMMITTED
...
SQL
STATEMENTS
...
COMMIT
TRANSACTION; Oracle the
level
and
consistent
Copyright Editorial
review
2020 has
Server
SQL
statement-level
Cengage deemed
MS SQL
of isolation.
Learning. that
any
All suppressed
Rights
reads
Reserved. content
use
Server
does
May not
not materially
be
the
to
copied, affect
SET
supports ensure
scanned, the
overall
or
TRANSACTION all four
Read
duplicated, learning
in experience.
ISOLATION
ANSI isolation
Committed
whole
or in Cengage
part.
and
Due Learning
to
electronic reserves
LEVEL
statement
Oracle
by
levels. Repeatable
rights, the
right
some to
third remove
to
default
Read transactions.
party additional
content
may content
be
suppressed at
any
time
MySQL
from if
define
provides
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
662
PART
V Database
Transactions
TABLE 12.13
and
Performance
Transaction isolation
Tuning
levels
Isolation
Level
Allowed Dirty Read
Comment
Non-Repeatable
Phantom
Read
Read Less restrictive
Read
Y
Y
Y
The transaction
Uncommitted
reads
uncommitted
data,
allows
non-repeatable
reads
and phantom
reads. Read
N
Y
Y
Does
Committed
not
allow
uncommitted reads
data
but allows
non-repeatable
reads phantom
More restrictive
Repeatable
N
N
N
reads.
Only allows
Read
and
phantom
reads.
Serialisable
N
N
N
Does
not
reads,
allow
dirty
non-repeatable
reads
or phantom
reads.
Oracle/SQL
Read
Only/
Server only
Snapshot
N
N
N
Supported
by
Oracle
and SQL Server. The transaction see the
can
only
changes
that
were committed
at the
time the transaction started.
12 uses
START
reads;
that
TRANSACTION
is, the
As you can see from databases it
database
to
12.7
review
2020 has
recovery state.
transaction
Cengage
Learning. that
previous
of various
a consistent
Database
deemed
the
CONSISTENT only
SNAPSHOT
see the
committed
discussion,
techniques
to
sometimes
to
to
data
provide
at the
transactions
time
the
transaction
management
manage the
concurrent
employ
database
with
transaction
is a complex execution
recovery
consistent started.
subject
and
of transactions.
techniques
to restore
the
state.
DATABASE RECOVERY MANAGEMENT
completed
Copyright
can
may be necessary
consistent the
Editorial
make use
However,
WITH
transaction
any
restores Recovery
must
to
All suppressed
be treated
produce
Rights
Reserved. content
does
a database
a given
state,
are
based
on the
atomic
as a single,
logical
unit
techniques
a consistent
May not
not materially
be
copied, affect
database.
scanned, the
from
overall
or
duplicated, learning
in experience.
If, for
whole
or in Cengage
part.
of
usually
work in
which
some reason,
Due Learning
to
electronic reserves
inconsistent,
transaction
rights, the
right
to
property: all operations
are
any transaction
some to
third remove
party additional
content
may content
a previously all
portions
operation
be
suppressed at
any
time
from if
of
applied
the
subsequent
and
cannot
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
be completed, (undone).
the transaction
In
database
short,
before
Although
to the
it
this
Examples
was
has
Hardware/software
events
Some
database
of the
to the
changes
database
that
the
and
Concurrency
must be rolled
transaction
663
back
made to the
recovery
of transactions,
of critical
error
recovery
techniques
also
apply
has occurred.
of this type
or a failing
could
be a hard disk
memory
bank.
system
errors
or operating
media failure,
Other causes that
is
cause
administrators
argue
that
this
one
This type
of event
can
be categorised
of errors
data
of the
to
most
a bad
under this
category
be overwritten,
common
deleted,
sources
of
problems. incidents.
? An unintentional wrong
rows
database
by a careless
pressing
the
to
are of a more severe
risk.
Under
access
disasters.
The following inconsistent
to
error
Section
12.1.4, purposes.
database
from
Before
can
on the
or intentional.
Such errors include keyboard,
deleting
or shutting
down
that the
company
and normally indicate
security
and
virus
operation
includes render
introduces
the the
main
the
caused
by hackers
caused
by disgruntled
and
fires,
the
threats attacks damage
the
earthquakes,
database various
into
and
power
an inconsistent used
to
gain
employees
company.
floods
techniques
trying
data
failures.
Whatever
state.
to
recover
the
database
from
an
state.
Recovery
you learnt
about
Database
to
lets
the
transaction
transaction
an inconsistent
continuing,
are
database
a consistent
12.7.1 Transaction
recovery
category
category
section
state
key
nature
data resources
the
This
a critical
this
to
compromise
cause,
end user.
wrong
as unintentional
by accident.
events
at serious
Natural
caused
a table,
server
unauthorised trying
failure is
from
? Intentional
In
the
some type
Afailure
program
database
Human-caused
the
all
Transactions
are:
failures.
application
or lost.
is
after
on a motherboard,
include
and any changes
reverses
emphasised
or the system
of critical
capacitor
recovery
Managing
aborted.
chapter
database
must be aborted
transaction
12
a consistent
examine
log
recovery
structure
focuses
state
four important
how it
different
the
data in the
by using
concepts
and
on the
that
contains
data
methods
for
used
transaction
database
to recover
a
log.
affect the recovery
process:
12 The write-ahead-log before
protocol.
any database
database
can later
data are actually be recovered
Redundant transaction that
a physical
disk it
and
actually
updates
accessing to
the
a physical
the
copy
physical
buffers
A checkpoint
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
to
All
Rights
Reserved. content
does
the
May
DBMSs
the
data in
ability
to recover
storage area in primary the
in
DBMS
software
primary
data in the
not materially
also
be
copied, affect
the
of a failure,
transaction
because
thereby
saving
is
happening,
registered
scanned, the
overall
or
duplicated, learning
in the
in experience.
whole
the
DBMS
transaction
or in Cengage
part.
Due Learning
to
process
contain
significant
log.
electronic reserves
right
log.
log to ensure
the
is
physical
updates
data,
much faster
updated
than
data are
processing
written
time.
which the DBMS writes all ofits not
execute
As a result
rights, the
data from
a transaction
that
does
the
data.
the
When
on, all buffers that
written
memory used to speed up disk
reads
memory.
buffer
Later
operation,
While this is
not
using
are always
ensures that, in case
A database checkpoint is an operation in
disk.
operation
suppressed
of the
a single
Database checkpoints. updated
time,
on a buffer
disk every time.
disk during
state,
logs
Most DBMSs keep several copies of the transaction
will not impair
of it
that transaction
This protocol
a consistent
processing
a copy
ensures
updated.
A buffer is a temporary
To improve stores
to
logs.
disk failure
Database buffers. operations.
This protocol
some to
third remove
party additional
any of this
content
may content
other
requests.
operation,
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
664
PART
V Database
the
Transactions
physical
because
database
update
database.
next,
database
are
is
or
undo)
to
be
3
saved to
buffers
DBMS
to
because
transactions
its
the
and
state
that started and committed
its
commit
commit
point,
log is
point,
using
no changes
was
never
these
its
or undo
commit
the transaction
ROLLBACK
nor
are
a ROLLBACK) never
transaction
(no
ROLLBACK The recovery
steps:
data was
nothing needs to be done,
after the last checkpoint,
and to
made in
before
uses write-through
operations
needs to
operation
the
The
update
the
ascending
during
the
the
failure
database,
order,
occurred,
or immediate
transactions
aborts
be done to restore
uses
the
the DBMS uses
from
using the after
oldest
to
newest.
or that wasleft active
nothing
needs
to
be done
updated.
point. If the transaction
operation
operations
updated.
updated.
follows
before the last checkpoint,
The changes
was
procedure
by transaction
reaches
techniques.
This is the last time transaction
a COMMIT operation
to redo
log.
database
Whenthe recovery updated
a failure.
saved.
that performed
a COMMIT
the
after
disk.
transaction
neither
because
As you
write-through
For any transaction that had a ROLLBACK operation after the last checkpoint (with
hour.
update, the transaction
the failure)
log.
physical
per
a consistent
database
(before
the
times
only the transaction
reaches
it reaches
is required
not in
recovery.
of deferred-write
Instead,
and
several
transaction
database
use
database.
before
This synchronisation
write or deferred
in the transaction
log records
in the
make
database
the data are already
the transaction
4
the
and committed
For a transaction values
the
after the transaction
aborts
checkpoint
by the role in
bringing
physical only
made to
For atransaction because
the
sync.
data in the
scheduled
generally
updated
the last
physically
of the
an important
involves
transaction
for all started
1 Identify
2
update
If the
need
process
will be in
copy
procedure uses deferred
physically
information.
play
procedures
do not immediately
log
the
automatically also
process
recovery
database
update
checkpoints
Whenthe recovery
Tuning
and the transaction
operations
recovery
Transaction
log
Performance
Checkpoints
will see The
and
transaction
before it reaches
the
log
update, the database is immediately
execution,
database
before
to
even
its
commit
a consistent
values.
before
the
point,
transaction
a ROLLBACK
state. In that
The recovery
process
case, the
follows
these
steps:
1 Identify
the last
physically
12 2
checkpoint
saved
to
For atransaction because
the
in the transaction
This is the last time transaction
data were
disk.
that started and committed
data
log.
are
already
before the last checkpoint,
nothing needs to be done
saved.
3 For atransaction that committed afterthe last checkpoint, the DBMSredoes the transaction, using the
after
values
of the
transaction
log.
Changes
are
applied
in
ascending
order,
from
oldest
to
newest.
4
For any transaction active (with
log
transaction
log.
You make
Copyright review
2020 has
sure
Cengage deemed
had a ROLLBACK operation
a COMMIT
records
to
Changes
you
Learning. that
any
understand
and
All suppressed
Rights
does
May not
the
not materially
be
applied
log in
in reverse
12.14
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
order,
from
or in Cengage
part.
Due Learning
to
newest
electronic reserves
occurred,
to
log
is
the transaction
rights, the
right
the
or that
was left
DBMS
uses the
some to
third remove
party additional
values in the
oldest.
database
transaction
includes
checkpoint
using the before
a simple
a simple log
whole
operations,
to trace
process,
This transaction
after the last
before the failure
or undo the
Table
recovery
one checkpoint.
Reserved. content
nor a ROLLBACK)
ROLLBACK are
may use the transaction
transactions
Editorial
neither
transaction
that
recovery used
that
process. includes
components
content
may content
be
any
time
three
used
suppressed at
from if
the
subsequent
To
earlier
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
12
Managing
Transactions
and
Concurrency
665
...
...
...
89-WRE-Q,1,
Value
After
Value
Before
43
675.62
45
615.73
1009,1,
1009,10016,
277.55
11
12
1007,18-JAN-2014,
2
0.00 6
PROD_QOH
Attribute
CUST_BALANCE
CUST_BALANCE
PROD_QOH
PROD_QOH
*
*
*
ROW
H
54778-2T
ID
10011
1009,1
1009
89-WRE-Q
10016
10007
2232/QWE
S*
*R*A*
C
examples *
*
*
Transaction
Transaction
Transaction
of
of
of
*
*
Transaction
Transaction
Transaction
recovery
End
Start
End
Start
PRODUCT
CUSTOMERACCT_TRANSACTION
PRODUCT
PRODUCT CUSTOMER
Table
End
Start
INVOICE
LINE
****
****
****
****
****
****
12 transaction
for
Operation
log
NEXT
PTR
PREV
PTR
TRX
NUM
UPDATE
UPDATE
START
352
COMMIT
Null
UPDATE
START
COMMIT
UPDATE
START
COMMIT
INSERT
INSERT
415
419
427
431
457
397
405
415
419
427
431
106
106
106
427
431
457
363
365
341
352
363
101
101
101
101
106
106
106
106
341
352
363
365
397
405
415
419
405
CHECKPOINT UPDATE
INSERT
Null
525
528
Null
transaction
A
12.14
Null
Null
Null
521
525
155
155
155
521
525
528
TABLE TRL
ID
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
423
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
666
PART
V Database
in the
Transactions
chapter,
transaction
and
Performance
so you should
log
has the
Transaction
and increase
product
already
following
101 consists
54778-2T
be familiar
with the
basic
process.
106 is the
represents
the
277.55.
of two
UPDATE
the
customer
same
credit
credit
sale
of one
This transaction
Transaction
statements
balance
that
A database
only the changes Using Table
the
checkpoint
done
unit
for
that reduce
customer
all
by transaction
12.14,
buffers
update.
on hand
for
a credit
database
101 to the
on hand for sale
of two
product units
of
10016
and two
to
from
disk.
The
In this
in the
amount
UPDATE)
consists
2232/QWE
buffers
This transaction
customer
This transaction
transactions.
database
the
to
12.1.1.
INSERT
of product
committed
were physically
Note that transaction were
already
of one
6 units
to
checkpoint
case, the
of
statements. UPDATE
26 units. event
writes
checkpoint
applies
all
data files.
database
recovery
the
transaction
to
Find
disk,
that
log
transaction
process
for
a DBMS,
using the
deferred
data
Use the
previous
Use the
next
the
TRL ID
pointer
after
to
before the last checkpoint. action
needs
after the last
write the
changes
to
to
checkpoint
disk,
Therefore, all changes
be taken.
using
(TRL ID 423), the
the
after
values.
For
DBMS uses example,
for
Repeat the
process
active (do to
values
to locate
values to locate (Start
with
COMMIT
statement
areignored.
not end
the
each
TRL ID
for transaction
Any other transactions written
457).
values.
was the
were
disk.
no additional
committed
pointer
457
were left
and
was TRLID 423. This wasthe last time
106:
COMMIT (TRL ID
using
written to
101 started and finished
written
For each transaction
4
the
method as follows:
database
12
quantity
Section
DML (three
inventory
you can now trace
saw in
1 Identify the last checkpoint. In this case, the last checkpoint
3
the
10011
89-WRE-Q
SQL
updated
for all previously
you
of product
quantity
wrote
event
of five
a simple
increases
changes
sales
consists
155 represents
statement
2
Given the transaction,
characteristics:
54778-2T.
Transaction
update
Tuning
start
of the
transaction
DML statement
405,
then
for this
415,
(TRL
ID
and apply the 419,
427
and
397).
changes
431.)
to
disk,
Remember
that
transaction.
155.
Therefore, for transactions
with a COMMIT
that ended
or ROLLBACK)
nothing
with ROLLBACK or that
is done
because
no changes
disk.
SUMMARY A transaction represents
is
the transaction transaction one in
Copyright review
2020 has
can exist
by itself.
a database
all
data integrity
which
have five
consistent
state),
(data
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
aborted), being
scanned, the
that
must
access
be a logical
consistent
constraints
is
Learning.
one
main properties:
isolation
operations
the unit
Either all parts are executed,
from
the transaction
that
database A transaction
otherwise,
Cengage deemed
of
events.
takes
Transactions
Editorial
a sequence
real-world
overall
or
duplicated, learning
state
to
database.
of
work;
A transaction
that
is,
or the transaction
another.
no portion
is
A consistent
of
aborted.
database
A
state
is
are satisfied.
atomicity
(all parts
consistency used
in experience.
of the transaction
(maintaining
the
by one transaction
whole
or in Cengage
part.
Due Learning
to
electronic reserves
cannot
rights, the
right
some to
third remove
are executed;
permanence
of the databases
be accessed
party additional
content
may content
be
by another
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
transaction cannot
until the first transaction
be rolled
concurrent
back
once
execution
serial
order).
SQL
provides
changes
disk)
for
and
SQL transactions request
keeps
Concurrency
control
execution
The
scheduler are
scheduler
of locking:
when
Serialisability
When two
or
a deadlock,
Concurrency
of the
control
Copyright review
2020 has
COMMIT
(saves
in
requests.
Each
database
database.
The information
purposes. The
updates,
concurrent
uncommitted
data
and
which
order is
stamping
the
concurrent
critical
and
and
transaction
ensures
optimistic
database
methods
are
integrity
used
by the
by a transaction.
another
The lock
transaction
is
using
prevents
it.
There
one
are
several
levels
database
states:
systems:
1 (locked)
binary
locks
and
or 0 (unlocked).
a database
shared/exclusive
A shared
lock
and no other transaction
wants to
update
(write
a particular
to) the
locks. used
is updating
item.
database
is
the
An exclusive
and
no other
when
lock is
locks
(shared
or
data.
is
guaranteed phase,
any
data,
through
in
and
which
the
the
use of two-phase
transaction
a shrinking
phase,
locking.
acquires
in
all
which
the
to
release
The two-phase
of the locks
transaction
that
it
releases
all of
new locks. wait indefinitely
embrace.
There
for
are three
each
other
deadlock
control
a lock,
they
techniques:
are in
12
prevention,
which
conflict
stamping
transaction
wound/wait with
and that At commit
methods
of conflicting
assigns
transactions
is rolled
back
a unique
in time
and
which
time
stamp
stamp order.
continues
to
each
Two
executing:
transaction
schemes the
are
wait/die
scheme. optimistic
methods
transactions time,
are
assumes
executed
the
private
copies
isolation
levels:
Read
database
from
that
the
majority
concurrently, are
updated
using to the
Uncommitted,
of
database
private,
database.
Read
transactions
temporary The
Committed,
copies
ANSI
standard
Repeatable
Read,
Serialisable. recovery
Learning. that
any
restores
backups
be used in
Cengage deemed
the
lost
in
can exist for
execution
control
data.
order
locks
with time
the
and the
Database
Editorial
executed
and field.
data from
acquiring
decide
Database
to
being
of transactions.
problems:
or Read
a growing
defines four transaction and
of the
avoidance.
schedules
Concurrency
time
while
used in
more transactions
and
do not
on the
or a deadly
detection
be
unlocking
without
to
main
a data item
page, row
shared
has
to
only two
of schedules
schema
used
have
held
without
scheme
result
of transactions.
data item
a transaction
are
locks
can
can
Several
exclusive)
and
modify
execution
execution
Locking,
access
the
wants to read
data.
needs
that
the
The transaction
table,
of locks lock
a transaction
the
using
database,
Two types
locking
three
establishing
systems.
unique
from
issued
in
ensure the serialisability
guarantees
transaction
same
667
state).
or database
(ROLLBACK)
simultaneous
result
for
executed.
database
to
A binary
can
is responsible
multi-user
A lock
of all transactions
the
Concurrency
operations.
used for recovery
coordinates
and
made by a transaction (the
statements:
database
SQL statements
Transactions
transactions
of two
previous
changes
serialisability
of the
use
Managing
retrievals.
operations in
the
(the and
as that
the
database
log is
of transactions
inconsistent
through
by several
durability
committed)
same
(restores
track
stored in the transaction
is
is the
ROLLBACK
several I/O
log
completed)
transactions
are formed
originates
The transaction
is
transaction
of transactions
support
to
the
12
All suppressed
are permanent
case
Rights
of a critical
Reserved. content
the
does
May not
not materially
be
copies
of the
error in the
copied, affect
scanned, the
overall
or
duplicated, learning
a given
database;
master
in experience.
whole
state
to
a previous
they
consistent
are stored in
state.
a safe place
and are
database.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
668
PART
V Database
Transactions
and
Performance
Tuning
KEYTERMS atomicity
field-level lock
row-level lock
atomic transaction property
full backup
scheduler
binarylock
immediateupdate
serialisability
buffer
inconsistentretrievals
serialisable
checkpoint
isolation
sharedlock
concurrency control
lock
table-level lock
consistency
lock granularity
time stamping
consistentdatabasestate
lock manager
transaction
database-levellock
lost updates
transaction log
databaserecovery databaserequest
monotonicity
transaction log backup
mutually exclusiverule
two-phase locking
deadlock
optimisticapproach
uncommitteddata
deadlyembrace
page
uniqueness
deferred update
page-level lock
wait/die
deferred write
ReadCommitted
wound/wait
differential backup
Read Uncommited
write-ahead-log protocol
durability
redundanttransactionlogs
write-through
exclusivelock
repeatable read
FURTHER READING Assaf,
W.,
West,
R.,
Aelterman,
S. and
Curnutt,
M. SQL
Server
2017
Administration
Inside
Out.
Microsoft
Press,
2017. Brumm,
B. Beginning
Seppo,
S., and
Underlying
Oracle
SQL for
Soisalon-Soininen, Physical
Oracle
Database
S. Transaction
Structure,
Data-Centric
18c:
From
Processing: Systems
and
Novice
to
Professional,
Management Applications.
1st
of the
Logical
Springer,
2016.
edition.
Apress,
Database
and its
2019.
Online Content Answers to selected Review Questions andProblems forthischapter are available
12
on the
online
platform
for this
book.
REVIEW QUESTIONS 1
Explain the following
2
Whatis a consistent
3
The
DBMS
real-world
4
Copyright Editorial
review
2020 has
Atransaction
that the
What are the
6
Whatis a scheduler,
Cengage
Learning. that
any
a transaction
All suppressed
Rights
Reserved. content
does
May not
log,
semantic
possible
List and discuss the four transaction Whatis
is alogical
unit of work.
database state, and how is it achieved?
does not guarantee event.
5
deemed
statement:
and
meaning
consequences
of the transaction
of that
limitation?
not
be
copied, affect
what is its function?
scanned, the
the
an example.
properties.
what does it do and whyis its activity important
materially
truly represents
Give
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
to concurrency
party additional
content
may content
be
suppressed at
any
time
control?
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
7
Whatis alock,
8
Whatis concurrency
9
Whatis an exclusive lock,
12
Managing
Transactions
and
Concurrency
and how, in general, does it work? control, and whatis its objective? and under which circumstances
is it granted?
10
Whatis a deadlock, and how can it be avoided? Discuss several deadlock avoidance strategies.
11
Which three levels what
each
669
of backup
of those
three
12
What are the four
13
What does serialisability
may be used in database recovery
backup
levels
management?
Briefly describe
does.
ANSI transaction isolation levels? of transactions
Whichtype of reads
does each level allow?
mean?
PROBLEMS 1 Suppose you are a manufacturer of product ABC, whichis composed of parts A, B and C. Each time
a new
product
ABC is
QOH in a table
named
PART_QOH
a table
in
database
contents
Table name:
created,
it
PRODUCT. named
are
shown
must
be added
And each time
PART,
must
in the
following
to the
the product is
be reduced
PROD_QOH
ABC
1205
name:
PART_QOH
A
567
the parts inventory,
of parts
A, B and
using
C. The sample
98
12
549
Given that information,
a
answer
questions
a-e.
How many database requests can you identify for aninventory update for both PRODUCT and
PART?
b
Using SQL, write each database request you identified in Step a.
c
Writethe complete transaction(s).
d
Writethe transaction
log, using Table 12.1 on p. 646 as your template.
e
Using the transaction
log you created in Step d, trace its usein database recovery.
2
Describe the three concurrency
3
most common concurrent transaction
control
can
be used
Which DBMS component resolve
has
PROD_
PART
PART_CODE
2020
the
PRODUCT
C
review
using
tables.
B
Copyright
inventory,
created,
by one each
PROD_CODE
Table
Editorial
product
Cengage deemed
Learning. that
any
to
avoid
those
is responsible
for
execution problems. Explain how
problems.
concurrency
control?
How is this feature
used to
conflicts?
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
670
PART
V Database
4 5
Transactions
and
Performance
Tuning
Using a simple example, explain the use of binary and shared/exclusive Suppose your database system has failed. of deferred-write
and
write-through
locks in a DBMS.
Describe the database recovery
process and the use
techniques.
Online Content The'Ch12_ABC_Markets' database is available ontheonlineplatform for
6
this
book.
ABC Markets sell products to customers. The entity relationship diagram shown in Figure P12.6 represents the main entities for ABCs database. Note the following important characteristics: A customer
may make many purchases, each one represented
The CUS_BALANCE is updated amount the customer owes.
by aninvoice.
with each credit purchase or payment and represents
The CUS_BALANCE is increased (1) customer payment.
with every credit purchase and decreased (2)
The date oflast purchase is updated
with each new purchase
The date oflast payment is updated
FIGURE P12.6
the
with every
made by the customer.
with each new payment
made by the customer.
The ABC Markets Relational Diagram
12
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
An invoice
represents
An INVOICE
can
have
The INV_TOTAL
can
CHEQUE
The invoice
be 30,
can
quantity
A customer
one for
cost
60
or 90
be OPEN,
PAID
Transactions
and
Concurrency
671
by a customer.
LINEs,
the total
Managing
each
of the invoice,
product
purchased.
including
(representing
the
taxes.
number
of days
of credit)
or
or CC.
status
A products
purchase
many invoice
represents
The INV_TERMS CASH,
a product
12
may
or CANCEL.
on hand (P_QTYOH)
make
many
is updated
payments.
The
(decreased)
payment
type
with each
(PMT_TYPE)
product
can
sale.
be one
of the
following:
? CASH
for
cash
? CHEQUE
? CC
for
cheque
for credit
The payment
payments.
card
details
payments.
payments.
(PMT_DETAILS)
are
used
to
record
data
about
check
or credit
card
payments: ?
The
bank,
account
? The issuer,
credit
Note:
Not all entities
Using
this
BEGIN
a
attributes write
a unit
invoice
has
code
COMMIT
only
to
10010
to
cheque
payments.
date for credit
in this
example.
represent
group
the
each
SQL
card
payments.
Use only the of the
statements
attributes
following
indicated.
transactions.
in logical
Use
transactions.
makes a credit purchase (30 days) of one unit of product
of 110.00; one
for
and expiration
SQL
price
number
are represented
the
and
with
the
product
tax rate
is
8 per
cent.
The invoice
number
is
10983,
line.
On 3 June 2019 customer 10010
c
Create a simple transaction log (using the format shown in Table 12.14) to represent the actions previous
8
in
makes a payment of 100 in cash. The payment ID is 3428.
transactions.
Create a simple transaction of the transactions
log (using the format
Problems
shown in Table 12.14) to represent the actions
6a and 6b.
Assuming that pessimistic locking is being used but the two-phase a chronological the
9
list
complete
of locking,
processing
unlocking
of the
chronological
10
list
complete
of locking,
processing
manipulation
described
in
locking
activities
Problem
and
data
transaction
manipulation
described
in
a chronological
11
list
complete
of locking,
processing
unlocking
of the
Assuming that pessimistic locking chronological the
Cengage deemed
list
complete
Learning. that
any
All suppressed
of locking,
processing
Rights
Reserved. content
does
May not
not materially
and
transaction
unlocking
copied, affect
scanned, the
overall
manipulation in
and
or
duplicated, learning
data
that
locking
activities
Problem
in experience.
whole
or in Cengage
part.
Due Learning
in
to
reserves
rights, the
right
protocol, would
create a
occur
during
protocol is not, create
that
would
locking
activities
Problem
electronic
during
occur
during
6b.
manipulation
described
would occur
6a.
is being used with the two-phase
of the transaction
be
data described
that
locking
activities
Problem
Assuming that pessimistic locking is being used but the two-phase the
12
protocol is not, create
6a.
is being used with the two-phase
unlocking of the
and data
transaction
Assuming that pessimistic locking the
has
number
cheque
b
7
2020
and
On 11 May 2019 customer
of the two
review
and
TRANSACTION
and this
Copyright
card
database,
11QER/31
Editorial
number
that
protocol, would
create a
occur
during
6b.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER 13 Managing Databaseand SQL Performance IN THIS CHAPTER,YOU WILLLEARN: Basic
database
How
performance-tuning
a DBMS
processes
SQL
About the importance About the types
practices
How to formulate
queries
of indexes
of decisions
Some common
concepts
in
the
query
used to
queries
query
processing
optimiser
has to
write efficient
and tune the
make
SQL code
DBMS for optimal
performance
PREVIEW Database
performance
coverage few
in the
records
intended
least
per table.
task,
most
efficient
Unfortunately, when, in the
query
environment
query
environment
real
focus
the
often
is
no visible
yet it
on
making
are
executed
create
a
SQL
query
over more
receives
minimal
have only a
queries
process.
performance
when only 20 or 30 table
to
usually
used in classrooms
of the
to query efficiency
queries
what it takes
topic,
efficiency
gives
of attention
world,
you learn
a critical
Most databases
the
considering
the lack
is
curriculum.
As a result,
without
efficient
chapter,
optimisation
database
perform
In fact,
improvements
an
even the over
the
rows (records)
are queried.
can give unacceptably
slow results
tens
of
efficient
millions
query
of records.
In this
environment.
NOTE Asthis book focuses on databases, this chapter covers only those factors directly affecting database performance. Also, because performance-tuning techniques can be DBMS-specific, the material in this chapter may not be applicable under all circumstances, nor will it necessarily pertain to all DBMS types. This chapter
is
designed
to
build
a foundation
for the
general
understanding
of database performance-tuning issues and to help you choose appropriate performance-tuning strategies. (For the most current information about tuning your database, consult the vendors documentation.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13.1
DATABASE PERFORMANCE-TUNING
One of the required. the
main functions
of a database
End users and the
following
system is to
DBMS interact
through
The end-user (client-end)
2
The query is sent to the DBMS (server end).
3
The DBMS (server end) executes the query.
4
The DBMS sends the resulting
End users expect performance
answers
use of queries
to to
and
SQL Performance
673
end
users
generate
when they
information,
are using
is
good?
goal
that is, to try to
amount
of time.
The time
required
a typical
DBMS
memory (RAM) components
vary from is
end-user
easier
to return
a result to
General
on
disk and guidelines
CPU
network) for
for better
of end-user Therefore,
of the
database
DBMS in the
Table
minimum
factors
tend
The
power,
if
than
performance
time
These
query
know
Unfortunately,
possible.
to vendor.
processing
better
system
as
by the
throughput.
achieving
results.
the response
vendor
do you
performance
Database
many factors.
and from
whether the
Regardless
as fast
processed
depends
query
tuned.
to reduce
main factors:
guidelines
System
set
environment
by three
(hard
slow
queries
How
database
months later.
execute
query is
evaluate.
bad
about
well two
designed
an end-user
general
hard to
monitored and regularly
environment
and input/output
is
How do you know
to identify
complaints
is to
application.
as possible.
performance Its
and procedures
constrained
and summarises
TABLE 13.1
is
performance
ensure that
by a query
and to
enough?
well one day and not so
to a set of activities
wide-ranging
database
good
must be closely
system,
as quickly
Good is
of database
performance
refers
results
all it takes
may perform
the
database
time
generates a query.
data set to the end-user (client-end)
queries to return
performance
query
perceptions,
of
application
query response
database
same
tuning
their
of a database
a 1.06-second good
Database
sequence:
1
the
Managing
CONCEPTS
provide
the
13
available
13.1 lists
to
be
performance primary
some
system
performance.
performance
Client
Server
Resources
Hardware
CPU
The fastest
possible
Multiple processors. The fastest higher.
possible, i.e. quad-core
Cluster
Virtualised RAM
The
maximum
possible
The
of networked
server
maximum
or
computers.
technology.
possible
to
avoid
13
OS
memory to disk swapping. Storage
Fast SATA/EIDE sufficient state
hard disk
with
free hard disk space.
drives (SSDs)
for faster
Multiple high-speed, Solid
speed)
RAID configuration). (SSDs) for
Software
Network
High-speed
Operating
Fine-tuned
system
high-capacity
(SCSI/SATA/Firewire/Fibre
connection
for faster
OS, DBMS
High-speed
for best client application
64-bit
performance
Solid speed.
and
disks
Channel) in state
drives
Separate
data
disks
spaces.
connection.
OSfor larger
Fine-tuned
address server
spaces.
for
best
application
for
best throughput.
possible.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Network
Fine-tuned
Application
Optimise
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
for
SQL in
duplicated, learning
best throughput
in experience.
whole
client
or in Cengage
part.
Fine-tuned
application
Due Learning
to
electronic reserves
Optimise
rights, the
right
some to
third remove
party additional
content
DBMS for
may content
be
suppressed at
any
time
if
best
performance.
from
eBook
the
subsequent
rights
and/or restrictions
eChapter(s). require
it
674
PART
V Database
Transactions
and
Naturally, the system in
the
real
world,
Therefore,
the
possible
performs
unlimited
and
existing
Tuning
best
when its hardware
resources
hardware
with
Performance
are
software
(and
often
not the
components
limited)
and software
norm;
internal
should
resources,
resources
and
be optimised
which
is
are optimised.
external
why
to
constraints obtain
database
However,
always
the
exist.
best throughput
performance
tuning
is
important. Fine-tuning be
checked
minimise the
the to
the
performance
designed
database
is redesigned
general,
client
On the least
client
it is
such
this
design.
well as a well-designed
factor
On the
server
side,
in
to
determining
No amount offine-tuning This is true
performance
must
resources
ethos:
will make
when
gain from
activities
can
be divided
into
those
taking
objective using
is to
the
generate
minimum
a SQL
amount
query
that
of resources
returns at the
the
an existing
older databases.
place
correct
server
end.
required to achieve that goal are commonly referred to as SQL performance
requests
all factors
sufficient
books
database.
a unrealistic
That is, has
an important
worth repeating
with good database as
is
and
either
on the
side:
the
of time,
design
approach. level
Tuning: Client and Server
server
side,
amount
optimum
As database
starts perform
a holistic
at its
efficiency,
performance-tuning
or on the
requires
operates
and the end user expects
database
side
one
performance
13.1.1 Performance In
of a system
each
of bottlenecks.
systems
Good database database
that
occurrence
database
a poorly
performance
ensure
the
DBMS
in the fastest
environment
way possible,
must
while
be properly
configured
making optimum
to
use of existing
answer The
in the
activities
tuning. respond
to
clients
resources.
The
activities required to achieve that goal are commonly referred to as DBMS performance
tuning.
Online Content If you wanttolearn moreaboutclientsandservers,checkAppendix F, Client/Server
Systems,
located
on the
online
platform
for this
book.
Keep in mindthat DBMS implementations are typically more complex than just a two-tier client/server configuration. However, even in multi-tier (client front-end, application middleware and database server back-end)
client/server
environments,
performance-tuning
activities
are frequently
subdivided
into
subtasks to ensure the fastest possible response time between any two component points. This chapter covers SQL performance-tuning practices on the client side and DBMS performance-tuning practices onthe server side. However, before you can start learning about the tuning processes, you mustfirst learn more about the DBMS architectural components and processes, and how those
13
processes
interact
to respond
to end-user
requests.
13.1.2 DBMS Architecture The architecture of a DBMS is represented bythe processes and structures (in memory and in permanent storage) used to manage a database. Such processes collaborate with one another to perform specific functions.
Copyright Editorial
review
2020 has
Cengage deemed
Figure
Learning. that
any
All suppressed
Rights
13.1 illustrates
Reserved. content
does
May not
not materially
be
the basic
copied, affect
scanned, the
overall
or
duplicated, learning
DBMS
in experience.
whole
architecture.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 13.1
13
Managing
Database
and
SQL Performance
675
Basic DBMS architecture
DBMS
server
computer Client computer
SQL
User
query
Client
process
Database
Listener
process
Table
Lock
Scheduler Result is
spaces
Optimiser
manager
set
I/O
sent
SQL cache
Data cache
operations
back to
Data
files
client
DBMS
processes
running
in
Database
primary
memory
stored
(RAM)
data
in
secondary
memory
(hard
As you
examine
Figure
13.1,
note
the
following
components
All data in a database are stored in data files. of several
from files
data files.
A data
many different that
make
up the
that
each
can
contain
A database
database;
predefined increments define
file
tables.
rows
however,
extend
will be in
10
enterprise database is normally composed one
single
(DBA)
as required,
known as extends.
new
from
table,
or it
determines
the
data files
For example, if KB or 10
a logical
you
grouping
may have a system
data table
space
temporary
the
work
which
data
Copyright review
2020 has
to
DBMS
space
the
creates
DBMS and
it in
with
can
a
an index
and
minimum the
set
data
RAM (data
table
table
on.
of table
Each
time
you
Cengage deemed
Learning. that
any
All suppressed
also caches
Rights
Reserved. content
does
May not
not materially
be
system
copied, affect
scanned, the
overall
or
is
permanent
where
data read
or before the data are
storage
(data
data and the contents
duplicated,
or in
in experience.
whole
Cengage
files
in
part.
Due Learning
to
electronic reserves
rights, the
from
the
mostrecently
database
data
written to the database
catalogue
learning
or a a new
cache).
the
cache
13
spaces.
after the
data
group
a user
create
data
The
or file
all indexes;
are stored
RAM.
in
For example,
accessed
in
expand
data are stored;
space to hold
so
from
space
characteristics.
memory area that stores the
data have been read
data
the DBA can
The data cache or buffer cache is a shared, reserved blocks
rows
size of the
automatically
Atable
similar
data dictionary
grouping,
must retrieve
place
data
tables; sorts,
contain
MB increments.
store
where the
do temporary
stored)
that
user-created
automatically
data, are
The data cache
Editorial
table
space
with the
the
data files
to store the
table
database, To
of several
can
the initial
more space is required,
Data files are generally grouped in file groups creating table spaces. is
disk)
and functions:
Atypical
administrator
files
permanent
right
some to
files
data files.
of the indexes.
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
676
PART
V Database
Transactions
and
Performance
The SQL cache or procedure executed
SQL
know
statements
more about
and
Advanced
stores To
move
data
I/O requests purpose
of the I/O
multiple
from
8K,
might issue Working
I/O
with
DBMS
of
using
Also illustrated and their
the
to
other
on to the
Scheduler.
Lock
some
vendor
DBMS
to
data
faster
computer
physical
cache,
DBMS
issues
even if
than
disk
printer).
components
disk
on the
and
block,
you
generally
use
operating on the
only
one
system
situation,
and A DBMS
request. working
to retrieve
with
the
data in
data.
the
(Thats
data files, because
no
cache.)
focus
on
minimising
than
reading
the
the
SQL
SQL cache
disks, video
depending
read
DBMS
vendor,
different
slower
typical
hard
size depends
hard
data
the
processes.
functionality
the
number
data from Although
is
of I/O
the the
similar.
data
operations, cache.
number
of
The following
processes
processes
are
for
processes.
creates
clients
requests
Once a request
a user process
you
server.
The
listens
and
hands
is received,
the
processing
the listener
of the
SQL
passes the request
to
process.
DBMS,
to the
Chapter
the
activities
are
cache),
Furthermore,
the
many times
RAM (data
an entire
to the
many times
within
DBMS.
as memory,
multi-block
wait for
work
is
process
user
User. The DBMS
submit
is
9, Procedural
Rather,
by the
and from
disk block
or a
to
the
SQL.
want to
(I/O) request is a low-level (read or write)
retrieves
storage
Chapter
written
mostrecently
(If you
Figure 13.1:
appropriate
log
cache
to
data to
or even larger.
have
13.1
study
end-user
files)
operation
request
operations
The listener
requests
64K
to
the
devices (such
permanent
data
vary from
in
(data
move
read
performance-tuning
Figure
names
is to
read
needed
I/O
in
Listener.
32K,
and functions.
SQL functions,
store
Aninput/output
The physical
doesnt
are
majority
represented
16K,
data in the
the
because
storage
disk
from
a singleblock
operations
The
rows,
triggers
SQL that is ready for execution
operation an I/O
and
not
computer
only one row.
be 4K,
because
to/from
Note that
containing
could
of the
permanent
operation
or devices.
attribute
the
does
memory area that stores the
including
triggers
cache
and waits for the replies.
data access The
SQL
version
from
procedures,
procedures,
The
a processed
cache is a shared, reserved
or PL/SQL
PL/SQL
SQL.
Tuning
are
There
scheduler
12,
Managing
manager.
This
Transactions
Optimiser.
The
and
manages
process.
This
processes,
schedules
Transactions
and
manage each client
a user
many user
process
process
Managing
assigned
are
to
the
session.
process
at least
will handle
one
concurrent
Therefore,
per
each
execution
when you
all requests
logged-in
you
client.
of SQL requests.
(See
Concurrency.)
all locks
placed
on
database
objects.
(See
Chapter
12,
Concurrency.)
13 the
data.
optimiser
You
process
will learn
more
analyses
about
this
SQL
process
later
13.1.3 Database Query Optimisation Most of the algorithms The selection
of the
The
of sites
selection
Within those mode
two
or the
Automatic
Copyright review
2020 has
Cengage deemed
Learning. that
any
for
optimum
execution
to
timing
a query
of its
All
Manual
Rights
Reserved. content
does
May not
not
be
copied, affect
overall
are based
duplicated, learning
on two
communication
algorithm
Operation
or
efficient
way to
access
Modes
minimise
optimisation
the
most
principles:
order
optimisation
scanned,
the
chapter.
can
modes
requires
in experience.
whole
or in Cengage
that
can
part.
Due Learning
to
the
electronic reserves
costs
be evaluated be
meansthat the DBMS finds the
query
materially
and finds
in the
optimisation
to
optimisation.
query optimisation
suppressed
query
be accessed
principles,
user intervention.
Editorial
proposed
queries
on the
classified
as
basis manual
most cost-effective optimisation
rights, the
right
some to
third remove
additional
content
may content
operation
or automatic.
access path without
be selected
party
of its
be
and
suppressed at
any
time
from if
scheduled
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
by the
end user or programmer.
users
point
the
of view,
optimisation
Within this timing
Static strategy
query
of such
selected
When the
the
are
embedded is
database.
the
at run
when
program
query
time,
is
using
is
determined
query
optimise
the
optimisation
convenience
is
is clearly
the
Database
and
more desirable
increased
overhead
SQL Performance
from
that
the
677
end
it imposes
on
optimisers, time
first
Setting
the
quickly
user
goal to
procedures.
In these
as
possible
dynamic
so control is
or manual. In the
possible. for
retrieval
cases, the
Then,
the
Arule-based the
best
they
query
of the last
are typically
Because detail
in the
through
strategy.
general
database next
in
Within should
retrieve
the
cases,
the
while
DBMS
the
number
with access
In these
control
row
of records,
rights.
These
statistically attempt
first
based
to
row
is
goal is to
waits for
is typically
will not pass
it is important
by the
DBMS
statistical
the
done in
minimize often
used
present user
to
embedded
back to the calling
to retrieve
and is
generation
all of the
generated
in
mode, the
utility
one
DBMS
manual statistical
a user-selected
query optimisation access
to
used
algorithms.
about the database.
as size,
optimiser
is
program. that is
or rule-based
strategy.
query
strategy
in the same
of information
of users
time
dynamic
The best
information
DBMS
the
scroll
SQL and
application
data to the last
until row
as
be returned.
managed
dynamic
periodically
access
query.
and updates the statistics after each access. In the must be updated
type
such
the
the
.NET.
to
by the
Although
based
access
that
Minimising
therefore,
can
information
specify
Basic
SQL
the database.
overhead.
several times
number
best
Visual
determined
to the
characteristics
environments.
rows
minimise
data have been retrieved;
The statistical
row.
as
other
happen
uses statistical
to
done.
when
necessary
access
database.
algorithm
client
quickly
the
the
processing
and
plan
dynamically
according
the
common
C# or
the
plan to
on statistically
serviced
is
Database access strategy is defined
may be based
a goal
or the last
as
can fetch
optimiser
stored
all of the
setting
row
about
determine
optimisation
is
as
creates
is
by run-time
database
to
and interactive
to the
data, it
about
allow
the first
rows
the
inside
DBMSs
strategy
be classified
queries
DBMS
such
it
uses that
which could
can
of requests
by the
DBMS
information
measured
optimisation
number
systems
several
through
the
the
or dynamic.
approach
languages
compilation,
access
up-to-date
query
used
to retrieve
in transaction
for
when
This
place at execution time.
Therefore,
information
time,
some
takes
cost is
DBMS.
programming
to
can be static
time. In other words, the best optimisation
by the
DBMS
techniques
based
are then
compiled
is executed,
For example,
according
algorithms
place at compilation is
the query is executed
provide
access
statistics
be classified
optimisation
to the
most
its
optimisation query.
also
procedural
program
the
A statistically
average
in
efficient,
The statistics
query
executed.
every time
Finally,
takes
the
optimisation
program
optimisation
can
query
submitted
When the
Dynamic when
algorithms
classification,
query optimisation is
statements
the
Automatic
cost
Managing
DBMS. Query
to
but the
13
of two
generation
which is specific
different
automatically
modes:
evaluates
mode, the statistics
to the
a particular
DBMS.
algorithm is based on a set of user-defined rules to determine
The rules
are
entered
by the
end
user
or database
administrator,
and
1
nature.
statistics
play a crucial
role in
query
optimisation,
this topic
is explored
in
more
section.
13.1.4 Database Statistics Another
DBMS
statistics. tables,
indexes
temporary
query
automatically
review
2020 has
Cengage deemed
Learning. that
any
All
resources
These in this
does
May not
to
such
statistics
chapter,
not
be
copied, affect
overall
or
in
query
of
uses the statistics
duplicated,
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
be
make
gathered
rights, the
right
the
some to
database
objects
processor
such
speed
as and
characteristics. to
support
gathering
database used,
database
statistics can
is
about
of processors
many DBMS vendors
learning
optimisation
measurements
a snapshot
DBMS
Database
scanned, the
of
as number
give
the
For example,
materially
role
a number
efficiency.
DBMS.
Reserved. content
an important
available
processing
Rights
plays
refers
later
by the
suppressed
that
statistics
available.
will learn
improving
Copyright
and
space
As you
Editorial
process
Database
third remove
critical
decisions
on request
ANALYZE
party additional
content
may content
by the
command
be
suppressed at
any
time
from if
the
subsequent
about DBA
in
or
SQL
eBook rights
and/or restrictions
eChapter(s). require
it.
678
PART
V Database
to
gather
IBMs
Transactions
statistics.
DB2
sample
and of
addition,
the
database
Number of rows,
in Indexes
each
column
Number
and
Logical
Resources
per data file.
object
Although
For
to
ANALYZE When you generate
Database
(and
and returns.
picture
you
13.2
review
2020 has
DBMS
Cengage deemed
Learning. that
objects
STATISTICS parameters.
is shown
in
Table
A 13.2.
the
know
the
DBMS
an index.
and
as is
Server
and
number
number
of extends
described
in
detail later.
DB2) automatically
To generate
the
gather
database
object
syntax: STATISTICS;
table,
you
indexes
would
use the
following
command:
are also analysed.
However,
you could
command:
of the index.) catalogue
in
database
specially
objects,
if you are the
a RENTAL
table
be subject
to
RENTAL
table
The
basic
processing
SQL
in
of data files
on request.
asit exists today.
way to execute
architecture
of
a SQL
query
processes
of key values
in the index,
STATISTICS;
for
use
Therefore,
of the table
query
all related
system
would
minimum value
STATISTICS;
For example,
will likely
of columns in each row,
of key values
and size
Oracle,
VENDOR
name
the
number
owner
to
designated
especially
of a video
store
constant
the
daily
you
generated
rentals.
updates
a given
DBMS
That
that
RENTAL
as you record
last
more current the statistics,
common
objects
and you have a video
video
and
It is
database
store
inserts
statistics
tables.
those
week
the
do not
your depict
better the chances
query.
processes
and
memory
structures,
you
are ready
request.
QUERY PROCESSING
What happens the
the
statistics
indexes)
histogram
with the following
DBMS to find the fastest
how the
as
COMPUTE
change.
system
daily rentals
Now that
in its initialisation
number
COMPUTE
a table,
in
the
and
following
is the
stored
key,
statistics
COMPUTE for
are
an accurate
to learn
options database
For example,
UPDATE
maximum value in each column,
size, location
gather
for
VEND_NDX
regenerate
your
gather statistics. uses the
used, row length,
uses them in
use the
statistics
associated
key
object_name
statistics
its
are for the
could
VEND_NDX
DBMS,
table
DBMS
DBA to
statistics
are subject to frequent rental
to
Server
have indexes.
DBMSs (such
the
example,
to
different
in the index
block
for a single index
INDEX
that
in the index disk
VENDOR
generate
periodically
Statistics
about
of disk blocks
of columns
newer
you
TABLE
ANALYZE
SQL
measurements
columns
physical
generate
statistics
(In this
and
,TABLE/INDEX.
example,
Auto-Create
values in each column,
name
require
on request,
ANALYZE
number
exist, the
of the
others
statistics
and
statistics
some
statistics,
own routines
Microsofts
may gather
statistics
key values
Environment
If the
and
DBMS
of distinct
of distinct
Copyright
the
while
Measurements
number
Editorial
Auto-Update
Sample Sample
have their
procedure,
that
Object
Tuning
many vendors
RUNSTATS
provides
Table
13
Performance
measurements
TABLE 13.2 Database
In
uses the
procedure
and
any
at the
processes
All suppressed
Rights
Reserved. content
does
DBMS
server
queries
in three
May not
not materially
be
copied, affect
end
scanned, the
overall
when the
clients
SQL
statement
is received?
In
simple
terms,
phases:
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13
Managing
Database
1
Parsing. The DBMS parses the SQL query and chooses the
2
Execution. The DBMS executes the SQL query, using the chosen execution
3
Fetching. The DBMS fetches the data and sends the result set back to the client.
The
processing
required
tables
by DML
are
SQL
statements The
catalog,
end-user
discussed
(such
difference
as
is that
in the
Figure
13.2 shows
following
CREATE
TABLE)
(SELECT,
the steps
is
actually
required
from
updates
query
the
the
UPDATE
for
plan.
plan.
different
INSERT,
679
and
processing
data
dictionary
DELETE)
processing.
mostly
Each
of the
sections.
Query processing
....
From
...
Where
...
SQL cache
Data cache
Parsing
Execution
phase
Syntax
Execute
check
Access
check and
Generate
Place
operations for
Retrieve
analyse
access access
I/O
Add locks
rights
Decompose
phase
plan
check
Naming
Fetching
phase Access
Store
SQL Performance
most efficient access/execution
a DDL statement
while a DML statement
data.
FIGURE 13.2
Select
DDL
statements.
or system
manipulates steps
of
and
data data
Generate
transaction blocks
blocks
in
result
set
mgmt from
data
data
cache
files
plan
plan
in
SQL
13
cache
Data files
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
680
PART
V Database
Transactions
and
Performance
Tuning
13.2.1 SQL Parsing Phase The
optimisation
process
includes
breaking
down
transforming the original SQL query into a slightly that is fully equivalent and more efficient. Fully equivalent
means that the optimised
parsing
the
query into
smaller
units
and
different version of the original SQL code, but one
query results
are always the same as the original query.
More efficient means that the optimised query will almost always execute faster than the original query. (Note that it almost always executes faster because, as explained earlier, many factors affect the performance of a database. Those factors include the network, the client computers resources, and even other queries running concurrently in the same database.) To determine the most efficient you learnt about earlier. The SQL parsing
activities
way to execute the query, the DBMS may use the database statistics
are performed
by the
query
optimiser.
The query
optimiser
analyses
the
SQL query and finds the most efficient wayto access the data. This process is the mosttime-consuming phase in query processing. Parsing a SQL query requires several steps. The SQL query is: Validated for syntax compliance Validated against the data dictionary to ensure that tables Validated
against
the
data
dictionary
Analysed and decomposed into
ensure that the
user has proper
access
rights
more atomic components
Optimised through transformation Prepared for execution
to
and column names are correct
into afully equivalent
by determining the
but more efficient
most efficient execution
SQL query
or access plan.
Once the SQL statement is transformed, the DBMS creates whatis commonly known as an access or execution plan. An access/execution plan contains the series of steps a DBMS uses to execute the query and return the result set in the most efficient way. First, the DBMS checks to see if an access plan for the query already exists in the SQL cache. If it does, the DBMS reuses the access plan to save time. If it
doesnt,
the
optimiser
evaluates
different
plans
and
makes decisions
about
which indexes
to
use and how best to perform join operations. The chosen access plan for the query is then placed in the SQL cache and made available for use and future reuse. Access plans are DBMS-specific and translate the clients SQL query into the series of complex I/O
operations
required
to read
the
data from
Although the access plans are DBMS-specific, in Table 13.3.
13
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
the
physical
data files
and
generate
the
some commonly found I/O operations
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
any
time
set.
are illustrated
suppressed at
result
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
TABLE 13.3
Managing
Database
and
SQL Performance
681
Sample DBMS access plan I/O operations
Operation Table
13
Description
Scan (Full)
Reads
the
entire
table
sequentially,
from
the
first
row
to the last
row,
one row
at a time
(slowest). Table
Access
Index
(Row
ID)
Reads
Scan (Range)
a table
Reads
the index
(faster than Index
Access (Unique)
row
directly,
first
obtain
the row
the row
ID
IDs
value (fastest). and then
accesses
the table
rows
directly
a full table scan).
Used when a table
Nested Loop
to
using
has a unique index in a column.
Reads and compares
a set of values to another
set of values,
using a nested loop
style
(slow). Merge
Merges two
Sort
Sorts
Table
13.3
shows
RDBMS.)
just
However,
accessing
and
a unique
A row
you
your
park
number. single
ID is like car in
set (slow).
access
I/O
the
operations.
type
of I/O
(This
illustration
operations
that
is most
based
on an
DBMSs
Oracle
perform
when
data sets. note that
for
every
row
the
row
address.
an airport
a table
saved
parking
Using that information,
section
sets (slow).
does illustrate
Table 13.3,
identification
directly.
a data
database
13.3
manipulating
As you examine is
a few
Table
data
access
in
using
permanent
storage
Conceptually,
space.
The
a row ID is the fastest
it is
parking
you can go directly
slip
to
and
similar
your
can
to the
contains
car
the
method.
be used
to
parking
slip
section
without
access
the
you
number
having to
Arow ID row
get and
when space
go through
every
and space.
13.2.2 SQL Execution Phase In this run,
phase,
the
from
all I/O
proper
the
operations
locks
are
data files
processed
and
during
the
indicated
if
in
needed
placed in the parsing
and
the
access
acquired
for
DBMSs
plan
the
data cache.
execution
phases
are
data to
executed.
When the
be accessed
and the
All transaction
of query
execution data
management
plan is
are retrieved
commands
are
processing.
13.2.3 SQL Fetching Phase 13 After
the
parsing
are retrieved,
execution
sorted,
the resulting table
and
grouped
query result
space
to
store
phases
and/or
are
(if required)
set are returned
temporary
completed,
all rows
aggregated.
to the client.
that
match
the
specified
During the fetching
During this
phase, the
condition(s)
phase, the rows
DBMS
of
may use temporary
data.
13.2.4 Query Processing Bottlenecks The
main objective
the fewest the
query
more
Copyright Editorial
review
2020 has
into
Learning. that
any
All
Rights
Reserved. content
is to
of interdependent
a query is,
suppressed
processing
execute
As you have seen, the
a series
complex
Cengage deemed
of query
resources.
does
the
May not
not materially
I/O
more complex
be
copied, affect
scanned, the
overall
or
duplicated, learning
a given
execution operations
the
in experience.
to
be executed
operations
whole
or in Cengage
part.
query in the fastest
of a query requires
are,
Due Learning
to
electronic reserves
in
which
rights, the
right
the
to
third remove
break
a collaborative
means that
some
way possible
DBMS to
party additional
content
manner.
The
are
more
bottlenecks
may content
be
suppressed at
any
time
from if
with down
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
682
PART
V Database
likely.
Transactions
and
A query processing
causes
the
overall
more interfacing DBMS,
five
CPU
to
is required
the
CPU
CPU
not enough
RAM
the
must
be shared
Hard disk
hard
allocates among
likelihood
a system
has, the
of bottlenecks.
Within
a
should
match
the
processor
speed
is too
can be caused
CPU spends
too
A CPU
systems
expected
slow
for
the
by other factors,
much time
bottleneck
swapping
workload.
amount
such
of
memory
blocks), DBMS
will affect
not
only the
as data
cache
and
work
as a defective a badly but
all
memory
for
specific
usage,
processes,
such
including
moving data among
the
operating
components
system
that
SQL
and
cache.
DBMS.
are competing
RAM
If there
for scarce
a bottleneck.
as needed space
the
process.
hard disk space is
disk
that
all running
also use the
storage
the
of an I/O operation that
components
system.
RAM available,
RAM can create
more
increasing
DBMS
CPU utilisation
or a rogue
in the processing
way, the
bottlenecks:
of the
RAM (the
in the
DBMS
not enough
systems
heavy
driver
running
power
same
components,
might indicate
component, device
the
In the
cause
processing
utilisation
is a delay introduced
down.
among
However,
written
slow
typically
performed.
processes
Tuning
bottleneck
system
components
A high
is
Performance
used for
hard disk for to
make room
available
and the
more than just
virtual in
memory,
RAM for
ability
storing
end-user
which refers
more urgent
to
have faster
data
the
database
server
data.
to copying
tasks.
Therefore,
transfer
rates
Current
areas the
reduce
operating
of RAM to the more
hard
disk
the likelihood
of
bottlenecks. Network
in
network.
a database
All networks
many network Application
have
nodes
code
and poorly
as the
poorly
designed
a limited
access
two
designed
as long
Learning
environment,
of the
these
of bandwidth
at the
common
Inferior design
perform
better.
bottlenecks
is
and
that
same time,
sources
code
database
database avoid
amount
network
most
databases.
underlying
how to
the
and the is
clients shared
can be improved
optimise
all clients.
are inferior
When
application
optimisation
no amount
database
via a
are likely.
with code
However,
connected
among
bottlenecks
of bottlenecks
sound.
are
techniques,
of coding
performance
is the
code
will
make
main focus
a
of this
chapter.
13.3 INDEXES AND QUERY OPTIMISATION 13
In in
Chapter the
functions
is
Conceptual, that
and
speeds
even join
an ordered
indexes
of indexes
section
Logical,
and
up
access.
data
operations.
13.5,
in
you
Physical
contains
the
physical
database
how indexes
Design,
facilitate in
the index
where high-sparsity
will learn
Database
Indexes
The improvement
set of values that
are recommended
selection In
11,
process
data
key and columns
design
impact
access
are
that
sorting speed
pointers.
stage SQL
you learnt
searching,
In
occurs
will play
a huge
are
using
crucial
aggregate
because
addition,
used in search
performance
indexes and
an index
you also learnt
conditions.
part in
that
The careful
query
optimisation.
tuning.
NOTE You
can learn
(Section
Copyright Editorial
review
2020 has
how
to
select
indexes
in
Chapter
11,
Conceptual,
Logical,
and
Physical
Database
Design
11.3.3.).
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13.4
13
Managing
Database
and
SQL Performance
683
OPTIMISER CHOICES
Query optimisation is the central activity during the parsing phase in query processing. In this phase, the DBMS must choose what indexes to use, how to perform join operations, which table to use first, and so on. Each DBMS has its own algorithms for determining the most efficient way to access the data. The query optimiser can operate in one of two modes: Arule-based execute
optimiser
a query.
uses a set of preset rules and points to determine the best approach to
The rules
assign
a fixed
cost
to each
SQL operation;
the costs
are then
added
to yield the cost of the execution plan. For example, a full table scan will have a set cost of ten, while atable access by row ID will have a set cost of three. A cost-based optimiser uses sophisticated algorithms based on the statistics about the objects being accessed to determine the best approach to execute a query. In this case, the optimiser process adds up the processing cost, the I/O costs, and the resource costs (RAM and temporary space) to come up with the total cost of a given execution plan. The optimiser objective is to find alternative ways to execute a query, to evaluate the cost of each alternative and then to choose the one with the lowest cost. To understand the function of the query optimiser,
lets
use a simple
example.
Assume that
you
based in South Africa (SA). To acquire that information,
want to list
all products
you could
SELECT
P_CODE, P_DESCRIPT, P_PRICE, V_NAME, V_STATE
FROM
PRODUCT,
WHERE
5 VENDOR.V_CODE
VENDOR.V_COUNTRY
Furthermore, lets
by a vendor
query:
VENDOR
PRODUCT.V_CODE
AND
provided
write the following
5 'SA';
assume that the database statistics indicate
1
The PRODUCT table has 7000 rows.
2
The VENDOR table
3
Ten vendors come from
4
One thousand
that:
has 300 rows.
South Africa.
products come from vendors in South Africa.
Its important to point out that only items 1 and 2 are available to the optimiser. Items 3 and 4 are assumed to illustrate the choices that the optimiser must make. Armed with the information in items 1 and 2, the
optimiser
would try to find the
most efficient
way to
access
the
data.
The primary
13
factor
in determining the most efficient access plan is the I/O cost. (Remember, the DBMS will always try to minimize I/O operations.) Table 13.4 shows two sample access plans for the previous query and their respective I/O costs.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
684
PART
V Database
Transactions
TABLE 13.4 Plan
and
Comparing
Step
Performance
Tuning
access plans and I/O costs I/O
Operation
A1
Cartesian A1
A2
product
Select
rows
vendor A2
Total
I/O
Cost
7 300
2 100 000
7 300
2 100 000
2 100 000
7 000
2 107 300
7 000
7 000
1 000
2 114 300
300
300
10
300
7 010
70 000
7 310
70 000
1 000
77 310
7 000
5(PRODUCT
Set
Rows
Operations A
Resulting
I/O
Cost
1 300
X VENDOR)
in
A1
with
matching
codes
5 s PRODUCT.v_code
5
VENDOR.v_code(A1) A3
Select
rows
in
A2
V_COUNTRY A3
B
B1
5 (s
Select
5 'SA'
V_COUNTRY rows
in
5 'SA'
VENDOR
V_COUNTRY B1
with
(A2))
with
5 'SA'
5 s V_COUNTRY
5 'SA'
(VENDOR)
B2
Cartesian
7 000
Product
B2 5 (PRODUCT B3
Select rows in matching B3
1 10
X B1) 70 000
B2 with
vendor
codes
5 s PRODUCT.v_code
5
B1.v_code(B2)
To makethe example
easier to understand,
the I/O
Operations
and I/O
Cost columns in Table 13.4 estimate
only the number ofI/O disk reads the DBMS must perform. For simplicitys sake, it is assumed that there are noindexes and that each row read has anI/O cost of 1. For example, in Step A1,the DBMS must perform a Cartesian product of PRODUCT and VENDOR. To do that, the DBMS mustread all rows from PRODUCT (7 000) and allrows from VENDOR (300), giving atotal of 7 300 I/O operations. The same computation is done in all steps. In
Table 13.4, you can see how plan A has a total I/O
than plan B.In this case, the optimiser
cost that is
almost
30 times
higher
will choose plan Bto execute the SQL.
NOTE
13
Not all DBMSs from the
optimise
optimisation
Given the right the
SELECT
without
defaults
You learnt sparsity
2020 has
Cengage deemed
in
are
Learning. that
any
your
not
All suppressed
to
need
Reserved. does
11,
May
not materially
be
could
an index
access
Oracle parses
Always read the
be answered
PQOH_NDX could
any
queries
documentation
differently
to examine
the
be resolved
of the
data
entirely
in the
with only an index.
P_QOH
attribute.
by reading
only the
for the
PRODUCT
blocks
Then first
For example,
a query
entry
table.
such
in the
as
PQOH_
(Remember
that
order.)
Conceptual,
candidates
not
chapter.
implementation.
PRODUCT to
way. As a matter of fact,
in this
queries
with
ascending
good
content
some
FROM
Chapter
Rights
DBMS
table
the
same
sections
conditions,
MIN(P_QOH)
the index
review
for
PRODUCT
NDX index,
Copyright
in several
requirements
assume
Editorial
SQL queries the
way described
copied, affect
for
scanned, the
overall
Logical,
index
or
duplicated, learning
and
creation.
in experience.
whole
or in Cengage
Physical However,
part.
Due Learning
to
electronic reserves
Database there
rights, the
right
are
some to
third remove
Design,
that
cases
party additional
content
columns
where
may content
be
suppressed at
any
with low
an index
time
from if
the
subsequent
in
eBook rights
alow
and/or restrictions
eChapter(s). require
it
CHAPTER
sparsity you
column
would be helpful.
want to find
SELECT
out
how
scan to read
all EMPLOYEE
have
without
an index
the
need
on
to
FROM
for the rows
the
that the
are in the
EMPLOYEE
EMP_GENDER
the
EMPLOYEE
employee
data
you
Database
table
the
query
could
write
attributes
SQL Performance
a query
such
685
If as:
5 'F';
would have to
be answered
and
has 122 483 rows.
would
WHERE EMP_GENDER
column,
query
Managing
company,
and each full row includes
EMP_GENDER,
access
assume
employees
COUNT(EMP_GENDER)
If you do not have an index
if you
For example,
many female
13
you
perform
a full table
do not need.
by reading
only the
However,
index
data,
instances
the
at all.
13.4.1 Using Hints to Affect Optimiser Choices Although
the
optimiser
optimiser
may not
on the
existing
choose
the
statistics.
best execution There
generally
performs
best
If the
execution
statistics
plan. Even
with current
occasions
when the
are some
well
under
plan.
statistics,
the
user
optimiser
that
most common
are
embedded
optimiser
TABLE 13.5
hints
optimiser
would like
inside
the
SQL
used in standard
Optimiser
the
optimiser
to
optimiser
choice
some
makes
decisions
may not be the the
Optimiser
command
in
based
may not do a good job in selecting
change
SQL statement. In order to do that, you need to use hints. the
circumstances,
Remember,
are old, the
end
most
text.
optimiser
the
most efficient mode for
the
one.
current
hints are special instructions
Table
13.5
summarises
a few
for
of the
SQL.
hints
Usage
Hint
Instructs
ALL_ROWS
the optimiser to
it takes to return processes. SELECT
FIRST_ROWS
to
ALL_ROWS
*/ * FROM
the optimiser to
generally
SELECT /*1 Forces the SELECT
set. This hint is generally
PRODUCT
minimise the time
used for batch
mode
P_QOH
, 10;
only the first set of rows in the query result
used for interactive
FIRST_ROWS
mode
processes.
*/ * FROM PRODUCT
optimiser to use the
/*1
WHERE
minimise the time it takes to process the first set of rows, that is,
minimize the time it takes to return
hint is
INDEX(name)
query result
For example: /*1
Instructs
minimise the overall execution time, that is, to
all rows in the
For example:
WHERE P_QOH
P_QOH_NDX index to
INDEX(P_QOH_NDX)
*/ * FROM
set. This
, 10;
process this query. For example:
PRODUCT
WHERE
P_QOH
, 10;
13 Now that some
you are familiar
general
13.5 SQL
SQL
performance
tuning
practices
processes
to facilitate
the
SQL queries, work
of the
lets
query
turn
our attention
to
optimiser.
evaluated
from
the
write efficient
perspective.
Afew
Therefore,
words
the
of caution
some
MostSQLperformance optimisationtechniques are DBMS-specific and,therefore, arerarely portable, different
Cengage deemed
Learning. that
any
All suppressed
versions
in database
Rights
Reserved. content
does
May not
not materially
be
DBMSs perform automatic
goal is to illustrate
are appropriate:
2
across
relational
client
SQL code.
Most current-generation
2020 has
is
used to
advancement
review
recommendations
DBMS
1
even
Copyright
coding
way the
SQL PERFORMANCE TUNING
common
Editorial
with the
of the
same
DBMS.
Part of the
query optimisation
reason
for this
atthe server end.
behaviour
is the
constant
technologies.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
686
PART
V Database
Transactions
Does this
mean that
will always
optimise
general
and
you it?
optimisation
circumstances
database performance
not
are related
services,
SQL
data
UPDATE,
DELETE
SELECT
statement
and, in
focusing
to
poorly
a SQL
written
statements
the
of view.
almost
include
in
use of indexes
because
The
and
different
this
although
and how to
bring
one.
as INSERT,
to
write conditional
provides
written
(such
the
database
a DBMS
are related
uses special
will,
a poorly
commands
section
by the
of current
outperforms
DBMS
DBMS
usually
majority
Therefore,
the
(The dictated
can,
always
many
most recommendations
particular,
query
SQL code.
query
written
for improvement. techniques
SQL
point
written
written
query is
room
on specific
A poorly
a carefully
SELECT),
how
considerable
a performance
manipulation
and
is
than
execution.)
knees from
problems
worry about there
rather
query
to its
Tuning
because
techniques,
optimizing
Although
should No,
of the
system
general
Performance
the
use
of the
expressions.
13.5.1 Index Selectivity Indexes
are the
Chapter
11,
how likely
most important
Conceptual,
an index
using indexes
technique
Logical,
is to
used
and
in
Physical
be used in query
SQL
performance
Database
processing.
optimisation.
Design, index
To recap,
the
As you learnt
selectivity
general
is a
guidelines
from
measure
for creating
of and
are:
Create indexes
for each single
attribute
used in a WHERE,
HAVING,
ORDER
BY or GROUP
BY
clause. Create an index When
a
MIN or
Declare
indexes
However,
you
How
especially
one index
to
will change indexes
in
creation
Too
as new rows
all search
columns
of the index
It
a query,
let
operations.
your
or deleted optimiser evaluate
that
help the
UPDATE
query
choose.
A proper if
and
shown
attributes.
an index
DELETE
in
will choose many
optimiser,
different
the
answer
In any case, you should procedure
will be the
performance
is
not
for
operations,
optimisers
conditions
in
P_QOH
table
not create
cost-based
from the tables.
data
condition
in the
query
uses
use the
the
search
you should
some
and improve
using
use functions
Furthermore,
even if
the
you
down INSERT,
use? If you
test,
will not
when
of rows.
optimiser
monitor,
in join
For example,
P_MIN
bears repeating
are added
and then
usage
for
are ignored
will slow
for
use the indexes
performance.
of an index
many thousands index
can
PK/FK.
indexes
does the
high.
column.
optimiser
to improve
many indexes
driving
column is
an indexed
than
you create?
Which index
with time
evaluation
the
contains
be the
columns.
other
because
to
so the
an index
should
table
keys
use
is
in a table.
if the
indexed
13
section,
The reason
on the indexed
applied
columns
always
many indexes
every column
only
in join
next
* 1.10.
is
and foreign
cannot
13.6 in the
. P_MIN
data sparsity
MAX function
all primary
Declare
Table
when the
create
constant
adequate.
13.5.2 Conditional Expressions A conditional statement. to
only
shown
Copyright Editorial
review
2020 has
Cengage deemed
the in
Learning. that
any
expression
is
A conditional rows Table
All suppressed
normally
expression
matching
the
expressed
(also
within the
known
conditional
WHERE or
as conditional
criteria.
criteria)
Generally,
the
HAVING
restricts
clauses
of a SQL
output
of a query
the
conditional
criteria
have
the
form
13.6.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
TABLE 13.6
Conditional
13
Managing
Database
and
SQL Performance
687
criteria Conditional
Operand1
Operator
P_PRICE
.
V_COUNTRY
5
P_QOH
A simple
Table
column
A literal
Most
of the
easier.
Lets
name
such
query
column
as
some
columns
to
P_QOH
first.
The
For example,
.
and
involve
additional
column
Equality
comparisons
to
there
are
symbol
no exact
(.,
.5,
The reason perhaps
is
,,
all comparison Also,
the
high,
Whenever
- 10
Cengage
Learning. that
any
All suppressed
Rights
Reserved. content
will almost
is
does
with
symbol there
conditional it to read
such
as:
AND
P_MIN
May
not materially
be
copied, affect
additional
the
overall
or
duplicated, learning
in
whole
as in
equality
P_PRICE in the you
use
than
13
request.
values
and
of NULL) of
LIKE %glo%.
when the
there
to use literals.
the
exception
V_CONTACT
If
an inequality
than
the
5 10.00
column.
complete
or less
especially
values
conditions
are
sparsity
equal
For example,
of the
values.
if your
condition
5 17.
write the
experience.
if
(with
to
operands.
index
to
than
conditions,
attribute
NULL
rule,
However,
searches,
expressions
the
processing
such
different
5 P_REORDER
scanned,
if all
decimal) faster than
For example, using
The slowest
slower
P_PRICE
5 'Jim'
search
of all conditional
more greater
symbols,
In
null values,
As a general
as false.
expressions,
to
slowest
evaluated
more
* 1.10
execution
a character
(integer,
references
search
yields many
P_MIN
query
comparisons.
comparisons
be the
be
. 10.00
the
try to use V_NAME
comparisons.
perform
work
SQL code:
of a single
P_PRICE
total
in
use of
contents
evaluate
comparing
do a direct
wildcard
must
NULL
comparisons.
always
(,.) are
conditional
not
is
avoid the the
add to the
than
values in the index.
LIKE
transform
P_MIN
can
must
equal
5 7, change
condition
,
DBMS
there
when
multiple
P_QOH
deemed
condition
equal
is,
possible,
P_PRICE
composite
has
not
that
When using
2020
the
exactly
numeric
than inequality
matches, the
date and
to
optimisers
expressions
example,
5 'JIM',
do not store tend
make the
capitalisation.
is faster
inequality
DBMS
operators
using
data is
As indexes
the
because
literal
CPU handles
faster
For DBMS
will also
with proper character,
because
,5),
only a few
than
than
expressions. the
to
conditional
Comparing
UPPER (V_NAME)
a numeric
are faster
faster
is
designed
expression
possible.
expressions
and, therefore,
are processed
processed
in
are
write efficient
because
are stored
comparisons.
comparisons
'SA'.
a conditional
to
* 1.10
are faster
processing
in
whenever
of functions
attribute
date
text
next
used to
comparing
P_MIN
a character literal. In general, the character
or the
mentioned
as operands
than
comparisons
a numeric
10.00
practices
if your condition
V_NAME
field
comparing
value
techniques
use
can be:
or V_COUNTRY
with functions
is faster
* 1.10
* 1.10.
or literals
than
names in the
review
P_MIN
P_PRICE
common
a literal
expression
Numeric
P_MIN
expressions
will be faster
Copyright
'Moloi%'
an operand
as the
optimisation
examine
Use simple
as
such
such
conditional
time.
13.6, note that
or a constant
An expression
Editorial
'SA'
.
As you examine
is
10.00
LIKE
V_CONTACT
is
Operand2
equality
AND
or in Cengage
part.
Due Learning
conditions
P_QOH
to
electronic reserves
first.
If
you
have
a
5 10
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
688
PART
V Database
Transactions
and
Performance
Tuning
change it to read: P_QOH
5 10 AND P_MIN
Remember,
equality
RDBMSs query
If you use
(The
multiple
this technique, conditional
DBMS
everything
is
will be evaluated
process
paying
than
inequality
to this
condition the rest
V_COUNTRY5
'SA'
If
of the
set.
as soon as it finds AND
conditions
the
a
conditions
evaluates
technique,
Naturally,
data
most
for the
done.)
multiple
of the
use this
conditions.
Although
the load
be false first. If you use
for
one
if you
to
conditions
Remember,
Therefore,
sparsity
of the
as true.
additional
of the
most likely
be false.
be evaluated as false.
conditions.
detail lightens
do what you have already
write the
to
. 10
attention
evaluating
evaluating
knowledge
to
have to
evaluated must
unnecessarily
an implicit
condition
that
all conditions
else
waste time
will stop
AND P_MIN
you,
wont
AND conditions,
the
true,
are faster
do this for
optimiser
expression
be found
the
to to false,
DBMS
wont
use of this technique
For example,
look
at the
following
list:
P_PRICE
. 10
If you know that
AND
only a few
V_COUNTRY When
conditions
will automatically
optimiser.
implies
5 P_REORDER
using
DBMS
5 'SA'
multiple
will stop
that
is
the
conditions
vendors
AND P_PRICE
OR conditions,
evaluating
evaluated
are located
to must
put the
to
condition
Africa,
you could rewrite
most likely
conditions
Remember,
be evaluated
South
this
condition
as:
. 10
the remaining
be true.
in
for
as soon
to
be true
as it finds
multiple
OR conditions
what is
described
first.
By doing
a conditional
to
evaluate
this,
the
expression
to true,
only
one
of
true.
NOTE Oracle
evaluates
conditions
queries
from
last
in
an opposite
here.
That is,
Oracle
evaluates
to first.
Whenever
possible,
expression
try to
containing
NOT (P_PRICE Also,
way from
avoid
the
use
a NOT logical
.
10.00)
can
NOT (EMP_GENDER
NOT logical
operator into
be
5 'M')
of the
written can
be
an equivalent
as P_PRICE written
operator.
as
,5
It is
best
to transform
expression.
a SQL
For example:
10.00.
EMP_GENDER
5 'F'.
13
13.6
QUERY FORMULATION
Queries are usually and tells get the to
you
to
job
match
done,
generate
the
environment
applications.
desired
values
also
want
2020 has
Cengage deemed
Learning. that
any
To
that
on
All suppressed
Rights
Reserved. content
does
May not
must
you
you
queries would
and computations
if an end user gives you a sample
write the
which
must
have
because normally
want to
computations?
tables
Remember
the
output.
To
are required
understanding
are the the
steps
The first
return
generate
computations
output
of the
database
SQL code.
they
follow
to
and
a good
of your
are required.
Do you
SQL required
columns,
will be the focus
to return. some
single
you
do that,
SELECT
a query,
want
For example,
evaluate
output.
to include
return
format,
database
will focus
you
questions.
carefully
which columns
data
should
review
output
must
To formulate
1 Identify
Copyright
that
you
and of the
This section
Editorial
written to answer
queries
you
outlined
step is to
just
the
names
that
all columns
will find in
most
below:
determine
and
clearly
addresses,
in the
which
or do you
SELECT
statement
values.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
Do you need simple
expressions?
on hand
the
such
to
as
generate
DATE(),
total
SYSDATE()
Do you need aggregate should
use
the
learn
more about
in
Chapter
to
summarise
breaking
as views. the
final
tables
to
fewest
SQL Performance
price times
single
689
the quantity
attribute
functions
Then you could
your
granularity
15, Databases data that
the total
may need
use
output.
for
by product,
you
a subquery.
The granularity
is known
are
sales
to
as atomic
is the level
data.
You
will
Business intelligence.
not readily
available
the
query into
multiple subqueries
create
a top-level
query that joins
on any table.
and storing
those
views
In
those
and
output.
used in
use the
some
and
Once you know
the
query.
tables
in
Some
your
which columns
attributes
query
to
appear
minimise
are required,
in
the
more than
number
you can determine the
one table.
of join
In those
cases,
operations.
Determine how to join the tables. Onceyou know whichtables you needin your query statement,
4
must
properly
identify
how to join
but in some instances,
you
the
tables.
may need to
In
case, you
that the
must determine
data type
Simple
comparison.
In
Single value to may need
to
5
use
an IN
Nested
comparisons. subqueries.
Grouped
most cases,
data in the
you
data
Also, in other
selection.
but to the
operator.
cases,
For example:
data.
you
single
values.
a single
13.7 DBMS
performance
tuning
(allocating
for
the
data
the
previous
the
cases,
DBMS
For
generating
Data cache.
cache;
The
from this
Learning. that
any
the
use the
criteria
need
nested
to
may apply
use the
ORDER
'SA');
selection
criteria
PRODUCT);
not to the
HAVING
output
you
'UK',
FROM
. 10
raw
clause.
may be ordered by one
BY clause.
All
tasks
1
such
purposes)
data
the
Rights
cache
May not
DBA
as
and
managing
the
also includes
must
work
must
the
structures
DBMS
in
processes
physical
among
all database
to
the
in
storage
on setting to
to
users.
The
that
primary
(allocating
examined
the
speed
queries
up query
in the perform
response
optimisers.
the parameters
permit
has settings
practices
ensure
indexes
by cost-based
enough
DBMS
several
developers
end focuses
Each
applying
creating
required
be set large
as possible.
with
for
statistics
at the server
data
does
DBMS
database
cache
Reserved. content
of the
DBA is responsible
tuning
to the
suppressed
global
caching
the
the
cache is shared
allocated
Cengage deemed
example, case,
performance
serviced
for
performance
In that
and for
are
you need to
IN ('FR',
AVG(P_PRICE)
selection you
P_PRICE
multiple values,
V_COUNTRY
5 ( SELECT
Ensure
files).
section.
as expected.
includes
memory
Fine-tuning
time
cases,
of natural
DBMS PERFORMANCE TUNING
memory space
In those
type
For example:
value to
Determine in which order to display the output. Finally, the required or more columns.
some
are correct:
may need to have some .
occasions,
In those
criteria
For example:
P_PRICE
On other
aggregate
will use
are needed in your criteria.
comparison
will be comparing
comparison
you
and operators
multiple values. If you are comparing
involving
data,
of the
cases,
Most queries involve some type of selection criteria.
which operands
and granularity
most
use an outer join.
Determine which selection criteria are needed. In this
has
for
may need
join,
2020
required
may consider
you
review
data
maximum
you
source
Copyright
with
you
2 Identify the source tables.
Editorial
you
Sometimes,
generates
3
cases,
such
cases,
Database
multiply the
may need
some
raw
Data
granularity
You
If you need to compute In
of the
data.
do you need to
cost?
Managing
ROUND().
BY clause.
granularity
within the
subqueries
try
or
functions?
a GROUP
Determine
of detail
That is,
inventory
13
as
that
many
used for:
data
requests
control the
majority
to
be
size of the
of primary
data
memory resources
cache.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
690
PART
V Database
Transactions
SQL cache.
users
have
been
accessing cases,
access from
DBMS
by the the
the
will likely
query
only
parsing
The sort cache is used as a temporary as well as for index-creation mode.
statistics that
are
are
Most DBMSs
automatically available.
used
operate in
determine
cost-based
have
and
by
execute
it
SQL requests
(after
the
an application many
SQL
with
different
many times,
users.
using
for the same
multiple In
the
same
query are served
storage
area for
ORDER
BY or
GROUP
BY
functions. one of two
the
For example,
by the
you
phase.
Sort cache.
Others
SQL statements
if
be submitted
once
and subsequent
operations, Optimiser
executed
Generally,
query
the
second
skipping
most recently optimiser).
same
will parse
way, the
SQL cache,
Tuning
stores the
parsed
a database,
the
plan. In that
the
Performance
The SQL cache
statements
these
and
the
optimisation
optimisation
mode
DBA is responsible
optimiser.
If the
modes:
based
for
statistics
are
cost-based
on
whether
generating not
the
available,
or rule-based.
database database
the
statistics
DBMS
uses
a rule-based
optimiser. From
the
performance
memory to
minimise
options for their (if
not
all)
systems
of components
on data
important storage
(RAM)
and
as flash
memory
are still
with poorly
drives.
That is
why
rather
tuning.
than
drives). to
secondary
the
costs,
these and
written
primary
database
portions
storage.
These
modern
database advances
type
of databases
performance
tuning
SQL statements.
database
storage
Note the following
in
and technology
optimisation
most
physical
(disk) from
Even though
or poorly
markets,
stored
offer in-memory
demands
query
databases
selected
managing
vendors
diminishing
state
database
are optimised to store large
performance
subject
designed
a niche in
entire
database
Data),
and solid
they
carving
Big
have the
systems
storage
Analytics
DBMS performance
An
SSD
speed
performance
Although
implementations
details
of the
still rely
data files
general recommendations
plays
for
an
physical
does
than rates
This type of device uses flash solid state drives (SSDs) to store the not
tolerance.
RAID
13.7
shows
moving disk
contention
systems
use
the
multiple
common
and, therefore I/O
caused
RAID systems
most
parts drives.
Array of Independent disks.
Common
any
rotating
and reduce
by several individual 13.7
have
traditional
Use RAID (Redundant
TABLE
why several
database
Business
are
on disk
database.
Table
primary
to
databases:
a higher
13
This is
of increased
UseI/O accelerators.
fault
be optimal
because
when faced
role in of
in
bottlenecks,
databases stored
would
popular
access
especially
in-memory
as
(such
disk
it
disk access.
database
becoming (such
eliminate
costly
of view,
main products. In-memory
of the are
applications
rules,
point
by typical
deliver
storage
I/O
operations
high
transaction
at
drives.
Disks) to provide balance between performance disks
to
provide
RAID
performs
accelerators
create
virtual
performance
disks
(storage
volumes)
improvement
and fault
and
formed
tolerance.
configurations.
RAID configurations
RAID Level
Description
0
The
data
blocks
performance
are but
reconstructed
1
The
Copyright Editorial
review
2020 has
same
Cengage deemed
Learning. that
any
no fault
over
data
blocks
Provides of two
All suppressed
Rights
are
written
drives.
known
tolerance
a
minimum
of two
to
separate
(duplicated)
read
Also
Fault
Requires
increased
performance
means
and
as a striped
that,
in
case
array.
Provides
of failure,
data
increased could
be
drives. drives.
fault
Also
tolerance
referred via
data
to
as
mirroring
redundancy.
or
Requires
a
drives.
Reserved. content
separate
tolerance.
and retrieved.
duplexing. minimum
spread
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13
Managing
Database
and
SQL Performance
691
RAID
Level
Description The data are striped
3
drive.
5
Parity
data are specially
Provides
good read
The data
and the
tolerance
across separate
parity
and fault tolerance.
data that
and fault
are striped
data.
Requires
The data blocks are spread
1+0
generated
performance
via parity
drives, and parity data are computed permit the reconstruction
tolerance
across a
via parity
separate
minimum
over separate
drives.
of three
data.
of corrupted
Requires
Provides
good
or
a minimum read
missing
of three
performance
data.
drives.
and fault
drives.
drives and
This is recommended
and stored in a dedicated
mirrored. This arrangement
RAID configuration
for
provides
both speed
most database installations
( if cost is
not an issue).
Minimise
disk contention.
Use
multiple, independent
storage
volumes
with independent
spindles
(a spindle is a rotating disk) to minimise hard disk cycles. Remember, a database is composed of manytable spaces, each with a particular function. In turn, each table space is composed of several data files (in which the data are actually stored). A database should have atleast the following table spaces: ? System table space. Used to store the data dictionary tables. It is the table space and should be stored in its own volume.
mostfrequently
accessed
? User data table space. Used to store end-user data. You should create as many user data table spaces and data files as are required. You can create and assign a different user data table space for each application and/or for each group of users. ? Index
table
space.
Used to store indexes.
You can create
and assign
a different
index
table
space for each application and/or for each group of users. The index table space data files should be stored on a storage volume that is separate from user data files or system data files. ? Temporary
table
space.
Used as a temporary
storage
area for
operations. You can create and assign a different temporary and/or for each group of users. ?
Rollback
segment
table
space.
Take advantage
of the
various
table
storage
or set aggregate
table space for each application
Used for transaction-recovery
Put high-usage tables in their own table spaces. with other tables.
merge, sort
purposes.
By doing this, the database
organisations
available
in the
minimises conflict
database.
For example,
in Oracle, consider the use ofindex-organised tables (IOT); in SQL Server, consider clustered index tables. Anindex-organised table (or clustered index table) is a table that stores the end-user data and the index data in consecutive locations on permanent storage. This type of storage organisation provides a performance advantage to tables that are commonly accessed through a given index
the
order,
because
the index
contains
the index
key as well as the
data rows.
13
Therefore,
DBMS tends to perform fewer I/O operations.
Assign separate files in separate storage volumes for the indexes, system and high-usage tables. This ensures that index operations will not conflict with end-user data or data dictionary table access
operations.
Partition
tables
based
on usage.
Some
RDBMSs
on attributes. (See Chapter 14, Distributed be processed by multiple data processors. the most.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
support
horizontal
partitioning
of tables
based
Databases.) By doing so, a single SQL request could Put the table partitions closest to where they are used
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
692
PART
V Database
Transactions
and
Use denormalised involves to
taking
second
tables a table
normal
For
and
example,
table.
you
might
add the
you have learnt
the
Lets
use a simple
example
work.
The
is
based
in
previous
ones
that
causes Chapter
attributes
a lower
data
normal
duplication,
tables.
subtotal,
minimises
performance-improving
to
7, Normalising in
invoice
attributes
form
In the
but it
Database
short,
use
amount
computations
in
typically, minimises
from join
third
operations.
Designs.)
derived
of tax
queries
technique
form
attributes
and the
and join
in
total
your
in the
tables.
INVOICE
operations.
QUERY OPTIMISATION EXAMPLE
Now that
the
in
Another
normal
This technique
aggregate
Using derived
13.8
a higher
was discussed
computed
Tuning
where appropriate.
from
form.
(Denormalisation Store
Performance
example you
used
you do not overwrite
basis of query
to illustrate on the
optimisation,
how the QOVENDOR
chapters.
previous
query
optimiser
and
However,
you are ready works
QOPRODUCT
the
to test and
how
tables.
QO prefix
is
your you
Those
used
for
the
new knowledge. can
tables table
help it are
name
do its
similar to
to
ensure
tables.
Online Content Thedatabases andscriptsusedin this chaptercanbefoundontheonline platform
for
this
book.
To perform this query optimisation illustration, you will be using the Oracle SQL*Plus interface. Some preliminary work must be done before you can start testing query optimisation. The following steps will guide you through this preliminary work: 1
Log in to
Oracle SQL*Plus. using the username and password
provided by your instructor.
2
Create a fresh set oftables, using the QRYOPTDATA.SQL script file located on the online platform for this book. This step is necessary so that Oracle has a new set of tables and the new tables contain no statistics. Atthe SQL. prompt, type: @path\ QRYOPTDATA.SQL
3
where path is the location
of the file in your computer.
Create the
The PLAN_TABLE
PLAN_TABLE.
is a special
table
used
by Oracle to
store the
access
plan information for a given query. End users can then query the PLAN_TABLE to see how Oracle will execute the query. To create the PLAN_TABLE, run the UTLXPLAN.SQL script file located in the RDBMS\ADMIN folder of your Oracle RDBMS installation. The UTLXPLAN.SQL script file is
13
also found
on the
online
platform
for this
book.
At the
SQL prompt,
type:
@path\UTLXPLAN.SQL You use the EXPLAIN PLAN command to store the execution plan of a SQL query in the PLAN_TABLE. Then, you would use the SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY) command to display the access
Copyright Editorial
review
2020 has
Cengage deemed
plan for
Learning. that
any
All suppressed
a given
Rights
Reserved. content
does
SQL statement.
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13
Managing
Database
and
SQL Performance
693
NOTE Oracle,
MySQL
available, using
the
and
To see the
11g
through
on the
access
statements
plan
command
Note that the first access
plan in
FIGURE
default
to
by the
Figure display
Oracle
13.3.
INITIAL
to
you
plan for
generates
EXPLAIN
The
interface.
execute
Then,
Figure 13.3 uses afull table
13.3
optimisation.
optimiser.
SQL*Plus
access
SQL statement
cost-based
In
examples The
Oracle,
if table
in this
examples
statistics
section
will
give
are
not
were
generated
different
outputs
you are using.
DBMS
the
to
a rule-based
of ORACLE
used in
to
all
back the
version
as shown
DISPLAY)
server
will fall
ORACLE
depending
SQL
DBMS
the
scan
your use
query,
the
SELECT
a given
SQL
statistics
on the
use the *
FROM
PLAN
TABLE
and
SELECT
(DBMS_XPLAN.
statement.
for the
QOVENDOR
PLAN (Oracle
EXPLAIN
QOVENDOR table,
table.
Also, the initial
and the cost of the
plan is 3.
11g)
13
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
694
PART
V Database
Lets clause) are
Transactions
now and
shown
create see
in
FIGURE 13.4
and
Performance
an index
how
that
Figure
Tuning
on V_AREACODE
affects
the
access
(note
plan
that
generated
V_AREACODE by the
is
used in the
cost-based
ORDER
optimiser.
BY
The results
13.4.
EXPLAINPLANafter index on V_AREACODE (Oracle 11g)
13
As you examine
Figure
13.4,
note that the
new access
plan cuts the
cost
of executing
the
query
by
half! Also note that this new plan scans the QOV_NDX1 index and accesses the QOVENDOR rows, using the index row ID. (Remember that access by row ID is one of the fastest access methods.) In this case, the creation of the QOV_NDX1 index had a positive impact on overall query optimisation results.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
At other times, indexes indexes what (Note
on small happens that
tables when
V_NAME
you is
FIGURE 13.5
do not necessarily or
when the create
used
query
an index
on the
help in query accesses
clause
Managing
optimisation.
a high
on V_NAME.
WHERE
13
This is the
percentage
The
new
of table
access
as a conditional
Database
case
rows
plan is
expression
and
SQL Performance
when you have
anyway.
shown
695
in
Lets
see
Figure
13.5.
operand.)
EXPLAINPLAN after index on V_NAME(Oracle 11g)
13
As you can see in Figure 13.5, creation of the second index did not help the query optimisation. However, there are occasions when an index could be used by the optimiser, but it is not selected because of the wayin which the query is written. For example, Figure 13.6 shows the access plan for a different query using the V_NAME column.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
696
PART
V Database
Transactions
FIGURE 13.6
and
Performance
Tuning
ACCESSPLAN using index
on V_NAME (Oracle 11g)
13
In Figure 13.6, note that the access plan for this new query uses the QOV_NDX2 index on the V_NAME column. Lets now use the table QOPRODUCT to demonstrate how an index can help when aggregate function
queries
using the cost of 3.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
are being run.
For example,
Figure 13.7 shows the access
MAX(P_PRICE) aggregate function.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
Note that this
whole
or in Cengage
part.
Due Learning
to
electronic reserves
plan for a SELECT statement
plan uses a full table
rights, the
right
some to
third remove
party additional
content
may content
be
scan
suppressed at
any
time
from if
with a total
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 13.7
13
FIRST EXPLAIN PLAN: aggregate function
Managing
Database
and
SQL Performance
697
on a non-indexed column
13
A cost
of 2 is
performance
by
two-thirds
Copyright Editorial
review
has
but
an index is
could on
created
you improve
P_PRICE.
and the
it?
Figure
Yes, 13.8
QOPRODUCT
plan uses only the index
could
shows
table
QOP_NDX2
you
improve
how
is
the
analysed.
to answer the
the
plan Also
previous
cost is note
query; the
query
reduced
that
the
by
second
QOPRODUCT
table
accessed.
Cengage deemed
creating
of the access
never
2020
already,
after the index
version is
very low
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
698
PART
V Database
Transactions
FIGURE 13.8
and
Performance
Tuning
SECOND EXPLAIN PLAN: aggregate function
on an indexed
column
13
Although
the
optimisation, As a DBA,
just
Copyright review
2020 has
Cengage deemed
examples
in
you
also
examples
you
tools
Learning. that
any
All suppressed
saw
should
for a single
graphical
Editorial
few
be aware
query, for
Rights
but for
does
May not
not materially
be
section
that
in the
all requests
performance
Reserved. content
this
monitoring
copied, affect
scanned, the
overall
or
duplicated, learning
show
how
important
which
index
main
goal is to
proper
creation
does
optimise
and query types.
index
selection
not improve
overall
database
Most database
is for
query
query
performance.
performance
systems
provide
not
advanced
and testing.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
13
Managing
Database
and
SQL Performance
699
SUMMARY Database performance tuning refers to a set of activities and procedures an end-user
query is
processed
by the
DBMS in the
minimum
amount
designed to ensure that
of time.
SQL performance tuning refers to the activities on the client side designed to generate SQL code that returns the correct answer in the least amount of time, using the fewest resources at the server
end.
DBMS performance tuning refers to activities on the server side orientated to ensure that the DBMS is properly configured to respond to clients requests in the fastest way possible while making
optimum
use of existing
resources.
Database statistics refers to a number of measurements gathered by the DBMS that describe a snapshot of the database objects characteristics. The DBMS gathers statistics about objects such as tables, indexes, and available resources such as number of processors used, processor speed
and temporary
about improving
space
available.
The DBMS
the query processing
The DBMS processes
uses the statistics
to
make critical
decisions
efficiency.
queries in three
phases:
? Parsing. The DBMS parses the SQL query and chooses the
most efficient access/execution
? Execution. The DBMS executes the SQL query, using the chosen execution
plan.
plan.
? Fetching. The DBMS fetches the data and sends the result set back to the client. Indexes
are crucial to the
process
that
speeds
up data access
and should
be carefully
selected
during physical database design in order to facilitate the searching, sorting and use of aggregate functions and join operations. During query optimisation, the operations,
which table
DBMS must choose
to use first,
and so on. Each
the most efficient way to access the data. The two optimisation and cost-based optimisation. A rule-based
optimiser
uses
a set of preset rules
execute a query. The rules assign afixed to yield the cost of the execution plan. A cost-based
optimiser
which indexes to use, how to perform join DBMS
has its
own algorithms
most common approaches
and points to
determine
for
determining
are rule-based
the
best approach
to
cost to each SQL operation; the costs are then added
uses sophisticated
algorithms
based
on the
statistics
about the
objects
being accessed to determine the best approach to execute a query. In this case, the optimiser process adds up the processing cost, the I/O costs and the resource costs (RAM and temporary space) to come up with the total cost of a given execution plan.
13
Hints are used to change the optimiser modefor the current SQL statement. Hints are special instructions for the optimiser that are embedded inside the SQL command text. SQL performance tuning particular,
queries
deals with writing queries that
should
make good
use of indexes.
make good use of the statistics. In
Indexes
are very useful
when you
want to
select a small subset of rows from alarge table based on a condition. When anindex exists for the column used in the selection, the DBMS may choose to use it. The objective is to create indexes with high selectivity. Index selectivity is a measure of how likely anindex will be used in query processing.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
700
PART
V Database
Transactions
and
Query formulation generate
the
deals
required
computations
Performance
with how to translate
results.
are required
tuning
memory
memory
(allocating
space
for
To do this, to
DBMS performance (allocating
Tuning
the
generate
includes for
data
business
you the
desired
tasks
caching
questions
must carefully
into
specific
evaluate
which
SQL code to
columns,
tables
and
output.
such
as managing
the
purposes)
and the
structures
DBMS in
processes physical
in
primary
storage
files).
KEYTERMS access/execution plan
extends
RAID
automaticqueryoptimisation
index-organisedtable
rule-basedoptimiser
cluster-indexed table
index selectivity
rule-based query optimisation algorithm
cost-based optimiser
in-memory database
SQLcache or procedure cache
data cache or buffer cache
input/output (I/O) request
SQLperformancetuning
datafiles
I/O accelerators
static queryoptimisation
databaseperformancetuning
manual statisticalgenerationmode
DBMSperformance tuning
optimiser hints
dynamic statistical generation mode
query optimiser
dynamic query optimisation
query processing bottleneck
Online Content are contained
on the
statisticallybasedqueryoptimisation algorithm table space orfile group
Answers to selectedReviewQuestions andProblems forthis chapter online platform
accompanying
this
book.
FURTHER READING Fritchey,
G. SQL
5th
edition.
Niemiec,
R.
Server
2017
Apress, Oracle
Query
Performance
Tuning:
Troubleshoot
and
Optimize
Query
Performance,
2018.
Database
12c
Release
2 Performance
Tuning
Tips
& Techniques,
Oracle
Press,
2017.
REVIEW QUESTIONS 13 1
Whatis SQL performance tuning?
2
Whatis database performance tuning?
3
Whatis the focus
4
Whatare database statistics, and why arethey important?
5
How are database statistics
6
Which database statistics
7
How is the processing
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
of most performance-tuning
processing required
Rights
Reserved. content
does
May not
activities, and why does that focus exist?
obtained? measurements are typical
of SQL DDL statements
of tables, indexes
(such
and resources?
as CREATE TABLE) different from the
by DML statements?
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
8 In simple terms, the DBMS processes is
accomplished
9 If indexes
in
each
Managing
queries in three phases.
Database
What are those
and
SQL Performance
701
phases, and what
phase?
are so important,
why not index
every column in every table?
10
Whatis the difference between a rule-based
11
What are optimiser
12
13
optimiser and a cost-based
optimiser?
hints, and how are they used?
Mostofthe query optimisation techniques are designed to makethe optimisers workeasier. Which factors
13
should
you
keep
in
mind if you intend
to
write
conditional
expressions
in
SQL
code?
Whichrecommendations would you makefor managingthe datafiles in a DBMS with manytables and indexes?
14
What does RAID stand for, and what are some commonly
used RAIDlevels?
PROBLEMS Find the
solutions
to
Problems
SELECT
EMP_LNAME,
FROM
EMPLOYEE
ORDERBY
EMP_LNAME,
5 'F'
AND
query:
EMP_AREACODE,
EMP_AREACODE
EMP_GENDER
5 '0181'
EMP_FNAME;
Whatis the likely sparsity
2
Whichindexes should you create? Writethe required SQL commands.
3
Using Table 13.4 as an example, create two alternative access plans. Usethe following There are 8 000 employees.
b
There are 4 150 female employees.
c
There are 370 employees in area code 0181.
d
There are 190female employees in area code 0181. 4 to
6 are
based
on the
EMP_LNAME,
FROM
EMPLOYEE
5
5 Should you create anindex
EMP_FNAME,
Cengage
Learning. that
any
All suppressed
Rights
Reserved. content
does
data sparsity
May not
not materially
be
copied, affect
EMP_DOB,
13
AS YEAR
of the EMP-DOB column?
on EMP_DOB?
scanned, the
YEAR(EMP_DOB)
5 1976;
Whattype of database I/O operations
deemed
assumptions:
query:
YEAR(EMP_DOB)
4 Whatis the likely
2020
following
SELECT
4
6
has
of the EMP_GENDER column?
a
WHERE
review
on the following
1
Problems
Copyright
3 based
EMP_FNAME,
EMP_GENDER
WHERE
Editorial
1 to
overall
or
duplicated, learning
Why or why not?
willlikely
in experience.
whole
be used by the query? (See Table 13.3.)
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
702
PART
V Database
Problems
Transactions
7 to
and
10 are based
on the
SELECT
P_CODE,
FROM
PRODUCT
WHERE
7
Performance
Tuning
ER
model shown in Figure
P_PRICE
.5
(SELECT
AVG(P_PRICE)
Assuming that there are no table statistics,
FROM
query:
9
Whatis the likely data sparsity ofthe P_PRICEcolumn? Should you create anindex?
PRODUCT);
what type of optimisation
Whattype of database I/O operations
FIGURE P13.1
Given the following
P_PRICE
8
10
P13.1.
willlikely
will the DBMS use?
be used by the query? (See Figure P13.1.)
Whyor why not?
The Ch11-SaleCo ER model
13
Problems
Copyright Editorial
review
2020 has
Cengage deemed
11 to
based
on the
SELECT
P_CODE,
FROM
LINE
GROUPBY
P_CODE
HAVING
SUM(LINE_UNITS)
Learning. that
14 are
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
following
query:
SUM(LINE_UNITS)
scanned, the
overall
. (SELECT
or
duplicated, learning
in experience.
whole
or in Cengage
MAX(LINE_UNITS)
part.
Due Learning
to
electronic reserves
rights, the
right
some to
FROM
third remove
party additional
content
LINE);
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
11
Whatis the likely
12
If not, explain
Discuss whether or not you should create anindex
14
Writethe command to create statistics for this table. 15 to
19
are
on the
V_CODE,
FROM
VENDOR
Whichindexes
V_NAME,
by the
column(s) be, and why would you create
on P_CODE. Justify your answer.
query:
V_CONTACT,
V_COUNTRY
5 'UK'
query?
Number
AB
15
AN
55
Country
of Vendors
Number
HG IC
358
IT
25
100
BE
3244
LV
645
BL
345
LC
16
BH
995
LT
821
LX
62
BU
75
CR
68
MC
CY
89
MO
12
CR
12
MN
65
DK
19
NL
74
ES
45
NW
113
FI
29
PL
589
208
SA
36
GM
745
UK
375
GR
35
VC
258
425
17
Whattype of I/O database operations
18
Using Table 13.4 as an example, create two alternative access plans.
has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
of Vendors
47
AU
2020
703
Assumethat 10 000 vendors are distributed as shownin the table below. Whatpercentage ofrows
FR
review
SQL Performance
should you create and why? Writethe SQL command to create the indexes.
Country
Copyright
and
V_NAME;
will be returned
Editorial
following
V_COUNTRY
ORDERBY
16
based
SELECT
WHERE
15
Database
your reasoning.
13
Problems
Managing
data sparsity of the LINE_UNITS column?
Should you create anindex? If so, what would the index the index?
13
13
would mostlikely be used to execute that query?
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
704
PART
V Database
19
Transactions
and
Performance
Tuning
Assume that you have 10 000 different products writing than
a
Web-based
or equal
that
your
interface
to the
query
to list
minimum
returns
the
quantity,
result
stored in the PRODUCT table
all products
with
P_MIN.
set to
the
Which
a quantity optimiser
Web interface
in
on hand hint
the
least
and that you are
(P_QOH)
would
that
you
time
is less
use to
possible?
ensure Write the
SQL code. Problems
20 to
21
are
based
on the
following
query:
SELECT
P_CODE,
FROM
PRODUCT
P, VENDOR
P.V_CODE
5 V.V_CODE
WHERE
P_DESCRIPT,
AND
V_COUNTRY5
'UK'
AND
V_AREACODE
5 '0181'
ORDER
BY
P_PRICE,
P.V_CODE,
V_COUNTRY
V
P_PRICE;
20
Whichindexes
21
Writethe command(s) used to generate the statistics for the PRODUCTand VENDORtables.
Problems
22
and
would you recommend?
23 are
based
on the
SELECT
P_CODE,
FROM
PRODUCT
WHERE BY
query:
P_DESCRIPT,
V_CODE
ORDER
following
P_QOH,
P_PRICE,
V_CODE
5 '21344'
P_CODE;
22
Whichindex
23
How should you rewrite the query to ensure that it uses the index Problem
Problems
would you recommend,
and which command
would you use? you created in your solution to
22?
24 and 25 are based SELECT
P_CODE,
FROM
PRODUCT
on the following
query:
P_DESCRIPT,
P_QOH,
P_PRICE,
V_CODE
13 WHERE
P_QOH P_MIN
AND
P_REORDER
Copyright review
2020 has
Cengage deemed
BY
5 P_REORDER
Learning. that
any
more
All suppressed
Rights
given in Section 13.5.2 to rewrite the query to produce the required
efficiently.
Reserved. content
5 50
P_QOH;
Use the recommendations results
Editorial
P_MIN
AND
ORDER
24
,
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
25
Whichindexes
Problems
Managing
Database
26 to 29 are based
on the following
CUS_CODE,
MAX(LINE_UNITS*LINE_PRICE)
FROM
CUSTOMER
NATURAL
CUS_AREACODE
GROUPBY
JOIN
INVOICE
NATURAL
JOIN
5 '0181'
CUS_CODE;
about
the
use
of derived
28
you gave in Problem 26, how would you rewrite
query?
Which indexes commands
29
would you give the
attributes?
Assuming that you follow the recommendations the
705
LINE
Assumingthat you generate 15 000invoices per month, whatrecommendation
27
SQL Performance
query:
SELECT
designer
and
would you recommend?
WHERE
26
13
would you recommend
for the
query you
wrote in
Problem 27, and what SQL
would you use?
How would you rewrite the query to ensure that the index
you created in Problem 28 is used?
13
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Part VI
Database ManageMent 14 Distributed Databases
15 Databases forBusiness Intelligence 16 Big Data and NoSQL 17 Database Connectivity and Web Technologies
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
busIness
VIgnette
the FacebookcaMbrIDge scanDal anD the gDPr
analytIca
Data
In 2018, Facebook faced international investigations into illegally collecting users personal data. The data was collected by Cambridge Analytica, which was a political consultation company that supported President Trumps 2016 election campaign. It was suggested that Cambridge Analytica had collected data from up to 87 million users across the globe and then used this data: firstly, to
profile the
candidate
people
were likely
to
vote for in the
US election,
and secondly,
to target
advertisements at users to try to influence whothey would vote for. The data was collected through a Facebook app called thisisyourdigitallife, where users consented to take partin a personality study. However, the app also extracted personal data from linked Facebook friends without their consent. However,
all the
data obtained
was used
without
knowledge
to
develop
a software
program
to
influence the US elections, which was sold to Trump campaigners. The major concern, even today, is that Facebook does not know which data the app shared with Cambridge Analytica. In 2019, lawsuits against Facebook continue, with USjudges requesting that all Facebooks data privacy records be made available after the companys lawyers argued that users have no expectation
of privacy.
What this scandal
demonstrated
how collecting personal data to profile individuals used to mislead people and generate fake news. It to collect your personal data and what exactly this where hidden patterns are discovered in data that behaviour
and
perform
predictive
analytics
was the
power
of Big Data analytics
for the purpose of automated profiling raises a debate about giving a company data will be used for. In the field of data can be used to makeinferences about a
new knowledge
can be discovered
about
and
could be consent mining persons a person
that he or she does not even know about. So,this raises the questions: Who owns this knowledge, and was consent ever obtained to use it for a purpose unknown at the time of collection? Better protection for users of data is now in place, thanks to the General Data Protection Regulation
(GDPR),1
which
become
a legal
requirement
for
all organisations
in
Europe from
25
May 2018, that collect and process data. One of the major changes detailed in Article 22 of the GDPRincludes the rights of an individual not to be subject to automated decision making, which includes profiling, unless explicit consent is given. Article 4(4) of the GDPR defines whichforms of
707
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
708
Part
VI
Database
Management
data processing of personal a natural
could
data,
utilising
for
example,
person
performance
at
behaviour,
work,
location
by the term
behaviour,
at
of up to
the
GDPR applies to
data for
4 per
citizens, they
store
which it
cent
the
situation,
The
of annual
global
challenge
and
where it is
was collected.
unlikely,
but if the
violation
of GDPR rules.
legal
and
the
ensure
GDPR
news had broken two
that
all they
have
months later
process
him
breaches
the
organisations have
stopped
the
the
then
of
is
consent
reliability
to
or
or her or similarly regulation
greater).1
is
a
Given that
data of European
exactly
Cambridge
meant
data subjects
the
personal
know
Facebook
persons
what is
the
concerning that
to
reliability,
or interests,
million (whichever
that
natural
definition
concerning
effects
or 20
that
relating
interests,
preferences
an organisation
turnover
ensure
stored,
Could
for
and companies
will be to
a lengthy
aspect
processing
aspects
preferences,
personal
produces
penalty
all organisations
provides
any personal
of automated
personal
concerning
personal
health,
where it
or her.
71
to
certain
aspects
health,
Recital
any form
evaluate
or predicting
in relation
movements,
him
fine
Union
movements.1
or
This includes data to
situation,
especially
affects
profiling. personal
analysing
work, economic
location
significantly
this
economic
or
profiling
performance
be considered
and
what
personal
use it for the
purpose
Analytica
might now
scandal?
be facing
It is
fines for
14
1
Copyright Editorial
review
2020 has
The
Cengage deemed
GDPR
Learning. that
any
Portal
All suppressed
Rights
(2019),
Reserved. content
does
May not
[online].
not materially
be
copied, affect
Available:
scanned, the
overall
or
duplicated, learning
https://eugdpr.org/
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter 14 Distributed Databases In thIs
chaPter,
you
What a distributed components
wIll learn:
database
management
implementation
is affected
system
(DDBMS)
is and
what its
are
How database
by different levels
of data and process
distribution How transactions How
are
database
managed
design
is
in
a distributed
affected
by the
database
distributed
environment
database
environment
Preview A single on
database
different
several
can be divided into
computers
different
distributed
network
database
The growth by the increased
and
that
cost
apply.
However,
network
clearly
adds
words,
it
database
must
database
fragments,
2020 has
Learning. that
any
All suppressed
can be stored
dispersed
forms
Reserved. content
the
to
the
the
has
growth
among core
of the
been fostered of Big
Data and
services
does
May not
not
be
the
location
affect
scanned, the
overall
or
duplicated, learning
of the
practical.
some
in
whole
a
Cengage
Due Learning
of
partitioning
of Web-centric
scalable;
As demand
part.
design
In todays
grows,
To accommodate
or in
sites in a
the
and the
fragments.
desirable
you learnt
different
example,
data
distributed
concepts
must be highly
increases.
experience.
For
of those
complexity.
achieve
treats
design
of data among
data system
as demand
copied,
basic
complexity.
and replication
made to
materially
(DDBMS)
the
distribution
and inherent
be
system
a systems
consider
dynamically
must
Rights
systems
network-based
therefore,
any distributed
needs
trade-offs
Cengage deemed
grow
processing
growth,
review
must
be
database
operations,
distributed
management
database;
chapters
systems
multi-site
can
effective.
database
environment,
Copyright
made
computer
data into
too,
management
of business have
as a single logical
a distributed
The
database
in earlier
the
Editorial
or nodes.
globalisation
distributed
database
sites,
The fragments
Processing,
system.
changes
more reliable The
a network.
of distributed
technological
several fragments.
within
in so
such
other do the
dynamic
properties.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
710
Part
VI
Database
14.1
Management
the eVolutIon systeMs
oF DIstrIbuteD
Database
ManageMent
A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites. To understand how and why the DDBMS is different from
the
DBMS, it is
useful to
examine
briefly the
changes
in the
database
environment
that
set the
stage for the development of the DDBMS. During the 1970s, corporations implemented centralised database management systems to meet their structured information needs. Structured information is usually presented as regularly issued formal reports in a standard format. Such information, generated by 3GL programming languages, is created by specialists
in response
to
precisely
channelled
requests.
Thus, structured
information
needs are
well
served by centralised systems. Basically, the use of a centralised database required that corporate data be stored in a single central site, usually a mainframe or midrange computer. Data access was provided through dumb terminals. The centralised approach, illustrated in Figure 14.1, worked wellto fill the structured information needs of corporations,
but it fell short
when quickly
moving events required
faster
response
times
and equally
quick access to information. The slow progression from information request to approval, to specialist, to user, simply did not serve decision makers wellin a dynamic environment. What was needed was quick, unstructured access to databases, using ad hoc queries to generate on-the-spot information.
FIgure
14.1
centralised
database
management
system
Request Application issues a data request to the DBMS
DBMS
Reply
Data
End user
Read
L o c al
dat
ab
se
14 Database management systems based on the relational model could provide the environment in which unstructured information needs would be met by employing ad hoc queries. End users would be given the ability to access data when needed. Unfortunately, the early relational modelimplementations did not yet deliver
database
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
acceptable
throughput
when compared
to the
well-established
hierarchical
or network
models.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
The past three affected
decades
database
Business next
operations
corner
in
of crucial
Among
with this
social
those
and technological
changes
change,
Distributed
changes
Databases
that
711
have
were:
competition
expanded
from
the
shop
on the
cyberspace.
market needs favoured
an on-demand
transaction
style,
mostly based
on
services.
Rapid social demand
and
a series
design.
global;
Web store
demands
Web-based
and
became
to the
Customer
gave birth to
development
14
and technological
for
complex
have increasingly
changes
and fast adopted
fuelled
networks
by low-cost,
to interconnect
advanced
network
smart
them.
technologies
mobile
devices increased
As a consequence, as the
platform
the
corporations
for their
computerised
solutions. Data realms
manage
are
geographically mobile
distributed
created
competitive
Rapid
data
recent
the
factors
data
diverse locations
must
data tend to
be
via location-aware
as
became
had to respond
restructured
quickly
leaner-and-meaner,
obvious: environment.
decentralisation
databases
even
to form
became
decision-making
on the
of
business
units
However,
the
a necessity.
more firmly
entrenched.
way
by:
particularly,
the
The
World
WWW is, in
tolerance
These
such
does
not
distributed
as
have
voice,
to
of
Wide
effect,
created
Web (WWW
the
as
repository
for
digital
devices
use
wireless
and
high
tablets
demand
music
and
data
as
data
access.
varied
pictures.
databases,
of
such
for
and require
distributed
the
mobile
Pixel,
locations
video,
imply
lead
use
Googles
dispersed
data,
often
and
devices
necessarily
requirements
widespread
iPhone
geographically
formats,
access
The
Apples
Galaxy.
data from
data
iPad,
They
exchanges
Although
distributed
performance
replication
includes
Apples
and failure
techniques
similar
to
those
in
databases.
The accelerated
growth
provides
applications
remote
maintenance
and
fully
tolerance
of companies to
operations.
necessarily
require
based
distribution.
revolution.
such
Samsungs
not
units
quick-response
influenced
of the internet
companies
requirements
multiple-location
described
and
which
database
in the
structures
was strongly
in business
two
and
just
access
wireless
multiple
are
from
As large
crucial
management
addressed
for
mobile
data
applications
Such
data.
smartphones
access
became
acceptance
platform
The
As a result,
music and images.
environment
operations,
multiple-access
were
distributed
and
of
years,
growing
the
more frequently.
video,
accessed
pressures.
access
decentralised
factors The
world
as voice,
business
dispersed
decentralisation
During
in
a dynamic
reacting,
ad hoc
made
those
digital
and remotely
and technological
quickly
The
in the
of data, such
devices.
These factors to
converging
multiple types
The
distributed.
distributed
using applications
companies company
Just
as
data functionality;
often require
the
that data
with
as a service.
want to are
mobile
however,
use of data replication
outsource
generally
stored
data
access,
other
factors
techniques
This new type
their
application
on central
this
type
such
servers
of service
and
may not
as performance
similar to those in
14
of service
development,
and failure
distributed
databases. The increased mobile to
Copyright Editorial
review
2020 has
customers,
Cengage deemed
Learning. that
focus
technologies
any
All
Rights
need
Reserved. content
mobile business
within
the
suppressed
on
does
May not
for
not materially
be
their
business
on-the-spot
copied, affect
scanned, the
overall
intelligence. plans. decision
or
duplicated, learning
in experience.
whole
More and
more companies
As companies making
or in Cengage
part.
Due Learning
use
increases.
to
electronic reserves
rights, the
right
social Although
some to
third remove
party additional
are embracing
networks
content
to
a data
may content
be
get
suppressed at
any
closer
warehouse
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
712
Part
VI
Database
Management
is
not usually
distributed
Emphasis sources
and
new
online content see for this
of
many
habits
discover
platform
that
database,
facilitate
different
does rely extraction
The era of data
of communities, ways to
it
data
on Big Data analytics.
spending
distribution,
a distributed
queries
types.
and
effectively
and
on techniques
such
as data replication
and
and integration.
mobile communications
Todays
customers
organisations efficiently
have
significant
in
ways to
are investing reach
gave us data from
many
influence harvest
on the
such
data
to
customers.
Tolearn moreabouttheinternetsimpactondataaccessand
Appendix
H, Databases
in
e-Commerce,
available
on the
online
book.
Atthis point in time, the long-term impact ofthe internet and the mobile revolution on distributed database design and management is unclear. Perhaps the internet and mobile technologies success willfoster the use of distributed databases as bandwidth becomes a more troublesome bottleneck. Perhaps the resolution of bandwidth problems will simply confirm the centralised database standard. In any case, distributed
databases
exist today
and
many distributed
database
arelikely to find a place in future database development. The distributed database is especially desirable because subject to problems such as: Performance
degradation
High costs associated
operating
centralised
due to a growing number of remote locations
concepts
database
management is
over greater distances
with maintaining and operating large central (mainframe)
Reliability problems created by dependence the need for data replication
and components
database systems
on a central site (single point of failure syndrome)
Scalability problems associated with the physical limits imposed by a single location, physical space, temperature conditioning and power consumption Organisational rigidity imposed by the database, which meansit and agility required by modern global organisations.
and
such as
might not support the flexibility
The dynamic business environment and the centralised databases shortcomings spawned a demand for applications based on accessing data from different sources at multiple locations. Such a multiple-source/multiple-lo database environment is managed by a DDBMS.
14.2
14
DDbMs aDVantages
anD DIsaDVantages
Distributed database management systems deliver several advantages over traditional systems. At the same time, they are subject to some problems. Table 14.1 summarises the advantages and disadvantages associated with a DDBMS.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
table
14.1
Distributed
DbMs advantages
Data are located
near the
a distributed
match
Faster
demand
system
site.
are
The
Complexity
dispersed
Applications
with only
sites.
requirements.
access.
stored
greatest
database
business
data
locally
End
subset
of
often
work
companys
a
713
A distributed
database
database
spreads
processing
out the
data
at several
Growth facilitation. network
systems
workload
by
security,
sites.
the operations
addressed Security.
smaller
and located
foster
better
Reduced add
closer to customers,
customers
operating
cheaply
to
system.
and
among and
local
shared
Development
to
work is
on low-cost
optimisation,
of data
by different
communication
update
a
done
more
PCs than
multiple sites. will be
at several
sites.
There are no standard
protocols
at the
different
different
often incompatible
and
lapses
at
management
people
For example,
manage the
on
of security
when data are located
Lack of standards.
staff.
than
query
control,
and so on, must all be
The probability
The responsibility
sites
more cost-effective
a network
more quickly
sites are
departments
company
costs. It is
workstations
mainframe
Because local
communication
between
anomalies.
and resolved.
increases communications.
data
to prevent
concurrency
recovery,
path selection
different
of other
sites.
Improved
data from
activities
due to
management,
and they
must have the
database
degradation
backup,
access
New sites can be added to the
without affecting
control.
data location,
together
ability to coordinate
data.
and
Database administrators
Transaction system
management must recognise
must be able to stitch
users
of the
Faster data processing.
to
Databases
Disadvantages
data in
and
Distributed
and disadvantages
Advantages
to
14
database level.
database
distribution
vendors
employ
techniques
of data
and
to
processing
in
a
DDBMS environment.
mainframes. Increased User-friendly
interface.
usually
equipped
interface
(GUI).
PCs and
workstations
with an easy-to-use
The GUI simplifies
are
graphical
data
user
the
danger
of a single-point
computers
other
fails,
the
workstations.
multiple Processor
access users
When one
is
picked
of
request
Increased
training
generally
higher
even to the
at
hardware
sites.
is
The end
user is
able to
thus
of
requiring
space.
by any
processor
cost. in
Training
a distributed
extent
location,
at the
costs
are
model
than
they
model, sometimes
of offsetting
Distributed
duplicated infrastructure
copy of the data, and an end
processed
sites,
copies
operational
and
savings.
Higher costs.
any available
at different
disk storage
would be in a centralised
up by
Data are also distributed
independence.
Multiple
use and training
failure.
workload
requirements.
are required
additional
for end users. Less
storage
databases require to operate, such as physical
environment,
personnel,
software
and
licensing.
data location.
14 Distributed databases are used successfully but have along way to go before they can yield the full flexibility and power of which they are theoretically capable. The inherently complex distributed data environment increases the urgency for standard protocols governing transaction management, concurrency
control,
security,
backup,
recovery,
query
optimisation,
access
path selection,
and so on.
Such issues must be addressed and resolved before DDBMS technology is widely embraced. The remainder of this chapter will explore the basic components and concepts of the distributed database. Because the distributed database is usually based on the relational database model, relational terminology is used to explain the basic distributed concepts and components.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
714
Part
VI
Database
14.3 In
Management
DIstrIbuteD
distributed
ProcessIng
processing,
a databases
anD DIstrIbuteD
logical
processing
Databases
is shared
among
two
or
more physically
independent sites that are connected through a network. For example, the datainput/output (I/O), data selection and data validation might be performed on one computer, and a report based on that data might be created on another computer. A basic
distributed
processing
environment
is illustrated
in
Figure
14.2. It shows that
a distributed
processing system shares the database processing chores among three sites connected through a communications network. Although the database resides at only one site (London), each site can access the data and update the database. The database is located on Computer A, a network computer known as the database server.
FIgure
14.2
Distributing
processing
Computer
Site 1 London user Joe
environment
A DBMS
E m pl o y e e d at a b se
Site 2 Cape Town user Donna Computer B
Site 3 Harare user Victor Computer C
Update payroll
Generate payroll
Communications
data
Database
records
are
network
processed
in
report
different
locations
A distributed database, on the other hand, stores a logically related database over two or more physically independent sites. The sites are connected via a computer network. In contrast, the distributed processing system uses only a single-site database but shares the processing chores among several sites. In
14
a distributed
database
system,
a database
is composed
of several
parts known
fragments. The database fragments are located at different sites and can be replicated sites. An example of a distributed database environment is shown in Figure 14.3.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
as database
among various
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
FIgure
14.3
Distributed
14
Distributed
Databases
715
database environment Computer
A DBMS
London
Site 1 user
Alan E1
Communications
Computer
network
B
Computer
C
DBMS
DBMS
E2
E3
Site 2 Cape
Town
Site 3
user
Betty
Harare
user
Victor
The database in Figure 14.3is divided into three database fragments (E1, E2 and E3)located at different sites. The computers are connected through a network system. In a fully distributed database, the users Alan,
Betty and Victor
do not need to know the
name
or location
of each database
fragment
in order to
access the database. Also, the users may belocated at sites other than London, Cape Town, or Harare, and still be able to access the database as a single logical unit. As you examine and contrast Figures 14.2 and 14.3, you should keep the following points in mind: Distributed processing does not require a distributed requires distributed processing.
database,
but a distributed
database
Distributed processing may be based on a single database located on a single computer. For the management of distributed data to occur, copies or parts of the database processing functions must be distributed to all data storage sites. Both distributed
processing
and distributed
1
databases require a network of interconnected
components.
14.4
characterIstIcs systeMs
A distributed related
Copyright Editorial
review
2020 has
data
Cengage deemed
Learning. that
any
database
oF DIstrIbuteD
management system (DDBMS)
over interconnected
All suppressed
Rights
Reserved. content
does
May not
not materially
be
computer
copied, affect
scanned, the
overall
or
duplicated, learning
systems
in experience.
whole
Cengage
ManageMent
governs the storage and processing
in
or in
Database
part.
which
Due Learning
to
electronic reserves
both
data and processing
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
oflogically
functions
time
from if
the
subsequent
are
eBook rights
and/or restrictions
eChapter(s). require
it.
716
Part
VI
Database
Management
distributed
among
several
sites.
A DBMS
must have at least
the following
functions
to
be classified
as
distributed:
Application within
interface
the
distributed
Validation
to
Query
optimisation
by the
query,
Mapping
to
Formatting
to find
to
Security
to
provide
Backup
and recovery
Concurrency
database
data location
write
data
for
privacy
to
for the manage
in the
distributed as follows:
and remote local
fragments
and recoverability
end
user
or to
an application
of the
data
access
ensure
ensure
that
the
data
of local
one
consistent
move from
and remote
transactions
management
such
system
must perform
the request.
as the following:
may require
data
from
Select
only
logical-to-physical
The request
mayinclude
all customers
a single
table,
or it
Ensure database consistency,
7
Validatethe datafor the conditions, if any, specified by the request
8
Present the selected a distributed
DBMS must
transparent
data
access
does
May not
This
of a centralised
greater than
access
to
several
and/or 1
000.
tables
data components
security and integrity
must handle
And it
Reserved.
another.
data in the required format
processing.
content
to
mathematical
with balance may require
6
Rights
state
(or an end users) request
Search for, locate, read and validate the data
All
across
well as transactions
all of the functions
5
suppressed
consistency
segments
Decompose the request into several disk I/O operations
any
program
in case of a failure
data
as
4
Learning.
database
and to
Mapthe requests
that
be accessed
databases
3
Cengage
must
administrator
simultaneous
Validate, analyse and decompose
deemed
which are local
storage
availability
2
DDBMSs
and
fragments
and remote
Receive an applications
and
DBMSs
DDBMS to
database
operations
database
both local
1
addition,
with other
be synchronised?)
permanent
database
distributed
DBMS,
or to
(which
to the
at
across
multiple
and
are distributed
presentation
to ensure the
features
of local
data from
the synchronisation
data
has
if any,
management
A fully
In
2020
best
activity includes
The request
review
must data updates,
data
fragments
logical
Copyright
strategy
the
control
Transaction
programs
components
access
or
prepare
which data request
the
the
read
DB administration
Editorial
determine
determine to
end user or application
data requests
and how
I/O interface
with the
database
to analyse
Transformation
14
to interact
not materially
be
copied, affect
perform features
scanned, the
overall
or
duplicated, learning
all necessary
those
additional
are illustrated
in experience.
whole
or in Cengage
part.
Due Learning
functions functions
in
to
electronic reserves
Figure
rights, the
right
imposed
by the
transparently
to the
distribution end
of
user.
The
14.4.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
FIgure
14.4
a fully distributed
database
Site 1
14
Distributed
Site 2
Distributed processing
UserTom Communications
network
Single logical
database
Dat a bas e f r ag me nt A1
Dat a bas e f r ag me n A2
SOURCE:
single
logical
database
in
sites 1 and 2, respectively. users end
see
only
users
do not
need
to
know
where
To better distributed
14.5 The
to
consists
and
know
fragments
understand
the
that
different
systems
do not the
workstations
(sites
must
Network
types
best to
the
and
allow
software
network
that
Communications
that
Cengage deemed
names into
of the
A2, located
at
so can Tom.
Both
fragments.
separate
In fact,
fragments,
nor
the
do they
database
scenarios,
lets
first
define the
distributed
processor
data.
Learning. that
any
All suppressed
the
network
system that
reside
and exchange and so on
database
carry the
that
in
each
As the
are likely
to
can
one
is, it
The
distributed
workstation.
data.
functions
data from
system.
database
hardware.
be supplied
be run
on
workstation
must
be
The
components
by different
multiple
14
vendors,
it
platforms.
to another.
able to
network
computers,
support
The DDBMS several
must
types
of
media.
The transaction requests
form
components
hardware
media that
communications
has
the
divided
A1 and
database;
components:
computer
all sites to interact
systems, ensure
of the
be communications-media-independent;
2020
know
is
of distributed
following
or nodes)
be independent
hardware
operating
review
to
database
fragments,
were alocal
Learning
components.
at least
components
Copyright
need
database
as if it
Technology/Cengage
are located.
must include
Computer system
Editorial
of two
Course
DDbMs coMPonents
DDBMS
is
14.4
database
need
the
database
Figure
Mary can query the database
one logical even
717
management system
User Mary
The
Databases
(TP),
The transaction
Rights
Reserved. content
does
May not
not materially
be
which is the
processor
copied, affect
scanned, the
overall
or
duplicated, learning
software
receives
in experience.
whole
component
and
or in Cengage
part.
Due Learning
processes
to
electronic reserves
rights, the
found the
right
some to
third remove
in
each
computer
applications
party additional
content
may content
data
be
suppressed at
any
time
from if
that
requests
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
718
Part
VI
Database
Management
(remote and local). manager (TM).
The TPis also known as the application
processor
(AP) or the transaction
The data processor (DP), whichis the software component residing on each computer that stores and retrieves data located at the site. The DPis also known as the data manager (DM). A data Figure
14.5
among
may even
illustrates
TPs
used
FIgure
processor
and
by the
the
be a centralised
placement
DPs shown
in
DBMS.
of and interaction
Figure
14.5 is
made
among
possible
components.
through
a specific
The set
communication
of rules,
or protocols,
DDBMS.
14.5
Distributed database system management components Melusi
Peter
Mary
TP
TP
TP
TP DP
DP
network
TP DP
Aneesha
any
DP
Dedicated data processor
Chantal
Each TP can access
data from
Dedicated data processor
DP
Communications
Note:
the
data on any
DP, and each
DP handles
all requests
for local
TP.
14 The protocols Interface
determine how the distributed with the network to transport
Synchronise all data received from (DP side)
database system will:
data and commands
DPs(TP side) and route retrieved
Ensure common database functions in a distributed concurrency control, backup and recovery.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
between
duplicated, learning
in experience.
whole
or in Cengage
part.
DPs and TPs data to the appropriate
system. Such functions include
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
security,
suppressed at
any
time
TPs
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
DPs and TPs can be added to the system TP and data
a DP can reside
transparently.
support
access
leVels
Current are
(distributed
systems
matrix to
of processes
table
can
For example, DB) and
a simple
other
are
14.2
in
data
the
Database
on the
may store
sections
systems:
Host
process
basis
user to
the
of how a single
at a single
according
of the access
other
local
DBMS
to
process
distribution
14.6
interfaces
site
or at
data
DB) or in
multiple
data and process
of data and process
to
sites.
distribution multiple
Table
distribution.
sites
14.2
uses
These types
distribution
Data
Multiple-Site
DBMS
Data
Not applicable
multiple processes)
Fully distributed
Client/server
FIgure
A
well as remote
proper
and
site (centralised
File server
process
719
network.
(Requires Multiple-site
components.
as
with
Databases
follow.
levels
Single-Site Single-site
in
data in
that
operation
end
Distributed
DIstrIbutIon
processing
systems
the
centralised
DBMSs
be classified
database
the
be an independent
anD Process
a DBMS
discussed
affecting
allowing
independent
may support
classify
without
computer,
a DP can
from
oF Data
database
supported.
same
In theory,
remote
14.6
on the
14
DBMS (LAN
DBMS)
Client/server
DDBMS
single-site-processing, single-site data(centralised)
T1
Dumb terminals
DBMS
T2 Front-end processor
Dat aba e
14
T3
Communication DSL or fibre
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
Remote dumb terminal
through line
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
The end user
must
All record-and
All data selection, entire
files
computer than
000.
SELECT
at the
functions for
time
can
table
take
SQL
query:
CUS_BALANCE
.
1000;
at the
be illustrated
remote
thus
Such
721
data.
data
For
rows,
of
that
costs.
example, 50
requiring
a requirement
communication
easily.
10 000
workstation,
workstation.
and increases
containing
the
order to access
Databases
location.
place at the
processing
response
condition
server in
end-user
Distributed
suppose
the
have
balances
which
file
server greater
* CUSTOMER
WHERE
All 10 000
CUSTOMER
A variation architecture. database
is
and the
distributed.
rows
must travel through
the
network
to
be evaluated
at site A.
of the multiple-site processing, single-site data approach is known as client/server Client/server architecture is similar to that of the network file server except that all
processing
server
site. In
last
to the file
done
network
slows
A issues
FROM
file
the
a CUSTOMER
If site
is
and update
traffic,
of the
stores
1
through
network
The inefficiency
activity
search
travel
increases
make a direct reference
file-locking
14
Note that
contrast,
done
at the
client/server
the
the
server
network
client/server
perform
file server
reducing
is
network
multiple-site
approach
architecture
online content this
site, thus
systems
requires
capable
traffic.
Although
processing, the
database
of supporting
both the
the latters to
data
at
network
processing
be located multiple
is
at a single
sites.
Appendix F, Client/Server Systems, islocatedonthe onlineplatform for
book.
14.6.3 Multiple-site
Processing,
Multiple-site
Data(MPMD)
The multiple-site processing, multiple-site data (MPMD) scenario describes afully distributed DBMS with support for multiple data processors and transaction processors at multiple sites. Depending on the level
of support
for different types
of centralised
DBMSs,
DDBMSs
are classified
as either
homogeneous
or heterogeneous. Homogeneous DDBMSs integrate multiple instances of the same database over a network. Thus, the same DBMS will be running on different mainframes, minicomputers and microcomputers. In contrast, heterogeneous DDBMSs integrate different types of centralised DBMSs over a network. A fully
heterogeneous
models (relational, No
DDBMS
currently
heterogeneous systems are
DDBMS
hierarchical provides
environment.
and networks,
subject
to
certain
Remote
access is
will support
different
or network) running full
support
Some
the
scenario
DDBMS implementations
and allow remote restrictions.
for
DBMSs
that
may even support
different
data
or for
fully
over a network.
data access to
depicted
in
support
several
another
DBMS.
Figure
14.8
platforms,
However,
such
the
14
operating DDBMSs
still
For example:
on a read-only
basis
and does not support
Restrictions are placed on the number of remote tables that
write privileges.
may be accessed in a single
transaction.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
722
Part
VI
Database
Management
Restrictions
are placed
Restrictions
are placed
provided
FIgure
to relational
14.8
on the
number
on the
database
databases
of distinct
databases
model that
but not to
that
may be accessed.
may be accessed.
network
or hierarchical
Thus,
access
may be
databases.
heterogeneous distributed database scenario
DBMS
Platform
IBM
DB2
MVS
APPCLU 6.2
VAX rdb
MVS
DECnet
SQL/400
OS/400
3270
UNIX
TCP/IP
3090
DEC/VAX
IBM
AS/400
RISC
Informix
computer
Intel
Network Communications Protocol
Operating System
Xeon
Windows Server 2019
Oracle
CPU
TCP/I
14 The
preceding
change
number several
of issues
A distributed
review
2020 has
Cengage deemed
Learning. that
any
that
Rights
does
means
system
requires
May not
not materially
be
copied, affect
the
overall
duplicated, learning
DDBMS data
Therefore,
management
in experience.
whole
or in Cengage
have
part.
the
continues
multiple
sites
next section
leads
to to
a
will examine
Features
characteristics features
technology at
systems.
transParency
functional
or
The Managing
and understood.
transparency
scanned,
exhaustive.
frequently.
database
Database
DDBMS
Reserved. content
no added
must be addressed
database
All
by are
of distributed
features.
suppressed
is
new features
DIstrIbuteD
transparency
Copyright
of restrictions
and
key features
14.7
Editorial
list
rapidly,
Due Learning
to
that the
electronic reserves
can
be grouped
common
rights, the
right
some to
third remove
property
party additional
content
may content
and
described
of allowing
be
suppressed at
any
time
from if
the
subsequent
as
the
eBook rights
end
and/or restrictions
eChapter(s). require
it.
chaPter
user to feel like the a centralised The
databases
DBMS;
DDBMS
transparency
Distribution
transparency, a DDBMS
If
and
stored
data
are
?
That the
data
are replicated
failure.
node.
This is
backbone
that feature
maintaining
networks
platform
find
the
cost-effective
a transparent
processing Heterogeneity (relational, responsible Distribution,
14.8
of
need
2020 has
Cengage deemed
allows
know
but
entirely
completed
will be picked that
degradation
depend
the
overall
due to its
or
or aborted,
up by another on a
network
Web presence
as the
also
data.
systems
The
capacity
under a common,
data requests
transparency
that
should
DBMS.
or due to
the
system
be able to
will
scale
more transaction
out
or data
of the system. of several
or global,
from the
ensures
by adding
performance
were a centralised
use on a network
transparency
remote
a physically
global
different local
schema.
schema
features
All suppressed
Rights
The
DBMSs
DDBMS is
to the local
will be examined
does
database
supported
DBMS in
schema.
greater
detail in
by the
to
be
DDBMS
a database
is
prior
to
of transparency.
May not
need
to
partitioned. data
Therefore,
neither
not
it
were a
system to system.
fragment
14
names
nor
access.
specify
where
those
must specify the database
fragments
are located.
exists when the end user or programmer
must specify both the
locations.
summarised
materially
as though
The end user or programmer
exists when the end user or programmer not
managed varies from
are recognised:
is the highest level
specified
does
are
Reserved. content
dispersed
of transparency
that
and their
features
any
failure
organisations
performance
transparency
are
names
Learning. that
are split vertically
transParency
transparency
Transparency
review
know:
business.
performance
mapping transparency
fragment
Copyright
to
names
Local
Editorial
the
transparency
locations
Location
to
to update data at several network sites.
which allows the integration
The level
distribution
Fragmentation
fragment
as a single logical
need
sites.
will be either
of the
in
access
affecting
and
transparency
fragment
user.
sections.
database.
not
to
or increase
for translating
Distribution
does
not
and columns
multiple
transaction
Performance
path
DIstrIbutIon
levels
does
which allows the system to perform asif it
and hierarchical)
centralised Three
with
to the
sites.
because
particularly
transparency,
transaction
next few
the
any performance
without
network
that
differences.
manner
nodes,
user
rows
among
multiple
were lost
transparency,
the
in
among
trust in their
will not suffer
most
working
which ensures that the system will continue to operate in the event of a
a critical
Performance The system
he or she is
or transparent,
723
integrity.
Functions
for
dispersed
ensures
Failure transparency,
the
which allows a transaction
database
hidden,
Databases
sites.
geographically
transparency
maintaining
are
database to be treated
transparency,
meaning that the tables multiple
transparency,
Transaction
node
distribution
on
That the
thus
database
which allows a distributed
exhibits
?
Transaction
words, the user believes that
Distributed
are:
data are partitioned,
horizontally
other
of a distributed
features
database.
? That the
the
only user. In
all complexities
14
be
copied, affect
in
scanned, the
overall
or
Table
duplicated, learning
in experience.
14.3.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
724
Part
VI
table if the
Database
Management
14.3
a summary
SQL statement
Fragment
of transparency
features
requires:
name?
Location
name?
Then the
DBMS supports
Yes
Yes
Local
Yes
No
Location transparency
No
No
Fragmentation
As you
examine
name is No cannot
have
a fragment
attributes
divided
data
FIgure
its
the
name
use
that is
of various
14.9
ask
fails
to
clearly
that is,
the
no reference
The reason
reference
for
to
a situation
not including
an existing
levels,
that
fragment.
suppose
EMP_ADDRESS,
over three London
E2 and
is
High
(If
in
which the
scenario
you
dont
fragment
is simple:
you
need
to
specify
table
containing
irrelevant.)
EMP_DOB,
in fragment
transparency
Medium
why there
transparency
data are distributed
stored
might
of distributon
Low
transparency
name is Yes.
location
by location;
are
you
EMP_NAME,
EMPLOYEE is
14.3,
a location name,
To illustrate
the
Table
and the location
Level
mapping transparency
employee
data
employee
have
an
EMPLOYEE
EMP_DEPARTMENT
different locations:
Harare
you
London,
are stored
data
are
and EMP_SALARY.
in fragment
stored
The
Cape Town and Harare. The table E1, Cape
in fragment
Town
E3. (See
employee
Figure
14.9.)
Fragment locations Distributed
DBMS
EMPLOYEE table
E1
Fragment
Location
E2
London
Now suppose
the
end user
wants to list
E3
Cape Town
all employees
Harare
with a date of birth
prior to
1 January,
1970.
To
focus on the transparency issues, also suppose the EMPLOYEE table is fragmented and each fragment is unique. The unique fragment condition indicates that each row is unique, regardless of the fragment in whichit is located. Finally, assume that no portion of the database is replicated at any other site on the network.
14
Depending
on the level
of distribution
transparency
support,
you
may examine
three
query cases.
Case 1: The Database Supports Fragmentation Transparency The query conforms to a non-distributed database query format; that is, it does not specify fragment names orlocations. The query reads: SELECT
*
FROM
EMPLOYEE
WHERE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
EMP_DOB
All suppressed
Rights
Reserved. content
does
May not
not materially
be
, '01-JAN-1970';
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
Case 2: The Database Fragment
names
must
SELECT
Supports
Location
be specified
in the
14
Distributed
Databases
725
Transparency
query,
but fragment
location
is
not specified.
The
query
reads:
*
FROM
E1
WHERE
EMP_DOB
, '01-JAN-1970';
UNION SELECT
* E2
FROM
EMP_DOB
WHERE
, '01-JAN-1970';
UNION *
SELECT
E3
FROM
EMP_DOB
WHERE
Case 3: The Database Both the
fragment
Supports
name
SELECT
, '01-JAN-1970';
Local
and location
Mapping Transparency
must
be specified
in the
query.
Using
pseudo-SQL:
*
FROM
E1
WHERE
NODE
LONDON
EMP_DOB
, '01-JAN-1970';
UNION SELECT
*
FROM
E2
WHERE
NODE
CAPE
EMP_DOB
TOWN
, '01-JAN-1970';
UNION SELECT
*
FROM
E3
WHERE
NODE
HARARE
EMP_DOB
, '01-JAN-1970';
14 note NODE indicates part
of the
the location
standard
As you examine
the
of the
database
preceding
query formats,
way end users and programmers interact
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
fragment.
NODE is
used for illustration
purposes
and is
not
SQL syntax.
does
May not
not materially
be
copied, affect
scanned, the
overall
or
you can see how distribution
transparency
affects
the
with the database.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
726
Part
VI
Database
Management
Distribution transparency is supported by a distributed data dictionary (DDD), or a distributed data catalogue (DDC). The DDC contains the description of the entire database as seen by the database administrator. The database description, known as the distributed global schema, is the common database
schema
are processed Therefore,
the
Keep in
DDC
14.9
transparency
update
data
ensures
that
the
their
updating
the
(remote
impose
distribute
DDBMS
requests)
that
at the network
nodes.
at all sites.
implementations
be able to
that
subqueries
and it is replicated
limitations
a database,
supports
on the level
but
location
not
a table,
of
across
transparency
but
not
transParency is
a DDBMS
in
and
many
property
different
transactions
are
that
consistency.
that that
connected
in
computers
completed
ensures
Remember
only
when
all
database
transactions
a DDBMS a network.
database
maintain the
database
transaction
Transaction
sites
involved
can
transparency
in the
transaction
part of the transaction. database
databases
systems
consistency
basic
distributed
require
complex
and integrity.
concepts
governing
mechanisms
To understand
remote
requests,
to
how the remote
manage transactions
transactions
are
transactions,
and to
managed,
distributed
ensure
you
should
transactions
and
requests.
14.9.1 Distributed Whether
or
not
update
distributed
requests and Distributed
a transaction
difference
between
or request
distributed
transaction
distributed,
from
it is
different
lets
the
begin
BEGIN
transparency
to
by and
remote
one
and
having
to
or
more
a distributed sites
the
COMMIT specify
database
transaction
on a network.
by establishing
WORK
avoid
transactions2
formed
transaction
several
concepts, using
of location
14.10
is
a non-distributed data
transactions,
existence
FIgure
through
might
into
distributed,
DDBMS
indicates
integrity
stored
Distributed
can
user requests
transparency.
databases
basic
current you
a condition
distributed
the
translate
consistency
of the
For instance,
Such
Transaction
know
maintain
transactIon
complete
TPs to
DPs. The DDC is itself
some
support. sites.
fragmentation
the
by local
must
mind that
transparency multiple
used
by different
To
difference
better
The
the
latter
illustrate
between
WORK transaction the
requests. is that
the
remote
format.
and
Assume
the
data location.
a remote request Site B
Site A TP
DP Network
14
C USTO ME
SELECT* FROM CUSTOMER WHERE
2
The
CUS_COUNTRY
details
of
distributed
White, Clarifying
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
Comment: The request is directed to the CUSTOMER
All suppressed
= 'ZA'
requests
Client/Server,
Rights
Reserved. content
does
May not
not materially
be
table
and DBMS
copied, affect
scanned, the
overall
at site
transactions 3(14),
or
duplicated, learning
B
were
November
in experience.
whole
or in Cengage
originally
1990,
part.
Due Learning
to
pp.
electronic reserves
described
in
David
McGoveran
and
Colin
78-89.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
Aremote to
request, illustrated
be processed
can
by a single
reference
data
Similarly, site.
at
only
a remote
Aremote
FIgure
in Figure 14.10, lets a single SQL statement
remote
database
one remote
transaction
transaction
14.11
processor.
In
other
words,
the
14
Distributed
Databases
727
access the data that are
SQL
statement
(or request)
site.
composed
is illustrated
in
of several
requests,
accesses
data at a single remote
Figure 14.11.
a remote transaction
Site A
Site B
TP
DP
I NV
CE
OI
Network
P R O D U C
BEGIN WORK; UPDATE PRODUCT SET
PROD_QTY WHERE
INSERT
INTO
= PROD_QTY PROD_NUM
INVOICE
VALUES '100', COMMIT WORK;
1
= '231785';
(CUS_NUM,
INV_DATE,
'15-FEB-2015',
As you examine Figure 14.11, note the following The transaction The remote
updates
transaction
The transaction
the
PRODUCT
remote transaction
and INVOICE
is sent to and executed
can reference
INV_TOTAL)
120.00;
only one remote
tables
features:
(located
at the remote
site
at site
B).
B.
DP.
Each SQL statement (or request) can reference only one (the same) remote entire transaction can reference and be executed at only one remote DP.
DP at atime,
and the
A distributed transaction allows atransaction to reference several different local or remote DP sites. Although each single request can reference only one local or remote DP site, the transaction as a whole can reference
multiple
DP sites
because
each request
can reference
a different
site. The
distributed
transaction process is illustrated in Figure 14.12. Note the following features in Figure 14.12: The transaction
14
references two remote sites (B and C).
The first two requests (UPDATE PRODUCT and INSERT INTO INVOICE) are processed by the DP at the remote site C, and the last request (UPDATE CUSTOMER) is processed bythe DP at the remote site B. Each request
can access only one remote site at a time.
The third characteristic may create problems. For example, suppose the table PRODUCT is divided into two fragments, PRODl and PROD2, located at sites B and C,respectively. Given that scenario, the preceding distributed transaction cannot be executed because the request:
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
728
Part
VI
Database
Management
SELECT
*
FROM
PRODUCT
WHERE cannot
PROD_NUM
access
distributed
FIgure
data
from
5 '231785';
more than
one remote
site.
Therefore,
the
DBMS
must
be able
to
support
a
request.
14.12
a distributed transaction Site
A
Site
B
DP
TP Network
M ER
C U S T O
Site
BEGIN
C
WORK;
UPDATE
I
SET PROD_QTY=PROD_QTY WHERE
INSERT
INTO
VALUES
UPDATE SET
PROD_NUM
INVOICE
('100',
1
= '231785';
(CUS_NUM,
'15-FEB-2019',
INV_DATE,
INV_TOTAL)
120.00);
CUSTOMER CUS_BALANCE
WHERE COMMIT
C E
N V OI
DP
PRODUCT
= CUS_BALANCE
CUS_NUM
P R O D U C
+ 120
= '100';
WORK;
A distributed request lets a single SQL statement reference data located at several different local or remote DP sites. Because each request (SQL statement) can access data from more than onelocal or remote DP site, a transaction can access several sites. The ability to execute a distributed request provides
fully
distributed
database
processing
capabilities
because
of the
ability to:
partition a database table into several fragments reference one or more of those fragments fragmentation transparency.
14
The location
and partition
a distributed
request.
with only one request. In other words, there is
of the data should be transparent
As you examine
Figure
14.13,
to the end user. Figure 14.13 illustrates
note that the transaction
uses
a single
SELECT
statement to reference two tables, CUSTOMER and INVOICE. The two tables arelocated at two different sites, B and C. The distributed request feature also allows a single request to reference a physically partitioned table. For example, suppose a CUSTOMER table is divided into two fragments, C1 and C2,located at sites
B and
C, respectively.
Further
suppose
the
end
user
wants to
obtain
a list
of all customers
whose balances exceed 250. The request is illustrated in Figure 14.14. Full fragmentation transparency support is provided only by a DDBMS that supports distributed requests. Understanding the different types of database requests in distributed database systems helps you address the transaction transparency issue more effectively. Transaction transparency ensures that
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
distributed (Review
transactions Chapter
concurrent state
to
are treated
12,
Managing
transactions,
as centralised
Transactions
whether
transactions,
and
or not they
are
ensuring
Concurrency, distributed,
if
serialisability
necessary.)
will take
the
14
Distributed
729
of transactions.
That is, the
database
Databases
from
execution one
of
consistent
another.
FIgure
14.13
a distributed request Site A
Site B
DP
TP
Network
CUST O ME R
Site
C
I N V OI CE DP BEGIN
WORK;
SELECT FROM
CUS_NUM, INV_TOTAL CUSTOMER, INVOICE
WHERE
CUS_NUM
= '100'
INVOICE.CUS_NUM COMMIT WORK;
FIgure
14.14
AND
= CUSTOMER.CUS_NUM;
PR O D U C
another distributed request Site A
Site B
TP
DP
Network
C1
Site C SELECT * FROM CUSTOMER WHERE CUS_BALANCE
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
14 DP > 250;
duplicated, learning
in experience.
whole
C2
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
730
Part
VI
Database
Management
14.9.2 Distributed Concurrency
control
concurrency
becomes
control
especially
important
in the
distributed
database
environment
because
multi-site, multiple-process operations are morelikely to create data inconsistencies and deadlocked transactions than single-site systems are. For example, the TP component of a DDBMS must ensure that all parts of the transaction are completed at all sites before a final COMMIT is issued to record the transaction.
Suppose each transaction operation was committed by each local DP, but one of the DPs could not commit the transactions results. Such a scenario would yield the problems illustrated in Figure 14.15: the transaction(s) would yield aninconsistent database, withits inevitable integrity problems, because committed data cannot be uncommitted! The solution for the problem illustrated in Figure 14.15 is a two-phase
FIgure
commit
14.15
protocol,
which you
the effect of a premature
will explore
next.
coMMIt
DP
Site A
LOCK (X) WRITE (X) COMMIT
Data are committed
Cant roll back sites A and B DP
Site B
LOCK
(Y)
WRITE (Y) COMMIT
DP
Site C
LOCK
Rollback site
(Z)
... ... ROLLBACK
at
14
14.9.3 two-Phase
commit Protocol
Centralised databases require only one DP. All database operations take place at only one site, and the consequences
of database
operations
are immediately
known
to the
DBMS. In
contrast,
distributed
databases makeit possible for atransaction to access data at several sites. A final COMMIT must not beissued until all sites have committed their parts ofthe transaction. The two-phase commit protocol guarantees that, if a portion of a transaction operation cannot be committed, all changes made at the other sites participating in the transaction will be undone to maintain a consistent database state.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
732
Part
VI
Database
14.10
Management
PerForMance
One of the
most important
anD FaIlure functions
transParency
of a database
is its
ability
to
make data
available.
Web-based
distributed data systems demand high availability, which means not only that data are accessible but that requests are processed in atimely manner. For example, the average Google search has a sub-second response time. When was the last time you entered a Google query and waited more than a couple
of seconds
for the results?
Performance
transparency
allows
a DDBMS to
perform
as if it
were
a centralised database. In other words, no performance degradation should be incurred due to data distribution. Failure transparency ensures that the system will continue to operate in the case of a node or network failure. Although these are two separate issues, they are interrelated in that a failing node or congested network path could cause performance problems. Therefore, both issues are addressed in this section.
The objective of a query optimisation routine is to minimise the total cost associated execution of a request. The costs associated with arequest are afunction of the:
with the
access time (I/O) cost involved in accessing the physical data stored on disk communication
cost associated
with the transmission
of data among
nodes in
distributed
database systems CPUtime cost associated Although
with the processing
costs are often classified
overhead of managing distributed transactions.
as either communication
or processing
costs, it is difficult
to separate
the two. Not all query optimisation algorithms use the same parameters, and not all algorithms assign the same weight to each parameter. For example, some algorithms minimise total time; others minimise the communication time; and still others do not factor in the CPU time, considering it insignificant relative to other cost sources.
note Chapter
13,
Managing
Resolving
Database
and
data requests
in
SQL
Performance,
a distributed
provides
data
additional
details
environment
about
must take
query
the
optimisation.
following
points
into
consideration:
Data distribution. In a DDBMS, query translation is decide In this
which fragment case,
a TP
to
access.
executing
(Distribution
a query
must choose
data requests to the chosen remote
14
more complicated
transparency
what fragments
DPs, combine the
because the DDBMS
was explained to
access,
DPresponses
must
earlier in this chapter.) create
multiple
and present the data to the
application. Data replication.
In
data replication ensure that
addition,
the
makes the
all copies
data
access
of the
may also be replicated
problem
even
data are consistent.
at several
more complex Therefore,
different
because
an important
the
sites.
The
database
must
characteristic
of query
optimisation in distributed database systems is that it must provide replica transparency. replica transparency refers to the DDBMSs ability to hide multiple copies of data from the user. This ability is
particularly
important
with data update
processed, it can be satisfied
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
operations.
If a read-only
by accessing any available remote
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
request
is
being
DP. However, processing
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
a
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
write request
also involves
The two-phase
commit
will complete ensure
successfully.
the
introduces
of all the
this,
and node
predetermined network
system
availability.
because
path
that
all changes
is,
to
maintain
14.9.3 at other
sites,
means that
that
the
should
to
Databases
733
data consistency.
ensures
all fragments
and pushes them
and basically
transparency,
delay imposed
due to
the
transaction
DDBMSs
be
must
mutually
each remote
replica.
not all data changes
also
consistent.
This
are immediately
how
to
the
database
database
Chapter
3,
Diagrams;
a distributed
Chapter
query
issues
a data
sites
in less
and traffic
consider for
with remote
of the
the delay imposed
partition
time
loads.
such
packet to
cannot than
be easily
others
Hence,
to
as network
and
achieve
latency,
make a round
the
trip from
when nodes become suddenly
point
A
unavailable
Where to locate
those
Data fragmentation
where
to locate
a distributed
the
database.
database
fragments
The following
can
section
help
discusses
or
DesIgn
distributed,
Characteristics;
Normalising
the
design
Chapter
Database
three
database
to
and
of
design.
introduces
the
a database
centralised
7,
Which fragments
third
of time required
Model
database
How to
part
of bandwidth
Database
is
Relational
and
associated
their
should
consistency
DIstrIbuteD
Whether
DDBMS
partition and
distributed
14.11
finish
because
partitioning,
performance
for
time
failure.
planning the
the
by the amount
a network
Carefully ensure
nodes
varies
to point B; or network
the
Section
are replicated
The response
some
performance
performance
in
data
fragments
in
Distributed
by all replicas.
Network
issues
if
about
fragments
a DP captures
delays in the
all existing
you learnt
However,
consistency
To accomplish
seen
synchronising
protocol
14
5,
Designs,
principles
and
Data
Modelling
are
still
applicable.
two
issues,
concepts
with
described
Entity
Relationship
However,
the
design
of
new issues:
into fragments
replicate fragments
and
data
and replicas
replication
deal
with the
first
and
data
allocation
deals
with
issue.
14.11.1 Data Fragmentation Data fragmentation object site
might over
catalogue
allows you to break a single object into two or more segments
be a users
a computer (DDC),
database, network.
from
vertical parts
and
Information
where it is
Data fragmentation a table into logical
strategies,
fragments.
mixed. (Keep
by a combination
the
Each
in
unique
rows
equivalent
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
all
accessed
mind that
All
Rights
does
May not
by the
fragmentation
TP to
is
process
here, are based three
Each fragment
types table
user
stored
in
at any
the
distributed
data
not
be
1
at the table level
and consist
always
of dividing
strategies:
be recreated
from
horizontal,
its
fragmented
and joins.)
stored
materially
The
be stored
requests.
of data fragmentation can
or fragments.
can
refers to the division of a relation into subsets (fragments)
is
have the
Reserved. content
or a table.
data
a fragmented
at a different
same
copied, affect
overall
or
duplicated, learning
and
each
(columns).
with the
scanned, the
node,
attributes
of a SELECT statement,
suppressed
about
as discussed
of unions
fragment
database
You will explore
Horizontal fragmentation (rows).
a system
in experience.
In
short,
WHERE clause
whole
or in Cengage
part.
Due Learning
to
electronic reserves
fragment
has
each
on a single
rights, the
right
some to
third remove
unique
fragment
of tuples
rows.
However,
represents
the
attribute.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
734
Part
VI
Database
Management
vertical fragmentation subset the
(fragment)
is
exception
refers to the division of a relation into attribute (column)
stored
of the
PROJECT
key
statement
Figure
the
Table
into
several
to
fragment
has
all fragments.
unique
subsets. Each
columns
This is the
with
equivalent
of the
of horizontal and vertical strategies. In other words,
horizontal
strategies,
The table
CUS_LIMIT,
14.16
name:
each
common
refers to a combination
fragmentation
14.16.
COUNTRY,
FIgure
which is
and
subsets
(rows),
each
one
having
CUSTOMER
table
for the
a subset
of the
(columns).
To illustrate
in
node,
SQL.
may be divided
attributes
column,
in
Mixed fragmentation a table
at a different
contains
lets
the
CUS_BAL,
use the
attributes
CUS_NUM,
CUS_RATING
and
XYZ
CUS_NAME,
Company,
depicted
CUS_ADDRESS,
CUS_
CUS_DUE.
a sample customer table
CUSTOMER CUS_NAMe
CUS_
Sinex, Inc.
COUNTrY
LiMiT
UK
3500.00
2700.00
3
SA
6000.00
1200.00
1
UK
4000.00
3500.00
3
3400.00
SA
6000.00
5890.00
3
1090.00
St.
SA
1200.00
550.00
1
0.00
Ave.
NL
2000.00
350.00
2
50.00
12 Main St.
11
Martin Corp.
321 Sunset
12
Mynux
910
Corp.
Eagle
13
BTBC, Inc.
Rue du
14
Victory,
Inc.
123
Maple
15
NBCC
Corp.
909
High
online content the online
Horizontal
CUS_DUe
CUS_
CUS_
NUM 10
CUS_BAL
CUS_
CUS_ADDreSS
Blvd. St.
Monde
rATiNG 1245.00 0.00
Thedatabases usedtoillustratethe material in this chapterarefoundon
platform for this
book.
Fragmentation
There are various
ways to partition a table horizontally:
Round-robin partitioning. Rows are assigned to a given fragment in a round-robin fashion (F1, F2, F3,..., Fn) to ensure an even distribution of rows among all fragments. However, this is not a good strategy if you require location awareness the ability to determine which DP node will process a query based on the geospatial location of the requester. For example, you would want
14
all queries from
UK customers
to
be resolved
from
this fragment to be located in a node close to Range
partitioning
based
on a partition
key.
a fragment
that
stores
only
UK customers
and
UK.
A partition
key is one or more attributes
in a table
that determine the fragment in which a row will be stored. For example, if you want to provide location awareness, a good partition key would be the customer state field. This is the most common and useful data partitioning strategy. Suppose XYZ Companys corporate three countries, but company locations
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
management requires information about its customers in all in each country (UK, SA and NL) require data regarding local
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
customers you
only.
define
the
Based on such requirements, horizontal
fragments
to
you decide to
conform
to the
table
14.4
horizontal fragmentation
Fragment
Name
Location
CUST_H1
United
CUST_H2
The
CUST_H3
South
distribute
structure
Netherlands
Africa
in
Distributed
data by country. Table
of the customer table
Databases
735
Therefore,
14.4.
by country
Node
Condition
Kingdom
the
shown
14
Name
Customer
Number
Numbers
of rows
CUS_COUNTRY
5 'UK'
NAS
10, 14
2
CUS_COUNTRY
5 'NL'
ATL
15
1
CUS_COUNTRY
5 'SA'
TAM
11, 13, 14
3
The partition key will be the CUS_COUNTRY field. Each horizontal fragment may have a different number of rows, but each fragment must have the same attributes. The resulting fragments yield the three
tables
depicted
FIgure
in
14.17
Table name: CUS_
Figure 14.17.
table fragments in three locations
CUST_H1
Location:
CUS_NAMe
United
CUS_ADDreSS
NUM 10
Sinex,
12
Inc.
12
Mynux
Main
910
St.
Eagle
Node: NAS
Kingdom
St.
CUS_BAL
CUS_
CUS_
COUNTrY
LiMiT
UK
3500.00
UK
4000.00
CUS_
CUS_
rATiNG
DUe
2700.00
3
1245.00
3500.00
3
3400.00
CUS_BAL
CUS_ rATiNG
CUS_DUe
Corp.
Table name:
CUS_ NUM
CUST_H2
CUS_NAMe
15
NBCC
Table
Location:
name:
CUS_ADDreSS
Corp.
909
CUST_H3
High
Ave.
Location:
CUS_NAMe
CUS_
South
Martin
review
2020 has
CUS_
rATiNG
DUe
Monde
SA
6000.00
5890.00
3
Maple St.
SA
1200.00
550.00
0.00 1090.00
14
0.00
1
Fragmentation
may also divide the
the
in in
a few
Table
14.5.
any
All suppressed
suppose
Rights
of the
Reserved. content
does
May not
relation
the
department.
only
Learning. that
CUSTOMER
For example, collections
Cengage deemed
CUS_
1
123
attributes.
Copyright
CUS_BAL
1200.00
Blvd.
50.00
TAM
6000.00
Victory, Inc.
shown
Node:
2
SA
Sunset
14
interest
350.00
CUS_
Rue du
and
2000.00
LiMiT
BTBC, Inc.
vertical
NL
COUNTrY
13
You
Editorial
321
CUS_ LiMiT
CUS_
CUS_ADDreSS
Corp.
CUS_ COUNTrY
Africa
NUM 11
Node: ATL
The Netherlands
company
Each
not
be
copied, affect
scanned, the
overall
vertical fragments
is
department
CUSTOMER
materially
into
or
duplicated, learning
in experience.
divided into two is located
tables
attributes.
whole
that
or in Cengage
part.
Due Learning
in
departments: a separate
In this
to
electronic reserves
are composed
case,
rights, the
right
some to
the service building,
the
third remove
of a collection
party additional
content
may content
department
and
fragments
each
are
be
suppressed at
any
time
of
from if
has
defined
the
subsequent
eBook rights
an as
and/or restrictions
eChapter(s). require
it
736
Part
VI
table Fragment
Database
Management
14.5
Vertical fragmentation
name
Location
CUST_V1
of the Node
Service
Bldg
custoMer
Name
table
Attribute
SVC
Names
CUS_NUM,
CUS_NAME,
CUS_ADDRESS,
CUS_COUNTRY CUST_V2
Collection
Bldg
ARC
CUS_NUM,
CUS_LIMIT,
CUS_BAL,
CUS_RATING,
CUS_DUE
Each vertical fragment must have the same number of rows, but the inclusion of the different attributes depends on the key column. The vertical fragmentation results are displayed in Figure 14.18. Note that the
FIgure Table
key attribute
14.18
name:
(CUS_NUM)
is
Location:
Service
CUS_NUM 10
14
to
both fragments
CUST_V1
and
CUST_V2.
Vertically fragmented table contents
CUST_V1
Table name:
common
Building
Node:
CUS_NAMe
CUS_ADDreSS
CUS_COUNTrY
Sinex, Inc.
12
Main St.
UK
11
Martin
Corp.
321
Sunset
12
Mynux
Corp.
910
Eagle
13
BTBC, Inc.
Rue du
14
Victory,
Inc.
123
15
NBCC
Corp.
909
CUST_V2
Location:
SVC
Collection
Blvd.
SA
St.
UK
Monde
Maple High
SA
St.
SA
Ave.
NL
Building
Node: ARC
CUS_NUM
CUS_LiMiT
CUS_BAL
CUS_rATiNG
CUS_DUe
10
3500.00
2700.00
3
11
6000.00
1200.00
1
12
4000.00
3500.00
3
3400.00
13
6000.00
5890.00
3
1090.00
14
1200.00
550.00
1
15
2000.00
350.00
2
1245.00 0.00
0.00
50.00
Mixed Fragmentation The
XYZ
Companys
accommodate vertically
table
the
to
structure different
accommodate
requires
the
site
the
attributes,
yields
Copyright Editorial
review
2020 has
the
Cengage deemed
Learning. that
any
requires
All
CUSTOMER within
departments
thus
the
data
be fragmented
locations,
(service
and
the
horizontally
data
collection).
does
May not
in
not materially
be
Table
copied, affect
must
In short,
to
be fragmented the
CUSTOMER
horizontal
fragments)
that
are located
is used
information
fragmentation
The
horizontal at each
is introduced fragmentation
site.
As the
within each horizontal
needs
at each sub site.
for yields
departments
fragment
to
divide
Mixed fragmentation
14.6.
scanned, the
First,
(CUS_COUNTRY).
vertical fragmentation
meeting each departments
Reserved. content
procedure.
a country
(horizontal
buildings,
displayed
Rights
a two-step within
tuples
different
results
suppressed
the
locations;
different
on the location
of customer
in
that
mixed fragmentation.
based
subsets
are located
company the
Mixed fragmentation each
requires
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
table
14.6
Mixed fragmentation
of the custoMer
Fragment
name
Location
Horizontal
CUST_M1
UK-Service
Criteria
CUS_COUNTRY
5
UK-Collection
Distributed
Node
resulting
Name
rows
NAS-S
10, 14
vertical
at Site
Criteria
Attributes
5
NAS-C
737
at
each Fragment CUS_NUM,
CUS_NAME,
CUS_ADDRESS,
CUS_COUNTRY
Databases
table
'UK' CUST_M2
14
10, 14
'UK'
CUS_COUNTRY
CUS_NUM,
CUS_LIMIT,
CUS_BAL,
CUS_RATING,
CUS_DUE
CUST_M3
NL-Service
CUS_COUNTRY
5
ATL-S
15
CUS_NUM,
'NL' CUST_M4
NL-Collection
CUS_NAME,
CUS_ADDRESS,
CUS_COUNTRY
5
ATL-C
15
'NL'
CUS_COUNTRY
CUS_NUM,
CUS_LIMIT,
CUS_BAL,
CUS_RATING,
CUS_DUE CUST_M5
SA-Service
CUS_COUNTRY
5
TAM-S
11, 13, 14
CUS_NUM,
'SA' CUST_M6
SA-Collection
CUS_NAME,
CUS_ADDRESS,
CUS_COUNTRY
5
TAM-C
11, 13, 14
CUS_NUM,
'SA'
CUS_COUNTRY
CUS_LIMIT,
CUS_BAL,
CUS_RATING,
CUS_DUE
Each by
fragment
displayed
department
fragments
location,
listed
FIgure
in
14.19
Table name:
in
Table
to fit
Table
14.6
each
14.6
are shown
CUST_M1
Location:
Location:
2020 has
Cengage deemed
Learning. that
any
All
Rights
910 Eagle St. Node: CUS_BAL
NAS-C
CUS_rATiNG
CUS_DUe 1245.00
12
4000.00
3500.00
3
3400.00
does
May not
not materially
Node:
NL-Service CUS_NAMe
be
copied, affect
scanned, the
overall
or
duplicated, learning
in
whole
909
or in Cengage
part.
Due Learning
to
High
electronic reserves
CUS_COUNTrY
Ave.
rights, the
right
14
ATL-S
CUS_ADDreSS
Corp.
experience.
to the
UK
3
NBCC
country,
UK
2700.00
Reserved. content
Main St.
3500.00
Location:
each
CUS_COUNTrY
10
CUST_M3
suppressed
12
CUS_LiMiT
15
review
within
corresponding
process
CUS_ADDreSS
UK-Collection
CUS_NUM
Copyright
and,
The tables
Node: NAS-S
Mynux Corp.
CUS_NUM
Editorial
by country
14.19.
Sinex, Inc.
CUST_M2
name:
Figure
data
requirements.
CUS_NAMe
12
Table
in
data
UK-Service
10
name:
customer
table contents after the mixedfragmentation
CUS_NUM
Table
contains
departments
some to
NL
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
738
Part
VI
Database
Table
Management
name:
Table
CUST_M4
name:
Location:
NL-Collection
ATL-C
CUS_NUM
CUS_LiMiT
CUS_BAL
CUS_rATiNG
CUS_DUe
15
2000.00
350.00
2
50.00
CUST_M5
Location:
Node:
SA-Service CUS_NAMe
CUS_ADDreSS
11
Martin
321
13
BTBC, Inc.
Rue du
14
Victory, Inc.
123
CUS_NUM
Table name:
Node:
CUST_M6
Location:
CUS_NUM
Corp.
TAM-S CUS_COUNTrY
Sunset
Blvd.
SA
Monde
SA
Maple St.
SA
Node: TAM-C
SA-Collection CUS_LiMiT
CUS_BAL
CUS_rATiNG
CUS_DUe 0.00
11
6000.00
1200.00
1
13
6000.00
5890.00
3
1090.00
14
1200.00
550.00
1
0.00
14.11.2 Data replication Data
replication
Fragment existence reduce
refers
copies
of fragment
Suppose
while fragment Replicated all
of
the
There
are
stored
data
14
in
two
to
that
on
update, data
by a computer
network.
requirements.
time,
Since the
data copies
can help to
A1 and
possible:
A2.
Within
fragment
A1 is
a replicated stored
distributed
at sites
S1 and
S2,
S3. rule.
Therefore,
the
The
to
update
is
originating
mutual
maintain
consistency
data
performed
DP node
are immediately
However,
rule
consistency
at all sites
requires
among
the
where replicas
exist.
it
updated.
decreases
sends
the
This type
data
changes
to the
of replication
availability
focuses
due to the latency
on
involved
at all nodes.
of the update. type
maintaining
is
a database
After a data update,
In this
served
information
of replication:
consistency.
notify them
fragment. is
ensure
data consistency
Pull replication.
focus
a data
data
ensuring
local
After
sites
and response
mutual consistency
that
styles
multiple
fragments,
14.20
be identical.
basically
nodes
nodes to
to the
must ensure
maintaining
two
S2 and
fragments
at
serve specific
data availability
Figure
at sites
copies
sites to
costs.
into
in
DDBMS
Push replication. replica
divided
data are subject
copies
replicas,
query
depicted
A2is
data
can enhance
A is
scenario
of
at several
and total
database the
storage
copies
communication
database,
that
to the
can be stored
the
originating
The replica
of replication,
data
availability.
some
benefits,
DP node
nodes
data
updates
However,
sends messages
decide
when to
propagate
this
style
to the replica
apply the
more slowly
of replication
updates
to the
allows
for
to their
replicas.
The
temporary
data inconsistencies. Although
replication
because
each
on a DDBMS,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
has
data
copy
consider
Rights
Reserved. content
does
must
the
May not
not materially
be
copied, affect
scanned, the
also imposes
maintained
processes
be
it
overall
by the
that the
or
duplicated, learning
in experience.
additional
system.
DDBMS
whole
or in Cengage
part.
DDBMS
To illustrate
must perform
Due Learning
to
electronic reserves
rights, the
right
some to
the
to
third remove
processing
replica
use the
party additional
content
overhead
overhead
imposed
database:
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
chaPter
FIgure
14.20 S1
A1
database
If the
the
selected
Site S3
DP
DP
A1
is fragmented,
the
is replicated,
nearest
A2
DDBMS
must
decompose
DP receives
The
TP assembles
The
problem
communication
DDBMS
to satisfy the
a data request
The
and
the
must
decide
copy to satisfy the transaction.
and updated
The TP sends
Three
739
2
a query
into
subqueries
to
access
the
fragments.
database
selects
Databases
S2
Site
DP
appropriate
Distributed
Data replication
Site
If the
14
and the
becomes
each
request
copy
to
access.
A WRITE operation
mutual consistency
to each selected
executes
which
requires
that
operation
all copies
be
rule.
DP for and
A READ
execution.
sends
the
data
back
to the
TP.
DP responses. more
complex
when
you
consider
additional
factors
such
as network
topology
throughputs.
replication
scenarios
exist:
database
stores
a
database
can
be fully
replicated,
partially
replicated,
or
unreplicated: A fully this to
replicated
case, the
all database
amount
of
fragments
overhead
A partially replicated sites.
Most
DDBMSs
An unreplicated duplicate
Several factors Database and the
it imposes
database are
data transmission
Usage
frequency.
fragment
database
at
multiple sites. In
can be impractical
multiple copies of some database fragments the
due
system.
partially
replicated
stores each database fragment
the decision
size. The amount
network
handle
of each database
A fully replicated
database
at multiple
14
well.
at a single site. Therefore, there are no
fragments.
influence
higher
on the
stores
able to
database
database
multiple copies
are replicated.
to
of data replicated costs.
bandwidth
use data replication:
that
Replicating could
The frequency
large
affect
of data
will have an impact
other
usage
amounts
on the
of data
storage
requires
requirements
a window
of time
and
applications. determines
how frequently
the
data
needs
to
be
updated.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
740
Part
VI
Database
Costs. with
Management
Costs include
synchronising
associated When the
with replicated
usage
can reduce catalogue
performance,
cost
of
software
overhead,
components
of remotely
located
requests.
Data
data
whose
and their
versus
and
management
fault-tolerance
associated
benefits
that
are
data.
frequency
the (DDC),
access.
those for transactions
contents
The data replication
are
used
data is
high
replication by the
makes it possible
and the
database
information
TP to
decide
to restore
is which
lost
is large,
stored copy
in
data
the
replication
distributed
of a database
data
fragment
to
data.
14.11.3 Data allocation Data allocation
describes the process of deciding
where to locate
data. Data allocation strategies
are
as follows:
With centralised
data allocation,
the entire database is stored at one site.
With partitioned
data allocation,
the database is divided into several disjointed
and stored
at several
Withreplicated Data
data allocation,
distribution
combination
over
of both.
Data allocation
studies
and
Size, number Types
a computer
network
data
take
into
to
through
to the
data
partition,
way a database
which data to locate
consideration
a variety
data
is
replication,
divided
or a
or fragmented.
where.
of factors,
including:
goals
and number
of transactions
achieved
on one issue:
availability
of rows
is
is closely related
focus
algorithms
Performance
copies of one or more database fragments are stored at several sites.
Data allocation
Most data allocation
parts (fragments)
sites.
of relations
be applied
to
the
that
an entity
database
and the
maintains
with other
entities
attributes
accessed
by each
of those
transactions Disconnected Most
algorithms
and location. No optimal to
In a 2000 highly
data
mobile users such
Some algorithms or universally
as network
include
accepted
symposium
external
algorithm
system
However,
there
for the
Robust
Distributed
Corporation,
presentation
proven
are three
three
review
2020 has
data, such
Systems,
Cengage deemed
Learning. that
any
All suppressed
Rights
does
May not
commonly
bandwidth
as network
Gilbert and
not materially
be
copied, affect
scanned, the
overall
and throughput,
topology
duplicated, learning
of
data
or network
size
throughput.
have been implemented
in experience.
whole
or in Cengage
part.
his presentation
properties:
consistency,
all three these
University
properties
three
of
Computing,
MITin their
Web Services,
stated in
provide
Consider
A. Brewer,
Nancy Lynch
or
to
properties.
Eric
Brewer
desirable
a system
at the Principles of Distributed
by Seth
Reserved. content
Dr Eric
for
desirable
of Consistent, Available, Partition-Tolerant
Copyright
network
exists yet, and very few algorithms
computing,
it is impossible
CAP stand
Towards was later
on distributed
data
tolerance.
The initials
Editorial
topology,
the caP theoreM
distributed
partition
3
include
for
date.
14.12 14
operation
California
at
paper Brewers
in
same
Due
to
electronic reserves
rights, the
right
some to
third remove
more detail:
Berkeley
and Inktomi
July 2000. This theorem
Conjecture
party additional
content
may content
and time.3
and the
Feasibility
ACM SIGACT News, vol. 33, Issue 2, pp. 5159,
Learning
any
availability, at the
properties
ACM Symposium,
that in
be
suppressed at
any
time
from if
the
subsequent
eBook rights
2002.
and/or restrictions
eChapter(s). require
it
chaPter
Consistency. the
same
In a distributed
data
However,
this
Section
involves
Simply
ever lost.
If
you
operation. Partition
transaction ensure
In this
a few
using
consistency
trade-off
shows,
Google
buy
which
data
Copyright Editorial
review
tolerance
has
provides
Cengage deemed
Learning. that
any
All
This
Rights
Reserved. content
does
Computicket
to
in the
in
No received
request
is
stop
middle
of the
in the
14.7).
failure.
This is the
The system
will fail
because
May
not materially
be
of the
you
ACIDS
in
seen
2). In
practice,
of database the
scanned, the
overall
that
or
duplicated, learning
ranges
whole
By the
ones
want.
you
small
probability
the
achieve
of having have
countdown
at
higher
website
and
small
principle
you else!
The
consistency
noticed
work.
tend to forfeit availability.
of distributed
At
time
by someone
(BASe).
data
This
systems
in
BASE refers to a data
slowly
emergence
from
through
the
system
the
of
highly
NoSQL
distributed
consistent
(ACIDS)
to
and NoSQL data models. For example,
highly scalable provides
in
thing.
view.
14.7.
emergence
experience.
same
browsing
best
purchased
propagate
the
ranges
mergethe best of relational provides
minutes
some companies
consistent
without access).
ChiefsOrlando
have the
same
to
latency
NoSQL databases provide a highly distributed
that
Table
have the
a new type
but
a few
all
becomes
data
Kaizer
ensure
ACIDS which
transactions
seats
the
to
systems,
not immediate
Chapter
get the
properties
eventually
of consistency
the been
If you
generated
the
which
prefers
distributed
has
for
database
have
For example,
(see
practice,
affect
the
to refresh.
with highly
are
you
Computicket
tickets,
as shown
copied,
until
locking
Web pages
of consistency
not
tickets
The one in
by concurrent
exactly
are five
databases,
ACIDS
are
there
database latency
may spend
have
state
distributed
ensure
see
may be doing may already
soft state,
new type In
website
that
and serialisability.
buy tickets
to
other
changes
to
its implications
you learnt
database
imposed
You
world
concert
database
Johannesburg.
selected
systems, 12,
increases,
stadium
than
data service
support.
a spectrum
suppressed
Section
and small
(delays
consistent.
(BASE),
Cloud Spanner
transactions.
2020
see
updated.
as you learnt
of a node
durability,
contention
using
a spectrum
consistent
partition
event
a consistent
For centralised
availability
consistency
in
distributed
components and
isolation,
result
or data
select
available,
provides
ACIDS
to
Chapter
a highly
when dealing
consistency
eventual
now
eventually
now
741
organisations.
Web-based In
need for availability
for their
NewSQL databases attempt to the
delays,
system
(see
and the
you
and
to
and isolation
with
consistency,
all over the
waiting
model in
databases
databases
databases.
transactions
until all replicas are eventually
the
should
be immediately
system.
even in the
distributed
way on purpose their
which data are basically
database
want the
including
are
tickets
Webtickets
between
consistency
by the
Web-centric
on highly
checking
again
work this
As this example the
the
of customers
when
distributed
FNB Stadium and
users from
restart
for
you
at the
will start
to
do not
operate
same results.
grows
that
button,
you
thousands clock
All nodes
should
partitioning
fulfilled
of all
transactions
latency
tickets
other
customers
you
to
atomicity,
more difficult
game
checkout
is designed
systems,
network
imagine
available
case,
focuses
business
It is
time,
in
all successful
price in
soccer
the
a bigger role.
replicas
network
always
online,
transparency
properties:
problem.
is
continues
always return the
a high
same
the
Databases
fail.
As the
For example,
click
takes
that
and
requirement
The system
that
data operations is not an issue.
the
tickets
all distributed
properties
a request
buying
CAP theorem
for
database
the
means
with latency
a paramount
of failure
the
Pirates
are
all nodes
widespread
through
consistency
which
dealing
tolerance.
only if
paying
time,
speaking,
This is
equivalent
a bigger
database,
same
Distributed
14.10.
Availability.
Although
at the
14
consistency of
from
or in Cengage
part.
distributed
Due Learning
to
and
NoSQL
and
ACIDS
to the
electronic reserves
rights, the
right
some to
databases high
availability
NewSQL
third
party additional
content
databases
consistent
may content
be
suppressed at
any
14
for
with relaxed
distributed
eventually
remove
with support
time
BASE.
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
742
Part
VI
table
Database
Management
14.7
Distributed
database
spectrum
DBMS
Partition
Transaction
Type
Consistency
Availability
Tolerance
Centralised
High
High
N/A
Model
Trade-off
ACIDS
No distributed
DBMS
data
processing High
Relational
Relaxed
High
ACIDS (2PC)
Sacrifices
DBMS
ensure
availability
to
consistency
and
isolation
Relaxed
NoSQL
High
High
BASE
Sacrifices
DDBMS
consistency
to
ensure availability High
NewSQL
High
Relaxed
ACIDS
Sacrifices
DDBMS
partition
tolerance
to
ensure
transaction
consistency
and availability
14.13
Database
Maintaining network
data security in a DDBMS is far
has also to
features
described
users
and roles.
features
to
public, the
In
10,
14.14
will offer via
link
DBMS, as the
will support i.e.
new
style
their
14
of delivering
organisations
own databases
interconnected third
DDMBS
party
cloud
virtualised
provider
with the
an organisation
Consider the following
4
Oracle
that
all users
refer to the
wIthIn
their
to
party
supply
Guide,
users
could
access
by referencing
vendor-specific
over the
the
reference
pricing
manual.
of IT
a service
that
each
level
provides
an alternative
(IT) infrastructure
provider that
services
model for
Cloud computing is a
Web. It
technology
party cloud
a range
often called
to
uses a number
are standardised.
service
it
provides,
agreement.
host
The
of
Each which
can
main benefits
are:
cloud
one IT infrastructure
Administrators
to
own information
own flexible
This is
third
a link
SQL statement:
the clouD
they rely on a third
cloud infrastructure
As the
only
Database
Instead, computers
organisation.
provides
make
'travel';
security,
and resources
provide
will have its
of using
Cost-effectiveness.
organisations,
data
wish to
or software.
and
be negotiated to
do not
To
for
travel.
Databases
applications,
that
Oracle
links.4
Current trends in distributed data systems cannot fail to discuss cloud computing. for
security
authentication
For example, database
underlying
all of the
password
features. through
actual link.
non-authenticated
about
DIstrIbuteD
additional
USING
database
database Process,
authentication
customer
remote
more specific information
DDBMS
when creating the
LINK
a public,
to the
the
than in a centralised
Development
vendors
DATABASE
creates
Typically
database
keyword is used
pointer
more complex
Database
specific
a distributed
PUBLIC
statement
made secure.
Chapter
addition,
PUBLIC
customer
For
be in
access
CREATE This
securIty
provider
is likely
is required,
11g
Release
to
be hosting
which reduces
2 (11.2),
Part
the
Number
services
for
many
cost to the individual
E25494-02.
Available:
https://docs
.oracle.com/cd/E11882_01/nav/portal_4.htm
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
organisation.
Under the negotiated
service
level
agreement,
14
an organisation
Distributed
will also only
Databases
743
pay for
what it requires.
Latest
software.
version
available
Scalable the
Most third to remain
architecture.
database
allows
data requirements
and/or
change
Data and software greater
providers
will ensure that their
software
is always the latest
competitive.
If the
capacity
Mobile access. which
party cloud
flexibility
the
of the
within the
for
organisation
underlying
employees
data
cloud
expand,
can be accessed
in terms
it is
easy to increase
model.
of
where
generally
they
from
anywhere,
work.
note You
It is
willlearn
clear
For
more about
that
traditional
example,
hardware
a
resources
One solution
Column
is to
for
if
time
use
are for
such
stores depending
replica
addition,
upon
A further
example
distributed to
review
Singh,
has
Learning. that
any
store
a cloud
queries)
within
the
Web Technologies.
environment. and
operating
petrabytes
uses
associated
within a cloud at a given time.5
cloud.
defines
Bigtable
of
Bigtable
Current
data
to
NoSQL
by a row
across
store
as a parse,
map is indexed
its
structured
distributed,
key,
column
and is
often
often
The
CAP
All
to
Rights
P.,
key and a
Reserved. content
does
May not
not
be
(nodes
within
support
synchronous even
Computing and
copied, affect
Impact,
scanned, the
overall
or
duplicated, learning
the
73,
in experience.
whole
do
database can
can
it is
so the
which
queries
be changed
is
a
can be quickly,
to
2012,
replication
Latest pp.
a necessity
Computer
or in Cengage
part.
Due Learning
electronic reserves
cloud
tolerance properties
Abadi
stated
many applications
that
a
require
partition.6
Architectural
Concepts,
World
2011.
Society,
to
that the
only two
Dr Daniel
and
follow
of partition
support
because
Trends
104245,
infrastructures
property
be said In
cloud
when there is no network
Databases:
IEEE
SimpleDB,
1 but
cloud)
an AP system.
Technology,
In
data can be geographically
model
Web services,
So, a cloud as
Amazons
data
system
clients.
data.
relational
scalability,
using
is sacrificed
Cloud
underlying
database
or offline
Thus, the
Traditional
unlimited
essential.
Growing
materially
offer
is
stored
as document-orientated
performance.
to
Engineering Theorems
The for
and accessed
also
solution
is
to
a distributed
servers
or delete
environment.
referred
cannot
Sandhu, Science,
requests. optimised
multiple
update, insert
database
document
are referred
which is
on
high availability.
servers
data is
query,
NoSQL
ultimately
of
exist
each
stores
CouchDB, can
within a cloud
and
Instead,
Document
Apaches
Web service
of individual
of
suppressed
is
operates
Thus, consistency
S.,
Cengage deemed
that
data in tables.
database
As data is stored
availability
T. and
Shim,
2020
data
example,
Google
the ability to
indexed seeks
system
Academy
Copyright
using
CAP theorem
distributed
Editorial
data
with failure High
low latency.
6
for
and format.
which allows
automatically
computing
deal met.
5
in
upon service requirements
manage
systems
Earth.
of a vendor-specific
CAP theorem?
of the
and
Google,
storing
size
same
are offered
data store that
query
Cloud
can
of the
with replication,
data is
its
An example
copies
all users
non-relational
is
operate (through
and
stores:
map where the
move away from
databases. where
the
Google
to
requests
based
store
distributed
sorted
trying
Connectivity
stamp.
Document
used
as
multidimensional
differently
and
to
of servers.
17, Database
This is in contrast to a DDBMS
document
large-scale
all data
dynamically
and
not thousands,
Chapter
when
over
databases
stores
in
problems
control
data is consistent.
NoSQL
applications
persistent
has
are allocated
column
stores
hundreds, data
to ensure
services
will face
typically
resources
include
computing
DDBMS
DDBMS
where hardware
solutions
cloud
45(2),
rights, the
right
some to
third remove
212,
party additional
2012.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
744
Part
VI
Database
Management
When a third cloud
party
providers
and
Oracle
Oracle
are
that
grow
DDBMS
are
to
Each
the
For unless
of a target
a fully
for
defend
distributed
against
attacks. basis
these threats.7
Databases
Dates
database
a useful
Alibaba
on a daily
DIstrIbuteD C.J.
big public
IBM,
malicious
dramatically
to
it includes
do constitute
(AWS),
distributed
and,
distributed
database
although
no current
database
target.
The
Each local for
site can act as an independent,
security,
concurrency
control,
autonomous,
backup
centralised
and recovery.
No site in the network relies on a central site or any other site. All sites
case
of a node
failure
by node failures.
or an expansion
of the
The user does not need to know the location
The system is in continuous network.
of the data in order to retrieve
data.
Fragmentation transparent order
6
more
data? In 2019,
Services
will increase
intelligence
The system is not affected
even in the
Location transparency.
5
artificial
of that
Web
capabilities.
Failure independence.
those
them
events
describe
the rules
site is responsible
same
operation,
4
makes
complete
commandments
Central site independence.
3
is
security
Amazon
as follows:
DBMS.
have
growth
use cloud-based
all of them,
Local site independence.
2
This
Cloud,
of security-related
databases
Dates
conforms
Google
12 coMManDMents
of distributed
commandments.8
12 rules
bigger.
will be to
data, how good is the
Azure,
number
c.J. Dates
No discussion
all your
Microsoft
the
one solution
14.15
1
as
set to
predicts
and that
manages
such
to
to
transparency. the
retrieve
user.
The user sees only one logical
The
user
does
not
need
to
know
the
database. name
Data fragmentation
of the
database
is
fragments
in
them.
Replication transparency. The user sees only one logical database. The DDBMStransparently selects
the
database
fragment
to
access.
To the
user,
the
DDBMS
manages
all
fragments
transparently.
7
Distributed query processing. Query optimisation
8
9
A distributed
performed
Distributed transaction transaction
14
is
processing.
is transparently
may be executed
by the
Atransaction
executed
Hardware independence.
query
transparently
at several
The system
may update different
data at several different sites. The
platform.
10
Operating system independence.
11
Networkindependence. The system mustrun on any network platform.
12
Database independence.
7
Oracles
top
10
Cloud
The system
Predictions
mustrun on any operating system software
must support
2019 [online],
DP sites.
DP sites.
mustrun on any hardware
The system
at several different
DDBMS.
any vendors
available:
platform.
database product.
www.oracle.com/assets/oracle-cloud-predictions-2019-5244106.p
2019
8
Copyright Editorial
review
2020 has
Date, C.J., Twelve
Cengage deemed
Learning. that
any
All suppressed
Rights
Rules for a Distributed
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
Database,
in experience.
whole
or in Cengage
Computer
part.
Due Learning
to
electronic reserves
World, 2(23), pp. 7781,
rights, the
right
some to
third remove
party additional
content
may content
8 June, 1987.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
14
Distributed
Databases
745
suMMary A distributed database stores logically related data in two or more physically independent sites connected via a computer network. The database is divided into fragments, which can be horizontal (a set of rows) or vertical (a set of attributes). Each fragment can be allocated to a different
network
node.
Distributed processing is the division oflogical database processing among two or more network nodes. Distributed databases require distributed processing. A distributed database management system (DDBMS)
interconnected
governs the
computer
The main components (DP).
processing
and storage
of logically
of a DDBMS are the transaction
The transaction
related
data through
systems.
processor
component
is the
processor (TP) and the data processor
software
that resides
on each computer
that requests data. The data processor component is the software that resides that stores and retrieves data. Current
database
systems
can be classified
by the
extent
to
which they
node
on each computer
support
processing
and data distribution. Three major categories are used to classify distributed database systems: (1) single-site processing, single-site data (SPSD); (2) multiple-site processing, single-site data (MPSD); and (3) multiple-site processing, multiple-site data (MPMD). A homogeneous
distributed
database
system integrates
a computer network. A heterogeneous distributed types of DBMSs over a computer network. DDBMS
characteristics
are best
described
only one particular
type
database system integrates
as a set of transparencies:
of DBMS
over
several different
distribution,
transaction,
failure, heterogeneity and performance. Alltransparencies share the common objective of making the distributed database behave as though it were a centralised database system; that is, the end user sees the data as part of a single logical centralised database and is unaware of the systems complexities. Atransaction is formed by one or more database requests. An undistributed transaction updates or requests data from a single site. A distributed transaction can update or request data from multiple sites. Distributed concurrency control is required in a network of distributed databases. COMMIT protocol is used to ensure that all parts of a transaction are completed. A distributed
DBMS evaluates
every data request
to find the
database. The DDBMS must optimise the query to reduce costs associated with the query. The design
of a distributed
database
must consider
optimum
access
Atwo-phase
path in a distributed
access, communications
the fragmentation
and CPU
and replication
of data.
14
The designer must also decide how to allocate each fragment or replica to obtain better overall response time and to ensure data availability to the end user. A database
can be replicated
over several
different
sites
on a computer
network.
The replication
of the database fragments has the objective ofimproving data availability, thus decreasing access time. A database can be partially, fully, or not replicated. Data allocation strategies are designed to determine the location of the database fragments or replicas. The CAP theorem states that a highly distributed data system has some desirable properties of consistency, availability and partition tolerance. However, a system can only provide two of these properties at atime.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
746
Part
VI
Database
Management
key terMs application processor(AP)
distributed transaction
NewSQL
basically available, soft state, eventually
distribution transparency
NoSQL
document stores
partially replicated database
centraliseddataallocation
DO-UNDO-REDO protocol
partitionkey
client/server architecture
failure transparency
partitioned data allocation
cloud computing
fragmentation transparency
performancetransparency
column stores
fully heterogeneous DDBMS
remote request
coordinator
fully replicateddatabase
remotetransaction
dataallocation
heterogeneitytransparency
replicatransparency
datafragmentation
heterogeneous DDBMS
replicated data allocation
data manager(DM)
homogeneous DDBMS
servicelevel agreement
data processor(DP)
horizontal fragmentation
single-site processing, single-site data(SPSD)
datareplication
local mapping transparency
subordinates
databasefragments
consistent (BASE)
location transparency
transaction manager(TM)
distributed data catalogue (DDC)
mixedfragmentation
transaction processor(TP)
distributed data dictionary (DDD)
multiple-site processing, multiple-site data
transaction transparency
distributeddatabase
(MPMD)
distributeddatabasemanagement system
two-phasecommit protocol
multiple-siteprocessing, single-sitedata
(DDBMS)
(MPSD)
distributed globalschema
unreplicated database
mutualconsistency rule
distributed processing
network latency
distributedrequest
networkpartitioning
Further Jain,
A., The
Cloud
DBA-Oracle:
online content
reVIew 1
vertical fragmentation write-ahead protocol
reaDIng
are contained in the
14
uniquefragment
Managing
Oracle
Database
in the
Cloud.
Apress,
2017.
Answers to selectedReviewQuestions andProblems forthis chapter online platform for this book.
QuestIons
Describe the evolution from centralised
DBMS to distributed
DBMSs.
2 List and discuss some ofthe factors that influenced the evolution ofthe DDBMS. 3
Whatarethe advantages ofthe DDBMS?
4
What are the disadvantages
5
Copyright Editorial
review
2020 has
DDBMS?
Explain the difference between a distributed
Cengage deemed
of the
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
database and distributed
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
processing.
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
6
Whatis afully distributed
database
7
What are the components
of a DDBMS? features
Explain the transparency
9
Define and explain the different types of distribution transparency. Describe the different types
11
Explain the need for the two-phase
12
Whatis the objective
13
To which transparency
Distributed
Databases
747
management system?
8
10
14
of a DDBMS.
of database requests
and transactions.
commit protocol. Then describe the two
phases.
of query optimisation functions? feature
are the query optimisation functions
of query optimisation
related?
14
What are the different types
algorithms?
15
Describethe three data fragmentation strategies. Givesome examples.
16
Whatis data replication,
17
How does a BASE system differ from a traditional
18
What are the three
19
What are the
and what are the three replication
proprieties
of the
strategies?
distributed
database system?
CAPtheorem?
main benefits to an organisation
of using a cloud infrastructure?
ProbleMs The following
FIgure
problem is
P14.1
based
on the
DDBMS
FRAGMENTS
LOCATION
CUSTOMER
N/A
A
PRODUCT
PROD_A
A
PROD_B
B
INVOICE
N/A
B
INV_LINE
N/A
B
I N V OI CE
P RO D_ A
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
Figure
P14.1.
14
C U ST O ME R
Copyright
in
the DDbMs scenario for Problem 1
TABLES
Editorial
scenario
does
I N V _LI N E
R O D_ B
Site C
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
748
Part
VI
Database
1
Management
Specify the
minimum type(s)
transaction,
distributed
of operation(s) the database
transaction,
or distributed
must support (remote request, remote
request)
to
perform
the
following
operations:
At Site C
a
SELECT* FROM
b
CUSTOMER;
SELECT
*
FROM
INVOICE
WHERE
c
INV_TOT
SELECT
1000;
*
FROM
PRODUCT
WHERE
d
.
PROD_
QOH
, 10;
BEGIN WORK; UPDATE
CUSTOMER
SET
CUS_BAL
WHERE
CUS_NUM
INSERT
INTO
INSERT
1 100
5 '10936';
INVOICE(INV_NUM,
'10936', INTO
CUS_NUM,
'15-FEB-2019', LINE(INV_NUM,
PROD_NUM,
PRODUCT
SET
PROD_QOH
5 PROD_
PROD_NUM
5 '1023';
COMMIT
INV_DATE,
INV_TOTAL)
VALUES
('986391',
100);
UPDATE
WHERE
e
5 CUS_BAL
LINE_PRICE)
VALUES('986391',
'1023',
100);
QOH 1
WORK;
BEGIN WORK; INSERT
INTO
CUSTOMER(CUS_NUM,
('34210',
INSERT
'Victor
Ephanor',
INTO INVOICE(INV_NUM, '34210',
COMMIT
CUS_NAME, '143
Main
CUS_NUM,
'10-AUG-2018',
St.',
CUS_ADDRESS,
CUS_BAL)VALUES
0.00);
INV_DATE,
INV_TOTAL)
VALUES ('986434',
2.00);
WORK;
At Site A
14
f
SELECT
CUS_NUM,CUS_NAME,INV_TOTAL
FROM
CUSTOMER,
WHERE
g
CUSTOMER.CUS_NUM
SELECT INVOICE
WHERE .
SELECT FROM
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
1000;
*
PRODUCT
WHERE
Editorial
5 INVOICE.CUS_NUM;
* FROM
INV_TOTAL
h
INVOICE
Rights
PROD_QOH
Reserved. content
does
May not
not materially
be
copied, affect
,
scanned, the
overall
10;
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
chaPter
14
Distributed
Databases
749
At Site B
i
SELECT
j
*
FROM
CUSTOMER;
SELECT
CUS_NAME,
INV_TOTAL
FROM
CUSTOMER,
INVOICE
WHEREINV_TOTAL
.
1000
AND
CUSTOMER.CUS_NUM
5
INVOICE.CUS_NUM;
k
SELECT
*
FROM
PRODUCT
WHERE
2
PROD_QOH,10;
The following a
data structure
and constraints
The company publishes one regional the
b
Netherlands
(NL)
and the
exist for a magazine publishing
magazine in each country: France (FR), South Africa(SA),
United
Kingdom
(UK).
The company has 300 000 customers (subscribers) listed
c
in
company:
distributed throughout
the four countries
Part a.
On the first customer
of each whose
attribute
to indicate
CUSTOMER
month, an annual subscription
subscription the
is
country
(CUS_NUM,
POSTCODE,
due for
(FR,
SA,
CUS_NAME,
CUS_SUBSDATE)
renewal. NL,
INVOICE is printed The INVOICE
UK) in
which
the
CUS_ADDRESS,
INVOICE
(INV_NUM,
entity
and sent to each
contains
customer
a REGION
resides:
CUS_CITY,
CUS_REGION,
INV_REGION,
CUS_NUM,
CUS_
INV_DATE,
INV_TOTAL) The companys and
management
has decided
regional
data to
all
current
List
all
new
has
associated
of the
will handle
however, ad
its
own
will have hoc
with centralised
subscriptions
into
customer access
queries
the
such
to
management
companys
and invoice
four
data.
customer
The
and invoice
as:
by region.
customers
by region. by customer
requirements,
how
and
by region.
must you partition
the
database?
in Problem 2, answer the following
questions:
14
willyou makeregarding the type and characteristics ofthe required
system?
Whattype of data fragmentation
c
Which criteria
is needed for each table?
must be used to partition each database?
Design the database fragments. names,
site
and to issue
b
d
2020
reports
Whichrecommendations database
review
subscription
Given the scenario and the requirements
a
problems
management
headquarters,
customers
all invoices
Given those
Copyright
annual
List
Report
Editorial
Each
at company
generate
is aware of the
decentralise
subsidiaries.
management
3
to
attribute
names
and
Show an example
demonstration
with node names, location,
fragment
data.
e
Whattype of distributed
f
Whattype of distributed database operations must be supported atthe headquarters site?
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
database operations
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
must be supported
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
at each remote site?
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter 15 Databasesfor Business Intelligence In thIs Chapter, How
business
you
intelligence
wIll learn:
provides
a comprehensive
business
decision
support
framework
About
business
About the How to
intelligence
data
warehouse
prepare
and Loading How to
the
About
the
How
SQL
About
life
data for the
star
role
and reporting
styles
cycle
data
warehouse
and
snowflake
schemas
and functions
of data
characteristics analytic
data
its evolution
using the
Extraction,
Transformation
Process.
develop
About
architecture,
and
functions
visualisation
analytics
capabilities are
and
used
how it
for
decision-making
and
data
of online
to
support
supports
purposes
mining
analytical data
processing
(OLAP)
analytics
business
intelligence
Preview Business
intelligence
developed
to
markets,
rapid
information
change
support
data
data
Online
gather,
especially
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not
be
a data
pool.
best
and
software
of globalisation, complexity
developed.
new
The
external
ways
to
and
range
data
warehouse
a new extracts
providing
and
of
operational
Therefore,
sources,
analyse
tools
emerging and
has increased,
all of these requirements.
well as from
Additionally,
age The
decisions
warehouse, as
practices
in this
regulation.
business
databases
including
and
copied, affect
the
overall
provides
multidimensional
or
duplicated, learning
in experience.
or in Cengage
part.
Due Learning
to
analysis.
intelligence
warehouses,
whole
advanced
data
of business
and present information
use of data
scanned,
(OLAP)
components
generate on the
materially
increasing
processing
tools,
main concepts
that
of making
a more
present
decision
were developed.
analytical
visualisation
collection
decision
support
called
operational
comprehensive
the
were unable to support
facility,
data from
and
to
structures
data storage
is
business
required
database
its
(BI)
support
for
electronic reserves
rights, the
right
This
and
business
data analytics
some to
third remove
data
analysis
chapter
decision
explores
support
decision
and the
systems
makers, focusing
and data visualisation.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
15.1
the
neeD For
Data
15
Databases
for
Business
Intelligence
751
analysIs
Organisations tend to grow and prosper as they gain a better understanding of their environment. Typically, business managers must be able to track daily transactions to evaluate how the business is performing. Bytapping into the operational database, management can develop strategies to meet organisational goals. In addition, data analysis can provide information about short-term tactical evaluations
and strategies
such
as: Are our sales
promotions
working?
What market
percentage
are
we controlling? Are we attracting new customers? Tactical and strategic decisions are also shaped by constant pressure from external and internal forces, including globalisation, the cultural and legal environment, and, perhaps mostimportantly, technology. Given the
many and varied
competitive
pressures,
managers
are always looking
for
a competitive
advantage through product development and maintenance, service, market positioning, sales and so on. Thanks to the internet, customers are moreinformed about the products they how muchthey are willing to pay. Technological advances allow customers to place orders smartphones while they commute to work. Decision makers can no longer wait a couple of report
to
be generated;
quick
decisions
must be
made for the
business
to remain
promotion want and from their days for a
competitive.
Every
day, advertisements offer, for example, instant price matching, and the question is, How can a company survive on lower margins and still make a profit? The key is having the right data at the right time to support the decision-making process. Different
managerial
levels
require
different
decision
support
needs.
For example,
transaction-processing
systems, based on operational databases, are tailored to serve the information needs of people who deal with short-term inventory, accounts payable and purchasing. Middle-level managers, general managers, vice presidents and presidents focus on strategic and tactical decision making. Those managers require summarised information designed to help them make decisions
in
a complex
business
environment.
Companies
multilevel decision support needs by creating users for example, those in finance, customer also developed for different industries such as started to work well, but changes in the way in expanding
markets,
merges and acquisitions,
and software
vendors
addressed
these
autonomous applications for particular groups of relationship management, etc. Applications were education, healthcare and finance. The approach which business was conducted, e.g. globalisation,
increased
regulation
and new technologies,
called for
new ways of integrating and managing decision support across levels, sectors and geographical locations. This more comprehensive and integrated decision support framework became known as business intelligence.
15.2
BusIness IntellIgenCe
Business intelligence (Bi)1 is a term that describes a comprehensive, cohesive, and integrated set of tools and processes used to capture, collect, integrate, store and analyse data with the purpose of generating and presenting information to support business decision making. This intelligence is based
1
In
on learning
1989,
while
of concepts
and understanding
working
and
at
the facts
Gartner Inc.,
Howard
methods to improve
about
Dresner
the
business
popularised
business decision
environment.
BI is
BI as an umbrella
making by using fact-based
term
to
support
a framework
describe
15
a set
systems (www.
computerworld.com/action/article.do?command=viewArticleBasic&articleId=266298).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
752
part
VI
that
Database
Management
allows
into
a business
wisdom.
business This
performance
business
of the
to
affect through
empowers
users
to
information
into
culture
positively
a companys active
decision
make sound
support
decisions
knowledge,
at
based
and knowledge
by creating
all levels
on the
in
adopters
companies
You
were high-volume
As BI technology
retail/merchandising,
willlearn
15.1
evolved,
industries its
manufacturing,
that
usage
BI tools
about these tools
later in the
such spread
media,
have implemented
an organisation.
accumulated
as financial to
other
government, and
how
CiCis Enterprises
US; operates in
Source:
Cognos
even
the
such
education.
tools
and healthcare
as telecommunications, Table
have
chain in
650
pizza
was
Provided
and time-consuming
Needed to increase
less
accuracy in the
creation
of marketing
Needed
an easy, reliable
efficient
way to access
Provided
budgets
accurate,
timely
Nasdaq
Inability US electronic
stock
to
query
organisation
analysts
with access to
data for decision-making
and
Received
daily data
provide real-time,
and standard
executives,
Oracle
other
www.oracle.com
in-depth
performance
pharmaceutical
Oracle
storage
a multitier
storage
analysts
and
Implemented centre
storage
terabytes
of data
Needed
a way to
costs
control
government better
and flexible
many
costs
market
competition
increasing Needed
for
view
and
analytical
Ability to
and
reliable
Faster
Needed
leading
telecommunications
generate
Source:
Needed 200
review
help
process
performance
financial
sources
in
a
to improve
process
and smarter
business
Quick
to
users
standards-based
analysis
decision
strategy monitor
dashboard and
making
formulation
performance
using
technology easy
performance
reports
warehouse
way
Ability to
employees
compliance
moving to
access
to real-time
data
Microsoft
www.microsoft.com
Copyright
to
Had a time-consuming
provider
Editorial
a tool
monitor service-level
end
multiple
forecasting
for Swisscom
for
Streamlined,
framework
by
solution
get and integrate
financial
decision-making
costs
and near real-time
access
data from
capabilities
profits
with support for ad hoc query
conditions,
regulations
of product
new data
and reporting, data
to tougher
international
www.oracle.com
Reduced
for
users
adjust
company
15
purposes
by store to reduce
ad hoc
reporting
business
Excessive
Pfizer
Switzerlands
budgets in
time
waste and increase
Source:
some
companies.
Corp.
www.cognos.com
Global
lists
the
chapter.
access
cumbersome
30 states
market trading
15.1
benefited
Benefit
Information
pizza
restaurants
services, insurance
industries
and
shows
Problem
Eighth-largest
Source:
knowledge
solving business problems and adding value with BItools
Company
Largest
continuous
business.
BIs initial
the
data into information,
potential
improvement
insight
companies.
taBle
to transform
BI has the
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
a way to integrate
different
copied, affect
scanned, the
overall
Managers
data from
control
systems
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
have
closer
and
better
over costs
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Implementing
BIin an organisation
also the
metadata,
a deep
understanding
of
users
on the
BI is that
not
help
situation
in
about
alignment
online
platform
better
and identify
key
data.
of the
not only internal
In
practice,
business
(See
for this
by itself,
business
capturing
the
an organisation.
a product
a
framework
1
and
at all levels
available
involves
or knowledge
15
BI is
L,
business
Data
for
and external
a complex
processes,
Appendix
Databases
Business
business
proposition data
Warehouse
Intelligence
data,
that
753
but
requires
and information
needs
Implementation
Factors,
book.)
but a framework
understand
its
opportunities
of concepts,
core
to
practices,
capabilities,
create
provide
competitive
tools
and technologies
snapshots
advantage.
In
of the
general,
company
BI provides
a
for:
Collecting and storing operational data
2 Aggregating the operational datainto decision support data 3
Analysing decision support data to generate information
4
Presenting such information
5
Making business on (restarting
6
decisions,
the
which in turn generate
7
be collected,
stored,
outcomes
preceding
within the the
of the
processes
knowledge, basic
represent
In
practice,
of a BI system
will use the
and so
per
operational and
a system-wide
the first
are the focus
it is the
as input
explained
of the
the
which again provides
of data,
processes
operational
of an operational
from
which
preceding
BI system.
flow
and storing
function
material in
of the
view
point, collecting
se; rather, data
outcomes
and they
decisions,
more
with a high degree of accuracy. and
outcomes
data, does not fall
system.
However,
information
will be derived.
are
towards
points
In the following
orientated
section,
you
the
BI
The rest generating
willlearn
about the
BI architecture.
15.2.1 Business Intelligence
architecture
BI
and
covers
a range
acquisition
to
of technologies
storage,
BIfunctionality and
stored,
and so on
points
BIframework.
realm
system
decisions
more data that are collected,
of the business
Predicting future behaviours and outcomes
The seven
business
process)
Monitoring results to evaluate data to
into
to the end user to support
ranges
presentation.
integrated,
transformation,
from
simple
data
BI architecture
multivendor
applications
to
integration,
gathering
ranges
from
environments.
manage
presentation,
and transformation highly
However,
the
integrated
some
entire
analysis,
data
functions
cycle
and
to very complex single-vendor
common
life
monitoring
data
systems are
from
archiving.
analysis
to loosely
expected
in
most
BI
implementations.
Like any critical technology
business IT infrastructure,
and the
fit together
within
management
the
of such
the
BI architecture
components.
is composed
Figure
of data, people,
15.1 depicts
processes,
how all these
components
BI framework.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
754
part
VI
FIgure
Database
Management
15.1
Business intelligence
People
framework
Business
External data
intelligence
Processes
framework
Data visualisation
Operational data
Monitoring and
Data
alerting
Query
analytics
and reporting
Data store Data
ETL
D
warehouse
at a mar
Extraction, transformation and
Management
loading
Governance
Source:
The general
BI framework
functionality
required
components
taBle
later
15.2
depicted
in
in this
most chapter.
The
Basic BI architectural
Component
Description
ETL tools
Data and
the
15
such
Copyright review
2020 has
and
The data store is
analysis
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not
be
copied, affect
components You
that
will learn
described
in
Learning
encompass
more
Table
about
the these
15.2.
tools
be found
prices,
be saved into by the
Such
company
data
decision support
that
optimised its
are relevant
located
market the
that
data to
(such in
external
data.
generally represented
mart. The data are stored in structures
for
day-to-day
information
are generally
and is
integrate
The external
but
marketing
groups or companies
for
during
and payments.
within the
data.
filter,
a data store
company
market indicators,
competitors
collect,
by a data
are optimised
for data
speed.
scanned, the
(eTL)
sales history, invoicing
cannot
optimised
query
are
by industry
or a data
and
materially
data to generated
as stock
provided
warehouse
briefly
data
data that
as demographics)
Data store
basic
and loading
such as product
databases
are
and external
Internal
provide
business,
has six
BI systems.
components
internal
support.
sources
15.1
transformation
aggregate
operations,
Figure
Technology/Cengage
components
extraction,
decision
Editorial
in
current-generation
Course
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
Component Query
Databases
for
Business
Intelligence
755
Description
and reporting
This to
component
create
performs
queries
that
This component ways.
monitoring
and
alerting
helps
as summary
This
user to select
maps,
concise
information
in
activities,
as number
by product
a given
metric;
system
will perform
store.
value
advises
creates
placed
statistical
mixed
in the
meaningful
database,
goes
The
view for the
hours,
BI system data
analyst.
performance
number
Alerts
or above
or
of customer
can
be placed
a certain
shop floor
format,
or dashboards.
activities.
by region.
below
and innovative
presentation
graphs,
past four
revenue
such as emailing
data analysis
the
data
store
Depending
operational
metrics about the system
of a metric
user about
model. the
to
which
baseline,
managers,
on the
presenting
or predictive.
data that
data
to
build
select
and
analysis
types,
predictions
and
by special
situations
Explanatory and their
allow
tool
using the data in the
are generated
of business
relationships
of the
tasks
data analysis
models
understanding
discover
models
and data-mining
Business
analysis can be either explanatory data
analyst
an application.
and enhance
data in the
data
reports.
the
of
a single integrated
and total
a given action,
business
that identify
month,
performs
This tool
a reliable
by the
required
most appropriate
of business
specific
of orders
by
once the
This component
the
used
accesses
graphs,
monitoring
view could include
such
tool
the
pie or bar
real-time
visual alerts or starting Data analytics
end
This integrated
complaints
create
and reporting
data to the end user in a variety
the
allows
the
and
and is
data store.
reports,
component
will present
the
and retrieval,
database
query
presents
This tool
such
the
the
or more commonly, Data visualisation
data selection
access
on the implementation,
Data
15
and
how to
algorithms
problems.
Data
uses the existing predictive
of future
values
analysis and
events.
Each BI component shown in Table 15.2 has generated a fast-growing market for specialised tools. Thanks to technological advancements, the components can interact with other components to form a truly
open
architecture.
As a matter of fact,
a single BIframework.
taBle
15.3
you can integrate
multiple tools
Table 15.3 shows a sample of common
sample of business intelligence
Tool
from
different
business
and
Sample
Dashboards
activity
business
monitoring
use
Web-based
performance
integrated
view,
concise
technologies
indicators
generally
to
present
or information
using
into
tools
Description
Dashboards
vendors
BItools and vendors.
graphics
in
that
vendors
key
Salesforce
a single
IBM/Cognos
are clear,
BusinessObjects
and easy to understand.
Information
Builders
iDashboards Tableau Portals
Portals
provide
information
a unified,
distribution.
technology
that
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
of
not materially
be
affect
scanned, the
Portals
are a
a single
BI functionality
copied,
point
of entry
for
overall
or
duplicated, learning
can
in experience.
Web page. be accessed
whole
or in Cengage
part.
Due Learning
Oracle
Web-based
use a Web browser to integrate
multiple sources into types
single
to
Actuate
data from
Microsoft
Many different through
electronic reserves
rights, the
right
15
Portal
SAP
a portal.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
756
part
VI
Database
Management
Tool
Description
Data analysis reporting
and
These
tools
Sample
advanced
data sources
tools
to
are
used to
query
create integrated
multiple
and
diverse
Microsoft
reports.
tools
These
tools
problems
provide
advanced
statistical
and opportunities
hidden
Reporting
Services
MicroStrategy SAS
Data-mining
vendors
analysis
to
within business
Web Report
uncover
SAP
data.
Teradata
Studio
MicroStrategy Hadoop Data
warehouses
The
(DW)
data
Data are
in the
warehouse captured
from
foundation
the
integration
Online
of data
issues
in
analytical
of a BI infrastructure.
production
DW on a near real-time
business OLAP tools
is the
and the
a timely
system
and
basis. BI provides capability
Amazon
placed
Oracle
company-wide
to respond
Redshift
IBM
to
Exadata
DB2
Azure
manner.
processing
provides
multidimensional
data
IBM/Cognos
analysis.
Micro
Strategy
ioCube Apache Data visualisation
These
tools
provide
techniques insight
to
into
advanced
enhance
visual
understanding
business
data
and its
analysis and
true
and
Kylin
Dundas
create
additional
Tableau
meaning.
QlikView
Actuate
As depicted in Figure 15.1, BIintegrates people and processes using technology to add value to the business. Such value is derived from how end users apply such information in their daily activities, and particularly
in their
daily business
decision
making.
The focus of traditional information systems was on operational automation and reporting; in contrast, BI tools focus on the strategic and tactical use of information. To achieve this goal, BI recognises that technology alone is not enough. Therefore, BI uses an arrangement of best management practices to manage data as a corporate asset. One of the most recent developments in this
area is the
use of
master
data
management
techniques.
Master
data
management
(MDM)
is a collection of concepts, techniques and processes for the proper identification, definition and management of data elements within an organisation. MDMs main goal is to provide a comprehensive and consistent definition of all data within an organisation. MDM ensures that all company resources (people,
procedures
and IT
systems)
that
work
with data
have
uniform
and
consistent
views
of the
companys data. An added benefit of this meticulous approach to data management and decision making is that it provides aframework for business governance. Governance is a method or process of government. In this case, BI provides a method for controlling and monitoring business health and for consistent decision
15
making.
Furthermore,
having
such
governance
creates
accountability
for
business
decisions.
In the present age of business flux, accountability is increasingly important. Had governance been as pivotal to business operations in previous years, crises precipitated by Enron, WorldCom, Arthur Andersen and the 2008 financial meltdown might have been avoided. Monitoring
a businesss
headed. To do this,
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
health is
crucial
BI makes extensive
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
to
understanding
where the
use of a special type
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
of
rights, the
right
company
is
and
where it is
metrics known as key performance
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
indicators.
Key performance
that
the
assess
Many
different
General.
KPIs
Earnings
receivable
and
Human
KPIs
are
To tie
Another from
per
and
to job number scores
the
main
KPI to
the
strategic
a specific
time
its
strategic
examples
of
of business, sales
revenue
Business
and
KPIs
Intelligence
757
measurements
operational
goals.
are:
same-store
sales,
product
by employee
per
employee,
percentage
would
on
plan
example,
student
cent
by
of sales
of
an
organisation,
if
you
to
account
basis,
and
plans
are
for
compared
In this
rate
defined
case,
first
by autumn
year
to
indicators
achieve
such
business
operations,
a
you
a sample
seniors
from
a
to
environment,
school
performance
to
publication
a KPI is
high
such
rates,
an academic
student
case,
longevity
goals
or retention.
returning
In this
a year-to-year
operational
are in
employee retention
and
of graduating
the
2022.
and
student
satisfaction
grades
be to increase
75 per
monitored
first-years,
master
exam
turnover
tactical
For
measure
the final
KPI
employee
strategic,
frame.
ways to
to
and
of incoming
after
cent
by line
openings,
rates,
in
reaching
Some
profit
margin,
evaluation
sample
60
of
for
sales
be to increase
measured
in
the
be interested
2022. year
to
in
industries.
by promotion
profit
Applicants
within
would
sales
share,
assets
determined
goal
might goal
per
Graduation
business.
by different
Databases
are quantifiable numeric or scale-based
or success
measurements
and teaching
desired
used
recalls,
resources.
Education. rates
are
product
Finance.
(KPis)
effectiveness
Year-to-year
turnovers,
indicators
companys
15
second would
goals
would
be
be set
place. Although
BI has
must initiate exists ask
to
the
the
appropriate
missed. In spite
unquestionably
decision
support
the
an
important
support
manager;
process it
questions,
does
by
asking
not replace
problems
of the very powerful
role
modern
the
appropriate
the
will not
BI presence,
in
be identified
the
questions.
management
human
The
function.
and
and
manager
BI environment
If the
solved,
component
the
manager
fails
opportunities
is still at the
to
will
centre
be
of business
technology. The
main
Tables
15.2
decision
BI architectural and
15.3.
support
intuitive
capabilities.
and informational
provides
three
Advanced
user to
The reports Monitoring
and
exceptions
and
can
15.1
advanced
decision
support
and
further
explained
information
functions
capabilities.
a decision
provide
has
been
provides
warn
to
about the
interactive
from
used
to evaluate
be set to
information
of view
information
indicators
alerts
reports
points
The BI system
key performance
Figure
is its
its reporting
insightful
the
multiple
After
outcome.
advanced
presents
actionable
alerting.
decisions
other
key
in
BI system
and particularly
Furthermore,
data from
present
of the
in
generation
and
come to life
via its
A modern
BI system
styles:
A BI system
formats. the
were illustrated
heart
A BI systems
reporting
reporting.
study
the
user interface,
distinctive
of presentation
the
components
However,
highly
managers
offers
with ways to
allow very
the
end
detailed
ways to
define
data.
deviations
monitor
metrics and
of an organisation. about
in a variety
making.
BI system
aspects promptly
that to
decision
the end user
different
features
summarised
support
made, the
organisation
In addition, or problem
areas.
15 Advanced
data analytics.
patterns types
and trends of data
relationships,
with
Copyright Editorial
review
2020 has
hidden
analysis: trends
Learning. that
any
All suppressed
Rights
does
May not
not materially
be
copied, affect
provides
the
among
the
overall
or
to
data,
duplicated, learning
in experience.
help the end user discover
data.
predictive.
predict future
scanned,
tools
organisations
and
patterns
models that
Reserved. content
within
explanatory and
ways to create
Cengage deemed
A BI system
These
tools
Explanatory while
are
used
analysis
provides
analysis
provides
predictive
relationships,
to
create ways the
two
to
discover
end
user
outcomes.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
758
part
VI
Database
Management
Understanding BI in in
the
architectural
an organisation.
the
next
components
A good
As you
have learnt
framework
for
decision
in
previous
continuous
making
Integrating
is the
main
architecture.
of company-generated
such
mainframes,
Common
a disparate
no longer similar
different separate
data
a single
Improved
using
Copyright Editorial
review
2020 has
single
stores.
of the
not
time.
period
useful
any
potential
of becoming
This
executive,
BI front
the integrating
architecture
as
ends
for all company for
devices
version
could
well as diverse
can
provide
support
hardware
that
use
up-to-the-minute
users. IT departments
diverse interfaces.
of company
End users
multiple
the
data.
operations.
data
to integrate
performance. to
clever
In the
benefit
and insightful
past,
Such
systems
and
up-to-date
synchronised
such
data under
BI can provide
manufacturing
waste,
increased
bottom
takes
alot
achieved
multiple
IT
collected
systems
and
stored
has always
a common
for
environment
been
and
in can
employee
and
and technological
are the
as you
advantages advantages
many different be reflected
customer
in
turnover,
and
business.
financial
but
Such
reduced
the
of human,
overnight,
competitive
processes. sales,
line
As a matter of fact,
information
result
of
resources,
a focused
not to
company-wide
willlearn in the next section,
the
mention effort
BIfield
that
has evolved
to
evolution
end
users
has
part of corporations.
computer
of
Learning.
a
Improved
of time itself.
an integral
and
that
provide
making.
mobile devices.
analysis.
different
could
decision
data.
support
benefits are
a long
evolution
Cengage
as outlined
benefits:
to
and
and
BI architecture
organisation.
operational
laptops
Keeping
an increased
all these
systems
deemed
an
of an organisations
15.2.3 Business Intelligence
15
an organisation,
business
BI has the
within
options
fosters
other
project,
interface
reduced
BI benefits
Following
IT
a common
a framework
customer
efficiency,
over a long
BI provides
multiple training in
version
most importantly,
could take
and
systems
interfaces
organisational
from
Achieving
other
desktops,
aspects
BI provides
added
implemented
data reporting
provide
data repository
present
became
step in properly implementing to
formats.
Common
difficult.
a properly
data from
for
or common
supported data in
any
servers,
information have to
presentation
Providing
benefits
improvements
BI, but
mix of IT
user interface
consolidated
of
Like
all types
as
sections,
goal
for
time.
is the first
many
Benefits
performance
umbrella
areas,
promises
section.
15.2.2 Business Intelligence
from
of a BIframework
BI infrastructure
technology
evolved
into
been
a priority
of IT
Business
decision
support
advances,
todays
highly
business
systems
intelligence
integrated
since
mainframe
has evolved started
BI environments.
over
computing
many decades.
with centralised Table
15.4
reporting
summarises
the
BI systems.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
taBle
15.4
Business intelligence
15
Data
Traditional
Operational
mainframe-based online
Source
data
processing (OLTP)
Intelligence
end
end
Process
Data
None
None
Reports
transaction
Business
extraction/
integration Type
for
759
evolution
Data
System
Databases
read
Store
Very
Temporary files
data directly from
for reporting
data
Presentation
Tool
Tool
basic
Very
Predefined
and summarise
operational
User
Query
used
purposes
User
basic
Menu-driven,
reporting
predefined
formats
reports,
Basic sorting, totalling,
text
numbers
and
only
and
averaging Managerial
infor-mation Operational
system
Basic
data
extraction
aggregation
(MIS)
and
Lightly
data in
filter and summarise operational into
Operational
departmental
data
decision system
data
as above,
in
ad hoc columnar
hoc reporting
report
definitions
SQL
store
and
First DSS
process
populates
External
Same
addition to some
to some ad
using
Data extraction
support
RDBMS
data
integration
(DSS)
as above,
in addition
intermediate
data
First-generation
aggre-gated Same
Read,
DSS
Query tool
database
data
with
Spreadsheet
some analyti-cal
generation
capabilities
store
Usually
and
Run periodically
RDBMS
reports
style
Advanced presentation
predefined
tools
with and
plot-ting graphics
capabilities First-generation
Operational
Advanced
data
Data
BI
data
extraction
and
warehouse
integration External
data
diverse
data
Same
BI Online
as
presentation
Optimised for query
classifications,
purposes
scheduling
Same
and
to
multidimensional
technology
aggregations,
conflict
Same as above, in addition
RDBMS
Access
sources, filters,
Second-generation
Same as above
tools
with drill-down capabilities
Star schema
resolution
model
as above
Data
ware-house
above
Adds
support
Same
analytical
stores
processing
in
(OLAP)
data
cubes
and
data
MDBMS
as above,
but uses cubes
for end-user-based
multidimen-sional matrixes;
analytics
with
limited
multiple
by terms
of
cube size
dimensions
Dashboards Scorecards
15
Portals Third-generation
Same
Mobile BI
Same
above but
Cloud-based Big
as
includes
Data
as above
Same
Cloud-based
above
social
Cloud-based
media, IoT and
as
Hadoop
machine-generated
and
Advanced
Mobile
analytics
iPhone,
Flexible
Pixel,
interactions
NoSQL
Galaxy
devices:
iPad,
Note
via data
databases
visualisation
data
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
760
part
VI
Database
Using the
Management
Table 15.4
desktop
Connectivity The
and
of the
support
decision
making.
decision
offerings.
as
a small
available
to
You
can
has a
of the
desktop
effectively
changed
use
from
Table
15.4
to
track
1970s,
the
need
the
support
in
managers
to the
evolution
system.
Atfirst,
support
statistical
to
and
systems;
modelling.
shop floor
Over time,
migrated
appliances
to
of information
managerial
a BI solution.
systems
of decision
support
an organisation.
servers,
with training
line
than
in
to
Database
systems.)
tools used to assist
and reach
commodity
17,
decision
managers
decision
the reach
managers
an organisation,
focus
environment
(Chapter
of cloud-based
of computerised
servers,
mainframe
first-generation
selected
computer,
high-end
of top-level
all users in
of a few
from the BI environments.
discussion
was the
much narrower
realm
minicomputers,
group
also
mobile
a detailed
BI environment
were the
This evolution
to
intelligence
system (DSS) is an arrangement
systems
such
business
cloud-based,
provides
modern
with the introduction
limited
more current
A DSS typically
support
platforms,
you can trace
to the
Web Technologies,
precursor
A decision
and
as a guide,
and then
more
BIis
no longer
Instead,
mobile
BI is
agents
dissemination
agile
cloud-based
now
in the field.
styles
used
in
business intelligence: Starting
in
running
the late
on
mainframes,
predefined
and took
the
spreadsheet,
environment, data in As the
and
use
systems
of the can find
Rapid changes
the
1980s, for
environments.
a new style
decision
from
reports
Such reports
of information
support
centralised
With
systems.
data
mobile
depicts
as
BI, end as the
the
of
to
stores
manage
systems
that
were
started
were
distribution, In this
and
end users
evolution
of business
intelligence
OLAP in
analytical Section
manipulated
the
and the internet dashboards
access
BI reports
Google
Pixel
BI information
in via
data in
a
early
first-generation
more 1990s
DSS.
The
with the
systems
(OLAPs) in the
mid-1990s.
chapter.
revolution
led to the introduction
early
native
of
in the
flourished
of this
the
flow
were familiar.
processing
15.7
the
developed
with the
with which
Web-based
users
evolution
umbrella
and online
iPhone,
tried
These
features
technology
such
such
departments
an IT
more about
in information
device, 15.2
format
systems.
data into
warehouse
out
BI systems
decade.
Figure
all
data
IT
reporting
DSSs were established,
advanced
in the
information
still used spreadsheet-like
You
server
by centralised
process.
dominant
multiplied,
enterprise
integrated
introduction
smart
as the
was filled
spreadsheets.
using
basically
to
computers
downloaded
of spreadsheets
way
Once
emerged
distribution
or even central
time
of desktop
managers
desktop
formal
minicomputers
considerable
With the introduction
for information
2000s
applications
and
mobile
that
of
BI later
run
on a
in the
mobile
or iPad. dissemination.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.2
evolution
1970s
of BIinformation
1980s
Centralised
Spreadsheets
reporting
dissemination
15
Databases
for
Business
Intelligence
761
formats
1990s
2000s
Enterprise
Dashboards
2010s
Present
reporting
Big Data analytics/Hadoop /NoSQL/Data
visualisation OLAP
Mobile
BI
Credit:
SOURCE:
Oleksiy Mark/Shutterstock.com
Course
Technology/Cengage
Learning
Although still in its infancy, mobile BI technology is poised to have a significant impact on the way BIinformation is disseminated and processed. If the number of students using smart phones to communicate with friends, update their Facebook status and send tweets on Twitter is any indicator, you can expect the next generation of consumers and workers to be highly mobile. Leading corporations are therefore
starting
to
push decision
making to agents in the field
to facilitate
sales and ordering, and product support. Such mobiletechnologies some users call them disruptive technologies. BIinformation technology has evolved from centralised reporting in just
over a decade.
The rate
of technological
15.2.4 Business Intelligence Several technological advances
create
advances
of BIto new levels.
technology
styles to the current,
that
mobile BIstyle
down; to the contrary,
technology
The next section illustrates
some BI
trends
are driving the growth
new generations
relationships,
are so portable and interactive
change is not slowing
advancements are accelerating the adoption technology trends.
customer
of more affordable
of business intelligence
products
and services
that
technologies. are faster
These
and easier to
use. In turn, such products and services open new markets and work as driving forces in the increasing adoption of business intelligence technologies within organisations. Some of the more remarkable technological trends are: Data storage improvements. Serial Advanced Technology capacity
that
New data storage technologies, such as solid state drives (SSDs) and Attachment (SATA) drives, offer increased performance and larger
make data storage
with a capacity approaching
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
faster
and
more affordable.
Currently,
you can buy single
15
drives
16 terabytes.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
762
part
VI
Database
Management
Business intelligence warehouse
and
simplified
appliances.
administration,
vendors
include
Business
These
without the
for
commitments.
Data
analytics.
Organisations knowledge
The
analytics.
organisation.
BI can
personal
analytics
self-service
15.3
now
relentless
Although
BI is
depends
on the
used
suited to
decison
that
data
and
inventory
large
provide
time
Teradata,
or cost
MicroStrategy
difference
to join
several
tactical
and
tables. strategic
any
All suppressed
Rights
May not
not materially
are
closer
Tableau.
is the
end user in the
to
and
an
walls of the
customers.
There is
need
for
decision
levels
better
support
within
operational
level.
between
be
serve
Some
a growing
trend
decision
support
data
and
for
data
operational
an
organisation,
its
Yet operational
operational
and
data.
effectiveness
data is
data and
decision
Therefore,
it is
seldom
well
support
data
affect
is
update
or
to the
overall
or
duplicated, learning
in experience.
a
capture
provide
not
or in Cengage
part.
Due Learning
such
electronic
surprising
to
From
rights, right
some to
data
represent
Thus,
INVOICE,
is
party additional
content
may
be
any
in
would
DSS point
an
have
data
give
of view,
dimensionality.
suppressed at
a simple
excellent
analysts
content
data,
INVOICE
you
and
to
performance,
invoice,
granularity
third
that
Customer
transactions,
the
remove
tend
of fields.
example,
a simple
(tables)
update
an arrangement
span,
the
for.
effective
business
data.
reserves
transactions
(for
extract
daily
to
structures
number
tables
main areas: time
whole
support
minimum
to
operational
data in three
To
Although
data
the
must be accounted
more different
For example,
operational
scanned,
with
which
to
sold, it mode.
each
by five
in
optimised is
DEPARTMENT).
the
purposes.
database
an item
many tables,
meaning
copied,
Data
different
a relational
query-friendly.
business
does
who
and
managerial
a frequent
operational
Reserved. content
making outside
between
data storage
and not
Whereas
DSS data differ from
Learning.
in
each time
data in
it is
analytics.
information
differ.
stored
be represented
STORE
database,
data
of every
desktop
users
at the
data
structures are
store
for
to the
evolution
support
on are in
might
market
near real-time
Data vs Decision support
so
DISCOUNT,
operational
mobile
gathered
Operational
systems
transaction
a new
section.
and
and
created
decision
The differences
For example,
data,
to
and tactical
tasks.
data
normalised.
operational
that
warehouse
Data
data
decision
formats
operations.
Cengage
the
of the
support
operational
be highly
deemed
BI
prepackaged
and they
new source for
MicroStrategy
technological
in the following
their
Most
has
a data
These
incurring
Microsoft,
and
analytics.
at strategic
quality
has
business
DeCIsIon support
Operational
2020
develop
and capacities, without
of these
warehouses
personnel.
data
ratios,
Examples
data
to
or extra
Oracle,
data analytics
include data
15.3.1 operational
review
offer
corporation
for
advantages.
of understanding
are examined
Copyright
Aster.
a BI project
media as the
be deployed
vendors
in this
the importance
Editorial
integration.
industries
by IBM,
phenomenon
BIis extending
personalised
One constant
Data
OLAP brought
Mobile
LINE,
any
software
pilot-test
offered
to social
gain competitive
organisation.
sales
to are
Big
are turning
to
Personal
15
allow
models for specific
services
and fast Teradata to
optimised
price-performance
SAP.
Big
daily
and
are starting
hardware,
appliances
offer improved
scalability
Greenplum
services
need for
organisations
Such
appliances
Companies
cloud-based
offer pay-as-you-go
an opportunity
learn
EMC,
as a service.
store rapidly
now offer plug-and-play
new
installation,
Netezza,
as a service.
and
These
rapid
IBM,
intelligence
services
Vendors
BI applications.
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Time span. cover
X; rather, five
Operational
alonger
time
they
data cover a short time frame. In
frame.
tend
to focus
on sales
contrast,
are seldom
interested
in
generated
during
past
the
Databases
decision
support
a specific
for
the
past
Intelligence
763
data tend to
sales invoice
month,
Business
to
year
customer
or the
past
years.
Granularity from
(level
highly
they
of aggregation).
summarised
must be able to
within the
city
to
decompose,
effects
region, Figure
extract
data focus
and are interested
store
and
customer.
shows
how
decision
support
using
a variety
year),
information
in
data and transaction-at-a-time
15.3
up the
analyse
sales
by region,
by store
summarised
(that
is,
finer-grained
individual
data relate X fared
relative
In that data
meaningful
data to
one
than
multiple
on
an
six
part
months
of the
(such
The ability between
by
picture.
dimensions
dimension.
differences
past
are
of
many data
the
and time
from
levels
a higher level.
For example,
Z during
each
of the
rather
dimensions.
place
produce
at lower
data to
tend to include
product
both
the
transactions
be examined
to
ways is
operational
to
case,
can
of filters
data
data, you are aggregating
over those
city,
FIgure
to
of aggregation,
within the region,
managers require
in how the
province,
present
case,
levels
need
by city
data analysts
product
and
managers
by region,
on representing
how
and
at different
over time. In contrast,
know
region
if
sales
components
want to
15.3
product,
the
and so on. In this
when you roll
Operational
might
be presented
For example,
more atomic
In contrast,
of the transactions
dimensions analyst
must
but they also need data in a structure that enables them to drill down, or
data into
Dimensionality.
data
data showing
within the region,
the
aggregation).
DSS
near-atomic.
access
compare the regions,
the
Managers
15
to
as
analyse,
decision
support
data.
transforming operational datainto decision support data Region
Operational
data
Decision
support
data
Time
Product
Agent
Sales Operational
data
granularity
and
presented
in
represents makes
have
tabular
difficult
focus.
time Such
format,
a single it
a narrow
single
in
derive
low
are
which
each
This
format
transaction. to
span, data
useful
15
Decision
usually
support
timespan,
row
system
tend
examined
to
in
(DSS)
have
multiple
data
high levels
focus
on
a broader
of granularity,
dimensions.
For
and
example,
note
can
often possible
information.
aggregations: Sales
by
Sales
for
product, all years
region,
agent,
or only
etc.
a few
selected
years.
Sales for all products or only a few selected
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
be
these
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
products.
eChapter(s). require
it
764
part
VI
Database
Management
online Content Theoperational datain Figure15.3arefound onthe onlineplatform forthis book.
The decision
support
data in
Figure
15.3 show
the
output
for the
solution
to
Problem
2 at
the end of this chapter.
From the designers as follows:
point of view, the differences
between operational
and decision support
data are
Operational data represent transactions as they happen, in real time. Decision support data are a snapshot of the operational data at a given point in time. Therefore, decision support data are historic, representing a time slice of the operational data. Operational and decision support data are different in terms of the transaction type and transaction volume. Whereas operational data are characterised by update transactions, are
mainly characterised
by query (read-only)
transactions.
Decision
support
DSS data
data also require
periodic updates to load new data that are summarised from the operational data. Finally, the transaction volume in operational data tends to be very high when compared with the low-to-medium levels found in decision support data. Operational data are commonly stored in manytables, and the stored data represent the information about a given transaction only. Decision support data are generally stored in a few tables that store data derived from the operational data. The decision support data do not include the
details
of each
operational
transaction.
Instead,
decision
support
data represent
summaries; therefore, the decision support stores data that are integrated, summarised for decision support purposes.
transaction
aggregated
and
The degree to which decision support data are summarised is very high when contrasted with operational data. Therefore, you will see a great deal of derived data in decision support databases. For example, rather than storing all 10 000 sales transactions for a given store on a given
day, the decision
support
database
might simply
store the total
number
of units
sold and the
total sales euros generated during that day. Decision support data might be collected to monitor such aggregates as total sales for each store or for each product. The purpose of the summaries is simple: they are to be used to establish and evaluate sales trends, product sales comparisons, and so on, that serve decision needs. (How well are items selling? Should this product be discontinued?
Has the
advertising
been effective
The data models that govern operational operational
databases
frequent
as
measured
by increased
data and decision support
and rapid
data updates
sales?)
data are different. The
make data anomalies
a potentially
devastating problem. Therefore, the data requirements in a typical relational transaction (operational) system generally require normalised structures that yield manytables, each of which contains the minimum number of attributes. In contrast, the decision support database is not subject
to such transaction
updates,
and the focus
support databases tend to be non-normalised large number of attributes.
15
is
on querying
capability.
and include few tables,
Therefore,
decision
each of which contains
a
Query activity (frequency and complexity) in the operational database tends to below to allow additional processing cycles for the more crucial update transactions. Therefore, queries against operational data typically are narrow in scope, low in complexity and speed-critical. In contrast, decision support data exist for the sole purpose of serving query requirements. Queries against decision
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
support
All suppressed
Rights
data typically
Reserved. content
does
May not
not materially
be
copied, affect
are broad in scope,
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
high in complexity
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
and less
party additional
content
may content
speed-critical.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Finally, is the
decision
result
display many
many
to
15.5
database
First,
product,
are stored
and
duplications.
customer,
the
differences
point
15.5
different
store,
summarises
by large
data
ways to represent
designers
taBle
data are characterised
factors.
data redundancies
different
relation Table
support
of two
in
amounts
Second,
and
Operational
Data currency
Current
operational
operations
Historic
data
Snapshot
Atomic-detailed Low;
some
aggregate
yields
to
be categorised
in
might
in
be stored
support
data
from
the
Mostly
volumes
High
update
Transaction
speed
Updates
Query
activity
Low to
Query
scope
Narrow
Query
complexity
Simple
data
many aggregation
Mostly
but
mostly
multidimensional
loads
Retrievals
and summary
calculations
are critical
High
medium
Broad
range
range
medium
Very complex
of gigabytes
Hundreds
of terabytes
to
petabytes
The many differences between operational data and decision support data are good indicators requirements of the decision support database, described in the next section.
15.3.2 Decision support
DBMS
query
Periodic
are critical
Hundreds
levels
Complex structures
DBMS
volumes
to
data
(week/month/year)
Non-normalised
updates
Transaction
Data
of company
Some relational,
Data volumes
data volume are likely
data
component
High;
Highly normalised
type
765
data characteristics
Support
Summarised
data
Mostly relational
Transaction
can data
decision
support
Decision
Time
Data model
data sales
and
and decision
Data
Real-time
level
same
that
Intelligence
manager.
between
operational
Characteristic
Summarisation
the
Business
The large
structures
For example,
for
of view.
Contrasting
Granularity
Databases
of data.
non-normalised
snapshots.
region
15
of the
Database requirements
A decision support database is a specialised DBMS tailored to provide fast answers to complex queries. There are four mainrequirements for a decision support database: the database schema, data extraction and loading, the end-user analytical interface and database size. Database Schema The decision support database schema must support complex (non-normalised) data representations. As noted earlier, the decision support database must contain data that are aggregated and summarised. In addition to meeting those requirements, the queries must be able to extract multidimensional time slices. If you are using
an RDBMS, the
data. To see whythis a single department. Table 15.6.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
suggest
using non-normalised
and even
duplicated
must be true, take alook atthe ten-year sales history for a single store containing At this point, the data are fully normalised within the single table, as shown in
Reserved. content
conditions
15
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
766
part
VI
taBle
Database
Management
15.6
ten-year sales history for a single department,
millions of euros
Sales
Year
2010
8 227
2011
9 109
2012
10 104
2013
11 553
2014
10 018
2015
11 875
2016
12 699
2017
14 875
2018
16 301
2019
19 986
This
structure
unlikely
that
works
that
such
a decision
more than contain
one
taBle
only has
a factor
To support stores
and
queries
that
track
one
store
much
need
when dealing all
of the
all of their sales
dimension
yearly
to include
2014
and
2019
for
a decision
with
one
support
One
one store,
the
database
ellipses
over
two stores
and two
data
departments
suppose
has
database
time.
For
must support
simplicity,
(1 and 2) within each store.
that
very
of which
must be able to
and
(...) indicate
it is
would
each
requirements,
departments,
departments
However,
support.
more than
and the
by
department.
data. Table 15.7 shows the sales figures
are shown;
sales summaries,
only
decision
by stores,
yearly
with
departments
are only two stores (A and B) and two
Only 2010,
15.7
have
environment
all of the
change the time conditions.
you
becomes
department.
multidimensional
there
when
a simple
support
data for
suppose
well
Lets
also
under the specified
values
were
per store,
omitted.
If
millions
of euros Store
2010
A
1
1 985
2010
A
2
2 401
2010
B
1
1 879
2010
B
2
1 962
...
...
2014
A
1
3 912
2014
A
2
4 158
2014
B
1
3 426
...
2019
A
1
7 683
2019
A
2
6 912
2019
B
1
3 768
2019
B
2
1 623
Copyright Editorial
1 203 ...
...
...
...
...
2
B
2014
15
Sales
Department
Year
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
you examine and that
Table 15.7, you can see that the
the
Now
table
exhibits
suppose
suppose
you
that
want to
sales attributes
multiple
number
of rows
15
Databases
and attributes
for
already
Business
Intelligence
multiplies
767
quickly
redundancies.
the
company
access
yearly
has sales
per row. (Actually,
ten
departments
summaries.
there
per
Now you
are 15 attributes
store
are
and
dealing
20 stores with
nationwide.
200 rows
and
And
12
monthly
per row if you add each stores
sales total for
optimised
retrievals.
each year.) The
decision
optimise
query
to increase
support speed,
search
non-normalised
the
decision
and
by importing
data
extraction
and
is data
filtering
capabilities
should
databases,
also
be
features
DBMS
found
also
created from
tools.
should
in
such
query
for
query
as bitmap
optimiser
decision
To
allow
support
largely
support
(read-only)
indexes
must
and
data
be enhanced
To
partitioning
to
support
the
databases.
by extracting
external
sources.
minimise
batch
the
and
different
data
or data
data
DSS database,
the
from
external
validation
DBMS
data
Thus,
impact
flat
and
to filter
advanced
database
support
operational
advanced
database,
The
the
data
hierarchical,
capabilities
Finally,
must support
operational must
extraction.
files
Data filtering
the
DBMS
data
sources:
rules.
from
the
on the
scheduled
as well as multiple vendors.
for inconsistent
the
the
structures
database
relational data into
must support
addition,
additional
capabilities
check
must
and Filtering
support
extraction
In
and complex
The
schema
DBMS
speed.
Data extraction
to
database
data
extraction
network
must include
and integrate
the
data integration,
and
the ability operational
aggregation
and
classification. Using
data
conflicts. based
For
multiple
example,
on different
be filtered
dates
scales,
and purified
and that they
Database
sources
and ID
and the
same
to ensure that
are stored in
Decision
support in
databases
2017,
DBMS
tend
Wal-Mart
had
might be required
importantly,
to
massively
support
parallel
The complex
sparked
to
that
the
Bill Inmon,
Copyright review
Inmon,
2020 has
facilitate
that
different
support
any
All suppressed
and
solve
data-formatting
measurements names.
In
may be
short,
data
data are stored in the
petabyte
of data in its
hardware,
such
as
such
ranges
data
databases (vLDBs).
technologies,
data
and the
are
not
warehouses.
To support
multiple
must
database
unusual.
For
Therefore,
the
a VLDB adequately,
disk arrays,
as a symmetric
ever-growing
of data repository. extraction,
data
and, even
multiprocessor
more
(SMP)
or
demand
This repository, analysis
and
for
sophisticated
called
decision
a data
data
analysis
warehouse,
contains
making.
warehouse father
of the data
nonvolatile
warehouse,
collection
definition,
lets
take
a
C. The
twelve
rules
of data
Kelley,
6-16,
Learning. that
of a new type
time-variant,
B. and pp.
Cengage deemed
to
formats,
(MPP).
the acknowledged
To understand
terabyte
very large
requirements
Data
subject-oriented,
Editorial
decision
40 petabytes
to use advanced
processor
the creation
15.4
4(5),
having
different
may have
pertinent
be enormous;
more than
multiple-processor
information
data in formats
2
elements
means
in
format.
DBMS must be capable of supporting
a
data
only the
a standard
usually
may occur
Size
example,
the
also
numbers
more
of data that
detailed
warehouse
look
for
defines the term provides
at its
support
as an integrated,
for
decision
15
making.2
components.
a client/server
world,
Data
Management
Review,
May 1994.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
768
part
VI
Database
Management
Integrated. derived
The data
from
integration
the implies
metrics
are
and from
that
all business in the
you
performance
can exist
cancel
or PG
format
tangle,
the
throughout
the
enhances
data in the
organised
and
products,
is
customers,
quite different
from the
For example,
structures
(relational
two
tables:
Data
the
sales
data
In
the
data, generated once
warehouse,
of the
warehouse
month,
quarter,
data
Non-volatile. in
the
data
contrast
a data
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
to
any
All suppressed
data
enter
data
organisation
data
components
a subject
on the
transaction
normalised
real-time
in
orientation.
processes
that
by product
require
modify
data updates!)
its sales
on current
the
and
retrieval
transactions,
warehouse
warehouse,
other
also
can
even
of sales
weekly
sales
are
As data in
used
to
generate
Once the
data
enter
the
aggregations
uploaded
a data
the time
summaries data
projected
in the sense that
aggregates
by its variables,
is
warehouse
contain
all time-dependent
time-dependent
updated.
measured
that
even
does
May not
the
data
warehouse,
companys
always
multiterabyte
it
not materially
be
copied, affect
scanned, the
history,
overall
databases
duplicated, learning
are
to the for
constitute
component and
data
products,
warehouse
warehouse,
is crucial.
aggregations
the
time
and
in
whole
or in Cengage
Due Learning
data,
and new DSS
ID
by assigned
a data
to
electronic reserves
right
a
The week, to
to
data
the
near-term
added,
must be able to support
hardware.
more
some
the
representing
comprehensive
specifically
rights, the
Because
data are continually
DBMS
multiprocessor
provided
part.
removed.
operational
of transaction
experience.
never
deleted why the
Kimball
a copy
or
the
That is
Ralph
was . . .
they
Data are never
growing.
definition,
saying
Reserved.
the
added to it.
Bill Inmons
content
than
activities
focus data
previous
and are
as ID
so on.
represent
and
Rights
of
be changed.
warehouse is
warehouse
and
of interest
of typical
models. It is also time-variant
data
data for
variables
history
and
which The
yearly
a time
are always
multigigabyte In
other
data,
to the
when
monthly,
contains
warehouse
so the
distribution
invoice
has
to numerous
support
time.
and other
uploaded
company
Once
history,
through
example,
and
year
cannot
data
on designing
stores
This
answers
warehouse
This form
by storing
warehouse
it
operations.
provide
organisation
data rather
decision
operational
statistical
weekly,
stores
snapshot data
For
data
because
of data
through
the
customers,
15
to
flow
data are periodically
are recomputed.
the
accomplished,
subjects
warehouse
potential
or customer.
contrast
represent
data
on the
to
Data
on.
another
acceptable
once
finance,
so
process
the
data are not subject
an invoice,
components
by product
and
as
opportunities.
specific
concentrates
business
contrast,
specifically
warehouse
of storing
the
In
marketing,
the
companys
business
contains
in
such
and as UG1,
format
but
the
or process-orientated
support
focus
all, data
by customer
Time-variant.
to
and INVLINE.
instead
summaries
the
tables)
a common
a company.
as sales,
and 4
To avoid
and optimised
within
designer
3
labels
1, undergraduate
department
understand
promotions
system
2, year
to
of strategic
warehouse
regions,
as 1,
for sales
any other
with text
be time-consuming,
better
areas
such
data
more functional
designers
data. (After
Therefore,
the
an invoicing
INVOICE
warehouse
by topic,
topic,
measurements
department.
conform
data are arranged
diverse functional
departments,
systems.
can
into recognition
summarised
For each
must
business
requirement
holds true for
accounting
Data
and this
be indicated
and
systems
managers
warehouse
from
Although
as undergraduate
in the
This integration
Data
might
data
formats.
characteristics
many different
department
warehouse
making and helps
coming
transportation.
one
information
data
organisation.
Subject-orientated.
are
computer
data
the same scenario
be defined
that integrates
with diverse
enterprise.
how
year 3 or postgraduate the
can be translated
questions
the
of an order
in
might
database
sources
elements,
discover
status
and closed
in
decision
understanding
to
the
status
year 2, undergraduate UG3
to
within an organisation;
A students
UG2,
data
way throughout
be amazed
For instance,
received,
department.
consolidated multiple
entities,
same
would
element.
open,
a centralised,
organisation
described
sounds logical,
business
warehouse is
entire
third remove
party additional
description
structured
content
may content
be
for
suppressed at
any
time
from if
of
query
the
subsequent
eBook rights
and
and/or restrictions
eChapter(s). require
it
Chapter
analysis.3 how it
In this
should
Bill Inmons
approach around
a number
detail in several dart
Section
ensure
between
the
the
development
the
one
two
have
and 15.8
Ralph
consistent
of the
successfully
development
summarises
15.8
Similar or
the
about
data
can
meanings.
not
is
belabelled
data
Subject-orientated
these can
be
Bill Inmons
Ralph Kimballs
to
With the
data
method
warehouse
advent
of
Ralph Kimball
enterprise
warehouses
and
have
Data
data
operational
Big
Data,
recognises
warehouse
to
databases.
different ID
warehouse
representations
Provide
numbers
with a common or as
and a given condition
for
in thousands
or in
a unified
For example,
for invoices,
payments
Data are recorded
or process,
data
For example,
the
that
sales
data
example,
decision
by
perspective
in
and common.
For
changes
data
Data cannot
with
only
environment
2nd
Evolving
Lifecycle
Edition. Role
Toolkit:
John
of the
Wiley
Enterprise
mind.
Therefore,
Practical & Sons, Data
Techniques
for
be changed.
data
changes
Warehouse
by
with a historical
periodically
Once the
from are
a time
data analysis
Building
Data are added
historical
properly
are allowed.
environment
Data
and
or by region.
dimension is added to facilitate
is fluid.
Systems,
data
by product,
manager,
may be the
orientation of the
making. For example,
may be recorded
Data are recorded
amount the
views
and various time comparisons.
are frequent
Therefore,
multiple
transactions.
on a given date, such as
an inventory
sale.
with a subject
facilitates
facilitates
on 12-MAY-2013.
Data updates
each
elements
and representation
units.
Data are stored
may be stored
and credit amounts.
as current
sale of a product
Non-volatile
of all data
may
division,
342.78
view
definition
millions.
with a functional,
orientation.
Time-variant
Data
all business
sales
R. The
building
as T/F or 0/1 or Y/N. A sales value
Data are stored
R. The
more
15.5).
organisations.
Data
as ######-####-###
may be shown
Intelligence
by first integrating
comparison
approaches
for the
approach
marts in
structured.
whereas
(Section
big data analytics.
are required
data
A further
Bill Inmons
Down
Data.
between
For example,
#############,
Kimball,
and
769
which is linked
virtually
warehouse
schema
Top
warehouse,
Up approach
normal form,
star
and
that
Database
may be stored
4
Intelligence
warehouse
as the
and then
organisation.
in large
of Big
to
will learn
data in the
be able to facilitate
differences
Operational
Integrated
Kimball,
Business
a comparison of data warehouse and operational database characteristics
Characteristic
3
for
data
data
a Bottom
of the
model in third
paradigms4
the
adopts
view
Kimball
challenges
referred
an organisation
implemented
to
often
marts. (You
of how the
data, i.e.
Ralph
has to transform
the
in
enterprise
both
been
Kimball
in terms
models
encompass
taBle
of the
enterprise-wide
as data
departments
approaches
is
centralised
known
contrast,
differences,
need for new
Table
on the functionality
development
of alarge
data using the relational
warehousing
expand
warehouse
different
multidimensional
Despite
only focuses
databases
In
within
method structures
data
data
creation
15.4.2.)
marts
marts to
creates
to
the
of departmental
data
made
Kimball
Databases
be developed.
and revolves
to
definition,
15
Therefore,
is relatively
Data
systems.
stored,
no
the
data
static.
Warehouse
and
15
Business
2008. Warehouse
in the
Era
of
Big
Data
Analytics.
Available:
www.
kimballgroup.com/
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
770
part
VI
In
Database
Management
summary,
query
the
integrated As
in
warehouse
this
from
15.4
warehouse
Typically, other
mentioned,
FIgure
data
processing.
data
words,
passed
process
is
operational
is
usually
are
extracted
through
known
as
a read-only from
database
various
a data filter ETL.
Figure
optimised
sources
before
and
being loaded
15.4 illustrates
the
for
data
are then into
ETL
analysis
and
transformed
the
data
process
to
and
warehouse. create
a data
data.
Creating a data warehouse
Operational
data
Data warehouse Transformation
Extraction
Loading
Filter
Transform
Integrated
Integrate
Subject-oriented
Classify
Time-variant
Aggregate
Non-volatile
Summarise
Although the centralised andintegrated data warehouse can be an attractive proposition that yields many benefits, managers may be reluctant to embrace this strategy. Creating a data warehouse requires time, money and considerable
managerial
effort.
Therefore, it is not surprising
that
many companies
begin their
foray into data warehousing by focusing on more manageable data sets that are targeted to meet the special needs of small groups within the organisation. These smaller data stores are called data marts.
15.4.1 twelve
rules that
Define a Data warehouse
In 1994, William H. Inmon and Chuck Kelley created 12 rules defining a data warehouse, summarise many of the points madein this chapter about data warehouses.5 15
5
1
The data warehouse and operational
2
The data warehouse data are integrated.
3
The data warehouse contains historical
Inmon, 4(5),
Copyright Editorial
review
2020 has
Cengage deemed
B., and pp.
Learning. that
any
Kelley,
6-16,
All suppressed
C. The
twelve
rules
environments
which
are separated.
data over along time horizon.
of data
warehouse
for
a client/server
world,
Data
Management
Review,
May 1994.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
4
The data warehouse data are snapshot
data captured
5
The data warehouse data are subject-oriented.
6
No online updates are allowed.
15
Databases
for
Business
Intelligence
771
at a given point in time.
7 The data warehouse developmentlife cycle differsfrom classical systems development. The data warehouse
development
is
data-driven;
the
classical
approach
is
process-driven.
8 The data warehouse contains data withseveral levels of detail: current detail data, old detail data, lightly
summarised
data
and
highly
summarised
data.
9 The data warehouse environment is characterised byread-only transactions to very large data sets.
The
entities
10
operational
environment
is
characterised
by numerous
update
transactions
to
a few
data
at a time.
The data warehouse environment
has a system that traces
data sources, transformations
and
storage.
11
The data warehouses
metadata are a critical component
and define all data elements. usage,
12
relationships
and
The
history
metadata of each
provide the
data
The data warehouse contains a chargeback use
Note
of the
how
an entity
those
from
Most
market
by end
12 rules
separate
processes. their
data
data
share
The metadata identify integration,
storage,
element.
mechanism for resource
usage that enforces optimal
users. capture
the
the
complete
operational
warehouse
suggests
use the star schema
of this environment. source, transformation,
data
data
store
to its
implementations
that
their
based
will not fade
to handle
life
cycle
components,
are
popularity
design technique
warehouse
from
functionality
on the
anytime
relational soon.
multidimensional
its
introduction
and
management
database
Relational
as
model,
data
and
warehouses
data.
online Content Furtherconsiderations aboutdatawarehouse development can be found in Appendix L, Data Warehouse Implementation platform
for this
Factors, located
on the online
book.
15.4.2 Data Marts A data
mart is
a small,
single-subject
data
warehouse
subset
that
provides
decision
support
to
a
small group of people. A data mart could also be created from the data extracted from alarger data warehouse for the specific purpose of supporting faster data access to atarget group or function. Some organisations choose to implement data marts not only because of the lower cost and shorter implementation time, but also because of the current technological advances and inevitable people
issues
that
make data
marts attractive.
Powerful
computers
can provide
a customised
DSS
15
to small groups in ways that might not be possible with a centralised system. Also, a companys culture may predispose its employees to resist major changes, but they might quickly embrace relatively minor changes that lead to demonstrably improved decision support. In addition, people at different organisational
levels
arelikely
to require
data
with different
summarisation,
aggregation
and presentation
formats. Data marts can serve as atest vehicle for companies exploring the potential benefits of data warehouses. By migrating gradually from data marts to data warehouses, a specific departments decision support needs can be addressed within a reasonable time frame (six months to one year),
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
772
part
VI
Database
Management
as compared years). have
to the longer
Information the
opportunity
The
only
frame
(IT)
to learn
difference
being solved.
time
technology
the
issues
between
Therefore, the
information
on available and
definitions
skills
a data
this
approach
required
to
warehouse
is the
and data requirements
The Data Perhaps
the
Because
data
extent
section
first
depth
than
to
remember
for
decision
foundation
of
means that
data
support its
warehouse
infrastructure
Decision is that
design
A Company-wide Designing that
a data
captures
business
the
discover
have
data that
is
resistance
and
of knowing
Data
power,
power
how
to
struggles
create
and
arbitration.
Involve
end
users in the
Secure
end users
Create continuous Manage
15
Great must
be
warehouse
Cengage deemed
Learning. that
any
for
skills
Rights
database.
always
design
you
as
it is
progress. of the
of a complete Although
database-system-development
it is easy to focus
must remember
entire
Instead,
work in
and implementation
procedures,
of the
a
that
the
decision
well as data.
Therefore,
infrastructure.
is
develop
lines
departments,
and
Building
it requires
and
perfect
and
skills
groups
turf
deal
model
user
and
boundaries.
and
damaging
warehouse to
end
data, you are likely
to trigger
data
managerial
data
both
geographical
support
uses is likely
the
from
organisations
data inconsistencies
sources
an integrated
organisation,
model all of the
(divisions,
designer
help
the
departmental
easy to find of its
to
to
so
battles,
often
end-user
is not just
with
on)
redundancies.
conflict
a matter
resolution,
must:
from the
beginning.
feedback.
conflict not,
well.
resolution.
of course, The
old
solely adage
sufficient.
The technical
of input-process-output
aspects
of the
repeats
itself
data
warehouse
here.
The
data
must satisfy:
Reserved. content
are as
designer
All
warehousing.
expectations.
addressed
suppressed
warehouse
process.
commitment
procedures
managerial
data cross
the
data
a static
support.
and
an opportunity
an attempt to
schema;
short,
end-user
end-user
Establish
In
not
the
repository, people
be essential
at all levels.
a star
mediation
to
components
control
imposed
data
and implementation
to
and implementation
in light
given
represents
and the
by an IS
User involvement
being
and it certainly
constraints
played
constraints
design
definition,
decision
data
of the
role
perfect
be common
is by
design
software,
warehouse
warehouse
goals,
central
considered
organisational
conflicting
Information
are
almost activities,
be examined
means
data that
warehouse
is,
in the
That requires
warehouse
the
perspectives.
Because to
as the
to
problem
Framework
company-wide
hardware, must
effort
a data
intelligence
for
appear
Some
Add the
warehouse
of the
the same for both.
of the
can describe
to three personnel
warehouse.
scope
are essentially
view
(one their
a data
and
many constraints.
data
Support
that
you are involved
and implementation
that
because
create
requirements.
a single
factors
support
database
to
warehouse
size
managements
why no single formula
business
includes
of
of the information
a few
infrastructure on the
a function
proposing
willidentify
thing
warehouse
are
as an Active
framework it is the
and
rather
warehouse
a dynamic
development
Others
and you understand
Therefore, this
system
funding.
of the
culture,
methodology,
has
the
a data
from
is subject
based
development.
2020
develop
mart and
to implement
benefit
Organisation-wide
by corporate
review
and
a data
problem
also
a Data warehouse
department
Copyright
required
15.4.3 Designing and Implementing
are
Editorial
usually
departments
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Data integration
and loading
Data analysis End-user
data
The foremost support
right
capabilities
with
data,
analysis
it is
advanced
data
data in
warehouse
begin
the
data
query
implementing
analysis
life
with
its
database
operational
in
773
performance.
a data
capabilities
cycle
warehouse
at the
of the
to fit the
data from
design
database
and the
a review
be adapted
derives
operational
Intelligence
right
is
to
moment,
provide in
the
end-user right
decision
format,
with the
cost.
database
must then
the
Business
Design Procedures the
wise to
procedures
with acceptable
concern
about
perhaps
for
criteria.
technical
Apply Database
Databases
needs.
and at the right
You learnt
15
are
corrupted.
design
traditional
data
operational
is important.
data
database
database
warehouse databases,
Its
difficult
Figure
process
to
design
10 and
good
why a solid
data
a simplified
11,
These
If you remember
will understand
produce
depicts
Chapters
procedures.
requirements. you
15.5
in
design
that
the
foundation
warehouse
process
so
data
when
for implementing
warehouse.
FIgure
15.5
Data warehouse design andimplementation
road map
Identify
Initial
and
Define
data
Identify
gathering Design
extraction
and transformation
Design
star
Facts,
schema
dimensions,
attributes
Create star schema Attribute
Naming
users
data
ownership
of
model
data
Define
frequency
of
Define
end-user
interface
use
Define
outputs
and
update
diagrams Design
hierarchies
Map to relational
key
subjects
operational
Define
routines
interview
main
and
tables
mapping
conventions Prepare
for
loading
Define initial
and
update
processes
Define transformation Loading
and
Define
testing
load
window
Map from
operational
Integrate Training Build
in
development
environment
Load
menus
Customise Build
query
Verify
tools
queries
out
Test
interfaces
Building
outputs
Optimise
for
End-user
data,
validate
required
Lay
and
data
transform
index
data
and
data
metadata
and star
schemas
and
testing and
results
speed
and
prototyping
accuracy
15
and testing
Roll out system Rollout
Get end-user
and feedback
System
feedback
maintenance
System expansion
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
774
part
VI
Database
One of the defining must
Management
key differences
the
business
be described
Identify
how
Check
all
existing
data
when this
in
order
to
has
been
The following
level,
to
how The
the
a data
data
the
of
in
which
customer
a
nature.
mortgage,
as asubject
associated.
Types
Operational any
Historical section
of detail required
within
the
may be the
does
basis users
data
in
warehouse
number
of a
the
organisation
or an hourly
need
basis?
The
to
general
require.
required
can
actually
15.6.2).
be created
be obtained
from
the
and the
ETL
and one option
processes
be
of how to create
the
customer
that
data
warehouse,
main source
This type
However,
systems
routines
of
the
data is
that
many
database. ready
for
extracted
from
the
and
external formats
Savings
and
may have
data from and
Loans,
can
be
each
of
both
a savings
account
In
order to
store the
of the same
customer
may be different.
these two instances
of sources
different
departments, customer
must be at a high
must be
warehouse. The
This data can come
key is to
determine
which
directly
from
operational
data is
warehouse. useful
store
are therefore
as routines
not just
old systems
It
useful
include:
organisation. within
is
warehouse. relevant,
15.4 illustrates,
a number
data
DBMS
of data into the
the
Figure
an operational from
often in
two
The same
may be extracted
in
fields
data
accurate,
a warehouse
of data.
files from
each
a successful
building
Data is
may have
in
to
high-quality,
Typically,
market.
numbers
within
data
archives,
DBMSs.
data
is
contained
warehouse.
a bank
different
critical
phase in
the
stock
For example,
is
process
many sources
preselected
also from
data.
and transformation
(ETL)
warehouse
often
data
will be included
archived
model
design issues
loading
the
from
the
within the
all
from
takes
but
data in
The
dimensional
ETL process
into
created
example
or application
as not
fields
into
of data
data.
DBMS
on a daily what the
of data
process
is loaded
is
but the
customer
relevant
can the
extraction
for
contradictory
and
sold
most time-consuming
systems
company,
stores
measure
For example,
than
the level
Loading
that
and loading
the
finer
will explore
required
data
operational
outside
modelled
a star schema.
warehouse
process
current
data.
were
that
business
transformation,
This is the select
transformation
of the
grain
completed
sections
Transformation, that
and accessible. developed
one
ensure
model using
Extraction, ensure
be
a week.
product
for
15.4.4 the extraction,
must
a sales
sold in
or granularity
design
process is the level
is to
to:
been
detail
design
that
systems.
dimensional
The
process
For example,
has
sources
database
business
of a particular
is to
source
defined. the
of
many
of thumb
Each
measures. that
the level
know
Only
detail
product
Identify
rule
in
business
particular
from the traditional
model.
to
this
perform
data
are
required
to load
such
as budgets
predictive
often
the
analytics
obsolete.
(discussed
Unique
data in the
data
warehouse
in
extraction
during the
first time load.
15
Internal
data.
spread
sheets.
External
Sources
the internet)
and
it
can
available.
In
may require
main issue Editorial
review
2020 has
Cengage deemed
Learning. that
any
organisation
for comparing of external
marketing
addition,
the
unique
one-off
may choose
the business
data include
data
be available
An organisation
Copyright
within the
data. Important
competitive.
is that
Data
that
has
been
and
constant
format
of the
performance
real-time
at any time
The
monitoring data
which
to enable
data feeds,
purchased.
external
or sales forecasts,
an organisation
newspapers
main
problem
is required
will be different
external
determine
the
to
and reports with
to
from
may exist in
be
(from data
when it is
internal
data
and
transformations. to
buy tools
to extract
data or may write individual
routines
in-house.
The
is cost. All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Once the In
each
system
data has been determined,
case, is in
The
it
may be necessary
a different
process
especially
format
from
common
and
Storing for than
source
Address
names
the
house
There
example,
of
ready
being
within a DBMS no unique
could
be down
same
the
to
details,
address,
and
data
warehouse. the
(or
as
source
mapping.
extracted
cleans) the
as subject-oriented
775
data,
data and
data.
both
shown in
may
entry
error
record
names
Figure
has
and
15.6,
missing.
or the
fact
been
many unique A person
that
created.
addresses
the Two
challenges
may
person or
have has
more
more moved
people
may
may be spelt incorrectly.
which exist in two
separate
For
databases
within
data anomalies
Address Rogers
6 State
Rd,
Roy
Rogers
Clare
A. Peterson
Jane Smiley
2: Customer
Marketing
6 State
Road,
4
Street,
West
1214
Marketing
A Claire
14
Smiley, Jane
NW,
M
L100
F
L121
Warehouse
Addresses
West St,
is to ensure format
L100
M22
M33
Range
L333
Gender
M33
12 to 14 Range
problem
M
M23
R
an agreed
Location
table
Name
Peterson,
Gender North
West,Manchester,
Rogers,
provides
be
Sales table
Roy
to this
in the
Intelligence
from
known
found
also scrubs
warehouse
values
a data a new
and
Name
A solution
field
This is
anomalies
presented
and a data
key
name and address
Database 1: Customer
For example,
if the
warehouse.
data
Transformation
for
example
Business
include:
often
the two tables
15.6
Database
for
data
any
for
must be mapped into the
rule,
in the
source.
anomalies
updating
the
consider
FIgure
is
which
and, instead
under
field
eliminate
Databases
inconsistencies
address,
be stored
mapped aims to
format data
and addresses
designer. one
the
attribute
a transformation
an operational
ensures it is in a standardised
Name
apply
of data transformation
when it is from
Some
each selected
to
15
for
that
names
Warehouse
the
name
and
addresses
and
Supplier_iD
Male
L100
Z123
Female
L121
Z45
Female
L333
address
are
within the
Title
Location
broken
data
down
into
warehouse
their
could
component
parts.
be:
Mr
First_name
Roy
Middle_name Rodgers
Last_name
Street
No. or house
name
15
6
Address_line1
State
Road
Address_line2 Manchester
Country Postal
M23 4FR
code
United
Country
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
Kingdom
........
........
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
776
part
VI
Database
Management
an organisation. Marketing the
address In
Customer
table
in
field,
address check
to
ensure
then
M23 4FR is
that,
when
both the
existing
Figure
15.6,
Customer Male
entered
problems
the
a person
may
stored
the
when
been
encoded
as M
and F
is to
agree
data values into
data and flag records
Different
Standards
Country the
For
if
which
Often,
one
standards
format
example,
metric,
are likely
of the
and
does
not
be flagged
at that
that
each
Road.
both records
already
address.
could also
exist
be used place
one
does,
if
analysis.
house
be flagged
of the
be in
Finally, it
This is a difficult
moved
should
component
Accuracy
should
for further
may have
at all. In
M22.
ensure
UK. Rules
name
should
suitable
standard
for
is
are the
from
on,
the
data from
differently
while in
the
on a format
a number
in
each
situation
and the
correct
for further
analysis.
Customer
for
each
It is
where fields
cannot
be standardised.
global
as
be stored
Europe
within
and
and then
exist
fields
data
stored
warehouse
these
cover
and
warehouse,
perform
In the
it is
that
Standards
or imperial,
to
data
data
mm/dd/yyyy)
data
Africa,
routines
the
the
systems.
databases. table
very important
to
the
South
source file into
file in
organisations.
opposed
two
Marketing
format.
in
of operations
of the
the correct
exist
to
UK,
agreed
data as it is transported
to
merging
date (dd/mm/yyyy
measurements
is
same
or a person again,
M23 to
Customer
entered
a postal code database
Manchester,
a person living
occur
pick up erroneous
country
of
both
and the
or not
e.g. Rd becomes
For example,
be inserted
case,
is
different
needs
format,
city
be incorrect
has
The solution
to transform
Multiple
code
Sales table
is
developer
with the
to
In this
field
it is
and Female.
currency,
sources.
record
typically
Gender
Sales table
routines
postal
warehouse
code in the
Customer
his address
Problems
Multiple encoding In
external
postal
been recorded.
encoding
both the case
Rd and the data
whether there is not already
may not have
Multiple
against a valid
and the
to check
in
each
appears in a standardised
entered,
The address
analyse.
as the
record
to
yet in
stored
data is
is necessary
appears
rows,
problems,
be checked
that
details
also
such
name and address
should
to
of three
Road is
order to resolve
part of the
Roy Rogers
a total
is
automatic
and
write
rules
also
the
type
of
measurements.
should
which
as
these
used
in
be in the
US?
conversions
of the
warehouse.
Missing values Often,
when
you
information
extract
may not
or fields
may not
being
terms
the
of the
to
the
example,
15
are
referential In
be flagged
source.
for
some
3, Relational database.
matches
the
applied
To to
extracted
Copyright Editorial
review
2020 has
Cengage deemed
any
to
be simply
if the
be completed,
value
then
missing
may simply
with
missing
contained
ignored.
made to
missing
from
ensure
each
All suppressed
key
extracted
that
Reserved. content
does
May not
integrity
value
different this
relationship
Rights
Characteristics,
Referential
If critical,
establish within
the
the
is
is to
error,
upon
not
the
critical,
the record
missing field
weight
mismatched
depends
the field
record
and
human
been
values
then
Sometimes,
age
due to
have
within
an alternative
not materially
states
a table
from
does
into
be
in
you learnt
to
copied, affect
not
will determine
scanned, the
overall
or
duplicated, learning
referential
a foreign
systems
violations
happen,
the data
that
that
key
which it is related.
source
databases,
that
and then inserted
Learning. that
action
primary
combined
occur.
could
and an attempt
However,
Model
made on all data that is
to
value
deal
data
missing.
give their
in
containing
value
by going
is time-dependent,
not
extract
the
for
record
until
complete.
a relational
data is
missing
If the
to
may be
or data
How to
warehouse.
may be
may decline
values
system,
of sources.
data
values
people
integrity
Chapter
that
the
warehouse,
all cases. In addition,
a number
within the
could
original
waiting
all fields
field
BI function,
missing field
back
to
the
For example,
stage in the source
across
of the
into
collected.
at the input
selected
significance
been
be applicable
no data available from
individual
have
the
must
prior to insertion
of
data
status
the
be enforced
entry
data
referential columns
in
or an entry
checks
must be
warehouse.
constraints
warehouse key
a null
integrity
into
integrity
of foreign
must
have
Referential
of referential
a set
integrity
are
As
more likely
integrity
rules
when records
are
are first
warehouse.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
It is
very important
handling In
missing
order
the
to
support
warehouse
The final
data into is,
very time
Once
the
often it is
is
the
update
(the
time
15.5
to
star
takes
unique
it
on the update
database
the
database.
schema relational
served
advanced
the
has four
In
from
volume
will need
of
to
a data
which the
about
data).
available
and is
once
data is
moved into
usually
done in two
and is
used initially
and transformation
routines
of data
and
be updated
processing
within
the
to load
data
stages. historical
being
required,
warehouse
organisations
or refreshed
business
at regular
and the
used
(that
the first
time
less
intervals.
scheduling
complex
than
of its
the
first
and, of course, there is less
systems
should
cycle
is
are less intricate
warehouse)
effect,
the
data
and networks,
be scheduled
used to
schema
relational
and therefore
during
How business
time
data.
load.
However,
the load
non-business
requirements
structures facts,
on
map multidimensional
creates
database.
ER and
the The
near star
normalisation
window
hours.
schema
did
decision support
equivalent
not
of a was
yield
data
multidimensional
developed
a database
because
structure
that
well.
an easily implemented
relational
components:
star
techniques,
analysis
yield
the
existing
modelling
Star schemas preserving
is
777
and rules for
(data
data
Intelligence
sCheMas
a relational
existing
which
Business
fixes
metadata
exactly
in
only
extraction
The star schema is a data modelling technique into
know
process
place
organisations
routines
on the
update
integrity
warehouse
for
intensive.
The
will put pressure
to
as loading,
and the large
and transformation
it takes
load,
is live,
dependent
activities.
The extraction
use)
warehouse
updated
essential
is known
time
as referential
data
Databases
analysis.
Due to
in
and resource
data
intelligence
being
the
be a very time-consuming
first
warehouse. not
such
within
it is
data
ETL process
as the
routines
documented
intelligence,
data can
data
systems
load is
are
effective
of the
known
the
older
enable
Loading
stage,
all transformation
etc.
business
to
stage
warehouse. The first
that
values
15
model for
which
dimensions,
the
multidimensional
operational
attributes
and
database
attribute
is
data built.
analysis
The
basic
while still
star
schema
hierarchies.
15.5.1 Facts Facts are numeric example,
sales
commonly
measurements (values) that represent
figures
used
in
are
numeric
business
measurements
data
analysis
are
that units,
a specific
represent costs,
stored in afact table that is the centre of the star schema. through
their
Facts
dimensions
can
called
also
metrics
from
(covered
in the
be computed
to
differentiate
operational
next
or derived
product
prices
aspect or activity.
and/or
service
and revenues.
sales.
Facts
are
For
Facts
normally
The fact table contains facts that are linked
section).
at run
them from
business
stored
time.
Such
computed
or derived
facts.
The fact table is updated
facts
are
sometimes
periodically
with data
databases.
15.5.2 Dimensions 15 Dimensions that
are qualifying
dimensions
For instance,
Copyright Editorial
review
sales
the
next.
The kind
the
sales
of unit
location
2020 has
Learning. that
any
All suppressed
because
of problem
typically for
dimensions.
Rights
Reserved. content
does
May not
the In
not materially
be
copied, affect
provide
by product addressed
first
scanned, overall
additional
from
of 2014
dimensions
or
duplicated, learning
in experience.
whole
to
are the
or in Cengage
part.
Due Learning
to
perspectives
always
region
by a DSS
quarters
effect,
the
that
DSS data are almost
might be compared
X by region
and time
Cengage deemed
characteristics
are of interest
to region
might
In that
rights, the
right
glass
some to
third remove
party additional
may content
suppressed at
data.
period to
have which
be
Recall
other
a comparison
sales
through
content
to
one time
make
example,
magnifying
reserves
a given fact.
and from
be as follows:
2018.
electronic
to
viewed in relation
any
time
you
from if
of
product,
the
subsequent
study
eBook rights
and/or restrictions
eChapter(s). require
it
778
part
VI
Database
Management
the facts.
Such dimensions
for
with
sales
FIgure
15.7
product,
are normally stored in dimension
location
and time
tables.
Figure 15.7 depicts a star schema
dimensions.
simple star schema
Product dimension
Apple iPad
Location dimension
Sales Fact
Southeas
125
Time dimension
May 2018
000
15.5.3 attributes Each dimension table Dimensions
provide
contains
attributes.
descriptive
Attributes
characteristics
are often used to search, filter
about the facts
through
their
or classify facts.
attributes.
Therefore,
the
data warehouse designer must define common business attributes that will be used bythe data analyst to narrow a search, group information or describe dimensions. Using a sales example, some possible attributes for each dimension areillustrated in Table 15.9.
taBle
15.9
Dimension
possible
Name
attributes
Description
Location
Possible
Anything that location.
Product
Cape
Anything
that
product
sold.
shampoo, bottle
Time
provides
Example:
Eastern
15
Anything
month
provides For
blue
that
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
Region, country,
city, store and so on
101,
hair
Essence
of the
Product
product,
presentation,
care
brand,
150
type,
product
ID,
colour,
brand,
size
and
package, so
on
ml
liquid
provides
4:46
of the
Store
a description
example,
a time
frame
for
Year,
For example, the year 2018, the
of July, the
the time
London,
Attributes
SA
Natural and
a description
East
and
the sales fact.
Editorial
for sales dimensions
date
29/07/2018,
quarter,
month,
week,
day, time
of
day, and so on
and
p.m.
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
These
product,
analyst The the
can
star
schema,
data
are
data (such
location
now
and time
group
the
through
sales
its
needed.
And it
as order
number,
dimensions
add a business
perspective
figures
a given
in
facts can
15
and
for
product,
dimensions,
can
do so
without
imposing
purchase
order
number
provide the
to the
a given the
Databases
of the
and status) that
Business
sales facts.
region
and
data in the
burden
for
required
time.
format
and
779
The data
at a given
additional
commonly
Intelligence
when
unnecessary
exist in
operational
databases. Conceptually,
the cube.
that
can
used.
sales Of course,
be associated
However,
to
using
a view
FIgure
using
of sales
15.8
this
a fact
multidimensional does
table.
not imply There
a three-dimensional
example, represents
examples
the
that
is
no
model
multidimensional
dimensioned
data
model
there
is
is
best represented
a limit
mathematical
limit
by product,
analysis
location
jargon,
a three-dimensional
on the
number
of dimensions
to
number
of
the
makes it easy to visualise data
by
the
the
problem.
dimensions
In this three-dimensional
cube illustrated
in
Figure
15.8
and time.
three-dimensional view of sales
Location Conceptual cube
three-dimensional
of sales
location
by product,
and time
Produc
Sales the
Time
facts
are
stored
intersection
product,
time
of
in
each
and location
dimension
Note that each sales value stored in the cube in Figure 15.8 is associated with the location, product and time dimensions. However, keep in mind that this cube is only a conceptual representation of multidimensional
data,
and it
does not show
how the
data are physically
stored in a data
warehouse.
Whateverthe underlying database technology, one ofthe mainfeatures of multidimensional analysis is its ability to focus on specific slices of the cube. For example, the product manager may beinterested in examining the sales of a product, while the store manager is interested in examining the sales made by a particular store. Using multidimensional jargon, the ability to focus on slices of the cube to perform a more detailed
analysis is known
as slice
and
dice.
Figure
15.9 illustrates
the slice-and-dice
As you look at Figure 15.9, note that each cut across the cube yields a slice. Intersecting small cubes that constitute the dice part of the slice-and-dice operation.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
concept.
slices produce
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
15
eChapter(s). require
it.
780
part
VI
FIgure
Database
Management
15.9
slice-and-dice
view of sales
Location
Sales
managers
view
of sales
data
Produc
Product Time
view
managers
of sales
data
To slice and dice, it must be possible to identify each slice ofthe cube. This is done by using the values of each attribute in a given dimension. For example, to use the location dimension, you might need to define a STORE_ID attribute in order to focus on a particular store. Given the requirement for attribute values in a slice-and-dice environment, lets re-examine Table
15.9.
Note that
each
attribute
adds
an additional
perspective
to the
sales facts,
thus
setting
the stage for finding new ways to search, classify and possibly aggregate information. For example, the location dimension adds a geographic perspective of where the sales took place: in which country, region, city, store and so on. All of the attributes are selected with the objective of providing decision support data to the end user so that he or she can study sales by each of the dimensions attributes. Time is
an especially
important
dimension.
The time
dimension
provides
a framework
from
which
sales patterns can be analysed and, possibly, predicted. Also, the time dimension plays an important role when the data analyst is interested in looking at sales aggregates by quarter, month, week, and so on. Given the importance and universality of the time dimension from a data analysis perspective, many vendors
have added
automatic
time
dimension
management
features
to their
data
warehousing
products.
15.5.4 attribute Attributes hierarchy
hierarchies
within dimensions can be ordered in a well-defined attribute hierarchy. The attribute provides a top-down data organisation that is used for two main purposes: aggregation and
drill-down/roll-up
data analysis.
For example,
Figure 15.10 shows
can be organised in a hierarchy by country, region,
how the location
dimension
attributes
city and store.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
FIgure
15.10
location
attribute
15
Databases
for
Business
Intelligence
781
hierarchy
Country
The
attribute
hierarchy allows
Region
the
user perform
end
to drill-down
and
City
roll-up
searches
Store
The
attribute
hierarchy
warehouse. month-to-date
analyst
month
the
analyst
data
sales
spots
the
country
provides
For example,
can
or in
identifies
the
OLAP
aggregated attribute
for
example,
product
group
(dairy,
15.11
the
data
meaning
that
Q1,
Q3 and
cell contains
and
so
on) in
analyst the
In this
will see
data
are
Q4). Finally,
the total
all
which
example,
data
the
on the
y-axis.
by quarters
location
dimension
sales for each country
necessary
for
narrative
brand analyst
all
drill down
example,
that,
a specific until
the
to
to form
A, Brand sales
is
product,
thus
meaning
set to Quarter,
of products
set to Country,
on the
on).
the
products,
(x-axis)
sales
so
using
For
dimension
be based
B, and
is set to All
of an
dimensions.
product
facts,
and
a hierarchy.
can
studies
warehouse
be part
of the
using the
dimension
data
decomposed
attributes
(Brand
total
product
be
dimension
dimension
is initially
for a given
data
By doing within
allows the
descriptions
product
The time
(for
2013
The
be extended
are to
can be grouped
The
product
even
a data
the
drill down inside
year.
all regions
hierarchy data
may want to
product
the
in
in
does
performance?
previous
can
how
provide
store.
How
norm.
attribute
not
searches
query,
might decide to
of the
operation
the
roll-up
sales
were reflected
dimensions
in the
products
the
It is to
or on the
aggregated
to those
will identify
city to store, you
a scenario
dimensions.
the
that
different
products
to the
month-to-date
sales
below
merely
from
slow
meat,
illustrates
and location
Q2,
exist
and
answers
The data analyst
because
path
drill-down
of drill-down
operations.
attributes
can identify
March
possible
attributes
2019
performing
a defined
after you drill down from manager
that
is
is
at the
compared
This type
and roll-up
some
mind that the
so the
Figure
have
drill-down
hierarchy;
But keep in
time
to
the low
that
scenario
systems
to the
by region
region.
store
perform
looks
March 2019.
sales
whether
a particular
The just-described and
how
to
analyst
compare
decline for
see
determine
only
analyst
sales
March to
capability
a data
performance
a sharp
of
the
suppose
A, B and
ensuring
Cin
that
each
in a given quarter.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
782
part
VI
FIgure
Database
Management
15.11
attribute
hierarchies in
multidimensional
Time
Year
analysis
dimension
Quarter
Month
Week
Total
Q1Q2Q3 All
By
of
product
products
Product
Product
Q4
product
A
Product
B
Product
C
........
type
dimension
Location
........
hierarchy
........ Country One
product
Total
of
quarters
Region
City
Store
The
simple
different
data
information
all products,
analyst
sales analyst
value
on one
As the
data is
of the
dictionary
ensured,
support
powerful
region
values
examples
Copyright Editorial
review
2020 has
analytical
is related
Learning. that
any
All suppressed
Rights
Reserved. content
total
does
gives
presented.
May
levels
down
sales
to
for
y-axis),
sales
access
the
data
dimension
the
warehouse
request
and
the
see data
used,
the
Clicking
so forth.
data in the
stored in the
properly.
warehouses
GUI is country.
how the is
to the
month or week.
When a
region,
with three
x-axis),
quarter,
information
data
can (the
within
determine
data
with the
year,
product.
city in the
analyst
analyst
by region
hierarchy
data
time
of each
see
hierarchies
integrated
the
of aggregation:
The attribute to
provides
On the
each
attribute
OLAP tool
be closely
(the
product.
by country,
drill
the
15.11
Once
metadata
data
DBMSs
such
access
and they
must
capabilities.
representation
each
not
one
sales,
illustrate,
are normally to
dimension
or just
cell to
by the
must
star schema
table
Cengage deemed
and
used
tools
Facts and dimensions fact
the
country
and is query
15.5.5
shows
are extracted
Figure
data at different
on the
preceding
warehouse
15
initially
in
product
by type,
time-variant
clicks
illustrated
On the
grouped
can request
data
scenario
paths.
products
Each
again
analysis
not materially
represented
dimension
be
copied, affect
scanned, the
overall
table
or
duplicated, learning
by physical tables in
in experience.
whole
a
many-to-one
or in Cengage
part.
Due Learning
to
electronic reserves
in the
(*:1)
rights, the
right
data
warehouse
relationship.
some to
third remove
party additional
content
In
may content
database.
other
be
words,
suppressed at
any
time
from if
the
subsequent
The many
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
fact
rows
are related
product
appears
Fact
and
foreign
key
primary the
easily the
each
dimension
tables
constraints.
The
key on the many
primary
among
to
many times
key
the the
star
fact
are related
CUSTOMER
table
FIgure
15.12
by foreign
key
on the
keys
1
and
side,
the
primary location
are
for
Business
you can conclude
subject
dimension
including
key. Figure and
a customer
required
example,
to the table,
Because the fact table is related
product,
be expanded,
merely
to the
the
sales
Databases
Intelligence
that
783
each
table.
is a composite
and
can
dimension
Using the
fact
side, the fact table.
table
schema
row.
sales
primary
of the fact table
sales
customer
dimension in the
15
time
CUST_ID
to
has
to
fact
of the
tables,
the relationships To show
added
SALES
key/
as part
many dimension
tables.
been
in the
primary
stored
15.12 illustrates
dimension
dimension
the
familiar
is
the
table
you mix.
and
how
Adding
adding
the
database.
star schema for sales
TIME
LOCATION LOC_ID
1
1
LOC_DESCRIPTION
TIME_ID TIME_YEAR
COUNTRY_ID
TIME_QUARTER *
LOC_REGION
TIME_MONTH
SALES
*
LOC_CITY
TIME_DAY
TIME_ID
25 records
TIME_CLOCKTIME
LOC_ID *
CUSTOMER
365 records
CUST_ID
1
*
PROD_ID
CUST_ID
PRODUCT
1
SALES_QUANTITY
PROD_ID
CUST_LNAME
SALES_PRICE
CUST_FNAME
PROD_DESCRIPTION
SALES_TOTAL
CUST_INITIAL
PROD_TYPE_ID
3 000 000 records
CUST_DOB
PROD_BRAND
Daily sales aggregates by store,
125 records
customer product
and
PROD_COLOUR PROD_SIZE
PROD_PACKAGE PROD_PRICE
3000 records
15 The composite primary key for the SALES fact table is composed of TIME_ID, LOCATION_ID, CUST_ID and PRODUCT_ID. Each record in the SALES fact table is uniquely identified by the combination of values for each of the fact tables
foreign
keys.
By default,
the fact tables
primary
key is always formed
by combining the foreign keys pointing to the dimension tables to which they are related. In this case, each sales record represents each product sold to a specific customer, at a specific time and in a specific location. In this schema, the time dimension table represents daily periods, so the SALES fact
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
784
part
VI
Database
Management
table represents values
daily sales aggregates
used in the
Therefore,
the
contain
only
dimension
fact
the
tables
tables
SALES
only
Data
data
data
the
support
VENDOR
dimension
the
must orders
schema. all
Since
unique
to thousands
but it
has thousands
schema
facts
facilitates
through
searches
the
suppose
warehouse. be the
In that
case,
an interest
in the
sales
the
in the
the
fact
tables.
dimension
products,
and
of fact records.
tables
so
on), the
For example,
of corresponding data retrieval
dimensions
smaller
database.
special
another
department.
In
in
attention.
perhaps
Figure
If the
records
functions
attributes.
dimension
the
table
tables
in
because Therefore,
before
to
schema
might
a
accessing
vendor
in
department time
star
schema
while
If
shares
key interest, product
and
same
by a
product
table
as well as sales, the time same time
different
represent
maintaining
represented
by the
orders
specific
and a SALES
vendor,
dimension
uses the
table. to
answer
orders
have
is represented
ORDER_TIME,
Orders
in
to
be an organisations
that
a new
dimension
same
designed
an ORDERS fact table
given the interest orders
is
a new interest
maintain
yields
by the named
15.13,
develop
a star
product
However,
fact
are considered
of
vendors
be represented
table,
you
If orders
The
Each
you should
centre
star schema.
can
tables.
scenario,
should
time
create
For example,
data
now requires
department,
you
star
contain the actual
many times
is related
star
at the
many fact
table
table
used in the initial
sales
of the
DBMS first
in sales. In that
same
fact
dimensions.
new
record
will look
have
questions.
in the
ORDERS
time
in the
salespersons,
dimension,
characteristic
warehouse
tables
unique
Since fact tables
are repeated
the fact tables.
product
analyst
usually
your original interest fact table
(all
values
tables.
warehouses
decision
the largest
smaller than
and by customer.
those
each dimension
That
the
DSS-optimised fact
always
once in the
table.
time
by product process,
information
star schema,
fact
of the
the larger
are
are always
appears
most
support
non-repetitive
In a typical widget
decision
time
the
the
periods
periods
time
product,
as the
are
periods vendor
used,
used
by
and time
dimensions.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.13
15
Databases
for
Business
Intelligence
785
orders star schema
PRODUCT PROD_ID
1
PROD_DESCRIPTION TIME
ORDER PROD_TYPE_ID
1
*
*
TIME_ID
TIME_ID
PROD_ID
TIME_YEAR
VEND_ID
TIME_QUARTER
ORDER_QUANTITY
TIME_MONTH
ORDER_PRICE
TIME_DAY
ORDER_AMOUNT
TIME_CLOCKTIME
PROD_BRAND PROD_COLOUR *
PROD_SIZE PROD_PACKAGE
PROD_PRICE
3000 records 365 records
85 000 records Daily sales by product
aggregates and vendor
VENDOR
1
VEND_ID VEND_NAME VEND_AREACODE VEND_PHONE
VEND_EMAIL 50 records
Multiple
fact
tables
will explain
can
several
also
be
created
for
performance
performance-enhancing
and
techniques
that
15.5.6 star schema performance-Improving The creation
of a database
warehouse speed
designs
through
Normalising
dimensional
tables
Maintaining
multiple
tables
Partitioning
review
2020 has
are
Cengage deemed
Learning. that
any
fact
All
code
often
as
used
can
within
answers
optimise
different
the
star
to data analysis
performance-enhancement
well as through
to represent
be used
The following
section
schema.
techniques
and accurate
to
reasons.
better data
semantic
warehouse
aggregation
queries is the data
actions
might target
representation
query
of business
15
design:
levels
fact tables and replicating
suppressed
fast
Therefore,
of SQL
Four techniques
Denormalising
Copyright
provides
objective.
the facilitation
dimensions.
Editorial
that
prime
semantic
Rights
Reserved. content
does
May not
not materially
tables
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
786
part
VI
Database
Management
Normalising
Dimensional
Dimensional through
tables
the
among
in
Database
dimension
The
city,
tables.
The the
dimensions.
In this to the
star
15.14
fact
normalised
can
those
review in
is
usually
tables,
the
you
COUNTRY,
table.
Figure
table
facilitate
contains to
15.14
is
transitive
the
techniques
end-user
in
known
the
result
simplify
the
Chapter
table
of normalising
directly
schema.
tables.
operations
contain
related
form),
can have their own
dimension
data-filtering
is
normal
7, Normalising
as a snowflake
CITY and LOCATION
LOCATION
navigation dependencies
3NF (third
which the dimension tables
REGION,
Only the
and
relationships
normalisation
shown
schema
simplicity
dimension
revise
of star schema in
dimension
example,
semantic
if the location
you
schema
snowflake
SALES
achieve
necessary,
schema is a type
By normalising
compared
to
example,
and
15.14. (If
Designs.)
A snowflake
FIgure
For
province
Figure
Tables
normalised
dimensions.
region,
as shown
are
to the
related
to
the
very few records SALES
fact
table.
dimension tables
note Although using the dimension tables shown in Figure 15.14 gains structural simplicity, there is a price to pay for that simplicity. For example, if you want to aggregate the data by country, you must use a four-table join, thus increasing the complexity of the SQL statements. The star schema in Figure 15.12 uses a LOCATION
15
dimension
table
that
greatly
facilitates
This is yet another example of the trade-offs
Maintaining
Multiple
Fact
Tables
that
data retrieval
designers
representing
by eliminating
multiple join
operations.
must consider.
Different
Aggregation
Levels
You can also speed up query operations by creating and maintaining multiple fact tables related to each level of aggregation (country, region and city) in the location dimension. These aggregate tables are
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
precomputed processor
at the cycles
decision
analysis
by accessing
aggregate
data-loading
at run then
a lower
phase rather than
thereby
properly level
fact tables
FIgure
time,
of
accesses detail
for country,
15.15
speeding the
fact
The purpose
analysis.
summarised
table.
region
at run time.
up data
fact
This technique
15
tables
sales
for
Business
of this technique
An end-user
query
instead
is illustrated
and city to the initial
Databases
tool
Figure
787
is to save
optimised
of computing
in
Intelligence
the
15.15,
for values
which
adds
example.
Multiple fact tables
15
The
data
warehouse
database. because of
use
These the
Copyright review
2020 has
Cengage deemed
any
All suppressed
is to
processing
designer
Learning. that
must identify
multiple aggregate
objective
and the
warehouse
Editorial
designer
Rights
does
May not
not materially
be
access
required
must select
Reserved. content
fact tables
minimise time
which levels
to
which
copied, affect
scanned, the
overall
are updated
duplicated, learning
during
and processing calculate
aggregation
or
of aggregation
in experience.
whole
time,
a given fact
or in Cengage
part.
tables
Due Learning
to
electronic reserves
to
pre-compute
each load according
to
mode. And
expected
level
at run
frequency
time,
the
data
create.
rights, the
store in the
cycle in batch to the
aggregation
and
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
788
part
VI
Database
Management
Denormalising
Fact Tables
Denormalising objective,
fact
tables
however,
limitations
that
of records
is
the
product
improves
region
operation
for
down
system.
the
example,
sales
you
In
and
by using total
and
cases,
each
years
might
total.
as frequency
of use and performance
on the
to
Partitioning
these
manage
and
Since table
month level,
the
replicating
partitioning
techniques
discussed
and replication
geographic
areas. Partitioning client
places it in No used
are
computer
which
business
aggregation
defined
analysis. within
the
previous
years
operations, tables
time
daily
time. This is
In this
section,
business
materials
case in
for
data
very
table.
If
might
might
you
not
are
have
as a basis
Here again,
against the
300 000
begins to
bog
denormalised.
MONTH_1,
serve
take
have to
be a very taxing
sales
that
DBMS number
normally
you
for
possible
For
MONTH_2
...
year-to-year
design
the
the
table
avoid
size
criteria,
overload
such placed
relate
actually
generate
used,
is
business
in the
quarterly
is the
have
widely
Table
dispersed
of a table
and yearly.
of the
data
are
moved
all sales
to
and
stored to
in
the
another
records
current
level
of
have five
must have
year
sales all
would
be of value!
allows
that
the
and
data,
contains
beginning
previous
years
of
years
sales table which
but the
you to
data
table. table
from
historic
the
might
Those fact tables
to intimidate
use
each
you
The previous
the
dimension
for
as current year only, previous
enough
tools
table
example,
usually expressed year
most common
one fact
sales
design technique
intelligence
warehouse. in
makes a copy
time
to
The data in the
denormalisation
Business
is
access
table
data
Databases,
or columns and places the subsets
of the company.
remote
of this
Distributed
to the
replication
contains
year.
14,
time.
timespan
how the star schema
to
time.
current
current
which
making.
analytics
Chapter
a DSS is implemented
common
monthly,
sales history to
possible
you learnt
for
of the
locations
decision
to
year level.
For example,
about
This
complete
The one
weekly,
sales
only.
exception
it is
defined. Periodicity,
daily
sales
at several
response
YEAR_ID,
detail in
scheme
information
the
data that
this
fields:
when
access
dimension.
daily,
year,
represent
optimisers.
for
each
with the
thus
be replicated slow
provides
of
this
tables
specifically
data access
Therefore,
or explicit periodicity end
in
important
also to improve
SALES fact tables:
or all years,
here only as they
to improve
an implicit At the
in
and
costs.
worth of previous
be used
daily, maximum
all regions,
Although
splits atable into subsets of rows
aggregate
years
store
in
records
are evaluated
were covered
performance-enhancement
data
space
The latter
relations.
particularly
a different location,
matter
in
almost
raw storage to
space.
Tables
partitioning
close to the
decrease
aggregate
or the
storage
and the
ten years
easily
requirements,
and replication
are
can
data
size limits
products
of the
special
quarter level
denormalised
all
the following
tables
the
for
all
have
contain
Such
at the top
to
costs
300 000 rows.
of, for example,
comparisons
DBMS
summarise
useful
saves
a single record
sales
at least
it is
table
storage
effects than
compute
a comparison
a YEAR_TOTALS
MONTH_12
more negative
the
and
and record
performance to
such
Data
size limits,
be summarising
a DBMS,
performance
of an issue.
aggregates
could
access
and table
have far
For example,
sales,
less
database
in a single table,
Denormalisation
data
becoming
restrict
many records. access
improves
can
bravest
can
cause
of query
model data optimised
warehouse
data
as the
raw
knowledge.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
15.6
Data
15
Databases
for
Business
Intelligence
789
analytICs
Data analytics is a subset of BIfunctionality that encompasses a wide range of mathematical, statistical and modelling techniques with the purpose of extracting knowledge from data. Data analytics is used at all levels within the BI framework, including queries and reporting, monitoring and alerting, and data visualisation. Hence, data analytics is a shared service that is crucial to what BI adds to an organisation.
Data analytics
represents
what
business
managers
really
want from
BI: the
ability
to
extract actionable business insight from current events and foresee future problems or opportunities. Data analytics discovers characteristics, relationships, dependencies or trends in the organisations data, and then explains the discoveries and predicts future events based on the discoveries. In practice, data
analytics
is
better
understood
as a continuous
spectrum
of knowledge
acquisition
that
goes
from discovery to explanation to prediction. The outcomes of data analytics then become part of the information framework on which decisions are built. Based on the previous discussion, data analytics tools can be grouped into two separate (but closely related and often overlapping) areas: explanatory analytics focuses on discovering and explaining data characteristics and relationships based on existing data. Explanatory analytics uses statistical tools to formulate hypotheses, test them, and answer the how and why of such relationships for example, how do past sales relate to Predictive
previous
analytics
customer
focuses
promotions?
on predicting
future
data
outcomes
with a high degree
of accuracy.
Predictive analytics uses sophisticated statistical tools to help the end user create advanced models that answer questions about future data occurrences for example, what would next months sales be based on a given customer promotion? You can think of explanatory analytics as explaining the past and present, while predictive analytics forecasts
the future.
However,
you need to
understand
that
both
sciences
work together;
predictive
analytics uses explanatory analytics as a stepping stone to create predictive models. Data analytics has evolved over the years from simple statistical analysis of business data to dimensional analysis with OLAP tools, and then from data mining that discovers data patterns, relationships and trends to its current status of predictive analytics. The next sections illustrate the basic characteristics of data mining and predictive
analytics.
15.6.1 Data Mining Data mining refers to analysing relationships; to form computer models to support
business
massive amounts of data to uncover hidden trends, patterns and models to simulate and explain the findings; and then to use such
decision
making. In other
words,
data
mining focuses
on the discovery
and
explanation stages of knowledge acquisition. To put data mining in perspective, look at the pyramid in Figure 15.16, which represents how knowledge is extracted from data. Data form the pyramid base and represent what most organisations collect in their operational databases. The second level contains information that represents the purified and
processed
data. Information
Knowledge is found
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
forms
at the pyramids
does
May not
not materially
be
copied, affect
scanned, the
the
basis for
decision
apex and represents
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
making
and
business
understanding.
15
highly specialised information.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
790
part
VI
FIgure
Database
Management
15.16
extracting
knowledge from
data
Processing High
Knowledge
Information
Low
Dat
Current-generation business
such
data
mining
requirements.
as banking,
mining tools
tools
Depending
insurance,
contain
on the
marketing,
can use certain
algorithms
many
problem
retailing,
that
design
and
domain,
data
finance
application
variations
mining
focus
tools
and healthcare.
are implemented
in different
to fit on
specific
market
Within a given
ways and applied
niches
niche,
data
over different
data. In
spite
of the
lack
1
Data preparation
2
Data analysis
3
Knowledge
4
Prognosis.
In the
data
and filtered,
15
The
the
data
analysis
review
2020 has
groupings,
Data
dependencies,
Learning. that
any
All suppressed
and
main data
mining is
subject
trends
to four
general
phases:
data
Reserved. content
does
May not
set for
studies
the
mining tool
clusters,
by the
data
data
data
operation
are identified
are already
integrated
mining operations.
data to identify
applies
mining
warehouse
specific
common
algorithms
data
to
characteristics
find:
or sequences
or relationships
and deviations. phase
acquisition
Rights
be used
data in the
the target
phase
the
classifications, links,
sets to
Because the
classification phase,
acquisition
knowledge
Cengage deemed
the
warehouse is usually
this
Data
Data patterns,
Copyright
phase,
data
During
The knowledge
data
acquisition
of any data impurities.
or patterns.
the
standards,
and classification
preparation
and cleansed
Editorial
of precise
uses the results
phase, the
not materially
be
copied, affect
scanned, the
overall
or
data
duplicated, learning
of the
data analysis
mining tool (with
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
and classification
possible
rights, the
right
some to
third remove
intervention
party additional
content
may content
phase. by the
be
suppressed at
any
time
from if
During
end
the
subsequent
eBook rights
user)
and/or restrictions
eChapter(s). require
it.
Chapter
selects
the
used in
appropriate
data
mining
regression
trees,
algorithms
also
neural
phase.
In
88 per
per
per
age
, 30
to
The
complete
model the
cent
Intelligence
algorithms
classification
and
data
to
optimise
791
and
visualisation.
Hybrid
decision
trees
any combination
and
to generate
data set. acquisition
findings
are
of data
mining findings
did
use
not
most common
induction,
in
Business
used
to
can
a particular
phase,
others
predict
future
continue
to the
behaviour
and
be:
credit
card in the
past
six
months
are
account. who
within the
bought
next four
25 000
of findings
can
presentation
phase
Figure
FIgure
algorithms
knowledge
mining
be used
for
and
a 60-inch
or larger
TV are
90 per
cent likely
to
buy
weeks.
credit
rating
, 3 and
credit
amount
.
25 000,
then
the
term is ten years.
set
prognosis
data
of customers
,5
can
Databases
The
rules
neighbour
that
of the target
at the
who
that
centre
or a visual
promotion.
cancel
and income
minimum loan
the
Examples
of customers
an entertainment If
phase,
stop
trees,
many of these
behaviour
tools
nearest
algorithms
may use
algorithms.
decision
and
genetic
the
mining
that
cent
cent likely
Eighty-two
reflects
outcomes.
acquisition
networks,
reasoning,
mining tool
data
business
Sixty-five
or knowledge on neural
example,
A data
many
prognosis forecast
for
model that
Although
based
memory-based exist,
networks.
a computer
modelling are
15
might project
15.17
15.17
be represented
interface
illustrates
the likely the
in
that is
a decision
used to outcome
different
tree,
project future of a new
phases
of the
a neural
network,
events
or results.
product
data
mining
rollout
a forecasting
For example,
or a new
marketing
techniques.
Data mining phases
Identify
O p er ati on al
Data preparation
phase
data
set
Clean data set
d at ab ase
Integrate
data
Classification Data
Dat a ware h o u s
analysis
classification
and
Clustering
phase
set
analysis
and sequence
analysis
Link analysis Trend and deviation analysis Select
and apply
Artificial Inductive
Knowledge acquisition phase
algorithms
Neural
Networks
logic
Decision
ensembles
Classification Nearest
and regression
trees
15
neighbour
Visualisation
etc.
Prediction
Prognosis
phase
Forecasting Modelling
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
792
part
VI
Database
Management
Because of
what
between that
of the
a customers
meaningful
data that
reduce
healthcare
fraud,
Guided.
The
or relationships.
Automated.
end
In this
uncover
hidden
techniques
data.
patterns,
customer
taBle
However, For
models.
such
as a customer
example,
15.10
software
data
mining
can
also
model
to a target in
that
could
be used
marketing
more detail.
basis
Table 15.10
to the
data.
and
multiple
and explains that
to
to
campaign.
known
mining
profile
as the
modes:
apply
applies
data
describes
a customer
be used
explain
automatically
section,
acceptance,
in two
to run
these
relationships
and
to
mining tool
in this
create
and
Clearly,
mining usually
practical
techniques
tool
information
a predictive
analytics
mining
could
response
the use of predictive
The
As you learnt
data
be run
explore
which
mining
can
car. analysis,
development
mining
boundaries relationship
customers
Fortunately,
by step to
data
and extracting
data
Data
the
a close
(In regression
helpful in finding
decides
up the
model
on the
product
on.
step
user
and relationships.
group.
data
end
sets
an explanatory
data
and
user
of tyres
correlation.)
so
outside
might find
managers.
improve
and
might fall tool
brand sales
idiot
mining tool
relationships.
behaviour,
warehouse
data
trends
predictive
explains
markets,
on discovering
example,
and the among
patterns,
mode, the end
significant
focus
For
the
In this
mining
mining has proven
buying
stock
guides
drink
data
some findings
a data
by the label
In fact,
mode, the
to find
methodologies
given
user
of cool
customer
analyse
example,
high regard
described
define
patterns
the
brand
results.
help
mining process, For
be held in
are commonly
more
among
not
data
expect.
favourite
might
relationships
of the
managers
relationship
yields
nature
business
describes
create
predict
a
advanced
future
customer
The next section
contains
a sample
of data
vendors.
asample of current data warehousing vendors
vendor
Product
Teradata
Teradatas
the
web Address EDW (Enterprise
market leaders.
innovations,
Data
Warehouse)
The company
and capabilities
is
has included
one
of
www.teradata.com
new tools,
such as Hadoop-based
technologies. Oracle
Oracle is
synonymous
warehouses. platform
with databases
Oracle
Exadata
that includes
flash
and Hybrid Columnar Amazon
Amazon data
MarkLogic
Copyright Editorial
review
2020 has
Data
Hub is
It is optimised
analytics
and interactive
analytics
Although
the term predictive
Learning. any
All suppressed
the
promise
bottom
line.
Rights
Reserved. content
does
May not
Redshift
is their
a Hadoop-based
for
fully
aws.amazon.com
managed
data storage
batch processing,
not
analytics
is
of predictive
be
copied, affect
used
overall
or
by
analytics predictive
scanned, the
offers
ways to
www.marklogic.co
queries.
Therefore,
materially
www.cloudera.com
advanced
SQL.
offers a NoSQL platform that
predictive
that
overheads
solution.
15.6.2
Cengage
I/O
cloud-based
semantic-based
deemed
for lower
way through
Amazon
perform
their
www.oracle.com
Web Services has led the
warehousing.
MarkLogic
improve
data
I/O.
solution.
of functionality,
with
for reduced
Enterprise
15
storage
now
an advanced
Compression
petabyte-scale Cloudera
and
Machine is
duplicated, learning
in experience.
many BI vendors
is
very
analytics
whole
or in Cengage
part.
Due Learning
attractive is
to
receiving
electronic reserves
to indicate for
rights, the
right
a lot
some to
third remove
many different
businesses
party additional
of
content
looking
marketing
may content
be
buzz;
suppressed at
any
time
levels
for
from if
the
subsequent
ways to vendors
eBook rights
and/or restrictions
eChapter(s). require
it.
Chapter
and
businesses
are dedicating
use
of advanced
mathematical,
high
degrees
learnt
earlier,
of accuracy.
use similar
and
answering models the
data
has
step
behaviours.
after
data
In fact,
origins
need to
profile
force
for the
based
evolution
on your
mining when
send
data
mining
their to
and
data
business
predictive
mining
while predictive
analytics
focuses
In
you
some
ways,
understand dropping
you
your
the term
data,
you
data
refers
to the
outcomes
with
analytics
mining focuses
on creating
can
on
actionable
of predictive
analytics
use the
mining
793
As you
predictive
Data
can think
Intelligence
analytics?
and
focus.
be traced
customer
modelling
back to
buying
and purchasing
what credit limit
to
the
banking
patterns
methodologies
information
a big
predictive
offer,
analytics
experiences. and
flyer
stimulus
and
in these
data to
and replacing
BI data
history,
a credit
which
offers
credit
as
predict
it
with the
card industries.
industries
used in
drive
and
customers
in
with the
as a
was one of the first
search
profile
many
received
and
loyalty
frequent
mining
Business
was
analytics
card
you are
today.
For
company
more likely
The
a critical
driving example,
can use data to accept,
and
offers.
Google
personalise customer
data In fact,
for
analytics
future
different
are
can
predict
determine
analytics
media sites.
and of
those
Predictive to
analytics
demographic
models to
to
once
Predictive
predict
with a slightly
events.
BI vendors
to
Databases
analytics.
of predictive customers
data,
and
mining;
most
between
but
BI area.
tools
capabilities.
of tools, of past
more alluring term predictive The
difference
behaviours
to this
modelling
predictive
sets
and what future
next logical
future
also
overlapping
predict
resources and
What is the
mining
the how
to
extensive statistical
15
way to
companies
affinity
card
Take
to
get
the
social
of the
Nowadays, keep
the
right
Companies
of
stored
on social
of organisations
and
credit
organisations
ones,
data
turned
ads as a way to increase
airline
many
media.
mountains
offered targeted
example
and
of
were used by all types
the
programs.
an attempt
harvest
that
Similar initiatives
up sales.
advent
card
use
which
to increase
industries
predictive
in turn
and
and
analytics
will increase
loyalty
and sales.6 Predictive
analytics
intelligence
and
available
and
data.
employs
other
The algorithms
work
with certain types
in statistics
and understands
thanks the
to
constant
optimum
Most
service,
Predictive
analytics
optimise
existing
However,
predictive
carefully
monitor
advances, predictive
modern
analytics
is
evaluate
proper BI tools are
detection,
an organisation
not the the
model
models
hidden
that the algorithms
sauce
apply
used in
such
areas
multiple
algorithms
to find
relationships,
example,
problems
to
However,
optimised
pricing.
it
can help
or opportunities.
problems.
models
is trained
hand.
and
on
of problems
who typically
ways; for
all business
analytics
types
as customer
future
based
problem in
marketing
many different
artificial
models
end user,
automatically
to fix
networks,
to certain
to the
and anticipate
of predictive
neural predictive
are specific
targeted
in
problems
secret
value
actionable
it is important
fraud
identify
algorithms,
create
predictive
analytics
retention,
statistical
to
applies the
can add value to
and
the
Therefore,
business,
processes,
tools
build
of data.
customer
and
modelling
used to
technology
model.
customer
mathematical
advanced
Managers
determine
their
should
return
on
investment.
So far, support
you
data,
mentioned will learn
have learnt and
data
components about
about
analytics to
a widely
data
to
provide
used
warehouses
extract
knowledge
decision
BI style
known
support as
and star
schemas
from
data.
to
online
the
to
all organisational analytical
model and store
A BI system users.
uses In the
all the next
decision previously
section,
you
processing.
15
6 Analytics
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
Insight,
All suppressed
Rights
Available:
Reserved. content
does
May not
not materially
be
www.analyticsinsight.net/
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
794
part
VI
Database
15.7
Management
onlIne
analytICal
proCessIng
The need for moreintensive decision support prompted the introduction of a new generation of tools. Those new tools, called online analytical processing (OLAP), create an advanced data analysis environment that supports decision making, business modelling and operations research. OLAP systems share three main characteristics. They: Use multidimensional Provide advanced
data analysis techniques
database support
Provide easy-to-use
end-user interfaces.
Lets examine each of those characteristics.
15.7.1
Multidimensional
Data analysis techniques
The most distinct characteristic of modern OLAP tools is their capacity for multidimensional analysis. In multidimensional analysis, data are processed and viewed as part of a multidimensional structure. This type of data analysis is particularly attractive to business decision makers because they tend to view business data as data that are related to other business data. To better
understand
this
view, lets
examine
how a business
data analyst
might investigate
sales
figures. In this case, he or she is probably interested in the sales figures as they relate to other business variables such as customers and time. In other words, customers and time are viewed as different dimensions of sales. Figure 15.18 illustrates how the operational (one-dimensional) view differs from the
multidimensional
view
of sales.
As you examine Figure 15.18, note that the tabular (operational) view of sales data is not well suited to decision support because the relationship between INVOICE and LINE does not provide a business perspective of the sales data. Onthe other hand, the end users view of sales data from a business perspective is more closely represented by the multidimensional view of sales than by the tabular view of separate
tables.
Note also that the
multidimensional
view allows
end users to consolidate
or aggregate
data at different levels: total sales figures by customers and by date. Finally, the multidimensional view of data allows a business data analyst easily to switch business perspectives (dimensions) from sales by customer to sales by division, by region, and so on. Multidimensional data analysis techniques are augmented by the following functions: Advanced data presentation functions: 3-D graphics, pivot tables, crosstabs, data rotation and three-dimensional cubes. Such facilities are compatible with desktop spreadsheets, statistical packages and query and report-writer packages. Advanced
data aggregation,
consolidation
and classification
functions
that
allow the data analyst
to create multiple data aggregation levels, slice-and-dice data (see section 15.5.3), and drill-down and roll-up data across different dimensions and aggregation levels. For example, aggregating data across the time dimension (by week, month, quarter and year) allows the data analyst to drill
15
down
and roll up across time
dimensions.
Advanced computational functions: Business-orientated variables (market share, period comparisons, sales margins, product margins and percentage changes), financial and accounting ratios (profitability, overhead, cost allocations and returns), and statistical and forecasting functions.
These functions
their components
are provided
automatically
and the end user does not need to redefine
each time they are accessed.
Advanced data modelling functions: Support for what-if scenarios, variable assessment, contributions to outcome, linear programming and other modelling tools.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
variable
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.18
Database Table
name:
name:
operational
vs multidimensional
Databases
for
Business
Intelligence
795
view of sales
Ch15_Text
DW_INVOICE iNv_NUM
Table
15
name:
iNv_DATe
CUS_NAMe
2034
15-May-19
Dartonik
1400.00
2035
15-May-19
Summer Lake
1200.00
2036
16-May-19
Dartonik
1350.00
2037
16-May-19
Summer
2038
16-May-19
Trydon
iNv_TOTAL
lake
3100.00 400.00
DW_LINE
Dw_LiNe
iNv_NUM
LiNe_NUM
PrOD_DeSCriPTiON
LiNe_PriCe
LiNe_AMOUNT
LiNe_ QUANTiTY
2034
1
2034
2
Optical
Mouse
Wireless
RF remote
and laser
3TB
45.00
20
900.00
50.00
10
500.00
pointer Drive,
1
Everlast
Hard
2036
1
Optical
Mouse
45.00
30
1350.00
2037
1
Optical
Mouse
45.00
10
450.00
2037
2
Router
2037
3
Everlast
Hard
2038
1
NoTech
Speaker
200.00
Drive,
Time Customer
2 750.00
1 350.00
3 100.00
1 800.00
Lake
Totals
16-May-19
1 400.00
Dartonik
of Sales
Dimension
15-May-19
Dimension
Summer
View
4 900.00
400.00
Trydon
Totals
400.00
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
15
8 050.00
4 850.00
3 200.00
Aggregations are provided for both dimensions
Sales are located in the intersection of a customer row and time column
Copyright
400.00
8
50.00
Set
2050.00
10
205.00
3TB
600.00
5
120.00
Multidimensional
Editorial
1200.00
6
2035
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
796
part
VI
Database
Management
Predictive
modeling
(business
allows
the
with
a high
outcomes)
15.7.2 advanced To
deliver
features
efficient
system
to
build
percentage
advanced
statistical
models to
predict future
values
of accuracy.
Database support
decision
support,
OLAP
tools
must
have
advanced
data
access
features.
Such
include:
Access
to
Access
to
many
different
aggregated
kinds data
of
DBMSs,
warehouse
flat
data
files
as
and internal
and
well as to the
external
detail
data
data found
in
sources operational
databases Advanced
data
Rapid
consistent
The
and ability
to
data
must
be optimised
warehouse
Support
for
To
provide
from
the
very large
to
the
and roll-up
the
in
proper
data
either
data
source,
business
access
or
model
language
regardless
of
terms,
(usually
whether
to the
SQL).
the
The
source
is
query
operational
As already
interface,
analysis
explained,
the
data
warehouse
can
easily
and
quickly
in size. OLAP
database
data
expressed
to
match
as drill-down
times
requests,
databases.
a seamless
data
response
and then
multiple terabytes
appropriate
such
data.
operational
end-user the
query
source
or data
grow to
features
map end-user
appropriate code
navigation
tools
to their
requests
into
map the
own
the
data
proper
data
elements
dictionaries. (optimised)
from
These query
the
data
metadata
codes,
warehouse
and
are used to translate
which
are then
directed
to
source(s).
15.7.3 easy-to-use end-user Interface The
end-user
analytical
implemented,
an analytical
accelerates access
decision
to them
is
sophisticated
interface familiar
data
features to
end
spreadsheet such
review
2020 has
users.
Excel.
is
advanced
Cengage deemed
data
and
Learning. that
any
for
analysis
All
Rights
Reserved. content
does
May
not materially
be
copied, affect
user to
navigate OLAP
vendors
overall
this
and
the
closely
features
available
in
early
of data
functions
end-user
programs
tools
are
graphical
because
of the
are already to
desktop
with spreadsheets
interfaces,
Figure
their
Many
that
and when
equipped
common
systems
in
vendors
have
useful
interfaces.
analysis
their
simplifies
more
and
graphical
menu bar, as shown
familiar
become
integrated
and spreadsheet
When properly
data in a way that
lesson
presentation
have
components.
features
easy-to-use
generations
vendors
by using
scanned,
learnt with
previous analysis
OLAP
Advanced
spreadsheet
are
the
most critical
tools
OLAP systems
costs
not
the
features
development
suppressed
OLAP
Using
tool
analysis
within the
an advantage
the
analysis.
many
most
option
permits
from
Because
another
of the
OLAP and
are borrowed
Microsoft
training
Copyright
simple.
packages,
one
or data
extraction
as
to
Editorial
kept
is
interface
making
becomes integration
15
interface
OLAP
15.19.
simply
This seamless
end users gain access
and interfaces.
Therefore,
additional
minimised.
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.19
Integration
15.7.4 olap
of olap
with a spreadsheet
15
Databases
for
Business
Intelligence
797
program
architecture
OLAP operational
characteristics
can be divided into three
main modules:
Graphical user interface (GUI) Analytical
processing
Data processing
logic.
Figure 15.20 illustrates As Figure
logic
OLAPs architectural
15.20 illustrates,
OLAP systems
components. are designed
data. Although Figure 15.20 shows the OLAP system single-user scenario is only one of many.In fact, one each data analyst must have a powerful computer on data processing locally. In addition, each analyst uses copies
must be synchronised
to
ensure that
analysts
to use both operational
and data
warehouse
components located on a single computer, this problem with the installation shown here is that which to store the OLAP system and perform all a separate copy of the data. Therefore, the data are
working
with the same
data. In
other
words,
1
each end user must have his or her own private copy (extract) of the data and programs, thus returning to the islands ofinformation problems discussed in Chapter 1, The Database Approach. This approach does not provide the benefits of a single business image shared among all users. A more common
and practical
architecture
is
one in
which the
OLAP
GUI runs
on client
workstations,
while the OLAP engine, or server, composed of the OLAP analytical processing logic and OLAP data-processing logic, runs on a shared computer. In that case, the OLAP server will be a front end to the data warehouses decision support data. This front end or middle layer (because it sits between the data warehouse and the end-user GUI) accepts and processes the data-processing requests generated
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
798
part
VI
FIgure
Database
Management
15.20
olap architecture
Advanced reporting
OLAP engine
provides
the
data
a front
end to
warehouse
Spreadsheet reports Excel
Operational
External
plug-in
Analytical
data
data
processing
logic
Access
OLAP
plug-in
Data-processing logic
reports
OLAP GUI
Dashboards Alternate of
direct
access
operational
and
warehouse
Multiple
data
interfaces
and application
data
plug-ins Mobile
Bl
Data
ETL
Warehouse Credit: Oleksiy
Mark/Shutterstock.com
Technology/Cengage
Learning
Extraction, Transformation
and
Loading
SOURCE:
by the
many end-user
data
OLAP
approaches increase
in
Figure
by storing the
speed
with fairly sales
small,
data, Whatever
sharply
Copyright review
2020 has
stable
the
an OLAP server
for the
characteristics
Learning. that
any
superiority
All suppressed
Rights
Reserved. content
does
subsets.
May not
not materially
be
affect
scanned, the
overall
to
with local
miniature
databases
duplicated, learning
in experience.
whole
or in Cengage
part.
is
to
mart
objective
is to
of data
trends
most likely
to
certain:
work
multidimensional
with
data
OLAP proponents
multidimensional
store
work
data.
managed?
to store the
databases
will be examined
is
customer
and
The
data
most end users usually
analyst
with
stored
multidimensional
or
a sales
and
representations
that
one thing
best
warehouse
workstations.
graphic
work
components,
data
data
end-user (the
For example,
use of relational
approach
copied,
at
the
is the assumption
is likely
multidimensional
of specialised of each
approach
OLAP
merge
visualisation
representative
the
could
warehouse
data
behind this
of the
are
system
data
and
warehouse
Some favour
basic
Cengage
data
OLAP
of the
arrangement
divided.
the
access
a customer
But how
argue
deemed
data
The logic
whereas
must be used.
15.21,
extracts
of
and characteristics).
Editorial
Figure 15.21 illustrates
marts.
As illustrated
15
workstations.
Course
are
data; others
multidimensional
data.
The
next.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.21
olap server
with local
15
for
Business
Intelligence
799
miniature data marts
Sales
Dept
OLAP
OLAP
Databases
Local
data
Cust
GUI
marts
o mer s
server
Marketing Multiple
Analytical
accessing
processing
OLAP
the
Dept
clients
OLAP server
OLAP
M ar k eti n g
GUI
logic Data
Manufacturing
Dept
processing logic
OLAP GUI
Procurement
OLAP
Data
Data
Data Warehouse
Operational
Warehouse
data
extracted
warehouse
which
Dept
GUI
from
to local
provides
Pr o d u cti o n
Ven d o s
the
data
data
faster
marts,
processing
SOURCE:
15.7.5 relational relational
online
databases
Course
Technology/Cengage
Learning
olap
analytical
and familiar
processing
relational
query tools
(rOLAP)
provides
OLAP functionality
to store and analyse
by using relational
multidimensional
data. That approach
builds on existing relational technologies and represents a natural extension to all of the companies that already use relational database management systems within their organisations. ROLAP adds the following extensions to traditional RDBMS technology: Multidimensional
data schema support
within the
Data access language
and query performance
Support for very large
databases (VLDBs).
RDBMS.
that are optimised for
multidimensional
15
Multidimensional Data Schema Support within the rDBMS Relational technology uses normalised tables to store data. The reliance design
methodology
for relational
databases
data.
is
seen as a stumbling
on normalisation
block to its
use in
as the
OLAP systems.
Normalisation divides business entities into smaller pieces to produce the normalised tables. For example, sales data components might be stored in four or five different tables. The reason for using normalised tables is to reduce redundancies, thereby eliminating data anomalies and to facilitate data updates.
Copyright Editorial
review
2020 has
Unfortunately,
Cengage deemed
Learning. that
any
All suppressed
Rights
for
Reserved. content
does
May not
decision
not materially
be
copied, affect
support
scanned, the
overall
or
duplicated, learning
purposes,
in experience.
whole
or in Cengage
part.
it is
Due Learning
to
easier to
electronic reserves
rights, the
right
understand
some to
third remove
party additional
content
data
may content
be
when they
suppressed at
any
time
from if
the
subsequent
eBook rights
are
and/or restrictions
eChapter(s). require
it.
800
part
VI
Database
seen
Management
with respect
decision
support
seem
preclude
to
to
the
of standard
The
use
for those
to enable
technique star
heavily
known is
changing
as a star
designed
the
End users
who are familiar with the
star
schema
Another Most SQL
criticism
the
tables).
Query codes
intended
against
operational before
East from
off.
was
data
covered
query
operations that
relational
ROLAP
saves
provides
day
SQL is
not suited
use
ROLAP
of
for
performing
multiple-pass
extends
SQL
data (based
on the
SQL
For
the
example,
to
query
so that
star
the
optimiser
is
identifies
the
target
that
operational
it
can
data
or
analysis.
multiple
nested
differentiate
and
between
operational
SQL code required
data
to
access
use
As you
Figure
may be represented
are represented
in
Table
a row
15.11.
and
will recall based
15.3
The 1
warehouse,
has in
from
only four Table bit
5 East,
on,
0
1
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
1
copied, affect
scanned, the
South,
ten
rows
represents
bit
bit would be on.
west
0
be
0
first
of region values
0
not
the
only the East
1
materially
a given
table.)
0
May
SQL
such
North,
(Only
and the
0
not
the
11, Conceptual,
outcomes
15.11.
0
does
queries
techniques
Chapter
1
Reserved.
SQL
optimiser
optimises
indexing
0
content
the
drill-down
properly
1
Rights
the
on 0 and 1 bits to represent
represents
attribute
in the index
to identify
performs
of advanced
as shown
with a REGION
must be represented
data
user
operation
is the
in
end
modified
DBMS.
index is
attribute
is the
if the
databases.
REGION
to represent
query
Design, a bitmapped
Bitmap representation
All
for
functions
Data
advanced
schema)
generate
query
However,
performance
within relational
if the
the
if the
optimiser
through
outcomes
each row
because
warehouse.
query
Database
those
15.3
suppressed
must
support
analysis
queries
0
any
data
by adding
Multidimensional
0
Learning.
operations.
such
methods.
0
that
update
data
1
Cengage
This special
15.5.
access
advanced
Optimised for
the
design
will discover that those tools the
0
deemed
for
a special
data
to
0
has
foundation
Section
east
2020
as the
uses
than
used
query tools
ROLAP
visualisation
warehouse
tools
in
South
review
that
characteristics
data representations. detail rather
the
However,
is that
ROLAP
in
means
used.
require
data
them
indexes
West
15.11
has stressed
These
RDBMSs
North
Copyright Editorial
targets.
of improved
For example,
Note that
are
also improved
data,
Physical
Figure
and
technology,
way, a ROLAP system is able to
to the
For example,
and
schema.
criticism,
data
is
passing
source
condition.
and
this
for
query
requests
and
data
tools
requests
In that
the requests
Logical,
book
data.
passes
as bitmapped
star
databases
data
performance
Another
this
pre-aggregated.
multidimensional
which
and Query Performance
To answer
schema
to support
foundation
query
of relational
requirements
star
environment, and
techniques
in relational
with the traditional
optimisation
support
statements.
(normalised
design
optimise
new
when familiar query
decision
access
data
duplicated
schema,
design
work efficiently
Data Access Language
15
to
data
do not
and improves
taBle
invested
change.
the
view of the
relational
RDBMS technology
is
schema
Naturally,
Given that
be non-normalised,
data.
Fortunately
design
data. to
multidimensional
technique
other
data tend
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
As you
examine
bitmapped found
in
used in
Table
indexes
15.11,
are
more
many relational situations
married, gender
small.
single,
the
the index
at handling
databases.
where
domain) is fairly
note that efficient
However,
number
REGION
widowed,
divorced
of ROLAP
tools
a
minimum
amounts
do keep
of possible
For example,
takes
large
in
values
be another
Databases
amount
of data
mind that
for
than
outcomes
in this
bitmapped
Business
of space.
bitmapped (in
good
for
are the
an attribute
has only four
would
15
words,
example.
index
801
Therefore,
indexes
indexes
other
Intelligence
typically
are
primarily
the
attribute
Marital status
candidate,
as would
M or F.
Early
examples
analytical
processing,
the interaction
FIgure
and the
of the client/server
15.22
are
data
mainly
client/server
processing
ROLAP
early traditional
took
products
place
in
which
on different
the
end-user
computers.
interface,
Figure
the
15.22
shows
ROLAP
GUI
ROLAP
GUI
ROLAP
GUI
ROLAP
GUI
components.
rolap
client/server
architecture
ROLAP system
ROLAP
server
ROLAP
analytical
processing
Dat a
war eh o use d at a
logic
ROLAP
data-processing
logic
The ROLAP server interprets end-user complex
Op er ati o n a d at
requests
to access
the
data
If an end user
a
and builds
SQL queries
required warehouse.
requests
operation, the server
builds
code to
the
access
a drill-down
ROLAP
required the
The GUIfront
SQL
client
operational
end runs on the
computer
data-analysis
database.
ROLAP
receives
and passes requests
server.
to the
The
data replies
GUI
from the
ROLAP server and formats them
according
users
to the
presentation
end
needs.
15 Support for very Large Databases Recall that support for VLDBs is a requirement database is used in a DSS role, it
capability
Copyright Editorial
review
2020 has
Cengage deemed
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
DSS databases.
must also be able to store
and the process of loading
Learning. that
for
copied, affect
amounts
when the relational
of data.
Both the storage
data into the database are crucial. Therefore, the RDBMS must
scanned, the
very large
Therefore,
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
802
part
VI
Database
Management
have the
proper tools to import,
data
normally
are
require
that
both
data-loading
24 hours
loaded the
and populate
and
mode from
the
destination
is important,
the the
data
databases
especially
is
are
scalable
use relational
hardly
open
when
only briefly, typically
client/server to
the
entire
databases
surprising
architecture, enterprise.
for their
that
Clearly,
operational
realise
most current
slack
ROLAP
RDBMS
the
most
Decision batch The
speed
operational
of opportunity
support
operations of the
systems
for
run
maintenance
periods. advanced
is
alogical
Given the
vendors
However, (locked).
that
window
provides
data.
with data.
data.
be reserved
you
during
ROLAP
warehouse
operational
a day, 7 days a week, 52 weeks a year. Therefore,
With an open
is
integrate
bulk (batch)
source
operations
and batch loading
that
in
choice
size
have
decision
of the
extended
for
support
capabilities
companies
relational
that
already
database
their
products
extends
OLAP
to
market,
support
it
data
warehouses.
15.7.6
Multidimensional
Multidimensional
online
multidimensional techniques databases
analytical
database
to store
data in
are
suited
techniques
olap
best
used
in
management matrix-like
to
are
manufacturing
store
multidimensional
data using
MDBMS
as a data cube. z-axes The
in
created
cubes
created
of
before
precreated
MOLAP dealing
data in the from
a client/server
client,
or in
both
A datacube
be
sets.
cache.
To speed
the
Figure
MOLAP
cube
dimensions
data
by ad sales
hoc
design
and
Instead, product,
work
ROLAP
access,
data
can
the
be located
basic
must
you
be
query and
process
may be well justified
cubes
MOLAP
are
location
counterparts,
Since
at the
MOLAP
value.
cubes
One important
change
queries.
known
data
Data
the data cube creation
their
up data
to
will have the
Therefore,
than
of the
warehouse.
subject
are not the same thing.)
cache
shows
tools
of the x-, y-and
especially
are
normally
(A data cube is only a window to a predefined
cube
15.23
proprietary design/
(GIS).
hypercubes.
not
work. The front-end much faster
and a database
infrastructure, locations.
data
cube
dimensions.
design to
the
are
be created for
of the
a three-dimensional
becoming
they
a cube
only those
known
medium
cannot
example,
as
or from the
is,
Most
multidimensional
computer-aided
systems
represent
databases that
as
data cube is a function
thus
to
stores.
data
z-axes
static;
cubes
for
front-end
what is called the
database.
are
Data axes;
are
to
or column
dimensions,
data. such
information
stored
x-, y-and of
they
be used.
in-depth
small
the
operational
is that
databases
with
memory in
The
and you can query
and requires
because when
can
fields
geographic
stores
visualise
premise is that
multidimensional
engineering
and
n number
with defined
dimensions,
is critical
in
they
cubes
to
cubes
analyse
functionality
An MDBMS uses special proprietary
MOLAPs
of each data value in the
data from the
data
and
arrays.
from
arrays, row
space.
grow
by extracting
characteristic
time
can
store
derived
users
The location
a three-dimensional
data
systems (MDBMSs).
(CAD/CAM)
end
(MOLAP)
n-dimensional
manage,
MDBMSs
computer-aided
Conceptually,
processing
held
subset
of
MOLAP also benefits server,
at the
MOLAP
architecture.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.23
Molap client/server
15
Databases
for
Business
Intelligence
803
architecture
MOLAP system
MOLAP
Multidimensional
MOLAP
GUI
MOLAP
GUI
MOLAP
GUI
server
database
MOLAP
MDBMS
analytical
processing
logic
MOLAP data-processing Data
cube
logic
Data cube within
is
The
MOLAP
data
requests
and
translates
cube
requests
created
predefined
dimensions
to
engine
receives
from
end
them that
the
users
into
data
are
MOLAP GUI
passed
MDBMS The
RDBMS
MOLAP
users MOLAP
server
data
As the
data
requires
that
Therefore, over
the
less
work
the
MDBMS
access
review
2020 has
space
Learning. that
any
of the
All suppressed
created
And
data
cube
of
is restricted for the
proprietary
Reserved. content
does
May not
not materially
be
copied, affect
the
overall
or
and to
have
avoid
addition is
loses
end
with and
for
the
request
analysis
performance
lengthy
data
and the
techniques
of its
operation.
turn,
advantage
over
is
relational
somewhat
times
application in
dimension
speed
advantages
Scalability access
that,
of a new
a time-consuming
some
data sets.
system
storage
the
process
MDBMS
medium
operating
data
scanned,
the
MDBMSs
small
using a multidimensional
Rights
often,
although
available
use
of dimensions,
This re-creation
too
best suited to
(memory)
makes
be re-created.
are
database.
size
with a set number
cube
cubes
MDBMS is
methods
Cengage deemed
data
data
the the
a
predefined
entire
relational
because
Copyright
the when
databases,
Editorial
cube is
allows
Dat a war eho use d at a
Oper ati o na d at
GUI
to interact
caused
programs. require
limited
15
by having In
addition,
proprietary
data
query language.
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
804
part
VI
Database
Management
Multidimensional
data
analysis
is
also
affected
by how the
database
system
handles
sparsity.
Sparsity is a measurement of the density of the data held in the data cube. Sparsity is computed dividing the
the
data
empty. time
total
cubes
number
of actual
dimensions
are
Returning
to the
sales
are
populated.
processing
In
any
overhead
in the not
example,
period in a given location.
cells
values
predefined,
In fact,
case,
there you
cube all
may be
are
total
number
of cells
populated.
In
many products
that
will often find that fewer than
multidimensional
and resource
by the
cells
databases
other
in
the
words,
are not sold
50 per cent
must handle
sparsity
cube.
some
Since cells
during
of the
by are
a given
data cubes
effectively
to reduce
requirements.
note You
can read
Physical
more about
Database
Relational
data
with
investment
also
other
of time
architecture,
bitmapped
are the
argue that
data
sources
and
effort
MOLAP
databases
and
indexes
in
Chapter
11,
Conceptual,
Logical,
and
Design.
proponents
MDBMS
sparsity
may
norm
using
and tools
used
to integrate be
proprietary
the
a good
within
new
solution
and application
the
makes it
enterprise.
technology
for
software
solutions
those
speed is
Although
and the clients
difficult
in
it takes
existing which
to integrate
the
a substantial
information
small-to
systems
medium-sized
critical.
15.7.7 relational vs Multidimensional olap Table one
15.12 or the
summarises other
must include
provides and
a unified
decision
Figure
data) in the support
taBle
local
many
15.12
Characteristic
computer.
In
the
Schema
size
with the existing tools.
the integration
OLAP
products
the
selection
OLAP
DBMS,
Nevertheless,
of their are
able
if you are using
of
evaluation
programming
the
relational
databases
respective
to
Excel
OLAP data in a SQL server
meantime,
mind that a proper
summary
in
solutions
handle
tabular
within
and
OLAP functionality,
multi-dimensional
as shown
as well as cube (multidimensional
have
successfully
extended
SQL to
relational vs multidimensional olap MOLAP
Uses data cubes
Additional
dimensions
be added
dynamically
Medium
Architecture
Keep in example,
OLAP tools.
Uses star schema
Database
For
comparison.
towards
Many
rOLAP
15
for
working
relational
cons.
compatibility
same ease. For example,
15.24, you can access
and point.
of administrative
point
are
pros view
platforms,
framework.
with the
MOLAP
evaluators
availability
starting
vendors
support
data in
and
a useful
MOLAP
and
on the
hardware
performance
15.12
ROLAP
OLAP
depends
price, supported
requirements, Table
some
often
can
Multidimensional dimensions
arrays, row stores, column
require
re-creation
of the
data
stores.
Additional
cube.
Large
to large
Client/server
Client/server
Standards-based
Open
or proprietary
depending
on vendor.
Open
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Characteristic
rOLAP
Access
Supports
Good
Databases
for
Business
Intelligence
805
MOLAP ad hoc requests
Unlimited Speed
15
to
predefined
dimensions.
Proprietary
access
languages.
dimensions
with small
average
Limited
for
data
sets;
Faster for large
data
sets
with
predefined
dimensions.
medium to large
data sets
15.8
sQl analytIC
The proliferation
of
FunCtIons
OLAP tools
has fostered
the
development
of SQL extensions
to
support
multi-dimensional
data analysis. Most SQLinnovations are the result of vendor-centric product enhancements. However, many of the innovations have made their wayinto standard SQL. This section willintroduce some of the new SQL extensions that have been created to support OLAP-type data manipulations. The SaleCo snowflake schema shown in Figure 15.24 demonstrates the use of the SQL extensions. Note that this
snowflake
schema
has a central
DWSALESFACT
fact table
and three
dimension
tables:
DWCUSTOMER, DWPRODUCT and DWTIME. The central fact table represents daily sales by product and customer. However, as you examine the star schema shown in Figure 15.24 more carefully, you
FIgure
15.24
saleCosnowflake schema
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
806
part
VI
Database
see that
Management
the
DWCUSTOMER
DWREGION
and
Keep in (such
mind that
as
a database
CREATE,
expected.
INSERT,
However,
aggregations that
materialised
DWPRODUCT
is
at the
dimension
views
particularly to
store
of all
DELETE
you run in
multiple columns.
are
core
UPDATE,
most queries
over
BY clause
and
tables
have their
own
dimension
tables:
DWVENDOR.
Thats
useful:
data
and
a data
pre-aggregated
work
tend
CUBE.
in the
In
in
all
the
two
SQL
data
to include
section introduces
and
rows
Therefore,
will
warehouse
why this
ROLLUP
warehouses.
SELECT)
data
you
as
groupings
extensions
addition,
commands
warehouse
to the
will learn
and
GROUP
about
using
database.
online Content Thescriptfiles usedto populate the database andrunthe SQLcommands are available
on the online
platform for this book.
note This
section
uses
functionality.
the
similar functionality
15.8.1 The
and
the
extension
you to
instead.
BY
Copyright review
2020 has
BY
order
Cengage
Learning. that
any
in the for
your
SQL
extensions
to verify
to
support
OLAP
whether the vendor
supports
DBMS.
GROUP
BY clause generates
BY clause.
column
listed
to
generate
only
The
one
ROLLUP
except
aggregates
aggregate
for
extension
for the last
one,
goes which
by
each one gets
different new
value
step further; a grand
total
BY ROLLUP is as follows:
column2
[,table2,
GROUP
BY clause
each
GROUP
column1
of the
generates
deemed
of
[, ...],
aggregate_function(expression)
...]
column2
[, ...])
condition]
[ORDER
Editorial
of the
with the
ROLLUP (column1,
[HAVING
use of the
use
condition]
GROUP
15
the
documentation
is for
GROUP
listed
table1
[WHERE
syntax
used the
column1,
FROM
The
is
a subtotal
The syntax
SELECT
proper
know,
get
demonstrate
extension
of attributes
enables
to
DBMS, consult the
what the
As you
combination
list
RDBMS
rollup
ROLLUP
dimensions.
it
Oracle
If you use a different
column
a grand
ROLLUP
All suppressed
Rights
list
total.
column2, within
does
May not
not materially
be
...]]
the
All other
extension
Reserved. content
[,
GROUP
columns
to generate
copied, affect
scanned, the
overall
or
duplicated, learning
BY
ROLLUP
generate
subtotals
in experience.
whole
or in Cengage
is
very important.
subtotals.
by vendor
part.
Due Learning
to
electronic reserves
The last
For example,
Figure
column 15.25
in the
shows
the
and product.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
15.25
rollup
15
Databases
for
Business
Intelligence
807
extension
Note that Figure 15.25 shows the subtotals by vendor code and a grand total for all product codes. Contrast that with the normal GROUP BY clause that generates only the subtotals for each vendor and product combination rather than the subtotals by vendor and the grand total for all products. The ROLLUP extension is particularly useful when you want to obtain multiple nested subtotals for a dimension
subtotals
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
hierarchy.
by region,
any
All suppressed
Rights
For example,
a location
hierarchy,
you can
use
ROLLUP
to
generate
province, city and store.
Reserved. content
within
15
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
808
part
VI
Database
Management
15.8.2 the CuBe extension The
CUBE
extension
is
also used
with the
GROUP
BY clause to
generate
aggregates
by the listed
columns, including the last one. The CUBE extension enables you to get a subtotal for each column listed in the expression, in addition to a grand total for the last column listed. The syntax of the GROUP BY CUBE is asfollows: SELECT
column1 [, column2, ...], aggregate_function(expression)
FROM
table1 [,table2, ...]
[WHERE
condition]
GROUP BY
CUBE (column1,
[HAVING
column2 [,....])
condition]
[ORDER
BY
column1 [, column2, ...]]
For example, Figure 15.26 shows the use of the month and by product, as well as a grand total.
FIgure
15.26
CUBE extension to compute the sales subtotals
by
CuBe extension
15
In Figure 15.26, note that the CUBE extension generates the subtotals for each combination of month and product, in addition to subtotals by month and by product, as well as a grand total. The CUBE extension is particularly useful when you wantto compute all possible subtotals within groupings based on multiple dimensions.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
Cross-tabulations
All suppressed
Rights
Reserved. content
does
May not
not materially
be
are especially
copied, affect
scanned, the
overall
or
duplicated, learning
good candidates
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
for application
rights, the
right
some to
third remove
party additional
of the
content
may content
CUBE extension.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
15
Databases
for
Business
Intelligence
809
15.8.3 Materialised Views The data
warehouse
normally
contains
fact tables
that
store
specific
measurements
of interest
to
an
organisation. Such measurements are organised by different dimensions. The vast majority of OLAP business analysis of everyday activities is based on comparisons of data that are aggregated at different levels, such astotals by vendor, by product and by store. Since
businesses
normally
use a predefined
set of summaries
for
benchmarking,
it is reasonable
to
predefine such summaries for future use by creating summary fact tables. However, creating multiple summary fact tables that use GROUP BY queries with multiple table joins could become a resource-intensive operation. In addition, data warehouses must also be able to maintain up-to-date summarised data at all times. So, what happens with the summary fact tables after new sales data have been added to the
base fact
tables?
Under
normal
circumstances,
the
summary
fact
tables
are re-created.
This
operation requires that the SQL code be run again to re-create all summary rows, even when only a few rows needed updating. Clearly, this is a time-consuming process. To save query processing time, most database vendors haveimplemented additional functionality to manage aggregate summaries more efficiently. This new functionality resembles the standard SQL views for
which the SQL code is
predefined
in the
database.
However, the added functionality
is that the views also store the preaggregated rows, something like Microsoft SQL Server provides indexed views, while Oracle provides explains the use of materialised views. A materialised
view is a dynamic table that
contains
not only the
difference
a summary table. For example, materialised views. This section
SQL query command
to generate the
rows, but also stores the actual rows. The materialised view is created the first time the query is run and the summary rows are stored in the table. The materialised view rows are automatically updated when the base tables are updated. That way,the data warehouse administrator creates the view but will not have to update the view. The use of materialised views is totally transparent to the end user. The OLAP end user can create
OLAP queries,
using the standard
fact tables,
and the
DBMS
query optimisation
feature
will automatically use the materialised views if those views provide better performance. The basic syntax for the materialised view is: CREATE
MATERIALISED
VIEW view_name
BUILD {IMMEDIATE | DEFERRED} REFRESH
{[FAST
[ENABLE
|
COMPLETE
|
FORCE]}
ON
COMMIT
QUERY REWRITE]
AS select_query; The
BUILD
clause
that
the
indicates is in
rows
materialised
The
part of the
review
you indicate
commit
materialised
has
lets
create
Cengage deemed
rows
right
are populated
provides
are
actually
after the
routine
populated.
command
at a later time.
a special
any
the
DBMS
views
All
Rights
views,
does
May not
not materially
be
that
tables.
affect
how
that
rows.
try to
that
do
to
IMMEDIATE
is entered.
Until then, the
indi-cates
DEFERRED
materialised
an administrator
update
whenever
the
runs
materialised
a change is
COMPLETE select
indicates
query
on
to
the
update;
otherwise,
updates
to the
materialised
DML statement, ENABLE
that is,
view
view
populate
when
made in the
that
which
a FAST
the
The
it
update is
based
REWRITE
privileges
and
is rerun.
will do a COMPLETE view
will take
as part of the commit
QUERY
1
base tables,
a complete view is
new
option
allows
place
of the the
as
DML
DBMS
to
optimisation. you
need
defer to the copied,
and
when the
underlying
query
As always, Reserved.
content
will first
base
in
affected view
indicates
of the
the
materialised
suppressed
only the
clause
process
when
FAST indicates
materialised
updated
steps.
Learning. that
base tables.
COMMIT
use the
2020
clause
that
ON
that
To
Copyright
DBMS
in the
transaction
prerequisite
Editorial
The
view updates
all rows
indicates
update.
are populated
view rows
state.
are added to the
FORCE
view
views.
REFRESH
made for
materialised
view rows
materialised
an unusable
The
when the
materialised
that the
materialised
the
indicates
scanned, the
overall
or
duplicated, learning
specified
DBMS in experience.
whole
documentation or in Cengage
part.
Due Learning
to
electronic reserves
you
must
complete
for the latest rights, the
right
some to
third remove
party additional
content
specified
updates. may content
be
In the
suppressed at
any
time
from if
the
subsequent
case
eBook rights
and/or restrictions
eChapter(s). require
it.
810
part
VI
Database
of Oracle
Management
versions
materialised Figure
15.27
RDBMS.
11g
view.
In
shows
Note that,
a sysdba)
then
the if
you
FIgure
sales
you
do this
code
you
to
do not
Figure
aggregates
15.27
to
must create you
create have
must the
15.27, this
by product.
have
view logs
appropriate
a materialised
view.
view computes
(i.e.
the
you
base tables
set
materialised
privileges
SALES_MONTH_MV
on the
privileges
Administer
materialised
The
materialised the
MONTH_SALES_MV
Database
will not be able to create
As you can see in total
and 12c,
order
by the
view
in the
would
of the
Oracle
DBA.
Oracle
log into
11g
Oracle
as
monthly total
units sold and the
view is
to
materialised
configured
update
Creating a materialised view in oracle 11g using oracle sQl* plus as a DBa
15
automatically after each change in the base tables. Note that the last row of SALES_MONTH_MV indicates that, during October 2015, the sales of product SM-18277 are three units, for atotal of 20.97. Although all of the examples in this section focus on SQL extensions to support OLAP reporting in an Oracle DBMS, you have seen just a small fraction of the many business intelligence features currently Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
provided
by
manipulate, one for
most DBMS vendors. analyse
Oracle
FIgure
and
and
present
one for
15.28
For example, the
most vendors
data in
multiple
OLAP
products.
Microsoft
provide
formats.
Figure
15
rich
Databases
graphical
15.28
shows
Microsoft
SQL
Analysis
Data
the
allow
user
the
been
is
more
process
quickly
and
accurate
all
DBMS
than
managers
in
does
Data visualisation insight
from
simple
histograms,
time range
software
7
into
such
The
Best
Data
very
steps
many
Power
of 2019,
into
waterfall (such
Oliver
rich
Domo
Rist,
thousands,
techniques
and
many Excel)
Google
Baker
part.
to
tabular
make informed that
visualisation
provide
pie charts,
line
heat
maps,
The tools
used in
data
advanced
1
techniques
Gantt charts,
more. to
of rows
include
plots,
and never
millions
graphical)
Data
has
summarised
data to
(mostly
scatter
Microsoft
Pam
or
of the
formats
Such
and
that
is to
patterns
and this
way. Providing
relationships.
charts,
trends,
words,
meaning
charts,
as
representation
by identifying
hundreds,
are familiar.
data
a thousand
the
possible
maps, donut
BI,
picture
meaningful
visually
and
spreadsheet
Tools
with
a visual
data. The goal of data visualisation
worth
Tables
data into
charts,
provide
big is
mind in a
bubble
Microsoft
Visualization
picture
patterns and
charts,
Tableau,
datas
a
enough insight the
complex,
a simple
see the
human
trends,
to
meaning of the
visualisation.
by the
plots,
from
data
the
saying
encodes
bubble series
as
abstracting
to
the
data
overall
to
bar charts,
visualisation
of
not give them
decisions.
graphs,
Services
Server
efficiently heard
be processed
at-a-glance range
to
screens,
Services
ability to comprehend
We have
of data cannot data to
the
users
relationships.
sample
811
VIsualIsatIon
visualisation
enhances
Intelligence
sample olap applications
Oracle
Data
Business
user interfaces
two
OLAP
15.9
for
data
visualisation
Analytics.7
PC
Magazine,
July
24,
2018.
Available:
huk.
pcmag.com/cloud-services/83744/the-best-data-visualization-tools.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
Due Learning
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
812
part
VI
Database
Management
Common
productivity
visualisations.
Excel
visualising row
spreadsheet
and
column
powerful
The top
of the
manager
report
answered there
out
and
15.29
up.
aline
data
of the
that
sell
The rest
of the
visual
more than
product
most business
sales
15.29
by product
and
month.
products
up or down?
rest,
and that
remain
at the
sellers.
This
data.
of those
of
both.
top
table, if
The the he or
questions quickly
are trending
the
monthly
for
those We can
puts
Microsoft
What about
However,
through
eliminated
report
with totals
Looking
sales two
constant
a simple
top
has
sources.
For example,
month
are the
of the
add-in
multiple
shows by
the
for
users.
and
representation
data
capabilities
PowerPivot
of data from
are trending
powerful
PivotChart
of the
product
which
surprisingly
and
Figure
by
sales out
sales
at the
products
analysis. sales
to figure
which product
of
provide
PivotTable
introduction
within reach
plot
can often and
for the integration
shows
minutes
Excel
charting the
allows
sales
table
by looking
are three
one is trending
FIgure
shows
to figure
Microsoft basic
capabilities
a few
immediately
as
More recently,
summary
might take
she needs
that
data.
be used to visualise
data.
bottom
such included
data limitations
data visualisation
Excel could sales
tools
has long
are
deduce down
and
year.
Microsoft excel sales data report
The above,
albeit
simple,
example
shows the
power
of data visualisation;
it shows
how end users
can quickly gain insight into their data using a simple graphical representation.
15
15.9.1 the need for Data Visualisation From the
previous
discussion
you
might think
that
data visualisation
is
nothing
new,
and
you are
correct to a certain degree. After all, spreadsheets and graphics libraries have been around for a while. What has changed is the development of Big Data and business intelligence. The reality is that, in the current business climate, companies are trying to find a competitive edge by mining large amounts of data.
Copyright Editorial
review
2020 has
Cengage deemed
Tools that facilitate
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
and enhance
be
copied, affect
scanned, the
overall
or
duplicated, learning
the
in experience.
understanding
whole
or in Cengage
part.
Due Learning
to
electronic reserves
of large
rights, the
right
some to
amounts
third remove
party additional
content
of data have
may content
be
suppressed at
any
time
from if
the
subsequent
become
eBook rights
and/or restrictions
eChapter(s). require
it
814
part
VI
Database
Management
This new data visualisation Comparative larger
conveys
sales volumes
at least
as shown
two
additional
insights
by the size of the
into the sales
bubbles.
Larger total
data:
sales values
produce
bubbles.
Geographic
market penetration
visualisation greatest
as shown
makes it easier for
sales
penetration.
to
get
more
detail
to
get
more
detailed
one of the
Also,
the
by clicking
information.
The
many advantages
density
a manager to identify
Furthermore,
data.
by the
on the
of the current
manager map, the
to
zoom
breed
bubbles
the region
sales
ability
of the
in
end
the
(northeastern)
could
and
against
click
user
out,
zoom
down
of data visualisation
that
on any
can
drill
map. The has the
of the
in
on
and
sales
bubbles
a given
region
up, filter,
etc. is
tools.
note Data visualisation to
present
For example, uses
plays
data
are
health
data to
Another easier
data
within
helps has to
discover
it.
However,
modelling
and
understanding
data
the
the
visualisations
of this
meaning
can
of data.
be used
deals
data.
in
to
allows
rigorous
bad
data
that
makes
tool,
points)
and
and
explore
data quickly
other
tools
chapter,
Data
for
could
is just
and
such
data
Big
data visualisation
data
organised
structuring
decisions,
that
using
which he
years.
tool
of properly
to
understand
analysis
200
as we have seen in this of bad
process
end users to
data
in
past
ways
discipline.
As a communication
However,
can lead
New
any
communication
data.
(distilled
data
also important
not replace
of
with the
bad
over the
an effective
validated
because
Its
health
amounts
hidden in the
chapter
issue
it is
large
processed,
Data visualisation
predictive
population
is that
message
vetted
does
world
particular,
even larger!
it
of
visualisation
part
and not an end in itself.
and
Good
history
in
a very important
make a bad decision
about
data
data
A large
This is
the
be properly
a context.
analysis.
of
understand
discovering
developed.
Dr Hans Rosling, (www.youtube.com/watch?v=jbkSRLYSojo)
visualise
advantage
to
visualisation such
role in
being
see the video from
public
it
an important
constantly
a tool,
gain insights
as statistics,
data
modelling.
15.9.2 the science of Data Visualisation Data visualisation brain
sciences This is
interprets,
investigate a
at Figure was
Copyright review
2020 has
any
All suppressed
Rights
quicker
Reserved. content
does
May not
all
not materially
be
start
balls
are in
people
copied, affect
scanned, the
overall
would data
or
duplicated, learning
our
Panel say
in experience.
whole
the
A?
B.
Cengage
part.
Due Learning
How
Why?
to
learn
science visual
presented
or in
to
about
electronic reserves
rights, right
are in the
grouped
some to
the
the
external
third remove
B?
human
additional
content
may content
relates
is
to
looking
Which
brain
objects.
party
world.
psychology,
exercise:
Panel
human
cognitive
of data visualisation
Because
the
how the
neurology,
communication
many
with
study
speaking,
neuroscience,
with a simple
when
sciences Broadly
senses
linguistics,
Specifically,
Lets
process
The cognitive information.
with
includes
other fields. data.
to
sciences.
processes
connect
that
soccer
Almost
makes it
Learning. that
and
many
and
brains
visual
how
quicker/easier?
Cengage deemed
our
science
process
15.31,
way that
Editorial
how
anthropology
how our brains
cognitive
organises
multidisciplinary
philosophy,
15
has its roots in the
receives,
answer
wired
in
a
What constitutes
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
good
data
seen and
as
visualisation?
both
an art
function.
correct
the
Form
data
of data
15.31
means
In
using
question
to
words,
data
other
the
proper
Remember
visual
that
the
answer
because
visualisation
construct, purpose
data
is
and
of
data
Databases
for
Business
Intelligence
visualisation
concerned
with
function
means
visualisation
is to
can both
815
be
form
applying
the
communicate
easily.
the power
Over the past few
a difficult
a science.
transformations.
meaning
FIgure
That is
and
15
decades,
of visual communication
plenty
of research
has been done on data visualisation.
Data visualisation
has evolved to become a very robust discipline. As a discipline, data visualisation can be studied as a group of visual communication techniques used to explore and discover data insights by applying: Pattern recognition: Spatial
Visually identifying
awareness:
Aesthetics:
trends,
Use of size and orientation
distribution to
compare
and relationships and relate
data
Use of shapes and colours to highlight and contrast data composition
and
relationships. In
general,
data
visualisation
uses five
characteristics:
shape,
colour,
size,
position
and
grouping/
order to convey and highlight the meaning of the data. When used correctly, data visualisation can tell the story behind the data. Here is another example that uses data visualisation to explore data and quickly provide some useful data insights. In this case, we are going to use vehicle crash data for the state of Iowa, available at https://catalog.data.gov/. The data set contains data on car accidents in the
US State
of Iowa
from
2010 to
early
2015.
Figure
15.32
contains
a visualisation
of this
data
set using Tableau. 15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
816
part
VI
FIgure
Database
Management
15.32
Vehicle crash analysis
note There the
are most
several
public
common
sources
sources
of large
data
sets
that
you
could
use to
practise
http://catalog.data.gov
http://data.worldbank.org
http://aws.amazon.com/datasets
http://data.worldbank.org
https://data.medicare.gov
www.faa.gov/data_research/
good
Visualisation
Some
of
https://data.world/
www.cdc.gov/nchs/data_access/
For some
visualisations.
are:
examples
of data visualisations,
see the
Centers for
Disease
Control
and
Prevention,
Data
Gallery, at www.cdc.gov/nchs/data-visualization/
15 This
visualisation
visualisation,
includes
we can
vehicles majority slight
the
Copyright Editorial
review
2020 has
increase
in
visualisation,
Cengage deemed
Learning. that
driving
of accidents
any
All suppressed
Rights
not involve
vehicle
crashes
data
Reserved. content
does
May not
graphs
determine
on two-lane
did
the
three
quickly
not materially
roads
in the
copied, affect
scanned, the
overall
or
duplicated,
the
Finally,
past four
learning
bar,
in
whole
speed
years.
or in Cengage
heat
map)
number
we could
processed
experience.
and
a significant where
alcohol.
was previously
be
(line,
that
It is
limit
is
filters.
Looking
of car accidents
involved
90 km/h.
also
determine
also important
and transformed
part.
Due Learning
to
electronic reserves
and
rights, the
right
We can that
to
to
third remove
party additional
content
may content
see that
seems
note that,
extracted,
some
also
there
in
suppressed at
any
time
from if
the
to
order
formatted,
be
at this
single-occupant
be a to
do
formulas
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
applied, adult
etc. For example,
or senior;
etc.
As you
and its
can
You
you usually
introduces
data set
if the
BAC level
see in these
domain.
data,
in this
determine
examples,
cant
start
you
presented
start
in
with
are two
?
Ordinal:
400
400
Quantitative: ordered
can think being
correct
the
type
visually.
story
and
others
the
graph.
as being
of this This
was
problem
817
occupants,
of the
after
domain.
you
data
set
get the
raw
The next section
data
and
highlight
on the
end
visualisation
data
can
be
of the
data
or the
tool.
In
or aggregated.
into two
subtypes:
Examples:
data.
Gender
Examples:
R200
This type
000,
Rate your
200
001 to
on a star
schema
it
shape, ways.
The
that
the
be counted,
Examples
of
colour,
need
way you
and
visualise
provide
data use the
data
tells
insights
Panel
a
and
A, the
of at the
visually
it
group/order
the
unknown
Figure 15.33,
to resonate
to
way to represent
position
x-axis is at the top instead a red
quantitative
you
proper
size,
can
As you can see in
using
and the
means
colour,
visualisations
bar graphs
can
data.
etc.
certain data
of data
and ratio
data type, including
an issue.
with
(under
as interval
because
uses
Some
along
income
of the data
dimensions
data in
is that the
purposely,
use
same
but not aggregated.
family
this
visualisation
to
ordered
is important
users.
draw attention
your
with each
before,
not
of accidents,
the
This
proper
The
more).
to
number
schema.
data.
of data can be subdivided
measures
refer
and operations
done
multiple
understanding
the
be the
and ordered
or
or
data
a star
can be a way to
the
Intelligence
as child, teenager,
or
Therefore,
understand
may not
what is
facts
of qualitative
have learnt
a good
Business
or undergraduate).
001
GPA,
an impact
characteristic
600
age,
to represent has
poor),
000,
numeric
of functions
characteristics
fair,
600
to
but
can be counted
data include
As you
drivers
single
understand.
This type
(graduate
Statisticians
of
ways
data.
aggregated.
facts
need
be counted
class
good,
001 to
Describes
and
quantitative You
student
(excellent,
implies
dont
understanding
you
of the can
This is data that
000,
you
to
of those
qualities
data that
or female);
teacher
the
of data:
This is
(male
as
Some
?
Nominal:
visualisation,
ways.
Describes
visualisation
what
to classify
determine
for
on this topic.
types
Qualitative:
or illegal;
Databases
the Data
data
multiple
general, there
data
some time
basic notions
15.9.3 understanding Before
was legal
analysing
must dedicate
some
we used several formulas
15
main
bottom
with the
of
title
of
presentation. However,
change
you
could
the colour
convey.
Notice
use the
of the
that
the
same
bars to same
data
to
blue, and it
data
plot the
bar
graph
with the
x-axis
would have a different impact
can tell two
different
stories
at the
bottom
on the story
depending
on the
(Panel
B),
you are trying to
visualisation.
15
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
818
part
VI
FIgure
Database
Management
15.33
Infographics
can have an impact
beyond presenting the data
note If you
would like to learn
more about the fascinating
discipline
of data visualisation,
Show
Methe
Numbers:
Designing Tables and Graphsto Enlighten by Stephen Few and The Visual Display of Quantitative Information by Edward R.Tufte are good places to start.
suMMary Business
intelligence
applications generating
and
Decision
support
information
from
support
system
making
DSS
to
is
a term
capture,
presenting to
a
is
to
use such
(or
and
a series
of
data
decision
for
of
purpose
of
making. designed
decision
tools
set
with the
methodologies)
as a basis
of computerised
and integrated
analyse
business
information
an arrangement
cohesive
store
support
methodology
and to
(DSS)
a comprehensive, integrate,
information
refers data
for
collect,
used
to
to
making. assist
extract
A decision
managerial
decision
within a business.
Operational
15
(BI)
used
data
data
differ
are
not
from
best
suited
operational
for
decision
data in three
support. main areas:
From
the
end-user
time
span,
point
granularity
of view,
and
dimensionality. The of
data
warehouse
data that
database
data
Copyright review
2020 has
Cengage deemed
Learning. that
any
All
Rights
for
subject-orientated, decision
analysis
(OLAP)
making,
business
does
May not
not materially
be
copied, affect
scanned, the
overall
making.
and
provides
processing
Reserved. content
data
subset that
decision
suppressed
an integrated,
support for
analytical
supports
Editorial
provides optimised
warehouse
Online
is
query
decision refers
to
duplicated, learning
in experience.
whole
The
data
to
A data
a small
an advanced and
or in Cengage
part.
data
operations
Due Learning
to
electronic reserves
non-volatile
warehouse
processing.
support
modelling
or
time-variant,
is
usually
mart is
group analysis
collection a read-only
a small,
single-subject
of people. environment
that
research.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Relational
online analytical
relational
databases
data.
Multidimensional
by using
star
is
has four
Facts are numeric are
Conceptually,
general
analytics
knowledge
is
data.
to
map
Intelligence
819
multidimensional
OLAP functionality to
store
and
multidimensional
of performing
analyse
data
attributes
a specific
provide
model is
best represented
attribute
hierarchies.
The
hierarchies.
aspect
perspectives
or activity.
to
a given
fact.
by a three-dimensional The
to
support
analysis.
and attribute
business
additional
main purposes:
decision
advanced
dimensions,
that
unknown
data
process
has four
BI functionality
the
attribute
permit
cube.
hierarchy
provides
aggregation
and to
data
tools
a
provide
focuses
analysis
data
enhanced
advanced
can be divided into
on discovering
models
to
analysis
explanatory
and explaining
on creating
of operational
relationships,
phases:
provides
to
extract
and predictive
data characteristics
predict
future
and
outcomes
or
data.
characteristics,
been
that
Data analytics focuses
analytics existing
mining automates
has
data
analytics
on the
SQL
categories
well-defined
business
Predictive
prognosis.
in
of
events
based
facts,
a subset
Explanatory
and
provides
Business
by using
analyse
(MDBMSs)
or values representing
that is used for two
relationships.
Data
and
for
data analysis.
from
analytics.
components:
multidimensional
organisation
used
purpose
qualifying
be ordered
drill-down/roll-up Data
store
systems
technique
with the
measurements
the can
top-down
OLAP functionality
to
(MOLAP)
management
modelling
database
basic star schema
Attributes
tools
processing
database
a data
a relational
Dimensions
analytical
provides
query
Databases
data.
schema
data into
(ROLAP)
relational
online
multidimensional
multidimensional The
processing
and familiar
15
with
with the intention
dependencies
preparation,
analytic
data
data
and/or
analysis
functions
that
and
of finding
trends.
The
classification,
support
previously
data
mining
knowledge
OLAP type
acquisition
processing
and
data
generation.
Data visualisation comprehend
provides
the
visual representations
meaning of the
of data that
enhance
the
users
ability to
data.
Key terMs attributehierarchy
drill-down
partitioning
business intelligence(BI)
explanatoryanalytics
periodicity
cube cache
extraction, transformation andloading (ETL)
portal
dashboard
facts
relational online analytical processing
data cube
fact table
dataextraction
governance
replication
datafiltering
KeyPerformance Indicators(KPI)
roll-up
data mart
masterdata management(MDM)
slice and dice
data mining
materialisedview
snowflake schema
data store
metrics
sparsity
datavisualisation
multidimensional databasemanagement
starschema
system (MDBMS)
data warehouse
onlineanalyticalprocessing (OLAP)
dimensiontables
review
2020 has
Cengage deemed
Learning. that
verylarge databases(VLDBs)
(MOLAP)
dimensions
Copyright
15
multidimensional online analytical processing
decisionsupport system(DSS)
Editorial
(ROLAP)
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
820
part
VI
Database
Management
Further Finlay,
reaDIng
S. Artificial
Intelligence
and
2nd
Relativistic,
Technologies, Inmon,
W. Building
Kimbal,
R. The
Witten,
I.
in
Data
and
edition.
the
Data
Data
Frank,
E. Data
4th
Toolkit,
3rd
Mining:
1
What
Describe the
BI benefits
Explain
edition.
Wiley
of
systems,
and
main components
BI information
data
and
Techniques
(Morgan
Kaufmann
Series
of BI usage, using the internet
for
of the
do they
BI architecture
play in the
interact
business
environment?
to form
a system.
decision
support
Describe
the
formats.
differences
between
operational
and
data?
main characteristics?
of problems likely to be encountered
when operational
data are integrated
warehouse.
Use the following
8
Tools
examples
what role
dissemination
most relevant
Give three examples
scenario
working
data
2013.
found?
Whatis a data warehouse, and what areits
While
Learning
companies
6
its
Driven
2005.
Publishing,
Give some recent
What are the
the
Data
platform for this book.
5
into
to
Illustrate the evolution of BI.
support
how the
evolution
7
Publishing,
Wiley
Machine
have
BIframework.
What are decision
4
Guide
Answers to selectedReviewQuestions andProblems forthis chapter
on the online
Whatis business intelligence?
3
A No-Nonsense
QuestIons
assistance.
2
Business:
2016.
online Content
reVIew
for
edition.
Practical
Systems),
are available
Learning 2017.
Warehouse,
Warehouse
Management
Machine
to answer
as a database
warehouse
Questions
analyst
for
8-14.
a national
sales
organisation,
you
are
asked
to
be part
of
project team.
Prepare a high-level summary
of the
mainrequirements
for
evaluating
DBMS products for data
warehousing.
9
Your data warehousing implementation. some
data
would
15 10
warehousing
Suppose
Before OLAP
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
the
All suppressed
them?
a commitment,
overview.
architecture
them
data
for
The
requirements
Rights
Reserved. content
does
May not
not materially
and
be
copied, affect
How
data
members how
main OLAP client/server
scanned, the
overall
idea
would
OLAP
duplicated, learning
in experience.
to
you
about the need to acquire
enterprise-wide
or in Cengage
users. its
project
particularly will fit the
whole
your
explain
warehousing are
components
or
the
a data warehouse before its
concerned
data
warehouse.
What
recommendations.
warehouse
the
groups
are especially
implementing
your
the
analysis
members
before
Explain
you are selling
making
group
skills
you recommend?
data
11
project group is arguing about prototyping
The project
group
concerned
existing
How
would
advantages
you
has invited about
environment.
define
multi-dimensional
to them?
you to
the
OLAP
provide
an
client/server
Your job is to
explain
to
and architectures.
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
12
One of your vendors recommends to
13
your
project
14
Databases
for
Business
Intelligence
821
using an MDBMS. How would you explain this recommendation
leader?
The project group is ready to should
15
be the
basis
for
this
make afinal decision, choosing between
decision?
ROLAP and MOLAP. What
Why?
The data warehouse project is in the design phase. Explain to your fellow designers how you would use
15
a star
schema
in the
design.
Trace the evolution of DSSfrom its origins to todays technologies
16
Whatis
17
Explain
influenced
this
advanced analytical tools.
Which major
evolution?
OLAP, and what areits
main characteristics?
ROLAP and give the reasons
you
would recommend
its use in the relational
database
environment.
18
Explain the use offacts,
dimensions
19
Explain multidimensional
cubes and describe how the slice-and-dice
20 In the star schema context,
and attributes in the star schema. technique fits into this
model.
what are attribute hierarchies and aggregation levels and whatis their
purpose?
21
Discuss the
most common
22
Explain some of the
performance improvement
mostimportant
techniques
used in star schemas.
issues in data warehouse implementation.
23
Whatis data mining, and how does it differ from traditional
24
How does data mining work? Discuss the different phases in the data mining process.
25
Describe the characteristics
of predictive
analytics.
DSS tools?
Whatis the impact
of Big Data in predictive
analytics?
26
Describe data visualisation.
27 Is data visualisation 28
Whatis the goal of data visualisation?
only useful when used with Big Data? Explain and expand.
As a discipline, data visualisation data insights
29
by applying:
______________,
Describe the different types some
30
examples
of the
can be studied as _______________ used to explore and discover _________________
of data and how they
different
data
and
convey
mapto star schemas
and data analysis.
Give
types.
Whichfive graphical data characteristics findings
and _______________.
does data visualisation
use to highlight and contrast data
a story?
proBleMs
online Content Thedatabases usedforthis problemsetarefoundonthe online platform
for
this
databases, Problems on the
Copyright Editorial
review
2020 has
Cengage deemed
any
These
databases
named
'Ch15_P1.mdb',
1, 3 and
4, respectively.
online
Learning. that
book.
All suppressed
platform
Rights
for
Reserved. content
does
May not
this
not materially
be
are
stored
in
Microsoft
'Ch15_P3.mdb', The
data
for
Access
2002
and 'Ch15_P4.mdb', Problem
2 are
stored
format.
contain in
Microsoft
15
The the
data
Excel
for
format
book.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
822
part
VI
Database
1
Management
The university students lab
computer labs
using
director
usage
the lab.
assigns
statistics.
Show
the
Show
fact
student
lab
table
and
different
which includes
keep
track
of the lab
classification.
semesters.
tables:
student
and
using
data. the
Ch15_P1.MDB
These facts
data,
complete
become the source for the design of
table.)
design
of the
dimension
dimensions. (Hint: These dimensions become the source
tables.)
1a and
1b.
e
Recommend the appropriate
2
of
data.
containing
requirements
by student
the following
access
to
number
The computer
periods.
major
and
which
Define the attributes for each of the dimensions in Problem 1b.
Implement
g
you
Victoria Ephanor
force
defined
in
of four
be used Using
She
study
as the the
1d.
will meetthe requirements
sales
with
you
figures
basis for data
asks
supplied
in
distribution
to
develop
Because the business is growing
data the
currently
warehouse
salesperson
warehouse
Ch15-P2.xls
pool to help guide the accelerating
software,
a data
region,
employs
application and
a small
prototype
product.
sales
that
(This
prototype
seven
problems:
will is to
database.) file,
a
Identify the appropriate fact table components.
b
Identify the appropriate
c
company.
spreadsheet
by year,
a future
listed in this problems introduction.
manage the vast information
who is familiar
people.
her to
Problem
that it is time to
Ms Ephanor,
enable
hierarchies.
manages a small product
she recognises
growth.
attribute
your data warehouse design, using the star schema you created in Problem 1c and
attributes
Create the reports that
fast,
complete
the
following
dimension tables.
Draw a star schema diagram for this data warehouse.
d
Identify the attributes for the dimension tables that
e
will be required to solve this problem.
Using a Microsoft Excel spreadsheet (or any other spreadsheet
15
tables),
generate
be able to
first
f
a pivot table
specify
the
pivot table in
to
display
Figure
show
the
sales
of sales
for
any
by product given
year.
capable of producing
and (The
by region. sample
The
output
end is
pivot
user
shown
must in the
P15.1.)
Using Problem 2e as your base, add a second pivottable (see Figure P15.1)to show the sales by salesperson
for all years
g
Cengage
Learning. that
and
and for
by region.
a given
The
end
product
user
must
be
able to
specify
sales
for
a given
year
or
or for all products.
Create a 3-D bar graph to show sales by salesperson, by product, and by region. (See the sample
deemed
majors
database,
bulleted
by
in
purposes.
are to:
d
the
has
period,
time
measured by the
budgeting
warehouse
database
different
usage,
for
Drawthe lab usage star schema, using the fact and dimension structures you defined in
f
2020
by
a data this
mainfacts to be analysed. (Hint:
Problems
review
users
Define and describe the appropriate
c
Copyright
developing for
of lab
is important
1a-1g.
Define the
for the
Editorial
keeps track
function
of
different
the
three
Problems
the
of
is a dimension
Given the
b
for
contains
STUDENT
task
by time
Ch15_P1.mdb
USELOG
a
the
numbers
usage
director
particular
main requirements
number
usage
Use the
you
The
total
Compare
This
any
All suppressed
output
Rights
Reserved. content
does
in
May not
not materially
Figure
be
copied, affect
P15.2.)
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIgure
p15.1
FIgure
p15.2
15
Databases
for
Business
Intelligence
823
using a pivot table
3-D bar graph showing the relationships
among the agent, product and
region
3
David Suker, the inventory the
use of supplies
manager for a marketing research
within the different
company
company, is interested
departments.
in studying
Mr Suker has heard that
his friend,
Ms Ephanor, has developed a small spreadsheet-based data warehouse model(see Problem 2) that she uses to analyse sales data. Mr Suker is interested in developing a small data warehouse model like Ms Ephanors so he can analyse orders by department and by product. He will use Microsoft
a
Access
as the
data
warehouse
DBMS
and
Microsoft
Excel as the
analysis tool.
15
Develop the order star schema.
b
Identify
c
the
appropriate
dimensions
attributes.
Identify the attribute hierarchies required to support the
d
Develop
a crosstab
report
(in
Microsoft
Access),
model.
using
a 3-D
bar graph to
show
orders
by
product and by department. (The sample output is shown in Figure P15.3.)
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
824
part
VI
Database
FIgure
Management
p15.3
4
Crosstab report:
ROBCOR, on-demand has
grown
orders by product and department
whose sample data are contained in the database named Ch15_P4.mdb, provides aviation charters, using a mix of different aircraft and aircraft types. Because ROBCOR rapidly,
it
hires
you to
be its first
database
manager.
(The
companys
database,
developed by an outside consulting team, already has a charter database in place to help manage all of its operations.) Your first critical assignment is to develop a decision support system to analyse the charter data. (Review Problems 24-28 in Chapter 3, Relational Model Characteristics, in which the operations have been described.) The charter operations manager wants to be able to
analyse
charter
data such
as cost,
hours flown,
fuel
used and revenue.
She would
also like to
be able to drill down by pilot, type of aircraft and time periods. Given those
15
requirements,
Create a star schema for the charter data.
b
Define the dimensions
c
Define the necessary attribute Implement the data Problems 4a-4c.
e
review
2020 has
Cengage deemed
Learning. that
and attributes for the charter operations
any
All suppressed
Rights
design,
using the
design components
you developed in
willillustrate that your data warehouse meets the specified
requirements.
Reserved. content
star schema.
hierarchies.
warehouse
Generate the reports that information
Copyright
the following:
a
d
Editorial
complete
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Using the
data
provided
in the
SaleCo
snowflake
schema
in
15
Figure
Databases
15.24,
for
Business
Intelligence
825
solve the following
problems.
online Content Thescriptfiles usedto populate the databaseareavailableonthe online
platform
DBMS, and
5
for
consult
what
this
the
the
book.
The
script
for
your
documentation
proper
syntax
to
is
Whatis the SQL command to list the total customer
6
and a grand total
files
assume
verify
whether
an the
sales? (Hint:
7
and by month and a grand total
vendor
supports
for all product
ROLLUP
8
Whatis the
all sales? (Hint:
SQL command
Use the
to list the total
9
Whatis the
SQL command
month, with subtotals 10
Whatis the
to list the
ROLLUP
sales by
month and a grand total for all sales? (Hint:
sales? (Hint:
to list the
number
a different
functionality
with subtotals
by
Use the
with subtotals by
ROLLUP
command.)
with subtotals
byregion
command.) category,
with subtotals
by
ROLLUP command.)
of product
sales (number
by month and a grand total for all sales? (Hint:
SQL command
use
similar
month and product,
month and product
Usethe
number
you
command.)
Whatis the SQL command to list the total sales by region and customer, and a grand total for
If
and by product,
Use the
Whatis the SQL command to list the total sales by customer, customer
RDBMS.
DBMS.
sales by customer
for all product
Oracle
of product
of rows)
and total
sales
by
Use the ROLLUP command.)
sales (number
of rows)
and total
sales
by
month and product category, with subtotals by month and product category and a grand total for all sales? (Hint: Use the ROLLUP command.) 11
Whatis the SQL command to list the number of product sales (number of rows) and total sales by month, product category and product, with subtotals by month and product category and a grand total for all sales? (Hint: Usethe ROLLUP command.)
12
Using the answer to Problem 10 as your base, which command would you need to generate the same output but with subtotals in all columns? (Hint: Usethe CUBE command.)
13
Create your own data analysis and visualisation presentation. you to search for a publicly available data set using the internet using
a
what you have learnt
in this
The purpose of this project is for and create your own presentation
chapter.
Search for a data set that interests you and download it. Some examples sources are (see also Note on page 816):
of public data sets
www.data.gov http://data.worldbank.org http://aws.amazon.com/datasets 1
http://usgovxml.com/ https://data.medicare.gov/ www.faa.gov/data_research/
Copyright Editorial
review
2020 has
b
Use any tool available to you to analyse the data. You can use tools such as Microsoft Excel Pivot Tables, Pivot Charts, or other free tools, such as Google Fusion tables, Tableau free trial, IBM Many Eyes, etc.
c
Create a short presentation to explain some of your findings the data comes from, what the data represents, etc.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
what the data sources are, where
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
CHAPTER 16 Big Dataand NoSQL IN THIS CHAPTER, YOU WILLLEARN: The role
of Big Data in
The primary 3
modern
characteristics
business
of Big Data and how these
go beyond
the traditional
Vs
How the
core
To identify To
components
the
major
summarise
differ To
from
the
the
describe
of the
components
four
major
relational
the
Hadoop framework of the
Hadoop
approaches
operate
ecosystem
of the
NoSQL
data
model
and
how
they
model
characteristics
How to
work
with
document
How to
work
with
graph
of NewSQL databases
databases
databases
using
using
MongoDB
Neo4j
PREVIEW In
Chapter
and the learn
2, Big
about
Data Data
Models,
these
issues
You will also learn developed, the
to
Hadoop
efforts
the
NoSQL
database
MongoDB
retrieving provide
and
it has faced
and
Neo4j.
Just as
storing
new
data
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
is
copied, affect
the
tutorials model
overall
or
about
the
a standard
learn
about
data
model
chapter,
you
and continue
low-level in
approaches
databases
which try to
bridge
in
organisations
higher-level
databases
to be
technologies
component
the
non-relational
such
duplicated, learning
activities
with relational
data,
coding
scanned,
NoSQL In this
have developed,
column-oriented
key
such
of
as key-value
and
graph
databases.
gap
between
relational
NoSQL
products:
the
to
NoSQL
for
existing
MongoDB
in experience.
whole
or in Cengage
and the
part.
Due Learning
to
two
the
reserves
Neo4j, for
based
rights, the
right
some to
third remove
and,
and the
additional
perform old
data
data
and
Q and
R
respectively.
on it
party
to
Appendixes
decades
databases
tools
ability removing
Online
and
dominant
electronic
current
data,
databases.
has been
model
in
databases,
updating
as object-oriented
The relational
be
become you
database
database
challenges
emerging
NoSQL. basic
warehouses.
that
databases,
explore
hands-on
the
development.
you learn
developing
databases, NewSQL
specific
has Next,
you
The relational
Editorial
Data.
to
NoSQLs detail.
First,
Hadoop
about
management
to
greater
Data.
model to
systems
Finally,
much
Big
document
also learn
has led
about the technologies
Big
data
databases, You
in
address
address
were introduced
that
framework.
to
you
problem
content
may
that
development
have
content
during
evolved
be
suppressed at
any
time
to
from if
the
subsequent
time,
of data adapt
eBook rights
and/or restrictions
to
eChapter(s). require
it
CHAPTER
these
challenges
because new
Big
manipulation this
new
of dealing
many
of the
16.1
BIG DATA
Data
generally and
database
the
Velocity
Veracity
Data
ambiguity
from
now.
which
Web data,
the
created
large
Although
2020 has
The
Big
that
Data
lacks
velocity,
variety,
by a relational
as follows:
growth
Bigtable
Data.
are
sets
of data that
discussed
later
Web data
Data issues
any,
but
business
not
necessarily
has to
any
All suppressed
Rights
Reserved. content
does
the
However,
May not
not materially
be
copied, affect
cost the
scanned, the
overall
of social
created to
duplicated, learning
in experience.
deal
of the
or in Cengage
of
Due
to
electronic reserves
of
were
right
survived
businesses. among
the
quickly
As a first
to
followed,
Data problems.
created
growing
need
Cassandra
to
store
and
to
Big
Big
third remove
Data issues,
have increased
Data has been redefined Data, the mining
of its
some
of
in technology
and
rights,
dot-com
that
Big
perceptions
of
terms
the
set
variety.
structures,
companies
Facebook
Big
processing
Learning
Big Data
3 Vs.
changes
Volume
pundits
and
complex
address
with the
original
data, in
part.
to
among
After the
and
giant
data so that
Given the
whole
relational
Originally,
and Facebook
forefront
of automatically Veracity
or
Dynamo,
at the
and track 5 Vs.
what
Data five
current
velocity
into
a smaller growth
media
of the
Big Data. volume,
but the
into
More recently,
generate
all, of the
outweigh
new revenue.
Learning.
to
failed,
new technologies
been
too.
Big
the
characteristics.
significant
chapter),
have
that
combined
all three
companies
characteristics
Big
businesses
this
is
might not be considered
disagreement
3 Vs:
sources
consolidated
Amazon
of specificity
be considered
an extent
of the
audio
creating
in
had the
to
not
be considered
involve
start-up
in
and
for
and that
The success
pioneers
might
present
experienced
data store,
that
now
a data set to
video
Amazon
Big
This lack
data.
Web commerce
and
became
Data are
the
management
as
Google
characteristics.
as a combination
many Web-based
managing
the
16.1
have
that
term
field.
databases
management
Big Data is that there is some
graphics,
data
1990s,
of
Figure
media
Cengage deemed
Big
managing
social
generating
review
the
and
analyse
with it.
for
can be defined
characteristics
organisations
involving
NoSQL
of volume,
unsuitable
and
database
of
What was Big Data five years ago
of defining
of text,
like
opportunities
to the
the with
in
for
companies
(technologies
the
Although
associated
of these
storage
manipulate
in the
characteristics
data
with these
considered
problem
challenges
pressure
manage
the
of data
of the data to be stored
Big Data.
struggles
significant
and these Google
defining something
a combination
companies
other
in
as shown
experienced
feel
model.
created
be stored
associated
The key is that
burst in the
result,
displays
trends
development
generally
makes the
of the 5 Vs must be present for
new
bubble
emerging
to the
store,
arose
and
The latest
wave
827
of the data
values
adding to the
conceived
created
that that
to
possible,
leverage.
a new
efforts
relational
These characteristics
of data to
Similarly,
technology
Further
was
have led
of the
of data
data
describes
most urgent
Data
of characteristics
an extent
system.
of specific
now.
database
about
a set
increased
that
what is
NoSQL
the worth of the data to the business.
the lack
years
to
5 Vs) to
quantity
of Big
a set
the trustworthiness
to the
Big
one of the
is
of
from
term
Data and
arena. In each case the challenge
perceptions
Organisations
the variations in the structure
Value
value
Big
the speed at which data is entering the system
Variety
leads
(the
management
Volume
Notice
refers
value
create
assumptions
there
management
an ill-defined
wave
underlying
definition,
data
businesses
and requirements.
with the
a consistent
Big
to
Data is
wave of data represent
veracity
Copyright
Big
in the
changed
organisations
Data.
possibilities
challenges reject
for
is
dominant
advances
opportunities
challenges
Editorial
and remain
technological
16
Value
this
accuracy
party additional
content
may content
and
be
of the
data
any
time
data
in terms quality,
suppressed at
from if
the
subsequent
16
as
of must
eBook rights
and/or restrictions
eChapter(s). require
it
828
PART
VI
Database
be verified
Management
before data
FIGURE 16.1
a business
and
acts
upon it.
machine-generated
Advances
data that
in technology
can
spur
growth
have led to in
specific
a vast array
of user-generated
areas.
Original view of Big Data
Volume
Velocity
Big
Data
Variety
For example, Each visitors These
bands
Disney World has introduced Magic Band is connected to use radio
frequency
Magic Bands for park visitors to wear on their wrists. much of the data that Disney stores about that individual.
identification
(RFID)
and
near-field
communications
(NFC)
to
act
as tickets for rides, hotel room keys, and even credit cards within the park. The bands can be tracked so that Disney systems can follow individuals as they move through the park, record with which Disney characters (who are also tracked) they interact, purchases made, wait time in lines, and
more.
Visitors
can
make reservations
at a restaurant
and
order
meals through
a Disney
app
on
their smartphones and, by tracking the Magic Bands, the restaurant staff know when the visitors arrive for their reservation, can track at which table they are seated, and deliver their meals within minutes of the guests sitting down. Withthe many cameras mounted throughout the park, Disney can also capture pictures and short videos of the visitors throughout their stay in the park to produce a personalised
movie of their
vacation
experience,
which
can then
be sold to the visitors
as souvenirs.
All of this involves the capture of a constant stream of data from each band, processed in real time. Considering the tens of thousands of visitors to Disney World each day, each with their own Magic Band, the volume, velocity, and variety of the data are enormous.
16.1.1 Volume Volume, the quantity of datato be stored, is a key characteristic of Big Data. The storage capacities associated with Big Data are extremely large. Table 16.1 provides definitions for units of data storage capacity.
16
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
TABLE 16.1
Abbreviation
Bit
0 or 1 value
b
Byte
8 bits
B
Kilobyte
1 024*
Megabyte
1 024 KB
1 024 GB
TB
Petabyte
1 024 TB
PB
Exabyte
1 024
PB
EB
Zettabyte
1 024
EB
ZB
Yottabyte
1 024
ZB
YB
units
kilobyte
are
defined
5 210 5 1024
Naturally,
as the
On the
is
resources
system. such
addition
hundreds
it is
grows,
review
2020 has
the
any
all other
storage
the
prefix
kilo typically
values
means
are
1000;
based,
all values
however,
in
data
for
data
storage,
a
scale
up
system
to
a server
costs
for larger
out.
system:
and faster the
need
or scale
to a larger
to larger Further,
the
Scaling
up is
for example,
with
64
systems.
of these
storage
CPU
keeping
and
there
high-powered
the
changing
cores
However,
devices
from
a 100
TB
are limits
to
systems
increase
All suppressed
Rights
does
This is
100
exceeds
help
to
TB storage
also
capacity
creating
overall
than
into
of a server,
as clustering the
capacities
easily reaches can
to
reduce
systems
need storage
which
the
also referred
can
data that
warehouses,
of the
carry
storage
over
it is
cost
to
in these the
of the
buy
a single
extreme
dozens
be petabytes
not
multiple
over
There
are
degree to ill-suited
copied, affect
scanned, the
overall
or
duplicated, learning
DBMS
in
sizes.
of petabytes.
size
and
acts
requires
spread
brain
degree
associated
database
to
always
database
in
Chapter
12,
and fragmentation. data in the
of
database,
communication
with the and
was
hide the
data
of the
using replication
of communication
could the
As discussed
all of the
model
that
so that
as the
database.
a high
limits
relational
system user,
of control for
which a relational
for
the
multiple servers
significant costs
from
within the
point
by the
management
the
data
systems
performance
be
database
functions, of the
database
systems.
materially
all
represented
manipulation
must act as a single
across
the
advances
and
out these
a relational
May
greatest
a sophisticated
data
DBMS
not
workload
This
buy ten
clickstream
one
RDBMSs
Reserved. content
workload.
to
data
control
This limits
makes
of servers.
a
mistake, organisations
collect
To
the
when the
of nodes.
maintain
the
share
RDBMS
tables.
across
Learning.
either
means that,
enterprise
database
grows.
that
be.
cheaper
due to the increased
and it
Cengage deemed
it is
underlying
because
coordination
of nodes
can
a number
to
3 that
of an
must
DBMS
can
across
Make no
Chapter
be in
distributing
the
out
possible to distribute
However,
which
increases,
each system
moving
out
as eBay
of the to
scaling
to the
development
and
on
be stored
a 1 TB storage
system
of thousands
from
system
to
migrating
and
since
Organisations
appears
systems
but
servers
1 PB storage
complexity
occurs,
up involves
spread
computing
the
basis
of 2. For example,
needing
cores
hand,
of low-cost
Recall
are the
rate.
other
workload
over
and
data
a single
a cluster
This is in
nature
of powers
of
Scaling
and fast
at a dramatic
the
CPU
system.
how large
in terms
of systems,
with 16
storage
in
MB
bytes.
When this
number
a server
binary
quantity
also increases.
same
are
829
GB
1 024
bits
NoSQL
MB
Gigabyte
because
Data and
KB
bytes
Terabyte
that
Big
units
Capacity
storage
Copyright
capacity
Term
*Note
Editorial
Storage
16
ability
coordination
be scaled
and
to
distribute
as the
16
number
out as data volume
clusters.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
830
PART
VI
Database
Management
NOTE Although claim
some
to
RDBMS
support
storage
products,
clusters,
subsystem,
such
16.1.2
Velocity
Velocity,
another
key
of volume.
might
capture
retailer
like
in
and that
must
be stored
reader
the
to the
In addition
real
it
time.
creating
speed
must be processed
Clusters,
rely
legitimately
on a single,
example,
but
data
Other
shared
making
also
every
data
often be
used
products
of
to
items
as
magnitude
in the
GPS
inventory
This
produced,
time
of data and
the tag and the reader,
being
sale)
RFID,
amounts for
a
in the
final
20-minute
such
are still in boxes.
of a product
(the
during that
large
store
Today, mouse
event
generate
mirror
a retail
of the
one
system
of velocity
a purchase.
click
track
the
past,
in technology,
between
orders
enters
In the
on 30 events
can
while the
data
of capturing
that
quantity
of several
new
Amazon.
advances
line-of-sight
simultaneously a given
as
Instead
RFID tags
do not require
which
many ways, the issues
transaction
data.
at
of a customer
opportunities
an increase
at any
In
might capture
tracking
rate
such
process.
of the
For
of tags for
system
final
data-gathering
The tags
Application
generally
to the
transaction
purchasing
velocity
of
record
to the
and
experience,
hundreds
is tracked,
delivered
data
only
in
of a single
product
not
management.
can read
instead
captures
new layers
refers
a web retailer final
the
Real
and
must be processed.
consider the
in
Oracle
scope
Data,
about
shopping
add
warehouse
data
comparing
and
in
network.
data
increase
NFC,
Server
of Big
which the
browsing,
a 303
area
For example,
only the
a 20-minute
frame
at
SQL
are limited
characteristic
Amazon
searching,
as
clusters
as a storage
as well as the rate those
such
these
and the
means that,
each individual
amount
of
data
being
one time.
with which
data is entering
at a very rapid
the
pace. The velocity
system,
for
Big Data to
of processing
be actionable,
can be broken
that
down into two
categories:
Stream
processing
Feedback Stream
loop
processing
enters the that
it is
the
system
Large
focuses
produce of time
to
to try to
determine
which only
data
1
could
on that
review
2020 has
CERN,
Processing: 20,
Learning. that
data.
to
any
All suppressed
These
and
most
powerful
data
which
must data
What
discard.
data stream
at such
example,
in
the
in a two-step
as it
pace enters
at the
world,
have created algorithms
as it
a rapid
and filtered
For
accelerator
will actually
CERN
experiments
to decide ahead
process
to filter
the
data
be stored.1
on inputs,
of capturing
within just
the
loop.
Feedback
to the
to
be processed
to
particle
are applied
that
a feedback
purchases.
of data
algorithms
of as focused
is
delivered
of the
refers to the analysis of the data to produce actionable results.
The process
book
analysis
of data can enter the system
The
keep
of data
be thought
amounts
product
August
Cengage deemed
volumes
of the
and
second
information, for
large
a part of the
Copyright
GB per
on outputs.
acting
analysing
Editorial
data
and it requires
large
all
largest
processing
processing
recommendations
1
the
will be kept.
about
of as focused
16
which
Collider,
Feedback loop
then
store
processing,
about 600 TB per second of raw data. Scientists
to
stream
on input
system. In some situations,
not feasible
Hadron
down
processing.
a few
data,
Figure
loop seconds
loop
processing 16.2
to
so that
processing
it into
shows
processing
user in real time.
record?
feedback
the
loop
immediate
results
Not all feedback
While
be thought
usable information,
a feedback
provide
can
of the
loops
and
for
providing
results
requires
analysis
can
become
are used for inclusion
of
http://home.web.cern.ch/about/computing/processing-what-record,
2015.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
results
within immediate
through
terabytes
and tactical
data products.
and
petabytes
decisions.
It is
FIGURE 16.2
Feedback
of data
also
a key
loop
to inform
processing
decision
component
in
Big
Data and
NoSQL
is also used to help organisations
makers
data
16
and
help them
make faster
831
sift
strategic
analytics.
Feedbackloop processing Information
requested
information
by
user
plus
on recommendations
are
returned
List
of
recommended items added to the user request Data is
analysed
determine and
to
other
products
books
the
user
may like
Data is captured about the user and about the book requested
User
16.1.3 In the
Big
Data
context,
Data
data that
organised
can
to fit into
of the
data.
A data
discussed
data fit
refers
data model
created
organisations
transcripts,
by the
work
example,
most large-scale
of unstructured BLOB and
objects
data is that
uninterpretable
to impose
processing structure
on the
Copyright Editorial
review
2020 has
on the
for
audio,
semantic
a book
by data
processing.
that
data
the
model
data
when
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
graphic data,
be captured
data is
copied, affect
data
the
overall
decades
in
whatever
to the and
captured
data.
Big
Data
and
or
duplicated, learning
in experience.
whole
or in Cengage
Due Learning
for
format
This is
it one
Big
Data
to
electronic reserves
rights, the
right
some to
model
value.
of the
One
conveys,
problem
without
differences databases
processing
imposes
additional
content
may content
be
between
any
time
a
a structure
of providing
suppressed at
1
any
impose
One advantage
party
with
is inaccessible
exists in,
key
data. For
allows the storage
Relational
third
tweets,
of unstructured
naturally
remove
as
are semi-structured
texts,
which the relational
object
and processing.
rules,
world
emails,
atomic
the
some
for storage in
data in the
data type that
processing.
stored.
part.
that
not
much of the transactional
some forms
as a single,
meaning
business
images,
object (BLOB)
as a part of retrieval
scanned, the
Over the
data
on structured
and routed
of the
satellite
may be
of both rely
organisations
most
maps,
a binary large
and
database
the
environment,
data forms.
of the
elements
databases
model. Although
have evolved to address
or structure
a relational
on the
data
Structured
data is data that is
combines
Relational
based
data includes
video
value
not.
data
which the
Unstructured
the data are decomposed
a structured
support
in
or semi-structured.
data do
designer
database,
of other
RDBMSs
a data in
host
data as needed for applications
Cengage deemed
data
well in
and structures
model.
parts
as defined in the
databases
like
the
Big Data requires attempt
a whole
data
other
database
Unstructured
relational
of formats unstructured,
Semi-structured
while
and columns use
and
has been dominant,
model.
4. As data enters the
tables
array
to fit a predefined
a predefined
Chapter
vast
be structured,
a predefined
or unstructured. videos,
to the to
organised
model is
in
corresponding
data that
variety
be considered
has been
parts
the
on a link
Variety
captured.
is
clicks
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
832
PART
VI
Database
Management
structure
during retrieval
ways for
different
and processing
is the flexibility
of being
businesses
make
able to structure
the
data in different
applications.
16.1.4 Veracity Veracity
is
becoming
Veracity refers
more important
of the data and the information formats
it takes,
from
several
Also, in
Big
data
of the
generated
quality
causes,
terms
comments
for
to the trustworthiness
and
as having
of sentiment
analysis,
at one point in time
Data, it is important
that
to
are less
capture
controllable.
only
customers
data
source
is
portions
about
of the
and
preferences
at another
where
data they
rely
can
of the
the
data
collect.
on the accuracy
of Big Data, in terms
for action
validated
on the
Uncertainty
selected
opinions
based
makers reasonably
Due to the variety
might not be suitable
the
decisions
Can decision
from it?
accuracy
such
that
data.
different
data
due to change
point in time.
can
arise
high
velocity.
over
time,
so
When utilising
possible.
16.1.5 Value Given the
costs
has value.
Value, also called
viability,
information
can add value to the
meaningful In
order
to
advanced
of processing,
that
create
value,
algorithms
data. Information
that
through
after
contacts
predictive
analysing
to the
must
data types to the
an insurance calls
to
models to
hidden
such
as
and used to that
use to
a business
unless
data can be analysed
the
use
patterns
market
and
trends
amongst
data
and
and
current
other insurance
and
products;
analytics,
new
it
to provide
which
customer
insurance website
new
utilises
knowledge
within the
buying
making across the
about
persons,
of
and
drive decision
collects
objects
distribution
of no
which the
through
discover
company
cross-sell
Data, it is
organisation.
business,
etc.), insured
at risk
Big
degree to
be actionable
Big Data analytics
by looking
of different insurance
and
refers
Data
valuable
phone
company
by using
is
analytics,
(surveys,
to the
Big
on different
can be realised example,
storing
claims, usage,
customers;
patterns,
business.
all customer
can
create
increase
and even perform
For
value
turnover
analytical
pricing
contracts.
NOTE While the
value
important
to
(GDPR)
became
on 25
May
of the
Big
this
and have to
right
to
ask for online,
of the logic
used
to
ethically
all
businesses
actions
and
consent
is
given.
of how the
was automatically
algorithm
that
made the
organisations
be subject
A person
decision
rejected,
General that
is reached. they
would
collect
automated
subject
to
For example, have
a right
if to
also
Regulation
and
process
data
exchange
detailed
decision such
it is
Protection
businesses
major changes
to
who is
Data analytics, Data
most
One of the
not to
by Big
The
requirement,
with its requirements.
explicit
informed
and legally.
Union legal
of an individual
an explanation
by the
both
a European
comply
and
measurable
used for
is
the rights
unless
a bank loan
data is
requirement
Although
profiling,
has the
the
a legal
GDPR includes
includes
Data is linked
that
2018.
internationally
for
of
ensure
in
data
Article
making,
decision
making
a person ask for
was to an
22
which now apply
explanation
decision.2
16 2
Crockett, Joint
K.,
Goltz,
conference
S. and on
Garratt,
Artificial
M. GDPR Impact
Neural
Networks
on
(IJCNN),
Computational DOI:
Intelligence
Research,
10.1109/IJCNN.2018.8489614,
IEEE
International
ISSN:
2161-4407,
2018.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
16
Big
Data and
NoSQL
833
16.1.6 Other Characteristics Characterising
Big
Data
with the
5 Vs is fairly
standard.
However,
as the industry
matures,
other
characteristics have been put forward as being equally important. Keeping with the spirit of the 5 Vs, these additional characteristics are typically presented as additional Vs. Variability refers to the changes in the meaning of the data based on context. While variety and variability are similar terms, they mean distinctly
different
things
in
Big
Data.
Variety
is
about
differences
in
structure.
Variability
is
about
differences in meaning. Variability is especially relevant in areas such as sentiment analysis that attempt to understand the meanings of words. Sentiment analysis is a method of text analysis that attempts to determine whether a statement conveys a positive, negative, or neutral attitude about a topic. For example, consider the statements I just bought a new smartphone I love it! and The screen on my new smartphone
shattered
the first time I dropped
it
I love it!
In the first
statement,
the
presence
of
the phrase I love it might help an algorithm correctly interpret the statement as expressing a positive attitude. However, the second statement uses sarcasm to express a negative attitude, so the presence of the phrase I love it may cause the analysis to interpret the meaning of the phrase incorrectly. The final characteristic of Big Data is visualisation. Visualisation is the ability to present the data graphically
in
such
a way as to
make it
understandable.
Volumes
of data can leave
decision
makers
awash in facts but withlittle understanding of what the facts mean. Visualisation is a way of presenting the facts so that decision makers can comprehend the meaning of the information to gain insights. An argument could be madethat these additional Vs are not necessarily characteristics of Big Data; or, perhaps
more accurately,
they
are not characteristics
of only
Big Data. Visualisation
was discussed
and illustrated at length in Chapter 15 as an important tool in working with data warehouses, which are often maintained as structured data stores in RDBMS products. The important thing to remember is that these characteristics that play animportant part in working with data in the relational model are universal and also apply to Big Data. Big Data represents
a new wave in
data
management
challenges,
but it does not
mean that relational
database technology is going away. Structured data that depends on ACIDS (atomicity, consistency, isolation, durability, and serialisability) transactions, as discussed in Chapter 12, will always be critical to business operations. Relational databases are still the best way of storing and managing this type of data. What has changed is that now, for the first time in decades, relational databases are not necessarily
the
best
way for
storing
and
managing
all of an organisations
data.
Since the rise
of
the relational model, the decision for data managers when faced with new storage requirements was not whether to use a relational database, but which relational DBMS to use. Now, the decision of whether to use a relational database at all is a real question. This has led to polyglot persistencethe coexistence
of a variety
of data storage
and
management
technologies
within an organisations
infrastructure. Scaling up, as discussed, is often considered a viable option as relational databases grow. However, it has practical limits and cost considerations that makeit unfeasible for many Big Data installations. Scaling out into clusters based onlow-cost commodity servers is the dominant approach that organisations are currently pursuing for Big Data management. As a result, new technologies not based
on the relational
16.2
model have been
developed.
HADOOP 16
Big
Data
clusters.
requires Although
standard
for
framework
Copyright Editorial
review
2020 has
Cengage deemed
a different
for
Learning. that
most
any
All
to
other implementation Big
Data storage
distributing
suppressed
approach
Rights
Reserved. content
does
and
May not
not materially
be
affect
scanned, the
data
technologies and
overall
are
processing.
processing
copied,
distributed
or
duplicated, learning
in experience.
whole
possible,
Hadoop
very large
or in Cengage
storage
is
not
data sets
part.
Due Learning
to
electronic reserves
that
Hadoop
designed has
a database.
across
rights, the
is
right
some to
third
party additional
content
large-scale
become
Hadoop
clusters
remove
for
the
is
de facto
a Java-based
of computers.
may content
be
suppressed at
any
time
from if
the
subsequent
While
eBook rights
and/or restrictions
eChapter(s). require
it
834
PART
VI
Database
the
Management
Hadoop
framework
Distributed which
File
means that
supports
it
be
large
MapReduce
better together enormous
and
MapReduce.
used data
directly sets in
separately,
as a Hadoop
amounts
many parts, the two
(HDFS)
can
processing
HDFS and
includes
System
of data
Hadoop
system. across
Distributed
for
data
a highly
alow-level
storage.
parallel,
Hadoop
vast
MapReduce
is
file
complement
are the
system, model
While it is
each
specifically
other
to
Hadoop
processing
a programming
manner.
was engineered
clusters
components distributed
distributed
the two technologies
16.2.1 Hadoop Distributed The
most important
HDFS is
possible
so that
distribute
that
to
use
they
and
work
process
of servers.
File System
File System
(HDFS)
approach
to
distributing
data is
based
on several
key
assumptions: High volume. petabytes, the
The volume or larger.
HDFS is
in
size,
Hadoop,
Oracle
blocks each
on the
these into
file
hardware
greatly
just
storage
is
and
hand,
can
that files in the blocks,
reduced,
Oracle
to
block
even the
of file
blocks
be in terabytes,
that
are
Relational
organises
data into
of 64
example,
512
bytes
databases
often
8 KB physical times
As a result,
overhead
Data in
For
often
MB(8000
values.
metadata
large.
storage.
involved.
size
larger
to
will be extremely
types
into
system
has a default
simplifying
other
organised
By default,
be configured
is expected HDFS
as in
operating
blocks.
other
and it
per file is
assumes
database
on the
block!),
Big Data applications
physical
computer,
depending
blocks.
into
personal
aggregate
an
Hadoop
organised
on a typical
of data in
the size
the
number
of tracking
the
of of
blocks
in
file.
Write-once,
read-many.
and improves
system,
and then
This improves many
Big
Using
overall
data
closed.
overall Data
advancements
This is a key
the
write-once,
read-many
Using
this
performance Although
HDFS
advancement
allow
for
works
files
NoSQL
have
databases
the
of the
new
data
because
written
be
made to its of tasks
file
cannot
for
file
contents. performed
be changed,
to the
allows
issues
to the
types
appended
it
concurrency
created,
cannot
well for
contents
to
simplifies
a file is
changes
and existing
for
model
model,
Once the file is closed,
system
applications. in
a
throughput.
end
recent
of the
database
by
logs
file.
to
be
updated. Streaming pieces
access.
Instead
several
of optimising
optimised Fault
for
some
the
batch
tolerance.
commodity
computers.
device.
is
devices
on three
tables,
system
to
designed assumed
Hadoop
individual
elements
across
the
device fails, factor
Different
often retrieve
typically
process
thousands of such
data is
of three,
replication
is
of low-cost,
devices,
still
can
at any
point
to replicate
available
meaning
factors
Hadoop
of data.
HDFS is designed the
small
entire files.
randomly,
stream
with thousands
Therefore,
a replication
devices.
data
be distributed
when one
uses
different
where queries
as a continuous
that,
errors.
so that
files
to
systems
Big Data applications
access
of entire
hardware
By default,
stored
file
It is
will experience
processing
different
processing
Hadoop
many different
is
Unlike transaction
of data from
that
from
each
across
another
block
be specified
in time,
data
for
of data each
file,
if
desired. Hadoop
16
tasks
node
Copyright Editorial
review
2020 has
several
the
Learning. that
any
All suppressed
types
system.
and one or
Cengage deemed
uses within
Rights
does
May not
not materially
be
copied, affect
is just
there
as depicted
scanned, the
A node HDFS,
more data nodes,
Reserved. content
of nodes. Within the
overall
or
duplicated, learning
in experience.
in
whole
a computer
are three
Figure
or in Cengage
part.
Due Learning
that
types
of
performs nodes:
one
the
or
client
more types
node,
the
of
name
16.3.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 16.3
Hadoop Distributed
16
Big
Data and
NoSQL
835
File System (HDFS) Metadata:
Client Node
Name
Data Node 1
Block
1
Block 4
Block
Data nodes
Data Node 2
Node
File1:
Blocks
1,3,4: r3
File2:
Blocks
2,5,6: r3
Data Node 3
Data Node 4
Block
2
Block
1
Block
3
Block
2
Block
3
Block
1
Block
2
Block
5
Block
5
Block
4
Block
5
Block
4
Block
3
Block
6
Block
6
6
store the
actual file
data
within the
HDFS. Recall that files in
HDFS are broken into
blocks
and are replicated to ensure fault tolerance. As a result, each block is duplicated on more than one data node. Figure 16.3 shows the default replication factor of three, so each block appears on three data nodes. The within the
name
node
a HDFS metadata
small
and improve for the block
user
allows
that to
metadata
metadata
is
name
node
the
performance.
node is
requests
the
The
system
name numbers
makes
contains
cluster.
minimised.
comprise the
file
for
the
file
designed to
to
hold
system.
There
be small,
all of the
metadata
This is important
because there is composed
each file, system,
and the
either
desired
to read
files
one
name
recoverable.
to reduce
node
Keeping
disk
accesses
only one name node so contention of the
factor
write
only
easily
memory
primarily
replication or to
typically
and
in
The
metadata is
is
simple,
for
name
each
new files,
of each file,
file.
as
needed
with the name
node.
The
client
to
support
the node the
application.
When a client
node needs to create
Adds the new file name to the
a new file, it communicates
The name
node:
metadata
Determines a new block number for the file Determines alist of which data nodes the block will be stored Passes that information
back to the
The client node contacts file
on that
data
nodes that
Copyright Editorial
review
the
second
data
has
Cengage deemed
Learning. that
any
the first
At the
next
data
node
node then
All suppressed
Rights
Reserved. content
does
the block. in
the list
contacts
May not
not materially
be
copied, affect
node.
data node specified
same time,
will be replicating
contacts
2020
node.
client
the
scanned, the
overall
the
by the name node and begins
node
sends
the
Asthe data is received and
next
or
client
duplicated, learning
begins
data
in experience.
sending
the
or in Cengage
part.
Due Learning
to
electronic reserves
node the list
writing the
of other
16
data
from the client node, the data node data
node in the list,
whole
data to this
and the
rights, the
right
some to
third remove
node
for
process
party additional
content
replication.
continues
may content
be
suppressed at
any
time
from if
This
with the
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
836
PART
VI
Database
data
being
written, the
streamed
the
next
file is
the
Management
client
block.
node.
entire
if
the
a client
with that
nodes
to it
node file
each
reports
needs
and the
block,
on the
network.
each
name
node
a file,
or
desired
block
writing
files.
works
will consider
how
MapReduce
is
data
so
with the
same
key
are
Figure
node
of
key-value
units
once.
even in the
knows
not to
the to
that
could
the
the
of blocks
may appear
from
node.
data
data
from
HDFS
produce
The
in
node
many that
directly from
data
and informs
name
is
each
node
a data
node
can have
of which
a fault,
a heartbeat
node
in
block
is used to let
experiences
data
send
node
A heartbeat
will not receive that
nodes
the
seconds.
node
include
of the
down
specialised
lists
to
due to
from client
to
that nodes
causes
a block
data
node initiate
replicating
specialised
distributed
a live
a powerful,
processing
provides
data
have
fewer
yet
requirements
processing
to
highly of
Big
complement
Data
applications.
data
storage
Next,
of
HDFS.
used to process large data sets across clusters. Conceptually,
and follows into
value,
and
the
data
data in
into
to
that
subset
small
is
a great
key
and the
combines
line
them
sold.
of the
in
units
are the the
value.
is
The
original
of Figure
that
data
is
is
pairs, all
performed
by
map and reduce
in
that
Figure
that
original
redundant
function with
then each
the
are
stored
so the
each
data
as
data are
fact
store.
is
kept for
pairs in that
list
that,
customer
which the
key (product
stored
Note
data about the
takes
total
data as a value.
data is
to find
a new list of key-value
associated
16.4
database
ensures
data in the 16.4,
determines
of the invoice
a relational
The reduce
values
map function
function
MapReduce
parse each invoice
map function
by summing
of
of normalisation
shown
The
programs.
of duplication
is
original task.
of key-value
platform; therefore,
do not constitute
map functions
takes all at the
A mapfunction takes
pairs.
The reduce
subtasks
for the
takes a collection
key and the remainder
no form deal
of data that
Dunne. In the figure, The result
been
MapReduce
the
a final result
of key-value
result.
Java
as the
data storage
and there
very
a single
illustration
has
number
conquer.
performs
and areduce function.
Hadoop is a Java-based
conceptual
Hadoop
is
into
and
produce
a set
procedure-oriented
product
tables
them
Recall that
a simple
there
summary
any
of divide subtasks,
of a mapfunction
summarises
with the invoice
into
the
Learning.
principle of smaller
of each subtask
and filters
as detailed,
sold on that invoice.
and
the
a collection
the result
sorts
Therefore,
Leona
that
that
node.
it
of each
the
separated
Cengage
for
node
the list
block
block
If a data
name
node
understand
provides
pairs,
Remember,
is
name
available. the
name
MapReduce
and
written
16.4
number
deemed
node
transmitted
congestion
to request
node reads
by a program called a mapper. Areduce function
functions
has
name
block is
name
actually
avoid
each
the
six hours
of a heartbeat
a program called a reducer.
2020
the file
node
Given that
the
the
combines
of data
performed
10011,
with
on, then
As the name implies, it is a combination
review
data
name
client
If the lack
well for the
breaks
and then
a collection
Copyright
the
informs
node to
to retrieve
the
node is still
and
components
easy to
task,
same time,
Editorial
Once the first from
MapReduce
a complex
the
the
hold them.
attempts
is sent every
MapReduce is the computing framework
pairs
name
contacts
of replicas,
data
the
that
16.2.2
code
that
node
report
name
number
together,
system
the
the
on another
Taken
16
block. nodes
of the
Heartbeats are sent every three
outage,
As a result,
the
only
it
communicates
A block
power
reading
not
node
any
to the
the
of data
client
was
data flow
nodes
client
node
know that
failure,
node.
we
the
no time
Using this information,
data
and heartbeats.
hardware
file
are storing and list
written,
at
to read data
the
blocks are on that data node.
the
that
number
nodes.
Periodically,
than
been
to
has
This helps to reduce
for
for
file
that
data
data
nodes
block
note
associated
the
data
another
performance.
Similarly,
of these
get
It is important
slow system
closest
all of the
can
When the
closed.
name
across
node
products product
of key-value
code)
to
produce
result.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 16.4
As previously
files from
the
data
sets
used in
multiple nodes to a central and
computational pushed
place
nodes
that
are then
the
HDFS. Typically,
retrieving
the
data for
the
data
across
the
Data
processing
containing
aggregated
must be processed.
Big
nodes
to
and sent
Hadoop framework
and the
of reducers
is
HDFS is
using
configurable
64
by the
but
in
a central Each
back to the
client.
Data and
NoSQL
837
that
large.
Transferring
a tremendous node.
location,
copy
copies
of the
This
amount
Therefore,
mirrors the
of network
instead
of the
program
entire
of the
program
produces
are
results
distribution
of data in
a mapper for each block on each data node that
number
best
extremely
central
processed.
distributes
MB blocks,
user,
on the
processing be
are
would require
burden
This can lead to a very large
processed
applications
node for processing
an incredible
program
to the
Big
MapReduce
stated,
bandwidth,
16
of mappers.
yields
over
practices
For example,
15 000
suggest
mapper
about
if 1 TB of data is to be programs.
one reducer
per
The
number
data
node.
NOTE Best practices
suggest
However, there a given at
node
each
that
are cases
the
number
of applications
with satisfactory
of
mappers
with simple
performance.
Clearly,
on a given
node
map functions much
should
running
depends
on the
be kept to
100
as many as 300 computing
or less.
mappers
resources
on
available
node.
The implementation of MapReduce complements the structure of the HDFS, which is an important reason why they work so welltogether. Just as the HDFS structure is composed of a name node and several data nodes, MapReduce uses ajob tracker (the actual name ofthe program is JobTracker) and several task
trackers
(the
programs
are named
TaskTrackers).
The job tracker
acts as a central
control
for MapReduce processing, and it normally exists on the same server that is acting as the name node. Task tracker programs reside on the data nodes. Oneimportant feature of the MapReduce framework is that the user must write the Java code for the map and reduce functions, and must specify the input and output files to be read and written for the job that is being submitted. However, the job tracker will take
Copyright Editorial
review
2020 has
care of locating
Cengage deemed
Learning. that
any
All suppressed
Rights
the
Reserved. content
does
data, determining
May not
not materially
be
copied, affect
scanned, the
overall
or
which nodes to use, dividing
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
the job into tasks for the
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
16
nodes,
eBook rights
and/or restrictions
eChapter(s). require
it
838
PART
VI
Database
Management
and
managing failures
user
submits
a
of the
MapReduce
1
A client node (client
2
The job tracker blocks that
3
Thejob tracker the
might
system
be busy
same
4
determines
The job tracker
This
to
which task trackers
jobs
mappers
that
other
nodes
portion
fails
or crashes,
is
still
job
working
the
data that
Therefore,
the
new request
can handle
users can be running is
being
task
arrives.
on
processed
tracker
on that
Because
multiple
by node
the
data
nodes for the
end, either
can reassign
processing
completing
is
often
of the reports
for student
used
systems
when
about
portion
evenings
fee payment
of complementary
computing
when
of the
the larger
IT infrastructure.
The
task
mappers and
not
map and reduce functions.
halted.
availability
to
task
manager has failed.
node.
is when a program runs from beginning
requires
any interaction
an
often
processing
section
another
know that the
more jobs).
whether atask
without
may be idle,
Batch
for
until the job status is completed.
extended
use
and
discusses
might
user.
of time
processing
universities
use
batch
the integration
of these
or
to run
Batch a large
year-end processing
but it has limitations.
to improve
some
with the
period
batch
is not bad,
have been developed
next
begin
changes status to indicate that the job is completed.
Businesses
systems
programs
is
nodes
with an error,
capacity.
processing.
number
the
Batch processing
or halting
the
tracker
messages to determine
that
processing
in the
task
queries the job tracker
the task
nodes to
machine (JVM) to run the entire
(and
monitors the heartbeat
tracker
on each of those task.
messages to the job tracker to let the job tracker
on the job
The client node periodically
financial
different
may contain time.
when this
Whenthe entire job is finished, the job tracker
portion
which data nodes contain the
may be able to select from
of the
The Hadoop system uses batch processing. to
jobs
creates a new Java virtual
a function
tracker
If so, the
9
When a
work. Each task tracker
jobs from
node same
then contacts the task trackers
Thejob tracker
8
a data
all at the
for
The task tracker sends heartbeat
7
are available for
many MapReduce so
different
complete
way, if
task
user intervention.
as follows:
a MapReduce job to the job tracker.
multiple nodes, the job tracker
The task tracker
6
is
data.
reducers
5
without
process
for this job.
simultaneously,
running
on
automatically
general
with the name node to determine
Remember,
from
the
submits
be processed
mappers
is replicated
processing,
application)
of tasks.
Hadoop
multiple
All of this is done
for
communicates
should
a set number
nodes. job
As a result, of Hadoop
a
within
programs.
16.2.3 Hadoop Ecosystem Hadoop is sets.
manage grown are
widely
and up
not
use, it
skilled
interact any
and
Copyright Editorial
review
2020 has
Cengage deemed
Most
and
Learning. that
any
All suppressed
how they
Java
Rights
Reserved. content
does
May not
not materially
be
to
obstacles.
As
make it
easier
programming.
that
Figure
use
other to pieces
are
to
copied, affect
each
scanned, the
overall
or
constantly are
and
shows
of related
applications
more
accessible
examples
ecosystem
of the
popular
to
to
of some
data create, have
users
of these
of applications
and their
more
large
effort
use a set of other related
evolving
some
extremely
considerable
a host
use
also
an entire
of analysing
requiring
who types
products
that
and tools.
Like
relationships
are changing,
components
in
a Hadoop
other.
duplicated, learning
to
16.5
Hadoop
produce
The following
relate
potential
a result,
a few
organisations
situation.
the
quite
the interconnected fluid
into
tool
attempt
each
tapping
Hadoop is a very low-level
to
complement
a rather
ecosystem
presents Hadoop
at complex
ecosystem,
so it is
by organisations
because
around
of applications.
16
used
Unfortunately,
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
FIGURE 16.5
16
simplification Pig
Flume
NoSQL
839
applications
Hive
HBase
MapReduce Hadoop
Distributed
File
System
(HDFS)
Impala
Sqoop Data ingestion
Creating
Data and
Asample of the Hadoop ecosystem MapReduce
MapReduce
Big
Simplification
MapReduce
Core
applications
Hadoop
components
Direct
query
applications
Applications
jobs requires
significant
programming
skills.
As the
mapper and reducer
programs
become more complex, the skill requirements increase and the time to produce the programs becomes significant. These skills are beyond the capabilities of most data users. Therefore, applications to simplify the process of creating MapReduce jobs have been developed. Two of the most popular are Hive and Pig. Hive is a data
warehousing
system
that
sits
on top
of HDFS. It is
not a relational
database,
but
it supports its own SQL-like language, called HiveQL, that mimics SQL commands to run ad hoc queries. HiveQL commands are processed by the Hive query engine into sets of MapReduce jobs. As a result, the underlying processing tends to be batch-oriented, producing jobs that are very scalable over extremely large sets of data. However, the batch nature of the jobs makes Hive a poor choice for jobs that
only require
a small subset
of data to
be returned
very
quickly.
Pigis atool for compiling a high-level scripting language, named Pig Latin, into MapReduce jobs for executing in Hadoop. In concept it is similar to Hivein that it provides a means of producing MapReduce jobs without the burden of low-level Java programming. The primary difference is that Pig Latin is a scripting
language,
which
means it is
procedural,
while
HiveQL, like
SQL, is
declarative.
Declarative
languages allow the user to specify what they want, not how to get it. This is very useful for query processing. Procedural languages require the user to specify how the data is to be manipulated. Thisis very useful for performing data transformations. As a result, Pigis often used for producing data pipeline tasks that transform data in a series of steps. This is often seen in ETL(Extraction, Transformation and Loading)
processes
as described
in
Chapter
15.
Data Ingestion Applications One challenge faced by organisations that are taking advantage of Hadoops massive data storage and data processing capabilities, is the issue of actually getting data from their existing systems into the
Hadoop
cluster.
To simplify
this
task,
applications
have
been
developed
to ingest
or gather this
data into Hadoop. Flume is a component for ingesting data into Hadoop. It is designed primarily for harvesting large sets of data from server log files, like clickstream data from Web server logs. It can be configured to import
the
data on a regular
schedule
or based
on specified
events. In
addition
to simply
bringing
the
16
data into Hadoop, Flume contains a simple query-processing component so the possibility exists of performing some transformations on the data asit is being harvested. Typically, Flume would movethe data into the HDFS, but it can also be configured to input the data directly into another component of the Hadoop ecosystem named HBase.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
840
PART
VI
Database
Management
Sqoop is and forth scoop it
a more recent
between of ice
cream)
provides
Sqoop
a
way
works
is
an amalgam
in one direction data from reading
contents
format.
the of the
table
the
by row.
will usually
This is be
The resulting
a traditional
Direct
data
data
can
then
while
done in
into
Sqoop
from
files
back
the
back as in
Flume
files,
while Flume
and out of HDFS.
by
at a time
using
rows
in
MapReduce
relational
When
with the
MapReduce,
stored
a
in that
with log
Further,
manner
with the
to
to
primarily
one table
parallelised
HDFS
similar
into
HDFS, it can be processed
be exported
is
data
scoop,
SQL Server.
directions
several
converting
works
data is imported
a highly
into
for
(pronounced
Flume
MySQL and
HDFS, the
distributed
Sqoop
concept,
data in both
into
Once the data has been imported
Hive.
as Oracle,
It is a tool
name
However,
can transfer
database
row
The In
HDFS.
such
Sqoop
a relational
table
ecosystem.
HDFS.
of SQL-to-Hadoop.
databases
only,
Hadoop
and the
data into
with relational
operates
process
to the
database
of bringing
transferring
the
addition
a relational
so
a delimited
jobs
database,
or using most
often
warehouse.
Query Applications
Direct query applications These
applications
attempt
interact
to
with
provide
HDFS
faster
directly,
query access
instead
of
than is
going
possible
through
the
through
MapReduce.
MapReduce
processing
layer. HBase is
primary
a column-oriented
characteristics
NoSQL
SQL or SQL-like languages, system it
does
more
processing
and is more
detail
data
next
section.
first
SQL
SQL
HDFS
queries
directly
against
the
nodes.
It is
small
Hadoop
data
one
Hadoop data
available
It
from
analysts
was
through
such
data.
HBase is
components
HDFS.
by
if
makes for
at quickly
Hadoop
as
data
heavy
processing
good
ecosystem
a query
an organisation
analysts
The making
will be discussed
Cloudera
an SQL interface,
tool
very
of the
Prior to Impala,
Impala
HBases
processing,
databases
produced
One of
as Java for interaction. by batch
With Impala,
an appropriate
HDFS.
out easily. It does not support
caused
of the
HDFS.
database.
still in
considered
delays
popular
of the
to scale
Column-oriented
application.
to
are
more
system.
directly
sit on top
languages
the subsets
of the
a relational
while they
avoids
smaller
messaging
pull
into
generally
result
on
that
and imported
on data a relatively
HBase is
in the
so it
involving
for its
was the
make data from
from
sets.
to
and designed
on lower-level
jobs,
processing
designed
distributed
relying instead MapReduce
by Facebook
supports
to
on
for fast
sparse
used
Impala that
not rely
suitable
database
is that it is highly
would
can
in
engine needed
be extracted
write
SQL queries
use
of in-memory
large
amounts
caching of data into
set.
NOTE
Other than Impala, each of the components of the Hadoop ecosystem described in this section are all open-source, top-level projects of the Apache Software Foundation. Moreinformation on each of these projects
and
many others is
16.3 16
available
NoSQL DATABASES
NoSQL is the unfortunate developed
to address
NoSQL
in
Chapter
technologies
are,
what the
Copyright Editorial
review
2020 has
Cengage deemed
any
All suppressed
Rights
challenges
2, Data
Models.)
but rather are
Reserved. content
name given to a broad array of non-relational
the
technologies
Learning. that
at www.apache.org.
does
May not
not materially
represented The name is
what they not!
be
copied, affect
The
name
scanned, the
are
overall
or
duplicated, learning
by Big Data. (You unfortunate
not. In was
in experience.
whole
fact,
chosen
or in Cengage
part.
in that the
it
name
does also
as a Twitter
Due Learning
to
electronic reserves
rights, the
right
database technologies may recall that not
hashtag
some to
third remove
describe
does
party additional
what the
a poor job to
content
of
simplify
may content
that have
wefirst introduced
be
coordinating
suppressed at
any
NoSQL
explaining
time
from if
the
subsequent
eBook rights
a
and/or restrictions
eChapter(s). require
it
CHAPTER
meeting
of developers
developed encountering products
as their in this
stand
that
is
for
Not
Only
Server,
term
SQL.
the
are literally
NoSQL.
databases, databases
have
sense
all of those
following
Although
a cost
that,
The preference
is to
of the of the
databases
NoSQL
databases
Key-Value
value.
The value
Copyright review
2020 has
to
Cengage deemed
that
any
All suppressed
like
run
create
document
NoSQL
as open-source of the
operating
a cluster
open-source
system.
containing
It tens
for
Windows
or
Linux, that is freely
available
and highly
only in
licences
on
itself. defined
popular
as a part
a Linux
or
Unix
MacOS
environment.
The
approaches.
data
can be anything
Rights
Developer
Dynamo
Amazon Basho
Redis
Redis
Voldemort
LinkedIn
Labs
MongoDB, Inc.
CouchDB
Apache
OrientDB
OrientDB
RavenDB
Hibernating
HBase
Apache
Cassandra
Apache
Hypertable
Hypertable, Inc.
Ltd Rhinos
(originally
Neo4J
Neo4j
ArangoDB
ArangoDB,
GraphBase
FactNexus
Facebook)
LLC
does
16
May not
not materially
be
such
copied, affect
the
simplest
as a collection
the contents
Reserved. content
Databases
Riak
are conceptually
stores
understand
Learning. that
to
stores,
some
Linux
simply Oracle,
Databases
databases
database
attempt
Editorial
(KV)
going
purchase
name
broadly
produced
with the
as
off focusing
the
perceived
MongoDB
Graph databases
Key-value
NoSQL
Example
Column-oriented
16.3.1
products
been
were
the
data
such could
NoSQLdatabases
databases
Document
have generally
is
use a platform,
major
key-value
be associated
want to
under
a NoSQL
such
about
Table 16.2 shows
are
an organisation
NoSQL
as being
databases
to
worrying
that
support
NoSQL
product
products
841
were
of creating
that
you are better
than
categories:
databases
produced
a NoSQL RDBMS
they
products
appeal
NoSQL
were being
meant to imply
to interject
Regardless, refers
databases.
does not
Category
Key-value
if
organisation
each
of four
also tend
considered
traditional
Data and
that
such
has yet
tried
be considered
NoSQL
NoSQL
standpoint
discuss
TABLE 16.2
not all
they
most
one
be
term
can
no one
have
qualify.
the
and graph
As a result,
Therefore, sections
that
into
databases
been.
nodes.
products
which
many
Big
problems
was never
base of SQL users, the
of the
would
to
with the
NoSQL
Although
to all
technologies
deal
SQL. In fact,
observers
then
Access
to
The term for
ways.
requirement
fit roughly
of nodes, the
customisable.
NoSQL
of
Accordingly,
from
of thousands
if the
Microsoft
of each type. most
sizes. support
industry
supported,
of technologies
of these
movement.
for
fact,
hundreds
Most
some
database
Facebook
SQL, given the large
are
column-oriented
software,
makes
In
SQL
array
and
SQL in important
standard
MySQL and
understanding There
enormous include
never
non-relational
Amazon
should
More recently,
beyond
about the
Google,
sets reached
mimic
obvious.
languages
SQL
data
that implements
a product
that
like
category
query languages system
to discuss ideas
by organisations
16
as text,
the
overall
or
duplicated, learning
NoSQL pairs.
in experience.
component
whole
or in Cengage
part.
Due Learning
electronic reserves
models. acts
or an image.
or its
to
data
The key
an XML document,
of the value
scanned,
of the
of key-value
rights, right
some to
third remove
party additional
database
content
may content
be
any
time
from if
a
the
does not
simply
suppressed at
is
for
The database
meaningthe
the
A KV database
as an identifier
the
subsequent
stores
eBook rights
and/or restrictions
eChapter(s). require
it
842
PART
VI
Database
whatever the
Management
value is
meaning
provided
of the
for the
data in the
be tracked
among
databases
extremely
keys
database
equivalent but they
the
key. In
of the
and
the
words,
is
to
1 key
not
does
used to remove
exist,
KV model does not allow
of the
value
such
pair for
parse the
(One
important
in the
figure,
the
key-value
pairs
FIGURE 16.6
about
tabular
format
are
stored
16.3.2
Document
Document
data,
be in
2020 has
any
of the
LName
Learning. any
name
issue
that,
to
command
to
name, first
in
pair. If the new
have
name
DBMS
bucket
does
KV DBMS
and pairs
distinguish
other
query
not
even
content
return
the
know
how
characteristics.
appear
the
to
Since
to
the
application
1
Delete is
pairs.
understand the
Get
a key. If
value.
key-value
KV not
key-value
visually
are used.
it is not possible
the
does
plus
component
a value
with the
would be up to the
help
value
place
with three
it
within
bucket
key.
key-value
In fact,
although
to
used
component,
because
a get
is
bucket
value
on the
and delete operations
is replaced
example.
must be unique
in the
and
of as the KV
in tabular
form
components.
Actual
structure.)
Ramas Phone
LName
in key-value
Dunne
body
All suppressed
Rights
format,
of the
does
Initial
Leona
FName
A Areacode
0 Initial
Balance
K Areacode
0
Myron
Balance
Areacode
0161
0
May not
Unlike
such
as
difference
tags
not
be
copied, affect
XML,
while
Despite
the
overall
or
duplicated, learning
the
in experience.
value
component
value
component.
Object
whole
do.
Tags
of the
in
Cengage
part.
Due Learning
to
electronic reserves
document,
rights, right
some to
portions
document
third remove
party additional
may
Binary
may content
be
of
any
the
a document. the
be additional
title, tags
databases
suppressed at
JSON
understand
document
content
any
document
represents
there
documents,
the
or
to
be
data in
contain
The
(JSON),
named
the
stores can
do not attempt are
text
use of tags in
or in
Notation
can almost
that
in the
which
body
and they database
where the
KV databases
databases to identify
databases, is a NoSQL
a document
JavaScript
Within the
scanned,
key-value database
a KV database
document
document.
to
stores
is that,
may have
materially
similar A document
always
and sections.
Reserved. content
FName
Orlando
pairs.
component,
chapters
Alfred Balance
894-1238
222-1672
database
encoded
value
FName
844-2573
Phone
LName
a document
and
that
last
a convenience
of KV databases.
a document
example,
Cengage deemed
KV
Value
Another important
to indicate
review
for
be thought
based
on anything
as a new
a customer
name,
are
bucket
Store
component
on data in the
Be aware
are conceptually
a subtype
documents
(BSON).
Copyright
making
Databases
databases
considered
Editorial
perform,
= Customer
Phone
author
must
cannot
Key-value database storage
10014
For
value
a table-like
0181
content
added
last
in
10011
16
it is
to find the customers
0161
can
pair.
but it
16.6:
understand
relationships
Key values
operations
only get, store of the
then
could
is just
10010
of
the
and key 10011,
Figure
Key
type
the
as a customer
customer
Bucket
tagged
based
last
An application
not
data
by specifying
existing
data to
in fact,
DBMS
of keys.
query
simple
based
the
All data
exist,
customer
component
note
the
a thing
bucket
value
not
use the
keys;
A bucket can roughly
grouping
component
queries
on
component.
key-value to
is
based
work that
buckets.
pair. Figure 16.6 shows
a key-value there
are rather value
then
for
that
to
that
no foreign
processing.
performed
does
the
know
are
the
a key-value
pair
across
possible
combination
key combination
the
basic
A bucket is alogical
All queries
retrieve
are
organised into buckets.
on KV databases
used
bucket
it is
of the applications
There
simplifies
for
be duplicated
pair.
Operations or fetch
greatly
scalable
of a table. can
other
key-value
component.
at all. This
fast
Key-value pairs are typically a bucket,
key. It is the job
value
time
from if
the
subsequent
eBook rights
are
and/or restrictions
eChapter(s). require
it
CHAPTER
considered For
schema-less,
a document
documents in
are required
a document
capabilities to the
that
as
group
to
which
the
retrieve some
DBMS
all of the
key-value
into logical
groups
and
the same is
aware
such
includes
FIGURE 16.7
Balance
tutorial
basis
Tags inside
have
own for
the
Data and
data that is
documents
have its
are the
groups
843
not
all
The tags
of the
document
NoSQL
stored.
tags,
structure.
most
called
query
the
tag
buckets,
additional
are accessible
document
While a document based
on the
Figure 16.6, but in a tagged
as summing
a hands-on
to
within
on the
Big
possible. logical
possible
tags
document
all can
they
called collections.
data from
structure
although
document
because
querying
also
of the
MongoDB
so each
pairs into
where the
functions
in the
MongoDB,
key, it is
documents
aggregate
operations
tags,
that,
have over KV databases.
group
collection
a predefined
means
important
makes sophisticated
Figure 16.7 represents Because
same
extremely
databases
KV databases
the
do not impose
schema-less
have the
are
document
documents
specifying
being
database
DBMS,
Just
that is, they
database,
16
documents,
has the
value
or averaging
format
later in this
(available
on online
in
chapter,
of tags.
to
write
databases
queries.
and
database. queries
even
You learn
Appendix
by
For example,
for a document
possible
0. Document
balances
database
contents
it is
databases
may be retrieved
that
support
some
Q,
basic
Working
with
platform).
Document database tagged format Collection
= Customer
Key
Document
10010
{LName:
Ramas,
Areacode:
FName:
0161,
Alfred,
Phone:
Initial:
844-2573,
A,
Balance:
0}
{LName: Dunne, Areacode: 0181,
10011
FName: Leona, Initial: K, Phone: 894-1238, Balance:
0} 10014
{LName: 0161,
Orlando, Phone:
FName: 222-1672,
Myron,
Areacode:
Balance:
0}
Document databases tend to operate on an implied assumption that a document is relatively self-contained, not a fragment of the data about a given topic. Relational databases decompose complex data in the business environment into a set of related tables. For example, data about orders
may be decomposed
into
customer,
invoice,
line,
and
product
tables.
A document
database
would expect all of the data related to an order to be in a single order document. Therefore, each order document in an Orders collection would contain data on the customer, the order itself, and the products purchased in that order, all as a single self-contained document. Document databases do not store relationships as perceived in the relational model and generally have no support for join
operations.
16.3.3 Column-Oriented
Databases
The term column-oriented database can refer to two confused with each other. In one sense, column-oriented traditional,
relational
database
technologies
that
different sets of technologies that are often database or columnar database can refer to
use column-centric
storage
instead
16
of row-centric
storage. Relational databases present data in logical tables; however, the data is actually stored in data blocks containing rows of data. All of the data for a given row is stored together in sequence, with many rows in the same data block. If a table has manyrows of data, the rows will be spread across
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
844
PART
VI
Database
Management
many data blocks. across row
five
of
data
data.
Figure 16.8 illustrates
blocks.
Row-centric
Retrieving
FIGURE 16.8
one row
table
minimises
of data requires
CUSTOMER
relational
accessing
10011
just
of data that is physically
of disk reads
one
Ramas
Alfred
Dunne
Cus_City
data
necessary
block,
as shown
stored
to retrieve in
Figure
a
16.8.
Cus_Country
Manchester
UK
Leona
Durban
SA
Smith
Kathy
Paris
FR
10013
Olowski
Paul
Manchester
UK
10014
Orlando
10015
OBrian
Durban
SA
George
Utrecht
NL
Cape Town
SA
Myron
Amy
Brown
10017
Row-centric
Cus_FName
10012
10016
James
Williams
10018
Farriss
Anne
10019
Smith
Olette
Manchester
UK
storage
1
Block
Column-centric
4
Block
Block
10016,Brown,James,NULL,NULL
10010,10011,10012,10013,10014
10011,Dunne,Leona,Durban,SA
10017,Williams,George,Utrecht,NL
10015,10016,10017,10018,10019
Block
2
5
Block
10012,Smith,Kathy,Paris,FR
10018,Farriss,Anne,Cape
Town,SA
10013,Olowski,Paul,Manchester,UK
10019,Smith,Olette,Manchester,UK
3
storage
1
10010,Ramas,Alfred,Manchester,UK
Block
number
table
Cus_LName
10010
Block
with 10 rows
the
Comparison of row-centric and column-centric storage
Cus_Code
Block
a relational
storage
4
Manchester,Durban,Paris,Manchester,NULL
Durban,NULL,Utrecht,Cape
Block
2
5
Ramas,Dunne,Smith,Olowski,Orlando
UK,SA,FR,UK,NULL,
OBrian,Brown,Williams,Farriss,Smith
SA,NULL,NL,NL,UK
Block
Town,Manchester
3
10014,Orlando,Myron,NULL,NULL
Alfred,Leona,Kathy,Paul,Myron
10015,OBrian,Amy,Durban,SA
Amy,James,George,Anne,Olette
Remember, in transactional systems, normalisation is used to decompose complex data into related tables to reduce redundancy and to improve the speed of rapid manipulation of small sets of data. These manipulations tend to be row-oriented, so row-oriented storage works very well. However, in queries that retrieve a small set of columns across alarge set ofrows, alarge number of disk accesses are required.
For example,
a query that
wants to retrieve
only the
city and province
of every customer
will have to access every data block that contains a customer row to retrieve that data. In Figure 16.8, that would mean accessing five data blocks to get the city and province of every customer. A column-oriented or columnar database stores the data in blocks by column instead of by row. A single customers data will be spread across several blocks, but all of the data from a single column will be in just as all
of the
customer
very
well for
data
review
2020 has
that
and
data
easy to imagine
that
the
hundreds
for It is
database
meaning
technology,
Cengage
queries
warehouses. gains
would of data
transactions
over few Figure
At the
since insert,
province
but
shows
a few
size
grew
same
time,
column-centric
delete
has the
every
as is
rows
and
millions
or
storage
activities
can be achieved
and
to
just works
many rows,
only
table
and
for
storage
if the
update,
data
and
columns 16.8
storage
structured
city
of column-centric
be significant blocks.
the
type
Though
column-centric
it still requires
retrieving
This
will be stored together,
would
within relational
advantage
of supporting
queries.
Learning. that
case,
blocks.
used to run
worth noting that that
city data for customers that
data
of thousands
processing
In
two
are primarily
be very disk intensive.
deemed
only
systems
across
all of the
together.
accessing
be very inefficient
SQL for
Copyright
it is
In Figure 16.8,
will be stored
databases
of rows
would
data
many reporting
blocks,
billions
Editorial
state
blocks.
might require
done in
16
a few
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
The other describe confines
of
of the
conform
with
use of the term
a type
to
model. structures,
Bigtable
Hypertable,
it to the
product.
key-value
pairs
databases
same
things.
close
enough
to the
the
Ramas
cus_city
databases,
began continued
to
a set
same terms family
column model
family
model.
Therefore, value
column
in
name
to
is
of the
a key-value
column,
the
data
to
described released
of the
most
data
is
similar
cus_lname Cape
are
to
can
a cell
is
help
of
is the
is the name
Town
conceptually
model
component
column
mean quite the
and
relational
value
While
dont
simple
pair that
and the
cus_city:
as the
one
component.
the terms
of the
Similarly,
Town
value
conceptually
Ramas is a column;
Cape
HBase,
into
845
originated
but Facebook
Cassandra
databases,
are
the
and
include
at Facebook,
the
model
NoSQL
is to
beyond
require
database
products
in the
cus_lname: column.
not
Data and
database,
storage
do
This
develop
understanding
A column
name
products
of columns
as relational
your
family
Big
database is a NoSQL database that organises
databases
that
column
queries.
as a project
A column family
the
data
these
database
Cassandra has
called
of column-centric
SQL for
which
mapped
The key is the
the
as the
keys
also
concept
support
many of the
column
column.
is
NoSQL do they
the
Other column-oriented
relational
database.
stored in that and
use
Fortunately,
understand
relational
As
databases. with
database,
takes
nor
community,
popular column-oriented family
that
and Cassandra.
open-source
data in
column-oriented
database
relational
predefined
Googles
earlier,
NoSQL
16
you
data in
a
data that is
of the column,
another
column,
with
value.
NOTE Even though column family databases do not (yet) support standard SQL, Cassandra developers created a Cassandra query language (CQL). It is similar to SQLin manyrespects and is one of the compelling reasons for adopting Cassandra.
As
more
columns
cus_fname, name.
added,
cus_lname,
and
Similarly,
to form is
are
cus_street,
a customers
a group
attributes
becomes
cus_initial, cus_city,
clear
which
that
some
columns
would logically
cus_province
form
group
natural
together
and cus_postcode
groups,
to form
are used to create super columns.
that
Recall
are logically
in the
entity
related.
relationship
model.
the
In
discussion
in
many cases,
Chapter
super
such
group together
A super
4 about
columns
as
a customers
would logically
address. These groupings
of columns
composite
it
have more
can
column
simple
and
be thought
of
as the composite attribute and the columns that compose the super column as the simple attributes. Just as not all simple attributes have to belong to a composite attribute, not all columns have to belong to a super column. Although this analogy is helpful in many contexts, it is not perfect. It is possible to group
columns
into
a super
column
that logically
belongs
together
for
application
processing
reasons
but does not conform to the relational idea of a composite attribute. Row keys are created to identify objects in the environment. All of the columns or super columns that describe these objects are grouped together to create a column family; therefore, a column family is conceptually similar to a table in the relational model. Although a column family is similar in concept to a relational
table,
Figure
16.9 shows
that it is structurally
very different.
Notice in
Figure
16.9 that
each
row key in the column family can have different columns.
16
NOTE A column family can be composed
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
of columns or super columns,
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
but it cannot contain both.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
846
PART
VI
Database
Management
FIGURE 16.9
Column family
database
Column Family
CUSTOMERS
Name
Rowkey
Key Columns
1
City
Manchester
Fname
Alfred
Lname
Ramas
Country
UK
Rowkey 2
Key Columns
Balance Fname
Kathy
Lname
Smith
Rowkey
Key Columns
345.86
3
Company
Local
Lname
Markets,
Inc.
Dunne
16.3.4 Graph Databases A graph database is a NoSQL database based on graph theory to store data about relationship-rich environments. Graph theory is a mathematical and computer science field that models relationships, or edges, between objects called nodes. Modelling and storing data about relationships is the focus of graph databases.
Graph theory
is a well-established
field
of study
going
back
hundreds
of years.
As a result,
creating a database model based on graph theory immediately provides a rich source for algorithms and applications that have helped graph databases gain in sophistication very quickly. Asit also happens that much of the data explosion over the past decade has involved data that is relationship-rich, graph databases
have been poised to experience
significant
interest
in the
business
environment.
Interest in graph databases originated in the area of social networks. Social networks include a wide range of applications beyond the typical Facebook, Twitter and Instagram examples that immediately come to mind. Dating websites, knowledge management, logistics and routing, master data management, and identity and access management, are all areas that rely heavily on tracking complex
relationships
among
objects.
Of course,
relational
databases
support
relationships
too.
One
of the great advances of the relational model wasthat relationships are easy to maintain. Arelationship between a customer and an agent is as easy to implement in the relational model as adding aforeign key to create a common attribute, and the customer and agent rows are related by having the same
16
value in the common
attributes.
If the
customer
changes
to
a different
agent, then
simply
changing
the
value in the foreign key will change the relationship between the rows to maintain the integrity of the data. The relational model does all of these things very well. However, whatif we want a like option so customers can like agents on our website? This would require a structural change to the database to add a new foreign key to support this second relationship. Next, whatif the company wants to allow
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
customers or the
on its
friends
among
website to friend
of their
individuals
(e.g.,
friends,
friends?
that
friends
relationships
need
social
to
of friends,
become just
each
In
so a customer data,
be tracked,
and
and friends
of friends
asimportant
The primary components
other
networking
of graph
as the
often
the
we want to
keep
data
they
about.
are the
Each
data that
node (circle)
of different
are tracked
This results
This is the area
in
Figure
store
the
node.
847
like,
many layers
deep
where
databases
the
shine.
as shown in Figure 16.10.
a single
All agent
NoSQL
relationships
The node is a specific instance
16.10 represents
about
Data and
a situation
where graph
are nodes, edges and properties,
in
we need to
Big
which agents their friends
be dozens
relationships
A node corresponds to the idea of a relational entity instance. attributes;
can
of friends).
data itself.
databases
can see
there
16
agent.
nodes
of something
Properties
might
have
are like properties
like first name and last name, but not all nodes are required to have the same properties. An edge is a relationship
between
nodes.
Edges (shown
as arrows
in
Figure
16.10)
can
be in
one
direction,
or they
can
be bidirectional. For
example,
in
Figure
16.10,
the
friends
relationships
are not. Note that edges can also have properties. Ramas liked
a traversal. Graph and
agent
Alex Alby is recorded
Instead
databases degree
bidirectional,
graph
database.
excel
at traversals
that
focus
but the
likes
A query in a graph
of querying the database, the correct terminology
relationships
on relationships
database
Alfred
is called
would be traversing the graph.
between
nodes,
such
as shortest
path
of connectedness.
Graph databases do not force
share some
characteristics
data to fit predefined
of processing, graph
in the
are
In Figure 16.10, the date on which customer
at least
databases.
for
structures,
with other do not support
relationship-intensive
Graph
databases
NoSQL
data.
do not
scale
in that
SQL, and are optimised
However,
out very
databases
other
well to
key
graph
to provide
characteristics
clusters
due to
databases velocity
do not
differences
in
apply
to
aggregate
awareness.
FIGURE 16.10
Graph database representation ID:
101 likes
Label: Date:
ID:
9/15/2019 ID:
1
Type:
Type: agent Fname: Alex Lname: Alby Phone: 228-1249
ID: Type:
ID:
100
Label: ID:
2
Fname:
Leona
Lname:
Dunne
ID:
agent
Leah
Lname:
Hahn
friends
assists
likes
Date:
8/15/2012
120
ID: 109 likes ID:
Type:
Fname:
103
Label:
Date:
has
Cengage deemed
Learning. that
any
All suppressed
Label:
likes
Date:
9/15/2019
ID:
Rights
does
May not
not materially
be
copied, affect
107 likes
Label: 3/20/2020 ID: Type:
Reserved. content
ID:
ID: 108 Label: friend
100
scanned, the
overall
Lname:
Olowski
Phone:
894-2180
Kathy Smith
duplicated,
agent
Fname:
John
Phone:
104 likes 10/11/2018 Label: Date:
ID:
Okon 123-5589
customer
Lname:
learning
3
Type: Lname:
Date:
6
Fname:
or
111 assists
Label:
Alfred
Renew: 04/05/2017
2020
ID:
assists
friends
7
106
Ramas
Amt:
review
ID:
1/07/2020
ID:
4
customer
Lname:
Copyright
ID:
Type: customer Fname: Paul
ID: Label:
102
Label:
105
Label:
Fname:
Label:
Editorial
5
customer
in experience.
whole
16
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
848
PART
VI
Database
Management
16.3.5 Aggregate Key-value,
document
Awareness and column
family
databases
are aggregate
aware.
Aggregate
aware
means
that the data is collected or aggregated around a central topic or entity. For example, a blog website might organise data around individual blog posts. All data related to each blog post are aggregated into a single denormalised collection that mightinclude data about the blog post (title, content and date posted), the
poster (user
name and screen
name),
and all comments
made on the
post (comment
content and commenters user name and screen name). In a normalised, relational database, this same data might call for USER, BLOGPOST and COMMENT tables. Determining the best central entity for forming aggregates is one of the most important tasks in designing most NoSQL databases, and is determined by how the application will use the data. The aggregate-aware
database
models achieve
clustering
efficiency
by
making each
piece
of data
relatively independent. That allows a key-value pair to be stored on one node in the cluster without the DBMS needing to associate it with another key-value pair that may be on a different node on the cluster. The greater the number of nodes involved in a data operation, the greater the need for coordination and centralised control of resources. Separating independent pieces of data often called shards across nodes in the
cluster, is
what allows
NoSQL
databases
to scale
out so effectively.
Graph databases, like relational databases, are aggregate ignorant. Aggregate-ignorant models do not organise the data into collections based on a central entity. Data about each topic is stored separately and joins are used to aggregate individual pieces of data as needed. Aggregate-ignorant databases,
therefore,
tend to
be
more flexible
at allowing
applications
to
combine
data elements
in a
greater variety of ways. Graph databases specialise in highly related data, not independent pieces of data. As aresult, graph databases tend to perform best in centralised orlightly clustered environments, similar to relational databases.
16.4
NewSQL DATABASES
Relational
databases
are the
mainstay of organisational
data, and
NoSQL
databases
do not attempt
to
replace them for supporting line-of-business transactions. These transactions that support the day-to-day operations of business rely on ACIDS-compliant transactions and concurrency control, as discussed in Chapter 12. NoSQL databases (except graph databases that focus on specific relationship-rich domains) are concerned withthe distribution of user-generated and machine-generated data over massive clusters. NewSQL
databases
provide the latest category success
try to bridge the gap between
RDBMS
and
NoSQL.
NewSQL
databases
attempt
to
ACIDS-compliant transactions over a highly distributed infrastructure. NewSQL databases are technologies to appear in the data management arena to address Big Data problems. As a new of data management products, NewSQL databases have not yet developed atrack record of and have been adopted by relatively few organisations.
NewSQL
products,
such
as ClustrixDB
and
NuoDB,
are designed
from
scratch
as hybrid
products
that incorporate features of relational databases and NoSQL databases. Like RDBMSs, NewSQL databases support: SQL
as the
primary
ACIDS-compliant Similar to
16
Highly
NoSQL, NewSQL databases also support: clusters
or column-oriented
As expected, NewSQL
transactions.
distributed
Key-value
interface
no technology
has disadvantages
data stores.
can perfectly (the
disadvantages that have been discovered Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
provide the advantages
CAP theorem
scanned, the
overall
or
duplicated, learning
covered
in
centre on NewSQLs in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
of both RDBMS and NoSQL, so
Chapter
14 still applies!).
Principally,
heavy use ofin-memory rights, the
right
some to
third remove
party additional
content
may content
be
storage.
suppressed at
any
time
from if
the
subsequent
the
Critics
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
point to the fact handle
vast
practical
limits
should
be able
specific
the durability
component
by the
on in-memory
can
significantly,
used by
in
NoSQL
reliance
be held in
in
practice
needs.
Further,
structures
Although been
to
Data and
the
because
in theory
done
RDBMS
success
sections
Neo4j.
These
databases.
Q and
has
over traditional
experienced
The following
relational
Appendixes
have
and
memory.
little
of ACIDS.
Big
are
databases
beyond
distribution,
849
ability to there
NewSQL
scale
NoSQL
a few
dozen
it is far from
the
databases.
products
MongoDB
by traditional
16.5
of data that
out
database
business
matched
be impacted
amount
scale
databases,
databases
can jeopardise
can
While this is a marked improvement
NoSQL
NoSQL
This
to
of nodes
A few to
sets
to the
data nodes. hundreds
that this
data
16
provide
two
niche
a brief
databases
You can find
R, respectively,
in
markets
introduction
provide
on the
to
a set
more detailed
available
by providing
online
two
solutions widely
of functionality
hands-on
used not
examples
yet
of these
platform.
WORKING WITH DOCUMENT DATABASES USING MongoDB
section
currently
introduces
available,
you to
MongoDB,
MongoDB
Therefore, learning
the
has
basics
a popular
been
one
of working
of the
document
database.
most successful
with MongoDB
in
Among
the
penetrating
NoSQL
the
can be quite useful for
databases
database
database
market.
professionals.
NOTE
MongoDB is a product of MongoDB, Inc. In this book, we use the Community Server v.4.0.9 edition, which is open source and available free of charge from MongoDB, Inc. New versions are released regularly. This version
of
The name, to
support
MongoDB is
available
for
Windows,
MongoDB, comes from the extremely
high
availability
high
scalability
high
performance.
large
data
MacOS and Linux from the
word humongous
sets. It is
designed
asits
MongoDB
developers intended
website.
their new product
for:
Online Content Anexpanded setof hands-on exercisesusingMongoDB canbefoundin Appendix
Q, Working
with
As a document database,
MongoD,
available
on the
online
platform.
MongoDB is schema-less and aggregate aware. Recall that being schema-less
means that all documents
are not required
to conform to the same structure,
and the structure
of documents
does not have to be declared ahead of time. Aggregate aware meansthat the documents encapsulate all relevant data related to a central entity withinthe same document. Datais stored in documents, documents of a similar type are stored in collections, and related collections are stored in a database. To the users, the documents appear as JSON files, which makes them easy to read and easy to manipulate
in
a variety
of programming
languages.
Recall that
JavaScript
Object
Notation (JSON)
1
is a
datainterchange format that represents data as alogical object. Objects are enclosed in curly brackets { } that contain key-value pairs. A single JSON object can contain many key:value pairs separated by commas. A simple JSON document to store data on a book might look like this: {_id: 101, title: Database Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
Principles} not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
850
PART
VI
Database
Management
This document
contains
_id is a key title The
is
previous
with Database
component
square
adding
are
have
values
101, title: Database
When JSON
documents
a single
the
value.
would
authors
be read
appropriate
could
have
used.
document
could
such
the
for
values
be expanded
Coronel
are
key.
In the
and Morris.
are placed inside
to:
Crockett,
they
a given
Arrays in JSON
Morris,
by humans,
readability,
be
an array is
author: [Coronel, to
to improve
that
key,
above
Systems,
associated
values
pair for
for
are intended
line
as the
multiple
[ ]. For example,
pair on a separate
value
a key:value
multiple
brackets
{_id:
pairs:
Principles
may
example,
When there
key:value
with 101 as the associated
a key
value
two
often
Blewett]}
displayed
with
each
key:value
as:
{ _id:
101,
title:
Database
author:
Principles,
[Coronel,
Morris,
Crockett,
Blewett]
} MongoDB many database the
are comprised When
object
you
of collections
connected
want to
to
work.
the
Alist
of documents.
MongoDB
of the
server,
databases
Each
the
first
available
MongoDB
task
on the
is
to
server
server
specify can
can host with
which
be retrieved
with
command: show
All
databases
databases.
dbs
data
new
manipulation
database
in
commands
in
MongoDB
MongoDB
is
as easy
informs
the
server
the
name
must
as issuing
the
be
directed
use
to
a particular
database.
Creating
a
command.
use fact
The use command If there
is
a database
commands.
with
If there
is
not
which
a database
is to
be the target
specified,
database then
that
database
with that
name,
then
one is
of the
will
commands
be used
created
for
that follow.
the
subsequent
automatically.
Online Content Thedocuments forthefact database areavailable asa collectionofJSON documents is
that
available
can
on the
be directly
online
16.5.1 Importing
imported
into
MongoDB.
will use to illustrate
Documents in
a sample
named
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
and
MongoDB
MongoDB
query is based
of documents. on a fact
The collection
database
Access to Computer Technology (FACT) is a small library run by the
Editorial
'Ch16_Fact.json'
platform.
Remember that a MongoDB database is a collection
16
The file is
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
of documents
and a patron
collection.
Computer Information
party additional
content
may content
be
suppressed at
any
time
from if
we Free
Systems
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
department with
at Tiny
patron
{_id:
as the
College. central
The portion
entity.
,system-generated
display:
,the
The
of the
model that is
documents
have
patrons
full
name as it
will be displayed
,patrons
first
name
in
all lowercase
letters.,
lname:
,patrons
last
name
in
all lowercase
letters.,
,either
age:
,patrons
checkouts:
faculty age in
years
,the
which
month in
,the
day
book:,the
book
title:,the
title
pubyear:
here consists
NoSQL
851
of documents
structure:
to users.,
checkout
this
object.,
occurred.,
checkout which
of the
history.
occurred.,
this
checkout
book for this
occurred.,
checkout.,
book.,
year
subject:,the
a student.,
checkout
checkout
month in
number
of the
,the
this
is
patrons
for this
which
of the
patron
for the
number
year in
month:,the day:
only if the
of objects
,an assigned
year:
used
Data and
or student.,
,an array
[id:
being
following
Big
ObjectID,
fname:
type:
the
16
the
subject
book
was
of the
published.,
book.]
} Notice that
that
the
under
patron each
together key:value the
the
patron has
patron.
document checked
collection
out.
Finally,
note
Notice that
the
with capitalisation,
and again
pairs. The reason
for this is that
faculty
name
twice
facilitates
contains
information
also that
the
patrons
name
with first
about
checkouts is
all searches in
patron
subdocument
stored
name and last
each
twice,
is
once
and
an array
with first
name in all lowercase
MongoDB
all the
letters
are case sensitive
of
and last
books objects name
in separate
by default,
storing
searches.
NOTE The database can be created using the Ch16_Fact.json file by using the following command at an operating system command prompt (note that the command is for use at a command prompt in the OS, not inside the MongoDB shell). mongoimportdb
factcollection
patrontype
jsonfile
Ch16_Fact.json
Mongoimport is an executable program that is installed with MongoDB that is used to import data into a MongoDB database. The above command specifies that the imported documents should be placed in the fact database (if one does not exist, it will be created) and in the patron collection (if one does not exist, it will be created).
Mongoimport
can
work
with different file types
such as CSV files
and JSON files.
16
The type
parameter specifies that the imported documents are already in JSON format. The file parameter specifies the name of the file to beimported. If your copy ofthe Ch16_Fact.json file is not in the current directory for your command prompt, you will need to provide an appropriate path for the file location.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
852
PART
VI
Database
Management
16.5.2 Example of a MongoDB Query Using find( ) Once the
patron
collection
is imported,
you are ready
to
query the
MongoDB
database.
In
order to
manipulate collections, a MongoDB database uses methods. Methods are programmed functions to manipulate objects. Examples of such methods are createCollection( ), getName( ), insert( ), update( ), find( ), and so on. The find( ) method retrieves objects from a collection that match the restrictions provided.
The find( )
method
has two
parameters:
find({,query.},{,projection.})
The ,query. parameter specifies the criteria to retrieve the collection objects. The ,projection. parameter is optional and specifies which key:value pairs to return. The value with each key in the projection object is either 0(do not return), or 1(return). For example, Figure 16.11 shows the code to retrieve the _id, display the name and age for patrons that
either
have the last
name barry
and are faculty,
or have the last
name hays
and are
under
30 years old: db.patron.find({$or:
[
{$and: [{lname: barry},
{type: faculty}]},
{$and: [{lname: hays},
{age: {$lt: 30}}]}
]}, {display:
FIGURE 16.11
1, age: 1,type:
Example of MongoDB document query
Notice also that this example used
to improve
MongoDB is
16
originally the
Copyright Editorial
review
2020 has
Learning. that
any
readability
of the
document
to
of its
All suppressed
uses the
a powerful
designed
structure
Cengage deemed
1}).pretty( )
Rights
does
May not
not materially
be
documents
and for
copied, affect
scanned, the
overall
or
its
duplicated, learning
method. The pretty( )
by
database
Web-based
documents
Reserved. content
support
pretty()
placing
that is
operations query
in experience.
whole
key:value
being and,
pairs
adopted
as such,
it
method is a MongoDB on separate
by
method
lines.
many organisations.
draws
heavily
on
It
was
JavaScript
for
language.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
16
Big
Data and
NoSQL
853
NOTE We have find(
)
introduced
method
databases.
you
here, Appendix
powerful
Q,
document
16.6
to the
but there
is
basic
concepts
much
more to learn
Working
database
with
of a
MongoDB,
and is located
MongoDB
if
you
contains
on the
collection
and
are interested a
online
in
how
more thorough
platform
to
pursuing
of this
query
a career
tutorial
it
using
in
the
document
on how
to
use this
book.
WORKING WITH GRAPH DATABASES USING Neo4j
Even though NoSQL
Neo4j is not yet as widely adopted
databases,
with thousands
as MongoDB, it has been one of the fastest
of adopters
including
LinkedIn
and
Walmart.
growing
Neo4j is a graph
database. Like relational databases, graph databases still work with concepts similar to entities and relationships. However, in relational databases, the focus is primarily on the entities. In graph databases, the focus is on the relationships.
Online Content Anexpanded setof hands-on exercisesusingNeo4jcanbefoundin Appendix
R,
Working
Graph databases therefore, to
are
scale
out
LinkedIn
that
friends a
up
heavily
with
this
What if
we
solution.
30 friends
want
can
This requires 2.7
phrase
has
product
yet
As you
relational
Graph
Learning. that
any
All
that
databases
to
suppressed
can
database
Rights
milliseconds
Reserved. content
can
does
May not
not materially
be
of the
affect
Then,
000-row
person
table
another
see,
by the
starting
to
bridge time
slow.
is
table
so the
we are
working
query
for
query,
hours to run in
scanned, the
overall
adopters
or
duplicated, learning
queries describe
in experience.
whole
or in Cengage
part.
a relational in
their
Due Learning
to
electronic reserves
use
of graph
rights, the
In fact,
right
some to
third remove
in the
database
can
relationship. bridge
are
table
90 billion
to
rows
The relational
of friends
of friends.
a Cartesian
product
of separation
types
of highly interdependent
database,
seconds.
a person
the join).
degrees
could take
these
and those
the
friends
with and
in it,
(there
producing
the six
entity
be
to retrieve
friend
connecting
not trivial
is unable to keep up. These types
complete
direct
as
can
relationship
Arelational
with to construct Now
whom
person to their friends
that
join
to itself
the
(rows)
A query
entity.
beyond
of
able
such
as a person for
people
one to link the the
each
this
000 rows.
we look
network
a bridge
technology
when
copied,
when
are the least
a social
people,
10 000
300
engine is contending
copy
other
Graph databases,
why they
of
create
has
has
two joins:
of friends?
but it is
entities.
which is
many
table
table
names from
a 300
DBMS
volume, another
person
comes
friends
Joining
data,
with
book.
among
an example
we would
bridge
friends
problem
that the
that
the
the
of this
model, we could represent
would require
those
about
about relationships
Cengage deemed
know
handle
minutes
so that
The
be included.
3 1016 rows!
databases.
2020
to
joining
of problems, queries
quickly.
be friends
of a relational
Imagine
each
to retrieve
their
Consider
In implementation,
of his or her friends
query
can
platform
relationships
among types.
A person
a two-entity
Cartesian
database
review
people.
online
with complex
database
relationship.
will have to
in the
at on the
on interdependence NoSQL
unary
and another
perform
Copyright
the
connects
average
bridge
Editorial
reliant
among
and the names
with
available
with many people. In terms
people
itself
Neo4j,
are used in environments
many-to-many
end
with
are the forte you
often
of graph
encounter
16
the
databases.
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
854
PART
VI
Database
Management
NOTE Neo4j
is
a product
Community
of
Server
versions
Neo4j,
v.3.5.5
are released
and Linux from
Inc.
There
edition,
regularly.
the
Neo4j
are
This
open
version
versions
source
of
of
and
Neo4j is
Neo4j
available
available
available. free
for
In this
of charge
Windows
book,
from
(64-bit
we use the
Neo4j,
and
Inc.
32-bit),
New MacOS
website.
Neo4j provides several interface and optimised
multiple
which is
for interaction
options. It
through
was originally
a Java
designed
API. Later releases
with Java programming in
have included
the
options
mind,
for a Neo4j
command shell, similar to the MongoDB shell, a REST API for website interaction, and a graphical, browser-based interface for intuitive interactive sessions. In this section, you will use the Web browser interface.
16.6.1 Creating Nodesin Neo4j
NOTE An instance can
be
of
Neo4j
changed
each
databases
be
label
to
in
is the
closest
associate
properties
nodes
Each
Although the
describe so
both
and in the
are
to
SQL,
even
basic
Nodes
and
Copyright review
2020 has
Cengage deemed
Learning. that
data
the
data
If the
data
files in that path
before
path
for the
path
is
directory
starting
database
changed
to
on start-up.
the
server,
By
multiple
or type
of
is
composed
instances
the
of that
relational
if they
the
node.
are
model.
relational have
to
as a node.
Each restaurant
are
the
nodes, To
label.
same
more than
would are
distinguish
you can use labels.
get a Restaurant
model,
the
makes it
Just as
a node
has
databases
of properties.
In
group.
as a node.
or type
of node
of nodes
The nodes for This
a
used
of area restaurants.
kind
types
is
group.
be represented
one
the
Neo4j,
that
graph
set
one
Roughly In
a tag
same
share reviews
members
help
is
to the
to
members
edges.
database.
of that instance,
the
belong
where
and
Alabel
or belonging
Unlike
logically
of node.
nodes
a relational
characteristics
not required
critics
of in
both in
members
while code
might get a
more convenient
in code
nodes.
query
syntax
from
describe
programmers,
types
are entity
language
very
in
different.
Neo4j is
called
However,
Cypher is very easy to learn
Cypher.
being
and a few
Cypher
a declarative
simple
commands
is
declarative,
like
instead
of an
language
can be used to
perform
processing. relationships
are
created
using
a
CREATE
command.
The following
code
creates
a
node:
CREATE
Editorial
the
language,
database
member
and restaurants kind
to
label
one label
declarative
though
imperative
same
be represented
the
server.
of the same type to
of a club for food
another
between
The interactive,
16
as being
and nodes for restaurants
distinguish
the
of a table
characteristics
minds of users and
Member label,
Neo4j
databases
attributes
more than
members
restaurants
However,
all needed
changing
graph
with the
would
the
correspond
concept
for
the
have
at a time.
creates
and
chapter,
nodes
member
starting
of nodes
an example
club
before
database
have values
can
Consider
database
folder
to the
a collection
that
active
practice.
a graph thing
schema-less
fact,
for
in the
in
entity instances
are
a separate
earlier
nodes
one
Neo4j automatically
maintained
you learnt
speaking,
only
configuration
directory,
database
can
As
have
in the
point at an empty keeping
can
any
(:Member
All suppressed
Rights
{mid:
Reserved. content
does
May not
not materially
1, fname:
be
copied, affect
scanned, the
overall
Phillip,
or
duplicated, learning
lname:
in experience.
whole
or in Cengage
part.
Stallings})
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
16
Big
Data and
NoSQL
855
NOTE Neo4j
creates
internal as,
an internal
use
within
a unique
ID field
the
named
database
for
,id.
storage
for
every
node
algorithms.
It is
and relationship;
not intended
however,
to
be,
and
this
should
field not
is for
be
used
key.
The previous command creates a node with the Member label. That node was given the properties mid with the value 1, fname with the value Phillip and the property lname with the value Stallings. The mid property
named
being used as a member ID field to identify
the
members. If there is
not already
a label
Member, it is created at the same time the node is.
16.6.2 Lets
is
Retrieving
start
by issuing
MATCH
Node Data with a simple
command
to
MATCH and retrieve
our
WHERE
single
member
node:
(m)
RETURN(m) In this is for
case, this Phillip
command
command
Stallings such
In this
case,
clause
following
only node to
to retrieve
"Phillip"}),
nodes in the
(3
display. If
Phillip
{lname:
graph
database.
many nodes
In this
existed,
case, the only node
we could
have used a
Stallings:
"Stallings"})
m the
properties
allows for
previous
all of the
so that is the
as the
MATCH (m {fname: RETURN
retrieves
and
values
more complex
command
can
were
criteria,
be rewritten
embedded
such
using
in the
as using
a
WHERE
node.
Alternatively,
comparison clause
operators
the
use
other than
of a
WHERE
equality.
The
as follows:
MATCH (m) WHERE
m.fname
RETURN
5 "Phillip"
AND
m.lname
5 "Stallings"
m
Online Content The'Ch16_FCC.txt' file usedin thefollowingsection is available onthe online
platform
of this
editor
bar and
executed
The following
section
Ch16_FCC.txt
file,
78
additional
command
assumes
available
members, is
necessary
may seem
Working
Copyright Editorial
review
2020 has
with
Cengage deemed
Learning. that
any
All
Rights
to
does
are
you.
available
Reserved. content
May not
not materially
you
copied, affect
play
have
of the file button
preloaded
This file
using
the
To learn online
scanned, the
overall
or
duplicated, learning
the
Neo4j food
interface.
about
and
critics
massive
8 cuisines.
commands.
more
be copied
a single,
and
browser
with multiple
should
pasted
into
the
Neo4j
in the interface.
contains
67 restaurants,
on the
be
the
online.
owners,
you
The contents
using
that you
script files
unfamiliar Neo4j,
suppressed
43 if
use, it does not support that
to
book.
Providing
Because
it is
The command
such
commands,
database,
command the
code
designed
please
16
creates
as a single
for
includes
using the that
interactive
many statements
refer
to
Appendix
R,
platform.
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
856
PART
VI
Database
Management
16.6.3 Retrieving Beyond
retrieving
Relationship
nodes, it is
possible
Data with MATCH and WHERE to retrieve
data based
on the relationships
between
nodes.
As
stated earlier, focusing on relationships is the primary strength of graph databases. For example, the following command retrieves every member who has reviewed the restaurant Tofu for You and rated the restaurant a4 on taste: MATCH (m :Member) RETURN m,r, res
2 [r :REVIEWED {taste:
4}]
. (res :Restaurant
{name: Tofu for You})
Whenretrieving data based on a relationship, criteria for the direction of the relationship and any data characteristics of the relationship can be specified in the query. In this example, there are two nodes (m and res) and a relationship
that joins them (r). In this
case,
we are
matching
all nodes that
are
members,
the one node that is named Tofu for You, and all relationships that arelabelled as REVIEWED and have a property named taste equal to the value 4. You could add comparisons and logical operators using the WHEREclause, as shown in the following command, with the results shown in Figure 16.12: MATCH (m:
Member)
WHERE(r.value
[r :REVIEWED]
. 4 ORr.taste
. (res :Restaurant)
. 4) AND res.state
5 KY
RETURN m,r, res
FIGURE 16.12
Neo4J query using MATCH/WHERE/RETURN
16 The command
retrieves
all members
who have reviewed
any restaurant
in Johannesburg
restaurant greater than 4 on value or taste. Notice that using the inequalities such as greater than, and logical operators.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
and rated the
WHERE clause allows the use of
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
16
Big
Data and
NoSQL
857
NOTE This
section
pursuing
is just
a very
a career
on how
to
in
use this
brief introduction
graph
databases.
powerful
graph
to
Neo4j,
Appendix database
R,
but there
is
Working
with
and is located
much
more to learn
Neo4j,
on the
contains
online
if you a
platform
are interested
more thorough of this
in
tutorial
book.
In Chapter 15, you learnt about data warehouses and star schemas for modelling and storing decision support data. In this chapter, you have added to that by exploring the vast stores of data that
organisations
are collecting
in
unstructured
formats
and the technologies
that
make that
data
available to users. Data analytics, discussed in Chapter 15, is used to extract knowledge from all of these sources of data NoSQL databases, Hadoop data stores, and data warehouses to provide decision support to all organisational users. Even though relational databases are still dominant for most business transactions, and will continue to be so for the foreseeable future, the growth of Big Data
must be accommodated.
There is too
much value in the immense
amounts
of unstructured
available to organisations for them to ignore it. Database professionals must beinformed new approaches to data management to ensure that the right tool is used for each job.
data
about these
SUMMARY Big Datais characterised struggles
to adapt to it.
by data of such volume, velocity and/or variety that the relational Volume refers
to the
quantity
of data that
must be stored.
model
Velocity
refers
to both the speed at which data is entering storage as well as the speed with which it must be processed. Variety refers to the lack of uniformity in the structure of the data being stored. As a result of Big Data, organisations are having to employ a variety of data storage solutions that include technologies, in addition to relational databases, a situation referred to as polyglot persistence.
Volume, velocity, variety, veracity and value are collectively referred to as the 5 Vs of Big Data. However, these are not the only characteristics of Big Data to which data administrators must be sensitive.
Additional
Vs that
have
been suggested
by the
data
management
industry
include
variability and visualisation. Variability is the variation in the meaning of data that can occur over time. Further, visualisation is the requirement that the data must be able to be presented in a manner that makesit comprehensible to decision makers. Most of these additional Vs are not unique to
Big Data. There are also concerns
for
data in relational
databases.
The Hadoop framework has quickly emerged as a standard for the physical storage of Big Data. The primary components of the framework include the Hadoop Distributed File System (HDFS) and MapReduce. HDFSis a coordinated technology for reliably distributing data over a very large
cluster
of commodity
servers.
MapReduce
is a complementary
process
for
distributing
data processing across distributed data. One of the key concepts for MapReduce is to movethe computations to the data instead of moving the data to the computations. MapReduce works by combining the functions of map, which distributes subtasks to the cluster servers that hold data to be processed,
and reduce,
which combines
the
map results
into
a single result
set. The Hadoop
16
framework also supports an entire ecosystem of additional tools and technologies, such as Hive, Pig and Flume, which work together to produce a complex system of Big Data processing. NoSQL is
a broad term that refers
management.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
to any of several
Most NoSQL databases fall into
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
non-relational
database
approaches
to
one of four categories:
key-value databases,
or in
party
Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
data
eBook rights
and/or restrictions
eChapter(s). require
it
858
PART
VI
Database
Management
document of
databases,
products
under
many
products
and
Key-value
makes fast
to the
no
the
understand Document
DBMS
understand
meaning
aware
of the
expect
pairs in
the
relational
family
are
data in
in the
not required
wide variability
all-encompassing,
pair, the
can
data in it.
and the
be
value
of the key
of any type,
These
types
of
application
programs
but the
data in the
called is
key,
have
the
such
must
and the
DBMS
databases can
are very
be relied
on to
of simple placed
similar
that
is,
possible.
are
an
Document
of one another.
which
key-value
themselves
of a similar
family.
they
The
to a composite
All objects
a column
is
or JSON.
data into
of columns,
attributes. within
structure,
on tags
organise
columns,
component
XML
independent
databases,
super
value
as in
querying
of a series
into
same
makes
family
composed
and
tags,
and relatively
column
composed
using
which
can be grouped
a row
pairs,
be encoded
be self-contained
also
to
of the
documents,
to
being
given
component
key-value
must
component
model
Due to the
necessarily
data.
store
Columns
as rows,
meaning
document
tags
value
pairs.
identified
the
documents
not
pairs. In a key-value value
independent,
databases,
which
key-value the
also The
databases.
are
categories.
data in the
of the
or graph
categories
data in key-value
but the
completely
databases
is
multiple
to
Column-oriented
in
store
document.
databases
umbrella,
data is
the
encoded
these
DBMS,
attempt
when
databases,
NoSQL
can fit into
databases
be known
column-oriented
the
Rows
type
within
not required
are
attribute
to
are
a column have the
same
columns. Graph
databases
properties.
A node
are
based
is
similar
relationships
between
that
the
describe
highly is
to
NewSQL transactions) MongoDB
and is
Neo4j is
a graph
to
done that
them.
primarily stores
done
primarily
of both
distributed
as nodes
the
MATCH
command
attributes
data that is
among
the
nodes,
it
manner.
ACIDS-compliant
format.
language,
The
documents
named
can
MongoDB
be
Query
method.
and relationships,
are queried
are
distributed
JSON
using
to
both
Cypher,
with SQL, but is still significantly
through
which
and are the
infrastructure).
in
the find( )
Edges
excel at tracking
RDBMS (providing
documents
through
data
properties,
a highly
a JavaScript-like
Neo4j databases
many commonalities
is
stores
in
edges
model.
many relationships
a cluster
a highly
using
have
Due to the
features
that
and queried
can
nodes,
relational
Graph databases
across
(using
data through
in the
edges
data.
to integrate
is
database
describe
media
database
deleted
and
or edge.
database
databases
a document
shares
retrieval
attempt
and represent of an entity
nodes
node
as social
Data retrieval
properties
Both
a graph
NoSQL
updated,
Language.
that
such
databases
created,
nodes.
distribute
theory
an instance
corresponding
interrelated,
difficult
on graph to
different
perform
of
which
can
a declarative in
pattern
contain
language
many ways.
Data
matching.
KEY TERMS aggregate aware
column-centric storage
graph database
aggregate ignorant
columnfamily
HadoopDistributedFileSystem
algorithm
columnfamily database
batch processing
Cypher
block report
document database
job tracker
BSON(Binary JSON)
edge
JSON(JavaScript Object Notation)
bucket
feedbackloop processing
key-value(KV) database
collection
find()
16
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
(HDFS) heartbeat
map
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
CHAPTER
mapper
reducer
traversal
MapReduce
row-centric storage
unstructured data
method
scalingout
value
NewSQL
scalingup
variability
node
semi-structured data
variety
NoSQL
sentiment analysis
velocity
polyglot persistence
stream processing
veracity
pretty()
structureddata
visualisation
properties
super column
volume
reduce
task tracker
16
Big
Data and
NoSQL
859
REVIEW QUESTIONS 1
Whatis Big Data? Give a brief definition.
2
What are the traditional
3
Explain
3
why companies
Vs of Big Data?
like
Google
and
Briefly,
define
Amazon
each.
were among
the first to
address
the
Big Data
problem. 4
Explain the
difference
between
scaling
up and scaling
5
Whatis stream processing, and whyis it sometimes
6
How is stream processing
7
Explain why veracity, as Big Data.
Whatis polyglot persistence,
9
What are the key assumptions
made by the
10
Whatis the
a name
11
Explain the
12
Briefly explain how HDFS and
13
What are the four basic categories
14
How are the value components
15
Briefly explain the difference between row-centric
16
Whatis the
and whyis it considered
between
difference
MapReduce
processing?
can be said to apply to relational
8
basic steps in
necessary?
different from feedback loop
value and visualisation
difference
out.
a new approach?
Hadoop Distributed File System approach?
node
and a data node in
HDFS?
processing.
MapReduce are complementary
to each other.
of NoSQL databases?
of a key-value
between
databases as well
a column
database and a document and column-centric
and a super
17
Explain why graph databases tend to struggle
18
Explain whatit
column
database different?
data storage.
in a column
family
database?
with scaling out.
means for a database to be aggregate
aware.
16
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter 17 DatabaseConnectivity and Webtechnologies In thIs Chapter, About
the
different
Markup
(XML)
and features
ODBC,
Which services
can
technologies is
and
why it is important
for
Web
development
technologies:
cloud
About
connectivity
Language
About the functionality
About
WIll learn:
database
What Extensible
database
you
OLE, ADO.NET
are provided
computing
the
by
and
Semantic
actually
of various
connectivity
and JDBC
Web application
how it
Web and
database
enables
how it
servers
the
database-as-a-service
describes
concepts
in
model
a way that
computers
understand
Preview Databases
are the central repository
including
newer
channels
such
the
must
be available
to
data
a spreadsheet,
about
the
such architectures
The internet example,
Markup
Language
structured
data
Companies
leveraging
has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
the
May not
not materially
(XML)
that
be
services
copied, affect
the
overall
can
duplicated, learning
In this
to
has
interchanging
data
and
chapter,
and
origins
become
via
newer
you learn
operate.
For
commonplace.
an application
messages
way
the
end,
databases.
not only between
a standard
integrate
now
how
a quick
or
Web front
phones.
internet
universally,
may access
a
of all types
occurs
provides
users
connect
applications,
To be useful
and data.
of exchanging
In
and the Extensible
unstructured
and
applications.
will learn
scanned,
to
via the
applications
want to
offer
Those
Android
by business
devices.
application,
organisations
database
choose
model
and
in experience.
whole
cost
or in Cengage
part.
and
from
organisations
database-as-a-service
services
2020
and
between
you
and
interconnectivity
portfolio
Therefore,
review
how
but also between
applications
Copyright
goods
Basic
by applications
changed
environment,
database,
Editorial
used
has
buying
todays
iPhones
mobile
users.
Visual
as iPads,
data generated
Web and
all business
a user-developed
technologies
for critical
as the
can within
efficient
Due Learning
to
electronic reserves
Web technologies
a range benefit their
way to
rights, the
right
some to
third remove
within
of internet-based
IT
from
party additional
cloud
computing
environments.
provide
content
new
may content
These
business
be
suppressed at
any
time
their
services.
from if
by
cloud-based
services.
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17.1
DataBase
Database connectivity communicate
17
Database
Connectivity
and
Web Technologies
861
ConneCtIVIty refers to the
mechanisms through
with data repositories.
Databases
store
which application
data in persistent
programs
storage
connect
structures
and
so that they
can be retrieved at a later time for processing. As you have already learnt, the database management system (DBMS) functions as an intermediary between the data (stored in the database) and the end users applications. Before learning about the various data connectivity options, it is important to review some important fundamentals you have learnt in this book: DBMSs provide of administrative
means to interact tools
and data
with the data in their databases. This could be in the form manipulation
tools.
DBMSs
also provide
for external application programs to connect to the database bythe programming interface. (See Chapter 1, The Database Approach.)
a proprietary
way
means of an application
Modern DBMSs have the option to store data locally or distributed in multiple locations. Locally stored data resides in the same processing host as the DBMS. A distributed database stores data in multiple geographically distributed nodes with data management capability. (See Chapter 14, Distributed Databases.) The database connectivity software we discuss in this chapter supports Structured Query Language (SQL) asthe standard data manipulation language. However, depending on the type of database
model, some
database
connectivity
interfaces
may support
other
proprietary
data
manipulation languages. Database connectivity software works in a client/server architecture, in which processing tasks are split among multiple software layers. In this model, the multiplelayers exchange control messages and data. (See Chapter 14, Distributed Databases, and Appendix F, Client/Server Systems, located on the online platform of this book, for moreinformation on this topic.) To better understand
database
connectivity
software,
we use client/server
concepts
in
which an application
is broken down into interconnected functional layers. In the case of database connectivity software, you could break down its basic functionality into three broad layers: 1 A data layer where the data resides. You can think of this layer as the actual data repository interface. This layer resides closest to the database itself and is normally provided by the DBMS vendor.
2 A middle layer that manages multiple connectivity and data transformation issues. This layer is in charge of dealing with data logic issues, data transformations, ways to talk to the database below it,
and so on. This would also include
the native language
supported
translating
by the specific
multiple
data
manipulation
languages
to
data repository.
3 A top layer that interfaces with the actual external application. This mostly comes in the form of an application programming interface that publishes specific protocols for the external programs to interact From the
with the data.
previous
discussion,
you can understand
why database
connectivity
as database middleware because it provides aninterface between database or data repository. The data repository, also known as the management application, such as Oracle, SQL Server, IBM DB2, or the data generated by the application program. Ideally, a data source anywhere
support
Copyright Editorial
review
2020 has
Cengage deemed
and hold
any type
of data.
multiple data sources
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
Furthermore,
the
same
database
at the same time. For example, the
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
software
is also known
the application program and the data source, represents the data NoSQL that will be used to store or data repository can belocated connectivity
middleware
can
17
data source could be a relational
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
862
part
VI
Database
database,
Management
a NoSQL
database,
multidata-source-type The need
for
the
de facto
for
enabling
database
standard
is
database
data
manipulation
applications
connectivity,
to this
Open
based
interfaces a standard
to
database,
of
well-established
cannot
be overstated.
database
data repositories.
covers
(vendor
Access
support
connectivity
section
Database
a Microsoft
on the
language,
connect
Native SQL connectivity Microsofts
a spreadsheet,
capability
there
data file.
access
Just
connectivity
Although
only the following
or a text data
as SQL
has
interface
is
many
ways to
are
This
standards. become
necessary
achieve
interfaces:
provided)
Connectivity
(ODBC),
Data
Access
Objects
(DAO)
and
Remote
Data Objects (RDO) Microsofts
Object
Linking
and
Microsofts
ActiveX
Data
Objects
Oracles The
Java
data
connectivity
importantly, form
they
the
of
other,
of
here
most
Data
and
(OLE-DB)
dominant
vendors.
Access
interfaces
enhanced
functionality,
players
In fact,
(UDA)
manage the
connectivity
providing
are
database
Universal
of data source
database
thus
Database
(JDBC)
illustrated
support
Microsofts
any type
Microsofts
of the
Connectivity
the
for
(ADO.NET)
interfaces
enjoy
backbone
used to access see,
Database
Embedding
architecture,
data through
have evolved features,
in
the
ODBC,
over time:
more
ADO.NET
of technologies
interface.
each interface
and
and,
and
a collection
a common
flexibility
market
OLE-DB,
As you builds
will
on top
support.
17.1.1 native sQl Connectivity Most
DBMS
support that
vendors
provide
more standard
is
provided
of native
and
configuration
is
for
to
the
is
most, if
the
and is
RDBMS. Oracles
connectivity not
all,
databases connectivity
not the
Oracle
To
only
connectivity
can
way to
the
to
most
the
to that a client
client
features. for
to the
The
application client
connection
best
to
although
example
an
interface of that
Oracle
computer.
they
type
database,
Figure
17.1
you
shows
the
computer. for their
DBMS, and these interfaces
However, the
maintaining
programmer.
database
multiple
Therefore,
connectivity
most current
being
databases,
refers
vendor.
in the
are optimised
native
to their
SQL connectivity
unique
on the
a database; common
connecting
interface
a burden
Usually,
connect
standards,
database
become
arises.
for
connect
SQL*Net
interfaces
of the
methods
well. Native
vendor
of Oracle SQL*Net interface
different
database
as
database
configure
Native database access
own
interfaces
by the
interface
must install
their
the
interface
DBMS
products
native need
for
provided
support
support interfaces universal
by the
other
vendor
database
ODBC.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIGure
17.1
17.1.2 Developed
in the
of the
is
probably
(API).
to
The
the
(APi)
blocks
easy to
ODBC
better
ability
way to
Data
Access
Objects
Data
Cengage deemed
Learning. that
any
Web Technologies
863
to
is
ActiveX (rDO)
need,
API that
was a higher
Data bring
Access
about
level
a single
allows
guarantee
provide
two
access
to
However,
the while
rapid
data
adoption
in
functionality needed
access
Microsoft DAO is
are
all programs
programmers
other
API so
APIs
new programs.
significant
Therefore,
puts
an
that
and it enjoyed
a
interfaces:
Access still
widely
used
. application
to interface
directly
model is a new framework
uniform
provide Although
users to learn
did not
data.
object-oriented
developers
(UDA)
they
developed
API.
Objects (ADO).
Basic. It allowed
ODBC style
Microsoft
object-oriented
Data
because
A good
A programmer
Windows,
standard,
evolved, relational
Microsoft's first
users
programming
environment.
makes it easy for
Windows interface
applications.
blocks.
Microsoft
ODBC
any
programming
an application
building
of
access.
ODBC allows
software
operating
middleware
manipulate that
for
implementation
database
application
building
as
the
That
languages
SQL to
Visual
such
good
database
for
all of the
with
for
defines
tools
by providing
ultimately
To answer
Objects
UDA is designed
has
and
environments,
Microsoft's
Universal
non-relational
2020
and
Microsofts
interface.
SQL via a standard
consistent
are
(DAO)
was
Microsoft
Microsoft's
review
data.
DAO
remote
Copyright
execute
is a move towards
used in
Editorial
operating
applications
is
standard
(www.webopedia.com)
a program
they
(ODBC)
(CLi)
connectivity
using
protocols
As programming
to
access
database.
there
dictionary
widely adopted
applications. the
database
sources,
API will have similar interfaces.
was the first
Windows beyond
write
Connectivity
Call Level interface
of routines,
programmers,
using a common
data
Most
can
Database
supported
develop
together.
for
Connectivity
uDa
Group
online set
and Open
relational
as a
programmers designed
1990s,
most widely
access
makes it
rDo
SQL Access
Webopedia
interface
API
Dao, early
the
application
Database
oracle native connectivity
oDBC,
a superset
17
API that
will allow
interface
that they access
primarily
with ODBC data sources. have proposed.
to relational
17
and
databases.
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
864
part
VI
Database
Figure remote
FIGure
Management
17.2 illustrates
how
relational
sources.
17.2
data
Windows
applications
can use
ODBC,
DAO and
RDO to
access local
and
using oDBC, Dao and rDo to access databases
As you can tell
by examining
Figure
17.2, client
applications
can use
ODBC to
access
relational
data
sources. However, the DAO and RDO object interfaces provide morefunctionality. DAO and RDO make use of the underlying ODBC data services. ODBC, DAO and RDO are implemented as shared code that is dynamically linked to the Windows operating environment through dynamic link libraries (DLLs). DLLs are stored
as files
with the .dll
extension.
The basic ODBC architecture A high-level
17
ODBC
Running
has three
API through
which
as a DLL, the code speeds
up load
and run times.
main components: application
programs
access
ODBC functionality
A driver manager that is in charge of managing all database connections An ODBC
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
driver that
Rights
Reserved. content
does
communicates
May not
not materially
be
copied, affect
scanned, the
overall
directly
or
duplicated, learning
in experience.
whole
with the
or in Cengage
part.
Due Learning
DBMS.
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17
Database
Connectivity
and
Web Technologies
865
Defining a data source is the first step in using ODBC. To define a data source, you must create a data source name (DSN) for the data source. To create a DSN, you need to provide: An ODBC driver. driver that
is
normally
connect
select
A name.
most
Oracle
This is
available
system
database
use to connect
vendor,
databases.
although
For example,
driver
provided
by
Oracle
or, if
a unique
name
by
the
data
source
to the
data source.
Microsoft
if you
ODBC
which
ODBC offers two only to the
driver
user.
parameters.
connection point
types
System
to the to the
Most
database.
location
are
desired,
provides
using
the
an
The ODBC
several
Oracle
drivers
DBMS,
Microsoft-provided
will be known
of data sources: data
password.
ODBC
some
sources
FIGure
drivers
Microsoft
to
ODBC
user and system.
are
and password
required
drivers
17.3
to
use the
create
native
Configuring
require
if you Access
If you are using a DBMS
screens
ODBC
ODBC
For example,
of the
name and the username the
driver to
you
will
ODBC
available
to
and,
therefore,
User data sources
all users,
including
operating
services.
ODBC
and
by the
common
the
Oracle.
applications.
are
to
provided
to
the
driver for
to
You need to identify
are
using
(.mdb)
server,
to the
data
by the
to
establish
Access
necessary,
must provide
ODBC
provided
Microsoft
and, if
to connect
a system
parameters
a
file
you
needed
driver
specific
database, provide
name, the
database.
Figure
for
an
Oracle
you
need
a username
the server
source DBMS
a
database
17.3 shows
DBMS.
Note
that
vendor.
an oracle oDBC data source Defining
an
ODBC
system data source name (DSN) to
connect
using
Oracle
ODBC
If
an
Oracle
ODBC
DBMS,
Driver
Driver
uses the native SQL
to
Oracle
Oracle
connectivity.
no
user
ID is
provided,
ODBC will prompt for the user ID run
and
password
at
time.
SOURCE:
Course
Technology/Cengage
Learning
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
866
part
VI
Database
Once the specific the
Management
ODBC
calls
Core,
to the
functions,
and
providing
appropriate
Level-1
may provide
to
data source is defined,
commands
data
and
Level-2.
support
for
indicated
However, in that
Figure
17.4
The
source.
for
ODBC
The
procedural
API
how
ODBC levels
ODBC,
can
The
API
Driver
three
including
sub
route
of compliance:
For
example,
queries
vendors
vendor
will properly
levels
features.
The database
database
ODBC API byissuing
Manager
defines
increasing
DML statements,
the
write to the
ODBC
standard
provide
SQL or cursors.
with
support you
programmers
parameters.
compliance
to interact
shows
application
required
most SQL DDL and
but no support
support.
the
Level-1
and aggregate
can choose
must implement
all
which level
of the
features
level.
could
use
Microsoft
Excel to retrieve
data
from
an
Oracle
RDBMS,
using
ODBC.
FIGure
17.4
Microsoft eXCel uses oDBC to connect to the oraCle
CLIENT
database
APPLICATION ODBC
Interface
ODBC
API
ODBC DRIVER
MGR
ODBC
DRIVER
2 RDBMS
SERVER DATABASE
1
SERVER COMPUTER DATABASE
1.
From
the
5
3
Excel,
4
2.
Select
3.
Enter
click
From
data the
the
6
to
To limit
to
choose
the
Select
the
6.
Select
filtering
7.
Select
sorting
8.
Select
9.
Select
how
placed
in
and
Excel
Data
you your
uses
in
the
The
first
time,
on down
the
Options
and
list.
query.
rows
returned.
rows. Office
view
click drop
the
connection
OK.
listed.
user,
use
the
Microsoft
ODBC
API
to
executes
calls
the
to
Excel.
data
and
where
you
want
it
workbook.
Oracle
populate
restrict order
to
select
options
the
Click
are
Owner
to
to to to
want Excel
the
issues
the
columns
uses
source.
access by the
from
options
Data,
Query
ODBC
data
has
owned
name
options
Return
the
user
tables
External
source.
parameters. to
the
table
database. Excel
data
connect
user
Get
Microsoft
RDBMS.
ODBC
which only
under
From
Oracle
authentication
to
5.
10.
Tab,
and
an
Gradora
all tables 4.
Data
Sources
from
parameters
7
the
Other
retrieve
to
the
pass the
ODBC
the
SQL
request API
request
and to
down
generates
retrieve
the
to
the
a result result
set
set. and
spreadsheet.
9 8
10
As much of the functionality sources,
the
provided
use of the interfaces
by these interfaces
is limited
when they
advent of object-oriented programming languages, to other non-relational data sources.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
is oriented to accessing relational
are used
with other
it has become
part.
Due Learning
to
electronic reserves
rights, the
right
data source types.
moreimportant
some to
third remove
party additional
content
may content
data
With the
to provide access
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17
Database
Connectivity
and
Web Technologies
867
17.1.3 ole-DB Although
ODBC,
DAO and
RDO
were
widely
used, they
did not provide
support
for
non-relational
data. To answer that need and to simplify data connectivity, Microsoft developed Object Linking and embedding for Database (OLe-DB). Based on Microsofts Component Object Model(COM), OLE-DB is database middleware that adds object-orientated functionality for access to relational and non-relational data.
OLE-DB
was the first
part
of Microsofts
strategy
to
provide
a unified
object-orientated
framework for the development of next-generation applications. OLE-DB is composed of a series of COM objects that provide low-level database connectivity for applications. Since OLE-DBis based on COM, the objects contain data and methods (also known as the interface). The OLE-DB modelis better understood when you divide its functionality into two types of objects: Consumers are objects (applications consumers request data by invoking interface)
and passing
the required
Providers are objects that consumers.
Providers
Data providers
provide
providers
parameters.
manage the connection
are divided into two data to other
expose the functionality Service
or processes) that request and use data. The data the methods exposed bythe data provider objects (public
processes.
of the underlying
provide
additional
with a data source and provide data to the
categories:
data providers
Database
vendors
data source (relational,
functionality
to
and service
create
data
provider
object-oriented,
consumers.
The service
providers. objects that
text, and so on).
provider is located
between the data provider and the consumer. The service provider requests data from the data provider, transforms the data and then provides the transformed data to the data consumer. In other words, the service provider acts like a data consumer of the data provider and as a data provider for the data consumer (end-user application). For example, a service provider could offer cursor
management
indexing
services,
transaction
management
services,
query
processing
services
and
services.
As a common
practice,
many vendors
provide
OLE-DB
objects
to
augment
their
ODBC
support,
effectively creating a shared object layer on top oftheir existing database connectivity (ODBC or native) through which applications can interact. The OLE-DB objects expose functionality about the database; for example, there are objects that deal with relational data, hierarchical data and flat-file text data. Additionally,
the
objects implement
specific
tasks,
such
as establishing
a connection,
executing
a query,
invoking a stored procedure, defining a transaction, or invoking an OLAP function. By using OLE-DB objects, the database vendor can choose whichfunctionality to implement in a modular way,instead of being forced to include all of the functionality all of the time. Table 17.1 shows a sample of the object-orientated classes used by OLE-DB and some of the methods (interfaces) exposed by the objects.
taBle Object
17.1
sample
Class
ole-DB
classes
and interfaces
Usage
Session
Sample
Used to create application
Command
Used to
an OLE-DB
and
a data
process
RowSet
commands
a data
consumer
IGetDataSource ISessionProperties
to
object
manipulate
a data
will create
RowSet
providers
data.
ICommandPrepare
objects to hold the
ICommandProperties
by a data provider.
Used to hold the result database
between
provider.
Generally, the command data returned
session
that
set returned
supports
SQL.
by a relational
Represents
style
a collection
database
of rows
in
or a
IRowsetInfo
a tabular
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
17
IRowsetFind
format.
Editorial
interface
IRowsetScroll
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
868
part
VI
Database
OLE-DB
Management
provides
provide
support
Server
Pages
called
ActiveX
additional
for (ASP)
to interact
OLE-DB
FIGure
ActiveX.
Objects
OLE-DB,
language
architecture,
17.5
for the applications
languages,
that
(A script
is
at run time.)
DAO and that
especially
To provide
(ADO).
and executed
with
programming
and
Data
but is interpreted
capabilities
scripting
uses
showing
the support,
the
underlying
developed
a high-level
ODBC
objects. and
However, it such
a new
language
object
to
Figure
native
does not as Active
framework
that is not compiled
application-orientated
a unified interface
OLE-DB with
data.
Web development,
Microsoft
ADO provides
how it interacts
used for
written in a programming
RDO. ADO provides the
accessing
ones
access
interface data from
17.5 illustrates
connectivity
the
any ADO/
options.
ole-DB architecture
ADO introduced
a simpler
the data manipulation Table 17.2.
object
model that
services required
was composed
of only a few interacting
by the applications.
objects
to provide
Sample objects in ADO are shown in
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
taBle Object
17.2
Connectivity
and
Web Technologies
869
Usage
Connection
Used to set up and establish data source.
Command
Contains be
Fields
the
written
the
ADO
commands
data
to the
Contains
model is
against
use its
a specific
by the The
of Field
descriptions
ADO will connect to any OLE-DB
data
can for
source).
It
will also
be disconnected
each
column
over the
access
(data
of a command.
Recordset
improvement
new
connection
execution
source.
a tremendous to
with a data source.
can be of any type.
generated data
a collection
programmers
a connection
The data source
Used to execute
Recordset
Although
Database
sample aDo objects
Class
encouraging
17
in the
OLE-DB
framework,
contain
from
the
any
data
new
data to
source.
Recordset.
model,
Microsoft is actively
ADO.NET.
17.1.4 aDo.net Based
on ADO, ADO.NeT
framework.
heterogeneous, under
any
interoperable
operating
framework
is
access
the
component
provided
component
ADO/OLE-DB
To understand
duo.
representation
memory-resident
DataSet.
interacts
DataSet.
The
with the
Once the
and the changes
are
processing
data in the
represents
of the the
.NET
basic
data
two
enhances
new features
model, you should That is, the
disconnected
DataSet done,
and
critical
the functionality
for the
development
support.
is then
DataSet
stored in
object
the
can
be
XML format
made
persistent
In short, you can think
the
over any network
coverage
will only introduce
extends
data are read from
is
distributed,
know that DataSet
a data from
to
make
DataSet
data
a DataSet is a disconnected
contains
provider,
the
data
changes are
tables,
the
columns,
data
provider.
The
(inserts,
data
updates
synchronised
rows,
are placed
with the
on a
consumer
and
deletes)
data
source
made permanent.
The DataSet is internally
environments.
DataSet
section
framework
database.
data in the
of data
Comprehensive
this
introduced
XML
new
Once the
development
ADO.NET.
and
of the
and constraints.
manipulating
Therefore,
ADO.NET
of this
application
any type
at
language.
the .NET
DataSets
.NET
aimed
book.
that
Microsofts
platform for developing
architecture,
the importance
memory-resident
application
.NET
of
is a component-based
programming
of this
understand
applications:
relationships
and
scope
of the
to
by the
of distributed
the
data access
applications
system
beyond
It is important
in the
is the
The Microsoft .NeT framework
persistent
of the
data stored in the
(you
will learn
as XML
documents.
about
XML later in this
This is
critical
in todays
DataSet as an XML-based, in-memory
data source.
Figure
17.6 illustrates
chapter),
the
and
distributed
database that main components
of the ADO.NET object model. The ADO.NET framework consolidated all data access functionality under one integrated object model. In this object model, several objects interact with one another to perform specific data manipulation
functions.
Those
objects
can be grouped
as data
providers
and consumers.
Data provider objects are provided by the database vendors. However, ADO.NET comes with two standard data providers: a data provider for OLE-DB data sources and a data provider for SQL Server. That way, ADO.NET can work with any previously supported database, including an ODBC database with an OLE-DB data provider. Atthe same time, ADO.NET includes a highly optimised data provider for
SQL Server.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
870
part
VI
FIGure
Database
Management
17.6
aDo.net
framework Client Applications
Data
Consumers Access
Internet
Excel
DataSet
ADO.NET
(XML)
Data Providers DataTableCollection
DataAdapter
DataTable
DataColumnCollection
DataReader
DataRowCollection Command ConstraintCollection Connection DataRelationCollection
OLE-DB
DATABASE
Whatever
in the
the
data
data source.
provider
is, it
Some
of those
must
support
objects
a set
of specific
are shown in
objects
Figure
17.6.
in
order
A brief
to
manipulate
description
the
of the
data
objects
follows: Connection. database,
The
Connection
object
and so on. This object
defines the
enables
the
data source
client
used, the
application
to
name
of the
open and close
server, the a connection
to
a database. Command.
The
specified
17
call to
returns
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
Command
database be run
by the
All
Rights
Reserved. content
does
represents This
database.
a set of rows
suppressed
object
connection.
object
a database contains
When a SELECT
command
the
actual
statement
to
SQL
is
be executed
code
executed,
within
or a stored
the
a
procedure
Command
object
and columns.
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
DataReader. the
The
database
to
DataAdapter. most
DataReader
retrieve
object
objects
that
UpdateCommand
The
object
to
the
DataTable.
two
DataTable
object
the
it is
of the
This is the
contains
objects
to
populate
data.
contains
and the
the
InsertCommand,
data in the
object
data in
database.
This
a collection
of
DataRelationCollection
and
tabular
ways to
object
associate
format.
This
of entity
integrity.
enforcement
or
more
column nulls
descriptions. allowed,
one row
object
Each
maximum
or more than
object
completely
could
database.
In
within
a simple
to
populate
objects
in
create the
of the
one row
has
in
one
In turn,
column
value
one row
constraints
database
the
description
and
has
minimum
value.
with data as described
a
with tables,
the
DataSet data
source,
DataSet
can
come
table in
an
object
from
Oracle
that
the
rows
to the
a data
Two types
of
and constraints.
data source.
source.
which is from
tables
as though
sources.
they
way for
truly
heterogeneous
DataSet
disconnected. This
and a SALES table in
both
more
DataAdapter
once the
called
data
Even
The
However,
why its
different
database
relates
paves
for the table.
UniqueConstraint.
connection
of the
a DataSet
DataSet
and
a permanent
independent
then
short,
definition
require
have an EMPLOYEE
You
means
a SQL Server
were located distributed
in the database
applications.
The ADO.NET
framework
environment,
applications
a disconnected
system
on the
the
ForeignKeyConstraint
doesnt
DataTable
you could
database.
internet
object
uses those
relationships
the
data type,
zero rows,
a DataSet is, in fact, DataSet
Additionally,
support
with
main objects:
one
name,
the
allows
of three
contains
SelectCommand
populated,
same
which
are supported:
As you can see,
uses the
871
table.
represents
contains
data
session
object.
data source
database,
Web Technologies
DataColumnCollection.
ConstraintCollection
important,
another
contains
DataRowCollection
object
representation
the
a read-only
SelectCommand,
permanent
in-memory
describing
as column
DataAdapter
DataAdapter
with the
and
manner.
a DataSet
DataSet:
The DataTableCollection
PrimaryKey,
such
constraints
and
in
DataColumnCollection
?
that
row
The
Connectivity
creates
a rapid
managing
data in the
in-memory
up the
object is composed
in the
is
make of objects
related
properties
?
is the
of
framework.
The
Database
object that only) in
charge
the
DataSet
main objects.
property:
DataTable
ADO.NET
DeleteCommand.
that
The
important
object is in
managing
object
a collection
a table
(forward
data in the
DataSet
objects
contains
?
the
contains
DataTable
a specialised
in the aid in
and
and synchronise DataSet.
object is sequentially
The DataAdapter
specialised
following
data
17
optimised
exchange
as the
to
work in
disconnected
messages in request/reply
is the internet.
Web browser
databases
is
Modern
graphical
applications
format.
rely
user interface.
environments. The
next
a disconnected
most common
on the internet
In the
In
as the
section,
you
example
network
will learn
of
platform about
how
work.
17.1.5 Java Database Connectivity (JDBC) Java is an object-oriented in
2010)
that
languages
runs
for
multiple its
Copyright Editorial
review
2020 has
which
Learning. that
any
without Java
machine
All suppressed
Rights
Reserved. content
Web
browser Sun
means that
architecture.
a virtual
Cengage deemed
of
development.
environments
portable
run in
on top
Web
environment,
programming language
does
code
is
May
not materially
be
copied, affect
normally in
scanned, the
overall
Java
the
or
duplicated, learning
stored
in
is
one
of the
Java
write a Java
in
or in Cengage
part.
common once,
system.
Due Learning
to
electronic reserves
chunks This
rights, the
once
capabilities
pre-processed
operating
whole
most
as a write
application
The cross-platform
host
experience.
by Sun Microsystems (acquired
created
can
modification.
environment
not
software. Microsystems
a programmer any
developed
right
some to
third remove
programming run
additional
and then
known
content
anywhere
run it in
of Java are based
environment
party
by Oracle
may content
as applets has
be
any
time
that
17
well-defined
suppressed at
on
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
872
part
VI
Database
Management
boundaries, Java to
and all interactivity
run-time
environments
TV set-top
application
boxes. loads,
Another
it
can
application including
to
server,
main
technology data in the
access
FIGure
17.7
allows
databases,
the result of
personnel
companys
system
systems,
of using
from
Java
all its
is its
Java
the
data
sources,
spreadsheets
a data
source,
with
prepare
a
When a Java
they
use pre-defined
(JDBC) is an application
wide
send
devices
via the internet.
environment,
and text and
mobile
architecture.
Connectivity
to interact
Sun provides
handheld
components
Java run-time
Database
monitored.
to
or required
program
with
closely
on-demand
modules
data outside
is
computers
a Java
tabular
a connection
range
files.
the
of
data
JDBC
SQL code
sources,
allows to the
a Java database
set. JDBC
is
training.
that
it
JDBC
databases.
via database
ODBC driver.
operating
download
need to access
advantage
and
advantage
that
establish
and process
One
host
operating
programming interfaces.
interface relational
program
most
dynamically
When Java applications
programming
with the
for
As a
middleware.
allows
allows matter
Furthermore,
Figure 17.7 illustrates
a company
programmers
of fact,
JDBC
JDBC
to to
allows
provides
leverage
its
use their
SQL
direct
access
a way to connect
the basic JDBC architecture
existing skills to
investment
to
a database
database
the
server
to databases
and the various
in
manipulate
through
access
or
an
styles.
JDBC architecture Java
Client
Application
JDBC
JDBC
API
Driver
Manager JDBC-ODBC
Java
DB
Driver
Java
DB
Driver
Bridge
Driver
Database ODBC
Middleware
DATABASE
DATABASE
DATABASE
DATABASE
SOURCE:
Course
Technology/Cengage
Learning
As you see in Figure 17.7, the database access architecture in JDBC is very similar to the ODBC/OLE/ ADO.NET
architecture.
All database
access
middleware
shares
similar
components
and functionality.
One advantage of JDBC over other middleware is that it requires no configuration on the client side. The JDBC driver is automatically downloaded and installed as part of the Java applet download. Because Java is a Web-based technology, applications can connect to a database directly using a simple URL. Once the URLis invoked, the Java architecture comes into play, the necessary applets are downloaded
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
to the client (including are
executed
investing
the
securely
resources
on the internet.
to Such
In fact,
database
clients
develop
expand
generates
environment.
their
is likely
to
the
more
reliance
more
ways to
be stored
in
platform
Web Technologies
do
873
applets
companies more
are
business
databases.
on the internet
development
and
and then the
and
are finding
of data to
increasing
become
day,
and
amounts
Connectivity
information),
Every
Web presence
increasing
are part of the trend towards the internet
Database
driver and all configuration
run-time
and
business
the .NET framework resource.
JDBC
in the
17
Java
as a critical
and
business
of the future.
17.1.6 php PHP
or
suited
Hypertext for
Web
Scripting a
through
specific In
One of the to learn.
advantage (including
offers
Call Interface that
use
(OCI
must connect
then
to
SQL
degree
to the
17.2
database.
database
of control
as if
Millions
of people
services
to
all over the
allows
cursors
needed
OCI 11 is the
over
how
operating
Web servers
organisations
usually
The
creation
developed
program
most efficient
way
designed
11 gives using
to
close
is
Oracle
of applications OCI
application,
an application
major
most
addition,
by the
the
statements.
databases.
the In
application
within
short
used. Oracle
server.
of Java.
hold the the
the
OCI 11
data
cursor
and
of connecting
to
and
also
access
the
a higher
execution.
ConneCtIVIty
world
over
that
extensions
ahead
on all
medium
to
databases
use and not difficult
a few
used
PHP is
Server-Side
Web page is
PHP,
supports
to
connect
A typical
that
control
small
database
statements
be
and it
time if to
using
can
a
especially
ODBC
used
MySQL, is free to
It
is
of different
through
websites
PHP)
interface
execution.
SQL
a number
connecting of
for
be used
Oracle
believes
greater
Internet
databases
websites
open the
any
Oracle
gives
over program
DataBase
connecting
process
an
query
more databases,
query,
can
before the
of
Linux
development
that
access
SQL
to
everywhere.
most
ASP. It is
to
cent1
that
Web server
MySQL database
X and
programming
to
of
per
with versions
OS
addition,
module
MySQL,
almost Mac
an application
one or
by the
Oracle
In
of all stages
disconnect
any
supported
calls
of
language
Microsofts
connectivity
with a
and have a shorter
of function
control
extracted
and IIS.
11) is
to
on the
78.9
connect
Windows,
an extension
a series
developer
PHP is
requirements
also
case that
PHP, along to
scripting
an alternative
supports
in the
was reported
it is possible
Apache
have simpler
it
Microsoft
including
PHP
extensions
is that
as
PHP
for this is that
For example,
Another
seen
general-purpose
means that it is interpreted
2019,
key reasons
widely-used
be displayed.
database
systems today,
to
January
a
and is
which
Web browser
Oracle.
is
development
Language,
sent to
for
Preprocessor,
use
the
computers
Web.
and
Web database
Web browser
software
connectivity
opens
to
the
door
to
internet,
new innovative
that:
Permit rapid
responses
to
competitive
pressures
by bringing
new services
and products
to
market
quickly. Increase Allow
customer anywhere,
satisfaction anytime
Give fast
and
effective
or across
the
globe.
through
data
access
information
the
creation
using
mobile
dissemination
of
Web-based
smart
devices
through
support
services.
via the internet.
universal
access
from
across
the
street
17 1
Comparison
of the
Server-Side
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
usage of PHP vs Java for
Languages
All suppressed
Rights
Reserved. content
does
Available:
May not
not materially
be
copied, affect
websites,
W3Techs.com,
January 2019,
News, Technologies,
https://w3techs.com/technologies/comparison/pl-java,pl-php
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
874
part
VI
Database
Management
Given those
advantages,
architectures, Table
taBle internet
based
17.3
shows
17.3
many IS
on internet
a sample
of internet
Characteristics
and
to
face
characteristics
independence
Savings in
simple
user interface
Reduced
Location independence
development
at
manageable
support
(and
requirements of
multiple
Free
client
Low
entry
business
will learn and
in
access
costs
development
issues
that
the
same,
management
is
are
of this
they
hundreds
change
the
Webs
over
application
does
previous
not
the
negate
bad
the
in
particular,
and design,
may be
the
affected
by
design
analysis,
and
whether
transaction
structures
which transactions
development.
database
In the final
database
many database
in IS
profoundly
system-level
database of
why
and, are
chapters.
in line,
basic
see element
development
interface
effects
in
easy to
connectivity
way information
ability
cross-platform
details
relationships.
are
If
any
implementation
measured in
and
millions per
to
access
is
generated,
data in
databases
(heterogeneous)
accessed (local
functionality.
The
and
distributed.
and remote), Web has
the
helped
At the simplicity
create
a new
standard.
sections
databases
an environment
changing
dissemination
The following
The
multiple servers
be a critical
database
in the
same
using
it is to
Web servers
networks
and scalability,
by standing
this:
of free
private
per day.
is the
and
standards)
browsers)
environment,
and
the
it is
(open
availability
database
or
require
(Web
maintaining
database
online
be learnt,
is rapidly
interface,
information
and to
multiplied in
day, rather than in The internet
going
tools
tools
to the internet
were addressed
by
tools
information
connection
Web-based
connections
times
frequent of
user interfaces
a
dedicated
tools
processing
sections,
of
having
lesson
global
DBMS
following
a purchase
essentially
with
the
However,
immediate
of the
the
management
implementation
core
and
consider
for
development
costs
Distributed
infrastructure
costs!)
development
development
Reduced
current
development
Reduced
Relatively inexpensive
professionals
cost
internet
Reduced
make
provide.
and cost
Global access through
More interactive
you
they
development
multiple platform
Plug-and-play
Web.
benefits
making.
equipment
platform
time
end-user
Availability
costs
the
access
and portability
multiple
No need for
As you
data
decision
acquisition
on most existing
training
Reduced
creation
and the
equipment/software
No need for
In the
universal
technologies
Platform independence
Rapid
create
and to facilitate
Benefit
software
and
need to
operations
of internet
Ability to run
Common
the
streamline
technology
and benefits
Characteristic
Hardware
departments
standards,
examine
the
how
Web-to-database
middleware
enables
end
users
to interact
Web.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17.2.1
Web-to-Database
In general, the
17
Database
Middleware: server-side
Web server is the
main hub through
Connectivity
and
Web Technologies
875
extensions
which all internet
services
are accessed.
For example,
when an end user uses a Web browser to query a database dynamically, the client browser requests a Web page. Whenthe Webserver receives the page request, it looks for the page on the hard disk; when it finds the page (for example, a stock quote, product catalogue information or an airfare listing), the
server
sends it back to the
client.
online Content Systems,
Dynamic the
Web server
result
browser
generates
this
type
understand
the
page
(database
In the
that
before
page
connect
query),
and for
to the
display
but
the interaction
the
Figure
and,
therefore,
client.
The Web server receives Web-to-database
type
of scripting
browser,
4
database
Unfortunately,
data
from
the
capability
directly
the
Web
Therefore,
be extended
to
so it
can
extension.
Web server to handle specific
server-side
extension
Web server,
is that it
database
neither
a server-side
extension
Web browser.
the
database.
must
with the
the
client
must include
program
which, in turn, makes it
retrieves
sends
possible
provides its services
In short, the server-side
to the
the
to the
to
to the
extension
the
data
retrieve
and
Web server in
adds significant
internet.
known
as
Web server
web-to-database
and the
Web-to-database
middleware.
Web-to-database
middleware
Figure
17.8
middleware.
actions:
Web server.
and validates the request. In this case, the server passes the request to
middleware
language
The Web-to-database to the
to the
and read
example,
is also
the
page
back
data to the
browser.
program
17.8, trace
the
Web servers
more important client
the
sends
The server-side
to the
extension
between
query
retrieved
purposes.
whats
Web server
server-side
the
3
it
This job is done through
database
passes
websites. In this database query scenario, Web server
The client browser sends a page request to the
2
book.
is that the
to
the
requests.
preceding
transparent
As you examine
1
for this
scenario
how to
database
the query results,
A database
sends
of request
browser
functionality
it
contents
query
knows
a way that is totally
shows
before
databases
clients
present
platform
extension is a program that interacts
of requests. data from
to the
online
Web page
preceding
Web server
and process
A server-side types
the
with the
on the
nor the
support
Client/server systemsarecovered in detail in Appendix F, Client/Server
on the
Web pages are at the heart of current generation
The only problem query
located
to
for
processing.
enable the
Generally,
database
the requested
page
contains
some
interaction.
middleware reads, validates and executes the script. In this case, it connects
and passes the
query,
using the database
connectivity
The database server executes the query and passes the result
layer.
back to the
Web-to-database
middleware.
5
The
Web-to-database page
6
that
middleware
includes
the
compiles
data
retrieved
The Web server returns the just-created client
7
the result from
the
set, dynamically database,
and
generates
sends
HTML page, which now includes
an HTML-formatted
it to the
Web server.
the query result, to the
browser.
The client browser
displays the page on the local
computer. 17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
876
part
VI
FIGure
Database
Management
17.8
Web-to-database
middleware
3 SERVER
Web server
COMPUTER
2
page
and passes
CLIENT
1
COMPUTER HTTP
WEB
Web server page
SERVER
receives
request
SCRIPT
determines
contains
the
script the
the
language
script
page
to
Web-to-database
PAGE
middleware
request
WEB-TO-DATABASE
4
TCP/IP MIDDLEWARE
NETWORK Web-to-database middleware 6
Web server
HTML
sends
PAGE
8 The result
query
displayed HTML
HTML
HTML
formatted
using
Web-to-database middleware
7
query
in
passes
results
format
format
and
passes
client
is
to the
database
PAGE
page
to the
of the
database
the
connects
in
query database
connectivity
the
layer
HTML
back to the
ADO.NET
Web server
ADO OLE-DB ODBC
RDBMS COMPUTER
Database
server 5
passes results
the
query
back
to
RDBMS
the
SERVER
Web-to-database middleware
DATABASE
The interaction development
between the of a successful
wellintegrated
Web server internet
and the
database
with the other internet
Web-to-database
implementation.
middleware is
Therefore,
services and the components
the
crucial to the
middleware
must be
that are involved in its use.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17.2.2
17
Database
Connectivity
and
Web Technologies
877
Webserver Interfaces
Extending
Web server functionality
implies
that the
Web server
and the
Web-to-database
middleware
will properly communicate with each other. (Database professionals often use the wordinteroperate to indicate that each party to the communication can respond to the communications of the other. This books use of communicate assumes interoperation.) If a Web server is to communicate successfully with an external
program,
respond to requests. Common
both
Gateway Interface
Application
programs
must use a standard
Currently, there are two
well-defined
way to
exchange
messages
and to
Web server interfaces:
(CGI)
programming interface (API).
The Common Gateway interface (CGi) uses script files that perform specific functions based on the clients parameters that are passed to the Web server. The script file is a small program containing commands
written in a programming
language,
usually
Perl,
C11,
C# or Visual
Basic.
The script files
contents can be used to connect to the database and to retrieve data from it, using the parameters passed by the Webserver. Next, the script converts the retrieved data to HTML format and passes the data to the Web server, which sends the HTML-formatted page to the client. The
main disadvantage
of using
CGI scripts
is that
the
script
file is an external
program
that is
individually executed for each user request. That scenario decreases system performance. For example, if you have 200 concurrent requests, the script is loaded 200 different times, whichtakes significant CPU and memory resources away from the Web server. The language and method used to create the script can also affect system performance. For example, performance is degraded by using an interpreted language
or by writing the
script inefficiently.
An application programming interface (API) is a newer Web server interface standard that is more efficient and faster than a CGI script. APIs are more efficient because they areimplemented as shared code or as dynamic link libraries (DLLs). That meansthe APIis treated as part ofthe Web server program that is dynamically invoked when needed. APIs are faster
than
CGI scripts
because
the
code resides
in
memory and there is
no need to run
an external program for each request. Instead, the same API serves all requests. Another advantage is that an API can use a shared connection to the database instead of creating a new one every time, as is the case with CGI scripts. Although
APIs are
APIs share the same other disadvantage is The Webinterface Regardless of the must be able to
more efficient in handling
requests,
connect
with the
Use the native SQL access SQL*Net if you are using
database.
That connection
have some
disadvantages.
Because the
can be accomplished
in
one of two
ways:
middleware provided by the vendor. For example, you can use
Oracle.
Use the services of general database connectivity ADO.NET
they
memory space as the Web server, an API error can bring down the server. The that APIs are specific to the Webserver and to the operating system. architecture is illustrated in Figure 17.9. type of Web server interface used, the Web-to-database middleware program
standards
such as ODBC, OLE-DB, ADO,
or JDBC.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
878
part
VI
FIGure
Database
Management
17.9
Webserver CGI and apI interfaces
SERVER CLIENT
External
COMPUTER
CGI
COMPUTER
Program
TCP/IP Network
WEB API
SERVER
(DLL
Database
call)
ADO.NET
Connectivity
ADO
Middleware
OLE-DB ODBC
RDBMS COMPUTER RDBMS SERVER
DATABASE
17.2.3 The that
the
Web
browser
lets
end
generates
an
internet
is the
users
application
navigate
HTTP
GET
software
(browse) page
the
request
such Web.
that
as
Google
Each time
is
sent
to
Chrome,
the
the
end
user
designated
Apple clicks
Safari
or
Mozilla
a hyperlink,
Web server,
the
using
Firefox browser
the
TCP/IP
protocol.
The present
Web Browser
Web browsers the
different
interpretation
and
job is to interpret page
components
presentation
the in
capabilities
HTML code that it receives a standard
are
not
formatted
sufficient
from the
way.
to
Web server
Unfortunately,
develop
the
Web-based
and to
browsers
applications.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The Webis a stateless of the and
clients each
server
the
client
accessing
computers
it,
interact
browser is
what
in
concerned
client
the
client.
ended.
requests the
although
are actually just
browsing
The server
entered
in
selection,
you
need
user
to
is
a new
client
clicks
page
and
Web Technologies
components,
the
879
stored in the local
what the
end
and
so
on.
page
back
to
the
(go
doing
On the
with the
is
Web server),
directory)
want to
thus
to is
open,
document,
you
when
page
communication
communication
Web, if
know
is
requested
cache (temporary
user is
selected,
the
page to
communicate the
and
For example,
second
sends
server
client
model.
client/server
that
the
Web! Instead,
computers
server
and think
between
request-reply
no way for the
and the
a page
line
worldwide
the
and server
a link
and its
any idea
a
follow
page, so there is
HTML document
option
to jump
Connectivity
communication in
that
current
the
have
no open
is impractical
may be browsing
the not
is
The only time the
you
which
there
conversations
with the
receives
does
a form,
of course
short
when the
client
Therefore,
browser.
which
page.
a page
Once
That is,
only
Database
at any given time, a Web server does not know the status of any
with it.
very
was done in the first
the
are
system
communicating
17
losing
you
of your
which
act
data
on a clients
track
of
what
was
done before. The
Web browser,
output no
text
and
way to
perform
in the
client,
and
VBScript.
servers
Web
defers
The browser processing
you
must
provide
the
Client-side
various
inputs.
data
entry
to
other
Even
Web
programming
form
other
abilities
accepts
to
as
the
On the
data,
crucial
PHP,
side,
is
JavaScript
can perform
capabilities
server
there
processing
Java,
only data and
To improve
extensions.
field
such
such
displays
beyond formatting
form
perform
languages
that
data inputs.
client-side
processing
browser
Therefore,
a dumb terminal
and
necessary
when the
validation.
as accepting
plug-ins
only
of the
Web
Web application
power.
extensions
extensions
in
field
resembles
such
use
17.2.4 Client-side
available
its use of HTML, does not have computational
form
immediate
the
rudimentary browser,
through
accepting
add functionality
forms,
the
most
to the
commonly
Web browser.
encountered
Although
extensions
client-side
extensions
are
are:
Plug-ins Java
and JavaScript
ActiveX
and
VBScript
A plug-in is an external application that is automatically invoked it is
an
external
application,
data object are
not
generally
originally
server
Reader
JavaScript that
allows
Java,
it is
a page
the to
present
to
operating-system-specific.
if it
The
to allow the
one
as
manipulate
of the
page
a portable
the
components
being loaded
when
from
the
was Microsoft's Microsoft's
is
a specific
server
into
Because
embedded event
in
takes
client
browser (Internet
with
document,
the
and launch
a
that Web Adobe
computer.
is simpler
such
It is
as a
to
or macros)
generate
downloaded
mouse
click
than
with the
on an object
or
memory.
Explorer).
ActiveX
for Internet
associated
of a series of commands
Web pages.
place
Because
data properly
object
JavaScript
the
a .pdf
alternative to Java. ActiveX was a specification
compatible.
a replacement
code
sites.
is
is
handle
format
on the
when needed.
plug-in
Web server to
document
document
design interactive
JavaScript
executed
cross-platform
Edge,
example,
recognise and
to learn.
and is
ran inside
is
extension
is a scripting language (one that enables the running
easier
ActiveX
plug-in
For
data,
Web authors
Web page
truly
supported.
will receive
Acrobat
the
using the file
bythe browser
support
Explorer
However, was
with
despite
dropped
no ActiveX
for
Microsoft's
and in
2015
writing programs that
efforts,
Microsoft
ActiveX
was not
released
Microsoft
support.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
880
part
VI
Database
Management
From the
developers
absolute on the to
necessity. client
perform
input
routines
the
The
of view,
set
thus
them
to
a
application
is
can
Present
database
Create
Web pages
Enforce
referential
Web application
Security
and
That
processing
Web applications.
that
expands
client side is an
validation
scenario
cycles.
for
on the
no data
is
requires
Therefore,
the
done server
client-side
Most of the
data
data validation
such
as databases,
a consistent
perform
the
from
a
the
functionality
directory
run-time
of
systems,
environment
Web servers
and
for
search
by
engines.
Web applications.
Web
following:
Web page. various
formats.
pages.
to insert,
update
and
in the queries
servers
An integrated
provide
development
application
to
a database
nested
data validation
Web form
Web server.
CPU
application
of services,
integrity
and
permit on a
or VBScript.
also provides
Web search
Create
Use simple
valuable
data in a Web page using
dynamic
that
entered
be sent to the
middleware
be used
Connect to and query
are
servers
a
server
servers
must
wasting
JavaScript,
wide range
Web application
data
most basic requirements
Web application server
using routines
when
data
is one of the
Web application
linking
entire
validation,
are done in Java,
17.2.5 A
side, all data
validation
point
For example,
delete
application
program
and
programming
features
such
environment
database
data.
logic.
logic
to represent
business
rules.
as:
with
session
management
and
support
for
persistent
variables and authentication
Computational Automatic
languages generation
Performance
Database
to represent of
HTML
and
pages
and fault-tolerant
access
Access to
of users through
user IDs store
and
passwords
business
integrated
with
logic
Java,
in the
application
JavaScript,
server
VBScript,
ASP,
and
so
on
features
with transaction
management
multiple services,
such
Web application
servers
capabilities
as file transfers
(FTP),
database
connectivity,
email
and directory
services.
Examples Server
of
by IBM,
WebObjects data
WebLogic by
sources
Apple.
and
compatibility
Server All
other
by
Web application
services.
with other
include
Oracle,
They
ColdFusion/JRun
Fusion
by
servers
offer the
vary in their
Web and database
range
tools,
by Adobe,
NetObjects,
Visual
ability
to
of available
and extent
.NET
connect features,
of the
WebSphere
Studio
Application
by
Microsoft
Web servers
to
robustness,
development
and
multiple
scalability,
environment.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
17.2.6
17
Database
Connectivity
and
Web Technologies
881
Web Database Development
Web database
development
deals
with the
process
of interfacing
databases
with the
Web browser
in
short, how to create Web pages that access data in a database. As you learnt earlier in this chapter, multiple Web environments can be used to develop Web database applications. One of the most common web application development environments is known as LAMP. LAMP is
made up of the
Linux
operating
system,
the
Apache
Web server,
MySQL
database
and the
PHP
programming language (although Perl and Python can be used instead of PHP). It is often used within organisations that need an effective way of managing organisational data but do not have the time or money to invest in alarge-scale, costly web development project. LAMP allows Web developers to build efficient Web applications that are reliable and stable. Examining the components of LAMP will allow
us to see
why:
The Linux operating system is open source can be used to offer cross-platform is important
to enable
The Apache
your
website to
be used across
Web server is the leading
because it allows,
with PHP, the
all
major browsers
compatibility.
and any
mobile
platform in terms of its total number of domains.
development
of highly interactive
34.8 per cent of domains were hosted on Apache servers power the most sites 40.65 per cent.2
Web applications.
Web servers. As of 2018,
This
device.
This is
In 2018,
Microsofts
Web
MySQL databases can be used to store data for both simple and complex websites with varying degrees of database complexity. It allows easy retrieval and capturing of data from the Web. The programming language PHP is used to link all the components of LAMP. PHP allows the dynamic content of the website to be obtained through accessing data within the MySQL database. The main benefits of LAMP are that it is easy to programme and applications can be developed offline and then deployed onto the Web. Deployment is also relatively straightforward as PHP is easily integrated
with the
Apache
Web server
and
MySQL.
Despite the development
of the
LAMP
components
being independent, when combined they offer one ofthe best solutions for Webdatabase development. In order to illustrate the use of PHP to retrieve a data through a simple query, lets examine a PHP code example. Because this is a database book, the examples focus only on the commands used to interface
with the
database,
rather
than
the
specifics
of HTML code.
A Microsoft Access database named Orderdb is used to illustrate the Web-to-database interface examples. The Orderdb database, whose relational diagram is shown in Figure 17.10, was designed to track the purchase orders placed by users in a multidepartment company.
17 2
Copyright Editorial
review
Web
2020 has
Server
Cengage deemed
Learning. that
any
Survey.
All suppressed
Rights
Available:
Reserved. content
does
May not
not materially
http://news.netcraft.com/archives/category/web-server-survey/
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
882
part
VI
FIGure
Database
Management
17.10
the orderdb relational
diagram for the
Web database development
SOURCE:
The following rows.
The
1
example scripts
will explain
used in these
how
to
examples
Query the database
using standard
VENDOR
examples
source
table. was
The
defined
using
use perform
create
two
basic
SQL to retrieve
will use
the
PHP to
an
operating
ODBC
system
a simple
Technology/Cengage
Web page
to list
a data set that contains source
Figure
to the
17.11
shows
client the
the
VENDOR
shown
in
all records in the
named
RobCor.
Section
17.1.2.
The
2 Format the records generated in Step 1in HTML so they are included in the returned
Learning
tasks:
data
tools
Course
examples
ODBC
data
Webpage that is
browser. PHP
code
to
query
the
VENDOR
table.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIGure
17.11
17
Database
Connectivity
and
Course
Technology/Cengage
note that PHP uses multiple tags to query and display the data returned
Take a closer look
883
php code to query the VenDor table
SOURCE:
In the figure,
Web Technologies
at the
Learning
by the query.
PHP functions:
The ODBC_CONNECT
function
(line
11) opens
a connection
to the
ODBC
data source.
A handle
to this database is set in the $dbc variable. The ODBC_EXEC
function
(line
$dbc database connection. The
WHILE function
(line
13) executes
the
SQL query stored in the
$sql variable
against the
The querys result set is stored in the $rs variable.
15) loops
through
the result
set ($rs)
and uses the
ODBC_FETCH_ROW
17
function to get one row at a time from the result set. Notice that PHP variables start with the dollar sign ($).
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
884
part
VI
Database
The
Management
ODBC_RESULT
stores
it in
stores
them
The
in
ECHO
previous .
function
a variable.
(lines
1730)
This function
gets a column
extracts
the
value from
different
values
for
a row in the result each
field
to
set and
be displayed
and
variables.
function
lines.
(lines
You
32-47)
can
also
outputs
combine
text
text
to the
(HTML
Web page
code)
and
using
PHP
the
variables
variables
defined
(lines
3346)
in the using
the
delimiter.
The The
ODBC_CLOSE
previous
examples
Web applications. servers
other
based
format.
That is the
eXtensIBle
efficiency
using
Web.
internet
to
or
create
business
processes
business
entities.
commerce
heading
styles, tags
as the
not
angle
order
a true data
definition
that
Web-enabled
data in
integrate
their
and
with
a standard-based
data
organisations
to increase whether
and services
or services
to
they
a global
can take
a business
and
a consumer
Since
B2B
e-commerce
over
on the
represent, is
the
in
pairs
to
display
to
place
(business-to-consumer
Web
order
start
FOR
and
on the
bold
Web
located
thing
be in the
include
form
features.
to differ
as an item
ID?
of an
HTML
as
well as
formatting
Web page, such
end formatting
SALE in
as typefaces
and
For
the
example,
Arial font:
Web page, there is
number,
only
product
describe
orders
data
how to
elements.
code, display
To solve
no easy
quantity, the
that
extract
or
price
a
Web browser;
order in problem,
way to
a new
from
details
an
markup
HTML it
does
language
was developed.
facilitate
the
exchange
World
Wide
standard
sets
platform. for
same
to
would
different
and use data, tends
the
was expected
looks
data from the
customer can
That
standard
code
integrates
among
SALE,/font.,/strong.
The
1998.
identify
a product
Web browser
how something
would
of business information
Language (XML) is a metalanguage
vendor-independent exchange
businesses.
the transfer
travelling
come
of the
internet. in
between
displayed
describe
date,
designed
the
place
it requires
Markup Language
Markup
XML is
standard
page
document
manipulation
over
exchange
enables
or between
For example,
often
number,
HTML
extensible elements.
of
with one another
the sale of products
B2B)
which businesses
brackets
as Extensible
invoices,
able to
with
Web application
development
market and sell products
wayin
needs to get the
the
be
of systems
company.
face5Arial.FOR
The
permit
known
the
databases
that
(XMl)
to
or
take
order
Web
and they in
If an application
document.
to
HTML tags
,strong.,font
such
must
transactions
companies,
a purchase HTML
details.
following
transactions
among
company
The
order
just
and
features
can communicate
(e-commerce)
or not-for-profit
However, the
from
Until recently,
the
systems
Web pages
multiple
B2C).
e-commerce
document.
of the
more than
new types
(business-to-business
Most
substantially
can interface
that
lanGuaGe
Electronic
for-profit
businesses
you
surface
involve
Clearly,
market of millions of users. E-commerce between
ways the
of XML.
costs.
or private,
connection.
applications
MarKup
the
and reduce
are public
many
systems
on the role
database
of the
They also require
not
are
the
only scratch
Current-generation
systems
Companies
two
examples
applications.
17.3
closes
are just
These
provide.
database
function
Web the
Therefore,
e-commerce
used to represent
of structured Consortium
stage
it is
for
and manipulate data
documents, (W3C)3
giving
XML the
not surprising
such
published
that
as
the
real-world
orders
first appeal
XML is rapidly
and
XML
1.0
of being
becoming
the
applications.
17 3
Visit the to
Copyright Editorial
review
2020 has
Cengage deemed
W3C
develop
Learning. that
any
the
All suppressed
page,
XML
Rights
Reserved. content
at
www.w3.org,
for
additional
information
about
the
efforts
that
have
been
made
standard.
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The XML elements is
derived
for the
metalanguage
used in from
an
the
industry
HTML
which
document.
Standard
publication
aviation
allows the
XML
and
additional
XML tags
XML tags
be
SGML
,ProductID.
is,
identification
must be properly
to
(SGML).
are too
an XML document
and
SGML
is
and
Web Technologies
to describe
be an extensible
documents
complex
describe
for the
XML
standard
as those
unwieldy
885
the data
language.
an international such
is a text file,
data elements,
is not the same
that
as ,ProdPrice.,
said
technical
Connectivity
used
by the
Web. Just like
but it has afew
although
as follows:
of new tags to
well formed;
product
that
Database
such
XML is
Language
complex
characteristics,
definition
must
the
Markup
military services
XML is case sensitive:
example,
feature,
of highly
was also derived from
XML allows the
of new tags,
Given that
Generalised
distribution
and the
very important
definition
17
nested.
each
such
as
,ProductId..
a corresponding
closing
as ,Productid..
opening
tag
has
would require
the format
For example,
a properly
tag.
For
,ProductId.2345-AA,/ProductId.. nested
XML tag
might look like this:
,Product.,ProductId.2345-AA,/ProductId.,/Product.. You can use the The XML
is
XML
and
not
a new
representation
,--and
xml
prefixes
than
over
XML
based features.
the
it is
of the
product
with
data
are
XML
Markup
HTML,
concerned
HTML
Language
and
HTML
requires
strict
a B2B example
in 17.12
next
rather
generation
of
to include
adherence
to
Company
shows
of
of structured
standard
which
and
the job
complementary, is the
the
Figure
description remains
manipulation
(XHTML)
B over the internet.
the
display
perform
expands
XHTML
with
(Data
exchange
and
use of XML, consider Company
is
specification
than
XML document.
displayed.
the sharing,
XHTML
powerful
XML
data
short,
Hypertext The
more
In
in the
only.
HTML.
way the
that facilitate
Extensible
comments
XML
for
the
XML framework.
productlist.xml
for
boundaries.
As an illustration
exchange
FIGure
than
the semantics
Although
requirements. XML to
rather
functions. on the
to enter
or replacement
data,
organisational
overlapping,
HTML
symbols
are reserved
version
of the
HTML.) XML provides documents
--.
the
syntax
A uses
contents
of
document.
17.12
Contents of the productlist.xml document
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
886
part
VI
Database
Management
The XML example
shown in
The first line represents Every
XML
root
Figure 17.12 illustrates the XML
document
has
a root
element
Product
as a child element
Each
element
several and
Figure
child
child
In
the
example,
the
mandatory.
second
line
declares
the
ProductList
elements
or sub-elements.
In the
example,
line
three
declares
of ProductList.
contain
elements,
sub-elements.
represented
B receives
the
17.12 are
tags
the
For
example,
by P_CODE,
ProductList.xml
created
is fairly
complete.
value correct?
type
contains
can
Company
share
element.
and it is
as follows:
each
Product
P_DESCRIPT,
element
is
P_INDATE,
composed
of
P_ONHAND,
P_MIN,
P_PRICE.
understands
data
declaration,
XML features,
element.
The root
Once
document
several important
by
Company
self-evident, For
but there
example,
you
And what happens if
data descriptions definitions
and
is
could
no easy
schemas
are
to
XML
validate
document in the
data
value
a Vendor
address
the tags
the
a P_INDATE
data elements?
used
can process of the
way to
B expects
business
it
meaning
encounter
Company
about their
XML
document, A. The
or to
element
way to
transactions
must
have
a
the
how document
concerns.
Companies
B2B
whether
but is that
will show
and XMl schemas
use
check
in
as well? How can companies
17.3.1 Document type Definitions (DtD) that
it
shown
of 25/06/2019
The next section
those
assuming
example
understand
and
validate
one
anothers
tags. One wayto accomplish that task is through the use of Document Type Definitions. A Document Type Definition (DTD) is afile with a .dtd extension that describes XML elements in effect, a DTD file
provides
for
each
business
FIGure
17.13
the
XML
Companies
in
Figure
databases
(The
DTD
that intend
17.13
shows
to
logical
model
component
is
engage in
the
and
defines
similar
to
e-commerce
productlist.dtd
the
having
business
document
syntax
a public
rules
or valid
tags
data
dictionary
for
transactions
for the
must develop
productlist.xml
document
17.12.
Contents of the productlist.dtd document
examine
Figure
productlist.xml The first
of the
document.
DTDs. Figure
earlier
As you
composition
of
data.)
and share shown
the
type
17.13,
note
document.
line
declares
the
that
the
productlist.dtd
In particular, ProductList
file
provides
definitions
of the
elements
in
note that:
root
element.
17 The
Copyright Editorial
review
2020 has
Cengage deemed
ProductList
Learning. that
any
All suppressed
Rights
root
Reserved. content
does
element
May not
not materially
be
copied, affect
has
scanned, the
overall
one
or
duplicated, learning
child,
in experience.
the
whole
or in Cengage
Product
part.
Due Learning
to
element.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
The plus 1
symbol indicates
An asterisk * A question
would mark ?
The
second
line
The
question
to
To
be able to
reference
use
As you
XML
documents
DTD
B,it
and
they
element
and
second
note
within
and
Web Technologies
887
ProductList.
more times.
optional.
that to
verify
that
they
has six children
are
optional
elements.
sub-elements.
data.
within
an
XML
document,
the
productlistv2.xml
DTD
must
document
be referenced
that includes
the
line. the
be
P_INDATE
optional
if
the
and
P_MIN
elements.
Company
DTD only once.
B will be able to
or
indicates
shows the
For example,
create the
zero
is
P_MIN
element
17.14
declared
of the same type.
17.14
To further
17.14,
will need to
Company
FIGure
in the
Connectivity
element.
elements
Figure
Database
one or more times occurs
the actual text
define
were
child
Product
represents
Figure
because
the
P_INDATE
show that the
productlist.dtd
occurs
element
Product
the
a DTD file to
examine
Company
after
XML document.
to the
definitions
the
keyword
within that
mean that
describes
The #PCDATA
from
would
eight
Product
mean that the child
mark ?
Lines three
that
17
data
The
Aroutinely
All subsequent
being
do
DTD
not
can
appear
in
all
Product
be referenced
exchanges
by
product
XML documents
many
data
will refer
with
to the
received.
Contents ofthe productlistv2.xml document
demonstrate
the
use of XML and
DTD for
e-commerce
business
data exchanges,
assume
the case of two companies exchanging order data. Figure 17.15 shows the DTD and XML documents for that scenario. Although the use of DTDsis a great improvement for data sharing over the Web, a DTD provides only descriptive information for understanding how the elements root, parent, child, mandatory or optional
relate
to
one another.
A DTD provides
limited
additional
semantic
value,
such
as data type
support or data validation rules. That information is very important for database administrators who are in charge of large e-commerce databases. To solve the DTD problem, the W3C published an XML schema standard to provide a better way to describe XML data.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
17
eChapter(s). require
it
888
part
VI
FIGure
Database
Management
17.15
DtD and XMl documents for the order data
OrderData.dtd
+
sign
indicates
one
or
more
ORD_PRODS
elements
OrderData.xml
Two ORD_PRODS elements
in
XML
document
The
XML
schema
is
an advanced
data
definition
language
that
is
used to
describe
the
structure
(elements, data types, relationship types, ranges and default values) of XML data documents. One ofthe main advantages of an XML schema is that it more closely mapsto database terminology and features. For example, an XML schema will be able to define common database types such as date, integer or decimal, minimum and maximum values, list of valid values and required elements. Using the XML schema,
a company
would be able to
validate
the
data for values
that
may be out of range, incorrect
dates, valid values, and so on. For example, a university application must be able to specify that a grade point average (GPA) value be between zero and 4.0, and it must be able to detect an invalid birth date such as13/16/1987. (There is no 16th month.) Many vendors are adopting this new standard and are supplying tools to translate DTD documents into XML Schema Definition (XSD) documents. It is widely expected
that
XML schemas
will replace
DTD as the
method to
describe
XML data.
Unlike a DTD document, which uses a unique syntax, an XML schema definition (XSD) file uses a syntax that resembles an XML document. Figure 17.16 shows the XSD document for the OrderData XML document. 17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIGure
17.16
17
Database
Connectivity
and
Web Technologies
889
the XMl schema document for the order data
The code shown in Figure 17.16 is a simplified version ofthe XML schema document. As you can see, the XML schema syntax is similar to the XML document syntax. In addition, the XML schema introduces additional
semantic
types; required
information
elements;
for the
OrderData
XML document,
such
as string,
date and decimal
data
and minimum and maximum cardinalities for the data elements.
17.3.2 XMl presentation One of the main benefits of XMLis that it separates data structure from its presentation and processing. By separating data and presentation, you are able to present the same data in different ways which is similar
to
having
views
in
SQL.
The Extensible
Style
Language
(XSL)
specification
provides
the
mechanism to display XML data. XSL is used to define the rules by which XML data areformatted and displayed. The XSL specification is divided into two parts: Extensible Style Language Transformations (XSLT) and XSL style sheets. Extensible Style Language Transformations (XSLT) describe the general mechanism that is used to extract and process data from one XML document and enable its transformation within another document. Using XSLT, you can extract data from an XML document and convert it into a text file, an HTML
Web page
or a Web page that is formatted
for a mobile device.
What the user sees in
those cases is actually a view (or HTML representation) of the actual XML data. XSLT can also be used to extract certain elements from an XML document, such as the product codes and product 17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
890
part
VI
Database
Management
prices, to create another
XML
a product
XSL style sheets presentation
FIGure
The
when they
17.17
XSLT can even
define the presentation
templates.
elements
catalogue.
be used to transform
an XML document
into
document.
are
rules
XSL style
displayed
sheet
applied
to
describes
on a browser,
XML elements
the
formatting
smartphone,
something options
tablet
to
screen
like
apply
and
so
to
XML
on.
Framework for XMl transformations XSL
XSL style
transformations
HTML
sheets
XML document
HTML
Extract
Apply
Convert
formatting rules
to
XML The process
elements
different New
for
XML
different
web browser another
can
be
document
used
into
purposes,
such as one page for a
document
XSLT
can render webpages
to
transform
another
one
XML
for
and
a mobile
device.
XML
document.
Figure 17.17 illustrates the framework used by the different components to translate XML documents into viewable Web pages, an XML document or some other document. To display the XML document with Windows Internet Explorer (IE) 5.0 orlater, enter the URL of the XML
document
in the
browsers
address
bar. Figure
17.18 is based
on the
productlist.xml
document
created earlier. As you examine Figure 17.18, note that IE shows the XML data in a colour-coded, collapsible, tree-like structure. (Actually, this is the IE default style sheet that is used to render XML documents.)
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
FIGure
Internet HTML
17.18
Explorer code
works
also
that
tag to include
Displaying
is
the
only in IE
provides
used
to
binding
an
XML data in the
5.0
Database
Connectivity
and
Web Technologies
891
XMl documents
data bind
17
XML
of
XML
data
document
to
HTML document,
to an
HTML HTML
later
to
documents. table.
Figure
The
bind it to the
17.19
example
shows
uses the
HTML table.
the
,xml.
This example
or later.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
892
part
VI
FIGure
Database
Management
17.19
XMl data binding
17.3.3 sQl/XMl
and XQuery
As you havejust learnt, XML is used to transfer data from a Web-based application to the database and back again. SQL/XML and XQuery are two standard querying languages that are used to retrieve data from a relational database in the XML format. XQuery 1.0 is the W3Clanguage designed for querying XML data and it is relatively
similar
to
SQL, except it
was designed
to
query semi-structured
XML data.
SQL/XML is an extension of SQL that is part of ANSI/ISO SQL 2011 standard. Thisis because only small additions have been madeto the standard SQLlanguage. These additions include: XML publishing functions that can beincorporated include: ? xmlelement(
),
which creates
an XML element
directly into the SQL query. These functions
with a specific
name
? xmlattributes( ), which creates a set of XML attributes from the columns database table(s)
17
within the specific
? xmlroot( ), which creates the root element of an XML document
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
? xmlcomment( ?
xmlpi(
),
), which
which
allows
allows the
creation
? xmlparse(
),
which
parses
?
),
which
creates
xmlforest(
an XML
comment
of an
a string a list
XML
to
processing
elements
Database
Connectivity
and
Web Technologies
specific
database
893
be created instruction
as XML and returns of XML
17
the resulting
from
the
XML value
columns
within
the
table(s) ?
xmlconcat(
),
? xmlagg( ),
which
combines
a list
which aggregates
of XML
a number
values
of single
into
one that
contains
XML values together
an
XML forest
to create
a single
XML forest. An XML datatype A set Lets
of rules
now look SELECT
map relational
VENDOR
17.20
data to
at an example.
Consider
V.VEND_CODE,
FROM
Figure
to
XML.
the
following
V.VEND_CONTACT
SQL
query:
AS VENDOR_NAME,
V.VEND_AREACODE
V;
shows the
FIGure
17.20
Database
Ch17_SaleCo
contents
of the vendor
the contents
of the
table
VenDor
and the results
of this
query.
and proDuCt
tables
and results
of the
query
Table name: VENDOR veND_CODe
veND_CONTACT
veND_AreACODe
veND_PHONe
230
Shelly
K. Smithson
7325
555-1234
231
James
Johnson
0181
123-4536
Sibiya
7325
224-2134
0113
342-6567
0181
123-3324
0181
899-3425
232
Khaya
233
Lindiwe
234
Nijan
235
Henry
Molefe Pillay Ortozo
Table name: PRODUCT PrOD_CODe
PrOD_DeSCriPT
001278-AB
Claw
123-21UUY
Houselite
QER-34256
Sledge
hammer,
SRE-657UG
Rat-tail
file
ZZX/3245Q
Steel tape,
PrOD_PriCe
PrOD_ON_HAND 23
232
150.09
4
235
14.72
6
231
2.36
15
232
5.36
8
235
10.23
hammer chain
saw,
12
16 cm
bar
veND_CODe
16 kg head
mlength
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
894
part
VI
Database
Management
Data returned SELECT FROM
In
by query
V.VEND_CODE, VENDOR
order to
V.VEND_CONTACT
AS VENDOR_NAME,
V.VEND_AREACODE
V;
display
veND_CODe
veNDOr_NAMe
veND_AreACODe
230
Shelly
7325
231
James Johnson
0181
232
Khaya Sibiya
7325
233
Lindiwe
0113
234
Nijan Pillay
0181
235
Henry
0181
these
results
K. Smithson
Molefe
Ortozo
as XML, the
xmlelement(
) function
can be incorporated
into
the
SQL
statement like this: SELECT
XMLELEMENT(NAME
'VENDOR',
XMLELEMENT(NAME
'VEND_CODE',
V.VENDCODE),
XMLELEMENT(NAME
'VENDOR_NAME',
XMLELEMENT(NAME
'VEND_AREACODE',
V.VEND_CONTACT), V.VEND_AREACODE))
FROM VENDOR V; Each row returned
by the query corresponds to one VENDOR element,
whichis represented
as:
,VENDOR.
,VEND_CODE.230,/VEND_CODE. ,VENDOR_NAME.Shelly
K. Smithson,/VENDOR_NAME.
,VEND_AREACODE.7325,/VEND_AREACODE. ,/VENDOR. As you this
will have
query
columns
seen,
using
the
within the
SELECT
the
SQL/XML
publishing
query
function
VENDOR table.
XMLELEMENT(NAME
we have just xmlforest(
The query
),
written which
is
quite
creates
complicated.
a list
of
We could
XML
elements
rewrite from
the
would then look like this:
'VENDOR',
XMLFOREST(V.VENDOR_CODE,
V.VEND_CONTACT
AS VENDOR_NAME,
V.VEND_AREACODE))
FROM VENDOR V; Producing
XML from
functions, all the
Copyright Editorial
review
2020 has
in
Cengage deemed
we want to
products
found
17
if
that
any
All suppressed
Rights
Reserved. content
queries
display
were
Figure 17.17.)
Learning. that
SQL
does
that
the
May
not materially
results
associated
This could
not
contain
be
copied, affect
with
in
relational a way the
each
be achieved
scanned, the
overall
or
duplicated, learning
joins
in experience.
user
vendor.
requires
or in Cengage
part.
Due Learning
to
use
will understand.
(The
contents
using the following
whole
the
electronic reserves
more
Suppose of the
XML
publishing
we wanted
PRODUCT
to list
table
can
be
SQL query:
rights, the
of
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
SELECT
V.VEND_CODE,
V.VEND_CONTACT
17
Database
AS VENDOR_NAME,
Connectivity
and
Web Technologies
895
P.PROD_CODE,
P.PROD_DESCRIPT
FROM VENDOR V, PRODUCT, P WHERE
V.VEND_CODE
5 P.VEND_CODE;
To represent the results of this query as XML, we want to show the vendor details once and then alist of the products that the vendor supplies. In SQL/XML this can be achieved using the publishing function xmlattributes(
) in
a subquery
that
retrieves
the
products
associated
with each vendor.
Subqueries
in
SQL/XML are only designed to return one row, so if multiple rows are to be returned they must be aggregated into one single value using the function xmlagg( ). Thefollowing SQL/XML query makes use of these publishing functions to display all products associated with each vendor: SELECT
XMLELEMENT(NAME XMLATTRIBUTES
VENDOR,
(V.VEND_CODE
AS VEND_CODE),
XMLFOREST(V.VEND_CONTACT XMLELEMENT(NAME
(SELECT
AS VENDOR_NAME,
XMLAGG(XMELEMENT(NAME (P.PROD_CODE
PRODUCT,
AS PROD_CODE),
XMLFOREST(P.PROD_DESCRIPT PRODUCT
AS DESCRIPTION)))
P
WHERE P.VEND_CODE
5 V.VEND_CODE)))
AS 'PRODUCTS
FROM
AS AREA),
PRODUCT,
XMLATTRIBUTES
FROM
V.VEND_AREACODE
VENDOR
RELATED
TO
VENDORS'
V;
An alternative approach to SQL/XML is XQuery. XQuery is alanguage that can query, store, process and exchange
structured
or semi-structured
XML
data.
XQuery is used in conjunction
with XPath,
which is
used to navigate through elements and attributes in an XML document. XPath is a major component of W3Cs XSLT standard. XQuery includes over 100 built-in functions including functions, for manipulating strings and comparing dates. The following is an example of an XQuery that retrieves alist of products which has been supplied by each vendor: FOR V$IN $VENDOR/ROW RETURN ,VENDOR_CODE ,VEND_NAME
5'{$V/VENDOR_CODE}'. .{ STRING ($V/ VENDOR_CONTACT)},/VEND_NAME.
,PRODUCT.
{ FOR $P IN $PRODUCT/ROW WHERE $P/VENDOR_CODE
5 $V/VENDOR_CODE 17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
896
part
VI
Database
Management
RETURN ,PROD_CODE PROD_DESCRIPT
5 '{$P/PROD_CODE}'
5 '{$P/PROD_DESCRIPT}'/.
} ,/PRODUCT.
,/VENDOR. The XQuery
performs
can see requires is that
it
can
Lets
FIGure ,?xml
query
consider
17.21
exactly
the same
more in-depth data
stored
a simpler
query
knowledge inside
the
example
as the last of
XML
database
by using
SQL/XML
programming. or directly
the
query that
welooked
One of the
main strengths
from
DVDStore.xml
an
XML
document
at, but as you of XQuery
source. in
Figure
17.21.
DVDstore.xml Document
version5"1.0"
,!Created
a
encoding5"ISO-8859-1"?.
by KAC --.
,dvdstore. ,dvd
category5"Children". ,title.ToyStory
,/title.
,year.2005,/year. ,price.9.00,/price. ,/dvd. ,dvd
category5"Action".
,title.Indiana
Jones
,/title.
,year.2001,/year. ,price.15.00,/price. ,/dvd. ,/dvdstore.
In order to extract
data from
XML
documents,
the
doc( ) function
is used to
open the
dvdstore.xml
file
as shown below: doc(dvdstore.xml)
In order to extract data elements, illustrates
how the title
element
path expressions
would be extracted
from
from the
XQuery are used. The following dvdstore.xml
example
document:
doc(dvdstore.xml)/dvdstore/dvd/title Executing this function
would display the following:
,title.ToyStory
,/title.
,title.Indiana
Jones
,/title.
Writingthe function as /dvd/title selects the child elements of the top-level dvd element. If we wanted to extract elements based on a specific condition, for example to select the details of DVDs costing less than twelve rand, we would write:
17
doc(dvdstore.xml)/dvdkstore/dvd[price,12]
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
This function ,dvd
would return
17
Database
Connectivity
and
Web Technologies
897
the following:
category5Children. ,title.ToyStory
,/title.
,year.2005,/year. ,price.9.00,/price.
,/dvd. FLWOR Order is
expressions By ..
mandatory.
rand
is
are a fundamental
Return.
Each
An example
shown
$y in
order
by $y/title
expression
returns
expression, called
also
as the
d.title
An in-depth the
the
the
at
reading
title
used
d
for
an acronym
and
only
for For .. Let ..
the
retrieving
is
function,
all dvd elements
use
DVDs
only
dvd
except
the
under the elements
of the
costing
Where ..
Return
less
results
parent
are
clause
than
twelve
and the
return
comparison
purposes,
the
book,
additional
ordered.
dvdstore
with a price less
alphabetically For
clause above
In this
element
than
states
what
FLWOR
into
twelve
a
rand.
should
be
expression
can
query:
WHERE beyond
at the
previous
selects
results
SQL
dvd
XQuery
clause
elements.
following
section
as the
selects
where
title
FROM
look
further
expression
result
clause
the
orders
case
written
same
the for
by clause
SELECT
the
$y. Then,
in this
be
FLWOR is
as a clause
$y/title
FLWOR
order
known
doc(dvdstore.xml)/dvdstore/dvd $y/price,12
variable The
is
of a FLWOR
where
return
returned,
part of XQuery.
elements
below:
for
This
of the
d.price the
end
, 12;
scope
of this
of this
but
reading
can
be found
in
chapter.
17.3.4 XMl applications Now that
you have some idea
applications Keep in
lend
themselves
mind that
designers
and
the
customers.
In for
Legacy
the
new life in old
2020 has
Cengage deemed
Learning. that
any
earlier,
question
XML?
This
is, how
section
can you use it?
will list
only
by the imagination
the
exchange
All
positioned
of the
Reserved. content
supply
some
What kinds
of the
uses
of
of
XML.
and
creativity
of developers,
data,
providing
the
XML
does
For
May not
not materially
be
applications.
provides example,
copied, affect
scanned, the
overall
because
duplicated, learning
in experience.
whole
example
to
a data
features
or in Cengage
that
with large
part.
Due Learning
to
electronic reserves
legacy
system
be used
is the
use
warehouse make it
rights, right
some to
of
third remove
or
as the
more flexible.
data
with
to inject
some
XML to import
database.
a good
amounts
the
(EDI)
and
could
standard
government
expensive
that integrates
Web portals
or
it is less
Another
the
Data Interchange
XML technologies
databases several
competitors,
Electronic
the glue
operational
of B2B
with partners,
to replace
Web and
legacy
multiple
data
chain
XML provides
but trusted
Rights
enables
XML is
Web systems.
scenarios.
suppressed
XML
exchange
development.
development
review
well to
XML is limited
need to
integration.
data from
Web page
Copyright
of
automation
e-commerce
transaction
Editorial
that
particular,
systems
modern
use
As noted
all organisations
standard
particularly
future
next
programmers.
B2B exchanges.
for
of what XML is, the
fit for
certain
of personalised
party additional
content
may content
be
data
suppressed at
any
Web
time
from if
the
subsequent
17
can
eBook rights
and/or restrictions
eChapter(s). require
it
898
part
VI
Database
Management
use XML to different
pull data from
presentation
Database
support.
systems
(Web,
types
mobile
tree
of these structure
query
inside
industries.
accounting
exchanging
databases.4
or form. with
The
Most
store typically on.
contents
database
their
with external
creation
of
such
activities
HR-XML
data
model
XML format.
native format.
would
The
a hierarchical-like
also
require
that
data.
for the
from
standard
new
or generate
native
metadictionaries,
(METS)
market
simple
for
or vocabularies,
human
the
resources
Library
patient
reporting
The
for
entire
industry,
of Congress,
data exchange
language
XML to
and
XML
(XBRL)
structure
sections,
paragraphs, Oracle,
the
the clinical in electronic
standard
for
DB XML
by
DB2
Oracle
its
some
databases
would be
for
the
well suited
structure:
footnotes,
MS SQL
shape
provide
database
charts,
and
data in object
databases
database
dictate
figures,
IBM
to
XML
an XML
would
manage
software
servers.
For example,
books
are
support
middleware engines
relationships.
Berkeley
the
XML data in its
on XML
business
database
databases
is the
enable
XML format
and apply
devices.
would even be able to store
create
include
on the from
of chapters, of XML
to
mobile
information.
XML
of a book.
consists
Examples
XML
to full
of data in complex
the
used
extensible
databases range
using
store
you
data exchange
and the
approaches
data,
queries
standard
and financial
XML interfaces
storage
also
metadictionaries
(CLAIM)
business
data in
Of course,
support
and transmission
systems,
and thus
the
structure. to
XML is
information
record
storing
and stocks)
well as
will be able to integrate
so on)
a XML data type to
a relational
of
encoding
still
weather as
or export
are far-reaching
be extended
Examples
metadata
medical
while
metadictionaries.
and
can import
capabilities
language
Database
XML
queries
as news, computers
XML exchanges
systems,
databases
(such
on desktop
supports
a DBMS can support
implications
the
SQL
sources
pages
data, legacy
These
from
Alternatively,
external
to format
A DBMS that
of systems.
documents
multiple
rules
endnotes,
Server.
to
a book and
An example
so
of a full
(www.oracle.com/database/berkeley-db/xml.
html). XML services. services
Many companies
based
on XML and
interoperability
barriers
facilitates
would
use
17.4
services
You have almost
term
1990s,
17
4
cloud during
For
to
other
them
peak
to
provides
desk,
publish
the their
and learn
XML technologies
break
of
down the
the infrastructure
street,
and the
interfaces.
their
that world.
Other
vocabulary
work together
services,
(service
in innovative
ways
computing.
heard
was used
and
XML
the
would locate
of a new breed
promise
alike.
across to
Web, virtualisation cloud
development
technologies
a conversation.
over the
serVICes
about the cloud
years,
although
it
analysis
growth,
of
XML
from
the thousands
has represented
by telecommunication of internet
a comprehensive
companies
services,
certainly
the
and
to establish
is
on the
These services
work together
internet
ClouD CoMputInG
have used the term the
and
which internet, IT
working
with existing
and replies)
One area in
systems
systems
XML
wanting to interact request
to leverage
among
heterogeneous
Services
are already Webtechnologies.
companies the
term
database
to
depicted
products,
see
of publications
different describe
their
the internet
XML
and TV ads that
concepts. data
itself.
Database
In the late networks.
Then,
Products
in
by
1980s,
In the late 2006,
Ronald
Google
Bourret
at
www.rpbourret.com.
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
and
Amazon
services. But
began
Google,
using the term
Yahoo,
what exactly
cloud
(NIST),5
is
eBay
cloud
computing
cloud
and
computing?
is a
pool of configurable
and services)
can be rapidly
that
cloud
computing.
applications,
describe
early
to the
National
provisioned
resources
of this
Institute
ubiquitous, (e.g.,
and released
Connectivity
and
Web Technologies
a new set of innovative
adopters
model for enabling computer
Database
of Standards
paradigm.
and
Technology
on-demand
servers,
minimal
Web-based
computing
convenient,
networks,
with
new
storage,
management
899
network
applications
effort
or service
The term cloud services is used in this book to refer to the services provided by Cloud
storage,
and economically.
FIGure
to
were the
According
computing
access to a shared
provider interaction.
computing
Amazon
17
services
servers,
allow
Figure 17.22 shows
17.22
any
processing
organisation power,
to
add information
databases
a representation
technology
and infrastructure
of cloud
computing
services
to its IT
services
such
portfolio
as
quickly
on the internet.
Cloud services
Email
Desktop
Storage
Server
RDBMS
NoSQL
Content Delivery Simple
Simple
Messaging
Storage
Simple
Cloud Service Providers
Relational
Queuing
DB
Elastic
NoSQL
Compute
DB
SOURCE:
Course
Technology/Cengage
Learning
Cloud computing allows highly specialised, IT-savvy organisations such as Amazon, Google and Microsoft to build high-performance, fault-tolerant, flexible and scalable IT services. These services include applications, storage, servers, processing power, databases and email, which are delivered via the internet
5
to individuals
Recommendations Publication
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
of the
800-145,
All suppressed
and organisations
Rights
Reserved. content
National
Institute
September
does
May not
not materially
be
using
a pay-as-you-go
of Standards
and
price
Technology,
Peter
model.
Mell and
Timothy
Grance,
17
Special
2011.
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
900
part
VI
Database
Management
For example,
imagine
that
officer
A few
this
to the
email
systems
infrastructure
from
maintenance.
However,
for
and
Business
fraction and
of the
Microsoft
cost.
maintaining
recovery. If
or
need
more
processing more you
power
scaled
previous
scale
to
of
services
and
to
your
and
is important
Cloud
potential
to turn
enable
a revolution
The technologies technologies
include
However,
cloud
of
minutes. Instead
can employ relational
basic
IT
that
The
in
a
cloud.
Christmas
season,
you
scale
can
subsides,
beauty
you
of cloud
of
for a
managing
tolerance
and
matter
of
If
need
you
minutes. more
you simply down
can
add
as easily
go
services
because
it has the
and technological
barriers
processes
services
could
be done
fault
G Suite
solution
as
back to
is that
your
you
can
intervention.
technologies
business
into
with
minimal
commodity
change
potential
effort
and
computing
cost.
such
way that
to
become
so organisations
services,
not only the
Carr put it so vividly: Cloud
make
the
Web,
messaging,
itself
is still in the
this,
cloud
more
computing
and
(AWS)
or
of spending
Microsoft
have
virtualisation,
or
large
work
a can
In fact,
cloud
as electricity,
companies
is for IT
in
organisations
NoSQL) Azure
amounts
Microsoft
around
remote
for
and
have
gas
do business,
what the invention
Azure
and
buying Figure
Amazon
years
protocols,
into
hardware
cloud
ready
the
and
before it to
you
can log
for
use in
these XML.
can be
services
and software,
depicts
now;
VPN
Currently,
database
17.23
a few
mature further
organisations. a relational
of cash
for
desktop
are tapping
their
model for their IT services.
instance
been
early years and needs to
more
(relational
a pay-per-use
database
use.
financial
that
services
Web Services
you
configuration,
daily chores
storage
or storage
the
electricity.6
Despite
database
in their
As Nicholas
computing
adopted.
advanced
database
busy
add
building
Googles
email
security,
now your
use
more reliable
more importantly,
processing what
to
during the
Even
eliminates
have the
was for
orders
servers.
for
computing
can unit
setup,
can
worry about the
to implement storage
you
patches,
wants to
have implied
software,
era,
and
OS updates,
an administrators
technologies
grid
as
another
only for
database
power
Amazon
pay
hardware,
you do not have to
organisation
would
computing
get a scalable, flexible
additional
without
but the IT business itself.
widely
for
cloud
or years
add
cloud
need
usage
water, and to
of the
just
of a non-profit proposition
up, including
such
months
you
automatically,
changer.
leverage
take
your
Cloud computing game
ground
handle last-minute
Once
levels
down
to
ago,
in todays
IT infrastructure,
units
up.
the
Office 365 and
space,
processing
years
The best part is that
the
What used
you
portfolio.
chief technology
services
operation
IT
the
email
secure
a
in to matter
organisations
cost of provisioning
a
RDS, respectively.
17 6
Copyright Editorial
review
2020 has
Nicholas
Cengage deemed
Learning. that
any
Carr,
All suppressed
The
Rights
Big
Reserved. content
does
Switch:
May not
not materially
be
Rewiring
copied, affect
scanned, the
overall
the
or
duplicated, learning
World, from
in experience.
whole
or in Cengage
Edison
part.
Due Learning
to
to
electronic reserves
Google,
rights, the
right
some to
W.W.
third remove
party additional
Norton
content
may content
& Co.,
be
2009.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
902
part
VI
Database
Management
Private cloud. servicing
its
organisations
needs.
to
managed
add
IT
cloud.
a common
education.
Private
agility
by internal
Community share
This type ofinternal
own
such
These
computing
The prevalent
share
public
access
provision,
has access
Shared
infrastructure.
possible
by
Lower lower
than
range
from
needed
cloud
Flexible fault
area.
the
military
IT staff
cloud
be
or higher
or an external
services
share
third
party.
a common
set
of
next section.
The characteristics
Amazon,
All cloud
Google,
services
listed
in this
Salesforce,
use internet
provide.
infrastructure
technologies.
pricing.
pricing
and
is
SAP
and
section
and
are
Microsoft.
Web technologies
The basic requirement
Cloud
shared
by
multiple
services
effectively
by the
consumers
managed
The initial
costs
of using
IT infrastructures.
Because from
and scalable
tolerant
as
which is locally
on-premise
benefit
resource
in the
services they
service
virtualisation
variable
to fixed
most
principles.
such
technologies.
35 per cent to 55 per cent
in this
could
is that the
users.
Sharing
provide
an
is
made
organisation
organisation
as if it
of the infrastructure.
building
consumers
infrastructure
group of organisations that
government,
by internal
uses,
of guiding
manage the
The
and
cloud
to the internet.
user
costs
The
dispersed
are:
and
Web and
only
federal
are explored
providers
with a virtual IT infrastructure, were the
services.
managed
organisation
a set
cloud
via internet
deliver
device
IT
geographically
of Cloud services
characteristics
Ubiquitous
to
by large,
party.
of the
can be
characteristics
services
by prominent
third
as agencies
an
17.4.2 Characteristics Cloud
used
to internal
or an external
implementation
core characteristics.
often
This type of cloud is built by and for a specific
trade,
of the
are
and flexibility
staff
The cloud infrastructure
Regardless
shared
cloud is built by an organisation for the sole purpose of
clouds
the
According
depending
Web services
usage
and flexible
pricing
based
on
levels
services.
Cloud services
very reliable.
The
services
services
to
some
on company
lower
minimum
cloud
tend
studies,7
size, although
is
metered
options.
These
to
per
the
savings
could
more research
volume
options
be significantly
and time
range
is
utilisation,
from
pay-as-you-go
of service.
are built on an infrastructure can
scale
up and
down
that is highly
on
demand
scalable,
according
to
demands.
Dynamic servers,
provisioning. processing
and then
adding
The consumer power,
can quickly
storage
and removing
and email,
services
provision
any needed
by accessing
on demand.
This
the
resources,
including
Web management
process
can
also
be
dashboard
automated
via
other
services.
Service
orientation.
services
that
and
be
can
Managed
IT staff.
Cloud computing
use well-known delivered
anytime
operations.
Cloud
The system
7
Copyright Editorial
review
2020 has
The
Compelling (TCO)
Aggarwal,
Partner;
Cengage deemed
TCO
ownership
Learning. that
any
All suppressed
Rights
Case
Laurie
does
May not
is
not materially
be
minimises
managed
Cloud
copied, affect
cloud
Partner;
scanned, the
overall
or
duplicated, learning
in
in and
whole
need
cloud
maintenance
SMB
and
on-premise
Hurwitz
experience.
the
by the
and
Computing
comparing McCabe,
consumers hide the
with specific,
complexity
from
well-defined the end user,
anywhere.
management
for
on providing
These interfaces
computing
perspective
Reserved. content
and
infrastructure
IT staff is free from routine
17
focuses
interfaces.
Cengage
part.
Due Learning
to
extensive
provider. tasks
Mid-Market
electronic reserves
and
expensive
The consumer
so they
in-house
organisations
can focus
Enterprises:
business
& Associates,
or in
for
on other tasks
A 4-year
application
total
cost
development,
of
Sanjeev
2009.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
within the that
organisation.
outsource
The preceding offerings. move
list is
Although
to
them
otherwise
Managed
cloud
to
not exhaustive,
they
move to
are
the
services
Connectivity
that
and
use public
Web Technologies
clouds
903
and
party.
point for
services
way to
Database
organisations
third
cloud
best
Not all cloud
in the next
apply to
an external
but it is a starting
most companies
because
be unavailable.
as explained
operations
management
17
understanding
because
gain
access
are the
same;
most cloud
of cost
to
savings,
specific
in fact,
IT
there
computing
some
companies
resources
are
several
that
would
different
types,
section.
17.4.3 types of Cloud services Cloud
services
In fact,
come
cloud
options
different
services
according
sophisticated the
in
often
shapes
follow
categories:
Software
as a Service (SaaS).
cloud.
Consumers
Web or any
mobile
cannot
make
from
and
created
consumer
can
and interfaces.
Examples
and
their
processing
or remove
a server
can
choose
on top
cloud
internally
The
application
of SaaS include
is
multiple
of one another
services
provide
can be classified
organisations
by
via the
aspects
of the
application
actually
shared
among
Microsoft
service
to
applications that run in
in their
certain
all consumers.
Office 365,
Google
but users
Docs, Intuits
signage.
the
providers
manage
consumer
does
Microsoft
App Engine
cloud
applications not
infrastructure.
using manage
the
Azure platform with Python
the
In this
providers underlying
with .NET
scenario,
the
tools,
languages
cloud cloud
and the
infrastructure.
Java
development
or Java.
as a Service (iaaS). In this case, the cloud service provider offers consumers the
provision
databases,
provided,
customise
itself.
using
the
Google
consumers
works for
The cloud service provider offers the capability to build and deploy
and
the
of PaaS include
infrastructure
add
deploy
However,
environment,
ability to
digital
applications
build,
of service
can build
applications can
Examples
SCALA
model;
of services
providers
application
as a Service (PaaS).
consumer
carte
type
The cloud service provider offers turnkey the
to the
no single
These services
The consumer
organisations.
Online
Platform
can run device.
changes
multiple
TurboTax
la
needs.
Based on the types
following
the
an
to their individual
solutions.
and forms;
the
computer
own resources units
and
resources that
runs
on demand;
even
a complete
as needed. Linux
and
For
these resources virtualised
example,
Apache
include
desktop.
a consumer
Web server
can
using
storage,
The
64
servers,
consumer
use
GB of
AWS RAM
can then
and and
provision 1 TB of
storage. Figure
from
17.24
illustrates
any computing Cloud
creates
services
has a virtual
of the
different
types
of cloud
services;
these
services
can
be accessed
device.
computing
technologies
a sample
enabled
have
the
computer
evolved
creation
on the
in their
of new
cloud
that
sophistication
options can
such
and flexibility.
as desktop
be accessed
from
The
merging
as a service, any
device
which
over
the
of new
effectively
internet.
For
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
904
part
VI
FIGure
Database
Management
17.24
types of cloud services Servers Laptops Tablets
Desktops
Smart phones
Internet
Software
as a Service
Microsoft Office 365 Google Docs, Google Email Salesforce CRM Online SAP Business ByDesign
Platform
as a Service
Amazon
Web Services,
MS Azure
Google
Platform,
App
Amazon
MS SQL
Relational
Data
Service,
Amazon
as a Service
Amazon
Web
Amazon
Elastic
Services
Elastic
Amazon
Simple
Storage
Amazon
Elastic
Load
Computing
MapReduce
Cloud
2 (EC2)
Service Service
(S3)
Balancing
Service
SOURCE:
example,
you
can
over the
desktop
use
a service
Web for
via the
your
such
as VirtualBox
personal
use in a
Web browser
or using
the
computing
advantages
Table
8
17
has
17.4
grown
any Remote
of cloud
summarises
Cloud
Computing
Global
Banking
computing, the
Market and
remarkably
Outlook
in the but its
2019
Review,
|
and
minutes.
Desktop
past
few
widespread and
Global
2019,
of
Moreover,
Technology/Cengage
get
a
you
Protocol (RDP)
Learning
Windows
can
10 desktop
access
your
virtual
application.
and Disadvantages
main advantages
Finance
Course
(www.virtualbox.org) matter
17.4.4 Cloud services: advantages Cloud
DB
Engine
Infrastructure
running
Simple
Service
years.
Companies
adoption
disadvantages
Opportunities,
Available:
is
of all
still limited
of cloud
Challenges,
sizes
are
by several
enjoying factors.8
computing.
Forecast
and
Strategies
To
2028,
www.globalbankingandfinance.com/category/news/
cloud-computing-market-outlook-2019-global-opportunities-challenges-forecast-and-strategies-to-2028/
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
taBle
17.4
advantages
and disadvantages
Advantages cost
of entry.
of entry
of building
when
easy to
add
Issues
multiple
access.
from
and remove
High reliability
of
Consumers
costs
can
access
and
Web Technologies
905
bandwidth
Complex
organisation
otherwise
licensing
schemes.
are faced
services
to leverage.
are
in
Can the
cloud
Migrating
infrastructure
that
implement
schemes
Companies that
complete of the
vendor
hard to
agreements.
and control.
no longer
process.
the
Organisations
What is the responsibility breached?
for
It is
with complex licensing
service-level
Loss of ownership
difficult
difficult
operation.
and lengthy
and time-consuming.
cloud services
is
migration costs.
be difficult
Cloud providers are
and data
can
and complicated
that
and
of data to and from
as
Trusting
entities
organisations.
of implementation
amounts
access.
and performance.
compliance.
external
large
cloud
as long
and
data to
Data migration is a difficult
mobile
at any time,
solid infrastructures average
privacy
company
Hidden
Cloud computing
types
anywhere
have internet
of security,
sensitive
estimate
mobile computing.
Ubiquitous
the
has lower alternative
devices.
resources
Connectivity
of cloud computing
most data-cautious It is
support
computing
for
with the
on demand.
Support for
build
computing
in-house.
resources
they
Cloud
compared
Scalability/elasticity.
providers
Database
Disadvantages
Low initial costs
17
control cloud
use your
use cloud
of their
provider data
if
data. data
without
are
your
consent? Fast provisioning. demand
in
a
Resources
matter
of
can be provisioned
minutes
with
minimal
Organisation
on
change.
effort.
single ten
Managed infrastructure. implementations or external
are staff.
staff to focus
This allows
by dedicated
the
the
internal
organisations
Do the
justify
Will the
being
cloud
dependent
provider
be
to
on
around
a in
years?
integration
cloud
with internal
services
authentication
IT
End users tend to be resistant
savings
provider?
Difficult
Most cloud managed
culture.
IT
to integrate
and
system.
Configuring
transparently
other internal
with internal
services
could
be a
daunting task.
on other areas.
As the table shows, the top perceived benefit of cloud computing is the lower cost of entry. Atthe same time, the chief concern of cloud computing is data security and privacy, particularly in companies that deal
with sensitive
data and are subject
to high levels
of regulation
and compliance.9
This concern leads
to the perception that cloud services are mainly implemented in small to medium-sized companies where the risk of service loss is minimal. In fact, some companies that are subject to strict data security regulations tend to favour private clouds rather than public ones.10 One of the biggest growth segments in cloud services is mobile computing. For example, Netflix,11 the
video-on-demand
infrastructure infrastructure
9 Are
issues From
11 NoSQL
at
delaying
FarmVille,
Netflix,
adoption Charles
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
of cloud Babcock,
Yuri Israilevsky,
in
2011 that
does
computing?,
it
Ellen
InformationWeek,
Director
http://techblog.netflixx.com/2011/
Editorial
announced
had
moved
significant
parts
to AWS. Netflix decided to move to the cloud because of the challenges fast enough to keep up withits relentless growth.
security
10 Lessons
trailblazer,
of
Cloud
Messmer,
May 16,
and
Systems
Network
World,
of its
IT
of building IT
April
27,
2009.
2011.
Infrastructure
at
Netflix,
January
28,
17
2011,
nosql-at-netflix.html.
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
906
part
VI
Database
Management
note Cloud reality Cloud
Check: is the Cloud enterprise
service
types
outages
and sizes
infrastructure to
steal
providers.
thousands
of people
etc.).
and
all over
of dollars in lost
world,
downdetector.com.
have
retrieval
to
seen
in
data
service
such
and restore,
databases
and
to
manipulate
centre
millions
Facebook,
Twitter,
or cost
web services,
of all system
millions
go to
with a live
within reach
hackers
affect
degradation,
all
cloud
http://
outage
map.
development.
of any type
Cloud
of organisation.
over
queries,
indexing,
to
users.
end
SDS
Tabular
and
amounts
XML.
from
At the
of data
provides
cloud
access
have
simple
same
and
stable
expanded
management
ODBC
time,
while controlling
a relatively
vendors
software,
to
and
data
companies
costs
without
and reliable their
platform
business
to
offer
data management
companies
infrastructure
uses
to
stored
standard
of database
database
procedures,
Other features
of all sizes
personnel.
Services
(TDS)
as
data
for
communication
SQL networking for
servers
that
administrators triggers,
such
are available
data
encapsulate Data
uses a cluster
the internet
and exporting
without
This type
of
reporting
SQL
analytical
data
access
as SQL-Net
Server
backup
purposes.
and relational
Microsoft
Typically,
data
administrative
such
alarge
users.
and
synchronisation,
protocols,
provide
and
databases,
for
protocols.
Oracle
inside
the
protocol. interface.
data.
SDS is transparent interfaces
Programmers
asif the
data
disadvantage,
reliable
ADO.NET
evolved
(SDS) refers to a cloud computing-based
SDS typically
services
have
benefits:
programming
the
High level
affect
in
allowed
could
by provider
that is
computing
hardware,
use familiar
SQL data services Highly
at the
using
storage,
programming
database
potential
17
SQL
networking
A common
the
unique
protocols.
continue
data
and data importing
these
TCP/IP
that
(Instagram,
most common
technologies
services;
of in-house
available
Typically,
access
Cloud
functionality
as
are
Standard
remain
breach
services
problems
ever-growing
business
management.
of database
functions
of the
Such incidents
Other incidents
performance
management
processing
features.
relational
some
data
data
SQL data services
high costs
features
to data
manage
deploying
provides
Hosted
databases
data
better
management
provides
subset
most recent
of the
year.
to service interruptions
security
media
data loss,
alist
chapter,
remote
and
that
social
status
size,
this
SQL data services. service
in
iCloud
celebrities.
up-to-date
can find
ways to
developing
the typically
as the
every
universities
Data services
for
sacrificing for
such
well-known
as interruptions
a new dimension
advanced
are looking
public,
in large
service interruption,
of a companys brings
sQl
you
you
are reported
breaches
from
To see the
There,
computing
As
very
such
incidents
data
pictures
can cause
business.
Regardless
are
private
the
breach
from
Some
of
These incidents
17.4.5
security
of organisations,
ready?
write
of failure
embedded
as SQL
were stored locally instead
however, is that
offer the following and
to
such
scalable tolerance
data
a remote
for
are
Programmers Studio.NET
applications
of the
and
to connect
on the internet.
may not be supported
with in-house
a fraction
normally
Visual
location
data types
when compared
database
because
developers. and
code in their
ofin
some specialised
advantages
relational
application ADO.NET
to
One by SDS.
systems:
cost
distributed
and replicated
among
multiple
servers
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Dynamic
and automatic
Automated
data backup
Dynamic Cloud
creation
providers
running
in
a
scalability
such
matter
development business
data
for
the
service
17.5
Amazon,
Google
Microsoft
the
start
of the
the
and
you
work
The
patterns,
then
our
actually
is
relate
allow have
services
to
907
service
you
to
to
get
worry
your
about
to
be knowledgeable
database fault
enables
database
database
about
to
application to
create
deploy
the
technology
database
server
tolerance,
rapid
and allows them
use the
to relational
own
backups,
services
resources,
is free
access
as
2001, and
accurate
machine
a
one of the founders
mechanism
he published hypertext
could
be
of the
could
our
for
via
design
vision
best a SQL
and
SQL to
of
a very
in
future
of the
that
the
peoples
a
interactions,
management
the typical
way that WWW as:
machine-readable
thoughts,
powerful
through
WWW and director
concepts
of the
so intuitive state
become
working together
of the
describing
his initial
representation analysis
tool,
problems
seeing
which
beset
organisations.12
for
using a variety
the
majority
A traditional
the
Berners-Lee,
(W3C) In
information
easy
of
between
of formats
human
computer,
meanings
have the same
however,
two
including
beings can
concepts
images,
to
understand
only
understand
written
using
multimedia as they
the
syntax
different
and natural
comprehend
the
of the language
natural
languages
that
meaning.
In 2001, Tim Berners-Lee web in
by Tim
person an
of large
of language.
cannot
Web Technologies
storage.
technology
having
need
work and facilitating
WWW represents
semantics
and
do not
of cloud
Consortium
gave
which
and
WeB
between
management
language,
you
with the
The use of SQL data
information
still
understand.
space
patterns in the
Web
actually
interaction
information
Connectivity
applications.
Wide
can
better,
However,
Web was conceived
World
and
tasks.
A consumer at hand.
the seMantIC
computers
in
Even
with limited
rapidly.
is just
The Semantic
and
processes
minutes.
problem
included
of database
maintenance
high-quality
recovery
allocation
businesses
Database
balancing
and disaster
as
of
solutions for
develop
If
and
and routine
solution
load
17
which information
formally
is
given
defined
his idea
well-defined
of a Semantic
meaning,
better
to
Web
Web as An
enabling
extension
computers
of the
and
people
current to
work
cooperation.13 Today,
the
the
Semantic
WWW to share
However,
it is
maintains
its
your
not own
calendar)
reused
we had
you took
across
data relates
combination
to real
data
a specific
between
one
underlying
any
boundaries.
of data
from
different
The framework
allows
schemas
are
of data.
On a daily
bank accounts
applications
would
to
basis,
individuals
and view their
as each
be possible
ongoing
produce
without
such feature
database
it
Web is
which is a model that has a number
example,
a
application
know
what
use
own calendar. manages
you
were
and
doing
(via
photograph.
Sematic
world objects.
as
manage
Web of data,
The aim is to
applications,
and
holidays,
to link a
on the
and industry.14
integration
(rDF),
possible
work
referred
book
always
and
researchers
often
data. If
when
Research
Web is
photographs,
and led
a framework The framework sources
and
is based
of features
data from
by the
that
two
W3C in
allows
will establish develop
on the
resource
for
formats
Description
to
be
and for
modelling
how
Framework
data over the
applications
with
be shared
common
alanguage
for interchanging
different
collaboration
all data to
WWW. For
merged
even if the
different.
17 12
Berners-Lee,
T. WWW:
13
Berners-Lee,
T.,
14
Copyright Editorial
review
2020 has
W3C
Semantic
Cengage deemed
Learning. that
any
All suppressed
Past,
Hendler, Web
Rights
J. and
Activity.
Reserved. content
Present,
does
May not
and
Lassila,
Available:
not materially
be
copied, affect
O. The
Computer, Semantic
October Web,
1996 (vol.
Scientific
29 no.
American,
10),
pp.
6977.
May 2001.
www.w3.org/standards/semanticweb/
scanned, the
Future,
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
Cloud computing pool to
is a computing
of configurable
a cloud
access
application
can
data
and local
development
deployment
model that that
computing-based
ubiquitous
rapid
resources
management
management
for
to
businesses
of business
provides
be rapidly
ubiquitous,
service
that
companies
using
Database
SQL
data
provides
information
This
Web Technologies
access to
data
refers
storage,
enables
rapid
resources.
and common
909
a shared
(SDS)
service
technology
protocols
and
services
relational
of all sizes.
standard
Connectivity
on-demand
provisioned.
with limited
solutions
17
SDS allows
programming
interfaces.
The Semantic WWW to
Web, often referred
be shared
and reused
to
as a
across
Web of data, is
applications,
a framework
without
that
allows
all data on the
any boundaries.
Key terMs
ADO.NET
dynamic linklibraries(DLLs) Extensible Markup Language (XML) Infrastructure asa Service (InaS)
publiccloud Remote DataObjects (RDO) Resource Description Framework (RDF)
application programming interface (API)
Java
script
CallLevelInterface(CLI)
cloud services
JavaDatabase Connectivity (JDBC) JavaScript LAMP Microsoft .NET framework
server-side extension Software asaService (SaaS) SQLdataservices (SDS) stateless system
common cloud
ObjectLinkingandEmbedding for Database
tags
Common Gateway Interface(CGI) DataAccess Objects (DAO) datasourcename(DSN) databasemiddleware
(OLE-DB) Open Database Connectivity (ODBC) pathexpressions Platform asaService (PaaS)
DataSet
plug-in
Document TypeDefinition (DTD)
privatecloud
ActiveX ActiveX Data Objects(ADO)
client-sideextensions cloud computing
Further
Web-to-database middleware XMLschema XMLschema definition (XSD)
reaDInG
Duckett, J., PHP & MySQL: Server-side
Web Development.
John
Fawcett, J., Ayers, D. and Quin, L., Beginning XML, 5th revised Jain,
Universal Data Access (UDA) VBScript
A., The Cloud
DBA-Oracle:
Managing
Oracle
Wiley & Sons, 2019.
edition. John
Database in the
Cloud.
Wiley & Sons, 2012.
Apress,
2017.
online Content Answers to selectedReviewQuestions andProblems forthis chapter are available
on the
online platform for this book.
reVIeW QuestIons 1
Copyright Editorial
review
2
What are
3
Whatis the difference between
2020 has
Give some examples of database connectivity
Cengage deemed
Learning. that
any
ODBC,
All suppressed
Rights
DAO and
Reserved. content
does
May not
not materially
be
RDO?
copied, affect
How are they related?
DAO and RDO?
scanned, the
options and what they are used for.
overall
or
duplicated, learning
in experience.
whole
or in Cengage
17
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
910
part
VI
Database
Management
4
What are the three basic components
of the ODBC architecture?
5
Which steps are required to create an ODBC data source name?
6
Whatis
OLE-DB used for, and how does it differ from
ODBC?
7
Explainthe OLE-DB model based onits two types of objects.
8
How does ADO complement
9
Whatis ADO.NET, and what two new features
OLE-DB? makeit important
for application
10
Whatis a DataSet, and whyis it considered to be disconnected?
11
What are Webserver interfaces
12
Whatdoes this statement system
have for
used for?
Give some examples.
mean: The Webis a stateless system.
database
applications
development?
Whatimplications
does a stateless
developers?
13
Whatis a Web application
14
Whatare scripts, and whatis their function? (Thinkin terms of database applications development.)
15
Whatis XML, and whyis it important?
16
What are Document Type Definition (DTD) documents,
17
What are XML Schema Definition (XSD) documents,
18
Whatis JDBC, and whatis it used for?
19
Whatis cloud computing,
20
Nameand contrast the types of cloud computing implementation.
21
Name and describe the
22
Using the internet, provide
(SaaS,
server, and how does it work from a database perspective?
and whyis it a game
and what do they do?
and what do they do?
changer?
most prevalent characteristics
of cloud computing
search for providers of cloud services.
PaaS
services.
Then, classify the types
of services they
and IaaS).
23
Summarise the
main advantages
and disadvantages
24
Define SQL data services and list their advantages.
25
Whatis
meant by the Semantic
of cloud computing
services.
Web?
online Content Thedatabases usedin the Problems forthis chaptercanbefoundonthe online
platform
for
this
book.
proBleMs In the following
17
Copyright Editorial
review
2020 has
exercises, you set up database connectivity
1
Use Microsoft Excel to connect to the and retrieve all of the AGENTs.
2
Use Microsoft Excel to connect to the Ch02_InsureCo and retrieve all of the CUSTOMERs.
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
Ch02_InsureCo
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
using
Microsoft Excel.
Microsoft Access database, using
Microsoft Access database,
rights, the
right
some to
third remove
party additional
content
may content
be
using ODBC,
suppressed at
any
time
ODBC,
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Chapter
3
17
Use Microsoft Excel to connect to the Ch02_InsureCo and retrieve
4
the
customers
whose
AGENT_CODE
is
Database
Connectivity
and
Web Technologies
911
Microsoft Access database, using ODBC,
equal
to
503.
Create an ODBC System Data Source Name Ch02_SaleCo, using the Control Panel, Administrative Tools,
5
Data
Use
Microsoft
System
6
Sources
Excel to list
ODBC System
Administrative
Use
Tools,
Microsoft
System
option.
all of the invoice
lines
for Invoice
103, using the
Ch02_SaleCo
DSN.
Create an
7
(ODBC)
Data
Data Source Sources
Excel to list
Name
(ODBC)
Ch02_Tinycollege,
using the
Control
option.
all classes taught
in room
KLR200, using the
Ch02_TinyCollege
DSN.
8
Create a sample XML document
and DTD for the exchange
of customer
9
Create a sample XML document
and DTD for the exchange
of product and pricing data.
10
Create a sample XML document
and DTD for the exchange
of order data.
Create a sample
and DTD for the exchange of student transcript
11
college
Panel,
XML document
transcript
data.
data. Use your
as a sample.
17
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
A
alias
An alternative name for a column or table in a
SQL statement.
access
plan
at application
A set ofinstructions compilation
generated
time that is created
and managed by a DBMS. The access plan predetermines how an applications query will access the database at run time.
the
Microsoft
client
browser,
OLE-DB,
DAO, and
RDO. ADO
Aspur-of-the-moment
developing
distributed,
and
database
ignorant
A data modelthat
anticipated
a central
entity
atomic
A process
or set of operations
See transaction
processor
server-independent.
attribute to
An attribute that cannot be further
produce
meaningful
components.
For
atomic transaction property A property that requires all parts of atransaction to be treated as a single, logical unit of workin which all operations
on the
usage of the data.
algorithms
A PL/SQL block that name.
example, a persons last name attribute cannot be meaningfully subdivided.
does not
based
expressions evaluate to true.
processor
subdivided
be used.
data around
in a WHERE or HAVING clause. It requires
application programming interface (API) Software through which programmers interact with middleware. An API allows the use of generic SQL code, thereby allowing client processes to be
aggregate aware A data model that organises data around a central entity based on the waythe data will
organise
is
AND - No match found showing the function The SQLlogical operator used to link multiple conditional
application (TP).
interoperable applications aimed at manipulating any type of data over any network using any operating system and programming language.
aggregate
When the
anonymous PL/SQL block has not been given a specific
question.
heterogeneous,
command
that all conditional
ADO.NET The data access component of Microsofts .NET application development framework, whichis a component-based platform for
structure.
expressions
provides a unified interface to access data from any programming language that uses the underlying OLE-DB objects. ad hoc query
changes
to table
make
analytical database A database focused primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making.
ActiveX Data Objects (ADO) A Microsoft object framework that provides a high-level, application-oriented to
used to
American National Standards Institute (ANSI) The group that accepted the DBTG recommendations and augmented database standards in 1975 through its SPARC committee.
Internet Explorer. Oriented mainly to Windows applications, it is not portable. It adds controls such as drop-down windows and calendars to Web pages.
interface
The SQL command
followed by a keyword (ADD or MODIFY), it adds a column or changes column characteristics.
ActiveX Microsofts alternative to Java. A specification for writing programs that will run inside
ALTER TABLE
in a
must be completed
calculation.
consistent
(committed)
to
produce
a
database
912
Copyright Editorial
review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it.
Glossary
atomicity
See atomic transaction
attribute
A characteristic
An attribute
attribute
has a name
domain
property.
transaction, item.
of an entity or object.
no other transaction
(relationship)
See domain.
hierarchy
A top-down
that is used for two drill-down/roll-up
audit log
A security feature system
description
performed
the
that
aggregation
by all
database
node
operations
the
that
a report
are
only registered
users
can
logical
management
Such
view
DBMS
Procedures
database
procedures
management, and
usage
a DBMS
include
definition,
that
security
then
and
execution
AVG
DBMS
access
then
control,
most
A method by which
efficient
access
path for
data
function
that
outputs
column
the
in
into
a single
to
check
beginning
to
end
NOT.
components
begins
into
and
units. In
by defining
entities.
database
attributes
Compare
system is subjected.
to
and
top-down
and
which any
These limits
existing
include
hardware
normal form (BCNF)
normal
a candidate
form
(3NF)
key.
A table
in
and
A special type
which
in
every
BCNF
a single
batch to
update
determinant
must
be in
3NF.
is
within
a master
bucket
of
A movement to find new and better
data
and
derive
simultaneously scalability
amounts
business
insight
providing
high
at a reasonable
binary lock
Learning. that
any
Web-generated
from
it,
performance
All suppressed
(0). If
and
cost.
Rights
Reserved. content
does
May not
not materially
be
copied, affect
is locked
scanned, the
overall
or
duplicated, learning
of related
key-value
business
intelligence
in
whole
format
JSON format
including
binary
collect,
with the
purpose to
or principle
or in Cengage
part.
Due Learning
to
electronic reserves
right
some to
database,
and processes
of
store,
generating
and and
business
collection
additional
content
may content
be
any
data
presenting
For
suppressed at
used to
making.
of a policy, procedure,
an organisation.
party
cohesive,
analyse
decision
A description
third
a logical
A comprehensive,
support
remove
organised
pairs.
integrate,
within
rights, the
data structure
set of tools
capture,
business rule
by a
experience.
the
tree.
In a key-value
information
a data item
An ordered
and integrated
while
Alock that has only two states: locked
unlocked
Cengage
of
expands
data types
as an upside-down
operator
a range
A computer-readable
that
additional
entity.
objects.
b-tree index
a value
large
JSON)
data interchange
to include
values.
manage
See composite
BSON (Binary for
Aroutine that pools
whether
to
and
from
operation.
ways
deemed
and
The external limits to
bridge entity
method that
In SQL, a special comparison
Big Data
has
tasks
update routine
specified
2020
design
personnel,
of third
any user interaction.
BETWEEN
review
AND,
them into larger
them
Boyce-Codd
or expression.
A data processing
processing
transactions
Copyright
OR,
software.
for a specified
processing
batch
Editorial
node.
See also determinant.
without
(1)
node
A design philosophy that begins
the process
budgets,
of a query.
A SQL aggregate
batch
used
data
name
A branch of mathematics that
individual
groups
proposed
the
B
table
by the
the
design.
is
runs
data
File System
monitoring.
the
mean average
on that
design
aggregates
design,
user access
query optimisation
finds
informing
operators
boundaries automatic
or condition.
access
bottom-up
and guarantee
integrity.
node
algebra
uses the
by identifying
protect
of a value
sent every six hours
name
blocks
Boolean
which a
database.
authorisation
existence
In the Hadoop Distributed
to the
which
For example,
users.
The process through
verifies
represent
(HDFS),
records
entities.
COURSE.
Anindex that uses a bit array (0s and
block report
of a database
two
teaches
bitmap index
and
1s) to
automatically
of the
authentication DBMS
main purposes:
data analysis.
management a brief
data organisation
data
An ER term for an association
between
PROFESSOR
attribute
can use that
See also lock.
binary relationship
and a data type.
913
time
from if
the
subsequent
example,
eBook rights
and/or restrictions
eChapter(s). require
it
914
Glossary
a pilot cannot during
be on duty for
a 24-hour
to four
period,
classes
during
more than
10 hours
or a professor
superclass
may teach
up
and each child
a semester.
client/server
architecture
of hardware
C
and
a system
(CLI)
by the
Group for
SQL
Access
candidate that
key
does
itself
not
a superkey.
value
to
occurrence
for
last
names
last
names,
a set
are
which
centralised
in
within
the
when number
a data
database
is
stored
at one
decentralised
located
at a
in
component
which the
system
writes all of its
class
A collection
A class
methods
organised
in
class
diagram
notation
in the
creation
of class
class
hierarchy
2020 has
existing
tables
tree in
Learning. that
any
All
Classes
permits
the
existing
tables
use
ActiveX,
operators
produce
of relational
new relations.
operators that
algebra to
on
operators
produce
on
new relations.
object
data
The set of symbols
Reserved. does
May not
used
of configurable
not
be
class is a
copied, affect
access
resources
that
to
a shared
can be rapidly
The services provided by cloud
Cloud services
allow
any organiastion
to
power,
databases,
and infrastructure.
to a table in a relational
database.
In a column family
collection
of columns
collection
of rows.
or super
database, a
columns
related
to
a
scanned, the
overall
or
duplicated, learning
family
database
A NoSQL
database
column-centric storage A physical data storage technique in which data is stored in
of classes in
materially
modelthat
on-demand
model that organises data into key-value pairs, in which the value component is composed of a set of columns that vary by row.
notation.
parent
A computing
are
diagrams.
content
common
operators that
algebra
to
(relations)
ubiquitous,
column
UML
Rights
(relations)
column family
data representation
which each
suppressed
of relational
A property of relational
analogous
with shared
(methods).
The organisation
Cengage deemed
behaviour
disk.
A diagram used to represent in
most
JavaScript,
collections In document databases, alogical storage unit that contains similar documents, roughly
hierarchy.
relationships
a hierarchical
an
buffers to
an objects
a class
The
Java,
cohesivity The strength of the relationships between a modules components. Module cohesivity must be high.
Compare
management
implementation.
diagram
and their
use
processing
of a relatively
management,
updated
encapsulates
a
consists procedures.
database
and
Web browser.
quickly and economically add information technology services such as applications, storage, servers,
It is typically
of similar objects
(attributes)
Extensions that add
plug-ins,
cloud services
match an
requirements.
In transaction
of
provisioned.
design.
checkpoint
structure
the
pool
which a single
modelled to
and
features
and a provider
A property of relational
permits
provides
database.
A process in
of objects
a
cloud computing
A data allocation
A database
database
architecture
or a client,
VBScript.
closure all
and
ordered.
as a centralised
design is
small
operation
are
closure
which
and,
to
extensions
computing.
used
review
are
entire
design
organisations
Copyright
as a list
ordered
database
conceptual
Editorial
of
with am single
A nested ordering such
names
the
range
client/server
extensions
functionality
and
to form
servers,
or a server.
client-side
is
site.
centralised
class
that
entity.
data allocation
in
the
associated
alphabetically
Also known
and
expresses
of rows,
all first
centralised
to
and
occurrences
order sequence
sequence
single
of attributes
The
resources,
See key.
of the related
cascading
site.
access.
components
of clients,
a user of resources,
A property that assigns a specific
entity
strategy
database
a subset
connectivity
allowed
developed
A minimal superkey; that is, a key
contain
cardinality
A standard
See
The arrangement
software
composed
middleware.
Call Level Interface
class is a subclass.
also inheritance.
in experience.
whole
blocks,
which
across
many rows.
or in Cengage
part.
Due Learning
to
electronic reserves
hold
rights, the
right
some to
data from
third remove
party additional
content
a single
may content
be
column
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
COMMIT
The SQL command that permanently
conceptual
schema
saves
changes
conceptual
model,
data
Common server to
to
a database.
Gateway Interface
interface
perform
standard
specific
(CGI)
that
functions
See also conceptual
A Web
concurrency
uses
script
files
based
on
a clients
parameters.
completeness specifies
constraint
whether
each
entity
must also be a member The
completeness
Partial
be partial
not
Total completeness
be
some
members
means that
of at least
between
further
attribute
subdivided
and
example,
a phone
may be divided
simple
(615),
into
entitys
an exchange
code (2368).
primary
keys
Also known
two
primary
Compare
of the
1:M relationships. key
comprises
entities
as a bridge
that
entity.
it
key
computer-aided
(CASE)
systems
Development
conceptual data-modelling database
that
design
techniques
to
software-and
hardware-independent.
global
model
process. view
of an entire
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not
be
and
copied, affect
1:1, 1:M,
in a known
consistent
0.00 and 4.00.
because they help to
subquery
A subquery that executes outer
query.
both
provides describes
A query
optimiser
technique
COUNT A SQL aggregate function that outputs the number of rows containing not null values for a given column or expression, sometimes used in
objects are
optimiser
that uses an algorithm based on statistics about the objects being accessed, including number of rows, indexes available, index sparsity, and so on.
conjunction
with the
CREATE INDEX
a
indexes
the
on the
DISTINCT
clause.
A SQL command that creates basis of a selected
attribute
or
attributes.
details.
materially
database
each row in the
cost-based
both
a model of a
real-world
model
database
avoiding
include
objects are
The techniques
conceptual
Classifications
GPA must be between
once for
The output of the conceptual
The
main data objects,
of a
uses
create
as possible.
design
that
that represents
as realistically
with the
correlated
model
The techniques
A process
structure
a
real-world
hardware-independent.
conceptual
review
represents
as possible.
of the relationship
protocol.
create
software-and
database
Copyright
part or all of the
as realistically
data-modelling
Editorial
to
on a database.
coordinator The transaction processor (TP) node that coordinates the execution of atwo-phase COMMIT in a DDBMS. See also data processor (DP), transaction processor (TP), and two-phase commit
A process that uses
structure
conceptual
key.
engineering
techniques
entities.
Constraints are important ensure data integrity.
Life Cycle.
design
place
are working
The classification
students
See also linking
Tools used to automate
Systems
that takes
constraint Arestriction placed on data, usually expressed in the form of rules. For example, A
The at least
connects.
A multiple-attribute
A backup
consistent database state A database state in which all data integrity constraints are satisfied.
table.
composite
system
state. If not, the transaction will yield an inconsistent database that violates its integrity and business rules.
An entity designed to transform
M:N relationship
database
data integrity.
M:N.
begin For
as 615-898-2368
code
and a four-digit
entity
composite the
such
an area
attributes.
attribute.
composite an
additional
number
into
number (898), to
yield
multiprocessing
of
consistency A database condition in which all data integrity constraints are satisfied. To ensure consistency of a database, every transaction must
one
An attribute that can be
to
execution
subtype.
subtype.
composite
a
backup
connectivity
supertype
every supertype
must be a member
in
of the graphically.
A DBMS feature that
simultaneous
while one or more users
or total.
of any
transactions
concurrent
occurrence
one subtype.
can
means that
might
occurrence
of at least
constraint
completeness
occurrences
supertype
the
expressed
model.
control
coordinates
while preserving
A constraint that
Arepresentation
usually
915
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
916
Glossary
CREATE TABLE
A SQL command that creates
a tables
structures
attributes
given.
cross join (or
entity
the
and
product)
relationship
of two
arelational
the
product
reserved
corrected
data
is
not
that
uses
many
a three-pronged
sides
In
multidimensional
memory
area where
area that
blocks
to the
in
stores
RAM. fast
slower
the
cube
cache
number
of input/output
primary
and
OLAP, the
in
structure
SQL to
hold the
A cursor
are held.
speeding
data rows
used in returned
may be considered
memory
in
holding
which
query
columns
reserved
client
construct
up data
value in the
and rows.
memory
procedural
is
area in the
like
are
DBMS
to
store
held in
server,
before
querying
a graph
that
a
allows
database
query language
used in Neo4j
the
or facts
to reveal
Access
that
their
Visual
Objects
other
interface
The
relational-style
managing
entire
(DBA).
than
known
deciding
review
exposes
has
the
any
to
define the
subschema.
are
dependent
on
characteristics.
data resource,
have
functionality
All suppressed
Rights
does
it is
May
definition
as
and relationships. data
the
data
well as
A data
that
are
external
to
resource
metadatadata
about
contains
the
data.
and relationships. data
that
be
which
copied, affect
the
overall
or
duplicated, learning
A data dictionary
are
external
to the
resource
DBMS.
dictionary.
data extraction
A process used to extract and an operational
from
a database.
scanned,
data
data from
database
and
in experience.
A named physical storage space that
a databases
data. It can reside in a different
directory on a hard disk or on one or more hard disks. All data in a database are stored in data files. Atypical enterprise database is normally composed of several data files. A data file can contain rows
DBMS, the process
not
the
as well as their
validate
data file
resource
materially
Thus,
data definition
external data sources prior to their placement in a data warehouse.
authority
administrator
made to
not
data
that
Thus,
A DBMS component that
may also include
data fragments.
Reserved. content
data.
may also include
characteristics
access
whether
database
been
the
about
DBMS. Also known as an information
dictionary
The person responsible
the
contains
stores
optimised
sources.
to locate
Learning. that
and
Also known as aninformation
A data abnormality in
Cengage deemed
administrator
A DBMS component
data dictionary
from
MS Access is
In a distributed
changes
2020
Thelanguage
manipulation
characteristics
stores
data anomaly inconsistent
Copyright
an
as an information
where
be
dictionary.
(IRM).
data allocation
Editorial
provides
or not. The DA has broader
Also
manager
databases
can be used to
(DA)
and responsibility
must
cannot
used to access
dBase
on which
data
the
computerised
of
that
DAO interface
data administrator for
and DAO
Jet data engine,
based.
so they
(DDL)
data storage
dictionary
end user.
An object-oriented
interface
programs.
programming
of the
(DAO)
MS FoxPro,
Basic
meaning they
query.
metadatadata
their
have not yet been
meaning to the
programming
MS Access,
data
database.
the Data
of each
A data condition in which data
and
physical
dictionary
application
a
not in the
D processed
data in
on its x-, y-, and
used,
schema,
data dictionary
Raw facts,
data
The location
are
a database structure,
stores
data
the
between
manipulate
are static,
they
by an ad hoc
representation
for
compared
minimising
operations
and
DBMS.
data dependence
A declarative
advantage
memory.
data definition language
an array
computer.
Cypher
memory
accessed
takes
memory, (I/O)
Data cubes
created
area of
stored,
Cursors
primary
data cube is based
be created
by a SQL query.
a reserved
output
most recently cache
The multidimensional used
z-axes. A special
address
database.
A shared, reserved
the
secondary
data cube
shared,
access.
cursor
but the in the
of the
data cubes
assists
all files
A buffer
secondary
multidimensional Using
moves,
in
or buffer cache
of a computers
of the
relationship. cache
an employee
change
memory
tables.
Arepresentation
diagram
to represent
cube
For example,
data cache
Foot notation
symbol
characteristics
Ajoin that performs
Cartesian
Crows
using
whole
or in Cengage
part.
one or
Due Learning
to
electronic reserves
more tables.
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
data filtering validate external data
A process used to extract and
data from data
an operational
sources
warehouse.
prior
See
that
allows
users fragment
object
to
can
placement
a system
be stored
in
a
senior
data processor
of a DDBMS
be broken
or fragments.
database,
supervising
programmers,
and troubleshooting
the program. Also known as a data manager(DM).
and
(DP)
The resident software
extraction.
A characteristic
a single
more segments
to their
data
data fragmentation
database
917
into
The object database,
at any
site
two
or
that
a DDBMS.
The
local
Each
data
on a computer
stores
the
and retrieves
DP is responsible
data in the
to that
might be a
or a table.
component
computer
data for
and
coordinating
data. See also transaction quality
validity,
the
access
processor (TP).
A comprehensive
accuracy,
through
managing
approach
and timeliness
to ensuring
of data.
network.
data redundancy
data inconsistency different
A condition in
versions
(inconsistent)
of the
is
storage
environment
different
by changes
in the
database
physical
data
the
data in the
and referential
data
management
data collection,
database
comply
functions
modification,
and listing.
that focuses
and retrieval. include
manager (DM)
all
on
Common
addition,
multiple
fragments
Data replication
sites
on a DDBMS.
is transparent
provides
fault
to the
tolerance
deletion,
filtered
See data processing (DP)
commands
that
allows
data in the
SELECT,
an end
database.
INSERT,
(DML) user
from the
external
and operational
data, and
query tool
The set of to
data warehouse Anintegrated, subject-oriented, time-variant, non-volatile collection of data that provides support for decision making, according
manipulate
The commands
UPDATE,
and
enhancements.
will be stored for access by the end-user for the business data model.
manipulation language
end
data store The component of the decision support system that acts as a database for storage of business data and business model data. The data in the data store have already been extracted and
data
manager.
the
(unnecessarily
data source name (DSN) A name that identifies and defines an ODBC data source.
constraints.
A process
storage,
with
at
of the
performance
database, a condition
integrity
management
data
redundant
which a data
The storage of duplicated
fragments
Duplication user.
In arelational
entity
data
in
data.
data replication
characteristics.
which
contains
duplicated)
A condition in which data
unaffected
data integrity in
data yield
A condition
results.
data independence access
same
which
DELETE,
include
COMMIT,
and
to
ROLLBACK.
Bill Inmon,
the acknowledged
father
of the
data
warehouse. data
mart
A small,
subset
that
group
of people.
data tools
provides
mining
decision
sources
support
data in a data and to
relationships
data
warehouse data
to a small
and
warehouse
proactively
identify
warehouse
and possible
database
anomalies.
administrator
responsible
data
model
complex
real-world
used in the Life
Copyright review
data structure.
database
design
(DP)
who
a department
has
Data
phase
data processing
2020
usually graphic, of a of the
evolved
Cengage deemed
into
managing
Learning. that
any
All suppressed
manager
technical
models are Database
Rights
Reserved. content
does
May not
not materially
be
human
copied, affect
scanned, the
overall
Roles
second
resources,
or
duplicated, learning
in experience.
whole
for
(DBA)
planning,
The person
organising,
controlling,
and
database design The process that yields the description of the database structure and determines the database components. Database design is the
A DP specialist
supervisor. and
subject-oriented,
monitoring the centralised and shared corporate database. The DBAis the general manager of the database administration department.
Cycle.
include
Editorial
Arepresentation,
Anintegrated,
time-variant, nonvolatile collection of data that provides support for decision making, according to BillInmon, the acknowledged father of the data warehouse.
A process that employs automated
to analyse
other
single-subject
or in Cengage
part.
Due Learning
to
electronic reserves
phase
rights, the
right
some to
of the
third remove
party additional
content
Database
may content
be
Life Cycle.
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
918
Glossary
database design
development
The process of database
DataSet
and implementation.
In ADO.NET, a disconnected,
memory-resident
database fragment
The
A subset of a distributed
DataSet
representation contains
relationships, database.
Although
at
sites
different
set
the within
of all fragments
fragments
may be stored
a computer
network,
is treated
See also horizontal
as a single
fragmentation
the
database.
that
and vertical
study, and
clients
possible
tuning
requests
while
columns,
rows,
are
Activities to ensure
addressed
making optimum
as quickly
as
use of existing
resources.
Life Cycle (DBLC)
A cycle that traces
of a database
an information
history
system.
tables,
database.
constraints.
DBMS performance
the
fragmentation.
Database
and
of the
within
The cycle is divided
into
design, implementation
evaluation,
operation
A condition in
transactions the lock
six phases: initial
and loading,
testing
maintenance,
and
and
deadlock
deadly
which two
wait indefinitely on a previously
embrace.
for
the
locked
or more other
to release
data item.
Also
called
See also lock.
deadly embrace
See deadlock.
evolution.
decentralised database
management
collection
of programs
structure the
system
and
that
controls
(DBMS)
The
manages the
access
to the
conceptual
database
data
stored
an organisations
in
database
middleware
Database
through
connect
and communicate
connectivity
which application
programs
large
with data repositories.
number
centralised database and
performance
procedures
time
of a database
an end-user minimum
query is
to
the
is,
processed
decision
response
to
ensure
by the
DBMS in the
The process of restoring
a previous
consistent
statement
in
database for the
an
security
security,
integrity,
a
that
system
management,
and recovery
An organisation
and
use
of the
the collection,
of data in
database
access
Atype
to the
of lock
owner
works
online
for
batch
multiuser
Copyright review
2020 has
storage,
Learning. that
any
of the lock
processes
All suppressed
Rights
Reserved. content
does
to
and allows
the
database.
This
but is
unsuitable
for
May not
not materially
(DSS)
An arrangement
See deferred update.
be
copied, affect
scanned, the
overall
or
A process by which a table
from
a higher-level
dependency diagram dependencies (primary within atable.
that restricts
normal form
to
duplicated, learning
yields
Arepresentation of all data key, partial, or transitive)
derived attribute An attribute that does not physically exist within the entity and is derived via an algorithm.
DBMSs.
Cengage deemed
Compare
alower-level normal form, usually to increase processing speed. Denormalisation potentially data anomalies.
a database
one user at a time to access
lock
Editorial
lock
procedures.
write technique.
is changed
of components
environment.
database-level
After
DELETE A SQL command that allows data rows to be deleted from atable.
The person responsible
backup,
and regulates
and
system
denormalisation
defines
only
of objects
deferred-write
or a transaction.
database.
database
requirements.
of
of a single SQL
program
officer
subsets
deferred update In transaction management, a condition in which transaction operations do not immediately update a physical database. Also called
state.
The equivalent
application
model
design.
support
deferred
database request
database
to
of computerised tools used to assist managerial decision making within a business.
that
of time.
database recovery
A process in which used
A set of activities
to reduce
systemthat
amount
database
tuning
designed
is
verification of the views, processes, and constraints, the subsets are then aggregated into a complete design. Such modular designs are typical of complex systems in which the data component has arelatively
database.
software
design design
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
For example, subtracting
the the
description provides
Age attribute birth
a precise,
design trap
with the
real
design trap is known
runs
value See
that
when a is
most
not
database that
Boyce-Codd
normal
values form
in that
whose
related
computer
systems
that the value
dimension used star
to
of attribute
tables
search,
schema.
filter,
or classify table
a given
that
dedicated
to
securing
disaster
subtype
are
availability
that
different
contains
Learning. that
following
and among
schema
The database
of a distributed
database
schema
as seen by the
administrator.
processing Sharing the logical of a database over two or more sites by a network.
data processors
(DPs) in a distributed
distribution transparency A DDBMS feature that allows a distributed database to look like a single logical database to an end user.
a
failure.
subtype)
from
Document
In
that
and
any
All suppressed
one
the
Rights
Reserved. content
does
May not
not
be
copied, affect
the
overall
or
pairs in
duplicated, learning
in experience.
whole
model
which the
of atag-encoded
the syntax
rules
or valid tags for each type
of
domain In data modelling, the construct used to organise and describe an attributes set of possible values.
See distributed
scanned,
A NoSQL database
XML document.
Also
(DDD).
(DDD)
materially
defines
(fragment
database.
data dictionary
databases
data in key-value
document type definition (DTD) Afile with a .DTD extension that describes XML elements; in effect, a DTD file describes a documents composition and
A data
description
stores
value component is composed document.
another.
(DDC)
of a distributed
as a distributed
Cengage deemed
data
distributed transaction A database transaction that accesses data in several remote data processors (DPs) in a distributed database.
qualifying
integrity
a unique
distributed data dictionary data catalogue.
has
both
distributed
database.
entity set.
data catalogue
names, locations)
2020
a
processing
over interconnected
which are
global
several remote
A SQL clause that produces only a list
that
distributed
review
a
perspectives
(non-overlapping hierarchy,
DISTINCT
Copyright
data in
sites;
and
distributed request A database request that allows a single SQL statement to access data in
a one-to-many
additional
data
nonoverlapping
of values
Editorial
up.
The set of DBA activities
a specialisation
known
different
storage
sites.
distributed processing connected
B
A means
within
design,
or a database
subtype
dictionary
database
independent
tables.
provide
management
disjoint
is in
related
physically
fact.
disaster physical
facts
In a star schema
characteristics to
of attribute
B can be looked
with dimension
dimensions
determines
In a data warehouse, tables
The fact
relationship
A
the value
Alogically
functions
description
the
knowing
database
independent
(BCNF).
a database
that
more
the
of logically
distributed
row.
The role of a key. In the context of
indicates
or
related
physically
several
governs
processing
determination
statement
more
database management system A DBMS that supports a database
database
table,
in two
across
DDBMS
several
other
Alogically or
database
stored
distributed
common
computer.
determines
is
distributed (DDBMS)
identified
way that
in two
sites.
Any attribute in a specific row
directly also
a The
A single-user
on a personal
that
as a fan trap.
database
determinant
in world.
database
stored
distributed
activities
or incompletely
is
sites.
environment.
A problem that occurs is represented
consistent
that
and
of the
operating
is improperly
and therefore
distributed
by
date.
up-to-date,
description
an organisations
relationship
current
A document that
detailed,
reviewed
desktop
the
of operations
thoroughly define
might be derived
date from
919
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
920
Glossary
DO-UNDO-REDO a data
transactions log
protocol
processor
(DP)
A protocol
to roll
with the
help
back
used by
end-user
or roll forward
of a systems
create
transaction
from
entries.
To decompose
componentsthat
is,
aggregation.
data into
data
at lower
This approach
more atomic
levels
the
support
system
to focus
geographic
areas,
business
types,
and
in
entity
on specific and
so
on.
as tables,
indexes,
and
ERD.
users.
Permanently deletes an index
DROP TABLE
Permanently
Thetransaction
the permanence Transactions a system
have
failure
been
if the
consistent
completed
database
the
the
SQL
most
database.
Contrast
strategy
at run
information
with static
query
An entity
SQL
statement
is
generated
An environment
not known in
at run
a program required
can
time.
key
entity
the
to
is
that
are
entity
resulting
relationship
semantic
embedded
in
SQL
application
end-user
Copyright Editorial
review
and
and
end-user
2020 has
Learning. that
provide
any
additional
contained such
tool
presents
All suppressed
as
within COBOL,
does
May not
not materially
copied, affect
scanned, the
overall
or
duplicated, learning
relationships
in experience.
whole
model
with the
developed
by P.
A diagram models
entities,
model, a grouping
of
In a generalisation/specialisation of an entity the
subtypes
contain
supertype.
common
the
unique
The entity
characteristics
and
characteristics
of
entity.
In a generalisation/specialisation
a generic
entity
Ajoin
condition tables.
Due
of the
to
electronic reserves
rights, the
right
some to
third remove
that
of entity
operator
columns
Learning
type
characteristics
on an equality
part.
was
relationship
contains
Cengage
1:M, and level
diagram (ERD)
a subset
or in
A data
(1:1,
conceptual
In a relational
equijoin
data compiled
be
values.
and relations.
subtype
hierarchy,
A data analysis tool
Reserved. content
table
value in a
no null
model (ERM)
The
entity supertype
tool.
Rights
table
entities.
common
selected
has
hierarchy,
each
languages
of a relational
supertype the
ColdFusion.
query
Cengage deemed
of extended
model.
SQL statements
presentation
organises
by the
that
ER
programming
C++, ASP, Java,
that
the
a specific
has a unique
at the
an entity
set
entity
application
in
1975.
related
The entity relationship the
concepts
content
of a
nodes.
from
virtual
an entity
occurrence.
key
(ER)
entities
depicts
entity
diagram
considered
See entity instance.
entity relationship
queries.
In a graph database, the representation
EER diagram (EERD)
the
describes
attributes,
between
each entity
and that
of ER diagrams.
Chen in
E relationship
abstract
actually
ER modelling,
relationship
help
that
edge
is
not
The property
M:N) among
environment,
SQL statements
ad hoc
In
occurrence
entity
SQL
but instead SQL
a single
the
which the
a dynamic
it is
in the
by combining
into
as an entity
guarantees
primary
optimisation.
advance,
In
generate
to respond
in
attribute.
ERD.
model that dynamic
also
or event for
and relationships
cluster
because
Also known
that
time,
about
data present
entity type used to
entities
entity integrity
be lost
durability.
The process of
access
up-to-date
proper
See
entities
interrelated object.
row.
state.
will not
has
for
concept,
cluster is formed
entity instance
property that indicates
query optimisation
determining using
be stored.
Avirtual
or abstract
deletes atable (and its
of a databases
that
support
needs.
multiple
the final
dynamic
can
An entity
entity
data)
durability
provides
future
data
information
The overall company
which
represent
used to delete database
views,
desired
A person, place, thing,
multiple
DROP INDEX
in
database
entity cluster
such
access
See
up.
A SQL command
A data analysis tool used to
that
store.
expected
which
objects
data
representation,
of
is used primarily
a decision
DROP
queries
enterprise
drill down
also roll
query tool
the
contains
that links that
party additional
content
tables
compares
may content
the
subtypes.
be
specified
suppressed at
any
time
based
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
eventual
consistency
consistency
in
propagate
A modelfor database
which
through
updates
the
to the
system
so that
all
will be consistent
eventually.
exclusive lock
Alock that is reserved
transaction.
An exclusive
transaction item
requests
and no locks
lock
to
copies
An exclusive
transactions
to
lock
access
the
a
data
a data
data item does
not
existence
other
entities.
In
such
first
cannot
table
because
reference
the
a table
In
a subquery
explicit
cursor
hold
the
return
returns
or
model
entities.
operator
that
that
checks
entity
relationship
semantic entity
model;
that
facts
may
return
zero
constructs, subtypes,
such
and
entity relationship
extended relational that
includes
features
the
in
an inherently
facts linked
with
that
used
Copyright review
2020 has
permits
Cengage deemed
Learning. that
any
to
Unlike other the
All suppressed
Rights
Reserved. content
does
May not
not materially
to
A
copied, affect
scanned, the
overall
or
duplicated, learning
that
their
sales
measurements
figures
costs,
aspect
are
product
used in
units,
table.
business
represent
numeric
or service
business
prices,
data
and revenues.
is
not
expressed
loop
produce
Social
in experience.
whole
or in Cengage
bank
part.
Due Learning
to
even if
electronic reserves
a network
that
place,
rights, right
some to
third remove
defines
additional
thus
entities
Analysing stored data
a characteristic For example,
address,
constitute
party
entities, other
or numeric character
number, all
the
results.
or thing.
Security
when one entity
model.
processing
An alphabetic
the
among
in the
actionable
balance
occurs with other
an association
person,
data
Afeature that allows of a DDBMS,
1:M relationships
of characters
manipulate
of a documents
be
dimension
a specific
A design trap that
feedback
to the
markup languages,
manipulation
through
table
fails.
field and
associated
operation
producing
in size automatically
represent
each
Facts commonly
is in two
increments.
data elements. XML
refers
and classified
For example,
fan trap
database
Markup Language (XML)
metalanguage
original
that
of data files to expand predefined
data
Afact table is in a one-to-many
represent
continuous
best
relational
In a DBMS environment,
Extensible
Editorial
simpler
warehouse,
failure transparency
A model
models
extends
using
to the
model (ERDM)
environment.
of
the star schema
analysis include
node
object-oriented
of the
more
model.
data
focus,
In a data
In a data warehouse, the
sales.
supertypes,
clustering,
structural
ability
of adding
as entity
entity
(ER)
result
view
The specific representation
dimensions.
(values)
or
model
the
business
with a data subset
view of the
relationship
SQL, a cursor created
Sometimes referred to as the enhanced
relationship
Given its
view; the end users
contains
common
or activity.
entity
programmers
schema.
measurements
extended
over the
F
only one row.
(EERM)
data
of structured
environment.
when referencing
could
exchange
of an entity
statement
but
languages,
and invoices
works
schema
an external
exist.
any rows.
of a SQL
more rows,
orders
A manipulate
of a documents
the
environment.
fact table
In procedural
markup
The application
database
external
key
yet
other
as
model
data
global and
table.
output
two
first
SQL, a comparison
whether
not
over the
and
manipulation
such
an external
one or more related
must be created
of the
the
A property
an existence-dependent
EXISTS
does
of structured
internet.
more
must be created
that
can exist apart from a table
or
existence-dependent
existence-independent
Such
one
an environment,
existence-independent loaded
on
Unlike
the
documents
also
A property of an entity
depends
exchange
and invoices
to represent
XML facilitates
external
existence-dependent
used
elements.
elements.
allow
See
the
orders
Markup Language (XML)
XML permits
by any
database.
as
internet.
lock.
whose
such
metalanguage
when
update
are held on that
transaction.
to
data
XML facilitates
documents
by a
is issued
permission
other
that
will
Extensible
other
shared
elements.
database
921
content
may content
or group of a
a persons
phone
number,
and
fields.
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
922
Glossary
field-level
lock
transactions
to
require that
the
row.
Alock that allows concurrent access
use
of lock
data access
computer
same
of different
This type
multiuser
the
row
as long
fields
(attributes)
yields
the
but requires
running
as they
under
find()
A MongoDB
from
of
fully replicated
normal
depicted and
in tabular
a primary
flags
It
describes
format,
with
be used
All nonkey
dependent
a required
on the
absence
nulls in
key (FK) whose
another
fourth
or
must
whose
no
multivalued
as a single
database
values
must
and
memory location.
on graph
See
data
key.
full functional an attribute
it is
periodically
divided
It
systems
that
has
backup
dependent
types
Learning. that
any
that
Data
are
All suppressed
a
stores
data
ensures disaster
does
at their
be atomic
data.
model based
on relationship-rich
and
combined in
edges.
with
any
a SELECT
a
at the
of the
statement.
May not
not materially
be
conceptual
that
over
data
scanned, the
overall
which a used in
changes
on the
in
database
the
the
design
level.
duplicated, learning
In the a signal
node the
in
whole
or in Cengage
part.
Due Learning
Hadoop
name
to
selected
node
is
still
rows.
Distributed
node
to
File System
seconds notify
from the
the
name
node
available.
transparency
to integrate
one logical
experience.
to restrict
sent every three
to the
data
a system
models
or
A clause applied to the output of a
heterogeneity
a
management
affect
hardware
Therefore,
BY operation
heartbeat
management
copied,
on the
will have no effect
GROUP
data
and relational)
does
A condition in
depend
or
A system that
different
not
implementation.
(HDFS),
database
Reserved. content
by the
A SQL clause used to create frequency when
hardware
key.
support
Rights
determines
stored
said to
of nodes
functions
models
in
on a composite
of database
different
may even
Cengage
of the
network,
supports
deemed
model
A condition in which
DDBMS
(hierarchical,
network.
row.
A NoSQL database
theory
HAVING
dependence
different
A
H
into
failure.
heterogeneous
2020
updated
Afull
key but not on any subset
to
of detail represented
a tables
as a collection
distributions
copy of an entire
is functionally
integrates
in
GROUP BY
of
database
of all data after a physical
integrity
A determines
B. The relationship
hardware independence
separate
database
of attribute
equivalent
of granularity
database
be null.
a distributed though
A complete
full recovery
A is
R, an
on an attribute
A DDBMS feature
to treat
saved
review
graph
Atable that is in 3NF
even
on
Thelevel
stored
key in
sets
of also
written as AB.
level
aggregate
database
Copyright
values
primary
independent
See
Within a relation dependent
of attribute
dependent
granularity
the
more fragments.
full backup
Editorial
the
transparency
a system
systems
may to
sites.
database.
one value
lowest
match
multiple
allows
fully
Flags
attention
multiple copies
multiple
G
dependencies.
fragmentation
or
values.
stores at
dependence
B, and is
key.
a table.
normal form (4NF)
contains
B is
in
An attribute or attributes in one
values
table
attributes
by designers
by bringing
that
fragment
Aif and only if a given value groups
primary
In a DDBMS, the
Bis functionally
exactly
alert end users to
or encode
prevent
of a value
foreign table
response,
conditions, to
database
database
attribute
a relation
no repeating
as
DDBMS and homogeneous
database
partially replicated
documents
stage in the
Special codes implemented
specified
two
The first
key identified. are
to trigger
that
(1NF)
process.
relation
and
each
functional
form
normalisation
the
of related records.
a collection.
first
such
microcomputers.
DDBMS.
most flexible
a high level
method to retrieve
systems,
and
See also heterogeneous
within
overhead.
A named collection
computer
minicomputers,
distributed
file
different
mainframes,
several
Afeature that allows centralised
DBMSs
into
DDBMS.
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
heterogeneous integrates
DDBMS
different
management
A system that
types
of centralised
systems
over
heterogeneous
distributed
heterogeneous
DDBMS)
implicit
database
a network.
See
database
also fully
system
in
returns
(fully
and homogeneous
cursor
created
IN
DDBMS.
whose basis
for
An early database
concepts
and
subsequent
model is
in
model
basic
based
database
record
is the
root
relationship
is called
segment.
to the
tree
Each
segment
problem
has
DDBMS
only one type
of centralised
system
over a network.
DDBMS
and fully
database
system
A system
yielding
heterogeneous
different
be avoided.
relational
homonyms
existence
has
index
arises
a transaction-calculating
when
horizontal
fragmentation
database
generally
either
index
should
automatically
alerts
the
design
process rows.
in the
array
host language
be used in
The distributed that
breaks
See also
a table
database
its
primary
processing.
meaning.
Information
keys
are
the
primary
immediate
engineering
A methodology
Learning. any
dependent
Arelationship
of corporate
parent
Rights
during
Reserved. content
that
Also
strategic
making.
goals into
IE focuses
data instead
on the
of the
resource
that
helpful
description
processes.
the
relationship
primary
key
does
May not
not materially
be
its
copied, affect
overall
the
execution,
commit
scanned, the
as the future
or
duplicated, learning
in
whole
engineering
basis for
of the
The
(IE) process
planning,
information
of an object
methods
ability
and
point.
experience.
(ISA)
developing,
and
systems.
In the object-oriented
inheritance
a transactions
architecture
of the information
ability
See data
to inherit
classes
data model,
the
data
structure
above it in the
class
hierarchy. See also class hierarchy.
contains
entity.
reaches
systems
serves
and
manager
(DA).
Inheritance
called
A database update that is
the transaction
All
output
are
in which
identifying
entitys
key of the
suppressed
(IE)
a companys
controlling
existence-dependent.
immediately
before
of
information
in tables.
or strong
update
performed
consists
into fragment
model, such identifiers
relationship
entities
that
as
A measure of how likely an index query
decision
In an ERM, unique names of each entity to
Cengage
used to
Also known
data and facilitates
translates
In the relational
because
deemed
are generally
data retrieval.
and row
transformed
information
relationship
has
key values
statements.
a strong
2020
of index
The result of processing raw data
administrator
identifying
review
the last
See index.
information
Any language that contains
SQL
identifiers
Copyright
since
makes the
I
Editorial
data,
key.
information
even
database
Indexes
selectivity
is to
fragmentation.
embedded
the
the
backup.
data and applications.
related
of data
A process that only backs up
changed
or full
key
index
user
See also synonym.
of unique
vertical
mapped
a set
are updating
up and facilitate
to reveal
instance.
over
control
results.
An ordered
speed
DDBMS).
or automatically
adjustments.
and
values.
A concurrency
ID values (pointers).
distributed
software
and
appropriate
subsets
used to check
of specified
management
heterogeneous
Homonyms
Some
for
to their
a list
The use of the same name to label attributes.
checks
operator
among
backup
incremental
an index
homonym
statement
retrievals
erroneous
incremental
See also heterogeneous
(fully
SQL
a 1:M
it.
that integrates
database
is
(aggregate)functions
data that
homogeneous
when the
value.
while other transactions
The top
below
that
summary
structure
segment
directly
the
This
a segment.
a value
inconsistent
formed
development.
on an upside-down
which each record
model
characteristics
one
SQL
In SQL, a comparison
whether
hierarchical
A cursor that is automatically
procedural
only
923
In the object-oriented of an object
methods
of the
to inherit classes
data model,
the
above
data
structure
it in the
class
hierarchy.
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
924
Glossary
inner join meet
Ajoin
a given
can
operation in which only rows that
criterion
be an equality
or an inequality
the
are
selected.
condition condition
most commonly
The join
(natural (theta
join
join).
used type
join
Contrast
and stored
advantage
or equijoin)
The inner
of join.
compiled
criterion
create is
their
many
with
is its
applications
application
input/output
(I/O) request
that reads devices
Alow-level
or writes data to
such
as
memory,
and from hard
operation
Java
sources,
video,
sources,
disks,
and
of
one
A SQL command that allows the insertion or
more
data
rows
into
a table
using
by
a
model
In
data abstraction a specific
internal
by the
model
requires
models
internal
adapts the
a designer
the
chosen
to
words,
match
the
conceptual
database
of
join
environment,
pools
and inconsistent
of aninternal
to
data
has
created
text
key
duplicated,
managed
by
of a database
transaction
until the
first
several
ways.
primary
key (PK),
of
See also
2020 has
Cengage deemed
Learning. that
any
The
values.
Notation)
A human-readable
data interchange
also
programming language
Microsystems software.
All suppressed
Rights
Java
Reserved. content
that
does
May not
runs
that
defines
not
be
copied, affect
scanned, the
overall
concept
of
may be classified
on the
in
superkey,
candidate
key,
key,
and foreign
The attributes that form
key.
duplicated, learning
and
in experience.
whole
or in Cengage
part.
Due
to
indicators
that
sales
earnings
Learning
attribute.
electronic reserves
rights, the
right
In business
or scale-based
a companys
in reaching
Examples by
per
(KPIs)
numeric
assess
goals.
turnovers, are
or
a primary
The attributes that form a primary prime
or success
operational
on top
applications
materially
based keys
quantifiable
effectiveness
review
similar
secondary
key performance
An object-oriented
Copyright
on
a Hadoop
key.
one
J
Editorial
See
measurements
Web browser
for
dependence;
intelligence,
of the
and report in
share
Object
key attribute
is not
A process based on repetition
Sun
place,
in
procedures.
by
jobs
An entity identifier
functional
iterative
developed
event takes
page
and values in a document.
key attributes
Java
embedded
with the
Columns that join two tables.
format
key.
and
is
an object.
monitor,
generally
ends.
steps
design
K
often
and
transactions
process
on
processing
columns
by
a value.
used by one transaction
other
code
downloaded
to
In the old file system
A property
available
developed
operator used to
of independent,
which a data item
distribute,
JSON (JavaScript
supported
departments.
isolation
click
column(s)
attributes
an attribute
files.
JavaScript
when a specific
data
environment.
to those
constructs
tabular
Web authors
and then
a
of data
A central control program used
accept,
join
Arepresentation
allows
mouse
MapReduce
model.
ofinformation
different
to
the internal
and constraints
In SQL, a comparison
whether
islands
of a database
other
as a
allows
databases,
and text
Websites.
job tracker
The
in
An
that
A scripting language
Web pages,
such
model
them
with a wide range
relational
that
and activated
of
database.
IS NULL check
In
implementation
using
alevel
conceptual
representation
DBMS.
schema
model
modelling,
model for implementation.
characteristics
selected
the
that
DBMS
model is the
as seen
the
database
run
(JDBC)
interface
spreadsheets,
Netscape
in
internal
to
including
interactive
subquery.
main
developers
and then
Connectivity
to interact
JavaScript
INSERT
Javas
application
once
programming
program
computer
printers.
Web server.
to let
environments.
Java Database
outer join.
on the
ability
of
promotion,
strategic
KPI are sales
and
product
by employee,
share.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
key-value
A data model based on a structure
composed in
of two
which
every
of values.
The
associative
has
data
and
data
also
value
enable
or set called
A NoSQL
component
granularity
take
place
database
page,
of key-value is unintelligible
The body
a specific
familiarity,
subject.
awareness,
information
as it
characteristic
of information Knowledge
and
applies
is that
and facts
understanding
to
new
implies
an
environment.
knowledge
manager
design
the
L left
outer join
join
that
those
yields
that
table.
In a pair of tables to be joined, all the
rows
no
matching
have
For example,
with AGENT including
ones that
row.
LIKE
In
whether
values
outer
outer
an attributes
string
pattern.
linking
table
a
requirements
and right
value
rows,
matches
logical
check
the
In the relational an
M:M relationship.
See
also
composite
model
mapping transparency
DDBMS in
which
A property
database
access
of a
requires
the
lost
end
know
fragments.
See
location which
database name
locations
lock
and location
requires
database
transaction
requires
See
(Fragment
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
model for
system,
such as DB2,
Access,
be changed (The
or Ingress.
which the
without
internal
a
affecting
model is
because it is unaffected which
the
in
software
storage
affect
devices
the internal
A concurrency
updates
by
is installed. or operating
model.)
control problem in
are lost
during
the
concurrent
of transactions.
mandatory
also local
transaction
a lock
used to translate
the internal
M
the user to know fragments.
be known.)
a particular
on
data
or Ingress.
transparency.
one
A device that guarantees unique use of a in
is
A condition in
model.
will not
execution
A property of a DDBMS in
access
not
of the
transparency.
data item
has
name
also location
of the
need
mapping
2020
the
transparency
only the
review
both
can
updates
which user to
DB2,
and is therefore
design
management
a change
systems
a
as
design to the
Oracle, IMS, Informix,
computer
such
Access,
DBMS
into
independence
Therefore,
entity.
local
design
conceptual
the
model for
system,
Logical
hardware-independent
model, a table that
the internal
selected
database
internal
a specified
used to translate
Astage in the design phase
conceptual
SQL Server,
outer join.
is
Oracle, IMS, Informix,
of the
selected
used to
into
phase
DBMS and is therefore design
matches the conceptual
the
locks.
design
management
software-dependent.
matching
operator
text
other
of CUSTOMER
have
join
SQL, a comparison
implements
Copyright
design
design
including
table,
design to the
Logical
database
that
CUSTOMER
do not
See also
table, in the
join
will yield all of the
the
AGENT
a left
in the left
A stage in the
conceptual
logical
a
and releasing
of the selected
SQL Server,
use. Locking can database,
The way a person views data.
software-dependent.
old knowledge.
for
(attribute).
matches the conceptual
requirements
be derived
to
data item
oflock
levels:
assigning
logical that
execution
the
A DBMS component that is
for
selected
Editorial
and field
data format
A key
can
following
logical
of
to lock
Thelevel
at the
row,
lock
to
operations
use.
responsible
knowledge
from
own
after the
transactions
lock
DBMS.
about
other
their
the
model.
data as a collection
which the
lock is released
a value,
value
model is
databases
stores
a key
a corresponding
key-value
(KV)
model that
the
key
elements:
or attribute-value
Key-value
pairs in
data
925
prior
not materially
be
operation. to
copied, affect
data
scanned, the
overall
A
access;
or
duplicated, learning
in experience.
participation occurrence
in another
EMPLOYEE
works in
companys
or in Cengage
part.
Due Learning
to
electronic reserves
have
entity.
For example,
a DIVISION.
without
in which
a corresponding
being
(A
an
person
assigned
cannot
to
a
division.)
rights, the
Arelationship must
occurrence
be an employee
the
whole
entity
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
926
Glossary
many-to-many (M:N or *..*) relationship Associations among two or more entities
in
which
associated
entity
one
with
occurrence
many
and one occurrence
associated
map
with
is
of the related
many occurrences
data into
must
entity is
of the first
subtask
within
Mapper
a larger
entity.
contains
the
rows
stores
but
SQL the
created
first
rows
materialised when the
actual
the
the summary
MAX
and
to
rows.
time
the
is run
are stored in the table.
view rows
module
base tables
are
attribute
and
updated
metadata
in
a given
and relationships.
See
also
global,
data
method
In the object-oriented set
of instructions
Methods represent invoked
to
actions,
and are
linked
platform
for
characteristic
of interest
regardless
the
development
any type
of operating
that
of
distributed,
applications of data
over
in rows
and
(columns).
unit,
and is
a system.
(2)
An
which modules are
to
each transaction.
The
uses
database management A database management proprietary
system
attribute
value
and
all that
elements
be
defined
defined in the database
in
arrays of n dimensions
is
required in the
model
has
Cengage deemed
Learning. that
any
In
by database and
other
(MPSD)
words,
All
Rights
Reserved. content
does
May not
of online
processing analytical
database
system
not materially
with support
for
processors
copied, affect
A scenario
in
single-site which
data
multiple
processes
sharing a single data
elements
by at least
be
processing,
run on different computers repository.
transactions
all data
must be used
suppressed
management
multiple-site
multivalued attribute An attribute that can have many values for a single entity occurrence.
one
transaction.
2020
known
column.
needed.
model,
online analytical
multiple data processors and transaction at multiple sites.
Defined as All that is needed is
is there
to store
multiple-site processing, multiple-site data (MPMD) A scenario describing a fully distributed
network
and programming
a given
techniques
aimed
any
A SQL aggregate function that yields the
minimal data rule
review
several
attributes
produce
An extension
database
minimum
Copyright
of the
processing to multidimensional management systems.
language.
Editorial
into
to the
A component-based
interoperable
manipulating
data
to
multidimensional
heterogeneous,
must
of horizontal
data fragmentation,
divided
as an autonomous
(MOLAP)
all
one
(1) A design segment that can be
data in matrix-like as cubes.
In a data warehouse, numeric facts that
Microsoft .NET framework
there,
for
may be
unique timestamp
system
user.
MIN
A combination
multidimensional system (MDBMSs)
an action.
messages.
measure a business
at
elements
by at least
data model, a
perform
real-world
through
metrics end
transactions
all data
timestamp value produces an explicit order in which transactions are submitted to the DBMS.)
dictionary.
named
words,
monotonicity A quality that ensures that timestamp values always increase. (The time-stamping approach to scheduling concurrent transactions assigns a
column.
Data about data; that is, data about
characteristics
and
module coupling The extent to independent of one another.
updated.
value
other
database
must be used
has a subset
sometimes
A SQL aggregate function that yields the
maximum
In
information system component that handles a specific function, such asinventory, orders, or payroll.
The
are automatically
by model,
strategies
implemented
materialised
query
model
a table
each row
generate
The
required in the
needed.
transaction.
vertical
which
command
is
mixed fragmentation
pairs as a
A dynamic table that not only query
Defined as All that is needed is
is there
elements
be defined
database
job.
view
all that
defined in the
A program that performs a mapfunction.
materialised
view is
a set of key-value
and
all data
of a related
The function in a MapReduce job that sorts
and filters
data
of an entity
occurrences
minimal data rule there,
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
For example, store
the
an EMP_DEGREE
string
different
BBA,
degrees
MBA,
requires
NoSQL
might
to indicate
three
A new generation of database
systems
held.
mutual consistency that
attribute
PHD
that
database
rule
all copies
A data replication
of data
fragments
rule
to
that
database
mutual exclusive rule one transaction on the
A condition in
at a time
same
can
own
an
exclusive
on the
management
traditional
relational
is
not
based
on the
management
traditional
relational
model.
NOT
which only
based
A new generation of database
systems
identical.
not
model.
NoSQL
be
is
927
A SQL logical
operator that negates a given
predicate.
lock
object.
null
In SQL, the absence of an attribute
that
a null is
not
value. Note
a blank.
N natural join
Arelational
by selecting their
only the
common
natural
identifier
implies, forms
of time
in
object
A generally
real-world
objects.
to
day-to-day
for
point
A to
a data
packet
point
end users
to
objects
name
and
network
model
late
1960s
of record with
that
adds
a round
types
an owner
and
part
of
next-generation
record
type
and
a
member
record
(O/R
in a 1:M relationship.
network nodes
partitioning
become
network
node single
The delay imposed
suddenly
unavailable
when
due to
by
a
the
failure.
relationship
which
the
entity
does not contain
primary
entity.
non-key
many relational relational
database
was the
provide
for the
a unified
development
of
management
based
championed
researchers,
response
many within
extended
The ERDM,
database
models
on the
system
of the
to
the
constitutes OODM.
object-oriented
an inherently
This models
simpler
relational
structure.
instance.
non-identifying
parent
A DBMS
model (ERDM).
best features
of a
OLE-DB
to
database
model includes
In a graph database, the representation entity
data.
that
accessing
applications.
DBMS)
relational
Object
middleware for
strategy
framework
Database
Component
database
non-relational
object/relational
type
for
functionality
Microsofts
object-oriented
sets
with other
Microsofts
OLE-DB is
represented
as predefined
on
object-oriented
A data model standard created in and relationships
of a real-world
embedded
and Embedding
Based
relational
as a collection
identity,
and the ability to interact
Model (COM),
by the amount
B.
data
a unique
and itself.
(OLE-DB)
first
the
has
Object Linking
vocabulary.
make
An abstract representation that
properties,
As its
business
The delay imposed
required
from
values
entity
key is familiar
of their
network latency trip
common
identifier) for
a natural part
with
attribute(s).
key (natural
accepted
O
operation that links tables
rows
key
of the
dependent
the
Also known
attribute
Arelationship primary
object-oriented
in
(many
model
whose
basic
model (OODM) modelling
A data
structure
is an object.
side)
object-oriented database management (OODBMS) Data management software
key of the related
as a weak relationship.
See nonprime
data
to
attribute.
manage
data
in
an
object-oriented
system used
database
model.
non-prime
attribute
An attribute
that is
not part
of
one-to-many relationship
a key. normalisation to
entities
A process
so that
that
assigns
data redundancies
attributes
are reduced
entities
or
Copyright review
2020 has
Cengage deemed
are
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
used
among two or more
by data
models.
one entity instance
many instances
Learning. that
that
relationship,
eliminated.
Editorial
(1:M or 1..*) Associations
some to
of the
third remove
party additional
content
may
be
with
entity.
suppressed at
a 1:M
is associated
related
content
In
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
928
Glossary
one-to-one
(1:1 or 1..1) relationship
among
or
two
models. is
In
more
entities
that
a 1:1 relationship,
associated
with
only
Associations
are
one
used
entity
one instance
by
outer join
data
instance
of the
related
entity.
a table
analytical
support
system (DSS)
data analysis data
processing
analysis
making,
tools
techniques.
that
Decision
use
retained;
unmatched
Contrast
null.
that
modelling,
an advanced
supports and
transaction
systems
that
processing
operations
support
operations.
databases,
operational
that
support
retained;
unmatched
Contrast
overlapping The
developed
database
access
operational primarily
by
API to
database
to
support
operations.
hints
command
that
provide
(row)
disk
a
optimistic assumption
that
technique
most
for the the
SQL
such
pairs
related
are
table
are
See also left
outer
subtypes a condition
In a
in
which
supertype
can
each
appear
in
as a directly
A diskpage
system
locks
an entire
A diskpage
can
one or
more rows
and from
one or
partial
completeness in
not
which
a fixed
which
be
some
members
or
data for
more tables.
supertype
hierarchy,
occurrences
of any subtype.
In normalisation,
an attribute of the
diskpage,
contain
In a generalisation
dependency
(subset)
has
In this type oflock, the database
of a disk.
in
do not
be described of a disk.
section
partial
on the
operations
can
of a
as 4K, 8K, or 16K.
a condition
management,
based
database
in the
of the
section
management
database
inside
In transaction
control
which
page-level lock
day-to-day
embedded
block,
might
approach
unmatched
In permanent storage, the equivalent
size,
text.
a concurrency
are
one subtype.
addressable
applications.
as a transactional
are
values
hierarchy,
instance
page
Database
to
Special instructions
optimiser
table
P
database.
optimiser
all
with inner join.
entity
or
A database designed
Also known
or production
Microsoft
a companys
related
are
OLTP are known
(ODBC)
Windows
pairs
algebra JOIN operation that
which
specialisation
databases.
middleware
in the
(non-disjoint)
more than
databases,
Connectivity
in
left
day-to-day
transactional
Open Database
query
a companys
Databases
OLTP
(OLTP)
values
unmatched
with inner join.
a table
null.
all
join and right outer join.
decision
research.
online
algebra JOIN operation that
which
Arelational
produces
multidimensional
OLAP creates
environment
business
(OLAP)
in
left
outer join
online
as
Arelational
produces
is
dependent
primary
a condition
on only
a portion
key.
conflict. optional that
attribute
In
does not require
ER
modelling,
an attribute
a value; therefore,
it can be left
empty.
optional in
participation
which
one
entity
a corresponding
In ER modelling, a condition
occurrence entity
does
occurrence
The SQLlogical
expressions
to
ORDER BY output
ascending
Copyright review
operator used to link
expressions
clause. It requires
Editorial
in
in
database.
has
in
or
HAVING
conditional
A SQL clause that is useful for ordering query
Cengage
Learning. that
any
All suppressed
(for
example,
order).
Rights
Reserved. content
does
key
attributes
in
May
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
or in Cengage
part.
Due Learning
For
based
reserves
in the
CLASS,
on the
In partitioned
the
participants
databases,
determine
one or
the fragment
more in
will be stored.
data allocation
electronic
example,
teaches
a table that
that
to
database
See also fully
CLASS.
of dividing
fragments
not
is
and
partition
strategy
some
multiple sites.
a relationship.
relationship
partitioned
in
of only
at
PROFESSOR
which a row
of a SELECT
deemed
WHERE
A distributed
An ERterm for entities that
PROFESSOR
multiple
copies
replicated
relationship
be true.
or descending
2020
a
only one of the
which
are stored
teaches
conditional
the
database
database
fragments
participate
a particular
relationship.
OR
replicated
participants
not require
in
partially
are stored
rights, the
right
A data allocation
a database
some to
third remove
into
at two
party additional
content
may content
two
or
be
or
suppressed at
more
more sites.
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
partitioning subsets
The process of splitting a table into
of rows
performance that
allows
a system
database
tuning
perform
access
perform
as though
it
were
in
Activities that
more efficiently
make a
in terms
a table,
only, previous
a framework
of fact)
can
about the time span of data
usually
years,
expressed
as current
to
line
with
stored
breaks
standard
extensions
SQL statements
primary
that
is
stored
A block of code
and
and
physical
data format
method
to improve
documents
the
through
the
use
of
and indention.
key (PK)
In the relational
composed
of one
a row.
or
model, an
more
attributes
Also, a candidate
at the
that
key
See also key.
DBMS
prime attribute A key attribute; that is, an attribute that is part of a key oris the whole key. See also key attributes.
The way a computer sees
data.
physical
design
maps the
data
function
and
Because
of the types
hardware,
the
system
private cloud Aform of cloud computing in which aninternal cloud is built by an organisation to serve its own needs.
A stage of database design that
storage
of a database.
the
(statement
or false.
selected as a unique entity identifier.
server.
(stores)
find()
uniquely identifies
procedural
executed
true
year
or all years.
module (PSM)
which an assertion as either
of retrieved
identifier
persistent
in
mathematics to
MongoDB, a method that can be
the
readability
of a variety
technologies
infrastructure.
be verified
In
management
Used extensively in
provide
pretty()
of storage
The coexistence data
an organisations
chained
Information
and
predicate logic
a
speed.
periodicity stored
to
persistence storage
within
A DDBMS feature
DBMS.
performance and
of data
transparency
centralised
polyglot
or columns.
929
access
these
are
supported
by the
of devices
data access
physical
design
software-dependent.
See
characteristics
characteristics
methods
supported
are
hardware-and
also
both physical
a
Procedural Language SQL (PL/SQL) A type of SQL that allows the use of procedural code and in which SQL statements are stored in a database
by
as a single
model.
callable
object
that
can be invoked
by
name. physical independence physical
model
internal
can
be changed
affecting
procedures
the
model
described
the
hardware-and
Platform
as location,
data.
The
path,
physical
and format model is
are
both
as a Service (PaaS) provider
consumer-created
A modelin which
can build and deploy
applications
using
the
In the
client-side, invoked
public.
providers
types
policies
application
browser
when
query
that is automatically needed
to
manage
for
statements
manage company
communication
and
of direction
operations
support
of the
that
2020 has
Cengage deemed
to the
used
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
of
or in Cengage
part.
Due Learning
to
electronic reserves
right
some to
third remove
code.
A specific
by the
request
end user or the
DBMS.
to
language
rights, the
SQL
issued
A nonprocedural language
by a DBMS
of a query
Learning. that
form
manipulation
query language
the
organisations
objectives.
review
in the
data
application
are
through
A question or task asked by an end user of a
database
of data.
General
used to
Q
World Wide Web(WWW), a
external by the
specific
Copyright
of an activity or process.
infrastructure.
plug-in
Editorial
during
public cloud Aform of computing in which the cloud infrastructure is built by athird-party organisation to sell cloud services to the general
software-dependent.
cloud service
be followed
properties In a graph database, the attributes or characteristics of a node or edge that are ofinterest to the users.
A modelin which physical
such for
Series of steps to
the performance
characteristics
cloud
without
which the
model.
physical
the
A condition in
manipulate is
party additional
content
its
data.
that is
An example
SQL.
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
930
Glossary
query optimiser SQL
queries
access
the
access
A DBMS process that analyses
and finds data.
the
The
query
or execution
query result returned
most
efficient
optimiser
plan for the
set
contents;
way to generates
the
query.
The collection
the
JOIN,
PRODUCT,
and
relational (RDBMS)
of data rows
by a query.
a relational
the
RAID
An acronym for Redundant
to
Disks.
create
virtual
individual
the
use
multiple
volumes)
RAID systems
fault
Array of
systems
disks (storage
disks.
improvement,
RAID
from
provide
tolerance,
and
disks
database.
several
a balance
A collection
of related (logically
connected)
data
security,
query
A nested query that joins
data
a single
entity
married to
type.
Arelationship
For example,
an EMPLOYEE
of another
or a PART is
IBM
within
an EMPLOYEE
for
is
and summarises
produce
a single
reducer
the results
of
also
to
help
language
creates
and
to the
(SQL)
and
into
provide
concurrent
administration
A graphical
easy
data
application
entities,
representation the
of
attributes
and the relationships
in
model 1970,
users
designers
within
among
the
a
major
A program that performs a reduce
conceptual
based
on
and represents
data
relations. Each relation (table) represented as a matrix of
rows
and columns.
The relations
are related to each other through of common entity characteristics columns).
result.
breakthrough
of its
model is
set theory
intersecting
by E. F. Codd of
because
The relational
as independent is conceptually
map functions
Developed
it represented
and
mathematical
The function in a MapReduce job that
collects
RDBMS
integrity,
databases
entities,
simplicity.
a component
PART.
reduce
and retrieve
entities.
atable
found
locate
dictionary
diagram
relational relationship
software
(queries)
A good
a data
and system
a relational
to itself.
recursive
RDBMS
requests
physically
a query
those
fields.
recursive
The
logical
data.
maintains
relational
record
DIFFERENCE,
programs.
between
two.
to
requested
through
performance
that
and
access,
are SELECT,
UNION,
DIVIDE.
a users
commands
R
main functions
INTERSECT,
database management system A collection of programs that manages
translates
Independent
eight
PROJECT,
the sharing (values in
function.
redundant the
transaction
transaction
systems
log
to
referential
kept
ensure
will not impair
that
the
integrity tables
matching
or a
Relations to
common
each
2020 has
Learning. that
any
table.
have
an invalid
a null Even
relational relational
entry.
through
the
of a
relationship
(a value in a column).
All suppressed
Rights
Reserved. content
does
May not
relational
not materially
be
copied, affect
the
Arelationship higher.
table
scanned, overall
or
duplicated, learning
An association degree
or participants
A set of mathematical principles manipulating
that
use
schema The organisation of a database as described by the database
relationship
Relations
sharing
functions
administrator.
model, an entity
as tables.
processing
processing
relational schema The organisation of a relational database as described by the database administrator.
a corresponding
database
other
basis for
Cengage deemed
have
characteristic
algebra
that form the
review
to
data.
must have either
are implemented
entity
relational
Copyright
may not
In a relational
are related
Editorial
key
Analytical
relational databases and familiar relational query tools to store and analyse multidimensional data.
of a disk
by which a
entry in the related
it is impossible
relation
failure
ability to recover
foreign
an attribute
attribute,
physical
online analytical
(ROLAP)
management
A condition
dependent
set.
the
relational
Multiple copies of
by database
DBMSs
entry though
logs
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
between entities.
The number of entities
associated
with a relationship.
degree can be unary, binary, ternary, or
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
Remote
Data Objects (RDO)
object-oriented remote
application
database
DAO
and
was optimised such
servers.
ODBC
as
for
to
RDO
direct
deal
A higher-level,
interface
used
uses
access
the
to
to
lower-level
Oracle, and
ROLLBACK
A SQL command that restores the
database
contents
after
databases.
with server-based
MS SQL Server,
access
RDO
the last
DB2.
request
a single
SQL
A DDBMS feature that allows
statement
remote
DP. See
remote
transaction
to
access
also remote
data in
data in
data
which
repeating
group
describing
a group
type
for
a single
example,
a car
interior,
In
key
bottom,
trim,
replica transparency the
existence
of
so
access
of the same For
colors
the
of data from the
user.
data allocation
strategy
which
in
fragments
are stored
replication
versions
place
access
copies
time
reserved cannot in
A data allocation
of one
at several
or
more database
different
words
Oracle
any
SQL, the
tables
right
outer
other
purpose.
is
For
cannot
In a pair of tables
yields
the
table.
CUSTOMER
used
with no
For with
to
matching
example,
AGENT
the
CUSTOMER row. join.
be used
to
scaling
be joined,
review
2020 has
a right
outer
will yield
ones
any
A query
that
that
all
do not
join
of
of the
have
All suppressed
distributing
a cluster
up
when
optimisation
query optimisation
algorithm
uses
A query
preset rules
first,
involves
powerful
to
aggregate
and
making the
up the
by
Rights
Reserved. content
does
May not
migrating
not
be
copied, affect
scanned, the
overall
or
the
structures
servers.
same
with data growth
structure
to
more
systems.
scheduler
matching
the
The DBMS
order in
execution
which concurrent
to
ensure
that
transaction
The scheduler
of database
sequence
component
operations
interleaves
operations
in
establishes
the
a specific
serialisability.
different
data is the exact
materially
storage
AGENT
a
used with the
data
data
of commodity
A method for dealing
are executed.
Rolling
Learning. that
even
A method for dealing with data growth
involves
scaling
See also left outer join and outer
BY clause
Cengage deemed
table,
page.
portion is calculated
out
across
schema
opposite
of drilling down the data. See also drill down.
Copyright
same
to
S
example,
values in
In SQL, an OLAP extension
dimensions.
Editorial
of the same
optimiser
multiplication
all of the rows in the right table,
ones
including
GROUP
rows
on the
technique
that
roll up
are
transactions
correct answer 17.
word INITIAL
join
the
rows,
of rows.
or columns.
including other
set
which
database lock in
concurrent
optimisation
that
a join that
different
rows
allows
query optimisation
Words used by a system that
for
of a given
rule-based
and to improve
and fault tolerance.
be used
name
Replication
locations
blocks,
rules of precedence Basic algebraic rules that specify the order in which operations are performed. For example, operations within parentheses are executed first, so in the equation 2 +(3 5),the
sites.
of a database.
in
in
points to determine the best approach to executing a query.
The process of creating and managing
duplicate to
copies
stored
mode based on the rule-based algorithm.
The DDBMSs ability to hide
replicated
A physical data storage
data is
Aless restrictive
DBMS
rule-based
for its top,
on.
multiple copies
existed
statement.
all columns
lock
that
row-level trigger Atrigger that is executed once for each row affected bythe triggering SQL statement. Arow-level trigger requires the use of the FOR EACH ROW keywords in the trigger declaration.
request.
occurrence.
multiple
and
the
a characteristic entries
attribute have
to
also remote
a relation,
of multiple
can
requests)
DP. See
from
row-level
transaction.
by several
remote
which
condition
a single
A DDBMS feature that allows
(formed
a single
COMMIT
in
access
a transaction
to the
storage
technique hold
remote
table
row-centric
databases
931
duplicated, learning
such
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
Alogical
as tables,
rights, the
right
some to
third remove
grouping of database objects,
indexes,
party additional
content
views,
may content
be
and queries, that
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
are
eChapter(s). require
it
932
Glossary
related
to
single
user
each other.
schema such
to
single
user
to
a
types
set theory
grouping of database objects,
indexes,
views,
other.
and
Usually,
queries,
a schema
that
are
belongs
to
deals a
the
or application.
design,
according
to
operational
with
sets,
basis for
in the
is in
normal form (2NF)
normalisation
process,
1NF and there
(dependencies
purposes.
For
customer
combination
example,
number
appropriate
table
See
also
equivalent
of a file
systems
SELECT
A SQL command or a subset
not likely
models
data
that
modelling
both
structure in
data
the
the
a table.
known
1981,
The
data from
by
the
a data
other
database.
See
The SDM, M. Hammer
world,
in
a single
to
in
which
computers
one
single-site
all processing
is
done
on a single
and all data are stored
local
on the
disk.
database
user
data (SPSD)
A database that supports
at a time.
An attribute that can have
snowflake schema Atype of star schema in which dimension tables can have their own dimension
published and
Compare
slice and dice The ability to cut slices off a data cube (drill down or drill up) to perform a more detailed analysis.
real
relationships
as an object.
was developed
allows
the
from
on the
components.
single-valued attribute only one value.
values
The first of a series of data represented
meaningful
processing,
single-user only
yields
in
attribute.
CPU or host computer
type.
that
into
composite
A scenario
data model, the
and their
access
data
held
An attribute that cannot be
subdivided
host
used to retrieve
closely
are
A shared lock to
attribute
single-site
middle initial, match
record
model
more
transaction.
as model.
when a
to read
locks
used
relational
to
but the
tables.
semantic
permission
no exclusive
transactions
simple
key.
of rows
is
and is
in the
also exclusive lock.
key).
key),
will probably
row.
statement
are
name,
In the hierarchical
of all rows
primary
(primary
name, first
add and intranets.
dependencies
customers number
of last
and telephone
a relation
of things,
manipulation
Alock that is issued
and
read-only
A key used strictly for data retrieval
know
SELECT
which
are no partial
key
segment
in
in only part of the
secondary their
The second stage
extensions
Web servers
or groups
data
requests
database
by another
second
to
A part of mathematical science that
transaction
requirements.
Server-side
functionality
shared lock
The part of a system that defines the extent
of the
of requests.
Alogical each
scope
belongs
significant
as tables,
related
Usually, a schema
or application.
tables.
D.
The snowflake
normalising
schema
is
usually the result
of
dimension tables.
McLeod.
Software
semi-structured processed
to
sentiment that
data some
positive,
to
negative,
serialisability order
A method of text analysis
determine
a statement
or neutral
state
operations
that
would
creates
have
had been executed
server-side
extension
Copyright Editorial
review
with the
2020 has
Cengage deemed
any
All suppressed
process
Rights
Reserved. content
does
same
produced
instances
if the
May not
not materially
be
copied, affect
the
overall
or
duplicated, learning
In
measurement
multidimensional of the
data analysis, a
data density
held in the
data
cube.
specific
scanned,
which
applications
is low.
sparsity
that interacts
to handle
A model in
sparse data A case in which the number of table attributes is very large but the number of actual data
final
in a serial fashion.
A program
server
Learning. that
been
the
(SaaS)
software independence A property of any model or application that does not depend on the software used to implement it.
a
attitude.
transactions
directly
conveys
A property in which the selected
of transaction
database
if
as a Service
the cloud service provider offers turnkey that run in the cloud.
extent.
analysis
attempts
Data that have already been
in experience.
whole
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
right
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
Glossary
specialisation top-down
hierarchy
process
specific
entity
supertype.
subtypes
A shared,
stores the
called
SQL
data
services
that
provide
amount
to
star
the
a given
database.
data
a central
stateless server
system
memory between
the
client
statement-level
more
if the
omitted.
This type
or after
the
static
in
tables.
clients
server.
A SQL trigger that is
of trigger
is
executed
statement
completes,
optimisation
which the
predetermined
A query
access
path to
at compilation
optimisation
SQL
SQL statements
do not
change
SQLin which the
while the
Copyright review
2020 has
Cengage deemed
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
logic or
language.
of data inputs
about
which
before
storage.
A relationship
dependence
data
that
to
keep
occurs
in the
A data characteristic in database
thus requiring
that
schema
changes
affects
in all access
enable
whole
users to
create
database
or in Cengage
part.
Due Learning
to
electronic reserves
rights, the
COMMIT
protocol.
A query that is embedded (or nested)
another
or aninner
Learning. that
discard
relationship
subquery inside
application
is running.
Editorial
code
The processing
using the two-phase
A style of embedded
Business
of SQL
procedural
decisions
which data to
(2)
form
of
subordinate In a DDBMS, a data processor (DP) node that participates in a distributed transaction
is with
dynamic query optimisation. static
the
by
code.
and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information.
and is
Contrast
statements.
in
make
of commands
before
a database
time.
as indicated
program
Structured Query Language (SQL) A powerful and flexible relational database language composed
are once,
a value,
and
structured data Unstructured data that have been formatted to facilitate storage, use, and information generation.
state
ROW keywords
of procedural
structural independence A data characteristic in which changes in the database schema do not affect data access.
not reserve
communications
DBMS
best
programs.
a
which a Web of the
SQL
processing to
group
in its
DBMS-specific
which a change
case.
query
mode in
an open
and
data access,
represents table
the
(1) A named collection
on a server
structural
data into
dimension
Web does
FOR EACH
triggering
default
The
trigger
assumed
the
or
and the
used
used to
as a fact
A system in
maintain
are
The
determine
when two entities are existence-dependent; from a database design perspective, this relationship exists whenever the primary key of the related entity contains the primary key of the parent entity.
set of
support
known
one
with it.
to
strong
output.
does not know the status
communicating
end.
Standards
statement
procedure
order
and
minimum
The star schema
with
in
answer
modelling technique
table
that returns
stored
minimum
decision
a relational
1:M relationship
the activity.
multidimensional
using
server
of the
A data
SQL statements
stream
correct
the
and specific
quality
schema
map
at the
A named
procedural
access,
a database.
to
function
another
the
using
describes
for
evaluate
data storage,
returns
of time,
A detailed that
requirements
management
about
statistics
that uses
strategy.
a RETURN
Activities to help
that
of resources
standards
Data
tuning
query
information
stored
relational
amount
instructions
or
and functions.
over the internet.
a SQL
in the least
stored
SQL statements
triggers
based query optimisation A query optimisation technique
uses these
access
memory area that
executed
(SDS)
SQL performance generate
then
cache.
services
management
statistical unique
of the subtypes.
including
procedure
entity
on grouping
reserved
most recently
procedures,
Also
based
statistically algorithm
more
a higher-level
is
and relationships
SQL cache
PL/SQL
lower-level,
from
Specialisation
characteristics
and
A hierarchy based on the
of identifying
933
right
query.
Also known
as a nested
query
query.
some to
third remove
party additional
content
may content
be
suppressed at
any
time
from if
the
subsequent
eBook rights
and/or restrictions
eChapter(s). require
it
934
Glossary
subschema the
In the network
database
seen
that
produce
the
database.
subtype
the
desired
table
programs from
the
each
data in
to
The attribute in the
that
determines
supertype
to
which
occurrence
of all values for a given
super column
column
only
entity
is related.
that is
composed
lock lock
any row
In
as a file
a command
of other related
scheme that allows
locks
to
access
an entire
a table.
table,
by transaction
markuplanguages inserted
document
in
should markup
Web browser
preventing
T2 while
for
such as HTML and XML,
a document
to
be formatted. languages
specify
Tags
how
are
used in
and interpreted
presenting
by a
data.
An attribute or attributes that uniquely each
entity
in
a table.
task
See key.
trackers
framework surrogate
key
generally
numeric
A system-assigned
primary
key,
tasks
responsible
object,
relationship;
MapReduce
to running
map and reduce
and auto-incremented.
The use of different names to identify
same
A program in the
on a node.
ternary synonym the
storage space
known
T1 is using the table.
server-side
superkey
Also
at a time
A table-level
the
columns.
identify
data.
Alocking
access to
tag
database, a
of a group
related
one transaction
transaction
or expression.
In a column family
In a DBMS, alogical
group
group.
A SQL aggregate function that yields the sum
column
space
used
table-level entity
subtype
model, the portion of application
information
discriminator
supertype
SUM
by the
such
as an entity,
synonyms
should
an attribute,
generally
relationship
an association or a
For example,
be avoided.
An ER term used to describe
(relationship)
between
a CONTRIBUTOR
to a FUND from
three
entities.
contributes
which a RECIPIENT
money
receives
money.
See also homonym.
theta join system
catalogue
A detailed system data
dictionary
that
describes
systems
administrator
for coordinating
all objects
in
inequality
a database.
join
The person responsible
an organisations
need
for
systems
be
The process
traces The
the
SDLC
database
2NF
history provides
out
and
big
picture
to
development
evaluated.
each
A matrix composed and
columns
an entity
set in the
relational
that model.
rows
that
assigns
data
of time.
top-down
Also
called
a
relation.
Copyright review
2020 has
Cengage deemed
Learning. that
any
All suppressed
Rights
Reserved. content
does
May not
not materially
be
copied, affect
scanned, the
overall
or
duplicated, learning
in experience.
whole
design and
Due Learning
that is, it
management, concurrent unique
timestamp
to
electronic reserves
the
right
history
some to
third remove
to
party additional
can
of all
is tracked.
main structures
moves
rights,
data
A design philosophy
then
the
time-variant
when a companys
by defining
part.
attribute;
Data whose values are a
appointments
begins
Cengage
>=) in the
is functionally
a global
For example,
system
or in
>, ,