294 103 16MB
English Pages xvi, 312 [372] Year 2000
Concepts, skills and practices of effective programming
Copley Square
Elements ofProgramming with Perl
Digitized by the Internet Archive in
2014
https://archive.org/details/eiennentsofprograOOjohn_0
Elements of
Programming with Perl
Andrew L. Johnson
II
MANNING Greenwich (74° w. long.)
For Susanna, Joseph, and Thomas For electronic browsing and ordering of this and other visit
http://www.manning.com. The pubHsher
when ordered
in quantity. For
Special Sales
Manning
Manning
offers discounts
more information,
books,
on
this
book
please contact:
Department
Publications Co.
32 Lafayette Place Greenwich,
Fax: (203)
CT 06830
email:
66 1-9018
[email protected]
© 2000 by Manning Publications Co. All rights reserved. No
part of this publication
may
be reproduced, stored in a retrieval system,
or transmitted, in any form or by
means
electronic, mechanical,
photocopying, or otherwise, without prior written permission of the publisher.
Many of the
designations used by manufacturers and sellers to distinguish
their products are claimed as trademarks. in the book,
designations have been printed in
@
Where
those designations appear
and Manning Publications was aware of a trademark claim, the initial
caps or
all
caps.
Recognizing the importance of preserving what has been written,
it is
Manning's policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end.
Library of Congress Cataloglng-in- Publication Data
Johnson, Andrew
L.,
1963-
Elements of Programming with
Perl
/
Andrew
Includes bibliographical references and index,
ISBN 1-884777-80-5 1.
Perl
L.
Johnson,
cm.
p.
(alk.
(Computer program language)
QA76.73.P22J644 005.13'3— dc21
(p.
348).
~ ,
,
'
,
;
"; i
;,
,
paper) I.
Title.
1999
99-42510
CIP
Manning
Production
Publications Co.
32 Lafayette Place Greenwich,
services:
Copyeditor:
CT 06830
Typesetter:
Cover designer:
Printed in the United States of America 1
2 3 4 5 6 7 8 9 19 -
CM
- 02 01 00 02
TIPS Technical Publishing Adrianne Harun Lorraine B. Elder Leslie
Haimes
contents
preface
xi
xv
acknowledgments
Part I 1
Introductory elements
Introduction
3
1.1
On programming
1.2
On
'
Perl
7
Getting started 1.3
2
3
4
10,
A bigger picture
Writing code
Peri
18
i
2.2
Naming
2.3
Comments
24
2.4
Being
12
.
29
2.5
A quick style guide
21
strict
31
32
A first program Specification
debugging
Getting help
20
Structure
Writing programs
11,
15
2.1
3.1
Running
34 34,
45,
Design
35,
Maintenance
V
Coding 47
40,
Testing and
faqgrep
3.3
Exercises
Part II 4
49
3.2
Essential elements
Data: 4.
55
1
types
and variables 60
Scalar data
63
Scalar variables
4.2
Expressions
4.3
List data
65
67
Array variables
5
6
59
Hash
69,
4.4
Context
4.5
References to variables
4.6
Putting
4.7
Exercises
variables
73
it
77
'
', .
Selection statements
5.2
Repetition: loops
84
5.3
Logical operators
89
5.4
Statement modifiers
5.5
Putting
5.6
Exercises
92
together
97
,'
6.1
File handles
6.2
Pattern matching
99
and join
The
6.5
Putting
103 105,
Regex language constructs
6.6
Exercises
107,
112
113
DATA file handle it
^ 5
98
Matching and substitution operators
6.4
,
92
Simple I/O and text processing
Split
.
80
5.1
6.3
'
78
Control structures
it
74
76
together
Matching constructs
vi
71
together
114
116
120
CONTENTS
7
Functions
121
\
7.1
Scope
7.2
Global variables
7.3
Parameters
7.4
Return values
7.5
Designing functions
7.6
Parameters and references
7.7
Recursion
7.8
Putting
123
it
8
8.1
'x,
;
;
,
129
139
141
143
structures
Nested hashes
147,
149
8.3
References to functions
149,
-
150
-
,
,
152
,,
;
•
,
153
Closures 8.4
Nested structures on the
8.5
Review
8.6
Exercises
fly
155
158 159
160
/
POD
9.1
User documentation and
9.2
Source code documentation
Other uses
of LP
Tangling code
A simple tangler
CONTENTS
Routine examples
137,
140
Scope and references
9.4
r;
134
mathq program
8.2
9.3
^
137
together
Creating references
Documentation
,
131
Nested or multi-dimensional arrays
9
i
,
.
and aggregate data structures
Mixed
.
'
>
.
.
-
135
Exercises
References
'
'
127
Revisiting the
7.9
127
Further resources
.
161
164
169 170 170 178
vii
Part III 10
Practical elements
183
Regular expressions 10.1
The
basic
10.2
The
character class
components
184
188
Search and replace: capitalize headings shortcuts
10.3
Greedy
10.4
Non-greedy
10.5
Simple anchors
10.6
Grouping, capturing, and backreferences
10.7
10.8
quantifiers: take
11.1
what you can
quantifiers: take
what you need
Inserting
commas
Exercises
201
1
13
96
number
in a
operator
11.2
The
11.3
Strings within strings
11.4
Translating characters
11.5
Exercises
198
' .
206
207
208 211
212
.
214
lists
215
12.1
Processing a
12.2
Filtering a
12.3
Sorting
12.4
Chaining functions
12.5
Reverse revisited
12.6
Exercises
More I/O
198
203
substitution operator
Working with
13.1
195
202
text
The match
192
193
Context of the match operator
12
class
191
get
Other anchors: lookahead and lookbehind
Working with
Character
89,
191
Prime number regex
//
1
list
217
list
lists
"
217
:
^
'-
221
223
224
225
Running
external
commands
226
CONTENTS
14
13.2
Reading and writing from/ to external commands
13.3
Working with
13.4
Filetest operators
13.5
faqgrep revisited
13.6
Exercises
228
directories
16
17
.
230
233 -ry::.--^
modules
236
14.1
Installing
14.2
Using modules
237
14.3
File::Basename
238
i
14.4
Command line options
14.5
The
14.6
Fetching webpages
dating
game
239
241
243 243 -
14.7
CGI.pm
14.8
Reuse, don't reinvent
14.9
Exercises
249 253
255
256
Debugging
Part
.
229
Stock quotes and graphs
15
-
234
Using modules
227
15.1
Debugging by hand
15.2
The
257 262
Perl debugger
rV Advanced elements
Modular programming
271
16.1
Modules and packages
16.2
Making
16.3
Why make modules?
16.4
Exercises
Algorithms
a
274
278
279
and data structuring
17.1
Searching
17.2
Sorting
CONTENTS
module
272
280
281
283
ix
17.3
Heap
17.4
Exercises
291
Object-oriented programming and abstract data structures
18
What
18.2
OOP in Perl
is
295 295,
basics
Stacks, queues,
and linked
Stacks
Queues
301
lists
Linked
307,
lists
309
314
OOP examples
More
315
19.1
The heap
19.2
Grades: an object example
19.3
Exercises
as
316
an abstract data structure
320
330 ^^
What's
appendix A
left?
^
331
Command line switches 333
appendix
B
Special variables
appendix
C
Additional resources
appendix
D
Numeric formats
glossary
342
index
301
18.4
301,
299
Inheritance
Abstract data structures
Exercises
292
293
18.3
18.5
20
OOP?
18.1
The
19
286
sort
348
336 338
340
" '
.
;
CONTENTS
preface The Norse God Odin had two
He would send them
ory).
they would return and
ravens,
out each day to
him
tell
all
Thought and memory,
resources.
Hugin and Munin (Thought and Memthe corners of the earth.
fly to
Odin knew how
their secrets.
cogitation
and
recall,
a programmer, these are the important resources
aims to be your guide in should have the will
skills to
By
this endeavor.
own hugin
Perl
his
—
as
you too must tame. This book you
you
finish this book,
to scour the
web
manage and query
for interesting
the database of
collect.
are a lot
recommend
manage
processing and storage
the time
program
information, as well as a munin program to
There
night,
manage your own Hugin and Munin. In other words, you
be able to write your
information you
to
At
of books about Perl on the market today, and some of them
highly. (See
Appendix C, "Additional
resources.")
However,
I
many
authors of these other Perl books assume readers are already familiar with pro-
gramming. Other authors take the ulary
side-effect
approach, teaching readers the vocab-
and syntax of the language but offering few guidelines on how
effectively.
I
do not
believe that the side-effect approach
is
an
effective
to use
it
means of
teaching programming.
This book instead presents the basic elements of programming using the context I
of the Perl language.
merely
teach
I
hammer you with
do not assume that you've programmed syntax and function names. This
you both programming and
you need
to
Perl,
become an accomplished
from the
Perl
basics to the
book
before, nor is
do
designed to
more advanced
skills
programmer.
Audience This book
is
intended for two types of readers: those approaching Perl as their
first
programming language and those who may have learned programming off the cuff
xi
but
now want
a
more thorough grounding
in
programming
in general,
and
Perl in
particular.
More people than for
to
Common
its
Gateway
Some
popularity.
programming
Perl
front that this
is
book
an example or two learning belt,
how
ever are learning Perl. Undoubtedly, Perl's widespread use
to
people need Perl
cool.
is
to
understand up
not about using Perl for web-related programming, although
illustrates that application
it
while others just think
skills for their jobs,
Whatever your motivation, you need
program using
you can apply
and web-client programming contributes
Interface (CGI)
Perl.
of
Once you have
to a multitude of problem
This book does not assume that you
programmed
Perl. Instead, this
that
book
is
about
knowledge under your
domains.
know what
variables, arrays,
and loops
However, familiarity with basic mathemat-
are,
or that you've
ical
concepts and logic will certainly be helpful. Readers with no prior program-
ming experience should, of through the
first
course, begin at the beginning
and work
their
way
nine chapters in order. Chapters 10 through 15 are largely inde-
pendent and can be read advanced
before.
any
in
Chapters 16 through 19 introduce
order.
Each of these chapters
Perl concepts.
lays a
foundation for the following
chapter, so read these four chapters in order. If you are already familiar
to read chapters 2
which you wish
and
3,
and then pick and choose chapters
to improve. For example, chapters 6, 10,
aspects of regular expressions If you are a
with elementary Perl programming, you
and matching
competent programmer
that tackle areas in
and 11 cover
different
operators. Chapter 8 covers references.
in another language, this
useful in demonstrating Perl's
way of doing
not be
and the content
as concise as you'd like,
may want
things. is
book may still be
However, the discussions
not organized
as a reference
may
book.
Organization The book
is
organized in four main parts, starting with things to consider before
you begin programming, followed by the Perl.
The
third section explores a
Finally, the later chapters
structures
essential aspects
few of the more
practical
and
Perl-specific areas.
introduce more advanced concepts, such as abstract data
and object oriented-programming using
Introductory elements
of programming with
The
Perl.
three chapters in this section provide elementary
information on programming and the Perl language. Chapters 2 and 3 also delve into the basics of
program structure and design. In chapter
two examples, providing a whirlwind tour of the
xii
3,
we work through
Perl language in the process.
PREFACE
Essential elements
you need
structures
from
Chapters 4 through 9 cover the essential concepts and
program
to learn to
effectively.
variables to loop control constructs to
expressions to subroutines to references ish this section
of the book, you
will
Here you
input and output to basic regular
file
and nested data
have
all
the tools
applications.
structures.
you need
Chapters
to the Perl language, exploiting
1
0 through
1
into areas
some of Perl's unique and powerful
explore regular expressions in
more
detail, string
and
>
V
;
you
5 take
When you
fin-
to build real-world
,
Practical elements
we
will find everything
more
f
specific
Here
strengths.
processing,
list
.
more
input and output techniques, using modules, and the Perl debugger.
Advanced elements
Chapters 16 through 19 provide an introduction to more
advanced programming techniques, including building modules and abstract data
You
structures.
programming
are also introduced to object-oriented
Chapter 20 mentions a few areas not covered in
this
book and
features in Perl.
suggests references
for fijrther study.
The
Appendices
four short appendices cover command-line switches, special
Perl variables, additional resources for readers,
and a
brief explanation of binary,
and hexadecimal numeric representation. Following the appendices
octal,
is
a
small glossary of technical terms used in this book.
Source code, solutionSy The
source code for
and errata
many of the example programs and modules
book may be obtained from Manning's www.manning.com/Johnson for
presented in this
website. Point your browser to httpill
links to the online resources for the book, including
source packages.
Many
chapters have a small
appendix of answers to these
number of
exercises.
exercises at the end.
There
is
no
However, the web page mentioned above
contains a link to a solutions page. Finally,
some
errors
although
may
tains a link to
an online errata
errata sheet
address to
PREFACE
strived to eliminate mistakes
have slipped through.
book was published. on the
we have
If you find
and
fix
them
which you can submit
The
from the manuscript,
previously mentioned
web page con-
listing corrections to errors discovered after the
any
errors, please let us
in a later printing.
The
know errata
so
we can
page
lists
list
them
an email
error reports.
xiii
— Conventions In this book, "Perl" (uppercase P) refers to the Perl
programming language, while
"perl" (lowercase p) refers to the perl compiler/interpreter or the perl distribution.
Filenames and
mands you might
Some
font.
URLs
appear in
italics.
command
issue at the
Code, program names, and any comline
prompt appear
blocks of code are written using a form of literate
in a
fixed-width
programming (LP)
syntax to break the code into smaller chunks for presentation (explained in chapters
3 and
In these cases, the real Perl code, or the pseudo-code,
9).
fixed-width
font, while the lines representing literate
zn italic fixed-width
Many sary.
You
the terms
first
appear in the
commonly used
able or the contents of a string. directly related to
terms
—and many
whatever similar
text,
book
this
are in
they are
are defined in the glos-
italicized.
words foo and bar throughout
will also see the
generic terms
programming syntax
^onl.
of the technical terms introduced in
When
in a plain
is
in syntax
the point of the example
vari-
is
not
being called "foo." You might encounter these
"dummy"
in publicly available articles
book. These are
examples to represent the name of a
They are used when
is
this
words, such as
f oobar, baz, qux,
and quux
and examples on programming.
Author online Purchase of Elements of Programming with Perl includes free access to a private Internet
forum where you can make comments about the book, ask
questions,
and
receive help
the forum, point your
from the author and from other
web browser
rules
xiv
forum once you
To
to www.manning.com/Johnson. There
be able to subscribe to the forum. This access the
Perl users.
technical
site also
are registered,
you
provides information on
what kind of help
is
available,
access will
how
to
and the
of conduct on the forum.
PREFACE
"
acknowledgments No
author works in a vacuum, and
people deserve
my
I
'
^
am certainly no
gratitude for their help
exception.
A large number of
and support while
undertook
I
this
"little" project.
First
my ideas,
I
would
letting
like to
me
thank Marjan Bace, the publisher, for taking a chance on
run with them, and reining
to take seriously the view that an author
His pleasant and personal
welcome
relief
from the
style
and
toils
in
and publisher
during our
stresses
me
many phone
when
necessary.
and
are partners,
He it
seems shows.
conversations was always a
of making a living while writing a book.
Many other people at Manning contributed considerable efforts to make the book you now hold in your hands something of which I can be proud. I'd like to his many insightful suggestions; Ted Kennedy, Mary Piergies, for managing the production protesting the results of my script to convert the LaTeX
thank Brian Riley; Ben Kovitz, for for
managing the review
cess;
Syd Brown,
chapters into
process;
for patiently
MML format; Adrianne
in copyediting;
Harun,
for her steadfast attention to detail
and Robert Kern, Lynanne Fowie, and Lorraine
Elder, for turning
the manuscript into typeset pages.
A great many reviewers also
caught embarrassing errors and glitches and pro-
vided helpful comments and suggestions.
I
would
like
thank Tad McClellan,
Randy Kobes, Brad Fenwick, Jim Esten, Paul Holser, Dave Cross, Patrick Gardella, Mike MuUins, Michael Weinrich, Peter Murray, Richard Nilsson,
Umesh
Nair, Vasco Patricio,
provided crucial advice
at a
responsible for helping
while
I
My brother Brad Johnson also
couple of junctures in the book. All of these people are
make
this a better
book.
Any problems
or errors that
my responsibility.
remain are Parts
and Richard Kingston.
of
was
this
book were
at Bellamy's, the local
always find one of
booth when
I
my
my
written, or at least conceptualized, in
corner
preferred beers
needed to work, and a
pub and
on
restaurant,
I
could almost
tap. Bellamy's offered a quiet table or
seat at the bar
XV
where
notebook
when I needed
rejoin the living.
I'd like to
thank Linda
E., the bartender, for
not only serving up good
ale,
but also
being a cheerful friend.
Anyone who
uses uses Perl
and enjoys
gratitude to Larry Wall for creating this
thanks are due to
Tom
Christiansen,
it
as
much
gem and
as
giving
I
do owes a
serious debt of
to us so freely. Also,
it
whose long and continued
efforts
many
have been
instrumental in creating and maintaining a vast and informative documentation set that, in
Thanks from
my
opinion, remains unmatched in any other software documentation.
are also
whom
improve
Perl,
and taking Lastly,
I
to the
many and
Perl into
rest
new
of the Perl community
not;
my wife,
least, I'd like to
Susanna, just for being
xvi
continually
work
for continually keeping stories,
to
keep sharing code and ideas
thank
my
faults.
.
who
she
.um,
eccentricities);
Joseph, for their patience and understanding
games, reading
who
mother and three brothers
encouraging and never lacking in good advice whether you want
hours and countless other
and
who
on comp.lang.perl.misc
territory.
but certainly not
are always
varied regular posters
have learned much, the perl5-porters
and the
who lar
due
me
building
is
(and for putting up with
when Dad
and
it
or
my irregu-
my sons, Thomas
and
disappeared into his ofHce,
in touch with the pleasant side of reality: playing
snowmen, and other simple
pleasures.
A CKNO WLED GMENTS
Introductory elements
Introduction 1
.
1
1.2 1.3
On programming On Perl 7 A bigger picture
4
15
— To
write a story, fiction or fact, one
enough
at least so that
readers.
One must
manner
that
must grasp the
basics of the language in use
one can make simple statements that may be understood by
also
be able to string together a
communicates meanings and events
Stories also have certain compositional elements
series
of such statements in a
in relation to
time and space.
beginning, a middle, and an
like a
end, although the presentation need not proceed in that order. Listen to writers talk about writing, result
in the
mind and simply
writes
claiming that inspiration result
you'll find that writing
and
rare
is
Much more
down.
it
meaning
is
once
often, you'll find writers
—and
and usually comes during
fleeting
many ways
similar in
how
are described in space
—
and ends
are also
as a
nor a
visionaries,
like writing, that
and
skill levels
of statements such that events and
and time. Structural elements
immersed an
is
in the
—
Finally, real inspira-
is
often a subject of discussion.
neither an art achievable only by innately talented right-brain
is
strictly scientific left-brain logic
of
first
principles. It
on
some programmers
little
novels, others write pulp fiction,
write
all
trade,
go on with their
1.1
squares of adhesively backed yellow paper.
write elegant programs solving massively complex
is
all
manner of tools and big and
who
are not pro-
little
programs and
about
telling narra-
real jobs.
programming
is
a craft like writing,
and writing
programming about? Programming
is
is
about solving problems.
On programming
Computers talk
crank out
and many
kinds of narratives from business
problems; others produce solid utilitarian code daily; and many,
what
a craft,
for a multitude of purposes.
reports to postcards to notes
So, if
is
can be learned, practiced, developed, and honed to a variety of
Some writers write great literary who are not writers by trade
grammers by
beginnings, mid-
development process, not before.
art or a science
people
Similarly,
One must know the basics of
important in program composition.
Whether programming But programming
to writing.
to string together a series
tion generally occurs while
we
seldom the
— the writing process, not the other way around.
the language and
tives,
is
realizes the entire story at
of
Programming
dles,
and
of any "holistic inspiration" where the writer
are mindless devices capable only
about the
activities
of what a program
hardware into a
is
of doing what they are
told. Before
of programming we need to have a basic understanding
and the
role
it
plays in turning an expensive piece of mindless
usefiil device.
Imagine the following scenario: You are taken to a room containing a desk and a chair.
4
On
the
left
of the desk
is
an "in-box"
fiill
of pages of numbers, and on the
CHAPTER
1
INTRODUCTION
right
an empty "out-box." In between
is
lies
a manila envelope, a calculator, a
pad
of paper, and a stack of blank forms. You are told to open the envelope and follow the instructions you'll find inside. take the
first
The simple and
these calculations into certain
sets
of those numbers and to write the
into the out-tray
you must place the newly
and begin again with the next page of numbers
In this scenario,
you
are operating essentially as a mindless
taking input, performing simple operations, instructions are simple in
results
out form
in the in-tray.
computing
and writing out the
and do not require "thought" or
numbers, operate the
filled
of
When
numbered boxes on one of the blank forms.
you've completed one page of numbers,
results
to
page of numbers from the in-tray and perform certain simple arith-
metic operations on particular
you can read
you
explicit instructions tell
device:
results.
The
interpretation, merely that
calculator, temporarily store intermediate
on the pad of paper, and control the output device
(pencil) to write out the
results (see figure 1.1).
Memory Instructions
Other Storage (scratch pad)
Output
Input
Input
CPU
Output Device
(brain)
Device
ALU
Figure
1.1
An
(calculator)
Processing data
important point to note
in this scenario.
You may be
just
is
that
you have no idea about what you
one part of a team performing various
are
doing
steps in a
complex encryption/decryption scheme, or you might simply be balancing checkbook. At any
rate,
tion to being boring ties
you would not
and
repetitive
—
nonetheless require attention to
but don't absolutely free your mind.
such a
likely
enjoy this particular
activity.
It
small details.
would be hard
In addi-
—such
activi-
They occupy your
brain,
the usual definition of mindless
many
my
to meditate while
performing
task.
The most important component in the above situation is the set of instructions. The mindless brain can be replaced by a mindless central processing unit
ON PROGRAMMING
5
(CPU) controlling simple mechanical/ electrical devices metic operations and temporary storage. However, given set of instructions to control such a machine.
it
for input, output, arith-
takes a
mind
Anyone who has
to conceive a
stayed
up
late
trying to put together a "some assembly required" toy for their
on Christmas Eve
child can appreciate the problems of working with incomplete, out of order,
and
badly written instructions. I
rate.
said
programming
You not only have
is
about solving problems, but that
to think of a
way
develop that solution into a complete step-by-step solving the problem.
When
a
method
of simple, repeatable instructions,
An
we
for solving a
result. If
each step
is
An
of simple instructions for
problem
is
reduced to a
series
must be simple and not subject to
algorithm must also always generate the same
performed
fliawlessly,
the
outcome
is
guaranteed. Program-
about creating algorithms to perform simple or complex tasks and trans-
lating these algorithms into instructions that a
The as
have to
also
algorithm can be specified in any language, but regardless of how complex
interpretation or intuition.
is
set
you
that set of instructions an algorithm.
call
the algorithm might be, each step or instruction
ming
not entirely accu-
is
to solve a given problem,
particular instructions that a
computer can perform.
computer can execute
form known
are in a
"machine language" or simply "machine code." Machine code instructions con-
sist
of sequences of numbers, ultimately reduceable to zeros and ones
code). These instructions deal with into a particular
memory
address, or adding a
low
level operations
location, reading a
number
such
(i.e.,
as storing a
number out of a
particular
binary
number
memory
into an accumulator. For example, the following
snippet of machine code causes two numbers, which are stored in memory, to be
added together and saved into another memory location for
later use:
'"
8B45FC 0345F8 8945F4
'
Of course, we humans
'
have a hard time trying to formulate algorithms in a
machine language so we soon developed a language consisting of abbreviated words
to stand for the particular operations a given
could write programs out in
this
machine could perform.
We
"assembly" language, then translate the language
machine code using one machine instruction per each
into the corresponding
assembly instruction.
Assembly code was
easier to write
than the machine code
it
replaced, but
because different machine architectures used different machine code, each had to
be programmed in
its
own
one machine type had
version of an assembly language.
to be rewritten to
problem with assembly languages tions of moving
6
around
bits
is
A program written for
run on another machine type. Another
that they
still
dealt with the
low
level instruc-
of information.
CHAPTER
1
INTRODUCTION
Higher
level
languages were later developed to allows instructions to express
higher level concepts.
With
a
low
level
language
like
an assembly, one had to write
individual instructions to place a
number from
a specific location in
a temporary holding area;
number from
a different
finally, store
high
add
a
memory
one could write a
into
location; and,
memory. With
the result of that operation in a third location in
level language,
memory
a
single instruction that encapsulates the con-
cept (see figure 1.2).
Machine Code
Assembler Code
movl a, %eax addl b, %eax movl %eax, c
8B45FC 0345F8 8945F4
Comparison
Figure 1.2
High Level Code
c
of macliine, assembly,
a + b
=
and high
level
languages
Aside from the obvious advantage of making the instructions easier to read
and write
for
ment down
humans, compilers were created the corresponding assembly
to
that translated each high level state-
and machine codes
for different
machines. This meant that one could a write a program once in the high
guage and be able to run
it
on any machine
that
had a compiler
level lan-
for that language.
Early high level languages included Fortran, designed primarily for mathematical
computing;
intended
COBOL,
initially as
from which
exist
Perl
On
to choose, each with particular strengths is
about the language that
of
When
will
and weaknesses. The
be taught in
its
initially released in
1987.
nature (and certainly says something about
this
book.
inception
Perl's
its
is
creator as well).
Larry was faced with a problem that involved complex text processing and
report generation exceeding the capabilities of
what could be done
standard Unix tools like sed and awk he
made
viewpoint
he saw that
whole
Pascal,
Perl
was created by Larry Wall and
illustrative
programming; and BASIC and
teaching languages. Today, a wealth of high level languages
remainder of this chapter
1.2
for business related
to solving this particular task,
class
easily
with
a choice. Rather than restricting his his
problem was
just
one of a
of problems for which existing tools and languages provided no simple
solutions. So, in the spirit of long-term laziness, Larry
Wall created a new tool for
solving such problems. Perl did not capabilities
ON PERL
simply
fill
a gap in the existing toolset. Perl incorporated the
of existing tools and borrowed freely from other languages.
It
soon
7
became the language of choice (and
its
for getting things
done
in the
usefulness soon spread to other operating systems
another way, Perl didn't
one particular gap
fill
Unix environment
and environments). Put
in the toolset;
it
became
the tool to
use for filling gaps everywhere.
Designed not only for Perl
has become
incarnation,
it
practical usefulness, but also for
offers
and quite
at the
module system,
of the features
all,
moment. You want
you can write programs
I
just
a
may
program
on your hard
So what
analysis.
is
good
Perl
to interact
drive,
but the
new
files
Perl,
and writing plain
realize all the
code that
easily write
or directories, rename
files,
you save
files
files,
tasks
and more. but one
is
use of Perl.
ments you send
to a database server
to a webserver
simply encoded
text. Similarly,
A
significant portion
search capabilities are
and the
results
HTML pages
and the
programs use a simple, encoded (CGI).
ways you might
will read in directory list-
delete (unlink)
Query Language (SQL)
Client-server interactions, such as the Structured
you send
Pro-
text.
are representations
Automating many kinds of system administration and backup
common
and
for?
and directory names themselves
you can
it
some kinds of
with textual data. Not only the plain text file
that
accurate, but
either. Perl isn't well suited for
not seem like a big deal until you
of textual data. With ings, create
you
Perl. Telling
do anything you want wouldn't be
to
Perl excels at reading, processing, transforming,
want
with
lot to
for example, writing operating systems, writing device drivers,
doing heavy numeric
cessing text
references, nested data
mentioned don't mean a
know yN^iztyou can do
to
wouldn't be too far from the truth
programming:
current
a bit more.
Perhaps some, or even
you
its
extremely powerful regular expression enhancements, object-
oriented programming, a well defined structures,
continued expansion,
programming language. In
a powerful general purpose
it
it
returns, or the
sends back, are
all
HTTP
communications between a webserver and external text protocol
of Internet
powered by
and pass
write programs that "glue" language.
it
Perl
know
sites
as
Common
Gateway
dynamic content or
that provide
back to the webserver. Because
because of
its
Interface
programs that handle requests from the web-
work between other programs,
And
requests
largely plain or
server; turn these into queries for a database's server; collect, transform,
the resulting data;
state-
Perl
Perl
is
makes
and format it
so easy to
often referred to as a
wide use sticking various things together
throughout the Internet, Perl has also been called the "duct tape" of the Internet. I
mentioned
that Perl
module system. This means
mon for
8
ones
—have
anyone
was designed that
to
grow and
many of the common
has a well defined
that
it
tasks
—and some uncom-
already been encapsulated into modules that are freely available
to use (see the discussion of
CPAN
later in this chapter).
CHAPTER
1
When
you
INTRODUCTION
want
to write a
CGI program,
you can use
(FTP), or interact with a database, that provides
your
do
write client software to
file
and debugged module
a well tested
most of what you need, leaving you only
transfer protocol
add the code
to
to deal with
specific task.
Not only do modules make
it
modules provide extensions that allow you itself
might be
less
well suited.
I
common
easy to use Perl for
but some
problems for which
to use Perl for
mentioned above that
tasks,
may
Perl
Perl
not be a good
choice for numeric analysis. However, there are modules designed to extend
Perl's
handling of numeric data by providing arbitrarily large integers and floating point
numbers. There
is
Data Language (PDL) module package, which pro-
also the Perl
on
vides extensions for doing fast mathematical computations
numeric
ten in that language
is
languages are interpreted, meaning that a program writ-
read by an interpreter program that translates each state-
into the appropriate
machine code and executes
cannot be detected until the program
is
encounters a statement that generates an
meaning
that a
code before
whole program
is
even run. (Some
is
caught until the program
you run
is
it.
already running and the interpreter error.
Other languages
many
errors, called
running.) Perl
is
run-time
to the perl interpreter
many
and sometimes
will
errors,
not be
still
both interpreted and compiled.
program and compiles
an internal format (not machine code), then interprets Thus, in
are compiled,
kinds of errors can be caught before
a Perl program, Perl reads the entire
like a regular interpreter.
In such languages, errors
read by a compiler and translated into machine
can be run. In this case,
it
the program
When
of
data.
Some programming ment
large matrices
it
into
this internal representation
places in this book,
I
will
sometimes
— but they both
to the perl compiler
refer
refer to
perl itself Perl, however, refers to the Perl language.
The advantage of such errors
a compiled/interpreted system
is
that
kinds of
can be detected during the compilation phase, before a program begins exe-
cuting. Yet
you do not need
executable each time you
guage such
as
to separately
make
compile and link your code into a binary
a change, as
you would
for a strictly
C. You can simply type in your program and run
piler/interpreter takes care of the rest.
The
tradeoff
is
a
little
it.
whose statements
are individually interpreted with each
programs usually run very Yet another benefit of
many lower level compiled cifically
fast,
Perl's
ON PERL
is
running.
perl
faster
com-
A
than one
run through. That
interpreted nature
is
said,
memory management.
languages, like C, your program
When
The
especially for text processing types of tasks.
with allocating and releasing the necessary
your program
compiled lan-
loss in speed.
program compiled into machine executable code runs somewhat
Perl
many
you program
is
memory
In
required to deal spe-
for storing data while
in Perl, the perl interpreter takes
9
memory when needed and
care of allocating extra
releasing that
memory
so
it
can
when it is no longer needed. This doesn't mean you can completely ignore memory issues. You still need to make choices such as whether to read a file completely into memory or read a file one line at a time. But you don't have to worry about actually allocating and releasing the memory yourself. be used again
Later in this chapter, greatest thing since
I
will
continue
peanut butter
this discussion
of Perl and
why
it is
the
bread really wasn't such a big deal until
(sliced
peanut butter arrived on the scene). Right now,
turn to the
let's
more
practical
matter of ensuring that you have a perl distribution up and running so that you
can begin your Perl programming journey.
1.2.1
Getting started
If you are
not using a system that has perl
distribution
and
install
it
The
yourself
installed,
latest
you
will
have to obtain a perl
source distribution can be found
Network (CP7\N)
on
the
Comprehensive
src.
Pointers to binary distributions for various platforms can also be found there
Perl Archive
in the /ports directory.
and the process
is
own
On
can be a
downloading the
unpack
to
it
it,
$ $ $
$ $
file.
.
time consuming.
The
installed.
first
process
is
thing you want to do,
from the above mentioned
CPAN
README file
site, is
for
your
is
just
,
You
will
have to answer a variety of ques-
your system and where you want things
all that's
The
;
'
configure step will take awhile.
NT systems
Win/
compiler
version of perl. Essentially, the process
'
usually
probably want to compile
This should provide you with enough information to
rm -f config.sh Policy. sh sh Configure make make test make install
The
C
will
go into the resulting directory, and read the
own
tions about is
little
you have a
latest distribution
system and the INSTALL build your
you
a Unix-like system,
version of perl, assuming
simple, though after
distributions contain detailed installation information,
usually painless.
Unix-like systems your
The
http://www.perl.com/CPAN/
at
needed in most
On Win32
ActiveState version of perl, which
installed.
Picking the defaults
cases.
systems, your best bet is
is
probably to obtain the
available at http://www.ActiveState.com/
For Win95, you will probably need to get the before starting to install the perl distribution.
DCOM package and install
You can
find
it
it
at
http://www. microsoft. com/com/dcom/dcoml_2/download.asp
10
CHAPTER
1
INTRODUCTION
—
;
Installing the ActiveState version of perl
The
archive.
you have good reason not
defaults unless
able to run perl
a matter of double-clicking the
is
process will ask a few questions and
install
command prompt and
from the
you should accept the
to accept them. After this,
run the perldoc
you should be
utility to access
the documentation (see later in this chapter).
MacOS
You can
get a compiled binary distribution of Perl for the
the ports section of
CPAN:
link should automatically redirect
you
Installing this version involves
and
few configuration
setting a
README
file.
The major
the help
menu.
1.2.2
Running Perl
MacOS
in
http://www.perl.eom/CPAN/ports/index.html#mac. This to a nearby mirror site.
unpacking the archive, starting the program,
details that are
sections of the
pointed out in the included
documentation should be accessible via
Once you have perl installed, creating and running a Perl program is a simple process. The following is a simple, one line program that prints the string "Hello World": \
print "Hello World\n" Create a fortable.
find a
new
text file (plain text) using
This should be a text
list
editor,
any editor with which you
not a word processing program
are
com-
—you can
of decent editors for various platforms by pointing your browser
at http:/
/reference.perl, com/query. cgi?editors
Now
enter the above statement. Save the
program from the command
file as "first."
You can then run
the
line as
perl first
On many Unix-like
systems you can create your script to be run as
executable program. This
gram and Here
is
method means adding
setting the executable bit
the
new
on the
file
a special
first
line to
if it is
an
your pro-
with the chmod Unix command.
program:
/usr/bin/perl print "Hello World\n" #
!
The
first line,
characters #
!
called the
followed by the
"pound-bang" or shebang full
path to where perl
in other words, the absolute directory path to the perl
ON PERL
is
line starts
located
with the two
on your system
program. If you save
this as
11
and then type chmod +x first
before
at the
prompt, you can then invoke the
program hke: first
,T, ,
i:.
If the current directory
PATH (the environment variable hsting you may need to qualify the above call as
not in your
is
search paths for executable programs),
./first.
The
ActiveState port
into a batch
you
If
good idea put one chapter
(see
2).
called a "droplet,"
is
turn your perl program
utility to
PATH and called like any other program.
in the script, line
in,
is
and then choose run script from the script
not necessary on non-Unix systems, but
because perl will check
With MacPerl, you can which
is
a version
it
for
command
always a
it is
line switches
such
also save the script as a Mac-specific
you can execute by double-clicking
its
as
item
icon.
Getting help
1.2.3 Perl
pl2bat
MacPerl then you should be able to simply choose new from
are using
menu. The shebang
-w
a
that can be placed in your
file
menu, type
the file
comes with
a relatively easy language to learn, but
not a small language. This book
it is
does not attempt to be a reference for the Perl language. If you have perl installed, however, you already have the most up-to-date language reference available. perl distribution includes a large
Unix manpages and/or
amount of documentation
The raw documentation
format called Plain Old Documentation {POD), and included perldoc utility (or shuck on the Mac).
you can enter perldoc perl list
installed as
at
your
command
is
in a plain text
mark-up
is
also readable using the
To view
the initial perl pod-page
line
prompt. This document pro-
of the remaining sections of the core documentation. Another useful
starting page
tents
is
HTML pages (or some other format, depending on instal-
lation configuration details).
vides a
that
The
is
perldoc perl toe which provides a more in-depth table of con-
of the Perl documentation.
One
extremely useful
of documents
set
is
the set of perlfaq documents. Very
down into smaller problems and addressing those. However, when learning a new language, some of the smaller problems are often difficult because you do not yet know how to express the solution in the context of the language you are learning. This is when it is time often one begins to tackle a problem by breaking
to turn to the Frequently
Do
Asked Question (FAQs).
not assume that the
FAQs only
that your question will not be
but they are no
12
less
it
found
valuable for that.
address simple or
there.
On
"little"
Many FAQs do
questions and
have simple answers,
the other hand, there are also
CHAPTER
1
many
real
INTRODUCTION
programming particular
problem seems
to be, you'll often
we
use in the FAQs. In chapter 3,
FAQs
FAQs.
issues addressed in Perl's
for information
The FAQs
we might
No
and
are viewable lists all
will begin developing a tool to quickly search the
with the perldoc
is
easy or difficult your
need.
of the questions you
Another source of help
how
have good luck finding something of
are divided into nine sections, or
and
matter
The perltoc page
utility.
will find
named perlfaql
files,
to perlfaq9
describes each of these
answers to in each document.
the Usenet community. There are separate news-
groups for discussions on miscellaneous Perl topics {comp.lang.perimisc)., discussions
on
Tk
graphics toolkit
moderated group that you can
read, but partici-
modules {comp.lang.perl.modules), and the
perl
[comp.lang.perl.tk).
There
is
also a
pation requires registration {comp. lang.perl, moderated)
pated in Usenet newsgroups before,
I
recommend
you have never
If
.
that
you
partici-
take a look at
first
news, announce, neivusers.
Although forums
all
of the newsgroups are open to public participation, they are not
for questions addressed in the perl
documentation and FAQs. The people
participating in these groups are knowledgeable to have tried to find answers to
and
helpful, but
you
are expected
your questions in the documentation before turn-
ing to the newsgroup. These newsgroups are not free help desks. If you treat
you
as such,
On
will likely get ignored or worse.
a related note, remember,
ners often approach a learn.
programming
programming language
This can lead to asking questions
like
is
about problem solving. Beginsimply another application to
as
"How do
I
do "X"
learning a
new word processor application, one might formulate
"How do
I
does not provide single Before you ask a
It
approaches
fail,
a question such as
provides ways to formulate solutions to problems.
It
"How do
FAQs
When is
commands
Perl's built-in fianctions (see
search the
my
in Perl?"
documents?" But a programming language
create footnotes in
not simply an application.
Then
them
I.
or functions for every conceivable problem.
." .
question, search through the documentation
perldoc perlfunc)
to see if
to see if your question has already
on
one meets your needs.
been answered.
two
If these
ask yourself how you would go about solving the problem without a
computer. For example, recently a question appeared on the comp.lang.perimisc
newsgroup asking (and not There
you
is
for the first time)
no simple even or odd
how
to
tell if
an integer
is
built-in Perl fiinction that solves this
odd or
even.
problem
for
The obvious question to ask yourself is "How do / tell if an integer is odd?" Most people simply notice if the last digit is one of 0, 2, 4, 6, or 8. If
directly.
even or
so, the integer is even.
Algorithms often
You may not succeed
in solving
not be the optimal solution, but
07V PERL
arise
from such simple beginnings.
your particular problem, or your solution
this
common
sense approach
is
the
first
may
step in
13
thinking like a programmer.
One more
question in the newsgroup,
to use
is
avenue, before resorting to asking your
one of the Usenet search engines
(for
example,
www.dejanews.com) to search the Perl newsgroups for similar questions that have been asked and answered in the
past. Finally, if you
have exhausted
all
may
avenues
of inquiry without success, ask your question on the appropriate newsgroup. Be
on what you've
sure to include information
you
are not just looking for free
Perl
handouts but are actually interested in learning.
Another place to begin exploring a wealth of
Perl related information
home
will find links to
page
sions of the Perl Perl Archive
From
www.perl.com.
at
there
Network (CPAN). At CPAN, you can with
its
you
If
new
are
encounter quite a
when
in the hacker
--esr/jargon/,
may want
the newsgroup
you
community. You can find
started,
and
beyond the standard
it
is
I'll
you may
is
the Jargon
One File.
slang terms
this file at http://www.tuxedo.org/
run through a few terms that you might happen
two standard
if
you do ask
a question
on
answered in the FAQ, you're unlikely to get responses
RTFM which
Quite often, people are
to look at
common and not-so-common
as you begin to read the literature. For example,
These
in general,
or use your favorite search engine to locate a copy.
Just to get
upon
problem domains.
using the above-mentioned resources.
a large dictionary or lexicon of
found
the
and download not only the
programming and/or Usenet
to
of jargon
bit
find
scripts for various
very good, and quite extensive, resource you is
is
HTML ver-
standard libraries and modules, but also a large
number of contributed modules and
This
you
documentation and FAQs, plus pointers to the Comprehensive
Perl source distribution,
Jargon
know
tried so the helpful people there
will
Perl
means, "Read The F****** Manual."
quote material from the "camel" or "llama" books.
books by published by O'Reilly and Associates. The
books feature pictures of the respective animals on
gramming Perl (camel) and Learning
Perl (llama)
—
their covers
and
are titled Pro-
See appendix C. Both are very
good books, by the way.
A
couple of other frequently used terms are "grep," meaning to search, and
"grok,"
meaning
to understand.
Grep
derives
from the standard Unix search
of the same name. If you want to grok Year 2000 (Y2K) issues in grep the
FAQs
Another of data
fied set
term
as a string (a
of syntactic
rules.
is
"parse." Specifically, parse
More
generally, parse
is
to break
up a piece
—according
to a speci-
means
sequence of characters or words)
often used in the context of sim-
ply recognizing and/or extracting particular bits of data from a larger
You may
you should
for the relevant entries.
common
—such
Perl,
utility
also see reference to "p5p,"
which
chunk of data.
refers to the perl5-porters, a
group
of people responsible for maintaining and upgrading the actual Perl distribution
14
CHAPTER
1
INTRODUCTION
— across the
many platforms on which
it
Another group
runs.
—
set
of groups
really
the Perl Mongers, a collection of user groups distributed around the world. (Visit
is
http://www.pm.org to find your nearest group or to start one.)
member of the Winnipeg
Perl
Mongers (Winnipeg.pm
happen
I
to be a
for short).
A variation of the Monger moniker seems to be Perl M(o|u)nger, which ular expression talk (you'll learn refer to either
Monger
munge," which means
reg-
about regular expressions in chapters 6 and 10) to
or Munger. Munger, then, essentially, in the
cess, slice, dice, julienne,
is
is
the
noun form of the verb
"to
context of data processing, to parse, pro-
massage, fold, bend, or otherwise mutilate
manipu-
(i.e.,
late) data.
This in
little
interlude barely scratches the surface of the jargon
your journey.
only a
If
nothing
new area of study, but
a
new linguistic
the Jargon File mentioned above
1.3
A
warned you
else, at least I've
—
you
will
run into
are entering not
arena as well. Don't forget to check out
contains
it
that
you
much more
than mere definitions.
bigger picture
The
previous sections have dealt with basic information
Perl.
Now we will take a brief step back and take in a larger view. My first choice for
a
title
Perl,
for this section
was
"Practical
and the Rest of this Book," but Several years ago,
inspection, repair, different,
I
worked
and Philosophical Remarks on Programming, that
and construction
diver,
tasks, often in
and there was no such thing
we had
—even
for
me.
doing a variety of underwater
fast-moving water. Every job was
as a transportable stable
could be used in every situation. Insted, selection of tools, a
was a tad long winded
an inshore
as
on programming and on
work platform
that
the next best thing: a shop with a
wide variety of surplus construction
steel (rings, bolts, rods,
angle iron, and I-beams), and an arc-welder. This was a hacker workshop.
We
cre-
ated reusable bracing and clamping components, and rigged up a variety of different scaffold systems that could be lowered
from the
barge, positioned,
and anchored
various ways to bridge piers of different sizes and shapes, pilings, or
tems.
That arc-welder was the key
variety of components into
In the
to being able to quickly
and
working solutions we could deploy
programming world,
Perl
reminds
me
very
solidly
brute force
utility.
I
to
its
think a better analogy
quite capable of hacking out
fast,
gate sys-
connect a
in the field.
much of
that
and, in particular, the arc-welder. Perl has been nicknamed the Swiss
saw of programming languages due
dam
in
workshop
Army
chain-
multitude of built-in tools and overall is
that of a Swiss
Army
arc-welder,
sometimes crude, one-time solutions,
one
as well as
building and joining (virtually seamlessly) components for solving more complex or longer term problems.
A BIGGER PICTURE
15
Perl
is
not your average everyday programming language. In
programming language. There
exceptional everyday
programming languages out computer
and
Perl
purists,
seem
scientists
there than
to get great pleasure
seems to have more than
who complain
So what makes
that Perl
its fair
No
and any program you write using
seems to
much
the
some
satisfy the
The language
same high
level
of sticks anyway,
share of detractors from computer science
single feature of Perl
Perl
makes
it
outstanding,
could also be written using another lan-
things easier to accomplish than they might be
in other languages, but this can't be offer pretty
an
too big, ugly, and redundant.
is
Perl so great?
guage. Certainly, Perl makes
lot
is
Of course, some
stick at.
from shaking a
it
more high
are currently
you can shake a
tact,
all
there
is
to
it.
There
are other languages that
level functionality as Perl,
and do
so in a
way
that
above mentioned stick wavers. Yet Perl remains wildly popular.
continues to evolve, and
its
user base
is still
growing unchecked.
What
appeal might Perl have beyond the sheer functionality that might
attract
programmers from other languages, and why would experienced program-
mers
fall
more
language
this
if
other "better" programming languages exist?
blame squarely on the shoulders of Perl's
lay the
I
with
in love
precisely, slightly
first
creator, Larry Wall. (Well,
above and between Larry's shoulders.) Besides being a com-
puter programmer, Larry also has a background in linguistics. Consequently, the fact that Perl has the qualities
of a natural language
some of Larry's own musings on ral.html.
Here
Two dancy I
will
1
no accident. You can read
these qualities at http://kiev.wall.org/--larry/natu-
touch upon only a couple of points in
this regard.
of the things Perl receives criticism for are the richness and the redun-
in the language,
mean
is
that Perl
is
which
are
by no means unrelated
a big language, incorporating
issues in Perl.
and supporting a
By
large
richness,
number of
powerful and specialized features directly in the language. In contrast, other languages tend to be minimal, providing standard libraries for specialized list
tasks.
The
of Perl's specialized features includes regular expressions, pattern match operaprocess
tors,
ming
management,
and directory manipulation, and socket program-
tools, to highlight just a few.
to provide
it
TMTOWTDI
exists in natural
is
"),
Perl.
is its
ability
This
is
Indeed, the Perl slogan
is
which stands
for "There's
task.
More Than
It."
Critics say that this
languages and in
(pronounced "timtoady
One Way To Do But
Another example of Perl's richness
more than one way of saying or accomplishing the same
redundancy;
learn.
file
both richness and redundancy make the language harder to
only true up to a point. Certainly,
I
may
be able learn a complete
minimalist programming language rapidly, but assuming there are libraries providing additional functionality, libraries to
16
do the things
I
I
would then
might want
still
have to learn to use the various
to do. Perl
is
not really more demanding in
CHAPTER
1
INTRODUCTION
this respect.
You do not have
to learn the entire language before
you
the essential elements plus any extra built-in features
present task. Similarly ble
way
to say
Perl's
you
start,
merely
find necessary for your
redundancy doesn't require that you learn every
something before you begin. Redundancy merely expands your
options. Indeed, the positive consequence of these natural language qualities
make
that they
possi-
it
make
easier to think in Perl; they
easier to express
it
is
not
your
thoughts in Perl.
There
is
also a strong sense
of community
among many
More than just a large number of users sharing information on Perl community shares a sort of Perl spirit. Perl's "naturalness" fulness
—not merely
clever
programming
more than fun and games, but
I
think
programmers.
various forums, the lends itself to play-
but poetry, puns, and other games
tricks,
Of course,
of the sort people play with natural languages.
Perl
this
community
significant to note that
it is
coming from other languages who have found
their calling to be
spirit
is
programmers
growing tedious
have expressed gratitude that Perl has made programming fun again.
The
Perl
community
are given freely
on the newsgroups and many
the development and evolution of Perl itself tion of contributed
modules from
ming, unlike most creative originality.
available
concept of sharing. Help and advice
also holds a strong
The
acts,
community
Perl
Perl
talented
Beyond
programmers
places a high is
programmers cooperate on
that all
is
CPAN
—
a vast collec-
over the world. Program-
premium on
reuse rather than
no exception. Several hundred modules
on CPAN, ranging from database
interfaces to
faces for several graphics libraries to Internet
development
programming
are
tools to inter-
to date manipulation
modules and much more. Whenever you find yourself facing a new programming challenge, check out
module
that will
CPAN. Chances
make your
task
much
are pretty
program with
it.
The
that
someone has written
a
easier.
By now you might be wondering when we start learning to
good
will stop talking
about Perl and
next two chapters concentrate on aspects of
writing good code and the process of developing programs. In the latter chapter,
we
follow the development of two Perl programs from
working programs.
A good deal of Perl will
A BIGGER PICTURE
initial
idea through to
be presented along the way.
17
CHAPTER
P7
Writing code 20
2.1
Structure
2.2
Naming
2.3
Comments
24
2.4 Being strict
29
2.5
21
A quick style guide
31
2
—
Writing code
guage
—
is
that
is,
typing in the set of instructions in a programming lan-
only one part of the programming process. In
next chapter,
a relatively small part of the
it is
may
small the act of writing code
The
decisions
gramming which
How
in turn affects
well
how
programming
be in the overall process,
you make here can have a
process.
it is
on
large effect
you write your code
easy
fact, as
to fix (if a
all
affects
problem
we
will see in the
process. But, it is
however
an important
act.
other steps in the pro-
how
arises),
easy
it is
to read,
upgrade, and add
additional features. Perl
—meaning and understand—but
often derided as being a "write only" language
is
written in Perl are inherently hard to read
this
of
all
symbol on a standard keyboard
simply
is
false.
the fianky symbols used in Perl (Perl uses every
for
variety of shortcuts Perl offers the
programs
Some presumed
Mostly, these accusations arise from people unfamiliar with Perl. difficulties arise as the result
that
one purpose or another); some
due
are
to the
programmer; and some occur because people sim-
ply writing bad code (this happens in every language). Although at
first
glance, Perl
appears to be a difficult language to read,
it is
not terribly
difficult to write hard-to-read
code in Perl
—but
easy-to-read code in Perl either. This chapter
not worry tions.
if
you
really isn't. Certainly,
it
is
it is
not terribly
difficult to write
Do
about writing simple, clean code.
don't recognize or understand the Perl code in the following sec-
They are only examples
to illustrate style issues.
and apply these guidelines
tunity to learn Perl
As we mentioned
You will have plenty of oppor-
in the chapters that follow.
previously, high level languages exist for the benefit of the
programmer, not the computer. The computer does not execute the instructions in a high level language. the particular
code:
first
be translated into the machine language of
computer system where the program
high
last chapter,
They must
level
They provide
will be run.
a level of portability across
instructions because certain operations that
all it
the different machine types that easier for
might take
humans
it
easier for
humans
to write the
several lines of
code can be written in a single simple statement in the high
make
in the
languages offer three advantages over low level machine
have translators for that language; they make
they
As we saw
machine
and
level language;
to read those instructions. This last benefit
is
the
subject of this chapter. It is
level
up
to the
programmer
to take full advantage of whatever facilities the high
language offers to produce easy-to-read programs.
read affects
how easy it
WRITING CODE
is
to write, debug,
and maintain.
How Let's
easy a
program
is
to
begin with structure.
19
;
Structure
2. 1
This book
is
organized into parts, chapters, sections, subsections, paragraphs, sen-
and words. Whitespace plays a
tences,
large role in delimiting these elements.
wouldn'twanttoreadabookwithnowhitespaceinitnowwouldyou?
look
Let's
You
at
an
example of bad code written in Perl: $s=0; $i=l;while ($i>= {$response eq 'q' or $response =~ m/''\d+$) $is_valid = 1; else { print "Invalid Input: enter an integer or
{
'q'
to quit\n";
}
The
conditional reads like
this: if
the response equals
tains only digits, then set the valid indicator to Finally,
we
translate the last
loop, or whether the response
A FIRST PROGRAM
q,
or the response con-
1.
chunk, which
tests
whether we should end the
was right or wrong.
43
;
;
;
;
;
;
;
;
= if ($response eq 'q' { $quit = 1; elsif {$response = $solution) { print "Correct\n" else } print "Incorrect: $question $solution\n" )
,
,
,
}
{
}
Once
again, performing our
the top level,
chunk
substitution from the low level back
up
to
we have our completed program:
faqgrep sort perlf aq4 .pod: How do I sort an array by (anything)? perlf aq4 pod How do I sort a hash (optionally by value instead of key)? perlf aq4 pod How can I always keep my hash sorted? .
,
.
.
know
Users would then
that the information they seek
is
of the perlfaq documents, which can be read using the
located in section 4
command perldoc
perlfaq4. Before
we
format of the perlfaq pod question
is
=head2. So for
tem.
files.
The format of these
contained on a single line and
we know
our pattern on a
You
will also
that,
when we
to
result in a lot
pod
files is relatively
about the
bit
simple.
with a pod directive that looks file,
the perlfaq
files
we only need
were
they were installed into a directory
installation
know a little
Any like
to search
with that sequence of characters.
know where
/usr/local/lib/perl5/5.00502/pod find the
starts
to
read through each
line that begins
need
On my system,
To
we need
begin designing the program,
,
.
: ..
installed
on your
sys-
which
will
named
.
,
on your system you can type perl
-v,
of output regarding the configuration of Perl on your system. Near
the end of this output
is
You should
subdirectory under one of these listed directories.
find a
/^o>=
,
/usr/bin/perl -w use strict; «set directory and filename list>> «set search pattern» >= my $pattern = $ARGV[0] or die "no pattern given:
FAQGREP
as the first
element in the @argv array can be referred to as $argv 0
because
in the array. In
program
$
!
"
51
;
We will explain the $ARGV[0]
is
or die syntax
empty or contains
shortly. In this case,
it
simply means that
program
a false value, then the
if
with the
will exit
given error message.
We
have seen the while and until loop structures in the previous example.
Perl has another looping construct that
values. This structure
designed specifically to loop over
is
The
called the f oreach loop.
is
foreach variable (list of values) statements;
syntax of the loop
lists
of
is
{
"'" }
In this structure, the loop executes once for every value in the
each execution, the loop variable
is
assigned the next value in the
During
list.
list.
You may
To
declare your loop variable prior to the loop or directly in the foreach line. ate over each
file
we may
in our array of filenames,
iter-
use
= foreach my $filename (@faq_files) {
}
Before you can read a dle.
A file handle
you need
file,
a Perl data-type that
is
automatically opened the
file
it
handle,
The
it
and
associate
to be read
we may
from elsewhere. Once we open a
read lines from the
syntax of the
file
openO function
a system operation that
failure occurs, so
we
We may
do
this
using a logical
or
operator like
tion.
open
(
f ilehandle
You
,
may
fail
for reasons such as the
using an if conditional, but
$filename) or die("can't open
tor.
The former
ter
4 for a discussion of precedence.)
it
file.
We want
is
file
to (
)
does
know func-
more commonly done
to evaluate the expression
on the
$f ilename
'
:
$
!
"
)
operator instead of the or opera|
just a higher precedence version
When
'
of the same operator. (See chap-
Perl encounters
left,
the open
that expression evaluates to true Perl ignores the right
52
with a
it
so:
|
first tries
is
$filenaine).
always check the return value of the open
will often see this written using the is
Perl
explicitly
associate
open(filehandle,
is:
or the user does not have permission to read that
such a
han-
file
using the same syntax: .
a
file is
and
file
not if
with a
keyboard by default, unless you've
Opening exist,
it
handle stdin for reading, stdin reads from what
called the standard input, usually the
file
open
associated with an input or output chan-
our method for reading input from the keyboard using .
nel. Recall
redirected
to
is
CHAPTER
{
)
it
function in this case. If
hand
3
such a statement,
expression. If the
left:
WRITING PROGRAMS
;
expression
and
exit
Perl then evaluates the right
fails,
hand expression
right
prints
;
its
is
a call to the die
argument
)
(
function,
)
The
to the screen.
value of the current system error, so
die
(
hand
we
expression. In this case, the
which causes the program
special Perl variable $
include
it
in the string
we
to
holds the
!
pass to the
function to provide a better diagnostic message about what went wrong.
We may define our next code chunk as «read files and print matching lines>>= open(FILE, Sfilename) or die{"can't open '$filenaine': while () {
$
!
"
)
" }
close FILE; It is file
^
important to remember that the
handle being read. You do not close
{)
is
the input operator and file
an input operator, just a
file
is
the
handle:
close FILE; Perl does
some
extra
magic when we use the input operator
within a while conditional.
while
(
defined ($_
When you
It
read from a
file
undefined value at the end of the file is
and assigned
only thing
automatically converts the conditional to read
)
=
as the
)
{
using , the input operator () returns an So, in this conditional, a line
file.
to a special Perl variable, $_. This value
is
read from the
then checked to see
is
defined. In this way, the loop will be executed once for every line in the
ting $_ to each line in turn.
We we
time
The loop will
exit
when
the end of the
file is
file,
if it
set-
reached.
use regular expressions again to test for matches against our pattern. This
introduce the substitution operator,
works the same
s/pattem/replacement/. This
match operator with regard
as the
to
matching the pattern, but,
instead of just matching, Perl replaces the found pattern with whatever
is
in the
second half of the s/// operator. The replacement part of the substitution operator
is
just a string,
not a regular expression.
with =head2 and to see
them
if
we
We will use this to find lines beginning
strip off those characters
print out that line from the
without replacing them so we don't
file.
= s/ ''=head2 / / and m/$pattern/) { print " $f ilename \n$_"
if
{
:
}
Notice that we did not use the =~ binding operator with either the substitution or the
match
operators. This
is
because
being bound to a particular variable,
FAQGREP
it is
when
either operator occurs without
automatically
bound
to the special $_
53
;
variable,
which, remember,
is
;
;
;
'
the variable that contains each line in our
,
file
due
to
the magic while condition mentioned above.
That completes the whole program. chunks into
All that remains
their relative places to create the
whole program
to insert the code
is
listing:
= # /usr/bin/perl -w use strict; my $f aq_directory = /usr/local/lib/perl5/5 00502/pod' perlf aql pod 'perlf aq2 .pod' 'perlfaqS .pod' my (afaq_files = perlf aq4 pod 'perlfaqS .pod' 'perlfaq6 .pod' perlf aq7 pod 'perlfaqS .pod' perlf aq9 pod !
'
.
(
'
.
'
,
,
.
'
,
,
.
'
,
,
'
'
)
'
.
;
die "no pattern given: $ my Spattern = $ARGV[0] foreach my $filenaine (@faq_files) open(FILE, $filename) or die "can't open $f ilename while () if (s/^=head2// and m/$pattern/) { print $f ilename \n$_" !
"
|
|
{
'
'
:
$
!
"
{
"
,
:
-
)
,
.
,
close FILE;
This compiles fine with perl
-c but running
to search for a pattern
it
sort produced an immediate error about not being able to open a trates the
kind of extra information that the
We check our open out giving the this
is
where
full
it
openlFILE,
Now
"
call
and
realize that
when
running perl
it
files
with-
variable can provide.
!
we
are trying to
(
trying to
open the
$faq_directory/$f ilename"
will return to
illus-
open the
We knew we needed the FAQ directory location, and needed. We have to change the open function to use the full
intitial specification.
we
)
of
This
pathname.
was
path and filename
(
$
file.
)
)
file:
or die "can't open file:
$!";
faqgrep sort produces the output we showed in our
We will
not proceed further with
and modify
it
this script at this time,
to optionally print out the full answers to
but
match-
ing questions rather than just the filename locations.
We have covered a lot of ground in this chapter. We have taken two programming projects from initial ideas through to working programs. We have also been exposed to quite a If you
don't
bit
of Perl code in the process.
had no previous experience with the
worry
if
some of it seemed
a
little
Perl language prior to this chapter,
over your head.
The main purpose of this
chapter was not to teach Perl, but to introduce you to the process involved in creating programs.
54
Too
often,
it is
tempting to jump right into writing code when
CHAPTER
3
WRITING PROGRAMS
given a problem, especially often have a
seems
like a
way of requiring more complex
When you sit down topic
if it
to write
and purpose or goal of the
an
essay,
essay.
Programming has
edit
revise
tion
you
give to the early specification
it.
solutions than
you need
to
Then you need
and develop an outline of your arguments. and
simple problem. But simple problems
Finally,
a similar
first
to
first
imagine.
have a clear idea of the
do the necessary research
you can write the
development
and design
we
cycle.
stages, the less
way
right
to design a program, or to
the prevailing
wisdom
is
and writing code.
33 1
that .
decompose
you should ^
a
is
no
single
problem into subproblems.
design your
program
first
then
atten-
time you'll have
spend debugging or redesigning and rewriting your program. There
to
essay,
The more
Still,
before diving in
Exercises
Modify the mathq program
to keep a
running score of right and wrong
answers. 2
Design and pseudo-code a program that simulates the rolling of two standard six-sided die
and
prints out the total of the
roll.
Then
consider
how
be modified to display the faces of each dice rolled, for example: a
and
5
might be displayed
#######
EXERCISES
might
roll
of 3
as
#######
####### ###### ####### #######
it
-
#######
55
PART II Essential elements
CHAPTER Data: 4.1
types
and variables
60
Scalar data
65
4.2 Expressions 4.3 List data
67
4.4 Context
73
74
4.5 References to variables
4.6 Putting
it
4.7 Exercises
4
together
76
77
59
At a
basic level, a
program operates on
data.
Not
among programming languages
mental variation
surprisingly, a source
of funda-
how they carve up
the world
is
in
of data into meaningful or interesting types of discrete data.
Some
languages are
drawing fine-grained distinctions between
splitters,
ous kinds of data. For example, in some languages, what
we normally think of as
simply numeric data (numbers) might be divided into integer numbers and
may
ing point numbers. These
be further divided based on their
much
storage space they require in
might
differentiate only
on
tions defined
memory). Other languages
between numeric data and character
These type distinctions
and second,
a given type of data,
may
two ways:
are applied in
A variable,
vari-
first,
float-
size (i.e.,
are
how
lumpers and
data.
in terms of the opera-
as restrictions
on the
types of
we mentioned in the previous chapter, can be thought of as simply a named memory location where a value of a particular type can be stored. In a splitter language, you may have several different types data a variable
contain.
as
of variables: an integer type that can hold only integers (again, perhaps further
on
divided into short and long integer types based
the size of the integer), a float
type for holding floating point numbers, a char type for holding a character, and
perhaps other types Variables
as well.
may also
be
(scalar) or structured. Primitive
classified as primitive
or scalar variables hold a single piece of data; structured variables hold a collection
of data. In a
floating point Perl
list
much
lumper language.
a
singular or scalar values
and
plural or
—
holding single pieces of data
number
(such as 42 or
3
such a
is
of
list
It
draws a primary distinction between
values. Perl has only
the scalar variable.
A
one variable type
scalar variable
may
for
hold a
It
may
also
hold a reference to another variable or
location (see section 4.5).
Similarly, Perl has a
ordered
list
.14159) or a character string (such as h or hello or this
string has 29 characters).
memory
variable that
of integers, and another array variable defined to hold a
numbers.
very
is
you may have an array type of
splitter language,
defined to hold a
list
list
list
of scalar values.
might
type of variable called an array that can hold an
The
scalars
need not
all
be the same type, for example,
be: (42, 'hello', 3.14159). Perl also has
another plural variable
type called the hash or associative array (see section 4.3.2).
4.1 The
Scalar data
simplest
way
to use data within a
program
represented directly within the program.
world program
60
in chapter
1
We
is
as literal
—
data
have already seen
that
this
4
DATA: TYPES
explicitly
with our hello
where the program printed out the
CHAPTER
is,
AND
literal
string
VARIABLES
:
Hello World followed by a newline (represented by representations of numbers with
print 42; print 3.14159; print -2;
# #
\n).
Here
are a
few
literal
comments:
an integer a floating point number # a negative integer
Perl also allows for a
few other
literal
representations of numeric data such as
scientific notation:
2.31e4 2.31e-4
is 2.31 times 10 to the 4th power, is 2.31 times 10 to the -4th power,
or 2310C or 0.000231
Additionally, in literal representation only, a
taken to be a is
taken as a
number
number
binary, octal,
in octal (base 8) notation,
and hexidecimal numbers,
Finally, Perl allows
We
bers.
please refer to appendix D.)
notation notation
(base ten) (base ten)
one further notational convenience
numwe use
for representing
is 1369253
is 7214300.312413
important to
realize that these integers
and
floating point
character data for that matter) are not internally represented
sequence of digits you see on your screen (or in
sentation.
with
writing out large numbers:
1_369_253 7_214_300.312_413
binary format.
is
and a number preceded by an Ox
can use underscores with numbers to enhance readability, just as
commas when
It is
a zero
in hexidecimal (base 16) notation. (If you are unfamiliar
is 139 in decimal is 506 in decimal
0213 Oxlfa
number preceded by
Not
all
floating point decimal
The number
this
book).
numbers have
0.2, for example, if printed
numbers (and
and stored
They
as the
are stored in
a precise binary repre-
out to 20 decimal places turns
out to be a representation of 0.20000000000000001110. This leads to occasions
when
the results of some mathematical operations are not exactly
expect. This
is
perlfaq4 for further discussions of such data String or character data
and double-quoted
strings.
marks or apostrophes, quoted print
(
strings
what one might
not a Perl problem but a fact of binary representation (see perldoc
first.
in
two
basic forms in Perl: single-quoted strings
Single-quoted strings, delimited with single quotation
are the
more
literal
Recall our Hello
"Hello WorldVn"
SCALAR DATA
comes
issues).
of the two forms.
We will look at double-
World program from chapter
1
);
61
This prints out the string of characters Hello World followed by a newline.
The
\n
is
a special sequence denoting a newline in a double-quoted string. This
referred to as backslash interpretation. available within
Table 4.1
double-quoted
Backslash interpretation
There
strings, as
in
are several backslash interpretations
shown
in table 4.1:
double-quoted strings
X
\a
alarm
\cX
control
\b
backspace
\Onnn
octal byte
\e
escape
\xnn
hexadecimal byte
\f
formfeed
\l
lowercase next
letter
\L
lowercase
next\E
\u
uppercase next
letter
next \E
\n
newline
\r
carriage return
\t
tab
\U
uppercase
W
backslash
\Q
backslash non-alphanumerics
\"
double quote
\E
end\LAU,or\Q
'
until
until
For any other character, the backslash means interpret the next character ally (losing
any
special
meaning
is
it
may
have had). This
is
liter-
usually referred to as
escaping a character. So, to include a double-quote character or a backslash character
within a double-quoted string, you escape them with a backslash:
print "This \\ is a backslash"; print "Here \" is a double quote";
# #
prints: This prints: Here
\ "
is a backslash is a double quote
Aside from backslash interpretation, double-quoted strings also allow variable interpolation. This
by
its
means
present value:
that a variable in a double-quoted string will be replaced
;
Svariable = 'Hello'; print "$variable World";
Array variables
may
prints: Hello World
#
also be interpolated,
but hash variables are not subject to
interpolation in this manner. All scalar variables begin with a $ symbol
and
array variables begin with a @ symbol, so a consequence of this interpolation if
you want
to have to
with a backslash so that
one of those symbols it is
$variable = 'Hello'; print "\$variable World";
interpreted
#
prints:
in
is
all
that
your string you must precede
it
literally:
$variable World
Single-quoted strings cannot interpolate variables. Single-quoted strings only allow for two special cases of backslash interpretation: a backslash
62
CHAPTER
4
DATA: TYPES
may
be used to
AND VARIABLES
;
escape a single quote
(i.e.,
to allow a single-quoted string to contain a single quota-
tion mark), or to escape a backslash as in a double-quoted string.
often helpftil in terms of maintenance to adopt a style of coding where
It is
you only use double-quoted
strings
tion or interpretation not offered
marks
for
all
when you
require a double-quotish interpola-
by single-quoted
strings.
Use
single quotation
simple strings.
Of course,
we want to use it in a program at best. If we could name an item of data, then just use the name to refer to it, we would be much better off. Well, we don't do exactly that, but we can create named containers (i.e., variables) to hold bits of data. Then we can access the data through the name of the container. specifying data literally every time
would be tedious and error-prone
4.1.1
Scalar variables
A variable
is
a container or slot of memory, associated with a name, where
store data values. Perl's scalar variable type can string, or a reference.
We
discussed variable
you can
hold any scalar value: a number, a
naming
in chapter 2,
but
let's
quickly
review the particular naming rules for variables in Perl here.
A variable
name
in Perl, apart
or an underscore character and
may
or underscore characters (well, any has a large
number of special
sequently,
you seldom have
from
its
number
legal
and
illegal variable
do not follow these variables
following
list
names you may use with your own
$total_amount $_private $field_3 $abcl23 $This_is_a_LONG_variable_name illegal illegal
The
digits,
than 255 characters anyway). Perl
worry about giving your
$ainount
$!var $13_var
less
built-in variables that
to
number of letters,
be followed by any
or clash with built-in variable names.
flict
type symbol, must begin with a letter
rules.
names
Con-
that con-
provides examples of
variables:
legal legal legal legal legal legal
(must start with letter or underscore) (starts with digits)
WTiile Perl does not force you to declare your variables before you use them, the strict
pragma discussed
in chapter 2 does force
you
to declare
your variables
or use fully qualified variable names (discussed in chapters 7 and 16).
way
to declare
your variables
is
The
simplest
with the my declaration:
my $variable; my $naine; my $f oo, $bar) (
SCALAR DATA
63
;
As you can
you can
see,
a comma-separated
it.
;
list
;
;
declare a
list
of variables by using parentheses around
of variables.
Once you've declared a variable, you'll need to know how The equals sign (=) is the assignment operator:
$foo $bar
=
to assign a value to
42;
Hello $greeting = "$bar World\n" print $greeting; =
'
'
Figure 4.
memory
1
shows the relationship between a variable name and
its
value in
during declaration and assignment statements.
Variable
Name
Code
Value
declaration
my
$foo;
$foo
#
$foo
#
assignment $foo = 42;
Figure 4.1
Scalar variable declaration and assignment
Assignment may
also
be combined with declaration either singly or in
list
form:
my $greeting = 'Howdy'; print "Sgreeting WorldVn" my ($first, $second) = ('Hello', print "$first $second\n" It is
the
'World');
important to note that assignment writes a value into the variable
memory
location), but using a variable, as in the
(i.e.,
into
print statements above, does
not remove the value from memory, but only accesses the value from the variable. If
programs were confined to the
would be of limited
use.
We
64
We will
examine
this
data contained within them, they
need to obtain data from outside the program from,
for instance, a user typing at a disk.
literal
keyboard or by reading data from a
more
in
file
stored
on
depth in chapter 6 when we discuss input and
CHAPTER
4
DATA: TYPES
AND VARIABLES
;
output. For now,
we
consider only the simple case of obtaining scalar values from
We
a user at the keyboard.
input
file
,
can do this using the input operator and the standard
handle stdin as follows: ""
my Sinput; print "Enter a value: " $input = ; print "You entered $ input";
In a scalar context, such as assignment to a scalar variable, the operator reads in
one
line
of input from the standard input, which
board unless you've otherwise redirected
it.
is
usually the key-
So, in the snippet above, the
statement prompts the user to enter a value.
print
first
The assignment statement
takes
everything the user types at the keyboard up to and including the newline (generated by hitting the enter key) and assign that input to the variable $input.
4.2 An
.
Expressions
expression
is
something that evaluates
to a value. It
able, a function that returns a value, or the result
may
be a
literal value,
familiar with, addition, subtraction, multiplication,
the operators +
,
-
*
,
and
,
/
respectively.
a vari-
of an operation on one or more
values or expressions. Perl supports the basic mathematical operations
and
you
division, represented
are
by
For example:
=3+5;
$foo $bar
=
{$foo
The
+7)
/
5;
sum of its two
addition operator returns the value of the
in this case are the
two simple expressions represented by the
the right
and the
result
of the expression ($foo +
Operators also have a
relative
onstrated in the above example.
precedence
Of the
7)
on the
operands, which
literals
the second line, the division operator also has two operands: the
tion
,
3 and
literal
5.
In
value 5
on
left.
level associated
with them,
as
dem-
four basic arithmetic operators, multiplica-
and division have a higher precedence than addition and subtraction, meaning
that multiplication
and division operations
subtraction. In the example above,
are evaluated before
we used
any addition and
parentheses to override the standard
precedence, causing the addition to take place before the division because parentheses
have the highest precedence.
Had we
$foo would have been added to the Perl has
two additional numeric
exponentiation operator the
modulus
ands.
It
result
operator),
is
**.
it is
of 7
/
5.
operators: exponentiation
and modulus. The
Like the four simple arithmetic operators above (and
a binary operator. In other words,
returns the result of its
EXPRESSIONS
not used parentheses, then the value of
left
it
takes
operand raised to the power of its
two oper-
right operand:
65
;
'
$foo = 4 ** 2; $bar = $foo ** 3;
The modulus dividing the
$foo is 4 raised to the power of 2 or 16 $bar is 16 raised to the power of 3, or 4096
#
,
#
operator
may
(%)
be
less familiar. It
operand by the right operand. Both operands are taken to be
left
by removing any
gers (or converted to integers if necessary
The
result
of 10 %
remainder and testing
is
because 10 divided by 3
3 is 1
even or odd, you might
if it is
provide a ready means of making such a
test.
The
2 will only have a non-zero remainder
N
an odd
two binary
Perl also has (.),
and
will
be handy to
is
inte-
fractional portions).
3 with
left over. 1 is
1
the
the value returned by this expression. If you recall the question of
an integer to see
know string.
by an
right away, the
This
if
is
realize that this
operator can
of any integer N, modulo
result
integer.
string operators: concatenation, represented
repetition, represented
newline from a
returns the remainder after
is
x.
There
chomp
(
)
is
also
one
which removes
function,
by a dot
built-in function that a trailing
remove the newline from an input value
useful to
obtained using the operator described above. $foo = 'hello $bar = $foo 'world' '
.
$foo = 'bo' X
3
#
;
;
concatenation: $bar is now 'hello world'
#
$foo is now
'
bobobo
print "Enter a value: $ input = ; chomp ($ input # removes newline from $input )
'
;
Perl also offers several
shorthand assignment operators that combine a scalar
operator with the assignment operator, the complete perlop pod-page (perldoc
list
of these
is
available in the
perlop), here are a couple examples to illustrate the
concept: $a = 42; $a +=5;
same as: $a
$b = 'hello $b .= 'world'; '
$a + 5
=
;
#
same as
:
$b = $b
.
'world'
You might now wonder what happens you attempt is
"
,
#
to use a
if
numeric operator such
are only defined for
add
as addition to
the second application of type distinction
Numeric operations
a scalar variable contains a string
I
mentioned
numeric
values,
it
to a
in the
and
and
number. This
opening section.
string operations are
only defined for string operations. In a strongly typed language where each variable can only hold a particular type of value such as an integer or a character string, the
66
compiler can detect an attempt to add two mismatched variables and
CHAPTER
4
DATA: TYPES
AND VARIABLES
;
;
' '
cause an error before the program
is
actually run. In Perl, with only
type for both strings and numbers, such information piler.
An
gram
is
not available to the com-
is
running and the variables are evaluated for their values.
problem
number
expected, the
is
in a very relaxed
manner.
number
is
expected,
it is
used where a
is
converted to a string. (For example, the
is
converted to a
something reasonably interpreted
as a
if
a string
number according
leading spaces in the string are ignored. If the
a number), they are taken as the
number
If a
3.14 becomes the string of characters 3.14.) Similarly,
Any
scalar data
attempt to add a string and a number cannot be detected until the pro-
Perl solves this
string
one
number
number. Any
used where a
to the following rule:
non-space characters are
first
minus sign followed by
(or a plus or trailing
is
number
non-numeric characters
are
ignored. If the string does not have an obvious numeric interpretation, a value of 0 used.
is
'
3
.14
converts converts converts converts converts
'
3.14'
'
3.14abc' 'abcl23 '
'
number
This conversion keyboard, which
is
is
to 3 14 to 3.14 to 3 14 .
'i
.
.
;
to 0 to 0
^ :
a useful device that allows
read as a string, and use
it
you
as a
to read a
number.
have done some calculations and wish to print out the verted back to a string for output.
Run
[
number from
the
Similarly, after
you
number
con-
results, a
the following example
,
v
program
a
is
few times
using different values for input. Try using: 42, 42abc, and hello. # /usr/bin/perl -w use strict; print "Enter a value: " my $input = ; my Sresult = $input + 5 print "result is $result\n" !
;
If
you
program
are using the
-w switch for warnings, which
warnings
will issue
interpretation. Try
when
I
highly recommend, the
a string value does not have
running the preceding
script again
an obvious numeric
with the same inputs but
without the -w switch.
List data
4.3
A list
is
simply an ordered collection of scalar values.
comma-separated
LIST DATA
list
It is
represented literally as a
of scalar values or expressions enclosed within parentheses:
67
)
|
list of four values list of three values list of three values
(1,2,3,4)
Ca', +
(4
"red") $foo, 1)
42, 5,
A nested gle flat
where a
list is
inserted within a
4),
is the same as
(1,
3,
We've already seen an example of using a
list
(3,
5)
my
able declaration using the
sin-
it
5)
4,
in the previous discussion of vari-
The print
declaration.
argument, though we've only used
its
simply evaluated to a
list, is
list:
2,
2,
(1,
list,
function also takes a
)
(
so far with a single element in the
list
as
list:
print ("the value of \$a is $a\n"); print (' the value of $a is ', $a, "\n"); #same output as above In the second version we've used a
same output
second
is
The
as the first version.
(Hence, we did not have
of three scalar values to produce the
list
element
first
the value of the variable $a.
And
double-quoted strings
symbol in the
the third element
quoted string producing a newline. The practical in
{ '
one
'
— ,
two
'
,
'
three
'
,
Creating this type of
and
lists,
is
four
'
list
in
producing
'
one
'
,
'
'
,
'
three
'
,
'
your code can make
four
first
lists
version.
of quoted
'
it
difficult to read for
prone to the error of forgetting a quote. The qw
two
The
list:
sequence of whitespace-separated "words" and produces a {
string.)
simply a double-
demonstrated here by the simplicity of the
is
the quote function. Consider the following
'
is
of variable interpolation
utility
Perl has a convenient list-making function for strings
the single-quoted string.
is
to use a backslash to get a $
list
(
)
long
function takes a
of quoted "words":
'
is the same as:
qw(one two I
three four)
used quotes in the above paragraph to describe "words" because they don't
have to be words in the ordinary sense ters
separated by any
just sequences
amount of whitespace. The qw
use parentheses to delimit acter.
—
its
argument.
One
(
)
of non-whitespace characfunction does not have to
can use any non-alphanumeric char-
Common choices are slashes or vertical bars:
qw/one two three/ qw one two three
'
I
68
CHAPTER
4
DATA: TYPES
AND VARIABLES
;
;
;
Array variables value may be stored in an
;
4.3. 1
A list
prefixed w^ith an @
array variable (see figure 4.2).
symbol (think of it
as a stylized "a"
fiar
An
array),
array variable
but otherwise
lows the same naming rules as those for scalar variables mentioned
ment
to an array variable
assignment operator to assign a
@array =
same
the
is
list
as for scalar variables.
earlier.
is
fol-
Assign-
Simply use the
value to an array variable:
(42, 13, 2);
@array#
$arrav[1]#
An
Figure 4.2
array variable
is
a
list
of scalars.
three' ©foo = (1, 2, ©bar = ($a, $b, 3 + 4) ©copy = ©foo; '
)
Whenever you
mind. That way you won't find element of a ©foo ©bar ©new
it
list
you cannot
surprising that
representation in
store
an array
as
an
list:
=
(1
=
('four',
=
(0,
,
you should keep the
picture an array,
2
,
'
three
©foo,
'
)
6);
5,
©bar,
7);
is the same as:
©new
= (0, three '),(' four (1, 2, which resolves to: ©new = 0 1 2 three four '
(
,
Lists
and
,
,
'
'
,
'
arrays are ordered.
'
,
'
5
6),
5,
,
,
6
,
7
7
)
)
We can access
individual elements of them using
a subscript notation to refer to the position, or index, of a value in the
list:
©array = {9, 10, 11, 12) $second = $array[l]; # assigns 10 to $second
You might bered
1
,
find
two things odd
and why the
access into the array
bol. First things first. Perl, like
LIST DATA
at this point: is
why
the second position
is
num-
prefixed with a $ instead of an @ sym-
many programming
languages, starts counting
69
from
zero.
indices
numbered from 0
On any
we
This means that the four element
stored in the array has positions or
to 3.
the second point, recall that an array holds a
given element in the array
say
@foo
list
is
-
^
•
(42,
12,
of scalars, so the type of $
When
symbol.
^
,
,
=
list
always a scalar type denoted by a
2)
;
performs the equivalent assignment of
this
Well,
$foo[2])
$foo[l],
{$foo[0],
it's
=
12,
(42,
2)
not quite the same thing. In the
equal that three element
list,
while in the
first case,
the entire array
only the
latter case,
is
set to
three elements
first
are set to those values. If the array previously held ten items, the last seven
remain
unchanged.
Each element of the array does
is
a scalar in
is
own
its
right. All the array variable
allow us to refer to the whole collection as one
on the
certain operations
list itself
such
beginning, the middle, or the end of the
Before
we proceed
from a
list
of scalar variables just
as
we saw
and
to
perform
list.
array operations,
of scalar values to a
does not just apply to array elements. You can assign a list
list
adding or removing elements to the
some of these
to discuss
that the above assignment
as
named
when
earlier
list
list
we should
realize
of scalar variables
of scalar values to any
declaring and assigning multiple
scalar variables:
{$foo, ($foo,
$bar) $bar)
That
12); ($bar, $foo) (42,
=
last line is possible
values of the list
=
two
list
#
$foo gets 42, $bar gets 12 swaps $foo and $bar
because the
variables create the
of variables on the
that
# ;
left. I
list
list
on
the right,
—which would
slice
of an array or
list
—
that
is
evaluated
which
is
then assigned to the
— $bar
first.
is,
assigned to $foo, then $foo
leave both variables with the
Aside from accessing individual elements, a
The
right
point this out so that you don't mistakenly assume
assignments happen sequentially
assigned to $bar
on the
same
we may access what
value. is
referred to as
list
of indices into the
DATA: TYPES
AND VARIABLES
a sublist corresponding to a
array:
@foo ©bar
70
=
(10,
=
(afoo[l,2];
12,
14,
16)
;
#
@bar gets the list
CHAPTER
4
(12,
14)
;
When we
;
are accessing a slice (sublist),
we
A slice of an array
a
with a subscript
list.
is still
slices
need not be
in a consecutive order:
©foo ©bar
=
(10,
14,
=
©foo
There
12, [3
are
0
,
,
1]
two
©foo
=
array,
(11,
we
12,
=
#
©bar gets the list
10,
(16,
12)
of built-in functions for adding and removing elements
pairs
array.
For adding to or removing from the begin-
use unshif t and shift respectively:
13)
shift (©foo)
For working
or array type value. Also, these -
unshift (©foo, 10); # ©foo is now (10, unshif t (©foo, (8, 9));# ©foo is now (8, $bar
list
16)
;
from the beginning or end of an ning of an
use the © symbol in combination
$bar gets
#
;
at the
end of an
11,
13)
12,
10,
9,
11,
©foo is now
8;
array,
we
12,
13)
10,
(9,
11,
12,
13)
and pop functions
use the push
for
adding and removing respectively: ©foo
=(1,2,3);
^
push(@foo, 4); push(@foo, (5, 6));
# #
©foo is now ©foo is now
$bar
#
$bar gets
=
pop(©foo);
One
last
point to
make about
6;
comma
comma-separated
(1,
2,
3,
4)
(1,
2,
3,
4,
©foo is now
(1,
2,
3,
how
4,
5)
they are interpo-
you print an array using print ©array, no
separates the values printed. Perl will just treat the array as a list
double-quoted string
of values to be printed. However, like
print "©array", the array
between each value. Both behaviors are good ing the values of the special variables: $ list
6)
5,
array variables concerns
lated within double-quoted strings. If
space or
.
,
,
defaults,
if
will
you
print an array in a
be printed with a space
but can be changed by
the output field separator,
and
$
"
alter-
the
separator.
4.3.2 Hash variables Perl has
one other variable type that holds collections of
variable (formerly %
symbol.
A hash
known
as
an associative
array).
does not have a convenient
array can be represented as a
list,
scalar values
A hash variable
literal
is
—
the hash
prefixed with a
representation in the
way an
because the elements of a hash are stored in an
order completely independent of the order in which they were assigned.
A hash a
name,
is
like
an array that
called a key (which
LIST DATA
is
not ordered, where each value
must be
is
associated with
a string), rather than a positional index.
These
71
keys are used as subscripts to access individual elements just as indices are used as subscripts into arrays. ets,
But while array subscripts
are contained inside square brack-
hash subscripts are contained in curly braces.
Even though a hash such a
case, the list
%hash
=
('first',
is
list,
taken to be a
42,
The hash above
not a
is
a
list
'second',
can be used to assign values to a hash. In
list
of key/value
pairs:
12);
has two elements, one corresponding to the key first and
one corresponding to the key second: %hash = ('first', 42, 'second', 12); print " $hash{second} \n" # prints: 12 ;
Perl has
an alternative to the
comma operator that is
assign to hash variables, the => operator. This that
it
works the same
on the
also automatically causes the value
useful
left to
when
using
lists
to
comma except
as the
be quoted:
%hash = (first => 42, second => 12); print " $hash{second} \n" # prints: 12 ;
The like
list
two
in the first statement
pairs
still
of elements separated by a comma, and
the quotes are no longer necessary. Just
may
be interpreted
%hash
=
%hash
first
(
has four elements, but
=>
as a
42,
hash
remember
(see figure 4.3
second =>
it is
72).
it
looks
more
easier to read because
that a hash
on page
now
is
not a
list,
but a
list
;
13);
#
$hash{first}#
Figure 4.3
A
hash variable associating keys with scalar values
Because a hash the beginning or
always add a
%hash
is
not a
72
no functions
end of a hash: there
new element by simply
(first => 42, $hash{ third} = 7; =
list,
is
exist to
add or remove elements from
no beginning or end of a hash. You can
assigning a value to a
new key
in the hash:
second => 12);
CHAPTER
4
DATA: TYPES
AND VARIABLES
;;
And you
can delete a key (and
delete $hash{ first }
You can
named keys the
list
ordered
get a (
)
#
;
its
value) using the
delete function:
removes the key 'first' and its value
of the keys or the values in a hash using the appropriately
list
and values
(
do not expect these
functions. However,
)
of keys or values in any particular order. Remember, a hash list.
The keys
(
function merely returns a
)
list
to return
not an
is
of keys depending on the
order in w^hich Perl internally stored those keys.
%hash = (first => 42, second => 12); $hash{ third} = 7; ©keys = keys (%hash) print "@keys\n"; # printed: first third second
hash
Perl also allow^s list
slices,
of values by providing a
-O
similar to array slices, with
list
of keys
as a subscript to the hash.
symbol, rather than the normal % symbol, for hash
;
Here, @h_slice
hash
12)
(7,
an ordinary array and (ahash{ 'third'
,
'second'
function iterates through key/value pairs in a hash. This
done with one of the looping constructs shown Hashes are powerful
of them in examples in
4.4
}
is
the
tools for organizing sets
is
gener-
in the next chapter.
of data.
We will make heavy use
later chapters.
Context
draws a primary distinction between scalar values and
Perl
use an ©
slice.
The eachO ally
is
You
access a
slices:
%hash = (third => 7, first => 42, second => 12); ©h_slice = @hash{ third' ,' second' } # @h_slice gets '
which you may
It
values.
what type of value may be assigned
tion extends farther than just
type of variable.
list
also affects
how
are evaluated within a context that
This distinc-
to a particular
certain expressions are evaluated. Expressions
may
be either scalar or
list
(or void, but
we
won't consider that here). Consider the following:
@foo $bar
=
(11,
=
@foo;
In the
An
12,
first
13)
statement,
array expects a
list
we have
so the
ment, an array on the right typed language,
CONTEXT
this
list is is
a
list
on the
right being assigned to an array.
evaluated in a
list
context. In the second state-
being assigned to a scalar variable. In a strongly
would be a type mismatch.
How can you
assign an array to a
73
The assignment
scalar?
expecting a scalar value;
is
it is
providing scalar context for
the array. Perl has a rule for evaluating an array in scalar context: return the
num-
would be
ber of elements contained in the array. So, in the above example, $bar assigned a value of 3.
you might
Similarly,
@foo = (12) @foo =12;
The
; '
:
•
first
statement assigns a
ond statement
tries to assign just
scalar value
is
evaluated in
which
is
(12),
list
with one element to the array @f oo.
context,
list
=
(2,
3,
4)
(0,
1,
@foo);
@foo
evaluated as a
is
Perl's
produces a single element
Perl
list,
the earlier examples of array assignment:
Obar gets
#
2,
1,
(0,
3,
4)
why
list
context to any element within, which
list
in the second statement rather than as a scalar.
built-in functions return different values
which they
on
and
;
provides a
list
sec-
assigned to the array.
=
A
The
the scalar value 12 to the array. In this case, the
We saw an example of context in @foo @bar
one of two ways:
assign a scalar to an array in
is
the array
Many
depending upon the context
of in
depending
are called. If a function has different return value behavior,
context, the perlfunc pod-page entry for that function will clarify that different
behavior.
4.5
f
References to variables
Earlier
we
•
we mentioned
storage bin with a
warding address.
you
or
is
name and an
how
grams given
follows earlier
it
circle
its
you'll
is
stored.
see
that
for that vari-
examine the variable each variable
pointing to a storage bin
something
value
up the address
variable, Perl looks
sents the address associated with the variable Perl encounters
reference as a for-
of variable names and their associated
lists
chapter,
this
immediately followed by a black
Whenever
you can think of a
to the correct storage bin. If you
in
but
string, or a reference,
one. If you think of a variable as a
but the address where
own
its
WTienever you access a
name and
make
a reference to another variable in a scalar variable,
are not storing that variable's value,
addresses.
to
address, then
When you store
Internally, Perl maintains
able
number, a
that a scalar value can be a
never said what a reference
—
dia-
name
is
that circle repre-
name.
like $foo,
it
expects the
$
symbol
to be
immediately followed by something that evaluates to an address where a scalar value it,
is
then
stored. Perl checks
its
replaced by
its
f oo
is
(or set if we are assigning
74
internal scalar
address,
name
and the
lists
for the
name
f oo. If it finds
scalar value in that bin
is
retrieved
something to $foo).
CHAPTER
4
DATA: TYPES
AND VARIABLES
;
Perl allows its
you
)
to assign the address of
one
variable's storage bin, rather
value, to another variable using the backslash operator. This
than
called assigning a
is
reference to a variable:
$foo =42; $bar = \$foo; It is
_
#
important to
realize that
$bar
is
a scalar variable with
In this case, however, the value stored in that location bin.
Hence,
if
.
$bar gets address of $foo's bin
you print out the value of $bar you
will
(
—
own
storage bin.
the address of another
is
not see 42 but a representa-
tion of the address that looks like: scalar 0x8050fe4
tion consists of the type of reference
its
)
This printed representa-
.
—and
a reference to a scalar
a representation
of the address. To actually use the reference to access the contents of that storage bin,
you need
to dereference
it.
Remember
that Perl expects to find something that
an address immediately following the
resolves to
symbol before our reference
variable,
it
will
$
symbol, so
if
we
place another
$
be followed by something that resolves
to an address: '
$foo = 42 $bar = \$foo; print "$bar\n"; print "$$bar\n";
'
;
In the
name
first
'
#
#
prints the reference: SCALAR 0x8050fe4 prints: 42 (
print statement, Perl sees $bar as a $ followed by a scalar variable
that immediately resolves to an address. Perl looks
nal tables
and
up the address
uates the second dollar sign, retrieves the value, is
in
its
inter-
prints the value stored there (another address). In the second print
statement, Perl discovers a dollar sign followed by another dollar sign.
sign
:
which
is
which
an address
It first
eval-
is
followed by a valid variable name, and
as
we have
already seen.
Now the first dollar
followed by an address, so Perl follows that address, retrieves the scalar value
stored there (42),
and
prints
it
out.
Figure 4.4 shows a graphic depiction of a scalar variable containing a value,
and another
scalar variable containing a reference to that variable.
diagrams, an address
is
A reference value
denoted by a black
is
circle
As
in the earlier
pointing to a storage bin.
always a scalar value, although
it
may
refer to the address
of an array value or a hash value:
©array = (11, 12, 13) $aref = \@array; print "@$aref\n"; In this case, Perl expects the @ symbol to be followed by either a valid array
name, which corresponds to an address, or an address pointing
REFERENCES TO VARIABLES
to
an array value.
75
$scalar = 42;
•
$scalar
$s_ref = \$scalar;
•
$s_ref
A
Figure 4.4
As
reference to a scalar variable
in the previous example,
ment
above, the @ symbol
is
$aref evaluates to an address. So, in the print
state-
followed by an address that Perl uses to retrieve the
stored array value. Figure 4.5 depicts an array variable
and a
scalar variable con-
taining a reference to that variable.
@array =
(42, 13, 2);
@array 0-
$a_ref = \@array; $a_ref
•
Figure 4.5
At
A
reference to an array variable
this point,
references.
it is
only important that you understand the basic concepts of
Their usefulness won't come into play until
later chapters
with passing parameters to functions and creating data structures. references
4.
6
more
Putting
data,
76
many
We
deal
will cover
fully in those chapters.
it
together
In this chapter, you've been presented with well as
when we
all Perl's
basic data
and
variable types as
simple operators and functions for reading in data, manipulating
and displaying the
results.
You may not
CHAPTER
realize
4
it,
but already you can write a
DATA: TYPES
AND VARIABLES
;
variety of simple but complete
programs using only what has been covered so
For example, consider a program that asks for a measurement in inches and plays the equivalent measure in centimeters.
Here
is
a version of such a
with excessive comments to remind you of various operations. to centimeters using the
approximation that
1
We
far.
dis-
program
convert inches
inch equals 2.54 centimeters:
#! /usr /bin/perl -w use strict;
my ($inches, $centiineters
)
#
;
declare variables
print "Enter a measurement in inches: $inches = ; chomp {$ inches )
$centimeters
#
$inches
=
#
display a prompt
read input value from user remove newline from input value
#
;
";
*
2.54;
#
calculate converted value
print "$inches inches is approximately $centimeters centimeters\n"
You could
write
more complex programs
as well, taking
many more
values
from the standard input and calculating complex mathematical formulas using Such programs may become long because they must repeatedly
these values.
prompt
for a value
and read a
repeatedly executing the I
mentioned
and running
Perl
ters.
earlier
at
next chapter addresses ways of
to simplify such problems.
that
you
will learn far I
We will
cover
more by writing
encourage you to do the
Quite a few more functions and operators
many of them
exist in addition to
in the following chap-
are always available to you.
Use
second viewpoint on things we've already covered or to take a
some functions we
haven't yet explored.
Exercises
Write a program that asks for a weight in pounds and displays the equivalent weight in kilograms
2
and repeat here
presented here.
either to get a
look
4.7 1
same piece of code
The
Remember, the perlfunc and perlop pod-pages
them first
we have
of input.
programs than by reading about them.
exercises that follow.
those
line
(1
kilogram equals 2.2 pounds).
Write a program that calculates the gross pay of an employee. The program should ask for the hourly rate of the employee,
how many
regular hours,
and
overtime hours the employee worked. Pay for overtime hours
should be calculated
EXERCISES
how many
at
time and a half (1.5 times the regular hourly
rate).
77
CHAPTER Control structures 5.1
80
Selection statements
5.2 Repetition: loops
84
5.3 Logical operators
89
5.4 Statement modifiers 5.5
Putting
it
5.6 Exercises
together
92 92
97
78
5'
;
A program
is
a series of statements
tliat are,
by
beginning to end. This sequence of execution
of the program. Not series
tion.
strictly linear fashion.
Other problems require
some condition
In the previous chapter, explicitly address the
(or
Some problems
a series
require a choice
particular criteria or condi-
of steps to be repeated a certain number of
met.
is
we
we
discussed simple expressions but
concept of statements. In short, a statement
is
did not
an expression
combination of expressions) followed by a semicolon. For example
This context.
we
referred to as the flow of control
more courses of action, depending on
or
times or until
is
from
problems are amenable to being solved by executing a
all
of instructions in a
among two
default, executed in sequence
are
realistic
is
a statement, but
Such a statement
it
doesn't
do anything.
will generate a
assuming that the -w switch
is
just a literal value in a void
It is
warning when using the -w switch (and
being used throughout this book).
examples of statements are
foo = 12;
^
....
print "$foo\n";
'
Statements are usually written on a single
line,
but
this
the language. Occasionally, statements are too long to line.
More
is
\
;
'
not a requirement of
reasonably on a single
fit
Consider the following statement, which might have been used in the solu-
tion to the final exercise in the previous chapter:
$gross_pay = $regular_hours $hourlY_rate * 1.5;
This
is
$hourly_rate
*
a perfectly valid statement, though
$overtiine_hours
+
it is
*
not particularly readable
when
written in this fashion. This statement might be better written as
$gross_pay
$regular_hours $overtime_hours
=
Using whitespace in statement.
An
this
* *
$hourly_rate $hourly_rate
manner helps the
even better way to rewrite
$regular_pay $overtime_pay $gross_pay
= =
=
this
+ *
reader understand that this
statement
$regular_hours * $hourly_rate $overtime_hours * $hourly_rate $regular_pay + $ overt ime_pay;
Multiple statements
may
be grouped into
by enclosing them within a pair of curly
CONTROL STRUCTURES
1.5;
*
is
a single
to use three statements:
1.5;
compound
braces.
is
statements called blocks
A block also
introduces a scope on
79
variables declared within
which simply means
it,
that a variable declared in a
my
declaration within a block only exists within that block. In other words, the variable in question
my $foo
=
42
is
visible
only within that block:
•
;
.
{
my $bar - 13; print "$foo and $bar\n"
#
#
;
both $foo and $bar exist within this block
}
print "$foo and $bar\n";
Scope declaring
is
all
an issue we
is
will address
$bar does not exist here
more
which they
— think of
such variables are
Your
are pri-
entire pro-
Perl automatically putting curly braces at the
beginning and end of your source code serves to
all
are declared. Blocks can be nested.
considered a block
now we
closely in chapter 7, for
of our variables with the my declaration and
vate to the block in
gram
warning,
#
file.
Aside from introducing scope, a block
group statements into a compound statement that can be selected or
repeated with the use of control statements.
5. 1 The
Selection statements
basic selection statement in Perl (and
many
other languages)
is
the if state-
ment. This statement has the following general syntax: if
Condition
(
There
is
)
{
Block
no semicolon
}
on
after the closing brace
contains multiple statements and the structure
is
a block. Often, the block
written as follows (according the
principles laid out in chapter 2):
if
(
Condition statement; statement;
When
)
{
Perl encounters
then the block of statements
an if statement is
it
evaluates the condition. If true,
executed. Otherwise, execution skips the block
and
continues with the next statement following the block (see figure 5.1).
But what
is
and what
a condition
any expression. This expression value.
Remember,
is
true
(i.e.,
in a scalar context, the result
and
false?
The
condition
is
simply
evaluated in a scalar context to produce a scalar
Perl interprets variables
depending upon context
80
is
scalar or
list
and expressions
context).
of the evaluation
is
When
slightly differently,
something
the scalar value.
CHAPTER
5
is
A value
evaluated is
false if
CONTROL STRUCTURES
statement;
statement;
Flow diagram
Figure 5.1
undefined
it is
of
an 1£ statement
(a variable that
has no value stored in
a string containing only a zero. Every other value
my $foo if
=
$foo
(
is
it),
empty
a zero, an
string, or
considered true.
1; )
{
.
print "$foo is a true value\n"
'
;
:
r
i
,
'
'
}
,
If you
ment of:
run the snippet above
inside the if block. If
as a Perl
program,
you change the
initial
it
will execute the print state-
assignment to $foo to be one
0 (or anything that evaluates to zero; 0.0 or 0el2 for example),
'
'
or
,
'
0
'
then the if block will be skipped.
Often we do not want
to merely test a variable's value.
that value in terms of another value. In other words,
is
We
want
to evaluate
the variable's value less
than, greater than, or equal to another specific value? There are operators, called relational operators, for each tests
a
numeric
common
relations
of these
and one
tests.
These operators come
tests string relations (see table 5.1
mistake to use the wrong type of relational
test,
in
two forms: one
on page
82).
especially to mistak-
enly use a numeric test for string data. Perl will issue a warning about this
being used. (You are using -w,
SELECTION STATEMENTS
It is
if
-w
is
aren't you?).
81
Table 5.1
Relational operators
Numeric
String
$a < $b
$a
It
$a
le
$8
$b
$b
$a
$a gt $b
True
if
$a
is
$a ge $b
True
if
$a
is
greater than or equal
TO q)D
$a
== $b
$a != $b
$a eq $b
True
if
$a
is
equal to $b
$a ne $b
True
if
$a
is
not equal to $b
You can use the if statement the condition if
(
is
Condition Again, this
to choose set
true or something else if it
{
)
is
Block
}
else
{
is
Block
up
false.
choices. Perl can
do one thing
if
This form has the general syntax
}
better written as an indented structure according to principles
outlined in chapter
2.
Figure 5.2 shows the flow of control in such a structure.
statement;
statement;
TRUE
FALSE
else statement;
statement;
else statement;
statement;
statement;
statement;
Figure 5.2
82
Flow diagram
of
an if/else statement
CHAPTER
5
CONTROL STRUCTURES
,
The
following example displays one thing
something if
else if the relation
$foo
;
"
.
>
)
r' .
;
}
greater than $bar, or
is
is false:
$foo > $bar { print "$foo is greater than $bar\n" else { print "$foo is not greater than $bar\n" (
if
''
'
'
'
' '
-'O
;
!
'
:
:
.
}
Additionally,
you may
select
one of several blocks
ple selection criteria in elsif clauses (note, there
is
to execute
no
by using multi-
"e" before the "i" in the
elsif keyword), which can be inserted between an if block and an else block (see figure 5.3).
state me fit;
statement;
TRUE
BLOCK
1
TRUE
BLOCK
2
FALSE I
ELSE BLOCK
statement;
statement;
Figure 5.3
Flow diagram
of
an i£/elslf /else statement
SELECTION STA TEMENTS
83
;
;
;
we can determine
In this manner,
if
;
a value
greater tlian, equal to, or less
is
than another value: if
}
$foo < $bar { print "$foo is less than $bar\n" elsif $foo > $bar { print "$foo is greater than $bar\n" (
)
{
)
}
else
{
print "$foo must be equal to $bar\n"; }
5.2 The
Repetition: loops
other primary flow of control structure
block of code that
ify a
the loop.
A loop allows you to spec-
to be repeatedly executed a specific
is
some condition
(determinate loop) or until a program
is
number of
met (indeterminate
is
to calculate the average of ten values input
by a
user.
loop). Consider
You could
out ten pairs of statements that prompt for and accept user input. write the code once
What
and loop over
you do not know
if
you must use a loop
this case,
czW&d z sentinel
The
value,
in
it
times
Or you
write
could
10 times.
advance
how many
values will be averaged? In
to repeat the input statements until a special value,
h emtttd.
basic indeterminate loop,
which
you'll find in
many
other languages,
is
the while loop:
while
(
Condition
)
{
Here the condition executed.
Once
condition
is
Block is
the block
}
tested and, if
is
it is
true, the statements in the
block are
finished, control returns to the top of the loop
and the
The block will be repeatedly executed as long as the conOnce the condition returns a false value, execution next statement following the block. Whenever you write an
tested again.
dition evaluates to a true value. will continue
with the
indeterminate loop, you should always double-check the statements in your loop-
block to ensure that the conditional will eventually
an
infinite
Here
loop that won't stop running until you is
fail.
kill
Otherwise, you will have
the program.
an example that calculates the average of an undetermined number
of grades: # /usr/bin/perl -w use strict; my ($average, $total, my $grade = !
$count)
=
(0,
0,
0)
'
'
84
CHAPTER
5
CONTROL STRUCTURES
{
;
"
while (Sgrade ne 'q') print "Enter a grade or 'q' to quit: $grade = ; chomp $grade; if (Sgrade ne 'q' $count = $count + 1; $total = $total + $grade;
,
{
.
"
)
'
} :
,
}
# avoid division by zero if nothing entered Scount 11=1; $average = $total / $count; " print "The average is $average\n"
'
,
;
.
:
_
we first declare our variables for the average, the total, and the count of how many values will be eventually entered. These variables are initialized with values of zero, using the list assignment we saw in the previous In the example above,
chapter.
We
also declare a variable to
with the empty is
not equal,
prompt the variable.
sentinel
string.
We
hold each grade
(ne), to the string q, the
block
chomp
(
)
was entered.
add the grade
off the newline If the sentinel
to the total.
and read
initialize
this into the
determine
to
was not entered, we add one
it
to
we
grade if
the
our count and
This entire loop will be repeatedly executed until the
which point the loop
to calculate the average
and display the
while loop may
input and
and use an if statement
user enters a single q, at
A
it is
executed. Inside the block,
is
user for a grade, or the sentinel value q,
We
as
then enter the loop statement. If the value of the grade
also
is
finished,
and the program goes on
result.
have an optional continue block immediately
fol-
lowing the main loop block. while
(
Condition
)
{
Block
The continue block
is
}
continue
{
Block
executed each time the while loop continues to be
executed by default (see figure 5.4 on page 86).
much trol
in practice.
statements
WTien
we
it is, it is
The continue block
its
is
not used
usually in conjunction with additional loop con-
will discuss later in this chapter.
statement, complete with
}
However, we can use the while
continue block, to help define the next kind of loop,
the for loop:
for (initialization; condition; In this statement, the tialization expression
the loop
body
is
and
executed.
REPETITION: LOOPS
first
tests
iteration)
time the loop
is
{
Block
}
encountered,
it
evaluates the ini-
the conditional expression. If the condition
WTien the loop body
is
is
true,
finished executing, the iteration
85
statement;
statement;
FALSE
statement;
Loop Block
statement;
statement;
Continue Block
statement;
statement;
statement;
Figure 5.4
Flow diagram of a while loop statement
expression
is
evaluated,
and the condition
is
tested again.
An
example
will help
show what happens: for
(my $count = 0; $count print "$count\n";
.
and ;v.
.;
{
print "still running\n" }
while 1 print "still runningVn" (
)
-
{
-
}
Infinite loops
have no value in programming unless you can
the loop to exit with just this sort
some other statement.
Perl provides a
force
last statement to do
of thing.
while 1 { print "Enter a value: " my $input = ; chomp $ input; if $ input eq 'q' { print "You are exiting the loop\n" (
somehow
)
(
,
•
,•
:
)
last; }
print "You entered $input, here we go again\n" }
The last statement Perl provides
will force
an
exit
of the immediate enclosing loop.
two other loop control statements
ment, the next and redo statements. the loop block to be skipped
Briefly, the
in addition to the
last
next statement causes the
and the continue block
to be executed.
state-
rest
of
Control
then returns to the condition of the loop. In the case of the for loop, the loop
body
is
skipped, and the iteration expression
REPETITION: LOOPS
is
evaluated again before returning to
87
;
;
;
A
the conditional expression.
non-empty
lines in a
number of
simple example might be to count the
file:
# /usr/bin/perl -w use strict; my $ count =0; while { next if length ($_) $count++ !
(
-
,
.
)
[2] -> [0] \n" "$$aref [2] -> [0] \n" "$$aref [2] [0] \n" "$aref-> [2] [0] \n"
Notice in the
'
,
,
prints: 42
#
;
actually a reference to an
is
;•;
'
;
'Now, consider a nested structure of arrays (This
used between
is
subscript:
13,
$aref->
"
as the final value evaluates to a reference.
the arrow operator ->, which
is
may con-
how much
are to read.
anonymous
(We could
array or hash values in Perl
also create
anonymous
used in practice.) In other words,
if you
is
scalars,
by using a
sca-
but anonymous
dereference a scalar variable
that did not previously contain a reference to a value, Perl automatically creates
anonymous
value of the appropriate type and stores a reference to
my $variable; $variable-> [0] = 42; print $variable\n" print $variable-> 0 \n" "
[
]
it
;
as if
array
it
:
held a reference to an array, Perl automatically created
and assigned the value 42
CREA TING REFERENCES
in the variable:
#prints: ARRAY 0x8 04a9 6c) #prints 42
Here, the variable $variable did not contain a reference, but it
an
(
;
"
it
to the first
element of that
when we used an anonymous
array, storing a reference
145
;
to that array in the scalar variable.
As we saw
containing a reference just prints out
its
type and
This way of creating anonymous values quite useful. In fact,
we
will use this
lem given above. Here's a top
level
in chapter 4, printing out a variable
method
is
in
memory address.
called autovivification
and can be
our solution to the marking prob-
breakdown of the program:
= /usr/bin/perl -w use strict; my $data_file = 'scores.txt'; my %students; «read file and build data structure>>
.
.
.
#
!
This code.
of
We
is all
relatively straightforward, so
begin by opening our data
the
first
jump
right into the data structure
and looping through
chomp o'ing the newline off a
arrays. After
fields:
file
let's
line,
element will be the student name, the second
number, and the
last field is
to build our hash
it
we split ()
We
the score for that assignment.
is
it
into array of
the assignment
use these fields to
build the structure.
«read file and build data structure>>= open{SCORES, $data_file) || die "can't open file: while
{ chomp; my ©fields = split /:/;
.'^^^
name => 'Anne Smith', age => 35, beer => Pale Ale
{
'
,
name => 'Bill Jones', age => 21 beer => 'Dark Ale',
{
'
^
'
},
stims
=>
name => 'Sara Tims', age => 32 beer => 'Wheat Ale',
{
}
)
;
foreach my $id (keys %employees) { print " $employees { $id} {name} drinks: $employees { $id} {beer } \n" }
This snippet of code produces the following output: Sara Tims drinks Wheat Ale Bill Jones drinks: Dark Ale Anne Smith drinks: Pale Ale :
Mixed structures
8.1.3
Some
languages provide a special data type for a variable that can contain a mixed
collection of other basic data types "struct" in
things
C.
Perl's
you can
—
example the "record"
for
nested structures give you complete flexibility over the kinds of
nest.
We've already seen one mixed structure with the hash of Consider a more elaborate kind of
arrays used for the student's assignment scores.
record for each employee than that used in the example above. just a single
employee record
of records, just
my %employees
as the
=
in Pascal or the
will consider
here, but the hash could contain a multiple
number
example above does:
asmith =>
(
We
name => 'Anne Smith' age => 35, children => ['Amanda', 'Amy'], beer => Pale Ale Lager
{
[
'
'
,
'
'
]
}
)
print
"
;
@{ $employees {asmith} {children}
CREA TING REFERENCES
)
\n"
;
ttprints
:
Amanda Amy
149
;
Scope
8.2
You know
and references
that a lexical variable only exists during
diate enclosing block).
Most
to
hold a value,
importantly,
it
it
(i.e.,
no longer
also stores
memory
is still
can be released and used for something
imme-
to a variable
and
Perl sets aside
extra information about that value.
how many
things are pointing at that
and
how
it is
Perl determines
needed by the program or
else. Let's
,.,
.
if
that
memory
consider the simplest of cases:
;
{
(i.e., its
when
Well,
exists)?
called a reference count,
is
whether a value stored in
my $outer
some
maintains a count of
particular value. This
current scope
So what happens when we take a reference
that variable goes out of scope
memory
its
,
.
:
.
my
$ inner = 42 $outer = \$inner;
}
print
"
$$outer\n"
;
#
prints: 42
Now, you might be $ inner
thinking, "Hey,
how
can $outer
has gone out of scope?" That's a good question.
never referred to $ inner, able $ inner itself
is
it
just a
still
The answer
referred to $ inner 's stored value.
name
associated with a particular
$inner
refer to is
that $outer
Remember, the
memory
after
vari-
location that
When we take a reference to a variable we get a reference to the mem-
holds a value.
ory location to which depicts graphically reference count
is
what
is is
associated, not the to variable
going on before, during, and
depicted by
how many arrows
name
itself
Figure 8.4
after the inner scope.
The
are pointing at a value.
$outer %-
my $outer;
$ inner
my $inner
=
42;
$outer
=
\$inner;
$outer
print "$$outer\n";
Figure 8.4
150
$outer
Reference to a variable going out of scope
CHAPTER
8
REFERENCES I A GGREGA TE DA TA
S TR UCTURES
;
;
This
you want
ability to
,
have a reference to a variable going out of scope
to create a reference within a subroutine
while you work with
my $foo = get_ref print "@$foo\n";
—
;
(
it
)
and
store
in a lexical variable
it
before returning this reference back to
when
useful
is
main program. ...
;
12
prints:
#
3
'
sub get_ref { my $temp = [1, return $temp;
^^^.M
(' arrow syntax');
;-'•,(
'
.
(
{
,
'
.
(
'
sub
f oo
'
)
{
,
print "$_[0] \n"
;
•
;
}
To
an anonymous subroutine, you use the sub keyword followed
create
immediately by the block of code that
my $sub_ref
sub
=
will serve as the subroutine:
{
my $arg = shift @_; print $arg\n" "
'
}
$sub_ref ->
(
"
I
-
;
•
^
m anonymous");
'
References to subroutines are convenient for a variety of things, but the most
popular usage references.
is
probably for creating dispatch
tables
by creating a hash of function
Consider an interactive program that asks the user for a
command
to
execute:
my %dispatch_table
=
foo
{
bar quit q help )
=> => => => =>
sub { print "You chose 'foo'\n" sub { print "You chose 'bar'\n" \&quit, \&quit, \&help,
},
},
;
print "Enter a command (or while () {
'q'
or
'quit'
to exit):
";
chomp;
my if
$ command (
=
$_;
$dispatch_table $command} $dispatch_table $command} -> {
{
152
{
)
(
)
CHAPTERS REFERENCES / AGGREGATE DATA STRUCTURES
;
;
else { print "Illegal command:
}
print "Enter a command (or
$command\n" or
'q'
.
;
'quit'
•-
.
to exit)
"
:
}
.
,„,
.
.
,
,
'
sub help { print "The available commands are:\n"; foreach my $com keys %dispatch_table print "\t$com\n";
./
. :
{
)
{
.
'
}
:
-
sub quit exit
.
{
0; ^ •
}
'
Here we have two This
tion.
different keys in the hash referring to the
The
times, but encouraged.
mous
and
to create
when only one was
store
two
a duplicate anony-
rent lexical environment. is
when
subroutine in Perl
Such a subroutine
anonymous subroutine
that an
that
environment
sub speak { my $ saying return sub
separate, but identical, function references
really necessary.
83.1 Closures When you create an anonymous
even
would have been make
alternative
function for both the q and quit keys in the hash. In other words, Perl
would have
this
same quit func-
not only allowed, since a "thing" can be referenced any number of
is
carries
no longer
is
=
shift;
{
print "$saying $_
[
0
is
called a closure. lexical
its
in scope.
]
!
\n"
deeply
it is
;
}
bound
to
its
Another way
cur-
to say
environment around with
This
is
;
it
best shown by example:
i
}
my $batman my $robin
speak Indeed = speak (' Holy ') $robin-> mackerel # prints: Holy mackerel! $batman-> {' Robin ') # prints: Indeed Robin! =
(
'
'
)
'
;
'
'
(
)
;
;
Here, the subroutine speak assigns
it
the
first
assigned the $ saying.
syntax. ated.
will
makes a new
element of the parameter
anonymous subroutine of whatever
()
passed to
it.
It
be given to
it
as
first
an argument. Both $batman and
anonymous subroutine from speakO, but with
still
Ssaying and
then returns a
that uses that lexical variable along with the
These anonymous subroutines
Each
list
lexical variable
new
parameter
$ robin are
different values of
are then called using the
arrow dereference
contains the original value of $saying with which they were cre-
Thus, even though the referenced subroutines are called in a different scope
REFERENCES TO FUNCTIONS
153
than where they were created, they act
them
—
as if
The
they were
still
—
,
;
•
in terms of any lexical variables used within
within the same scope in which they were created.
we
kinds of things closures are good for are somewhat specialized, but
will consider
one particular usage
here: creating a stream based
on
a particular
mathematical function.
You may or may not be the Fibonacci numbers.
number
in the series
is
familiar with a mathematical series of
The
first
two Fibonacci numbers
2.
0,
The
first
1,
1,
We
2,
ten Fibonacci
3,
1
numbers 21,
13,
8,
5,
,
is
also
1
and the fourth, the
,
n fibo(n-2)
= =
We
iterative
if
+
can translate
lated the Fibonacci
what
result
of
1
+
1
that position
this definition into a
simple recursive subroutine that calcu-
We could also cre-
for a given position in the sequence.
to be able to print out the first five such
that calculates the Fibonacci
times
—more
if
it's
number
tinue the sequence
sequence
One
it
later.
And,
—probably
answer
numbers
to store
still,
would be repeating
array, a cache, for
for position
a recursive function
Then, the program would have
its
more
is
each time
it
A subroutine
to be called at least
for each position in the
list.
calculated the next position in the
a lot of computational
the best in this case
that have already been
—
is
work
already performed.
to create a separate storage
computed and have
from where
it left
the subrou-
off
to use closures to provide a steady stream of Fibonacci
bers that can be continued at
But
numbers, then do some
n would have
—once
quickly.
position in the sequence in order to con-
tine utilize this array to continue the sequence
alternative
is
if n == 0 or n == 1 if n > 1
fibo(n-l)
number
number at
type routine that accomplished the same task but
we wanted
An
next
34
other things, then print out the next five numbers in the sequence.
five
The
can envision a recursive definition for the nth Fibonacci number in the
fibo{n) fibo{n)
an
1.
are
sequence: for any position n in the sequence, the Fibonacci
ate
and
called
obtained by adding together the two previous numbers. So
the third number, the result of 0 + is
are 0
numbers
num-
any time:
# fibonacci stream generator: sub new_f ib_streain { my ($current, $next) = (0, 1); return sub{ my $fib = $ cur rent, ($current, $next) = return $fib;
($next,
$current
+
$next)
}; }
154
CHA P TER
8
REFERENCES I AG GREGA TE DA TA
S TR UCTURES
•
# create two new fibonacci streams my $fib_streaml = new_f ib_streain my $fib_stream2 = new_f ib_streain
#
print out first
f oreach
(
1
.
.
5
5
-
(1
.
.
10)
"
\n"
(
)
;
•
.
:
,
.
'
•
:
1 • .
(
)
"\n";
,
(
)
,
...
•
2
";
"
,
;
'
;
"
.
(1..5)
,
{
print out next
f oreach
:"
'
fibonacci numbers from stream
print $f ib_stream2-> print
..>
,,
.
print out first 10 Fibonacci numbers from stream
f oreach
#
)
;
{
)
print $f ib_streaml->
#
• :
(
5
.,
-,
-
-
,
.
fibonacci numbers in stream
-
/
•;
.-
1
{
print $f ib_streaml->
(
)
"\n";
,
-
.,,
.-..l
,
-,,
.
}
When two
the new_f ib_streain(
defines
lexical variables
anonymous subroutine fianction again creates
returns a
new
function
)
is
called,
within that scope.
It
it
new
creates a
scope and
then defines (and returns) an
that uses those lexical variables. Calling the generator
an entirely new scope with
its
own
and
lexical variables
closure that uses those variables. In this way,
you can
up
set
as
also
many
completely independent Fibonacci stream closures as you want. Closures are something of an advanced topic, so
we
are only introducing the
concept here.
8.4 Nested structures on the fly Imagine a
of employee data, with one employee record per line recording
file
employee ID number, 142al3 971a22 131b21 119dl7 12 3al2 666s66 777q42 If
first
name,
last
name, department, and type
John Doe Sales :pt Jane Doe Operations ft Amanda Smith: Sales :pt Frank: Cannon: Support :pt Ron: Gold: Support :pt Lucy :Kindser: Operations ft :Bob:Norman: Sales ft :
:
:
:
:
:
:
or part time):
(fiiU
'
"
'
'
:
;
-
:
.
.
:
.
c -?
:
:
:
:
we wanted
loop through
it,
to read this
file
and
splitting each line
create
an array of
on the colons
arrays,
we could simply
to produce an array.
We
could
then push a reference to that array into our main array:
NESTED STRUCTURES ON THE FLY
155
;
;
;
# /usr/bin/perl -w use strict; open(FILE, 'employee.dat')
;
!
die "can't open file: $!";
|
|
# build structure my ©employees; while {
)
{
chomp;
my ©fields = split /:/; push ©employees, \@fields; .•
}
;
,
print info from structure foreach my $person (©employees) { print "Name: $person->[l] $person-> print "Department: $person-> [3 \n" #
[2
]
";
,
]
}
This can be simplified
mous
fiirther
by using the split function inside an anony-
array constructor:
my ©employees while (
)
{
'
chomp; push ©employees,
'
split {/:/)
[
];
}
You can use any expression constructor,
and the expression
inside an
will
be evaluated in a
evaluation will be the contents of the In the case of data of this #! /usr/bin/perl -w use strict open (FILE, employee dat .
'
sort,
anonymous
list
new anonymous
a hash of arrays
context.
The
result
hash)
of the
array.
• .
'
)
||
die "can't open file: $!";
build structure my %employees; while my ©fields = split /:/; = shift ©fields; my $id $employees $id} = [©fields]; )
anonymous
may be a better choice for a structure:
#
(
array (or
-
{
^ :
{
}
print info from structure foreach my $id (keys %employees) print "ID = $id. Name = $employees { $id} ->
#
{
[
0
]
\n"
}
156
CHAPTER
8
REFERENCES I A GGREGA TE DA TA
S TR UCTURES
;
;
Or maybe
;
;
,
,
;
;
you'd like to read the data into a hash of hashes so that each field
could be accessed by a
#! /usr /bin/perl -w use strict; open (FILE, employee dat .
'
name:
field
'
||
)
die "can't open file:
$
!
"
# build structure my @field_names = qw(fname Iname dept type) my %employees while { chomp my ($id, ©fields) = split 1:1; @ { $employees { $id} } @f ield_names } = ©fields; {
.
^
)
{
}
print info from structure foreach my $id (keys %employees) { print "$id: $employees { $id} { f name } $employees{$id} {Iname} \n"
#
Another way to structure hash using department names
this data to generate a report as the
would be
primary key pointing to a hash of ID number
keys, which, in turn, each point to a hash of the rest of the record fields.
an idea of the structure, here %departments
=
(Sales =>
is
{
'
to create a
what part of it would look 142al3
'
=> {fname => Iname => type =>
like to build
To
you
give
manually:
'John', 'Doe' '
pt
'
'' '
,
,
},
'131b21'
=> {fname => Iname -> type =>
'Amanda', 'Smith', pt
' :
'
'
}, },
Support => {'119dl7' => {fname => 'Frank', Iname => 'Cannon', type => pt '
'
}
,
}, )
print
"
;
$departments {Sales}
And here
is
{'
13 lb21 '} {Iname} \n"
;
#
prints: Smith
one way we could build such a hash of hashes of hashes on the
# /usr/bin/perl -w use strict; open (FILE, employee .dat
fly:
!
'
'
)
||
die "can't open file: $!";
# build structure my %departments my @f ield_naines = qw( fname Iname type)
NESTED STRUCTURES ON THE FLY
;
157
2 1
;
;
;
;
;
;
;
;
while { chomp my ©fields = split 1:1; my %record; @record{@f ield_names} = @f ields 1 2 4 $departments{$f ields [3] } {$fields [0] } = {%record}; •
(
)
,,
[
,
,
]
print employee data by department foreach my $dept (keys %departments { print "$dept:\n"; foreach my $id (keys % { $departments { $dept } { my $record = $departments { $dept } $id} print "\t$id: $record-> { fname} $record-> { Iname} print " ($record->{type} \n"
#
)
}
)
{
"
)
} }
8.5
Review
Reference Creation
^
$scalar =42; $sc_refl = \$scalar; $$sc_ref2 = $scalar;
# # #
©array $a_ref $a_ref
(42,
13,
2:
\@array [®array]
# #
;
#
@$a_ref3
=
©array;
# #
(a => 42, \%hash; {%hash}
%hash $h_refl $h_ref2 %$h_ref3
=
explicit reference to $scalar's location. implicit creation of new scalar location holding value of $scalar (autovivif ication)
explicit reference to ©array's location. explicit creation of new array location holding copy of ©array. implicit creation of new array location holding copy of ©array.
b => 13) # explicit reference to %hash's location. # explicit creation of new hash location # holding copy of %hash. # implicit creation of new hash location # holding copy of %hash.
%hash;
Dereferencing print "$$sc_ref2"; # prints: print "©$a_ref3"; # prints: print "${$a_ref2} [1] # prints: print "$a_ref2-> [1] # prints: my ©ary = keys %$h_ref2; print "©ary" # prints print " $ { $h_ref 3 } {b} # prints print " $h_ref 3-> {b} " # prints
158
42 42 13 13 13 a b
;
13
;
13
CHA P TER
8
REFERENCES I AG GREGA TE DA TA
S TR UC TURES
8.6 1
Exercises
Write a routine that reads the following table into a two-dimensional such
as
array,
an array of arrays or a matrix:
three one two four five six seven eight nine
Then have
the routine transpose the rows
and columns
to
produce the follow-
ing output:
one four seven five eight two three six nine
EXERCISES
159
CHAPTER
9
Documentation 9.1
User documentation and
POD
9.2 Source code documentation 9.3 Tangling code
170
9.4 Further resources
178
160
161
164
At
this
point you have acquired the tools and concepts for creating both simple
and sophisticated programs using the
ment your programs they
are incomplete.
With each program you
ken."
Some would even
you need
write,
However, unless you docu-
Perl language.
consider
them
"bro-
both user documentation
to supply
and source documentation. User documentation
is
simply the instruction manual for using your pro-
gram. Source documentation
is
want when maintaining,
will
someone
else
you or another programmer understand code you or
revising, or just trying to
We discussed techniques for source documentation back
has written.
make
in chapter 2: trying to
the documentation
the source code as self-documenting as possible
by
using good formatting, choosing good variable names (which applies to choosing
good names
for subroutines
comments. Sometimes
and filehandles
all this is
and by using informative
as well),
not enough. Later in
this
chapter
other ways of documenting your source code using Literate
we
techniques. First, however,
standard Plain
Old Documentation (POD)
level
documentation
is,
as its
gram or module. For programs,
name
this
what outputs user
is
it
The
interface to the
module: the
format.
implies, intended for the user of the pro-
it
expects,
what options
it
essentials
might
take,
of
and
file
using the
to
is
POD
way
to provide user level
embed
markup language.
POD
enough
within your script and not worry about
underestimated. Learning to use
POD
be available in a standard form since is
as
written in
documentation
for
your
Perl
the documentation directly in the Perl source
the Perl compiler understands just well
in the Perl distribution
of functions and/or methods
availability
and return values of each of those functions or methods.
current standard
programs or modules
it
Perl's
produces. For modules (discussed in parts 3 and 4), the intended
well as the calling interface
use
documentation using
(LP)
another Perl programmer. Documentation in that case should cover the pro-
grammer
code
Programming
documentation should cover the
running the program: what arguments
will discuss
and POD
User documentation
9. 1 User
will cover creating user
we
is
all
easy
is
a simple
to ignore
it.
markup language
that
—which means you can
These two benefits shouldn't be
and means your documentation
will
of the standard documentation included
POD format.
Using
POD to include your doc-
umentation within your program means that any user of your program or module automatically has the documentation.
your documentation when in a separate text
USER
it is
You may be more
right there in the
same
likely to properly
file
as the
update
code rather than
file.
DOCUMENTATION AND POD
161
'
Let's
POD
code.
such
utility
for
POD
consider
on
own
its
way of marking up
before
we
a
as
pod2html or pod2latex) can convert the pod-source
your document using utility translators to
else for
just
first
of formatting
is
files,
a translator.
The
of tags
set
a
1
heading
-
'
2
heading
covered
lists,
.
,
........
*
is
i
-
And this is a paragraph of plain text below the second level heading
= item
small and
is
list:
This is a level
4
the various
PostScript, manpages, or
files,
This is a small paragraph of text below the level 1 heading.
=over
you can write
We will just consider the basic elements here.
This is a level
=head2
into source code
that
is
(a translator
You can then use
tags.
HTML
in Perl source
tags we'll consider are structural tags, that indicate headings,
and items within =headl
set
which there
in the perlpod pod-pa.ge.
The
one
produce LaTeX
POD
of
utility
it
program
plain text so that another
is
one or another formatting programs. The
something
embedding
discuss
.
.
.
/.
'
;
' ,
',
. .
t
..
-
^
{'I
: -
^"S'-
' '
-
•
first bulleted list paragraph = item
*
r
,^
^
second bulleted list paragraph = back
'
,
'
'
The =headl and =head2 and sub-headings. The =over
some
formatters.
lowed by an
asterisk,
tags are fairly self explanatory, they create headings 4 tag starts a
Each item
in the
list is
produces a bulleted
list,
the
list
and the =cut
tag
is
list
item.
tag ends the current section of
POD command or structure tags
paragraphs also need to flush to the
lems.)
a indent size
A numbered list can
asterisk.
POD
The =back
until another
if fol-
be created
tag ends the
=xxx
style
of
encountered.
All the
lines.
number implying
tagged with an =item tag which,
simply by using increasing digits in place of the
162
{
'
=cut
for
f,
,;„
left
need to
start at the left
margin. Plain
margin and need to be separated by blank
(Note, though, that seemingly blank lines containing spaces can cause prob-
Within a plain paragraph,
several formatting tags can be used:
CHAPTER
9
DOCUMENTATION
.
=headl Simple Formatting
Within a plain paragraph some text might be tagged as I or B or as C (presumably formatted in fixed width font) The S tag means that the text inside the tag delimiter should not be broken across lines at the spaces. .
You can enclose filenames in an F tag, index entries using X, and links with the L tag. Links are for manpage references, and a tag like: L would translate to "the blah manpage" but this added text can be controlled (see L for further information) = cut
--
Aside from structural elements and plain paragraphs, you can also create "verbatim" paragraphs
no
—where what you
interpretation of tags, etc.
You
write should get typeset as
create verbatim paragraphs
it is
with no wrapping,
by indenting each
line:
=head2 Verbatim Example This is a plain paragraph that might be wrapped by the formatter and will have B interpreted in the process. A I paragraph, perhaps showing a code example can be achieved by indenting:
#this is a verbatim paragraph my $foo = "nothing B in here"; print $foo; =cut
And
most of what
that's
bonus of POD
is
takes to write simple
it
that everything
manpage format we
documents. The added
between an =coirimand type tag and an =cut tag
ignored by the Perl compiler. Therefore, tion (in a
POD
it is
simple to write your user documenta-
will discuss next) directly
within your program or
module. By keeping your documentation inside your program you are more to
remember
to update
sure that anyone
who
it
and new
to reflect changes
receives
your program
is
features,
also receives the
many translators printing. Or the user
and you
likely
are always
documentation. This
user can then use one of the
to
ment
can view the document using the
for viewing or
perldoc
may
utility
included with the Perl distribution. (Some distributions of Perl
not include the perldoc
comes with a viewer
The supply
USER
produce a nicely formatted docu-
utility:
the MacPerl distribution, for example,
called shuck.)
standard convention for writing
at least the
minimum
embedded
POD
documentation
sections of a standard Unix-like manpage.
DOCUMENTATION AND POD
is
to
The
163
minimal
you should include
level- 1 sections that
solves the world's
problems are shown in table
Sections to include
Table 9.1
in
called foo that
9. 1.
POD documentation
Section
Description
NAME
The name
SYNOPSIS
A
brief
program
for a
program or module and
of the
usage example
of the
a
few words about what
program showing
its
it
does.
calling syntax, or a rep-
resentative sample of the functions within the module.
A more extended
DESCRIPTION
using
OPTIONS
=head2
any, put
If
them
you choose the
A
FILES
SEE ALSO
BUGS
,
'
,
.
discussion of the program, possibly with subsections
tags. a
in
of the files
A
list
of
If
you have bugs
manpages
Author's
want
There perlpod.)
omit
name and
if
none, leave this section out.
programs and/or documentation.
for related
in
your program that aren't ironed out yet,
them
If
this section.
used by the program,
list
or risk being told about
AUTHOR
here, or perhaps under the description above.
list
latter option,
list
them here
again and again...
contact information, plus any copyright statement you
to include (which also could
be under
are other relatively standard sections
You can and should add any
its
own
level
you may want
additional headings
1
heading).
to include. (See
and information about
your program that you deem appropriate.
You may include is
POD
almost anywhere within your program.
that if the compiler reads a
ment,
it
will ignore everything
finds.
Some
some
place
authors put
it
at the end,
things in Perl,
9.2
POD
it is
all
the
tag directive
up
it is
looking for a
and including the next =cut
to
up
it
to the
was
mean
state-
directive
it
throughout the source code. Like
many
Sou rce code documen tation largely
tainable code. I
new
programmer.
Source code documentation was discussed to some extent in chapter ter
basic rule
POD information at the beginning of the program;
and others mix
a choice left
when
The
about using good
style
and comments
Often those guidelines are
all
well-respected ones, not crackpots
to
2.
That chap-
produce readable, main-
you need. (Some programmers
—go
as far as to say that if
—and
your code
needs extra comments to be understood, you need to rewrite your code.) Sometimes, however, this
is
not enough. You
may want
to include detailed explanations
of certain algorithms, provide diagrams, or present the code in an order different
from the more
linear order the compiler expects. Literate
Programming
(LP) tech-
niques can be used for such things and more.
164
CHAPTER
9
DOCUMENTATION
LP
neither specific to Perl, nor necessary for
is
can be such a
LP
in
usefiil tool that
Perl,
but
it
remainder of this chapter to using
will devote the
I
programming with
your Perl programming.
Programming
Literate
Knuth
method of programming developed by Donald
a
is
in the early 1980s (Knuth, D. E. "Literate Programming." The Computer
Journal (27)2:97-1
1984).
1 1
The
essence of LP
embodied
is
in a
quote from Knuth:
Let us change our traditional attitude to the construction of programs: Instead of
imagining that our main task
on explaining
rather
The tion
basic concept
is
what
to instruct a computer
to do, let us concentrate
to
humans what we want the computer
is
that
one should be able
and the program source code together
to do. (Knuth,
file.
984.)
program descrip-
to v/rite out the
in a source
1
This can be presented in
an order suited to explaining the code to humans. The program source can be extracted
from the
file
The documentation,
and tangled together into
then,
is
the original
file,
its
which
proper order for the compiler. is
run through a process called
weaving to produce the description and code in a form ready for typesetting (usually
by LaTeX, but other
target formats such as
Quite a few LP systems
You
are already familiar
that
POD
are also a
Many
are designed for a particular
few language-independent LP systems.
with some of the syntax of one, the noweb system, created
by Norman Ramsey. Some sections of
out there.
exist
programming language. There
HTML can be used by some LP tools).
Perl
programmers use
POD as a form of LP, intermixing
POD-formatted description within the source code.
is
much
better suited to
documenting the source code
documenting the
itself
Thus, in
Personally,
interface of a
this section
we
I
feel
program than
will consider the
noweb system of LP.
Back
in chapter 3 (and again in chapter 5),
name and
define chunks of code.
we used
a simple little syntax to
That syntax comes from noweb, though we did
not use the complete noweb syntax. This syntax allowed us to break our programs
up into manageable
little
pendent of the order
in
units that
which the
back into the main program
The
(and nothing
symbol
nated
else
on
at the left
when
a
discuss in
any order we wished, inde-
particular lines of code
had
assembled
chunk of code
in
noweb begins with double
chunk name, immediately followed by an equals
that line).
A chunk of documentation
is
begun with a
margin, followed by a space or a newline.
new chunk
to be
the root chunk).
actual syntax to define a
angle brackets enclosing a
©
(i.e.,
we could
begins or the end of the
SOURCE CODE DOCUMENTATION
file is
A chunk
is
sign
single
termi-
encountered.
165
;
@
:-
;
;;
.
This is a documentation chunk in which we would explain why the following code assigns the answer to the universe (42) to the variable $foo then does other stuff and finally prints out what $foo divided by 2 is: =
my
=
$ f oo
42
;
«some other chunk>> print "$foo
is $bar\n"
2
/
@
This is another documentation chunk, in which we would explain the significance of dividing $foo by 2 if doing so had any significance, which, of course, it doesn't. "' = my $bar = $foo / 2 '
®
,
'
f:
.
,
Now, in a
file
if
the above were to represent a complete (albeit useless) program, saved
named
useless.
—on
notangle
—
nw, then running the tangler
that source
-Rchunk
notangle
as:
in noweb, the tangler
is
called
useless. nw would produce
the following output:
my $foo = 42; my $bar = $foo / 2 print "$foo / 2 is $bar\n";
Had
the placement of the
have been the same.
' .
.
"
two chunks been
The chunk
and any referenced chunks
-
^
'
.
>
.
reversed, the tangled output
command
given on the
.
line
found and printed,
is
are replaced with their definitions.
would
This means you can
begin designing your program using high level concepts as chunk names, then design and define each of the chunks in the order that makes the most sense to
—much
you
as
we
did for the
f aqgrep
and primes programs
in chapters 3
and
5
respectively.
A particular chunk in
one
place)
definition. final
to
definition
may
also
be continued
by simply using the same chunk name when
Continued chunks must occur
tangled code
(i.e.,
not fully specified
starting another
chunk
in the order they are to appear in the
—notangle simply concatenates
all
continued chunks together
produce a single chunk:
Documentation stuff. = my $foo = 42
.
.
,v
,
«another chunk»
„
. .
'
®
'
More documentation.
.
.
'
V,^
.
.
«another chunk»= my $bar
166
=
$foo
/
2
CHAPTER
9
DOCUMENTATION
;
@
then we continue defining the original chunk:
Yet more documentation..., =
print "$foo
is $bar\n"
2
/
@
Tangling the above version produces the same output ple. All the
continued component chunks of chunk are concatenated together and
then the chunk tions in the
is
chunk
represents a complete
You can
same
its
own
file
A
file.
program or module or something you want
to
given noweb source
module
write such a
the source code
testing
A
you may write a
functions to ensure they are
chunk of
fi^r
all
library
as a literate
a
might hold more than one
file
module
source
that defines several fiinc-
file.
You can
program whose purpose
working. In
way, you could even have each
this
code follow the chunk (or chunks) of module code that
module code or the
tangle out either the
Of course, we may still make sages to the screen, noting the
all
where
is
our code and Perl
name and might
Perl thinks the error
gled program that
ers
file
errors in
we
this
actually run.
is
not
really a
line be).
Hence, the problem.
compiler about what corresponding code
number
it is
is
it is
reading.
on
We
it tests,
Then you could
The problem file
Many language file.
tell
These
can use them to
located in the useles.nw
file.
is
mes-
found the error
it
that we've gone to
separate
numbers and
line
in that
will print error
number where
understand special directives that do nothing except
reading and what line
file.
code or both.
test
the trouble of writing our source code in chunks in a
be correct. Well,
also include in
to test each of the
is
thus keeping related code close together in the literate source
(or at least
with their defini-
to use as the root chunk.
notangle which code chunk
told
root chunk. For example,
the
are replaced
same manner.
tangle out into
tions.
Any embedded chunks
printed.
The -R option root
previous exam-
as the
from the tan-
name
file
will
not
compilers or interpret-
the compiler
what
file it
are useful for lying to the
tell
the compiler where the
Wlien an
error
is
encountered,
the error messages will point to the corresponding code.
In Perl, a line directive takes the form of a special
#line 13
"file" by
itself
on
a line. Actually,
it
comment
that looks like
will recognize directives
match-
ing the pattern
/'-#\s*line\s+ (\d+)
\
s*
(
\s "([""]*)") ?/
In that pattern, the $1 variable would hold the line able
would hold the
file
name. The
included, Perl uses whatever
it
number and
name
is
optional. If the
currently thinks
is
the
file
SOURCE CODE DOCUMENTATION
file
file
the $3 vari-
name
is
not
name.
167
The
noweb can be given
tangler in
and
a -L option
it
useless.
nw file
an example, a
as
notangle -Rchunk -L useless. nw
file.
Using
call to
>
xxx
would produce the following output
in a
#line
line directives
.nw source
into the tangled code referring to positions in the original
the above
emit
will
named
file
xxx:
"useless. nw"
3
my $foo =42; #line 11 "useless. nw"
my $bar = $foo / 2 #line 5 "useless. nw" print "$foo / 2 is $bar\n";
'
;
That
foo
effect, a line directive is
from a code chunk. Now,
2; (where the
/
;
with the -l option in
is,
enters or returns =
.
-
$
symbol on $foo
was an
if there is
error,
such
as
it
my $bar
missing), then running the resulting
would produce the following
tangled code as perl -w xxx
emitted whenever
errors:
Unquoted string "foo" may clash with future reserved word at useless. nw line 11. Argument "foo" isn't numeric in divide at useless. nw line 11.
So we could proceed directly to our original noweb source problem. revise,
We
and maintain the
If that
were
be beneficial. tion.
only use the tangled code to run the program.
The
write, debug,
literate source.
noweb (and other LP systems) allowed you to do,
all
original
However, there
is
to find the
file
We
.nw source
file
file
to
would
still
would be your program's documenta-
another side of the LP system
tem can weave your source
it
— "weaving." The noweb
produce LaTeX or
HTML
sys-
documentation
for
formatting and viewing. This means you can write the documentation chunks using whatever capabilities the target formatter allows such cal formulas,
HTML
diagrams,
cross-references,
lists,
and
as:
typeset mathemati-
Both the LaTeX and
indexes.
backends perform additional formatting and cross-referencing of your
actual code chunks, line options to the
and any
identifiers
you
specify,
through the use of command
noweave part of the system.
When you end a code chunk, you may use a special directive to specify a list of identifiers
—
variables, filehandles, subroutine
that chunk. In the
woven
and
all
cross-referenced
168
etc.
— considered
"defined" in
version, such identifiers will be cross-referenced
index can be produced listing defined,
names,
all
the chunks that used that identifier.
and indexed
and an
defined identifiers, the chunks where they were
Chunks themselves
for easy reference purposes.
CHAPTER
9
You may
will also
be
specify such
DOCUMENTATION
:
by following the @ symbol that ends the code chunk with a space and the
identifiers
%def
.
directive, followed
by
a space-separated
of identifiers on the same
list
line:
In a documentation chunk. Here is a code chunk which includes identifier definitions marked at the end of the chunk
«chunk>>= my $foo =42; my $bar =13; © %def $foo $bar Now in another documentation chunk. Unfortunately, because this
noweb system,
it is
book
However,
not being typeset using LaTeX and the
to be
shown
I
have created noweb source
program (shown
sophisticated version of the faqgrep
program
is
not possible to show you here exactly what the typeset documen-
tation actually looks like.
tangler
.
in chapter 13)
These
later in this chapter.
will
more
for a
files
and the simple
be available at
httpill
www.manning.com/Johnson/., where you will find a link to additional online resources for the book, including source code
(PDF) versions of the
Other
9.2. 1
One good
typeset
and
PostScript,
and Portable Document Format
documentation for the two programs mentioned.
ofLP
uses
use of an LP style of programming
teaching purposes, which
exactly
is
many of the programs you
why
I
is
the presentation of source code for
have used a form of LP
have encountered thus
far in this
when
presenting
book.
A couple of related uses arise from the fact that a single literate source file may contain the source code for more than one program, each with further elaborated.
Of what
possible use
is
this?
its
own
rectly.
program
that verifies that each
Using LP, you can create two root chunks
and one
for the test suite
lowed by
its
code
intended to
it is
time. If a
test,
component. In
for the
this way, test
wish to cor-
program or module, the program, fol-
code remains close to the
and the documentation can deal with
component needs
easily adjusted at the
—one
also
component works
— then develop each component of
related testing
chunk
Consider that you are writing a
program or module with many small components (chunks). You may write a comprehensive test
root
issues at the
to be fixed or modified, the appropriate test
same
code
is
same time.
This idea applies to
test
data as well: you
may keep chunks of code
that deal
with particular kinds of data next to chunks of data designed for testing that particular
chunk of code.
Similarly, a
program may read an external configuration
ing a set of parsing rules.
As above, the program and the
parsing data can be written in the same source
SOURCE CODE DOCUMENTATION
file
file
or a
file
describ-
external configuration or
in a parallel fashion.
169
None of this our tangling
You
time).
The
tem.
implies that literate code need be contained in a single
script given
are
fi:ee
multiple
do not need
your
to create
files
literate
match up with the multiple
to
LP
senting the source code in a logical fashion. Like Perl,
93 The
it
r
lacks
all
just described.
the formatting
the real typeset version.
LaTeX
—such
(When
We
call
^
is
:
is
program
that
would be
typeset
—
the previous chapters,
and introduce
the
like.
The
include a few
Remember, the
fijll
are available for
following program will allow
code without fetching and
many
available in
using noweb and
We will
automatically added.)
will also serve to tie together
It
is
markers so you can see what they look
to try out writing literate source
work with
the presentation only partially literate
http://www.manning.com/Johnson/.
noweb system.
literate
about giving you more
and typeset versions of the following program
plain text source at
your
design and documentation.
and cross-referencing
the literate
cross reference material
identifier definition
download
ultimately
files
to break
following section presents a simple tangle-like program that will
noweb syntax we've
you
how you approach both code
flexibility in
Tangling code
because
at
along chapter or section boundaries or whatever works best for pre-
files
freedom and
file
source code for a sys-
produced by the tangling process. Instead, you might choose sources into
(though
below in section 9.3.1 does only operate on one
to use multiple files
file
installing the actual
of the things you've learned in
a couple of new fianctions
you
haven't seen yet.
A simple tangler
9.3.1
Now that we know what the chunk definition and the
reference syntax are,
we can
build a limited tangler program to allow us to write our Perl programs using
noweb
'
intermixing code chunks and documentation chunks (we are in a
s syntax,
documentation chunk
right
now) throughout the source
We want our tangler to operate similarly to specify the root ferences.
it
to a
name you want
file
yourself
it.
The second
difference
root chunks
matically find
all
chunk names.
A root chunk
We
call
is
add two
We assume that the root chunk name
that, if
file
named
is
given, our
to their respective
program files
program outline
also the
(i.e.,
will auto-
based on their
any chunk not used inside another chunk
initial
is
running our tangler program
blah and writes the tangled code
no -R option
and print them
dif-
STDOUT so you
our tangler pqtangle for Perl Quick Tangier and write
named pqtangle. nw. Our
170
is
We
prints the tangled code to
to use for the tangled code. So,
with a root option of -Rblah creates a to
notangle, allowing a -R option to
a -l option to include line directives.
The notangle program simply
have to redirect file
chunk and
file.
definition. it
in a
our root chunk) looks
CHAPTER
9
file
like
DOCUMENTATION
— «pqtangle»= /usr/bin/perl -w use strict; >\s*$/
,:[;
;
}
push
@
{
$chunks { $1 }
}
,
"
$begin_of f set $f lie $line_no " :
:
(a
^
'-:v
When we finished reading through the chunk, we used the autovivification syntax to
push a
tion of the that are
string containing the byte offset,
chunk we
inside the inner
At
matched
while loop was
this point,
starts in the file
the one
we know
and the
line
number informa-
line
%chunks hash contains keys
values hold an array of offset/ information strings for
every location where that chunk's definition is
name, and
just parsed. In other words, the
chunk names and keys
$1 variable here
file
is
continued throughout the
which
is
now out
of scope.
the byte offset location where every
chunk
number of the
chunk
first
code
The
The one matched
in the outer if statement.
localized to that block
file.
line in that
definition definition.
We also have a record of every chunk used in another chunk and, hence, every chunk that cannot be a root chunk. We now populate our ©roots array with the root chunks we need to root
chunk name
the root chunks. since the keys are
@roots array =
($Root)
©roots
{
=
($Root)
TANGLING CODE
173
;
}
:
;
;
else { foreach my Skey (keys %chunks) { push @roots, $key if not $used{ $key) }
}
@
Printing out the root chunks for each root
do
is
simply a matter of opening a
chunk name and printing out
>= my $shebang_special = 0; $shebang_special = 1 if $line =~ m/^#!/; @
At
we need
this point,
Hne
to create a formatted
directive, substituting the
we used in the $Line_dir variable. We format this line directive in a separate function. Then we need to print out this line directive, but only if the current line is not a shebang line, or an embedded chunk reference. In those cases, we would want to print a line directive when processing that chunk.) We use a simple set of logical ORs that terminates at the first true expression, and thus only prints out the line directive when needed: correct information for the placeholders
= my $line_dir; if
($Line_dir) { $line_dir = make_line_dir $line_nijmber $f ilename) $line =~ m/"\s*>\s*$/ || $shebang_special print PROGRAM $line_dir "
;
,
(
.
|
|
"
,',
;
,
,
'
}
• .
we have
Since
and define our
it
literate
,
,
here.
just
used the make_line_dir
This example also
(
illustrates the
. ,
function,
)
we should go ahead
point about being able to write
source in the order that makes sense for discussion.
First, let's
add
to
the subroutine definitions chunk:
=
Now
the function to format our line directive
operations. file
'
The
function
is
$line_dir variable and assign
variable,
which holds the
to lexical variables.
it
of substitution
number and
the
We then declare a new
the value of our file-scoped $Line_dir
line directive string
with placeholders.
replace the placeholders with their proper values
$line_dir
a simple set
passed parameters for the current line
name, which we immediately assign
lexical
is
Finally,
we simply
and return the value of the
variable:
«suJb make_line_dir>>= sub make_line_dir {
176
CHAPTER
9
DOCUMENTATION
;
;
my $line_no, $f ile) = @_; my $line_dir = $Line_dir; $line_dir =~ s/\%L/$line_no/ $line_dir =~ s/\%F/$f ile/ $line_dir =~ s/\%\%/%/; $line_dir =~ s/\%N/\n/; return $line_dir;
;
' ;
(
-
,
—
;
' .
,
}
@
In order to tangle out our chunk,
ing the chunks in the line
first
That
place.
we is,
when
use a loop similar to that used
we
does not match a chunk terminating
pars-
continually loop as long as the current
We
line.
must make
sure
we
read in
we would be looping forever on the same line. Inside the loop, the current line might be an embedded chunk reference, in which case we need to tangle out that embedded chunk. Note, we are capturing the leading whitespace if there is an embedded chunk reference, as another line in both blocks of the if /else statement or
well as the it
chunk name
—
way we can
this
call
print_chunk
the
a string representing the current indentation level so
appropriate indentation. If the line does not contain a the code line (and print out a line directive following
)
(
routine and pass
our tangled code has the
chunk reference, we will it
if it
was a shebang
print
line):
«tangle out current chunk>>= while ($line if
}
~ m/ " \@\s*$ \@\s\%def / { ($line =~ m/ \s* ?).
is
\u$l/g
/
is
—and
replace the
matched
uppercase. Let's assume the
letter in
can simply rename
letter
text
^which
is
captured
with a space, followed
document
this file as article. html.bak
—
is
in a
file
called arti-
and use the following
new version of article, html:
to create a
# /usr/bin/perl -w use strict; opendNPUT, article, html, bak open(OUTPUT, >article.html while () { if m/< [Hh] [l-3]>/ { !
'
'
)
||
'
'
(
)
||
die "can't open file: die "can't open file:
$
$
" !
;
!
"
)
s/
(
[a-z]
)
/
\u$l/g;
}
print OUTPUT $_; }
190
CHAPTER
10
REGULAR EXPRESSIONS
A more command
general version of this
line
and print the
be redirected to another
program would simply read a
resulting output
file
on standard output
given on the
so that
it
could
file:
# /usr/bin/perl -w use strict; while! { if m// { [a-z] / \u$l/g; 8/ !
)
(
'
)
(
.
,
)
'
}
print
•:' ;
}
If
you named
this
program cap_heads, you could run
it
from the
command
line like this:
perl cap_heads article html bak .
.
This way you can use edit the
program
10.2.2 There
to
modify other such documents without having
Character
class
Each of these has
ter class.
We
sho rtcu ts
Table 10.1
their use in chapter 6.
We
Escape sequences for commonly used character classes Description
\w
equivalent
any
to: [a-zA-Z0-9_l, a
to:
word character
an underscore character
letter or digit or
equivalent
[^a-zA-Z0-9_l, a
any character that
is
equivalent
to: [0-9],
\D
equivalent
to: ['^0-9],
equivalent
to:
[
not a
non-word character or underscore
letter, digit,
any single
\d
\s
character
for review:
Escape sequence
\W
commonly used
a variant to stand for the corresponding negated charac-
saw each of these and some examples of
them here
to
to replace the filenames.
are three special escape sequences that stand for
classes.
repeat
it
article.html
>
digit
any non-digit character
\n\f\r\tl,
a
whitespace character
a space, newline, formfeed, return, or tab character
\S
equivalent
10.3
to: [^ \n\f\r\tl, a
Greedy quantifiers: take what you can get
Another greedy quantifier that operates fier.
non-whitespace character
similarly to the star
is
the plus (+) quanti-
This one matches one-or-more of the previous components. You can think of
GREEDY QUANTIFIERS: TAKE WHAT YOU CAN GET
191
;
m/f (o+)bar/ is
the
same
as
as
being the same as m/f (oo*)bar/, in that matching one-or-more
matching one thing, then zero-or-more of the same thing. The pat-
m/fo*bar/ would match against the string fbar, matching and
tern
characters followed
fbar because there
The The
ters.
star
is
plus
is
f
then zero o
by bar. The pattern m/fo+bar/ would not match against
isn't at least
one o following the
f in that string.
an indeterminate quantifier that can match any number of characonly slightly determinate in that
must match
it
at least
one thing,
but could match any number of additional characters. Perl also offers a few other greedy quantifiers with varying degrees of indeterminacy. These are listed in
meaning and an example with an equivalent formula-
table 10.2 along with their
tion using constructs
we
already know:
Greedy quantifiers
Table 10.2 Quantifier
Description
?
match zero-or-one time
m/fo?bar/
equivalent to m/f (ol)bar/
{n}
match exactly
n
times
m/fo{2}bar/
equivalent to m/foobar/
{min,}
match min-or-more times
m/fo{2,}bar/
equivalent to m/foo+bar/
match
max}
{min,
In the All the
at least
first
all
much
grouped alternation means
most max times
)bar/ might seem strange.
(o|
match an o or match nothing.
is
last
greedy quantifiers and will as
at
example above, the equivalent m/f
ordering of the alternatives in the are
min times, but
equivalent to m/f(ooooloooloo)bar/
m/fo{2,4}bar/
7\lso, if
the
example surprised you, remember that these
first
try to
match
as
much
as possible (or as
they are allowed) before trying lesser amounts.
10.4 Non-greedy
quantifiers: take
what you need
Often, greedy quantifiers are simply too greedy for your intended purpose. Consider trying to
match and capture
all
the text
on
a line
up
to the first occurrence of al5:
$line = "one two three al5 four five six al5 sevenVn" $line =~ m/ .*)al5/; print "$l\n"; # prints: one two three al5 four five six (
What happened? The string
192
star
and then backtracked
is
until
greedy and matched
an al5 could match.
CHAPTER
10
all
the
way
to the
What we need
is
end of the something
REGULAR EXPRESSIONS
;
that will it
match
matches
as little as possible
and then check the
of the expression to see
of the greedy quantifiers have a non-greedy form that
yet. All
the quantifier followed by a question mark.
now
rest
if
simply
is
A revised version of our example above
using a non-greedy star quantifier would be
$line
=
"one two three al5 four five six al5 sevenXn"
$line =~ m/
.
(
print "$l\n";
The
i
(
.
*?)al5/; #
prints: one two three
*?) tries to
acter followed
match zero
characters followed by an al5, then
by an al5, and so on
until
it
finally
one char-
matches fourteen characters one
three and succeeds in finding a following al5. The other non-greedy ver-
two
and operate
sions of the quantifiers are given in figure 10.1
10.5
in a similar manner.
Simple anchors
An anchor is
a
form of zero-width
This means
assertion.
matches not a character,
it
but a position with certain properties. You have already seen two such elements in chapter 6: the
can be used to match
(caret)
can be used to match the end of a you've already seen
string.
at the
beginning of a
string,
and the
$
Another anchor type regex element that
the word-boundary element \b. This matches at a position
is
between a word character (\w) and a non-word character (\w.) or between the beginning or end of a string and a \w character. To get an idea of what
match a position
rather than a character,
let's
it
means
to
consider another simple example
depicted graphically.
we
In figure 10.5 strings foo
(between the
step through
running the pattern m/\bfoo\b/ against the
bar and foodbar. At step start
of the string and the
1,
\b matches at the start of the string
f character) in
both
cases.
Because
this
zero-width assertion, the pointer remains pointing at the same position in the get string. In order to
on
the string,
first
show
we advance
the regex
components
in the place
the pointer to the next regex
component down below the
regex.
This
is
simply
\b and the f regex element both are successful at the
The
pointer then advances along in the usual
hit the final \b. In the first case, there
the o
and the space
is
a
is
word-boundary
In the second string, the position
lies
again a
a
tar-
where they match
component but drop
the
my way of showing that the
first
position in the string.
manner
in each string until
we
match because the position between
position.
Thus
the regex succeeds at step 6.
between an o and a
d,
which
characters, so the regex fails at this point. (Note: although not
SIMPLE ANCHORS
is
are
shown
both word
in the figure.
193
\b]»
EHa0 0 H
oj
TjiTopTolalu
a
b
0
Jo
-{iHo}QD{b}S& failure
Figure 10.5
Stepping through a pattern match with anchors
the pointer
would return
attempt to match the
first
to the
component
The word-boundary anchor at a position in a string
When on
a
beginning and the whole regex would repeatedly
has a complement, the \B anchor, which matches
between two word characters or two non-word characters.
the /m modifier (we will discuss modifiers in the next chapter)
match or substitution operator
at the start
against every position in the string.)
and end of
lines
means
it
within a multi-line string.
match only the beginning and end of a is
and
that the
$
is
used
anchors can match
The \A and
string respectively, regardless
\z anchors
of whether
it
a multi-line string or not.
The
final
simple anchor
the /g modifier
is
the \G anchor, which works in conjunction with
and anchors the match
to the last position
matched
in a repeated
match operation.
194
CHAPTER
10
REGULAR EXPRESSIONS
10.6 Until
and backreferences
Groupingy capturingy
now we
have only used plain parentheses for grouping subexpressions.
disadvantage of this technique expression
that anything
is
captured and assigned to a special variable based on the position of
is
and memory
the parentheses in the overall expression. This takes extra time
you
regex machinery, and, often,
capturing
its
expression
is
matching the
(?:
A
text.
form of parenthesization that
matched
for the
grouping a subpattern, not
are only interested in
subexpression) form. For example,
ested in capturing the
would
The
matched by the parenthesized sub-
will
only group an
you
are not inter-
if
the earlier example using m/f (u oo)bar/
text,
|
be better written as m/f(?:u oo) bar/. I
When capturing parentheses are used, cial variables
$3
$2,
($1,
.
.
.
the captured text
where the
)
(xyz)
/
the string foobarbaz, $2
These
string xyz.
left
to
would contain the
variables
may
string bar,
and $3 would contain the
be used within the replacement part of a substitu-
These
from within
strings
is
exited.
another pattern match successfully matches or the
This
is
useful for extracting particular bits of data
of text. (We saw examples of this sort of thing in chapter
Using capturing parentheses in
a pattern also
same pattern using the
able later within the
approach to searching a
for
file
tern such as m/\b(\w+) \s+\l\b/i. This
makes the captured
back to previously matched
multi-line strings
string to read a
words even line.
if
one
This
Remember
file
is
—such
characters, followed
at the
as
when
we
tells
the special $/ variable
end of one
possible because
line
we used
line
and the other
is
at the
word boundary following
backreferenced text here
is
the
match opera-
set to still
an empty
catch double
beginning of the
the \s sequence instead of just a space.
that the \s sequence represents the character class
placed a
is
—we can
two words may be separated by one or more of any of those that
[
\n\f \r\t] so the
characters. Also note
the backreference. This ensures that the
not simply the beginning of a larger word such
perfectly logical string This
a
by whatever word
catch doubled words that might differ in case. If used
by paragraphs instead of line by
is
text.
double words would be to use a pat-
was matched by the capturing parentheses. The /i modifier
we can
text avail-
would match something resembling
word followed by one-or-more whitespace tor to ignore case so
6.)
... escape sequences.
\3
\2,
\1,
are called backreferences because they refer
A simplistic
next
special variables
but automatically localized within their immediate enclosing block. They
current scope or block
on
pattern m/
did match against a target string, then $1 would contain
will retain their values until either
These
If the
right.
tion or in statements following a successful pattern match. are global,
of occurrence of
digits reflect the order
the subexpressions themselves counting from
(foo (bar)baz)
stored in a set of spe-
is
as in the
thistle is bristly.
GROUPING, CAPTURING, AND BACKREFERENCES
195
;
A
program that makes use of
this pattern to locate
doubled words and to highlight them somehow can be #
!
as
paragraphs containing
simple as
/usr/bin/perl -w
use strict; =
$/
";
read files in paragraph mode
#
while { print if s/ \b \w+ (
)
(
(
)
\s+
\1
(
)
\b/ * $1*$2 * $3 * /gi
)
}
We captured the first occurrence of the word, the second occurrence of the
word
into three separate variables so
the text with a few asterisks inserted to also
used the /g so that
we could
the intervening whitespace, and
make
the doubled
we could
words stand
replace
out.
We
highlight multiple occurrences of doubled words
within a paragraph. Jeffrey Friedl gives a more involved version of this program in his
book^ that allows an intervening tag such
bled words, uses
ANSI
an
HTML tag and
escapes to highlight the text,
lighted lines rather than the
improvements
as
whole paragraph. You
between the dou-
prints out only high-
encouraged to add similar
are
above version of the program.
to the
Prime number regex
10.6.1
we developed a program to list all the prime numbers from 2 to N (where N was a number entered by the user of the program). That program was straightforward and relatively efficient. Here we will show another program to list Back
in chapter 5,
prime numbers, one that
is
neither straightforward nor efficient, but nonetheless a
marvelous example of something (of what we're just not If
you
visit
and search the comp.lang.perl.misc
http://www.dejanews.com
archives for the terms "Abigail"
and "prime,"
you'll eventually find a rather surpris-
ing usage of regular expressions to determine if a given a frequent poster to the comp.lang.perl.misc (as far as
I
know)
as a clever little one-line
natures. Since then,
sions
it
and with a few
you search the
has been
archives.
primes from 2 to
N
The
(where
/usr/bin/perl -w use strict; my $N = shift @ARGV;
N
you
following is
number
newsgroup and
program
in
this
is
the group
are sure to find quite a is
prime. Abigail
is
example originated
one of Abigail's sign-off sig-
commented on within
variations, so
sure).
on
few
a
few occa-
articles
an extended version that
when
lists all
the
an argument to the program):
#!
1
Friedl, Jeffrey.
196
#
get the number
Mastering Regular Expressions. Sebastopol, CA: O'Reilly and Associates,
CHAPTER
10
1
997.
REGULAR EXPRESSIONS
;
(my $number = 2; $number my ($dir, $file) = / # default initial path $dir 11= return ($dir, $file) (
'
.
'
)
;
}
This version version
is
a
much
is
little faster
simpler overall.
A quick benchmark shows
that the
first
(around 18%) than the regex version. So unless you were
going to be doing a large number of such operations, the simplicity of the regex version probably outweighs any efficiency concerns your program might have.
The substrO data
files
—
files
function
where each
is
also
field
commonly used
of data
starts in the
the same width for every line (or record) in the
might be a better choice
this type
character group designation followed by
and
3.
We
file.
is
(The unpack {) function
fields.
The
data consists of a four-
measurement data with
have already identified
fields
field
occurs.
Now we simply want
print as comma-separated data for use in another program:
to extract only the
widths of 2,
where missing data
we will
# /usr/bin/perl -w use strict; while () chomp my ©fields = (substr($_ substr $_ substr $_ substr ($_
column
of task though.) In the following example, we
have some fixed column data with a few missing
3, 3, 2, 3, 2, 2, 2, 2,
to pick apart fixed
same column position and
columns of complete
data,
which
!
(
(
)
0,
4)
,
9,
3)
,
17,
2)
25,
3)
substr 4, substr $_, 14, substr $_, 21,
2)
,
(
3;
(
2;
;
print join(',', ©fields print \n" "
}
DATA 120B2212 110622 116953 13 632101 1021911793 3929090 220b26 220b29125111 118952934 096 220bl81231182811596233 63 0093 140D2611810821112882831 092 140D23 1062011291293833096
An
interesting thing
function:
you do not
happens when you take a reference to the substr
get a reference to the literal substring
itself,
()
but to the given
region of the string:
my $string = 'foobar'; $ref = \substr $string, print $$ref \n" (
"
210
1,
4); #
prints
:
ooba
CHAPTER
11
WORKING WITH TEXT
.
$string = 'scoolly' print "$$ref\n";
prints: cool
#
Here, the reference in $ref
.
not to the particular substring ooba, but to the
is
four-character shce of $string, starting at position after
we
new string to of this new string.
11.4
second character). So,
now
$string, our reference
assign a
character shce
(the
1
refers to the four-
Translating characters
Another situation that often operation
is
and seems hke a good choice
arises
translating characters.
string into underscores.
One
Assume you want
obvious method
is
to
for a substitution
change
spaces in a
all
to use the substitution operator
with the /g modifier: $_ = 'this is a string'; /_/g; print; # prints: s/
'
-
'
;
•
"
-
this_is_a_string
"
The
translation operator (tr//) accomplishes the
same
task:
'this is a string'; tr/ /_/;
$_ =
print;
#
The
first
prints: this_is_a_string
thing to realize about the tr// operator
first
part as a regular expression:
first
part
is
a
list
treats
is
both portions
that
it
as lists
does not treat the
of characters.
of characters for which to search, and the second part
The
a corre-
is
of replacement characters: tr/SEARCHLiST/REPLACEMENTLlST/. By
sponding
list
default,
operates
it
it
on the $_
variable,
but
it
can be bound to any variable using
the binding operator:
tr/abc/cab/; tr/a-z/A-Z/;
# # #
replace a with c, b with a, and c with b. replace lower case letters with corresponding uppercase letters
$string =~ tr/A-Z/Z-A/;
# # #
You can
replace upper case letters in $string with their counterparts in a reversed alphabet
use this as an easy
method of doing
encoding scheme where the alphabet
maps
that a
rotl3
(
)
to
n and b maps
to o
is
ROT 13
—
encoding
a simple
divided into two halves and swapped so
and
function to encrypt a string, then
vice versa. call it
You can
again
call
the following
on the encrypted
string to
get the original text back:
TRANS LA TING CHARA CTERS
211
;
sub rotl3 { my $ string = shift; $string =~ tr/a-zA-Z/n-za-mN-ZA-M/ return $string; }
The tr// found. is
When
the replacement hst
replicated as the replacement
$count
tr/a//;
=
#
tr/aeiouAEIOU/ /
=
When ter
;
the replacement
of the replacement
tr/a-z/ABC/
This allows you to count characters in a
string:
$count gets number of 'a' characters in $_ $count gets number of vowels in $_
list is
it
are used, the search hst
'
shorter than the searchlist, then the last charac-
repeated to equalize the two
lists:
b to B, c to C and all other lowercase letters translate to C as well
a to A,
#
;
list is
#
characters in the search hst that
empty and no modifiers
is
list.
#
$count
number of
function returns the
#
Three modifiers may be used with the tr// operator,
in the
same way that
modifiers are used with the match and substitution operators: /c, /d, and /s
(which stand for complement, delete, and squash, respectively).
The
/c modifier
words, the
list
that, instead
same
of
all
means
that the searchlist
size as the searchlist, all list
taken as a complement
characters not in the given searchlist.
of replicating the
the replacement
is
last
The /d
character in the replacement
—
in other
modifier means
list
until
it is
the
matching characters that do not have a counterpart in
are deleted
from the
target string.
The
/s modifier
means
to
squash consecutive matching characters with one copy of the replacement character.
tr/aeiou/ * /c tr/aeiou/x/d; tr/aeiou/ /cd;
#
lis;
#
;
tr/
I
sions.
a
would
The
# #
like to stress again that the
row followed by
1
a
a b with xyz.
It
replaces each a with an x, each asterisk with a y,
z.
Exercises
Write a function that returns a such case
tr/ / operator does not use regular expres-
expression tr/a*b/xyz/ does not replace zero-or-more a characters in
and each b with
11.5
replace all non-vowels with an asterisk replace a with x and delete all other vowels delete all non-vowels replace consecutive spaces with a single space
as:
"The the way
list
of all doubled words (two words repeated
to...") in a string.
A doubled word may have different
and be separated by any amount of whitespace including an embedded
newline.
212
CHAPTER
11
WORKING WITH TEXT
2
Write a regex to substitute every occurence of the word apple with orange only
if is
followed by a space and the word peel.
Do
not change the apple in
pineapple. 3
Write a function that prints out a summary of the frequencies of each vowel in a string. For example, if passed the string
discontent, a e
3
o
3
u
1
it
would
This is the winter of our
print:
0
14
EXERCISES
213
CHAPTER Working with
lists
215
12.1
Processing a
12.2
Filtering a
12.3
Sorting
12.4
Chaining functions
12.5
Reverse revisited
12.6
Exercises
list
217
list
lists
217 221
223
224
214
12
;
an important and powerfijl feature of the Perl language, so
Lists are as
no
surprise that, just as with strings, Perl has a
few
list
it
should come
manipulation tools up
its
sleeve.
In earlier chapters, we've seen the essential built-in functions for working
with
arrays
and
to
name
exists
(
joining
,
lists
chapter, list
)
we
—
popO,
a
as well as a
shift
examine a few
map
grep
)
,
uses of the
reverse
(
12,1
Processing a
(
(
)
,
()
functions.
( )
We
more
will also consider
function.
)
(
-
-
list action
on each
;
This should probably be your
first
choice
•
_
when you need
,
.
ele-
^
'
)
}
to create a
this
built-in functions designed explicitly for processing
and sort
©list foreach my $item { # do something with $item
it
making or
for
and joinO functions. In
The standard way to process a list of data, or to perform some ment of a list, is to iterate through the list in a f oreach loop:
data, but
keysO,
unshiftO,
(),
few functions and operators
the range operator, and the split
will
data: the
—pushO, few—
hashes
-
.
to process a
-S-'-
of
list
does have certain limitations. Consider a simple case where you want
new
array that contains the value of each element of an existing array
multiplied by a factor of two:
my @list =(1,2,3); my @new_list; foreach my $item @list push @new_list, $item (
print "@list\n"; print "@new_list\n"
The map
(
)
# ;
#
.
{
)
*
2
;
prints: prints:
_
12
3
4
6
2
,
function allows us write the preceding code
assignment from one
list
more
directly as a
list
to another:
my ©list = (1, 2, 3 my @new_list = map $_ * 2 ©list; print "(ilistXn"; # prints: 12 print (anew_list\n" # prints: 2 4 )
,
"
The map
;
{
)
3 6
function has two basic forms:
map BLOCK LIST map EXPR, LIST
PROCESSING A LIST
215
:
The block
or expression
is
evaluated once for each element in the
ment, and the return of the hinction
Each time the block or expression value in the is
is
evaluated, the $_ variable
is
an
of each such evaluation's
list
the element in the
alias to
set to the
is
list,
argu-
results.
current
but changing
it
not recommended under most circumstances.)
When
a block
curly braces)
following
first
argument
(a series
you do not use
comma
a
is
of statements enclosed in
the value of the
last
statement
between the block argument and the
list:
@new_list
map
=
The map
(
)
{
alias to
*
$_
this
@list;
2}
may
function
how you might do an
used as the
is
the return value of each evaluation
,
in the block. Also,
is
$_
(Actually,
list.
the
is
list
be used to transform a
also
f oreach
with a
the current element in the
in place.
list
loop (remembering that the loop variable list):
.
^
=(1,2,3);
my ©list
foreach my Sitem $item *= 2;
.
#
©list { same as: $item = $item
#
prints:
(
Consider
)
*
"i
2;
:
}
print "@list\n";
2
4
6
. , ,
It is
hide the
more
not wise to use real object
map {$_ print "@list\n";
*
In this version, value that
is
is
(
receives in the
)
isn't
list; it
#
prints:
it
is
-,
2
4
...
.,
:
6
,
perfectly clear that list
@list
is
being assigned a
new
list
value.
limited to returning just a single element for each element
can return a scalar or a
=
qw/one two three/;
=
map
{
'"
~' '
from an array where each array element @key_list my %hash
array:
.
a modification of its previous
The map
)
,
©list;
2}
(
unmistakably changing the contents of the
=(1,2,3);
=
much because it can tend to function allows a The map
too
of the code in the indirection.
direct assignment that
my ©list @list
this "aliasing" feature
,,
,
$_ =>
This also has the
1
effect
}
of
is
list
value.
Consider
initializing a
it
hash
taken to be a key and given a value of
1
@key_list;
filtering
out any duplicate elements in the key
list
array because there can be only one of each key in a hash.
216
CHAPTER
12
WORKING WITH LISTS
Filtering a
12.2
list
Often we don't want to process a
ments that meet some grep
(
criteria.
we want
list;
Another
to filter
—
it
that
get
is,
ftinction, similar in syntax to
map
(
,
)
the
is
ftinction:
)
grep BLOCK LIST grep EXPR, LIST
' '
Unlike the map
element in the
list,
ftinction,
)
(
which returns
a value (or a
in turn.)
This function
is
list
of values)
ft)r
every
of elements for which the block or
this ftinction returns the list
expression evaluates to true. (Again, the $_ variable list
the ele-
all
commonly used with
is
assigned to each value in the
a regular expression as
its first
argument:
my @list = qw/one two three four/; my @new_list = grep m/'^t/, ©list; print ©new_list\n" # prints: two three "
;
Think of grep
(
)
as a filter that allows
through only things that pass a
the case above, only those elements for which the regex m/'^t/
ments that grep
(
start )
with a t
—
are passed
by no means limited
is
of the standard
FAQ answers
through to the new
unique elements in an
element.
The grep
{
)
filter
ele-
:
array:
,
'
•
create a hash to
— those
argument. Here's one
my @array = qw/one two two three four three two/; my %seen; my ©unique = grep {! $seen{$_}++} ©array; print "@unique\n";
Here we use the block form and
true
In
list.
to using a regex as a first
for extracting the
is
test.
_
count occurrences of each
here allows only those elements that
we have not
seen
already.
12.3
Sorting
Perl's built-in
sort
(
)
lists
function
is
versatile,
allowing you to supply a function (or
block) that performs the comparisons or simply to use the default comparison routine,
which
uses stringwise comparisons.
The
basic
form of the function
is
sort SUBNAME LIST sort BLOCK LIST sort LIST
SORTING LISTS
217
To
an array of strings, you can simply use the default sorting routine:
sort
©list = qw/one two three four five/; ©list = sort ©list; print "@list\n"; # prints: five four one three two
When
you want
how sorting
is
to provide a different sorting
accomplished.
Any sorting method must compare two
time to each other and determine or equal to the second element. care of figuring out
When
time. pairs
which
if
the
first
element
While sorting the
pairs
element
is
is
list,
the sort
you supply a comparison method, the sortO
-1 if the first
is
method using
elements
at a
(
)
function takes at
any given
fianction places these
Your comparison function
smaller, 0 if they are equal,
(remember the cmp and
larger
the default sort
element
$b.
to realize
larger than, smaller than,
of elements need to be compared
of elements into the localized variables $a and
must return
method you need
and
1
if
the
first
operators). For example, to duplicate
stringwise comparison,
you could use
©list = qw/one two three four five/; ©list = sort { $a cmp $b } ©list; print "©list\n";
And,
to
do
it
as a subroutine,
you could use
©list = qw/one two three four five/; ©list = sort stringwise ©list; print "©list\n"; sub stringwise $a cmp $b;
" ' '
•
"
{
}
To ©list
sort a
list
numerically,
you would use
=(3,4,2,9,1);
= sort { $a $b } ©list; print "@list\n"; # prints: 12
©list
To
3
4
9
reverse the sense of the sort order,
you simply swap the two
special sort
variables:
©list = (3, 4, 2, 9, 1) ©list = sort { $b $a } ©list; print "©list\n"; # prints: 9 4 ;
218
3
2
1
CHAPTER
12
WORKING WITH
LISTS
5
9
;
As you know, the keys often
want
to retrieve
;
in a hash are not stored in
them
in
some
any particular order, but you
sorted fashion:
Andrew => 35, Sue => my %fainily = foreach my $key (sort keys %family) print $key is $f amily{ $key} \n"
39,
(
Joseph => 14, Thomas =>
7
)
;
{
"
}
END this prints:
Andrew is 3 5 Joseph is 14 Sue is 3 9 Thomas is
'• 35, Sue => 39, Joseph => 14, Thomas => foreach my $key sort $f amily { $b} $f amily { $a} } keys %fainily) print "$key is $f amily $key} \n" {
(
..,.(.;.',
{
7
)
{
{
}
.,
END this prints Sue is 3 9 Andrew is 3 5 Joseph is 14 Thomas is 7
-
J •
:
r
-
.. ,
.
Sometimes, there
is
more than one
Consider a colon-separated data to sort
age
by
last
name, then by
file
first
field in the
of first names,
name
(if
the
last
data that you wish to sort.
last
names and
names
ages.
We
want
are equal), and, lastly,
by
(if all else is equal):
'
# /usr/bin/perl -w use strict; my Odata = ; my ©sorted = sort myway @data; print @sorted; !
sub myway { (split /:/,
$a) [1]
cmp (split /:/, I
(split /:/,
$a) [0]
(split /:/,
$a) [2]
$b)[l]
I
cmp I
;
(split /:/,
$b)[0]
(split /:/,
$b)[2]
i
}
DATA Sue Johnson :
:
3
Andrew Johnson Bill Jones 37 Bill Jones 3 6 Mike Hammer 45 :
:
:
:
:
:
:
3
:
SORTING LISTS
219
|
This makes use of a (or take a slice of) a
use of the
list
(logical |
list facility
just as
OR)
you can with an
array.
you can
The example above
entire function
OR'd
expressions that are
(cmp returns zero), the expression
on
The
operator.
discussed yet:
is
subscript
is false,
elements of the pair of lists (the
to numerically
the data into
split
and we do the next expression
first
names). Again,
compare the third element of the
if
compare
to
pair of lists (age).
The
opera|
of the
com-
first
comparisons are equal.
if all
not a very efficient way to perform such multiple
is
we go
they are equal,
tor short circuits (see chapter 5) so this subroutine returns the result
parison that does not return zero or returns zero
lists
pair of lists. If they are equal
|
This
makes
also
a single Perl statement
we
together. First,
and compare the second elements of that
at the colons
first
we have not
|
made up of three
the
that
comparisons.
field
For every pair of elements that need to be compared (which can be a large number of comparisons), we are performing between two and can greatly improve this
if
by each element of those
we
split
on the
six splits
anonymous
the data once into
We
data.
then sort
arrays,
arrays:
# /usr/bin/perl -w use strict; my ©data = ; my ©sorted; ©sorted = map { [$_, split /:/] } ©data; ©sorted = sort myway ©sorted; ©sorted = map { $_->[0] ) ©sorted; print ©sorted; !
sub myway { $a->[2] cmp $b->[2] $a->[l] cmp $b->[l] $a->[3] < = > $b->[3]
'
•
'
'"' '
. -
.
-
..
.
.
.
|
'
'
II
.
_
'
;
.
...
}
DATA Sue: Johnson: 39
Andrew Johnson Bill Jones 37 Bill Jones 36 Mike Hammer 45 :
:
:
:
:
...
:
35 ;
,,.
.
^
—
,
,
,
.
,
,
,,
In this example, we've
first
array for each line of data. This first
element, and the
next three array.
.
.
.;
:
:
the
> '
fields.
The
list
used the map
anonymous
of fields (the
resulting
list
(
)
function to create an
array contains the actual line of data as
result
of splitting the
of anonymous arrays
We then sort this array of anonymous arrays
is
in the
line
of data)
myway
(
)
sub, dereferenc-
we want to compare. Finally, we extract just ment of each anonymous array (the data line itself) in another map {
©sorted array contains
just the lines
of data
as the
assigned to the @sorted
ing each particular field
that the final
anonymous
now
)
the
first ele-
function so
in the sorted order
we wanted.
220
CHAPTER
12
WORKING WITH
IISTS
Chaining functions
12.4
.
i
1
One
of the most famous Perl idioms
after
Randal Schwartz. The Schwartzian Transform implements the sorting idea
called the "Schwartzian Transform,"
is
named
above in a more compact fashion by chaining the three map, sort, map operations
we
together into a single statement. Before
what chaining
sider
Chaining
is
tackle that particular idiom,
con-
let's
in the first place.
is
simply using the
results
of one operation or ftinction
work through
into another operation or function. Try to
as the
input
the following simple
example:
/usr/bin/perl -w use strict; my ©unique; my %seen; while! { #
'
!
';
'/.J.;:,-
^
v
,
;
•
)
chomp;
push ©unique, grep
$seen{ $_} ++
!
,
split
'
';
}
print "@unique\n"; DATA this is one line of data this is another line of data this is the last line of data
You might have guessed words three
'
that this
program
creates
an array of
how
works?
We
see
chained statement more understandable for a
push
(
grep
©unique,
Looked into
and a
resulting
it
(
!
at this way, the
list
push
(
split(
,
ditional expression
(
and a
function.
)
list.
from the split
()
The
'
the unique
help to
make
the
reading:
')
)
);
function has two arguments: the array to push
)
of things to push into that
from the grep
resulting
$seen {$_}++
first
all
have chained together
may
functions in a single statement. Parentheses
list
,
.
But do you
in the given data.
^
.
array.
The grep
list
(
)
The second argument also has
the
list
two arguments: the con-
argument to the grep
fianction
is
(
)
function
is
the
(using the special case of splitting
list
on
whitespace).
Most
often
you
will
not see chained functions with
all
those parentheses,
let's
look at the original version again:
push ©unique, grep
!
$seen{ $_} ++
CHAINING FUNCTIONS
,
split
'
';
221
9 5
The way one function
5
;
to understand chained functions at a time.
Here we could read grep
hst of words, fiker that hst through the
onto the @unique
Now
let's
is
to read
this in (
Enghsh
right to left
Une into a
as "spht the
and push the resuking
function,
)
them from
hst
array."
consider the Schwartzian Transform and the
method
sorting
last
given above:
#
!
/usr/bin/perl -w
use strict
my ©data
'
;
'
•
:
j,
;
=
my ©sorted
=
map
$_->[0]
{
}
sort myway
map
split /:/]
[$_,
{
Odata;
}
print ©sorted; sub myway
{
cmp $b->[2]
$a->[2]
$a->
[1]
cmp $b->
[1]
$a->
[3]
$b->
[3]
...
'
'
||
I
I
'
_
;
~ '
'
DATA
;
Sue Johnson 3 Andrew Johnson Bill Jones 37 Bill Jones 3 6 Mike Hammer 4 :
:
:
:
:
:
:
:
3 . ,.
.
,
.
^
:
Notice series
:
is
how we
have reversed the order of the map
evaluated from right to
anonymous
array,
then the resulting
myway, and the sorted returns the
first
left. First,
list is
(
calls
)
because the chained
each element of @data
list
is
turned into an
of these anonymous arrays
passed to the leftmost map
element of each anonymous
array.
We
(
)
is
sorted
function, which simply
could have even done the
whole sorting chain using a block instead of a named subroutine: my ©sorted
=
map
$_->[0]
{
sort
}
{
$a->
[2]
cmp $b->
[2]
cmp $b-> [1] $a->[3] $b->[3] $a->
|
|
I
I
[1]
}
map
222
{
[$_,
split /:/]
}
©data;
CHAPTER
12
WORKING WITH
LISTS
;
12.5
Reverse revisited
The reverse
(
function
)
reverse a scalar,
only used
is
context sensitive. While we've only used
you should be aware
that
it
takes a
as its
list
argument.
We
have
like this:
it
$string = 'halb'; $string = reverse $string; print $string; # prints: blah Here, the function expect because catenates
thus far to
it
all
we
taking $string as a one element
is
called
^
list.
It
does what
we
in a scalar context. In scalar context, this function con-
it
of its arguments into a single string and reverses that
string.
Consider
the following: '
©array = 'blah' Oarray = reverse @array; print (aarray\n" # prints: blah (
"
In a
;
context, such as this example, the reverse
list
order of the this,
•
;
)
list,
With
leaving each element unchanged.
the resulting
list is
a
function reverses the
)
(
one element
list
such
as
unchanged:
©array = one two three ©array = reverse ©array; print "©arrayXn"; # prints: three two one $string = reverse ©array; print "$string\n"; # prints: enoowteerht {
'
,
'
'
'
,
'
'
)
But what do you suppose the following
produce?
will
$string = 'halb'; print reverse $string;
The print 0 reverse 0
is
function takes a
performed in a
list
list
context,
—
printed in reverse order. This explains the the conunifyO subroutine in chapter 10. to use the
scalar
{
)
provides a
i.e.,
and the
bug
To
I
single
list
element
context list
—
so the
($string)
is
mentioned previously regarding
get the intended meaning,
function to explicitly put the reverse
(
)
you need
into scalar context:
$string = 'halb'; print scalar reverse $string;
The reverse
(
)
function can be used to print a
file
in reverse order (by lines):
print reverse ;
REVERSE REVISITED
223
;
It
can also be used to invert a hash (so long
unique) to create a
new hash
as the values
of the hash are
that has the values of the former hash as
its
keys and
the keys of the former as the values:
Jan => 1, Feb => 2, Mar => 3, Apr => 4, May => 5, Jun => 6, Jul => 7, Aug => 8, Sep => 9, Oct => 10, Nov => 11, Dec => 12 %num2mon = reverse %inon2nuin; $num2mon{ 3 \n" print " $mon2nuin{Mar } # prints: 3, Mar %mon2nuin =
(
)
}
,
12.6 1
2
224
;
Exercises when
Write a function
that,
by the hash
Write the same thing in one
keys.
given a hash, returns the
Modify the one-line version above
of hash values sorted
line using a
to also filter out
CHAPTER
list
12
map
any value
(
)
less
function.
than 25.
WORKING WITH
LISTS
CHAPTER
13
More I/O commands
13.1
Running
13.2
Reading and writing from/to external commands
13.3
Working with
13.4
Filetest operators
13.5
faqgrep revisited
13.6
Exercises
external
directories
229
230
233
225
228
226 227
)
In chapter 6,
we
buih-in open
(
)
covered the basics of reading from and writing to a function. This
You can
write data.
not the only way for your programs to read or
is
output of external programs, run external pro-
also capture the
grams (which might write data to their contents.
using the
file
files
or elsewhere), and open directories and read
This chapter takes us on a brief foray through these alternate I/O
mechanisms.
Running external commands
13. 1
Running an you are
external
program may not seem
an I/O operation, especially
like
program data or receiving
are not sending that
output. Nevertheless, you
its
communicating with the world outside your program, causing programs
run and, perhaps, data to be read from a
two mechanisms
Perl has
you
and exec
the system 0
(
)
files
or the console.
running external commands or programs when
functions.
difference between the
program
or printed to
to be
immediately interested in capturing the output of those commands:
are not
The
for
file
if
two
to replace the currently
is
Each of these
that the exec
running
Perl
(
will )
run an external program.
function causes the external
program while the system () func-
tion spawns another shell process to run the given
program and waits
for
it
to
complete before continuing. system (' Is ') # runs the 'Is' command and waits for it to complete print "Still runningXn"; # will print when 'Is' finishes ;
exec Is # substitutes 'Is' command for current running script print "Still runningXn"; # will only print if 'Is' failed (
'
We
'
)
;
can detect
failure in a
manner
similar to our tests of the
open
with one significant difference: the system () returns 0 for success
and an
error code for failure.
To
test for failure,
you need
(
)
call,
but
(a false value)
to use the logical &&
operator with the die statement:
system! 'Is') So,
die "hmm,
what happens
'Is'
command failedXn";
to the output
from the command? In both
cases, the stan-
dard filehandles are inherited from the Perl program, and the output goes to STDOUT. In the second example above, as indicated in the comments, the print
statement will not be executed cess has call
if
the exec
(
)
was
successful. Because the
new
pro-
completely replaced the current Perl program, nothing after the exec
(
can be executed.
Why would
you want
to
run external programs using these functions? Well,
even though you can write routines in Perl to do pretty
226
much
anything an external
CHAPTER
13
MORE I/O
;
;
command
On
could do,
it is
Unix systems, there
used for such things
sometimes simpler to
of command
are a variety
file
a large
Perhaps your program
files.
amount of data
and remove duplicate elements from before
like to sort
# code that writes lots of raw data to $tinp_f ile exec ("sort $tinp_file unique > $ f inal_f ile " )
|
commands.
line utilities exist that are often
and finding
as sorting, searching,
has just collected and printed to a
just use the external
that
you would now
exiting the program: ...
#
;
Or, similarly, say you wanted to do the same thing to an existing reading
in
it,
which
case
you would use the system {)
could continue after running the given detail regarding these fianctions
-f
command. The
and how they process
fiinction so
perldocs offer a
their
arguments
system and perldoc -f exec). More often you want
output of external commands.
Let's
look
at
some
file
before
your program little
(see
more
perldoc
to capture or read the
aspects of these last operations.
Reading and writing from/to external
13.2
commands Perl also provides
mechanisms
to
run external commands, to collect or read their
output, or to send data from your Perl program into the standard input of an external
command.
Backtics are the simplest
gram.
The
syntax
is
mechanism
for collecting data
to simply enclose the external
from an external pro-
command
in reverse (or
open-
ing) single quotation marks, also referred to as backtics
$listing ©listing
=
$listing ©listing
=
=
=
'ls\'Is';
#
#
'dir'; 'dir';
# #
$listing has output as single string ©listing has output as array of lines same as above on DOS same as above on DOS
In scalar context, backtics collect as a single string. In list context, the
meaning of "line" Alternately,
open
(
)
is
and return the output of the given command
output
returned as a
is
dependent on the current value of the
you can open
a
file
function by giving an external
list
of lines (where the
special $/ variable).
handle onto an external process using the
command
open(DIR, 'Is I') II die "can't fork: while 0 { chomp print "$_" if -s $_ 5000;
$
followed by a pipe as an argument:
!
"
}
close DIR
II
die "failed $!";
READING AND WRITING FROM/TO EXTERNAL COMMANDS
227
;
Here we open the Is process and pipe one
line at a
it
;
into our
file
;
handle so we can read
time and print only the names of files that are over 5000 bytes in
(see section 13.4).
Note, catching open pipe failures
is
not straightforward
—
size
please
see perlfaqS for a solution to this problem.
We can also open up a whole pipeline of processes, the exec
(
such
as the
one we used
in
example above:
)
open(SORTED, "sort $tmp_file while { print; (
|
uniq
|")
die "can't fork:
||
$
!
"
)
}
close SORTED
||
die "problem with SORTED:
!
$
"
Here we have opened the sort program on the put to the uniq our
utility to
handle so
file
we can
remove duplicate read
by
line
it
file
and,
lines,
$tmp_f lie, piped
finally,
its
out-
piped that output to
line.
This piping mechanism works either way but only one way
at a time.
We can
write data to an external process as well by using a leading pipe symbol:
open{OUT, "I sort
while { print OUT; (
|
uniq
>
$f inal_f ile"
)
die "can't fork:
|
|
$
!
"
)
-
'
-
}
die "problem with OUT: $!";
close OUT I
I
In this case, utility,
tems.
and
The
133
we
our data to the sort command, then to the uniq
are sending
finally redirecting
utilities available
to a
it
file.
These
on your system may
Working with
are standard utilities
on Unix
sys-
differ.
directories
Although we gave a few examples above of using external commands (is or dir) to obtain directory listings, Perl also has a built-in set
reading,
and closing
directories (directory handles)
opendir(DIR, /home/ajohnson my ©listing = readdir DIR; closedir DIR; '
And,
if
use the grep
you wanted (
)
'
)
||
to print out
function to
filter
of functions for opening,
and reading
their contents.
die "can't: $!";
all
the
files
with a
extension,
.txt
the output of the readdir
(
)
you could
function:
opendir(DIR, '/home/ajohnson') || die "can't: $!"; my Olisting = grep /\.txt$/, readdir DIR; closedir DIR;
228
CHAPTER
13
MORE
I/O
;
.
One readdir selves.
;
thing to remember though
(
)
"
;
that the directory entries
is
do not contain leading path information,
Thus, you couldn't try to
just the actual entries
open one of the
just
produced by
them-
because chances are
files
it
doesn't exist in your current directory.
my $dir = /home/aj ohnson/public_html opendir(DIR, $dir) || die "can't: $ my ©listing = grep /\.html$/, readdir DIR; closedir DIR; foreach my $file ©listing die "can't: $ # open(FILE, $file) || # must use full pathname to the file open(FILE, "$dir/$f ile" die "can't: || # more stuff '
'
!
(
)
"
•
'
,.
,
'
{
!
won't work
#
;
"
.
$
)
.
i.-
^
;
!
;
,
,
" ;
,
.
}
Of course, how do we a
file.
really
might be a directory
It
know
(or a
if
the $file in the example above
FIFO or
a socket).
The
filetest
is
really
operators are
often used in conjunction with reading directories to determine what kind of entity a particular directory entry really
13.4
is.
_
Filetest operators
There are
several filetest operators that can
particular directory entry. Table 13.1
used such
tests.
(See
Most of these
test is
on page 230 shows many of
perldoc perlfunc
for the
complete
list
of file
the
commonly
test operators.)
return a simple true or false value and are often used in condi-
tional expressions or
then the
be used to find out information about a
grep
{
expressions. If a filename or
)
file
handle
is
not given,
performed against the filename contained in the default variable
($_):
,
$dir = /home/aj ohnson/bin opendir(DIR, $dir) die "can't: $ " my ©listing = grep -f "$dir/$_", readdir DIR; closedir DIR; '
'
!
|
|
The
-M, -A,
and -c
tests
return an age in days (fractional) relative to the start
of the current program. Therefore, to started,
you could
test
if
As a simple example, tories
(
entries in the
readdir
-m 'filename'
$in-> $max]
-.v-V
'
,
'
"
[
}
($in->[$i], }
$in->[$max]) =
(
$
in-> $max] [
,
$in->[$i]);
'
}
Another simple sorting routine
ment of the
array
and consider
element in the array and insert first
two array elements.
SORTING
We
it it
is
a sorted
into
leave
the insertion sort. Here
it
its
list
of length
1
.
We
correct position in the
where
it is
if it is larger
we
take the
first ele-
then take the second list
comprised of the
than element
1
,
or
we
283
;
put
it
into position
third element position.
14 13 4 13 4 13 4 12 3 3
2
5
2
5
2
5
2
4
5
and move element
by finding
Here
5
1
;
its
1
inserted, inserted, inserted, inserted,
1
4 5 2
position. Similarly,
we add
the
proper position and moving any higher elements up one
are the intermediary results
[$i+l] = $in->[$i];
4
5
>
$val){
(3, 1, 4, 5, 2):
sorted sorted sorted sorted
]
;
$i = $i
-
1;
}
$in->[$i +1]
=
$val;
} }
Our
final
simple sorting algorithm
is
called the bubble sort because larger ele-
ments "bubble up"
to the top (or end) of the array. In this routine,
each element of the
array,
if
the
first
moved
We
element
to the
is
comparing
it
list
and the
rest
continue making such passes until
the array
14 13 4 13 4 13 4 13 4 3
5
is
14 13 4 13 2 12 3
284
Let's
doing
of the array
we
don't
consider one pass
is
so, the largest
slightly closer to
make any swaps
on our
array of (3,
element
is
being sorted.
that indicate that 1, 4, 5, 2):
2
5
2
5
2
5
2
2
5
Now, 3
now sorted.
step through
with the next element and swapping the two
larger than the second. In
end of the
we
5
2
2
5
4
5
4
5
[0];
288
to
the heap while the current element
greater,
and D)
the heap, restoring the heap property
index 0 in the heap
because any nodes further
which child node
C
by one;
needs to be passed two parameters: a reference to
same routine at
is
down through
2),
with,
and the current element
will use this
size
what the implementation of the pushdown
at
To begin
rather than Just assuming
we
size
looping (size
node and decreasing the heap
last
the
for the
we
array.
children nodes and set the
heap until eventually our heap
shown
is
last ele-
no longer a heap because the
elements, and our swap completes the sorting of the array.
pushdown 0 routine
this
in the array.
still
on our new smaller six-element heap.
this smaller
To turn
of the heap-array. This routine uses a
1
current element to the node with which
store
no longer con-
is
not larger than both of its children. To
is
named pushdown
a routine
we
with the
1
decrease the heap size from 7 to 6 so the last element in the array it is
thing to
first
end of the
in the heap, thus putting the largest element at the
sidered as part of the heap, even though
in
an example, we store 7
as
in element 0 of the array because there are seven elements in the heap.
heap-array into a sorted array,
is
build a heap from an input array,
of the heap there. Using the heap in figure 17.3
the size
The
partially sorted.
how we might
how we might produce
that the
is
consider
it is
CHAPTER
.
17
,
ALGORITHMS AND DATA STRUCTURING
{
;
while($i $child] $child++
$heap-> $child+l
[$i] >= $heap-> $child] $heap-> $child] ($heap-> [$i] = $heap-> $child] $heap-> $i $i = $child;
if
(
[
{
)
[
last
}
.
' ]
[
,
,
-.
)
[
,
;
)
}
7
6
-
12
8
10
8
1
2
Figure 17.4
10
6
7
4
9
6
6
4
5
6
7
0
7
4
9
12
6
4
5
6
7
The action
of pushing
6
10
down
8
how we
unshiftO
build
it
into a heap
its size
from the bottom up.
7, 10, 8, 12, 6). Figure 17.5
after unshifting the size
shows
onto the
4
9
12
4
5
6
7
9
7
3
4
4
6
in
12
5
5
7
the heap
the heap-array once
will build a heap-array
take our ordinary array,
7
an element
Now that we know how we will sort figure out
10
8
12
from an ordinary into
Let's
•
its first
it's
built,
array.
we need
To do
0),
consider an array containing
this array in
we
this,
element (element
to
and
(4, 9,
binary tree form and in array form
array.
Obviously, this binary tree does not satisfy the heap property. But you'll note that if
we
call
causes the 7
pushdown
and the 12
(
)
on node
3, the last
small heap. Similarly, calling pushdown
swapped turning 1,
{
)
3 along with
on node 2
it
children
its
causes the 9
that subtree into a proper heap. Finally, calling
the root node, works as
HEAP SORT
Node
to be swapped.
parent node in the structure, this is
now
and the 10
a
to be
pushdown on node
did in our previous example: the 4 in node
1
is
289
{
Array Representation
4 I
shown
Ordinary array
swapped with the 12 now swapped with the 7
we can
Finally,
in
node
in
I
57
heap and array form
in
node 6;
10| 8 |12| 6
I
I
0
Figure 17.5
7
9
12345
7
node 3
3;
is
tested; the
and we have created
4 we
loop to swap the
put in node 3
heap from our original
a
is
array.
construct a heap sort algorithm that sorts an array in place by,
using one loop to build the heap from the bottom up.
first,
just
and
first
Then we
use a second
nodes of the heap, to reduce the heap
last
size,
and
to
restore the
heap property on the new smaller heap that remains. The algorithm,
which
the pushdown
calls
(
shown above,
routine
)
sub heap_sort { my $heap = \@_; unshift @$heap, scalar @_; for (my $i = int $heap-> 0 pushdown $heap, $i) [
(
]
$i >= 1;
2);
/
is
$i--)
{
;
(
}
for{my $i = $heap->[0]; $i >= ($heap->[l], $heap->[$i]) $heap->[0] --; pushdown $heap, 1)
$i--)
2; =
(
$heap-> $i] [
-
;
(
,
-
$heap->[l]); -
.
.
}
shift @$heap;
,,
,
,
.
,
'
'
'
}
'
Earlier
I
said that this sorting routine has a
N
log(base2) N). Well, log(base 2) it
runs in
N
*
h time, where h
is
roughly the height of the heap, so
,000,000, h
is
only 20. Thus,
it is
more underlying work than other tainly not
doing
it
as often:
of benchmarks for various
1000
sizes
*
we can
*
say
the height of the heap. You'll notice that the
is
height of the heap grows slowly relative to N. For 1
running time based on (N
N
= 1000,
h
is
10,
and
easy to see that even if the heap sort
routines, as
1000
is
far
N grows
large, the
more than 1000
for is
heap sort
* 10.
N=
doing is
cer-
A quick series
of arrays of random numbers produced the follow-
ing rough timings:
290
CHA P TER
17
ALGORI THMS A ND DA TA
S TR UC TURING
N = Bubble Took: Insert Took: Select Took: Heap Took:
5
N = Bubble Took: Insert Took: Select Took: Heap Took:
5
N = Bubble Took Insert Took Select Took Heap Took:
0.05 seconds
seconds 0.01 seconds 0.03 seconds 0
.
02
0.18 0.0 6 0.0 6 0.05
seconds seconds seconds seconds
N = 000 19.27 seconds Bubble Took: Insert Took: 4.84 seconds Select Took: 5.2 6 seconds Heap Took: 0.7 6 seconds
0
seconds seconds seconds seconds
4.79 1.21 1.31 0.34
00
N 5000 Bubble Took (way too long) Insert Took 122.11 seconds Select Took 132.48 seconds Heap Took:
4.43 seconds
Of course, The
the goal here was not to create the fastest sort routine in Perl code.
built-in sort
shown
(
function can sort a good deal faster than any of the routines
)
here. (For example, the built-in sort routine sorts a
list
of
size
5000
in
about 0.29 seconds on the same machine that produced the above timings.)
No, the goal of
chapter was to use sorting algorithms as a context to
this
introduce you to alternate ways of structuring your data that can lead to improved algorithms. ter
in
heaps
We
have also touched upon some basic terminology you will encoun-
further studies of abstract data structures
—
as well as
—nodes,
trees,
binary
trees,
an informal introduction to comparing the order of growth of
running times in different algorithms.
We
will
encounter some of these concepts
again in the following chapters.
17.4 1
Exercises
Rewrite the pushdown
(
)
routine to use recursion rather than a while loop to
push an element down the heap. Will the heap sort or 2
3
7\n alternate
way
sorted
built
its
work
in a
make
little
help or hurt the running time of
difference?
to build a
list.
this
heap
is
by insertion
—
recall
how
insertion sort
Write a routine that builds a heap by insertion. This will
bottom up
fashion.
Can you
think of other problems besides sorting where a heap or heap-like
structure
might be
EXERCISES
useful?
291
CHAPTER
18
Object-oriented programming
and abstract data structures OOP?
18.1
What
18.2
OOP in Perl
18.3
Abstract data structures
18.4
Stacks, queues,
18.5
Exercises
is
293 295 301
and linked
314
292
lists
301
— Perhaps you've heard of object-oriented programming (OOP) sometime during the past decade.
You might have heard
haps that
it
is
indeed have
its
that
bunch of horse hooey, or
just a
it is
the greatest productivity advancement since caffeine.
and
share of vocal proponents
detractors. Fortunately,
per-
OOP
does
we can hap-
and make up our own minds.
pily ignore the extremists
This chapter will begin with a brief overview of OOP in general, to give you a feel for
the subject
will settle right
down
OOP
features to begin building classes
our programs. Then
in
we'll use classes
and
We and
objects to cre-
a few well-known abstract data structures.
you
If
are already familiar with the concepts
gramming, you may want
on
few new fancy terms to your vocabulary.
to introduce a
into using Perl's
and using them
objects ate
and
Peri
OOP
(straight to the
references, so, if
POOP
as
were). Perl's
it
you had any trouble understanding
review chapter 8 and the perlref pod-page.
5.00503, there
is
and jump
to skip the next section
makes use of
you might want
to
In recent releases of Perl, version
Entire books could be written about object-oriented
might use
effectively.
it
tutorial
on
Perl's references.
many such books have been written. book you are now reading is also publishing a book
on object-oriented programming using this
book.
rather abbreviated
I tell
programming and how one
In fact, a great
Indeed, the publisher of the
started,
capability
What is OOP?
18.1
time as
OOP
pro-
straight to section 18.2
references,
which provides a short
also a perlreftut,
OOP
and terminology of
you
this
Perl that should
come out around
only so you understand that
and informal introduction
to
OOP
here
I
the
same
can only present a
— enough
to get
you
but not the whole messy enchilada.
To begin
with,
let's
take a look at
some
the terminology that
is
commonly
used in OOP-speak: abstraction, encapsulation, inheritance, and polymorphism. Big words, but not difficult to grasp. We've already touched upon the
when we
discussed functions and subroutines in chapter 7.
turn as they
come up
—
Typically
we
We will discuss
each in
at least so far in this call
book
—we approach
a
programming problem
procedural or one of algorithmic decomposition
identifying the tasks that need to be accomplished to solve the problem. If
is,
are given a
problem, we tend to focus on the verbs in the problem statement
in other words, the actions that
In
some
cases, as
look more closely before
two back
again in the following discussions.
from a perspective we can that
first
we
need
to be
performed on the
data.
with the heap sort routine in the previous chapter,
at the data to consider
how
it
we may
might be organized or structured
decide on a processing approach.
WHAT IS OOP?
293
— )
In an
OOP
approach to a problem, dividing and structuring the data space
a central concern.
We want
to identify the things
is
with which we will be working.
we tend to focus on the nouns in the problem statement and how we might model them using different data structures and functions.
In this case,
back to our childhood and use a few well-known sentences
Let's take a trip
from
book
children's
OO
to illustrate the
who don't remember, you should know that Spot is a
approach.
who
run." For those
or
books,
dog).
program
to
do animation
is
Now let's
say that
we want
to write a
move
start
sketching out a run
(
a picture of Spot around the screen while animating
would need
to
know
quite a lot about
how
represented graphically. Consider that the next page of our story says, "See
Dick run" (Dick two
never read this particular series of
we might immediately
Spot's legs. Obviously, this subroutine
Spot
begin with "See Spot
for an online version of this book.
In a traditional approach,
procedure that would
Let's
legs to
is
a
human). Our run
animate while Spot has
spot_run() and dick_run() or run (spot) messier
still if
—
(
)
won't
four.
Now we
has only
need either two subroutines
or one big run() subroutine called as run (dick)
that contains the code to
we had
work with Dick because he
do
either animation.
Things would get
other creatures that were going to run.
OOP
we might first identify the nouns. Spot and Dick, and their behaviors. In figure 18.1, we show one way we might begin to think about our data types. Notice that in this diagram we have also Looking
at this
from an
perspective,
added category names, "human" and "dog."
Human
Dog
name: Dick
name: Spot
number
number
of legs: 2
runO
barkO runO
jumpO
jumpO
taIkO
Figure 18.1
If we
of legs: 4
Modeling Dick and Spot
have modules that define such data types, then in our main book anima-
we need only use these modules to create a Spot object and a Dick object. Then we can ask each object to run on the appropriate pages of the book. What we have done is created abstractions for our data humans and dogs. We call these abstractions classes. When we use an abstraction to create a specific instance of a thing for example, creating an instance of a Dog, called Spot we tion program,
—
—
call that
We have also used encapsulation here. Each class encapbehaviors of the objects defines. We usually refer to the
instance an object.
sulates the data
294
—
and the
CHAPTER
it
18
OOP AND ABSTRACT DA TA STRUCTURES
data as the attributes or properties of the object, and the behaviors as methods. latter are really just
We
The
ordinary functions.
could have created a more general
tained the "name" and
"number of legs"
could have defined the other two
approach would use inheritance
—
that
is,
and a run
attributes
classes
Mammals
class called
method. Then we
)
Mammals. This
types of
special
as
(
that just con-
each special subtype inherits the proper-
of its parent class. This approach would also involve polymorphism because each
ties
subtype's
run
)
(
method would have
Polymorphism means that children
to be redefined for that particular subtype.
classes
can
alter their properties
and methods so
they are not identical to their parents.
You
can't really appreciate
fiirther ado, let's get
creating a class,
This
until
you
class defines
methods)
it,
so without
back to programming.
this
you
how
are defining a
the data
is
whole new data type
stored,
data type can perform.
how it
using array indices.
You
also
is
accessed,
Think of the
to assign data to particular places in the array later
actually start doing
OOP in Perl
18.2 By
OOP
know several
to use in
and what functions
array data type
and how
hash and access those values
tions defined to
work on
know how important and
The
18.2.1
later.
(i.e.,
—you know how
to access their values again
functions defined to
know how You also know how
shift, push, pop; unshift, splice. Similarly, you pairs to a
your programs.
arrays:
to assign key/value to use several func-
hashes: exists, each, keys, values. useful these data types can be for
work on
And you
already
many purposes.
basics
how do we create a class in Perl? We begin by using the package declaration to define a new namespace, the same way we did when building an ordinary module. So,
This package/ module will contain the definition of our in a
we
file
of the same name
as the
class.
We will store all
this
package but with a .pm extension added, just
as
did for ordinary modules.
package Student;
Here we have
started the definition of a class
vide us with a Student data type.
when making The
first
classes
because
OOP
We
do not have
named to
Student. This will pro-
worry about any exporting
modules should not export anything.
thing this class has to define
is
a
method of
creating a
new Student
we can use in our programs. Such a method is called a constructor method. We can name this method anything we want, but most people prefer to call it new This method will return a reference to the underlying data structure it object that
(
)
.
OOP IN PERL
295
creates.
bless
(
Not just any )
been specially tagged by
reference, but a reference that has
function. This
^/fi-^/w^
mechanism
sub new { my $class = shift; my $self = { } = undef; $self -> {name $self-> { courses } = bless $self, $class; return $self;
OOP capability.
of Perl's
,.
.
.
at the heart
is
Perl's
;
}
[
]
,.
;
'
}
:
When
a constructor
familiar with
the arrow
is
is
invoked using an arrow syntax that
— {Student->new(
'
argl
'
'arg2'))
,
actually passed to the function as the
arguments back one position in the argument have shifted off the vided a couple of this
first
initial
values for a name
hash reference, and returned
it.
By
the package
to a
particular hash reference belongs to the package Student object. If
you
argument
leave off the second
we
before
we
hash, pro-
in this hash, blessed
—
is
it
bless
to the
default to the current package/class, but using the second for other benefits, as
name
new anonymous
Perl will always
it,
soon be very
argument, moving the given
and course key
blessing
you'll
In the constructor above,
list.
name, created a reference
class
—
,
(
know that this now a Student
function,
)
argument
will
it
leaves
room
will see shortly.
We said that a class also defines the behaviors of the object in question, so let's give our Student object the ability to
The
tell
us
who
it is
and what course
it is
taking.
following two functions go in the Student package:
sub name { my $self = shift; $self -> {name} = shift if @_; return $self->{naine} "
'-
'
•
.
•
'.i
^ ,
;
'
'
}
'
' '
sub courses { my $self = shift; = @_ if @_; (a{$self->{courses} return @ { $sel f-> courses }} }
{
-
;
,
}
In Perl,
when you
call
an
object's
method, using the arrow syntax shown
above for the constructor, the object reference first
argument
argument into
to the function. So, in the a variable that
wanted. That variable
296
now
we
usually
itself is
automatically passed as the
two functions above, we
name $self but could
holds the reference to the hash.
CHAPTER
18
shift off the first call
We
anything
we
can access any
OOP AND ABSTRACT DA TA STRUCTURES
;
;
;
value in this hash reference in the usual ways. Above, fields if the
functions are passed arguments, or
ments were passed. Some people prefer
we
we
set the values for these
just return the values if no argu-
and
to create separate functions for setting
retrieving object attributes. If we
have the above package saved in a
statement of just 1
final
chapter 16,
we can
to return a true
;
use this
new
named Student.pm and included a value as we did with our modules in file
data type in a program:
# /usr/bin/perl -w use strict use Student; my $student = Student->new $student->naine Bill Jones'); $student->courses Math 'English'); print $student->naine "\n"; print join(' ', $student->courses !
{
)
"A :
;
'
{
{
'
'
(
)
,
{
We because
)
)
could have also accessed the
we know
#
,
that a student
is
just a
,
"\n";
name
#
prints: Bill Jones prints: Math English
or courses of the student directly
hash reference.
We
$student->{naine} to get the student's name, but this
WTien you use an tions.
object,
you should
access
it
could have said print
is
only through
not a good practice. its
documented func-
This way, the underlying structure of the object can be changed in the future
without affecting your program. Perhaps in the future we Student its
•'
will decide to
an array reference rather than a hash reference.
class to use
change the
We will change
methods accordingly:
package Student; sub new my $ class = shift; my $self = $self->[0] = undef; $self->[l] = []; bless $self, $class; return $self; {
[
]
,
' .
^
}
sub name { my $self = shift; $self->[0] = shift if @_; return $self->[0]; }
sub course { my $self
= shift; (a{$self-> [1] } = ©_ if
return @{$self->
[1]
(a_;
}
}
1;
OOP IN PERL
297
;
;
,
)
)
our programs had been accessing the name and courses of the student
If
by
objects
directly accessing the
hash keys, the programs would no longer work.
However, by only using the documented accessor functions, our programs can continue to work correctly regardless of how
we change
the underlying data struc-
ture in the class:
print $student-> {name} print $student->naine {
)
,
"\n";
#
,
"\n";
#
no longer works still works as advertised
Notice one additional thing about using objects that can be demontrated
with a simple script using our original, hash-based Student
class above:
"
/usr/bin/perl -w use strict; use Student; my $student = Student->new $student->name Bill Jones'); $student->courses Math English my $h_ref = {name => 'Bill Jones', course => 'English'}; #
.
!
.
(
,
)
(
'
(
'
'
print print print print
ence
ref
(
'
'
"$student\n" ref $student "\n"; "$h_ref\n"; ref($h_ref), "\n";
#
;
{
)
#
,
#
#
)
Student=HASH 0x80c44cc Student HASH 0x8 OdlbbO HASH
prints: prints: prints: prints:
(
(
we know the underlying data structure of a Student is a hash reference, we also know that it is not just an ordinary hash referit's a hash reference that also knows what class (package) it belongs. The
You can class
,
see that although
— )
function returns the type of reference for an ordinary reference as well as
the class-name of a blessed reference to create
more general constructor
an object).
(i.e.,
We
can use this information
functions:
package Student; sub new { my $type = shift; my $class = ref ($type) $type; => undef, my $self = name courses => |
|
{
[ ]
'
>
return bless $self,
$class; -
} ,
#
other methods.
.
.
1;
We
can
call this
constructor in the normal fashion or as an object
method
using an existing object:
298
CHAPTER
18
OOP AND ABSTRACT DATA STRUCTURES
;;
my $studentl = Student->new $studentl->name Bill Jones'); my $student2 = $studentl->new (
)
;
-j..,n---r::^[i.:^,
.,
'
{
,
(
important to
It is
of $studentl.
It is
)
realize that
a completely
$student2 in the above example
new
is
(and empty) Student object of
its
$studentl object was only used to access the constructor method.
The
provide any additional parameters to the constructor.
now
tests
whether
existing object. If
ref
(
)
was called using the
was
it
from an
object,
name
or
constructor
if it
method
was called from an
name by
gets the class
it
does not
It
using the
Inheritance
you have
a general class such as our Student class above,
specific classes directly
new
create entire also
called
class
own. The
function.
18.2.2 If
it
not a copy
want a
from
it
you can derive more
using inheritance. This means you do not have to
classes that duplicate parts
of existing
classes. Let's say that
This student
special type of Student to represent a part-time student.
can only be registered for a In Perl,
we implement
maximum
we
of three courses.
inheritance using a special array called the ©ISA array.
This array must be a package global variable, not a
lexical variable so, if
you
are
using the strict pragma inside your class modules, you will have to declare this variable using the
of the
class calls a
use vars pragma
method and
that
will also search start
any
The @ISA
When
©ISA array to
class definition,
array holds the
names
an object of your derived
exist in that object's
classes listed in those packages'
our new Student
from the parent
to inherit.
method does not
Perl searches the classes within the
To
(see below).
from which you want
classes
try to locate that
©ISA arrays
package, then
method.
Perl
as well.
we can simply
inherit everything
class:
package PT_Student; use strict; use Student; use vars '@ISA'; @ISA = qw(Student);
,
1;
We now Student
have a
class.
new
Because
classes listed in the
class called
to find
method
# /usr/bin/perl -w use strict; use PT_Student; my $pt_stud = PT_Student->new
that
is
exactly the
same
as
our
does not define any methods, Perl searches the
this class
©ISA
PT_Student
calls:
!
OOP IN PERL
(
)
299
$pt_stud->naine
'
(
John Smith');
$pt_stud->courses (qw/Math English Biology Chemistry/); $pt_stud->courses print join(' "\n"; '
(
,
You can class,
ance ent
and we
—
we
we want
parent;
course
didn't have to rewrite
But,
don't
,
class
behaves just the same as our Student
new
the code in this
all
PT_Student
want our PT_Student
by creating
this
class.
This
class inherits its functionality
a
new courses
from
inheritits
most three courses
par-
as its
in
its
function for the part time
)
(
is
be exactly the same
class to
to allow a part-time student to have at
We can do
list.
)
PT_Student
see that this
in other words, the
class.
)
student.
sub courses
my $self if
•'; '
•
-
{
=
(@_ > 3)
-
•
shift;
/
,
'
•
{
die "part time students can only have
courses. \n";
3
}else{
@{$self->{course}
}
@_ if @_;
=
"
\
'
}
return @{$self->{courses}
Now
you
,.
.
everything about part-time students
dents, except that if
'
.
;
}
' -
try to set the course
is
the
list
to
part-time student will die with an error message.
same
as that for regular stu-
more than
three courses, the
You might want
thing other than having the student die just for trying to take
more
haps you would only want to issue a warning message and return a
By
redefining the courses
(
)
function,
to
do some-
courses. Per-
false value.
we have used polymorphism
derived class has a slightly different shape, or functionality, than
its
parent
— our
class.
Figure 18.2 shows the relation between the Student and PT_Student classes.
A PT_Student IS A Student with a modified Student
courses
(
)
method.
PT_Student
name: ^ ISA
courses:
nameO coursesO
courses!)
Figure 18.2
300
Relation
between Student and PT_Student
CHAPTER
18
OOP AND ABSTRACT DATA STRUCTURES
Abstract data structures
18.3 In the
last
chapter and the previous sections, you've learned a
think about your data in a more abstract
way
(heaps, trees, students)
we
resenting your data as objects with behaviors. In this section,
Stacksy queuesy programming book
Virtually any
will
both are useful to
is
Even though
know so you
no exception. Stacks
their functionality
structure in
your everyday
life.
perhaps you took
computer programming
free.
is
much
the same as a stack in
Perhaps you have a stack of books lying next to your desk, and this
very book off the top of that stack.
from reading, you might place
When
book back on top of the
this
top. In real
life,
lifting several
you might
books
at a
ing your back or both. single items
also give a
Out
risk
stack
from only the top of the
a break
This describes
things from the
from the bottom, or
toppling your stack or injur-
A stack in the programming world limits you to
placing or
stack.
the operation of adding to a stack as pushing an element
and the removal operation
name
book out of the
time off the top, but you
We generally refer to stack,
try slipping a
you take
stack.
—you only add and remove
the fundamental property of a stack
First
is
Stacks
A stack data
onto a
continue our
can see examples of building
simple objects and better appreciate what Perl gives for
removing
rep-
that even mentions abstract data structures will
are elementary data structures.
largely built into Perl,
18.4.1
and about
to
and linked lists
almost always give examples of stacks and queues. This book
and queues
how
OOP techniques while exploring basic abstract data structures further.
practice with
18.4
about
little
to this ordering
as
popping an element from the
of placement and removal with a
(LIFO). Figure 18.3 depicts the basic stack and push(itenn)
its
stack.
We
stack: Last In,
operations.
popO
stack
Figure 18.3
Graphic representation of a
STACKS, QUEUES,
AND LINKED
stacl
now back to plain, now some bold and bold italic < em> text . And lastly, here is emphasized text containing < em> even more emphasized text, in which case you would probably want the word containing in the previous phrase to be in plain fonts again. :
:
:
you were parsing such
If
when you
hit a tag.
would you switch into nested states
pushing any new
text,
you would not want
What would you do when you
to?
You need
to be able to
state
you enter onto the
hit the
remember
and work your way back out
is
now your
a stack,
Whenever you
hit
is left
state
you move
you can keep
an end
tag,
you
on the top of the
current state.
You may suddenly wonder what already allow this kind of behavior?
many
What
tag?
earlier states as
can simply pop the current state off the stack. Whatever stack
ending
With
again.
stack.
to simply switch states
languages are nothing at
arrays in other languages
is
all
You
all like
this fuss
is
about. Don't
are right; they do.
But
Perl arrays. Usually, all
Perl's arrays
arrays in a great
you can do with
allocate a size for the array, then store
and
retrieve val-
ues using array indices only (no pushing, popping, splicing, or dicing). Perl's arrays are different
and convenient because they can grow or shrink on demand, and they
have the functionality of stacks built right in
as well as
and the functionality of
queues, as you will see shortly.
So do we need to bother with creating our it is
easy to
do and, by doing
so,
viding a size limit to the stack
we
we can add
if
desired
a
own
little
stacks in Perl?
Not
often, but
extra functionality, such as pro-
and automatically producing warnings
if
reach the bottom or top of the stack. Besides, creating stacks gives us a chance
to demonstrate the use of inheritance
302
CHAPTER
18
when we
create our
queue
object.
OOP AND ABSTRACT DATA STRUCTURES
To keep
things simple,
we
use
Perl's arrays to
implement our stack on the
but on the outside we just have a stack object that might have an optional
inside,
and provides only the following functions: push, pop, top, is_empty,
size limit
and is_full. The top function merely returns the top element out removing
it.
This
is
when you want
useful
to
compare the current
the previous state
without having to pop the previous
them, and push
back on again.
We
it
module
use another
function
—
module
as you'll see shortly.
To begin
ple constructor to return a reference to
()
line in
function
—from
as
inside our object's
our main program that called these
we start our package and create a siman anonymous array as the blessed object: with,
package Stack; use Carp; sub new my $tYpe = shift; my $class = ref($type) $type; my $max_size = shift; my $self = [$max_size]; return bless $self $class;
,
^
{
|
with
compare
allows us to issue warnings using the
or errors using the croak
methods. These warnings point to the methods,
state
state off the stack,
our stack object, the Carp module that comes
in
part of the Perl distribution. This
carpO
in the stack with-
,
-t
,
..
" . _
|
...
,
' •
,
_
>
We now have an anonymous array as an object with a maximum size attribute stored in its first position. We can construct our two test methods to test if the stack empty or full. We assume that, if the maximum size 0 (no size was given is
when always #
is
the object was created), fail
we want
a limitless stack so the full test should
in that case.
$stack->is_empty
(
)
;
returns true if stack is empty
sub is_empty { my $self = shift; return !$#$self; }
# $stack->is_full returns true if stack is full sub is_full { my $self = shift; return 0 unless $self->[0]; return {$#$self == $self->[0]); (
)
;
}
Wliat the heck which,
is
when used on an
$self, our object,
is
STACKS, QUEUES,
!$#$self? Well,
!
array, gives us the
a reference to
AND LINKED
an
array.
LISTS
is
just the logical not operator,
index of the
last
element of that
Thus, is_empty
{
)
$#,
array.
simply returns the
303
{
;
of the
logical negation
value),
!
{ {
;
;
—
index of the array
last
0 returns true. Similarly,
value and
!
With
if
the last index
is
is,
if
the last index
is
greater than 0, then
0
(a false
it is
a true
true return false.
these simple tests in place,
tions easily
that
by
first testing
we can now implement our remaining
func-
our stack for the appropriate condition, then using the
tures already built into Perl's arrays to
do the
rest.
The
of module looks
rest
fea-
like this:
pushes $item onto stack if stack not full # $stack->push $item) sub push { my $self = shift; my $itein = shift; if ($self->is_full carp "Stack is full:"; return ;
(
{
)
)
'
}
„,^,
,
. ,
push @$self, Sitem; }.
$stack->pop pops the top item from the stack if not empty sub pop { my $self = shift; ; if $self ->is_empty { carp "Stack is empty:"; return; #
(
)
;
'
•
{
{
)
)
•
-
}
,
return pop @$self; }
returns the value of the top element if not empty # $stack->top sub top { my $self = shift; if ($self->is_empty() carp "Stack is empty:"; return; (
)
;
)
}
^
•
return $self-> [$#$self
]
-
;
-;
.'
-• ,
'
}
"
'
1;
'
END
Now we to test
its
save this in a
file
named
Stack.pm,
and we write a simple
little
script
functionality:
#! /usr /bin/perl -w use strict use Stack; my $st = Stack->new 4 #test push to overflow .
•
-•
;
{
(
for(3, 5,2,
,
,
,-.
)
)
9, 11)
print "pushing: $_\n" if $st->push $_) (
}
304
CHAPTER
18
OOP AND ABSTRACT DATA STRUCTURES
;
:
;
if $st->pop print "popped: 9\n" print "pushed: 42\n" if $st->push 42 \n" print 'top is: ', $st->top (
)
(
"
(
)
" ,
\n"
-i
;
-
.
,
:
.5)
,
'
""
;
,
#test pop to underflow
ford.
?
.
)
,
;
,
,
,.
-
{
print 'popped:
'
,
$st->pop
" {
,
)
\n"
'i'
'
;
'
;
••
)
print $st->top
,
{
)
"
\n"
This script produces the following output: perl stack.pl pushing: 3 pushing 5 pushing: 2 pushing: 9 Stack is full: at stack.pl line 7 popped 9 pushed: 42 top is: 42 popped: 42 Popped: 2 popped: 5 popped: 3 Stack is empty: at stack.pl line 16 Popped Stack is empty: at stack.pl line 18
'
$
, ;
,
:
running?
how
notice
Had we
the carp
(
)
-
!
,
,
(
)
_
_
.
,
.
.
.
messages point to the line in the script
used ordinary warn
„
^
:
Now,
,,
calls,
the
first
we
are
message would have been
stack is full: at Stack. pm line 26.
which wouldn't help us lem
lies
because
modules' croak
that's
{
)
locate the
where we
is
a
accepts, parses,
little
our
script,
which
and then
round and square,
than our
we can
test script.
One
to check for
let's
assume that
to be used
this
die
(
)
The Carp function.
example of using a
Consider a program that
program allows two kinds of paren-
—presumably
so that the person entering the
them from mis-entering
as a first test for a valid
equation
is
braces:
3+[(4*[9-2]-l)/2] 3+[(4*[9-2)-l]/2) STACKS, QUEUES,
where the prob-
evaluates simple mathematical expressions involving
of things you might want to do
mismatched
Perl's
give an
expression can alternate between brace types to help keep equations.
is
are attempting to overflow the stack.
a full parser here,
trivial
less
basic arithmetic. Further, theses,
in
function provides a similar alternative to
While we won't write stack that
problem
AND LINKED
is correctly nested is incorrectly nested
LISTS
305
;
;
A simple subroutine
;
;
{
{
When
stack.
a closing bracket
from the stack
to see if it
will
encountered,
it
in
brackets) encountered onto the
can pop the
last
opening bracket
of the correct type. Such a routine also catches instances
is
of missing brackets. For
is
(left
;
work through each token
using a stack can be used to
the statement and push any opening brackets
;
readability, these equations are nicely
spaced out, but
we
split on nothing to allow for equations that are not spaced out. The following
script provides the subroutine plus a helper routine for printing errors:
#
!
/usr/bin/perl -w
use strict; use Stack;
my $expressionl my $expression2
='3+[{4*[9-2]-l)/2]'; ='3+[(4*[9-2)-l]/2)';
check_braces $expressionl) check_braces $expression2 (
)
(
sub check_braces { my $expr = shift my $st = Stack->new my $pos = 0 " my $valid = 1; #assume validity my $ token; foreach $token (split //, $expr) { if ($token =~ m/\ \ [/) $st->push($token) }elsif ($token =~ m/\) \] /) { die not_valid{$token, $expr, $pos) if $st->is_empty my $prev = $st->pop{); {' unless $token eq ') && $prev eq or && $prev eq $ token eq " die not_valid($token, $expr, $pos) ( )
'
'
-
"
'
'
'
(
,
,
|
..
I
'
(
)
'
(
[
]
'
'
'
'
)
}
:
-
}
.,.,„!'.,;,,.
,
;
^
/
$pos++ .'I
}
$token = $st->top(); die not_valid $token, $expr return 1; (
sub not_valid { my $token, $expr $pos = @_; my $ptr = x 19 x $pos return new() $queue->is_empty $queue->is_full
,
.
•
(
(
-
'
)
'
' ,
)
i
$queue->enqueue $item) shoves $item into queue function rather we are merely renaming Stacks push than inheriting it: ;
(
{
sub enqueue { shift->push shift {
)
"c'*.
)
;
,
.
-
:
,
}
#$queue->dequeue removes and returns front of queue sub dequeue { my $self = shift; if $self ->is_empty carp ref($self)," is empty:"; return; (
)
;
;
(
{
)
)
;
,.;
'
{
•
-
-
?
.
,
,
.
,
,
.
,,^.j.„
•
'
,.
.
,
'
'
}
return splice (©$self
,
1
,
1)
'^j
;
.
;
"
}
,
# $queue->f ront returns front of queue sub front { my $self = shift; if $self ->is_empty carp ref($self)," is empty:"; return; (
)
^
{
)
,-
(doesn't remove)
;
(
;
,
^
)
'
} ,
return $self->[l];
,
'
=
i"^
}
1;
END
'
/
The only ing "Queue
'
thing really different here
is
empty",
it
is
the carp
says ref ($self)
is
(
)
empty.
function on an object returns the object's class name.
308
CHAPTER
18
statement. Rather than say-
Remember
We
that the ref
()
have to go back to the
OOP AND ABSTRACT DATA STRUCTURES
{
;
;
;
;
;
;
Stack module and change those carp
word stack
instead of hard-coding in the
$self being used in Stack's functions
is
carp
(
making
we can now run
statement,
)
in the messages because
no longer
ing the example above in Queue.pm and classes
statements to also use this technique
)
(
sometimes the
a stack but a Queue. After sav-
the
minor changes
to the Stack
the following test script:
#! /usr /bin/perl -w use strict; use Queue;
my $q = Queue->new for(3,2,4,9,ll) $q->enqueue $_)
(
4
)
(
}
print $q->dequeue {),"*** \n" $q->senqueue 42 " print $q->front ***\n" {
)
,
(
ford.
.5)
)
{
print $q->dequeue
, (
"
)
*\n";
}
print $q->f ront
(
)
which produces the following output
(the asterisks are only to help differentiate
different print statements):
Queue is full: at queue.pl line
7
3*** 2*** 2
*
4
*
9
*
42
*
Queue is empty: at queue.pl line 15 Queue is empty: at queue.pl line 17
18.4.3 It is
Linked lists
hard to appreciate
Linked
lists
and
that Perl gives
are another freebie. In
static in nature.
array
all
You
that's the
state at the
needed
many
for free
only array you get to use. isn't
a
at least
within the
STACKS, QUEUES,
What
built-in data types.
how
if
you
LISTS
big
you want your
aren't sure
problem because
how
big your
Perl itself takes care
of
grow and shrink
as
array that can
memory limits of your
AND LINKED
its
languages, structures such as arrays are
you by providing you with a dynamic
—
with
beginning of the program
array will need to be? In Perl, this this for
you
computer.
309
new memory on demand, linked lists can be used to solve such dynamic problems by providing a way of pointing to this new memory. Obviously, Perl can allocate new memory on demand during runtime. You can always create new variables in Perl, and Perl can point to this new memory through references to new variables, or anonymous structures. So, while As long
as the
language
is
able allocate
not often needed in most Perl programs, you can create linked Also, while the linked
list
may
structure
a linked
is
list?
in Perl readily.
not be that useRil, the same techniques
can be used to create other structures such
So what
lists
as trees.
Consider that you have to read in a variable number
of simple inventory records (part-number:name:quantity) and you want to be able to search the
do
list
for certain fields. Easy, right? Well,
without using an array or hash to hold the entire
it
use small
anonymous
say that
let's
list
now you
although you can
record as hash
record as array
or #
p_num => 144, name => pencil, quantity => 7,
[
144 'pencil',
#
,
#
#
,
7,
#
}
•
]
:
Graphically these records could be represented as in figure 18.4, where
you have some
fields, regardless
still
arrays or hashes to hold individual records as in
{
ply assume
have to
storage device with slots
of whether you also store the
Generic Record
Specific Record
part_number
144
part_name
pencil
field
you can use
names
we
sim-
to hold the different
[keys]
along with them.
quantity
Figure 18.4
The
Graphical representation of inventory record
slots that
contain the field data could hold any scalar data.
one of these structures structure,
why
^
is
just a scalar value. So, if we
can have three
fields in
each
not another, or even two more?
In figure 18.5, each structure holds two additional reference to the next structure (if any)
ous structure
A reference to
(if
any).
The
fields,
one containing a
and one containing a reference
to the previ-
small black circles are just our reference depiction from
chapter 8, and the arrows are meant to point to the whole next (or previous) structure,
not just one
field.
"link" (a reference) to
310
its
Each node, or individual record-structure, contains a next node and
CHAPTER
18
its
previous node.
OOP AND ABSTRACT DA TA STRUCTURES
—
,
Specific Records
Generic Record
part_number
&&
1
part_name quantity
pencil
pen 11
•
• Records with additional pointer
Figure 18.5
we had
have access to
a reference to the this
first
node and, by
node, and so on, until also
9Q1
7
Next
If
w
undef
Prev
we
node, perhaps stored in a variable, then
reference,
its
field
because
it
called a linked
is
is
We
undefined.
the references in each node's "previous"
only thing holding the different records together as a link them. Hence, this
list.
we
next node, and that node's next-
node whose "next"
hit a
work backwards following
fields
list
Actually, this
could
field.
The
are the references that is
a double-linked
list
has links in each direction.
In the above description, I've only described creating a linked
hold just a key you associate with your data,
ing your record.
items in the
data,
The key
of structures
A more general approach would be to use a key field— to
that contain the data itself
which would hold your
list
most
field
likely as
as
—and
a data field
an anonymous hash or array represent-
then your search
is
with a hash
field
when looking
for particular
list:
A node in the linked list
previous key data next
(reference to previous item),
=> => => =>
:
. ,
,j,,„
(your key) (your data), (reference to next item),
"
)
In will
some
cases, the
key
itself
might be the only
have a key and a record to go with
it.
data, but in
Such a record
is
many
cases,
you
called the satellite data
was not pertinent previously because our stacks and
for the key. Satellite data
queues did not need to support a search operation.
The methods
following package to insert
new nodes
emptiness, and return the
do not need singly linked
node
—
list
to traverse the list.
is
a partial implementation of a linked
into the
list,
search the
of the stored records
list
in
both directions,
list
(i.e.,
this
list
for a given key, test for
dump
the
list).
STACKS, QUEUES,
Since
we
package just implements a
Each node knows the next node, but no node knows
say that five times
supporting
its
previous
fast.
AND LINKED
LISTS
311
;
package Llist; sub new { my $tYpe = shift; $type; my $class = ref ($type) my $self = { } return bless $self, $class; |
|
}
All our
new
function does
anonymous
representing with an
and returned and
The
real
work
routine here that
my $data
=
This
?
will is
you
done by the insert
(
anonymous hash)
(i.e.,
blessed
list.
method.
)
is
are
One
our insertion
line in
haven't seen before looks like this:
{
operator
node
hash. This
be the base node for our
def ined $_
:
empty node, which we
create a completely
is
is
[
0
]
)
$_[0]
?
$key;
:
called the ternary operator
else clause in an expression.
general form
Its
and operates rather
like
an if /
is ^
(condition) ? (this expression) (that expression) returns: this expression if condition was true returns: that expression if condition was false :
The above line of code tests if $_ $data. If not, we assign $key to $data my $data because false,
=
it is
shift
|
|
0
[
]
is
...
defined. If so,
instead.
We do
we
not simply use
$key;
possible that the value
we
are shifting
is
0,
which would evaluate
and we would wind up assigning $key instead of the
value of 0. In the ternary conditional, to see if
it is
true.
To do our
—and,
assign that value to
Thus we avoid
insertion,
we
test to see if the
perfectly valid data
argument
is
defined, not
problem.
this potential
we need
as
to consider
two
cases:
Our
first
node, self,
is
we need only copy our key and data into it or our first node is not empty. In the second case, we insert by creating a new node and copy the first nodes data and next pointer into this new node. Then, we add our new key and data into the first node and stick the new node into our first node's next field. In this way, the list grows like a reversed stack, with new eleempty
—
thus, the
list is
empty, and
ments being pushed onto the front of the
node
Each node
actually contains
its
next
object:
sub insert { my $self my $key my $data
312
list.
=
= =
shift; shift; def ined {$_[ 0
CHAPTER
]
)
18
?
shift
:
$key;
OOP AND ABSTRACT DATA STRUCTURES
; {
if
(
{
;
;
;
;
;
} ;
;
$self ->is_empty = $key; $self->{key} $self->{data} = $data; $self->{next} = undef; (
)
)
}else{
$node = $self ->new = $self -> {key} $node->{key} $node-> {data} = $self -> {data} $node-> {next } = $self -> {next (
$self->{key} $self->{data} $self -> {next }
)
$key; $data; = $node; = =
}
return
1;
Testing
if
We do
by
this
the
list is
empty
requires only that
testing to see if the
key
field
is
we
test if
the
first
node
is
empty.
not defined.
sub is_empty { return Idefined shif t->{key} }
able
The search and dump-list methods are similar because they both must be to traverse the list. The difference between the two methods is that the search
will
end and return the data associated with that key
is
looking.
if it finds
The dump method must continue through
the key for which
the whole
it
list:
sub search { my $self = shift; my $key = shift; return 0 unless defined $key;
while (1) return $self -> {data} if $self->{key} eq $key; last unless defined $self -> {next } $self = $self->{next} {
}
return
0
sub dump_list { my $self = shift; my @list = ($self->{data} while (1) last unless $self -> {next } $self = $self->{next} push Olist $self -> {data} )
,
}
return @list; }
1;
END
STACKS, QUEUES,
AND LINKED
LISTS
313
Notice
new
in
that, unlike
data with keys
found item.
the
first
keys
—by
with a hash, there
we have
We
already used.
nothing preventing us from sticking
Our
search
method
will
only return
could change our insert routine to disallow identical
searching to see
first
is
if
the key already exists
—
or to replace the data of
the key if that key already exists in the list rather than adding a new node. (Again, we would have to search the list to find the right node, if any.) The following exercises allow you to add further functionality to this object.
18.5 1
Exercises
Use perldoc
perl toot to read
object-oriented
what you
Tom
programming with
Christiansen's excellent tutorial
view the
Perl (or
Also take a look at the perlobj and perlbot pod-pages.
3
The
4
linked
list
object could benefit from a
rather than the values.
The main this,
you
Add
need to search the
list.
method
a dump_keys
thing missing from our linked
first
out of the
if that's
list
(
)
list is
that
method a
dumps
to this object.
method
to find the node.
the keys of the
to delete a node. For
Then you must
splice
it
This requires that you keep track of the previous node because
to splice the current
current node's next
314
page
prefer to use to read the documentation).
2
list
HTML
on
node you need
to set the previous node's
next
field to the
field.
CHAPTER
18
OOP AND ABSTRACT DATA STRUCTURES
CHAPTER More
19
OOP examples
19.1
The heap
19.2
Grades: an object example
19.3
Exercises
as
an abstract data structure
330
315
320
316
,
we
In this chapter,
heap
class,
and
begin by returning to our heap data structure and implement a
discuss
some of it
example of creating a few example we used
make
From
classes.
We
then turn to a more practical
these classes,
beginning of chapter
at the
from
reports
applications.
we can
revisit
and extend the
building a system to query and
8,
a data base of student assignment
and exam
grades.
The heap as an abstract data structure
19.1
may wish
Before tackling this section, you structure
and the algorithms we used
to familiarize yourself with the heap
in chapter 17 to create a
heap and pull items
off the top of the heap.
Heap
For our use the
first
will again use
We
—
also
in this case, the
an anonymous subroutine we can use to compare two
as well as
keys in the heap.
We will
an array to store our heap.
element of the array to store additional information
of the heap,
size
we
class,
begin by defining our package and setting up a private hash
of comparison functions.
package Heap; use strict;
my
%coinp =
str rstr num rnum
{
)
:
=> => => =>
$_[0] $_[!] $_[0] $_[1]
cmp cmp
$_[!]}, $_[0]}, $_[!]}, $_[0]}, •
;
This hash
private to the
is
expect the caller of our new
comparison routine to
and
{return {return {return {return
sub sub sub sub
.
Heap
package, no other package can access
We
constructor to supply an argument indicating which
( )
str and rstr are for string comparisons in normal
use.
num and mum.
reverse sorted order, similarly for
default to using
it.
If
no routine
is
specified,
numeric comparisons:
we
«
sub new { my $this = shift; my $class = ref ($this) |$this; my $comp = shift 'n\im' my $self = size => 0, { comp => $comp{$comp} |
(
|
)
|
;
[
}, ]
;
return bless $self $class ,
'
;
_
•
}
This constructor has with a hash reference in
set
up our
its first
the heap and a reference to the
316
initial
empty heap
as
an anonymous array
position. This hash reference contains the size of
anonymous subroutine
CHAPTER
19
for comparisons.
MORE OOP EXAMPLES
;
;
;
Unlike our heap model in chapter the heap
We
on demand
will first create a
—and then
ment
sub insert { my $self my $key my $data my $node
call
1
we wish
7,
to be able to insert items into
rather than building a heap out of an existing
node out of the arguments passed
in
—
of items.
list
and a data
a key
ele-
another method to do the actual insertion:
""
shift; shift; = def ined $_ 0 ? shift = { key => $key, data => $data,
"'
=
,
=
[
(
]
)
$key;
:
'
(.i
)
''
'
{
-
.
,
);
$self->_insert_node $node)
,
,
'
.r;
,
'
-
;
, _
}
V',.
tp^"'
Using an underscore the class. Such
method
to
start
actual insertion
has the effect
moving up the
element
if it is
just a
convention to mark some functions
methods should never be
do the
heap, which
is
leaf
We first increase the size of the tree. We then
onto to our binary
down one
node, copying each parent node
this final
smaller than the current key.
than the current node,
from outside the package. Our
as follows.
new
of adding a
from
tree
works
called
as private to
we have found
WTien the next parent node
is
larger
the place in the heap to insert the node,
which we do. sub _insert_node { my ($self, $node) = @_; my $i = ++$self-> [0] {size} my $comp = $self-> [0] {comp} while $i > 1 and $comp-> $self-> [parent $i) {key} $self->[$i] = $self-> [parent {$i) $i = parent $i
*"
;
(
(
(
]
,
$node-> {key } [0] {size} < return;
our
don't put the top element at the
We still need to put something in
take the last element of the heap
in
and push
it
down
the top posi-
using the push-
17.
'
"
1)
'
:'"
"
""
{
}.
318
CHAPTER
19
MORE OOP EXAMPLES
;
;
my $top = $self-> [1] {data} my $node = pop @$self; $self-> [0] {size}--; if
($self->[0] {size} > $self->[l] = $node;
0)
;
;
;
^
{
$self->_pushdown(l) ^
}
return $top; }
The
routine to pusli an element
down
the heap
same
the
is
as
we used
in
chapter 17, only modified to use the comparison routine.
sub _pushdovm { my ($self, $i) = my $size = $self-> [0] {size} " my $comp = $self-> [0] {comp} while{ $i ($self->[$lcid] {key},$self->[$kid +l]{key}) ,
.
.
,
'
'
;
•
•
.
.
""
'
)
(
{
= 0; last if $comp-> ($self-> [$i] {key} $self-> [$kid] {key} = ($self->[$i] $self->[$kid] $self -> $kid] $self -> $i $i = $kid; )
,
)
,
(
[
]
[
,
)
-
}
}
The all
three class
methods we use
to return the parent,
left,
and
right indices are
extremely simple one-line fianctions:
sub parent sub left sub right
{ {
{
return int($_[0] return $_[0] * 2 return $_[0] * 2
/
2)
}
}
+
1
}
1;
.
END
The
following
data and insert as the key.
We
is
a simple test script to read in a few lines of colon-separated
anonymous hashes of these want
to process these records in alphabetical order, so
heap constructor the string rstr to order.
Our heap normally puts
this idea
records into a heap using the
tell it
to
compare using
the largest element
and put the smallest element on top
on the
name
we
field
pass the
reverse alphabetical
top, so
we need
to reverse
to get proper alphabetical ordering:
THE HEAP AS AN ABSTRACT DA TA STRUCTURE
319
{
;
{
;
{
;
# /usr/bin/perl -w use strict; use Heap; !
my $heap
Heap->new
=
( '
rstr
'
)
while () chomp mY($name, $age, $beer) = split $heap->insert $name {name => age => beer =>
/:/;
$name, $age, $beer,
,
(
"
'
}
while ($_ = $heap->extract_top foreach my $key (keys %$_) print " $key :$_->{ $key} \n" {
)
)
}
print
\n"
"
; '
}
DATA
' '
Brad: 37: ale Andrew: 3 5 ale Susanna 4 0 lager John 33 stout :
:
:
:
:
A heap you can
can be used to implement what
known
as a priority
queue, where
store a Hst of things to process according to their priority.
These queues
are usually
is
dynamic; new elements are inserted while elements from the queue are
processed. In the real world,
you could consider
a hospital
emergency room
as
using a priority queue to process patients. Patients are not treated in order of
but according to the severity of their physical condition. In the computing
arrival,
world, you ing
on
may
have heard of priority queues being used to manage job schedul-
a multi-user
Different users
computer or
may
as the
have different
means of holding jobs
priorities
entered into the queues based on their priority
19.2
on the system, and
Grades: an object example
In this section,
more
we
will
no longer be dealing with standard computer science types
own
structures to use as objects
familiar kind of data processing task.
Recall the
problem given
at the
beginning of chapter
report of assignment grades for each student in a
generalized version of the
same problem.
class.
8.
We
Here we
needed to create a
will consider a
more
We want to be able to store grades for stu-
dents in a given course and retrieve current
320
their tasks are
level.
of data structures. Instead, we will formulate our for a
in a printer queue.
summary information
CHAPTER
19
for each student
MORE OOP EXAMPLES
5
-
As
in the course. ticular
before,
we
want
just
assignment in a plain text
file
(Perhaps the exams are performed
appended
to this
one course. In
To
we
want
we
as these
assignments
on
web and
the
come
in
and
are graded.
automatically graded and
to be able to use our
program
more than
for
consider two courses: Math-101 and Math-201.
define a configuration
The format
course.
also
example,
this
begin,
We
file.)
to be able to enter a student's score for a par-
format that can be used for each
file
will specify the total score possible for
each assignment and the
value of each assignment's contribution to the final course grade. For example, in
our Math-101 course,
we may
decide that
we
will give three assignments, each
scored out of 50, but counting for only 25 percent of the final mark; and one
exam marked out of 75 and counting our configuration
file
25 percent of the
for
fields for the
Assign 1 50 25 Assign:2:50:25 Assign: 3 50 25 Exam 1 7 5 2 :
create
(Assign or of,
and the
number out of
like this:
>-
*
-
"
"
'Cf
.
-
;
.
..
:
:
:
When we math-101
marked out
:
:
:
it is
contributes toward the final grade (expressed simply as a
100). Thus, our math-lOl.cfg^Xe. looks
:
work
type of graded
Exam), the assignment or exam number, the raw score it
We
math-lOl.cfgzs, colon-separated records, one for each type of
graded work. Each record contains
amount
mark.
final
record the grades in our grade
—we need
to record the student
file
for the course
—named here
name, the assignment number
(1, 2,
or
3 for assignments, or El for the 1st exam), and the raw score the student obtained.
So part of the data Bill Anne Sara Sara Bill Bill Anne Sara Anne Bill Anne
file
for this class
might look
like this:
Jones:2:35 Smith: 3: 41 Tims 2 45 Tims 3 39 Jones 1 42 Jones:El:72 Smith: 1:42 Tims 1 41 Smith: 2: 47 Jones: 3:41 Smith:El:69 :
:
:
:
:
:
Now,
'
'
:
:
let's
look at the types of things
we have
in
have courses, students, and assignments (or exams). eral
category
contains a
GRADES:
first,
list
of
the course category. its
students. Thus,
AN OBJECT EXAMPLE
A
we
our problem statement Let's
—we
consider the most gen-
given course has a course name, and also create a
file,
with a
it
.std extension.
321
{
;
containing
the
all
we have
course,
Bill Jones Anne Smith Sara Tims Frank Worza
;
;
in a given class. So, for
'
'
,
to use this course class in our reporting
Well, keeping things simple and assuming
whole
class,
our Math-101
math- 1 01 .std containing
file
how might we want
So,
,
names of the students
the
'
;
.
we can imagine
argument:
program that
a
we
just
want
program?
to create a report for the
invoked with the course name
is
as
an
'
# /usr/bin/perl -w use strict; use Course; die "You must supply a course name.\n"; @ARGV my $class = Course->new $ARGV 0 while () chomp $class->add_student_record (split /:/) !
I
I
[
{
]
)
•
;
.
1
;
}
$class->print_report END
(
)
Remember, our data data
file is
in $argv[0]
name
of grades has the same
file
when
the program
course, iterates through the data
file
is
called.
as the course, so the
This program creates a
adding student records to the course, and
then prints out a report of the course grades. Presumably, the Course
how
to
add student records and print the
Student class
list.
class
Let's
and have
it
store a
list
new
report.
We
make our Course
of student objects
begin building our Course
class
— one
for each
knows
class use a
name
in the
class:
package Course; use Student; use strict; sub new { my $type = shift; my $class = ref ($type) |$type; my $course = shift; => $course, my $self = { course => 0, number students => { } |
}
bless $self, $class; $self ->_conf igure_course return $self;
322
(
)
CHAPTER
19
MORE OOP EXAMPLES
{
;
;
;
;
This constructor creates a Course object that contains
number of students
fields for the
name of
contains an
empty
hash reference that holds a hash of student names and the student objects.
Much
the course, and the
of the
work
real
is
done within the configuration routine we
the constructor. This routine
and the student
as well as a field that
list
is
responsible for reading in the configuration
name.
for the given course
It
list
and
creates a
new student
Thus
my %cfg; open (CFG, $cfg_f ile) while ()
list.
'
'
•
"
*
'.cfg';
.
:
die "can't open $cfg_file: I
the routine reads the
object for each student in the
sub _conf igure_course { my $self = shift; my $cfg_file = $self->{course}
file
builds a configuration hash that
holds the information about each assignment or exam.
student
near the end of
call
.
>,.,,,
•
!
$
" ;
-I
I
chomp;
my ($type, ©data) $cfg{$type} [$data $cf g{ $type" "
.
'_no
split /:/;
= [0] '
}
]
=
[
(adata[l,2]
];
++
}
.
-
-
-
close CFG; my $stud_file = $self-> { course} '.std'; open (STD, $stud_f ile) || die "can't open $stud_file: $!"; while () chomp (my $name = $_) $self-> {students }{ $name} = Student->new \%cfg, $name) $self-> {number }++
'
'
.
;
.,
.
.
.....
(
close STD; }
We
can
now add
a few accessor type fiinctions to retrieve the data in our
course object. These are pretty straightforward: # course returns the course name sub course { my $self = shift; return $self-> {course} (
)
:
}
number returns the number of students in the course sub number { my $self = shift; return $self->{number} #
(
)
:
}
#
#
student (name) returns the student object associated with the given name
GRADES:
:
AN OBJECT EXAMPLE
323
;
;
sub student my $self = shift; my $name = shift; return $self-> { students }{ $name }
;
;
;
;
{
undef;
||
}
list(): returns the sorted list of student names sub list { my $self = shift; my ©list = map { $_->[0] } sort{ $a->[2] cmp $b->[2] } map { [$_, split] } keys %{ $self-> { students )} return @list; #
}
We
need to add the functionality we saw in our main program
need to be able to add student records and print a
we
class report.
— namely, we
For these functions,
assume that the student objects themselves have functions for adding assign-
ments and printing
their
own
reports:
'
sub add_student_record my $self = shift; my $name = shift; my $student = $self->student $name) $student->add_assignment(@_) {
(
}
sub print_report { my $self = shift; my $course = $self ->course my $number = $self ->number print "Class Report: course = $course: students = $number\n" $self ->list foreach my $name ){ $self->student $name) ->print_report '
"
(
)
(
)
(
(
• ,
'
)
(
(
)
}
'
1
"
\
;
"
_END
.'
-
'
Don't forget to end your
module with a
we know that we Student constructor, and we know
shown above. At hash to the
class
this point,
will
'
true statement such as the 1
be passing the configuration
that class will need a
method
adding assignments to the student and printing a report on the student. Just
Course class
we
class
used the Student
our Student
class
our
makes use of an Assignment
will define later:
package Student; use Assignment; use strict;
324
class,
as
for
.
_
.
,
CHAPTER
19
MORE OOP EXAMPLES
{ ;
;
;
sub new { my $type = shift; my $class = ref ($type) my $cfg = shift; my $name = shift; my $self = {config name assignments
;
;
;
;
'!'' |$type;
'
|
'-h;
:
,.
,
=> $cfg, => $name, => 0,
bless $self $class; return $self; ,
'
• .
}
much new
we have fields for a reference to the configuration hash and the students' names passed when each new student is created. We also have a field to hold the number of assignments this student has There
is
nothing
in this constructor;
completed.
ing
The method to add an assignment creates new fields in the object; one holdan anonymous array of assignments, and the other holding an anonymous
array of exams.
These arrays
are
populated with
new Assignment
objects,
and the
assignment count incremented: '
sub add_assigninent { my $self = shift; my($nuin, $score) = @_; my $cfg = $self-> {config} my $type; if ($num =~ s/"E//) $type = Exam else { } $type = 'Assign' '
'
?
'
'
}
$self->{$type} [$num] = Assignment->new($cfg, $type, $num, $score) $self-> {assignments} ++ }
We
also
add a couple of accessor
fiinctions to this class to return
assignment objects (either assignments or exams). discard possibly
empty elements
We
use the
lists
grepO function
of to
in either array:
sub get_assigns { my $self = shift; return grep{$_} @ { $self-> {Assign}
}
}
sub get_exams { my $self = shift; return grep{$_} (a{$self->{Exam}
}
}
GRADES:
AN OBJECT EXAMPLE
325
;
Our
{ ;
;
function for printing a student's report
assumes that a given assignment knows to maintain a
—
mark
{
running
how
is
;
{
;
longer, but not complicated.
to print
its
own
and only needs
report
of each assignments contribution toward the
total
It
final
$f score:
sub print_report { my $self = shift; my $cfg = $self -> {conf ig} \n" print $self-> {name}
;
r
,
"
:
$self ->get_exams unless $self ->get_assigns print "\tNo records for this studentXn" return; {
(
)
(
|
)
)
|
my $f total, $a_count, $e_count) = (0,0,0); foreach my $assign $self ->get_assigns print \t $assign->print_report $f total += $assign->f score {) $a_count++; (
(
"
"
(
)
)
(
)
r
•
;
•
;
'
;
-
.
"
/
;
>
foreach my $assign $self ->get_exains print \t " $assign->print_report $ftotal += $assign->f score () $e_count++; (
"
(
)
)
{
^
;
(
)
,
'
;
.
\
' '
'
;
'
.
,,
„ -
.
}
.
.
;
.
, ^
== $a_count and $cfg->{Exam_no} == $e_count) print "\tFinal Course Grade: $f total/100\n"
if ($cfg->{Assign_no}
}else{
print print
\n"
"
"
\tIncomplete RecordXn" '
•
;
.
}
'
1;
END
'
-
'
r'^^'-
' :
;
we have our Assignment class. The constructor for this passing off the hard work to its own _assign{ function:
Finally,
simple,
is
fairly
)
package Assignment; sub new
class
.
^
{
my $type = shift; my $class = ref ($type) |$type; my $cfg = shift; my $self = {config => $cfg} bless $self, $class;
'
|
326
-
'
CHAPTER
-
•
19
:
MORE OOP EXAMPLES
;
$self->_assign return $self;
(@_)
;
;
;
;
;
.{score} print $self->{ fscore}
raw
,
:
"
,
,
/
=
";
$self -> {raw} " "/", $self->{ final} ,
"
,
:
Adjusted
=
";
," \n"
}
1;
END
GRADES:
AN OBJECT EXAMPLE
327
:
:
Now we pass
can run our original program given at the
name of our
the
it
:
same name
as well as the
and data
files
in a
$ perl report.pl math-101 Class Report: course = mathFrank Howza No records for this Bill Jones Assignment 1 raw = Assignment 2 raw = Assignment 3 raw = Exam 1 raw = 72/75 Final Course Grade Anne Smith: Assignment 1 raw = Assignment 2 raw = Assignment 3 raw = Exam 1 raw = 69/75 Final Course Grade Sara Tims Assignment 1 raw = Assignment 2 raw = Assignment 3 raw = Incomplete Record =
:
=
:
named
file
same directory
in the
We
of
as
report.pl,
of the
created.
files
and we have
file
all
three
and
We
mod-
our program:
students
101:
this section
already have the data
math-101. cfg and the math-101. std
assume we save the program ules
Math-101.
course,
start
=
4
student
Adjusted = 21 00/25 Adjusted = 17 50/25 Adjusted = 20 50/25 Adjusted =24.0 0/25 83/100
42/50 35/50 41/50 :
Adjusted = 21 00/25 Adjusted = 23 50/25 Adjusted = 20 50/25 Adjusted = 23.00/25
42/50 47/50 41/50 :
88/100
:
You'll notice that the student
in
our
file.
class list file,
Similarly, Sara
Adjusted Adjusted Adjusted
41/50 45/50 39/50
20 50/25 22 50/25 = 19 50/25 = =
name Frank Howza
has no records.
appears
but there were no assignments or exams for him in the data
Tims
has no record for her exam, so her report
incomplete. If
He
is
marked
as
,
we had another
course with a completely different set of assignments (and
perhaps different students),
we could simply
class list file for that course's
create a
new
configuration
data and run the same program on
it.
file
and
For example,
our Math-201 course might have only two assignments, both out of 50 points, but with the
first
final grade.
one making up 25 percent and the second only 10 percent of the
There
are also
two exams, both out of 75, with the
20 percent and the second exam contributing 45 percent configuration
Assign: Assign: Exam 1 Exam 2
328
file
1
:
50 25
2
:
50
:
:
:
:
75
:
20
:
:
75
:
45
would look
first
contributing
to the final grade.
The
like this:
•. ,
.
10
CHAPTER
19
MORE OOP EXAMPLES
—
,
For simplicity's sake, grades
Bill Sara Sara Anne Anne Bill Sara Anne Bill Bill Sara
now
we assume
the same class
list
applies.
The
data
of
file
appears as
Jones: 1:43 Tims 2 32 :
:
Tiins:l:44 Smith: 1 44 Smith: 2 39 Jones El 75 Tims: El: 69 Smith: El: 70 Jones 2 40 Jones :E2 75
.,
:
:
:
:
:
:
,
•
.
-
:
,
.
.
,,,
•
.
:
\.
:
Tims:E2:69
.
Anne Smith: E2: 70
Running our same program with an argument of math-201 now produces full
report for this
new
a
course:
$ perl report.pl math-201 Frank Howza: No records for this Bill Jones: Assignment 1: raw = Assignment 2 raw = Exam 1: raw = 75/7 5 Exam 2: raw = 75/75 Final Course Grade: Anne Smith: Assignment 1: raw = Assignment 2 raw = Exam 1: raw = 70/75 Exam 2: raw = 70/75 Final Course Grade: Sara Tims: Assignment 1: raw = Assignment 2 raw = Exam 1: raw = 69/7 5 Exam 2 raw = 69/75 Final Course Grade: :
:
:
:
student
Adjusted = 21.50/25 Adjusted = 8.00/10 Adjusted = 20.00/20 Adjusted = 45.00/45
43/50 40/50 :
:
:
:
94 .5/100
Adjusted = 22.00/25 Adjusted = 7.80/10 Adjusted = 18.67/20 Adjusted = 42.00/45 90.47/100
44/50 39/50
:
:
:
:
Adjusted = 22.00/25 Adjusted = 6.40/10 Adjusted = 18.40/20 Adjusted = 41.40/45
44/50 32/50 :
:
:
:
88 .2/100
Because the reporting functionality
is
built into the objects themselves,
we
can modify the program to provide an interactive query for individual students or even individual assignments
—
if
we
desired.
Here
is
a version that allows
you
to
query the data for individual student reports: # /usr/bin/perl -w use strict; use Course; !
@ARGV I
I
GRADES:
die "You must supply a course name.Xn";
AN OBJECT EXAMPLE
329
{ ;{
;
{
;
;
my $class = Course- >new $ARGV 0 ]) while () chomp $class->add_student_record split 1:1); [
{
(
}
while (1) print "Enter a student name [or '1' for list; 'q' to quit]: chomp (my $name = ) last if $name =~ m/'^qS/; \n" and next if $name eq print join " \n" $class->list if $class->student $name) $class->student $name) ->print_report } else print "no student by that name\n"; (
,
(
,
'
"
)
1 '
)
{
{
)
"
{
(
)
;
}
}
By
using a configuration
file
to define the
marking scheme for the assignments
and exams, rather than hard-coding them into
a script,
we not only
achieve the
generality to use these programs with other courses, but to change our course configuration as well. Occasionally, a teacher will decide that a particular assignment
shouldn't count for as
much
can simply reduce the
final
the values of one or
more
as originally intended.
contribution
amount
1
for that assignment
and
program
to see the
new
increase file.
The
results.
Exercises
Another handy feature to have in our grade-tracking system would be the ability to read in data
2
this system, the teacher
of the others respectively in the configuration
teacher can then simply rerun the reporting
193
With
retrieve
and print
student
is
on multiple courses and then query
a record for a given student for
registered.
Try adding
all
the data base to
the courses in which that
this feature.
Examine any programs you may have written while working through
book
to see if your data could be
modeled
in a
more object-oriented
this
fashion.
Create classes to represent the data and behaviors, and rewrite your program to use these objects.
530
.
i
,
/
CHAPTER
19
MORE OOP EXAMPLES
CHAPTER What's What's the
left?
20
left?
A lot! You have nearly reached
the end of this book, but you're not at
end of the road by any means. As Larry Wall quotes
"The road goes As
ever
on and on
stated in the introductory chapter, Perl
I
have covered a great deal in a short space, there
One we
thing
we
in Perl's
own
source code,
"
never discussed
haven't left that out altogether.
is
how
is
is
a large language, and although
to use Perl as a
Appendix
more
quite a bit
A provides
we
Perl to discover.
command
line tool.
a brief overview of
But
some
common command line switches with a few examples of running Perl command line. Appendix B provides a brief reference on a few of Perl's
of the more
from the
special built-in variables. (See the perlvar ^od-^di^c for a
We
did a
little bit
complete
of network programming using the lwp: simple module in :
chapter 14. Perl has low-level socket programming with the socket several other
ming
network
related functions, as well as a
interface via the lO
gramming, you can
Did you
ever
:
:
want
to have
it
(
)
function and
more convenient socket program-
Socket module. To find out more about network pro-
see the perlipc
time and then execute
listing.)
pod-page
(ipc for inter-process
communication).
your program read in raw source code during run-
within your program?
331
Perl's
eval
(
)
function can be used
to
do
tines.
of thing.
this sort
Read about eval
(
It
can also be used to trap errors raised by other subrou-
Advanced Perl Programming by
in the perlfunc pod-page.
)
Sriram Srinivasan^ has a very good chapter on the uses of eval Perl
can also do more than just open read and close
0 and unlink ()
can rename
ownership (chown
With
Perl,
forking), use I
keep
{
)
and
),
you can
(delete) files,
create
also
DBM databases
stressing,
CPAN
{
)
directories (mkdir
children processes
(several different flavors),
— never be
solutions. Indeed, if
(if
(
)
,
—you
and mndir
(
)
and even more
yet.
And,
for
what
it left
CPAN
you can do with
To help you
further your learning.
other Perl resources. This
landscape that
My
I
is
hope you
original plan
not an exhaustive
Perl isn't
others
—and
sider sacred. to
in the sense that
Appendix
Perl, the list
much
332
C provides a brief list of
as
I
do. I
think are
and coding of your programs.
style,
about following the rules and rigidly adhering to the conventions of I
know
So
if
I
I've
broken a few
have any wisdom
do whatever works
Srinivasan,
you never stop
was to leave off by reviewing some of what
for
you
—and
rules in this
at all to
book
impart in
to have fun doing
,
1
Perl,
merely a few signposts in a vast
list,
will enjoy exploring as
important rules or guidelines in the design,
But
all
out.
"You don't learn a natural language even once, it."
as
for easy
Larry Wall once said while writing about the natural language aspect of
learning
).
your system supports
afraid to look to
tried to catalogue everything
I
directories
contains hundreds of additional modules providing
manners of additional functionality
would be notable only
and
change their permissions (chmodO) and
and remove
fork
files
as well.
)
(
that others might con-
my final
remarks,
it is
just
it.
I
Snram. Advanced Perl Programming. Sebastopol, CA: O'Reilly and Associates, 1997.
\V'
•
CHAPTER 20
WHAT'S LEFT?
Command line switches command line switch, the -w switch, which turns on warnings. Quite a few additional command line switches exist, many of which are designed to facilitate using Perl as a command line tool. By command line tool, I mean invoking Perl directly from the command line of
You
with the most important
are already familiar
the shell
and supplying one or more statements
to perform. Consider the follow-
ing one liner:
perl -pi.bak -e
'
s/red/blue/g
'
filenames
This combines several switches and takes every occurrence of red with blue. sion. In this appendix,
switches. (See
we
It
all
the filenames given
also saves the original files
will briefly describe the
most
perldoc perlrun for additional switches and
and
with a
useful
.
replaces
bak exten-
command
line
fiirther information.)
-C
When
used, Perl will compile the script
program. This
is
a useful
out actually running $
first
and check the
step in testing your
it.
perl -c programname
333
syntax, but will not run the
program
for syntax errors with-
-e commandline may be
Takes commandline, which $ perl -e argument
'
$blah
several statements, as the script to run.
shift @ARGV; print "$blah\n"' argument
=
-P This switch causes Perl to construct the following loop around your script (whether your script
while
()
is
in a
file
or given in a -e argument):
{
your script continue { print; #
}
"
'
.
V
^
-
}
^
Hence, the following three examples are perl -p -e
$
'
s/red/blue/g
the same:
all
filename
'
';.
# /usr/bin/perl -p s/red/blue/g;
-
!
'
'
.
:
/usr/bin/perl while () { s/red/blue/g; #!
-.
-
}
continue { print; } ,
, .
:
,
.
,
.
" •
^
,
,7-;;
;
,
.^_.J,.
^
;
:
.
-n
iv
'J
:
perl -n -e 'print if m/foo/'
This prints
all
lines
^
.J.
\
-,.„r>
:
"
Like -p except without the continue block. $
;:/^
^
-
.
;
-
>
l,
filename
containing the pattern
f 00 in the file filename
-±[ extension] .
This means that
all
the
files
given on the
command
In other words, changes are written to the extension, then the original
$
files
files
line are to
be edited in place.
themselves. If given the optional
are saved with that extension:
perl -p -i.bak -e 's/red/blue/g'
filenames
'
or with the switches combined: $
perl -pi.bak -e 's/red/blue/g'
334
filenames
COMMAND
LINE SWITCHES
;
-a Used with -n or -p performed
to turn
as the first
on
mode. This causes a split
autosplit
('
')
to be
statement in the imphed loop, and the results of the split
are assigned to the array @f
.
The
following three examples are equivalent and will
print the second field in each line of data in the given
file:
perl -a -n -e 'print "$F[l]\n"' filename perl -ane 'print "$F[l]\n"' filename
$ $
# /usr/bin/perl while () { @F = split print "$F[l]\n"; !
'
/
'
-Fpattern Allows you to supply an alternate pattern to using -a).
To
perl -an -F/
split
:
/
split
on
in autosplit
mode (when
each line on colons instead of the default whitespace:
-e
'print "$F[l]\n"'
filename
This invokes the Perl debugger on the script
(see
chapter 15).
-Mmodule Allows you to use a module from the $
command
line:
perl -MCPAN -e 'shell'
Invokes Perl and
calls
use CPAN; before executing the statement shell, which,
under the CPAN pm module, puts you into an interactive .
shell
mode.
-V Prints the Perl version information.
-V
•
Prints a detailed
and
summary of the
configuration details used
when compiling
Perl
prints out the value of the @INC array.
COMMAND
LINE SWITCHES
335
9^
perlvar
Table B.1
for further information.)
Special variables
Variable
Description
$_
Default variable for input and pattern matching.
$.
The current
$/
number
of the current or last
Input record separator. Default
;
is
Output record separator. Default
$\ $"
line
' ,
V
a newline.
List separator.
Value printed between items a
double-quoted
handle read.
a newline. is
interpolated
in
file
in
string. Default
an array is
when
it
is
a space.
$0
The current program name.
$'^W
Current warning value. You can set this within a script to turn warnings off and on for particular blocks of code.
$ARGV (iARGV
.
Current
file
Command
being read from . line
arguments.
336
Table B.1
Special variables (continued)
Variable
Description
@INC
Search paths for
&F
The
%ENV
Hash of current environment variables, may be set to change the environment for spawned processes.
%SIG
Hash
ARGV
File
use and require statements.
autosplit array.
of signal handlers. (See
perldoc perlipc.)
handle that iterates over @ARGV, also specified with the empty
input operator:
STDIN
Standand input
file
STDOUT
Standard output
DATA
Special
file
DATA
SPECIAL VARIABLES
'
handle.
file
handle.
handle referring to data following an
END
or
token.
337
A
F
P
E
N
Additional resources Your
first
line
of investigation should be the documentation and FAQs that are
included with your distribution of Perl, but here are a few additional resources for learning
more about
Perl.
Newsgroups comp.lang.perl.misc.
The primary forum
for discussions
and questions regarding
the Perl language. comp.lang.perl.modules.
A forum for discussions and questions
relating to the copi-
ous existing modules as well as issues surrounding creating your comp. langperl. moderated.
modules.
A moderated forum for Perl discussions.
comp.lang.perl.tk. Discussions involving using Perl
comp.lang.perl.announce.
own
Announcements
with theTk graphical
relevant to the Perl
interface.
community.
Web pages www.perl.com. Your starting place for exploring the world of
can find links to the Perl documentation,
CPAN, and
338
lists
Perl.
From
here you
of other resources.
reference.perl.com.
A
reference
list
of modules, tutorials and other Perl-related
things.
www.perl.org.
The
Perl Institute's
and information. At the time of and was being passed on
homepage, another good place this writing, the Perl Institute
to the Perl
to find Perl
had
news
just dissolved
Mongers, but the web address
will
probably
remain the same. (If not, check the Perl Mongers page given below.) theory.uwinnipeg.calsearchlcpan-search.html.
CPAN
A
engine for searching the
search
archives.
www.pm.org. The Perl Mongers homepage. Visit here to find a Perl Mongers group in
your area or
to start
your
own
if
one doesn't
exist near you.
Books and magazines Christiansen,
Sebastopol,
Tom, Randal Schwartz, and Larry
CA: O'Reilly and
Wall. Programming Perl,
Associates, 1996. Also
book
(because of the animal on the cover), this
somewhat out of date
is
language.
It is
the text
duplicated in the included documentation.
is
Christiansen,
is
an
asset for
any
Perl
rithms.
Cambridge,
book
programmer's
Cormen, Thomas, Charles
Leiserson,
MA: MIT
Press,
as "the
is
camel book"
book on
(current to Perl version 5.003),
Tom, and Nathan Torkington. The
O'Reilly and Associates, 1998. This
and
known
the reference
2nd ed. the Perl
and much of
Perl Cookbook. Sebastopol,
chock-full of recipes
CA:
and examples
library.
and Ronald
1990. This
is
Rivest. Introduction to Algo-
an outstanding introduction to
algorithms and data structures. If you aren't interested in the analysis of algorithms, the data structures presented later are
The Perl Journal. This quarterly journal
is
still
easily
understood.
an excellent resource with
articles rang-
ing from beginner to advanced to the simply whimsical. See http://tpj.com/ for subscription details
and contents of pervious
BOOKS AND MAGAZINES
issues.
339
Numeric formats Our
standard
such
as 42,
it
number system really
means 4
and each position leftward
uses base- 10 numbers. If
are three other
binary, octal,
base 8,
number
in a
number counts
common
342
is
units 10 times (or base times)
3 hundreds and 4 tens and 2 ones.
bases used for numeric data in
16, respectively.
count up to 15 using single
a
and 2 ones. The rightmost position counts ones
and hexidecimal (usually
and base
you consider
tens
greater than the previous position. Thus,
There
-
Hexidecimal
digits, so
stand for the numbers 10 to 15.
just called hex).
The
we
is
a
little
computer
science:
These represent base
2,
odd because we cannot
use letters instead
—
the letters "a" to "f"
following table shows several numbers writ-
ten in each of these base formats.
340
R
Table D.1
Numeric formats
Decimal (base 10)
Binary (base 2)
0
0
Octal (base 8)
0
Hex (base 0
1
1
10
2
2
3
11
3
3
4
100
4
4
5
101
5
5
6
110
6
6
7
111
7
7
8
1000
10
8
9
1001
11
9
10
1010
12
a
11
1011
13
b
12
1100
14
c
13
1101
15
d
14
1110
16
e
1
1
2
1111
17
16
10000
20
10
17
10001
21
11
1
32
1
00000
40
20
42
101100
52
2a
NUMERIC FORMA TS
16)
341
glo.
absolute path.
The
the root of the
system.
alias.
file
When
full,
unadulterated directory path to a
file,
beginning with -
one variable represents another
variable,
it is
said to be an alias for
that variable.
In regular expressions, an anchor
anchor. ter,
is
a special character, or metacharac-
that matches a particular location in a string as
opposed
to
matching
a particu-
lar character.
What you
argument.
give your spouse (or child, or parent)
person what to do. Similarly, an argument routine that
array.
tells
is
when
telling that
data that you give to a program or sub-
the program or subroutine what to
work with and how
to proceed.
A type of variable that holds an ordered list of data.
autovivification.
being used
as if
it
When
something comes into existence automatically
just
by
has always been there. For example:
my $foo; $foo->{bar} ='baz';
In the second statement, $foo
is
did not, an anonymous hash
automatically created and stored in $foo so that
is
used as
if it
holds a reference to a hash. Since
it
we
can assign a key and value in that hash. backreference.
Used within
a regular expression, a backreference
is
a special
sequence consisting of a backslash followed by an integer that stands for the text
matched by a preceding
set
of capturing parentheses of the same number.
342
More
bless.
like a
baptism
blessing
really,
is
the act of dubbing a reference as
belonging to a particular package, providing a basis for
Perl's
object-oriented capa-
bilities.
A structural segment of a
block.
by curly
that are delimited in Perl
A two-valued
Boolean.
program consisting of one or more statements
braces.
(usually true or false) property or variable. Perl does not
have Boolean variables, but
it
does evaluate some expressions (such
expressions) in a Boolean context. This
sion to
real value, the
its
expression
is
A byte
functionality
from a parent
)
and tell(
class, this is
in
which
it
command
Data
of
is
segments
—
tool.
usually a
called chunks.
that can be used as an object.
deeply
bound
to the lexical environ-
explicitly passed to a
to build or create
used to create and return an object
anonymous
arrays
is
and hashes can be
program and/or
something
called a constructor.
program when
it
is
its
files.
A class
else.
The
[
]
and
method
{ }
used to
called constructors (or composers).
degree of interdependency
functions) within a
GLOSSARY
all
line
Something used
The
or
programming
Joining two or more things, usually strings or
concatenation.
coupling.
some
was generated.
invoked from the
create
—
and methods
subroutine that
command line argument.
constructor.
(often the beginning
literate
are presented in relatively small
that defines data
An anonymous
closure.
file
class.
paragraph or two, or a handful of lines of code
A module
functions, a position in a
a class that inherits
Hardcore technical jargon from the noweb
Documentation and code
ment
)
set.
Also called a derived
class.
class.
concerned with only whether the
an 8-bit piece of binary data, often representing
is
a character in the ASCII character
chunk.
while Perl evaluates the expres-
another position in the
relative to
or the previous position).
its
itself is
As used with the seek(
measured in bytes
child
that,
true or false.
te offset. file
condition
means
as conditional
among components
(data structures,
modules.
343
De Volkswagen
debug.
Beetle
sometimes called de bug. In programming,
is
Debugging
errors in syntax or logic are referred to as bugs.
the task of isolating
is
the problems and fixing the code. Telling the compiler/interpreter about something
declure/declardtion.
— Something—
rather than telling
a variable or subroutine
delimiter.
ning and end of something
To undo
A
dispatch table. (i.e.,
do something
else,
such
that
with a statement).
demarks the begin-
as a record, a field, or a string.
The quotation
Perl are that string's delimiters.
the reference in order to follow
the corresponding key
is
it
to
where
it
points.
used.
Wrapping up data and functions
encapsulation.
—
(as
as
hash table of subroutine references that can be called upon
when
dispatched)
to
usually a character or string
marks you use around a string in dereference.
it
—such
(or
methods) into a neat
little
package that has a simple interface compared with the code that actually does the
work.
The
evaluate.
act of
Any
expression.
computing the
literal,
result
of an expression.
variable, subroutine, operator, or
combination of these
that evaluates to a value.
flag.
A marker or switch
depending on
value.
A named piece
function. someplace if
its
that can be set to cause different actions to take place
else (or
of code defined in one place that can be called from
many places)
the code in question
isn't
in the
used for
its
program. Also called subroutines, especially return value.
hash.
A type of variable that holds an
heap.
A data structure that dynamically maintains a partial ordering of the data
it
unordered
list
of key/value
pairs.
contains.
here-document (here-doc). inheritance.
input.
344
multi-line blocks of text.
In object-oriented programming, acquiring characteristics (data
and/or methods) from a parent initialize.
A form of quoting large,
The
act of giving
Data that comes into
class.
an a
initial
value to a variable.
program or subroutine.
GLOSSARY
Replacing a variable with
interpolate.
its
value within double-quoted strings.
In terms of programs, an interpreter reads a program (or a compiled
interpret.
form of a program) and executes the statements contained ating strings, backslash interpretation
of characters
sequentially step through a
In a hash, the key
key.
Any
keyword.
the act of reading certain special sequences
standing for some other (usually
as
To
iterate.
is
named language
is
what you use
built-in function
list
printable) character.
of values (or a
to look
name
non
up
of key/value
pairs).
values.
(such as print, index, open) or other
A
method
or style of
emphasis on source code documentation. Various tools
programming exist to assist
that places
an author in
more human readable programs.
writing
A function that
method.
A flow of control
loop.
A package,
module.
and methods)
The
construct which can cause a statement or series of state-
some number of times.
defined in
to a
its
program
own
file,
that uses
that provides data or functions (or
it.
often represented by one or
more
special symbols.
data that comes out of a program or function.
A namespace where you can define variables
package. interfere
part of a class.
A built-in function,
Operator.
output.
is
a block) to be repeated
(i.e.,
objects
list
construct (such as while or if) in the Perl language.
literate programming (LP).
ments
therein. In terms of cre-
and functions
that won't
with the main:: program's namespace or other packages/modules that
it
uses.
parameter.
parent
See argument.
A class,
class.
functionality to
POD. programs,
The
its
standard
POD
The
from a parent
ferent shapes
GLOSSARY
markup syntax
for
embedding documentation within
Perl
stands for plain old documentation.
polymorphism. inherited
usually designed to be general in form, that provides basic
children classes.
(i.e.,
ability to class.
be different. Child classes can redefine behaviors
Thus, a parent
class
can have children of
many
dif-
they are polymorphic).
345
A
quantifier.
regular expression term referring to the special symbols that
denote some number (possibly zero) of occurrences of the previous character or subexpression.
A
queue.
data structure providing a
rather like a line at the
tion,
something in terms of
then recursion should be thought of as a
means
recursively
to
first
(FIFO) processing order,
itself
makes
spiral definition.
rigidly define at least
nism that can be repeatedly applied
one
for a circular defini-
Defining something
special case
to turn other cases into
and then a mecha-
one of the
special cases.
A pointer to the real data. Or the address where some piece of data
reference.
is
memory.
stored in
regular expression. a variety
first-out
DMV, except you'll want to process your queue faster than that.
If defining
recursion.
first-in,
A way of specifying a set of strings
of pattern-describing symbols and
The
final resulting
satellite data.
The
organization of chunks of data a key.
value of an expression or function.
A
type of value that
string, or a single character.
The
scope.
is
is
often based
on
just a small
The remaining chunk of data may be
to as the satellite data for that key.
scalar.
by using
text.
return value.
segment or piece of data called
(or substrings)
referred
.
singular in nature, like a single number, a single
A type of variable in Perl that holds a scalar value.
range or limits within which a variable
is
active
(i.e.,
exists).
Also
referred to as the visibility of the variable.
sentinel value.
when
A value
used
as a flag
or switch that can be used to determine
a process or loop should stop iterating.
Common name for the
shebang. first line
of a
Perl script
—
#
!
character pair, also called
or shell script and
contains the path to the executable interpreter
slice.
Stack.
A
many is
data structure providing
of dishes next to your
sink.
bang.
other interpreted scripts
The
—
that
often referred to as the shebang line.
(possibly non-contiguous) subset of a
A
pound
list
first-in last-out
of values. processing order, like a stack
Presumably they were stacked there one
at a
time on
top of each other, and you will then wash them one at a time starting from the top
346
GLOSSARY
of the
pile. If
you
something more
are like
like a
Standard input.
STDIN
file
me
heap or
The
however, your dishwashing structure
may
resemble
just a pile.
input stream for a program, accessible in Perl via the
handle.
Standard output.
The output stream
for a program, accessible in Perl via the
STDOUT file handle. Statement.
compare two
Telling the
computer
to
subroutine.
add two numbers,
possibly
empty
(e.g.,
the null string).
See function.
A
subscript.
as
values, or print a string.
A sequence of character data,
String.
do something, such
syntactic construct for accessing or referring to elements of an
array or hash. $array[2] uses a subscript of 2 to refer the value stored in the third
element of the array (subscripts count from
key to
refer to the value stored
substitution operator.
zero).
$hash{key} uses a subscript of
under that key in the hash.
Like the match operator but also able to replace matched
portions of the target string with replacement text.
syntax.
How various symbols can be legitimately put together in a given language.
tangle.
In literate programming, this
the
program code into a format ready
set
In literate programming, this
is
program documentation including any
the source
the process of extracting and assembling
compiled and/or interpreted.
A named location of memory where data may stored and retrieved by name.
variable.
weave.
is
to be
the process of creating the formal typecross reference information specified in
file.
GLOSSARY
347
1
Symbols !
operator 65
-> operator 145
$ARGV
@_
@ARGV
II
operator 89
123, 128
ARGV arrays 5
100
$"
71,336
-a
access
'
abstract data structures
102
heap 316
$.
101,336
linked
$/
102,336
•
'
list
301
,
69 69
associative 71
309
elements 70
queues 307
interpolation 71
336
stacks
301 multi-dimensional 142,
ActiveState 10
$1 110
algorithm 6
+ operator
327
147 nested 142
running time 286, 290
71
subscripts
searching 281
^
69
swapping elements 70
operator 65
** operator
accessor function
slices
% operator 66 && operator 89
.
111,337
assignment 69
$_ 53,88,101,104,111,336 $0 336
*
1
335
$,
$AW
103,
arithmatic operators 65
$ 71, 102,336 53, 54,
51, 102, 103,
111,336
operator 52, 89
operator 104, 198
$!
111,336
103,
alias
88, 128
arrow operator 145 anchor 106, 193
65 65
assembly language 6
$ 106, 193
operator 66
assignment 64
\A 194
7
associative array (see hash) ..
operator 94
\B 107, 194 autovivification 173
operator 65 operator 218
\b 106, 193
103
\Z 194
/
= 64,
47
=> operator 72
43,53
B
^ 106, 193
266
= operator 47, 5 = vs.
\G 194
1
,
64
\B (non-word
and operator 89
anonymous array anonymous hash arguments 123
348
boundary) 107 1 1
56 56,
\b
296
106
backreference 109, 195, 197 backslash escape 62, 105,
1
12
1
backslash interpretation 62
backslash operator 75, 143
D
E
\D 106
-e
-d 335
each 73
backstepping 188
227
backtics
\d 43, 106
backtracking 197 binary 340
DATA
binary search 282
114
blocks 79
60
numeric 61
%ENV eof
data types 60 list
data
scalar data
212
child class
307
cmp
332
exec
226
commands 265
exit
120
D
exponentiation operator 65
265
267
@EXPORT 275 @EXPORT_OK
266
q 264
expression 43, 65
R
external
s
X
267
command 227
F
264
debugging 257-268
154
the perl debugger
-F 335
262
@F
337
factorial
operator 218
debugging steps 262
135
false values
code 243
defined 101
FAQ
comments 24-28
design 33, 35
pseudo-code 36
development cycle 33 coding 40-45
structure
20-21
command line options
171,
design 35-40
maintenance 47-49 specification
34-35
file
99 handle 52,99-103
File::Basename 238
filename 99 filetest
concatenation 66
die 53, 100
concatenation operator 66
documentation 160
continue 85, 87 control statements 36
POD
I
Fibonacci 154
file
239 complement 212
80,90
33
faqgrep 49, 230
debugging 45-47
19-31
style
276
264
228,251
closures 153,
Eratosthenes, sieve of 93 error 53
watch expression 265
53
closedir
60-63
evaluated 91
1
chmod 332 chomp 42 chown 332 close
j
invoking 262
^
337
eval
chaining functions 221
characters
84
1 1
debugger 262, 264 c
105,188
, ,
Date::Manip 242, 278
333 CGI.pm 249 character class
.
eq 43
67-68
bubble sort 284
-c 45,
114
see infinite loop
strings 61
Boolean 94
c
47
endless loop
literal
296
;
_END_
data
bless
:
enclosing 105
114,337
binding 104
binding operator 43, 53
334
elsif
_DATA_
,
161-164
source code 164
operators
229
for loop 85, 101
foreach 52, 88
fork 332 fiinction
cos
278
CPAN
documents 12
235
dot 66
see subroutine
122
double-quoted strings 61
INDEX
349
input record separator 102
LWP::Simple 243
install
194
/g
loop variable 88
modules 236
Getopt
Perl 10
Long 241 Std 239 GIFgraph 244 global variables
41
int
-M 335
interpolation 42,
62
interpreted vs. compiled 9
125,127
lO Socket 331
grep 217
machine language 6 maintenance 33
275, 299, 308
@ISA
m// 43, 103 machine code 6
map 216
H h2xs 277
markup language 161 match operator 43, 103,
lom 114
,
203
hash 71
adding elements 72 deleting elements
/g 113,
key 71
keys 73, 118
key/value pairs 72
Knuth, Donald 165
73
72
values
_
heap 286
last
287
array representation as a class
/s
106,204
/x
205
\A 204 \Z 204
87
and context 206
lexical variables
hexidecimal 61,340
high
205
length 116
316
heap property 287 here-document 118
level
/o
language 12,49
:
167
linear search
282
list
context 74
list
separator 71
matrilineal
mixed
334
/i
203
if
38. 80
"
as
"
'
programming (LP) 36, 164-170
getting
example 166
installing
else 38, elsif
weaving 165 -
local
loop 36,84-89
infinite
for 85,
43
see endless loop
101
fo reach 52,
loop
84
.
infinite
input operator 65, 101, 102
until 41
context 100
-n
334
ne (not equal) 85 next 87, 118
88
indeterminate 84
inheritance 275, 299
89-91
\n 62
determinate 87
index 208
31,63, 124, 125
N
lookbehind 198
337
using 237
my
125, 126
lookahead 198
39,84
indicator
-
logical operators 43,
82
50, 237, 272,
236
purpose 278
tangled 165
import 274
@INC
235
h2xs 274
:
statement
modifier 92
149
274
creating
:
V
structures
mkdir 332 modules 272
literate
-i
286
data 60, 61
literal I
Math::Trig 278
124
line directive
language 7
190
/m 204
nested 149 slices
105
/i
K
73
84
not operator 89 notangle 166
noweb code chunks 165
while 41, 84
INDEX 350
1
continued chunks 166
precedence 65, 89
documentation
print
chunks 165
hne
168
numeric data 61
lookbehind 198
68
priority
directive
queue 320
shortcuts 191 relational operators 81
pseudo-code 36
rename 332 repetition
66
i
repetition operator
Q
require reverse
qw 68
object-oriented
programming 293 classes
294
295
polymorphism 293, 295
340
s \s
106
redo 87 references 74, 134,
53
142-158
anonymous arrays 143 anonymous hashes 143
opendir 228, 251
^
autovivification
146
or operator 43, 52, 89 ,
1
02
dereferencing 144, 158
74
scalar context
Schwartzian Transform 221, 222, 251
scope 79, 123-127
creating 143, 158
output record
63
scalar 41, 60,
assignment 75 operator 185
output field separator 7 1
1
recursion 135, 154 sill
open 52,99, 100 pipe 227
223
rmdir 332
running Perl
228,251
readdir
inheritance 293,
:
range operator 94
example 321
octal 61,
Norman 165
rand 41
encapsulation 293, 294
.
rindex 208
Ramsey,
constructor 295
275 223
scalar context
R
293
66
context 223
list
abstraction
,
-.'t
prototypes 137
push 71
205
.
108
quantifiers
o /o
1
and subroutines 125 searching 281
146
implicit
separator 102
seek 173, 175 to a variable
75
shebang
1
regex 103
P
shift
71
sieve
of Eratosthenes 93
see regular
-p
334
expressions 103
package 273, 295
%SIG 337
regular expressions 43, 103,
184-201
parameters 127
278
sin
slices
and
references
1
34
alternation 107, 185 sort
pattern 103, 104 Perl 29, 45, 207,
POD
49,
anchor 106, 193
242
and prime numbers 196
70
217
binary search 282 linear search
161-164
backreference 109, 195 sorting
formatting tags 162
backstepping 188
minimal pod-page 163
capturing 109, 195
pod2html 162
character class 188
pod2latex 162
concatenation 185
structural tags
1
62
dot
(.)
283-291
bubble sort 284
heap
sort
insertion sort
283
283
106, 185
pop 71 pqtangle 170
INDEX
33
greedy quantifiers 191 split
paragraphs 163
286
selection sort specification
verbatim
282
113, 116
grouping 185, 195 sqrt iteration
185
lookahead 198
95
square root 95
351
standard error 99
declaring 63
'
standard input 99
165-168
tangle
standard output 99
interpolation
.
;
tell
statements 79, 99
STDERR STDIN
top-down design 35 tr// 211
42, 52, 65, 99, 99,
173 list
99
STDOUT
hash 71-73
337 /c
212
/d
212
337
strict
29-31,40, 127
string
number
Is
naming 21-23, 63 scalar 60,
212
vars
286
true values 80,
90
63
60
127,275
SVERSION
conversions 67 strings
loop 88
types
tree structure
62
assignment 42
275
61 truth table 90
W
u
\W
double-quoted 61 single-quoted 62
sub 122 subroutine 122-123
and context 130
106
-w 29-31,67
undef 101
\w 106
90
unless
arguments 123, 128
unlink 332
chaining 221
unshift 71
Wall, Larry 7
wantarray 130
weaving 165, 168 closures 153
;
until
90 while 101
defining 122
names 133
-V
103,112,207 It
-
.
73
variables
assignment to 209
352
335
values
207 208
system 226
continue 85
-V 335
substitution operator 53,
substr
and input operator 101
V
invoking 122
41
..
X X operator 66
array 51, 69-71
assignment 64
.
INDEX
VilPiliitf |No 1019999 03896 770
/^'"'^'^
the.Ubrary Sale of this m«.«s «k oenef lis .
Bosion Public Library
COPLEY SQUARE GENERAL LIBRARY The Date Due Card in the pocket indicates the date on or before which this book should be returned to the Library, Please do not remove cards from this pocket.
www.mdnning.com/johnson
PROGRAMMING LANGUAGES,
PERL ...
straightforward enough for use by
the casual reader but complete
Elements of
Programming with
to stand alone as
Dg^ w- 1
YkX
an excellent
enough
first
learning tool."
—Jim Esten
I
Lead Developer
L Johnson
Andrew
WebDynamic "I
H
ere's a
complete introduction to programming using
written so
Perl,
their first
it's
accessible to those learning Perl as
—
found myself saying, Aha
THAT'S what
damn
that
"
doc was trying
Perl
so
obscvire
to say!'
—Brad Fenwick
programming language.
Software Engineer
With examples ranging from a useful utility to FAQs to a web client for tracking and charting
stock quotes to
an object-oriented student grading system,
book
a practical,
this
search the Perl
offers
you
hands-on approach to learning programming the
Xview Solutions "...
very useful to a
Perl
newcomer
to
both
and programming."
—Randy Kobes
Perl way.
Professor of Physics
What's inside
The
and design
and the software development
®
Style
•
Full descriptions of Perl's data types, variables, operators,
and control *
issues
University of Winnipeg
cycle
"Johnson this
structures
very
clear,
and that
book apart from the
—
In-depth coverage of Perl's regular expressions
is
sets
others."
Patrick Gardella
Vice President *
References and nested data structures
®
Documentation and
*
Text and
list
Literate
Whetstone Logic,
Inc.
Programming "...
manipulation
•
Debugging techniques
®
Using and writing modules
•
Object-oriented programming and abstract data structures
book I've seen so far that programming through Perl.
the only
teaches
I would certainly recommend newcomer to programming."
it
to a
—Brad Murray Senior Software Analyst
Andrew rience.
L.
Johnson has
He
more than
1
5 years
of programming expe-
Alcatel
Canada
has published articles in scientific journals and the
Linux Journal, and has taught and tutored students
in
both
beginning and advanced programming topics. Andrew has a master's degree in anthropology.
Author responds on the Web to questions from our readers -V-
Squree code available online
53495 9 '781884"777806
MANNING
$34.95 US/$50.95 Canada
ISBN l-flflM777-flD-S