Linear algebra and its applications 1292092238, 9781292092232

Brand New, International Ed/Global Ed, Mainly Same content at bargain price

589 110 21MB

English Pages 576 [579] Year 2015;2016

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover......Page 1
Title Page......Page 2
Copyright Page......Page 3
About the Author......Page 4
Contents......Page 6
Preface......Page 9
Acknowledgments......Page 15
A Note to Students......Page 16
INTRODUCTORY EXAMPLE: Linear Models in Economics and Engineering......Page 18
1.1 Systems of Linear Equations......Page 19
1.2 Row Reduction and Echelon Forms......Page 29
1.3 Vector Equations......Page 41
1.4 The Matrix Equation Ax = b......Page 52
1.5 Solution Sets of Linear Systems......Page 60
1.6 Applications of Linear Systems......Page 67
1.7 Linear Independence......Page 73
1.8 Introduction to Linear Transformations......Page 80
1.9 The Matrix of a Linear Transformation......Page 88
1.10 Linear Models in Business, Science, and Engineering......Page 98
Supplementary Exercises......Page 106
INTRODUCTORY EXAMPLE: Computer Models in Aircraft Design......Page 110
2.1 Matrix Operations......Page 111
2.2 The Inverse of a Matrix......Page 121
2.3 Characterizations of Invertible Matrices......Page 130
2.4 Partitioned Matrices......Page 136
2.5 Matrix Factorizations......Page 142
2.6 The Leontief Input–Output Model......Page 151
2.7 Applications to Computer Graphics......Page 157
2.8 Subspaces of Rn......Page 165
2.9 Dimension and Rank......Page 172
Supplementary Exercises......Page 179
INTRODUCTORY EXAMPLE: Random Paths and Distortion......Page 182
3.1 Introduction to Determinants......Page 183
3.2 Properties of Determinants......Page 188
3.3 Cramer’s Rule, Volume, and Linear Transformations......Page 196
Supplementary Exercises......Page 205
INTRODUCTORY EXAMPLE: Space Flight and Control Systems......Page 208
4.1 Vector Spaces and Subspaces......Page 209
4.2 Null Spaces, Column Spaces, and Linear Transformations......Page 217
4.3 Linearly Independent Sets; Bases......Page 227
4.4 Coordinate Systems......Page 235
4.5 The Dimension of a Vector Space......Page 244
4.6 Rank......Page 249
4.7 Change of Basis......Page 258
4.8 Applications to Difference Equations......Page 263
4.9 Applications to Markov Chains......Page 272
Supplementary Exercises......Page 281
INTRODUCTORY EXAMPLE: Dynamical Systems and Spotted Owls......Page 284
5.1 Eigenvectors and Eigenvalues......Page 285
5.2 The Characteristic Equation......Page 293
5.3 Diagonalization......Page 300
5.4 Eigenvectors and Linear Transformations......Page 307
5.5 Complex Eigenvalues......Page 314
5.6 Discrete Dynamical Systems......Page 320
5.7 Applications to Differential Equations......Page 330
5.8 Iterative Estimates for Eigenvalues......Page 338
Supplementary Exercises......Page 345
INTRODUCTORY EXAMPLE: The North American Datum and GPS Navigation......Page 348
6.1 Inner Product, Length, and Orthogonality......Page 349
6.2 Orthogonal Sets......Page 357
6.3 Orthogonal Projections......Page 366
6.4 The Gram–Schmidt Process......Page 373
6.5 Least-Squares Problems......Page 379
6.6 Applications to Linear Models......Page 387
6.7 Inner Product Spaces......Page 395
6.8 Applications of Inner Product Spaces......Page 402
Supplementary Exercises......Page 409
INTRODUCTORY EXAMPLE: Multichannel Image Processing......Page 412
7.1 Diagonalization of Symmetric Matrices......Page 414
7.2 Quadratic Forms......Page 420
7.3 Constrained Optimization......Page 427
7.4 The Singular Value Decomposition......Page 433
7.5 Applications to Image Processing and Statistics......Page 443
Supplementary Exercises......Page 451
INTRODUCTORY EXAMPLE: The Platonic Solids......Page 454
8.1 Affine Combinations......Page 455
8.2 Affine Independence......Page 463
8.3 Convex Combinations......Page 473
8.4 Hyperplanes......Page 480
8.5 Polytopes......Page 488
8.6 Curves and Surfaces......Page 500
A Uniqueness of the Reduced Echelon Form......Page 512
B Complex Numbers......Page 513
C......Page 518
E......Page 520
I......Page 521
L......Page 522
O......Page 523
P......Page 524
R......Page 525
S......Page 526
Z......Page 527
Answers to Odd-Numbered Exercises......Page 528
C......Page 568
E......Page 569
I......Page 570
M......Page 571
P......Page 572
R......Page 573
S......Page 574
Z......Page 575
Photo Credits......Page 576
Recommend Papers

Linear algebra and its applications
 1292092238, 9781292092232

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Global edition

Linear Algebra and its Applications FIFTH edition

David C. Lay • Stephen R. Lay • Judi J. McDonald

F I F T H

E D I T I O N

G L O B A L

E D I T I O N

Linear Algebra and Its Applications David C. Lay University of Maryland—College Park

with

Steven R. Lay Lee University and

Judi J. McDonald

Washington State University

Boston Columbus Indianapolis New York San Francisco Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Editorial Director: Chris Hoag Editor in Chief: Deirdre Lynch Acquisitions Editor: William Hoffman Acquisitions Editor, Global Editions: Murchana Borthakur Editorial Assistant: Salena Casha Program Manager: Tatiana Anacki Project Manager: Kerri Consalvo Project Editor, Global Editions: K.K. Neelakantan Senior Production Manufacturing Controller, Global Editions: Trudy Kimber Program Management Team Lead: Marianne Stepanian Project Management Team Lead: Christina Lepre Media Producer: Jonathan Wooding Media Production Manager, Global Editions: Vikram Kumar TestGen Content Manager: Marty Wright MathXL Content Developer: Kristina Evans Marketing Manager: Jeff Weidenaar Marketing Assistant: Brooke Smith Senior Author Support/Technology Specialist: Joe Vetere Rights and Permissions Project Manager: Diahanne Lucas Dowridge Procurement Specialist: Carol Melville Associate Director of Design Andrea Nix Program Design Lead: Beth Paquin Cover Design: Lumina Datamatics Cover Image: Liu fuyu/123RF.com Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsonglobaleditions.com © Pearson Education Limited 2016 The rights of David C. Lay, Steven R. Lay, and Judi J. McDonald to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. Authorized adaptation from the United States edition, entitled Linear Algebra and its Applications,5th edition, ISBN 978-0-321-98328-4, by David C. Lay, Steven R. Lay, and Judi J. McDonald published by Pearson Education © 2016. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the publisher or a license permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS. All trademarks used herein are the property of their respective owners. The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this book by such owners. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library 10 9 8 7 6 5 4 3 2 1 ISBN 10: 1-292-09223-8 ISBN 13: 978-1-292-09223-2 Typeset by Aptara Printed and bound by Courier Kendallville in The United States of America.

About the Author David C. Lay holds a B.A. from Aurora University (Illinois), and an M.A. and Ph.D. from the University of California at Los Angeles. David Lay has been an educator and research mathematician since 1966, mostly at the University of Maryland, College Park. He has also served as a visiting professor at the University of Amsterdam, the Free University in Amsterdam, and the University of Kaiserslautern, Germany. He has published more than 30 research articles on functional analysis and linear algebra. As a founding member of the NSF-sponsored Linear Algebra Curriculum Study Group, David Lay has been a leader in the current movement to modernize the linear algebra curriculum. Lay is also a coauthor of several mathematics texts, including Introduction to Functional Analysis with Angus E. Taylor, Calculus and Its Applications, with L. J. Goldstein and D. I. Schneider, and Linear Algebra Gems—Assets for Undergraduate Mathematics, with D. Carlson, C. R. Johnson, and A. D. Porter. David Lay has received four university awards for teaching excellence, including, in 1996, the title of Distinguished Scholar–Teacher of the University of Maryland. In 1994, he was given one of the Mathematical Association of America’s Awards for Distinguished College or University Teaching of Mathematics. He has been elected by the university students to membership in Alpha Lambda Delta National Scholastic Honor Society and Golden Key National Honor Society. In 1989, Aurora University conferred on him the Outstanding Alumnus award. David Lay is a member of the American Mathematical Society, the Canadian Mathematical Society, the International Linear Algebra Society, the Mathematical Association of America, Sigma Xi, and the Society for Industrial and Applied Mathematics. Since 1992, he has served several terms on the national board of the Association of Christians in the Mathematical Sciences.

To my wife, Lillian, and our children, Christina, Deborah, and Melissa, whose support, encouragement, and faithful prayers made this book possible.

David C. Lay

Joining the Authorship on the Fifth Edition

Steven R. Lay Steven R. Lay began his teaching career at Aurora University (Illinois) in 1971, after earning an M.A. and a Ph.D. in mathematics from the University of California at Los Angeles. His career in mathematics was interrupted for eight years while serving as a missionary in Japan. Upon his return to the States in 1998, he joined the mathematics faculty at Lee University (Tennessee) and has been there ever since. Since then he has supported his brother David in refining and expanding the scope of this popular linear algebra text, including writing most of Chapters 8 and 9. Steven is also the author of three college-level mathematics texts: Convex Sets and Their Applications, Analysis with an Introduction to Proof, and Principles of Algebra. In 1985, Steven received the Excellence in Teaching Award at Aurora University. He and David, and their father, Dr. L. Clark Lay, are all distinguished mathematicians, and in 1989 they jointly received the Outstanding Alumnus award from their alma mater, Aurora University. In 2006, Steven was honored to receive the Excellence in Scholarship Award at Lee University. He is a member of the American Mathematical Society, the Mathematics Association of America, and the Association of Christians in the Mathematical Sciences.

Judi J. McDonald Judi J. McDonald joins the authorship team after working closely with David on the fourth edition. She holds a B.Sc. in Mathematics from the University of Alberta, and an M.A. and Ph.D. from the University of Wisconsin. She is currently a professor at Washington State University. She has been an educator and research mathematician since the early 90s. She has more than 35 publications in linear algebra research journals. Several undergraduate and graduate students have written projects or theses on linear algebra under Judi’s supervision. She has also worked with the mathematics outreach project Math Central http://mathcentral.uregina.ca/ and continues to be passionate about mathematics education and outreach. Judi has received three teaching awards: two Inspiring Teaching awards at the University of Regina, and the Thomas Lutz College of Arts and Sciences Teaching Award at Washington State University. She has been an active member of the International Linear Algebra Society and the Association for Women in Mathematics throughout her career and has also been a member of the Canadian Mathematical Society, the American Mathematical Society, the Mathematical Association of America, and the Society for Industrial and Applied Mathematics.

4

Contents Preface

8

A Note to Students

15

Chapter 1 Linear Equations in Linear Algebra

17

INTRODUCTORY EXAMPLE: Linear Models in Economics and Engineering 1.1 Systems of Linear Equations 18 1.2 Row Reduction and Echelon Forms 28 1.3 Vector Equations 40 1.4 The Matrix Equation Ax D b 51 1.5 Solution Sets of Linear Systems 59 1.6 Applications of Linear Systems 66 1.7 Linear Independence 72 1.8 Introduction to Linear Transformations 79 1.9 The Matrix of a Linear Transformation 87 1.10 Linear Models in Business, Science, and Engineering 97 Supplementary Exercises 105

Chapter 2 Matrix Algebra

109

INTRODUCTORY EXAMPLE: Computer Models in Aircraft Design 2.1 Matrix Operations 110 2.2 The Inverse of a Matrix 120 2.3 Characterizations of Invertible Matrices 129 2.4 Partitioned Matrices 135 2.5 Matrix Factorizations 141 2.6 The Leontief Input–Output Model 150 2.7 Applications to Computer Graphics 166 2.8 Subspaces of Rn 164 2.9 Dimension and Rank 171 Supplementary Exercises 178

Chapter 3 Determinants

17

109

181

INTRODUCTORY EXAMPLE: Random Paths and Distortion 3.1 Introduction to Determinants 182 3.2 Properties of Determinants 187 3.3 Cramer’s Rule, Volume, and Linear Transformations Supplementary Exercises 204

181

195

5

6

Contents

Chapter 4 Vector Spaces

207

INTRODUCTORY EXAMPLE: Space Flight and Control Systems 207 4.1 Vector Spaces and Subspaces 208 4.2 Null Spaces, Column Spaces, and Linear Transformations 216 4.3 Linearly Independent Sets; Bases 226 4.4 Coordinate Systems 234 4.5 The Dimension of a Vector Space 243 4.6 Rank 248 4.7 Change of Basis 257 4.8 Applications to Difference Equations 262 4.9 Applications to Markov Chains 271 Supplementary Exercises 280

Chapter 5 Eigenvalues and Eigenvectors

283

INTRODUCTORY EXAMPLE: Dynamical Systems and Spotted Owls 5.1 Eigenvectors and Eigenvalues 284 5.2 The Characteristic Equation 292 5.3 Diagonalization 299 5.4 Eigenvectors and Linear Transformations 306 5.5 Complex Eigenvalues 313 5.6 Discrete Dynamical Systems 319 5.7 Applications to Differential Equations 329 5.8 Iterative Estimates for Eigenvalues 337 Supplementary Exercises 344

Chapter 6 Orthogonality and Least Squares

347

INTRODUCTORY EXAMPLE: The North American Datum and GPS Navigation 347 6.1 Inner Product, Length, and Orthogonality 348 6.2 Orthogonal Sets 356 6.3 Orthogonal Projections 365 6.4 The Gram–Schmidt Process 372 6.5 Least-Squares Problems 378 6.6 Applications to Linear Models 386 6.7 Inner Product Spaces 394 6.8 Applications of Inner Product Spaces 401 Supplementary Exercises 408

283

Contents

Chapter 7 Symmetric Matrices and Quadratic Forms INTRODUCTORY EXAMPLE: Multichannel Image Processing 7.1 Diagonalization of Symmetric Matrices 413 7.2 Quadratic Forms 419 7.3 Constrained Optimization 426 7.4 The Singular Value Decomposition 432 7.5 Applications to Image Processing and Statistics 442 Supplementary Exercises 450

Chapter 8 The Geometry of Vector Spaces INTRODUCTORY EXAMPLE: The Platonic Solids 8.1 Affine Combinations 454 8.2 Affine Independence 462 8.3 Convex Combinations 472 8.4 Hyperplanes 479 8.5 Polytopes 487 8.6 Curves and Surfaces 499

453

453

Chapter 9 Optimization (Online) INTRODUCTORY EXAMPLE: The Berlin Airlift 9.1 Matrix Games 9.2 Linear Programming—Geometric Method 9.3 Linear Programming—Simplex Method 9.4 Duality

Chapter 10 Finite-State Markov Chains (Online) INTRODUCTORY EXAMPLE: Googling Markov Chains 10.1 Introduction and Examples 10.2 The Steady-State Vector and Google’s PageRank 10.3 Communication Classes 10.4 Classification of States and Periodicity 10.5 The Fundamental Matrix 10.6 Markov Chains and Baseball Statistics

Appendixes A B

Uniqueness of the Reduced Echelon Form Complex Numbers A2

Glossary A7 Answers to Odd-Numbered Exercises Index I1 Photo Credits P1

A17

A1

411

411

7

Preface The response of students and teachers to the first four editions of Linear Algebra and Its Applications has been most gratifying. This Fifth Edition provides substantial support both for teaching and for using technology in the course. As before, the text provides a modern elementary introduction to linear algebra and a broad selection of interesting applications. The material is accessible to students with the maturity that should come from successful completion of two semesters of college-level mathematics, usually calculus. The main goal of the text is to help students master the basic concepts and skills they will use later in their careers. The topics here follow the recommendations of the Linear Algebra Curriculum Study Group, which were based on a careful investigation of the real needs of the students and a consensus among professionals in many disciplines that use linear algebra. We hope this course will be one of the most useful and interesting mathematics classes taken by undergraduates.

WHAT'S NEW IN THIS EDITION The main goals of this revision were to update the exercises, take advantage of improvements in technology, and provide more support for conceptual learning. 1. Support for the Fifth Edition is offered through MyMathLab. MyMathLab, from Pearson, is the world’s leading online resource in mathematics, integrating interactive homework, assessment, and media in a flexible, easy-to-use format. Students submit homework online for instantaneous feedback, support, and assessment. This system works particularly well for computation-based skills. Many additional resources are also provided through the MyMathLab web site. 2. The Fifth Edition includes additional support for concept- and proof-based learning. Conceptual Practice Problems and their solutions have been added so that most sections now have a proof- or concept-based example for students to review. Additional guidance has also been added to some of the proofs of theorems in the body of the textbook. 3. More than 25 percent of the exercises are new or updated, especially the computational exercises. The exercise sets remain one of the most important features of this book, and these new exercises follow the same high standard of the exercise sets from the past four editions. They are crafted in a way that reflects the substance of each of the sections they follow, developing the students’ confidence while challenging them to practice and generalize the new ideas they have encountered.

8

Preface

9

DISTINCTIVE FEATURES Early Introduction of Key Concepts Many fundamental ideas of linear algebra are introduced within the first seven lectures, in the concrete setting of Rn , and then gradually examined from different points of view. Later generalizations of these concepts appear as natural extensions of familiar ideas, visualized through the geometric intuition developed in Chapter 1. A major achievement of this text is that the level of difficulty is fairly even throughout the course.

A Modern View of Matrix Multiplication Good notation is crucial, and the text reflects the way scientists and engineers actually use linear algebra in practice. The definitions and proofs focus on the columns of a matrix rather than on the matrix entries. A central theme is to view a matrix–vector product Ax as a linear combination of the columns of A. This modern approach simplifies many arguments, and it ties vector space ideas into the study of linear systems.

Linear Transformations Linear transformations form a “thread” that is woven into the fabric of the text. Their use enhances the geometric flavor of the text. In Chapter 1, for instance, linear transformations provide a dynamic and graphical view of matrix–vector multiplication.

Eigenvalues and Dynamical Systems Eigenvalues appear fairly early in the text, in Chapters 5 and 7. Because this material is spread over several weeks, students have more time than usual to absorb and review these critical concepts. Eigenvalues are motivated by and applied to discrete and continuous dynamical systems, which appear in Sections 1.10, 4.8, and 4.9, and in five sections of Chapter 5. Some courses reach Chapter 5 after about five weeks by covering Sections 2.8 and 2.9 instead of Chapter 4. These two optional sections present all the vector space concepts from Chapter 4 needed for Chapter 5.

Orthogonality and Least-Squares Problems These topics receive a more comprehensive treatment than is commonly found in beginning texts. The Linear Algebra Curriculum Study Group has emphasized the need for a substantial unit on orthogonality and least-squares problems, because orthogonality plays such an important role in computer calculations and numerical linear algebra and because inconsistent linear systems arise so often in practical work.

PEDAGOGICAL FEATURES Applications A broad selection of applications illustrates the power of linear algebra to explain fundamental principles and simplify calculations in engineering, computer science, mathematics, physics, biology, economics, and statistics. Some applications appear in separate

10

Preface

sections; others are treated in examples and exercises. In addition, each chapter opens with an introductory vignette that sets the stage for some application of linear algebra and provides a motivation for developing the mathematics that follows. Later, the text returns to that application in a section near the end of the chapter.

A Strong Geometric Emphasis Every major concept in the course is given a geometric interpretation, because many students learn better when they can visualize an idea. There are substantially more drawings here than usual, and some of the figures have never before appeared in a linear algebra text. Interactive versions of these figures, and more, appear in the electronic version of the textbook.

Examples This text devotes a larger proportion of its expository material to examples than do most linear algebra texts. There are more examples than an instructor would ordinarily present in class. But because the examples are written carefully, with lots of detail, students can read them on their own.

Theorems and Proofs Important results are stated as theorems. Other useful facts are displayed in tinted boxes, for easy reference. Most of the theorems have formal proofs, written with the beginner student in mind. In a few cases, the essential calculations of a proof are exhibited in a carefully chosen example. Some routine verifications are saved for exercises, when they will benefit students.

Practice Problems A few carefully selected Practice Problems appear just before each exercise set. Complete solutions follow the exercise set. These problems either focus on potential trouble spots in the exercise set or provide a “warm-up” for the exercises, and the solutions often contain helpful hints or warnings about the homework.

Exercises The abundant supply of exercises ranges from routine computations to conceptual questions that require more thought. A good number of innovative questions pinpoint conceptual difficulties that we have found on student papers over the years. Each exercise set is carefully arranged in the same general order as the text; homework assignments are readily available when only part of a section is discussed. A notable feature of the exercises is their numerical simplicity. Problems “unfold” quickly, so students spend little time on numerical calculations. The exercises concentrate on teaching understanding rather than mechanical calculations. The exercises in the Fifth Edition maintain the integrity of the exercises from previous editions, while providing fresh problems for students and instructors. Exercises marked with the symbol [M] are designed to be worked with the aid of a “Matrix program” (a computer program, such as MATLAB® , MapleTM , Mathematica® ,

Preface

11

MathCad® , or DeriveTM , or a programmable calculator with matrix capabilities, such as those manufactured by Texas Instruments).

True/False Questions To encourage students to read all of the text and to think critically, we have developed 300 simple true/false questions that appear in 33 sections of the text, just after the computational problems. They can be answered directly from the text, and they prepare students for the conceptual problems that follow. Students appreciate these questions—after they get used to the importance of reading the text carefully. Based on class testing and discussions with students, we decided not to put the answers in the text. (The Study Guide tells the students where to find the answers to the odd-numbered questions.) An additional 150 true/false questions (mostly at the ends of chapters) test understanding of the material. The text does provide simple T/F answers to most of these questions, but it omits the justifications for the answers (which usually require some thought).

Writing Exercises An ability to write coherent mathematical statements in English is essential for all students of linear algebra, not just those who may go to graduate school in mathematics. The text includes many exercises for which a written justification is part of the answer. Conceptual exercises that require a short proof usually contain hints that help a student get started. For all odd-numbered writing exercises, either a solution is included at the back of the text or a hint is provided and the solution is given in the Study Guide, described below.

Computational Topics The text stresses the impact of the computer on both the development and practice of linear algebra in science and engineering. Frequent Numerical Notes draw attention to issues in computing and distinguish between theoretical concepts, such as matrix inversion, and computer implementations, such as LU factorizations.

WEB SUPPORT MyMathLab–Online Homework and Resources Support for the Fifth Edition is offered through MyMathLab (www.mymathlab.com). MyMathLab from Pearson is the world’s leading online resource in mathematics, integrating interactive homework, assessment, and media in a flexible, easy-to-use format. MyMathLab contains hundreds of algorithmically generated exercises that mirror those in the textbook. Students submit homework online for instantaneous feedback, support, and assessment. This system works particularly well for supporting computation-based skills. Many additional resources are also provided through the MyMathLab web site.

Interactive Textbook The Fifth Edition of the text is available in an interactive electronic format within MyMathLab.

12

Preface

This web site at www.pearsonglobaleditions.com/lay contains all of the support material referenced below. These materials are also available within MyMathLab.

Review Material Review sheets and practice exams (with solutions) cover the main topics in the text. They come directly from courses we have taught in the past years. Each review sheet identifies key definitions, theorems, and skills from a specified portion of the text.

Applications by Chapters The web site contains seven Case Studies, which expand topics introduced at the beginning of each chapter, adding real-world data and opportunities for further exploration. In addition, more than 20 Application Projects either extend topics in the text or introduce new applications, such as cubic splines, airline flight routes, dominance matrices in sports competition, and error-correcting codes. Some mathematical applications are integration techniques, polynomial root location, conic sections, quadric surfaces, and extrema for functions of two variables. Numerical linear algebra topics, such as condition numbers, matrix factorizations, and the QR method for finding eigenvalues, are also included. Woven into each discussion are exercises that may involve large data sets (and thus require technology for their solution).

Getting Started with Technology If your course includes some work with MATLAB, Maple, Mathematica, or TI calculators, the Getting Started guides provide a “quick start guide” for students. Technology-specific projects are also available to introduce students to software and calculators. They are available on www.pearsonglobaleditions.com/lay and within MyMathLab. Finally, the Study Guide provides introductory material for first-time technology users.

Data Files Hundreds of files contain data for about 900 numerical exercises in the text, Case Studies, and Application Projects. The data are available in a variety of formats—for MATLAB, Maple, Mathematica, and the Texas Instruments graphing calculators. By allowing students to access matrices and vectors for a particular problem with only a few keystrokes, the data files eliminate data entry errors and save time on homework. These data files are available for download at www.pearsonglobaleditions.com/lay and MyMathLab.

Projects Exploratory projects for Mathematica,TM Maple, and MATLAB invite students to discover basic mathematical and numerical issues in linear algebra. Written by experienced faculty members, these projects are referenced by the icon WEB at appropriate points in the text. The projects explore fundamental concepts such as the column space, diagonalization, and orthogonal projections; several projects focus on numerical issues such as flops, iterative methods, and the SVD; and a few projects explore applications such as Lagrange interpolation and Markov chains.

Preface

13

SUPPLEMENTS Study Guide The Study Guide is designed to be an integral part of the course. The icon SG in the text directs students to special subsections of the Guide that suggest how to master key concepts of the course. The Guide supplies a detailed solution to every third oddnumbered exercise, which allows students to check their work. A complete explanation is provided whenever an odd-numbered writing exercise has only a “Hint” in the answers. Frequent “Warnings” identify common errors and show how to prevent them. MATLAB boxes introduce commands as they are needed. Appendixes in the Study Guide provide comparable information about Maple, Mathematica, and TI graphing calculators.

Instructor’s Technology Manuals Each manual provides detailed guidance for integrating a specific software package or graphing calculator throughout the course, written by faculty who have already used the technology with this text. The following manuals are available to qualified instructors through the Pearson Instructor Resource Center, www.pearsonglobaleditions.com/lay and MyMathLab: MATLAB, Maple Mathematica and TI-83C/89.

Instructor’s Solutions Manual The Instructor’s Solutions Manual contains detailed solutions for all exercises, along with teaching notes for many sections. The manual is available electronically for download in the Instructor Resource Center (www.pearsonglobaleditions.com/lay) and MyMathLab.

PowerPoint® Slides and Other Teaching Tools A brisk pace at the beginning of the course helps to set the tone for the term. To get quickly through the first two sections in fewer than two lectures, consider using PowerPoint® slides. They permit you to focus on the process of row reduction rather than to write many numbers on the board. Students can receive a condensed version of the notes, with occasional blanks to fill in during the lecture. (Many students respond favorably to this gesture.) The PowerPoint slides are available for 25 core sections of the text. In addition, about 75 color figures from the text are available as PowerPoint slides. The PowerPoint slides are available for download at www.pearsonglobaleditions.com/lay.

TestGen TestGen (www.pearsonhighered.com/testgen) enables instructors to build, edit, print, and administer tests using a computized bank of questions developed to cover all the objectives of the text. TestGen is algorithmically based, allowing instructors to create multiple, but equivalent, versions of the same question or test with the click of a button. Instructors can also modify test bank questions or add new questions. The software and test bank are available for download from Pearson Education’s online catalog.

14

Preface

ACKNOWLEDGMENTS I am indeed grateful to many groups of people who have helped me over the years with various aspects of this book. I want to thank Israel Gohberg and Robert Ellis for more than fifteen years of research collaboration, which greatly shaped my view of linear algebra. And it has been a privilege to be a member of the Linear Algebra Curriculum Study Group along with David Carlson, Charles Johnson, and Duane Porter. Their creative ideas about teaching linear algebra have influenced this text in significant ways. Saved for last are the three good friends who have guided the development of the book nearly from the beginning—giving wise counsel and encouragement—Greg Tobin, publisher, Laurie Rosatone, former editor, and William Hoffman, current editor. Thank you all so much. David C. Lay It has been a privilege to work on this new Fifth Edition of Professor David Lay’s linear algebra book. In making this revision, we have attempted to maintain the basic approach and the clarity of style that has made earlier editions popular with students and faculty. We sincerely thank the following reviewers for their careful analyses and constructive suggestions: Kasso A. Okoudjou University of Maryland Falberto Grunbaum University of California - Berkeley Ed Migliore University of California - Santa Cruz Maurice E. Ekwo Texas Southern University M. Cristina Caputo University of Texas at Austin Esteban G. Tabak New York Unviersity John M. Alongi Northwestern University Martina Chirilus-Bruckner Boston University We thank Thomas Polaski, of Winthrop University, for his continued contribution of Chapter 10 online. We thank the technology experts who labored on the various supplements for the Fifth Edition, preparing the

data, writing notes for the instructors, writing technology notes for the students in the Study Guide, and sharing their projects with us: Jeremy Case (MATLAB), Taylor University; Douglas Meade (Maple), University of South Carolina; Michael Miller (TI Calculator), Western Baptist College; and Marie Vanisko (Mathematica), Carroll College. We thank Eric Schulz for sharing his considerable technological and pedagogical expertise in the creation of interactive electronic textbooks. His help and encouragement were invaluable in the creation of the electronic interactive version of this textbook. We thank Kristina Evans and Phil Oslin for their work in setting up and maintaining the online homework to accompany the text in MyMathLab, and for continuing to work with us to improve it. The reviews of the online homework done by Joan Saniuk, Robert Pierce, Doron Lubinsky and Adriana Corinaldesi were greatly appreciated. We also thank the faculty at University of California Santa Barbara, University of Alberta, and Georgia Institute of Technology for their feedback on the MyMathLab course. We appreciate the mathematical assistance provided by Roger Lipsett, Paul Lorczak, Tom Wegleitner and Jennifer Blue, who checked the accuracy of calculations in the text and the instructor’s solution manual. Finally, we sincerely thank the staff at Pearson Education for all their help with the development and production of the Fifth Edition: Kerri Consalvo, project manager; Jonathan Wooding, media producer; Jeff Weidenaar, executive marketing manager; Tatiana Anacki, program manager; Brooke Smith, marketing assistant; and Salena Casha, editorial assistant. In closing, we thank William Hoffman, the current editor, for the care and encouragement he has given to those of us closely involved with this wonderful book. Steven R. Lay and Judi J. McDonald

Pearson would like to thank and acknowledge José Luis Zuleta Estrugo, École Polytechnique Fédérale de Lausanne for contributing to the Global Edition, and Somitra Sanadhya, Indraprastha Institute of Information Technology, Veronique Van Lierde, Al Akhawayn University in Ifrane, and Hossam M. Hassan, Cairo University for reviewing the Global Edition.

A Note to Students This course is potentially the most interesting and worthwhile undergraduate mathematics course you will complete. In fact, some students have written or spoken to us after graduation and said that they still use this text occasionally as a reference in their careers at major corporations and engineering graduate schools. The following remarks offer some practical advice and information to help you master the material and enjoy the course. In linear algebra, the concepts are as important as the computations. The simple numerical exercises that begin each exercise set only help you check your understanding of basic procedures. Later in your career, computers will do the calculations, but you will have to choose the calculations, know how to interpret the results, and then explain the results to other people. For this reason, many exercises in the text ask you to explain or justify your calculations. A written explanation is often required as part of the answer. For odd-numbered exercises, you will find either the desired explanation or at least a good hint. You must avoid the temptation to look at such answers before you have tried to write out the solution yourself. Otherwise, you are likely to think you understand something when in fact you do not. To master the concepts of linear algebra, you will have to read and reread the text carefully. New terms are in boldface type, sometimes enclosed in a definition box. A glossary of terms is included at the end of the text. Important facts are stated as theorems or are enclosed in tinted boxes, for easy reference. We encourage you to read the first five pages of the Preface to learn more about the structure of this text. This will give you a framework for understanding how the course may proceed. In a practical sense, linear algebra is a language. You must learn this language the same way you would a foreign language—with daily work. Material presented in one section is not easily understood unless you have thoroughly studied the text and worked the exercises for the preceding sections. Keeping up with the course will save you lots of time and distress!

Numerical Notes We hope you read the Numerical Notes in the text, even if you are not using a computer or graphing calculator with the text. In real life, most applications of linear algebra involve numerical computations that are subject to some numerical error, even though that error may be extremely small. The Numerical Notes will warn you of potential difficulties in using linear algebra later in your career, and if you study the notes now, you are more likely to remember them later. If you enjoy reading the Numerical Notes, you may want to take a course later in numerical linear algebra. Because of the high demand for increased computing power, computer scientists and mathematicians work in numerical linear algebra to develop faster and more reliable algorithms for computations, and electrical engineers design faster and smaller computers to run the algorithms. This is an exciting field, and your first course in linear algebra will help you prepare for it.

15

16

A Note to Students

Study Guide To help you succeed in this course, we suggest that you purchase the Study Guide. It is available electronically within MyMathLab. Not only will it help you learn linear algebra, it also will show you how to study mathematics. At strategic points in your textbook, the icon SG will direct you to special subsections in the Study Guide entitled “Mastering Linear Algebra Concepts.” There you will find suggestions for constructing effective review sheets of key concepts. The act of preparing the sheets is one of the secrets to success in the course, because you will construct links between ideas. These links are the “glue” that enables you to build a solid foundation for learning and remembering the main concepts in the course. The Study Guide contains a detailed solution to every third odd-numbered exercise, plus solutions to all odd-numbered writing exercises for which only a hint is given in the Answers section of this book. The Guide is separate from the text because you must learn to write solutions by yourself, without much help. (We know from years of experience that easy access to solutions in the back of the text slows the mathematical development of most students.) The Guide also provides warnings of common errors and helpful hints that call attention to key exercises and potential exam questions. If you have access to technology—MATLAB, Maple, Mathematica, or a TI graphing calculator—you can save many hours of homework time. The Study Guide is your “lab manual” that explains how to use each of these matrix utilities. It introduces new commands when they are needed. You can download from the web site www.pearsonhighered.com/lay the data for more than 850 exercises in the text. (With a few keystrokes, you can display any numerical homework problem on your screen.) Special matrix commands will perform the computations for you! What you do in your first few weeks of studying this course will set your pattern for the term and determine how well you finish the course. Please read “How to Study Linear Algebra” in the Study Guide as soon as possible. Many students have found the strategies there very helpful, and we hope you will, too.

1

Linear Equations in Linear Algebra

INTRODUCTORY EXAMPLE

Linear Models in Economics and Engineering It was late summer in 1949. Harvard Professor Wassily Leontief was carefully feeding the last of his punched cards into the university’s Mark II computer. The cards contained information about the U.S. economy and represented a summary of more than 250,000 pieces of information produced by the U.S. Bureau of Labor Statistics after two years of intensive work. Leontief had divided the U.S. economy into 500 “sectors,” such as the coal industry, the automotive industry, communications, and so on. For each sector, he had written a linear equation that described how the sector distributed its output to the other sectors of the economy. Because the Mark II, one of the largest computers of its day, could not handle the resulting system of 500 equations in 500 unknowns, Leontief had distilled the problem into a system of 42 equations in 42 unknowns. Programming the Mark II computer for Leontief’s 42 equations had required several months of effort, and he was anxious to see how long the computer would take to solve the problem. The Mark II hummed and blinked for 56 hours before finally producing a solution. We will discuss the nature of this solution in Sections 1.6 and 2.6. Leontief, who was awarded the 1973 Nobel Prize in Economic Science, opened the door to a new era in mathematical modeling in economics. His efforts

at Harvard in 1949 marked one of the first significant uses of computers to analyze what was then a largescale mathematical model. Since that time, researchers in many other fields have employed computers to analyze mathematical models. Because of the massive amounts of data involved, the models are usually linear; that is, they are described by systems of linear equations. The importance of linear algebra for applications has risen in direct proportion to the increase in computing power, with each new generation of hardware and software triggering a demand for even greater capabilities. Computer science is thus intricately linked with linear algebra through the explosive growth of parallel processing and large-scale computations. Scientists and engineers now work on problems far more complex than even dreamed possible a few decades ago. Today, linear algebra has more potential value for students in many scientific and business fields than any other undergraduate mathematics subject! The material in this text provides the foundation for further work in many interesting areas. Here are a few possibilities; others will be described later. 

Oil exploration. When a ship searches for offshore oil deposits, its computers solve thousands of separate systems of linear equations every day.

17

18

CHAPTER 1

Linear Equations in Linear Algebra

The seismic data for the equations are obtained from underwater shock waves created by explosions from air guns. The waves bounce off subsurface rocks and are measured by geophones attached to mile-long cables behind the ship. 

Linear programming. Many important management decisions today are made on the basis of linear programming models that use hundreds of variables. The airline industry, for instance, employs linear

programs that schedule flight crews, monitor the locations of aircraft, or plan the varied schedules of support services such as maintenance and terminal operations. 

Electrical networks. Engineers use simulation software to design electrical circuits and microchips involving millions of transistors. Such software relies on linear algebra techniques and systems of linear equations. WEB

Systems of linear equations lie at the heart of linear algebra, and this chapter uses them to introduce some of the central concepts of linear algebra in a simple and concrete setting. Sections 1.1 and 1.2 present a systematic method for solving systems of linear equations. This algorithm will be used for computations throughout the text. Sections 1.3 and 1.4 show how a system of linear equations is equivalent to a vector equation and to a matrix equation. This equivalence will reduce problems involving linear combinations of vectors to questions about systems of linear equations. The fundamental concepts of spanning, linear independence, and linear transformations, studied in the second half of the chapter, will play an essential role throughout the text as we explore the beauty and power of linear algebra.

1.1 SYSTEMS OF LINEAR EQUATIONS A linear equation in the variables x1 ; : : : ; xn is an equation that can be written in the form a1 x1 C a2 x2 C    C an xn D b (1) where b and the coefficients a1 ; : : : ; an are real or complex numbers, usually known in advance. The subscript n may be any positive integer. In textbook examples and exercises, n is normally between 2 and 5. In real-life problems, n might be 50 or 5000, or even larger. The equations p  4x1 5x2 C 2 D x1 and x2 D 2 6 x1 C x3 are both linear because they can be rearranged algebraically as in equation (1): p 3x1 5x2 D 2 and 2x1 C x2 x3 D 2 6 The equations

4x1

5x2 D x1 x2

and

p x2 D 2 x1

6

p are not linear because of the presence of x1 x2 in the first equation and x1 in the second. A system of linear equations (or a linear system) is a collection of one or more linear equations involving the same variables—say, x1 ; : : : ; xn . An example is 2x1 x1

x2 C 1:5x3 D 4x3 D

8 7

(2)

Systems of Linear Equations 19

1.1

A solution of the system is a list .s1 ; s2 ; : : : ; sn / of numbers that makes each equation a true statement when the values s1 ; : : : ; sn are substituted for x1 ; : : : ; xn , respectively. For instance, .5; 6:5; 3/ is a solution of system (2) because, when these values are substituted in (2) for x1 ; x2 ; x3 , respectively, the equations simplify to 8 D 8 and 7 D 7. The set of all possible solutions is called the solution set of the linear system. Two linear systems are called equivalent if they have the same solution set. That is, each solution of the first system is a solution of the second system, and each solution of the second system is a solution of the first. Finding the solution set of a system of two linear equations in two variables is easy because it amounts to finding the intersection of two lines. A typical problem is

x1 2x2 D x1 C 3x2 D

1 3

The graphs of these equations are lines, which we denote by `1 and `2 . A pair of numbers .x1 ; x2 / satisfies both equations in the system if and only if the point .x1 ; x2 / lies on both `1 and `2 . In the system above, the solution is the single point .3; 2/, as you can easily verify. See Figure 1. x2 2

2

3

x1

1

FIGURE 1 Exactly one solution.

Of course, two lines need not intersect in a single point—they could be parallel, or they could coincide and hence “intersect” at every point on the line. Figure 2 shows the graphs that correspond to the following systems: (a)

x1 2x2 D x1 C 2x2 D

(b)

1 3

x1 2x2 D x1 C 2x2 D

x2

x2

2

2

1 1

2

3 1

x1

3

x1

1 (a)

(b)

FIGURE 2 (a) No solution. (b) Infinitely many solutions.

Figures 1 and 2 illustrate the following general fact about linear systems, to be verified in Section 1.2.

20

CHAPTER 1

Linear Equations in Linear Algebra

A system of linear equations has 1. no solution, or 2. exactly one solution, or 3. infinitely many solutions. A system of linear equations is said to be consistent if it has either one solution or infinitely many solutions; a system is inconsistent if it has no solution.

Matrix Notation The essential information of a linear system can be recorded compactly in a rectangular array called a matrix. Given the system

x1

2x2 C x3 D 0 2x2

5x1

8x3 D 8

(3)

5x3 D 10

with the coefficients of each variable aligned in columns, the matrix 2 3 1 2 1 40 2 85 5 0 5 is called the coefficient matrix (or matrix of coefficients) of the system (3), and 2 3 1 2 1 0 40 2 8 85 5 0 5 10

(4)

is called the augmented matrix of the system. (The second row here contains a zero because the second equation could be written as 0  x1 C 2x2 8x3 D 8.) An augmented matrix of a system consists of the coefficient matrix with an added column containing the constants from the right sides of the equations. The size of a matrix tells how many rows and columns it has. The augmented matrix (4) above has 3 rows and 4 columns and is called a 3  4 (read “3 by 4”) matrix. If m and n are positive integers, an m  n matrix is a rectangular array of numbers with m rows and n columns. (The number of rows always comes first.) Matrix notation will simplify the calculations in the examples that follow.

Solving a Linear System This section and the next describe an algorithm, or a systematic procedure, for solving linear systems. The basic strategy is to replace one system with an equivalent system (i.e., one with the same solution set) that is easier to solve. Roughly speaking, use the x1 term in the first equation of a system to eliminate the x1 terms in the other equations. Then use the x2 term in the second equation to eliminate the x2 terms in the other equations, and so on, until you finally obtain a very simple equivalent system of equations. Three basic operations are used to simplify a linear system: Replace one equation by the sum of itself and a multiple of another equation, interchange two equations, and multiply all the terms in an equation by a nonzero constant. After the first example, you will see why these three operations do not change the solution set of the system.

1.1

Systems of Linear Equations 21

EXAMPLE 1 Solve system (3). SOLUTION The elimination procedure is shown here with and without matrix notation, and the results are placed side by side for comparison: 2 3 x1 2x 2 C x3 D 0 1 2 1 0 40 2 8 85 2x2 8x3 D 8 5 0 5 10 5x 5x D 10 1

3

Keep x1 in the first equation and eliminate it from the other equations. To do so, add 5 times equation 1 to equation 3. After some practice, this type of calculation is usually performed mentally:

5  Œequation 1 C Œequation 3

Œnew equation 3

5x1 C 10x 2 5x1 10x 2

5x3 D 0 5x3 D 10

10x3 D 10

The result of this calculation is written in place of the original third equation: 2 3 x1 2x 2 C x3 D 0 1 2 1 0 40 2 8 85 2x2 8x3 D 8 0 10 10 10 10x 10x D 10 2

3

Now, multiply equation 2 by 12 in order to obtain 1 as the coefficient for x2 . (This calculation will simplify the arithmetic in the next step.) 2 3 x1 2x 2 C x3 D 0 1 2 1 0 40 1 4 45 x2 4x3 D 4 0 10 10 10 10x 10x D 10 2

3

Use the x2 in equation 2 to eliminate the 10x2 in equation 3. The “mental’’ computation is 10  Œequation 2 10x 2 C 40x3 D 40 C Œequation 3 10x 2 10x3 D 10

Œnew equation 3

30x3 D

30

The result of this calculation is written in place of the previous third equation (row): 2 3 x1 2x 2 C x3 D 0 1 2 1 0 40 1 4 45 x2 4x3 D 4 0 0 30 30 30x D 30 3

1 Now, multiply equation 3 by 30 in order to obtain 1 as calculation will simplify the arithmetic in the next step.) 2 x1 2x 2 C x3 D 0 1 2 40 1 x2 4x3 D 4 0 0 x D 1 3

the coefficient for x3 . (This

1 4 1

3 0 45 1

The new system has a triangular form (the intuitive term triangular will be replaced by a precise term in the next section): 2 3 x1 2x 2 C x3 D 0 1 2 1 0 40 1 4 45 x2 4x3 D 4 0 0 1 1 x3 D 1

22

CHAPTER 1

Linear Equations in Linear Algebra

Eventually, you want to eliminate the 2x2 term from equation 1, but it is more efficient to use the x3 in equation 3 first, to eliminate the 4x3 and Cx3 terms in equations 2 and 1. The two “mental” calculations are

4  Œequation 3 C Œequation 2

Œnew equation 2

x3

x2 x2

4x3 D 4x3 D D

4 4 0

1  Œequation 3 C Œequation 1

Œnew equation 1

x1 x1

x3 D 1 2x2 C x3 D 0 2x2

It is convenient to combine the results of these two operations: 2 3 x1 2x 2 D 1 1 2 0 1 40 1 0 05 x2 D 0 0 0 1 1 x D 1

(1, 0, ⫺1)

D1

3

x1 Each of the original equations determines a plane in three-dimensional space. The point .1; 0; 1/ lies in all three planes.

x2

Now, having cleaned out the column above the x3 in equation 3, move back to the x2 in equation 2 and use it to eliminate the 2x2 above it. Because of the previous work with x3 , there is now no arithmetic involving x3 terms. Add 2 times equation 2 to equation 1 and obtain the system: 2 3 x1 D 1 1 0 0 1 40 1 0 05 x2 D 0 0 0 1 1 x D 1 3

The work is essentially done. It shows that the only solution of the original system is .1; 0; 1/. However, since there are so many calculations involved, it is a good practice to check the work. To verify that .1; 0; 1/ is a solution, substitute these values into the left side of the original system, and compute:

1.1/ 5.1/

2.0/ C 1. 1/ D 1 2.0/ 8. 1/ D 5. 1/ D 5

0 1D 0 0C8D 8 C 5 D 10

The results agree with the right side of the original system, so .1; 0; 1/ is a solution of the system. Example 1 illustrates how operations on equations in a linear system correspond to operations on the appropriate rows of the augmented matrix. The three basic operations listed earlier correspond to the following operations on the augmented matrix. ELEMENTARY ROW OPERATIONS 1. (Replacement) Replace one row by the sum of itself and a multiple of another row.1 2. (Interchange) Interchange two rows. 3. (Scaling) Multiply all entries in a row by a nonzero constant. Row operations can be applied to any matrix, not merely to one that arises as the augmented matrix of a linear system. Two matrices are called row equivalent if there is a sequence of elementary row operations that transforms one matrix into the other. It is important to note that row operations are reversible. If two rows are interchanged, they can be returned to their original positions by another interchange. If a 1A

common paraphrase of row replacement is “Add to one row a multiple of another row.”

1.1

Systems of Linear Equations 23

row is scaled by a nonzero constant c , then multiplying the new row by 1=c produces the original row. Finally, consider a replacement operation involving two rows—say, rows 1 and 2—and suppose that c times row 1 is added to row 2 to produce a new row 2. To “reverse” this operation, add c times row 1 to (new) row 2 and obtain the original row 2. See Exercises 29–32 at the end of this section. At the moment, we are interested in row operations on the augmented matrix of a system of linear equations. Suppose a system is changed to a new one via row operations. By considering each type of row operation, you can see that any solution of the original system remains a solution of the new system. Conversely, since the original system can be produced via row operations on the new system, each solution of the new system is also a solution of the original system. This discussion justifies the following statement. If the augmented matrices of two linear systems are row equivalent, then the two systems have the same solution set. Though Example 1 is lengthy, you will find that after some practice, the calculations go quickly. Row operations in the text and exercises will usually be extremely easy to perform, allowing you to focus on the underlying concepts. Still, you must learn to perform row operations accurately because they will be used throughout the text. The rest of this section shows how to use row operations to determine the size of a solution set, without completely solving the linear system.

Existence and Uniqueness Questions Section 1.2 will show why a solution set for a linear system contains either no solutions, one solution, or infinitely many solutions. Answers to the following two questions will determine the nature of the solution set for a linear system. To determine which possibility is true for a particular system, we ask two questions. TWO FUNDAMENTAL QUESTIONS ABOUT A LINEAR SYSTEM 1. Is the system consistent; that is, does at least one solution exist? 2. If a solution exists, is it the only one; that is, is the solution unique? These two questions will appear throughout the text, in many different guises. This section and the next will show how to answer these questions via row operations on the augmented matrix.

EXAMPLE 2 Determine if the following system is consistent: x1 5x1

2x2 C x3 D 0 2x2 8x3 D 8 5x3 D 10

SOLUTION This is the system from Example 1. Suppose that we have performed the row operations necessary to obtain the triangular form 2 3 x1 2x2 C x3 D 0 1 2 1 0 40 1 4 45 x2 4x3 D 4 0 0 1 1 x3 D 1

24

CHAPTER 1

Linear Equations in Linear Algebra

At this point, we know x3 . Were we to substitute the value of x3 into equation 2, we could compute x2 and hence could determine x1 from equation 1. So a solution exists; the system is consistent. (In fact, x2 is uniquely determined by equation 2 since x3 has only one possible value, and x1 is therefore uniquely determined by equation 1. So the solution is unique.)

EXAMPLE 3 Determine if the following system is consistent: x2

4x3 D 8

2x1

3x2 C 2x3 D 1

4x1

8x2 C 12x3 D 1

(5)

SOLUTION The augmented matrix is 2

0 42 4

1 3 8

4 2 12

3 8 15 1

To obtain an x1 in the first equation, interchange rows 1 and 2: 2

2 40 4

3 1 8

2 4 12

3 1 85 1

To eliminate the 4x1 term in the third equation, add 2 times row 1 to row 3: 2

2 40 0

3 1 2

2 4 8

3 1 85 1

(6)

Next, use the x2 term in the second equation to eliminate the 2x2 term from the third equation. Add 2 times row 2 to row 3: 2

2 40 0

x3

3 1 0

2 4 0

3 1 85 15

(7)

The augmented matrix is now in triangular form. To interpret it correctly, go back to equation notation: x1

x2

The system is inconsistent because there is no point that lies on all three planes.

2x1

3x2 C 2x3 D 1 x2 4x3 D 8 0 D 15

(8)

The equation 0 D 15 is a short form of 0x1 C 0x2 C 0x3 D 15. This system in triangular form obviously has a built-in contradiction. There are no values of x1 ; x2 ; x3 that satisfy (8) because the equation 0 D 15 is never true. Since (8) and (5) have the same solution set, the original system is inconsistent (i.e., has no solution). Pay close attention to the augmented matrix in (7). Its last row is typical of an inconsistent system in triangular form.

1.1

Systems of Linear Equations 25

NUMERICAL NOTE In real-world problems, systems of linear equations are solved by a computer. For a square coefficient matrix, computer programs nearly always use the elimination algorithm given here and in Section 1.2, modified slightly for improved accuracy. The vast majority of linear algebra problems in business and industry are solved with programs that use floating point arithmetic. Numbers are represented as decimals ˙:d1    dp  10r , where r is an integer and the number p of digits to the right of the decimal point is usually between 8 and 16. Arithmetic with such numbers typically is inexact, because the result must be rounded (or truncated) to the number of digits stored. “Roundoff error” is also introduced when a number such as 1=3 is entered into the computer, since its decimal representation must be approximated by a finite number of digits. Fortunately, inaccuracies in floating point arithmetic seldom cause problems. The numerical notes in this book will occasionally warn of issues that you may need to consider later in your career.

PRACTICE PROBLEMS Throughout the text, practice problems should be attempted before working the exercises. Solutions appear after each exercise set. 1. State in words the next elementary row operation that should be performed on the system in order to solve it. [More than one answer is possible in (a).] a. x1 C 4x2 x2

2x3 C 8x4 D 12 7x3 C 2x4 D 4 5x3 x4 D 7 x3 C 3x4 D 5

b. x1

3x2 C 5x3

2x4 D

0

x2 C 8x3

D

4

2x3

D

3

x4 D

1

2. The augmented matrix of a linear system has been transformed by row operations into the form below. Determine if the system is consistent. 2 3 1 5 2 6 40 4 7 25 0 0 5 0 3. Is .3; 4; 2/ a solution of the following system?

5x1 x2 C 2x3 D 2x1 C 6x2 C 9x3 D 7x1 C 5x2 3x3 D

7 0 7

4. For what values of h and k is the following system consistent?

2x1 x2 D h 6x1 C 3x2 D k

26

Linear Equations in Linear Algebra

CHAPTER 1

1.1 EXERCISES Solve each system in Exercises 1–4 by using elementary row operations on the equations or on the augmented matrix. Follow the systematic elimination procedure described in this section. 1.

x1 C 5x2 D

2x1

7x2 D

2. 2x1 C 4x2 D

7 5

12.

13.

3. Find the point .x1 ; x2 / that lies on the line x1 C 5x2 D 7 and on the line x1 2x2 D 2. See the figure. 14.

4. Find the point of intersection of the lines x1 5x2 D 1 and 3x1 7x2 D 5. Consider each matrix in Exercises 5 and 6 as the augmented matrix of a linear system. State in words the next two elementary row operations that should be performed in the process of solving the system. 2 3 1 4 5 0 7 60 1 3 0 67 7 5. 6 40 0 1 0 25 0 0 0 1 5 2 3 1 6 4 0 1 60 2 7 0 47 7 6. 6 40 0 1 2 35 0 0 3 1 6 In Exercises 7–10, the augmented matrix of a linear system has been reduced by row operations to the form shown. In each case, continue the appropriate row operations and describe the solution set of the original system. 2 3 2 3 1 7 3 4 1 4 9 0 60 7 1 1 3 7 1 7 05 7. 6 8. 4 0 40 0 0 15 0 0 2 0 0 0 1 2

1 1 0 0

0 3 1 0

0 0 3 2

2 1 0 0

0 0 1 0

3 4 0 1

3 4 77 7 15 4 3 2 77 7 65 3

Solve the systems in Exercises 11–14. 11.

3x1

7x2 C 7x3 D

8

x1

x2 C 4x3 D

5

x1 C 3x2 C 5x3 D

2

3x1 C 7x2 C 7x3 D

6

x3 D

7

3x3 D

8

2x1 C 2x2 C 9x3 D

7

x2 C 5x3 D

2

x1

3x2

D5

x1 C x2 C 5x3 D 2

x1

1 60 9. 6 40 0 2 1 60 6 10. 4 0 0

4

x1 – 2 x 2 = –2

x1 + 5x 2 = 7

2

3x2 C 4x3 D

4x1 C 6x2

4

5x1 C 7x2 D 11

x2

x1

x2 C x3 D 0 Determine if the systems in Exercises 15 and 16 are consistent. Do not completely solve the systems. 15.

x1

C 3x3

D

2

3x4 D

3

2x2 C 3x3 C 2x4 D

1

C 7x4 D

5

x2

3x1 16.

x1

2x4 D

3

D

0

x3 C 3x4 D

1

2x1 C 3x2 C 2x3 C x4 D

5

2x2 C 2x3

17. Do the three lines x1 4x2 D 1, 2x1 x2 D 3, and x1 3x2 D 4 have a common point of intersection? Explain. 18. Do the three planes x1 C 2x2 C x3 D 4, x2 x3 D 1, and x1 C 3x2 D 0 have at least one common point of intersection? Explain. In Exercises 19–22, determine the value(s) of h such that the matrix is the augmented matrix of a consistent linear system.     1 h 4 1 h 3 19. 20. 3 6 8 2 4 6     1 3 2 2 3 h 21. 22. 4 h 8 6 9 5 In Exercises 23 and 24, key statements from this section are either quoted directly, restated slightly (but still true), or altered in some way that makes them false in some cases. Mark each statement True or False, and justify your answer. (If true, give the approximate location where a similar statement appears, or refer to a definition or theorem. If false, give the location of a statement that has been quoted or used incorrectly, or cite an example that shows the statement is not true in all cases.) Similar true/false questions will appear in many sections of the text.

1.1 23. a. Every elementary row operation is reversible. b. A 5  6 matrix has six rows.

c. The solution set of a linear system involving variables x1 ; : : : ; xn is a list of numbers .s1 ; : : : ; sn / that makes each equation in the system a true statement when the values s1 ; : : : ; sn are substituted for x1 ; : : : ; xn , respectively. d. Two fundamental questions about a linear system involve existence and uniqueness. 24. a. Elementary row operations on an augmented matrix never change the solution set of the associated linear system. b. Two matrices are row equivalent if they have the same number of rows. c. An inconsistent system has more than one solution. d. Two linear systems are equivalent if they have the same solution set. 25. Find an equation involving g , h, and k that makes this augmented matrix correspond to a consistent system: 2 3 1 4 7 g 4 0 3 5 h5 2 5 9 k 26. Construct three different augmented matrices for linear systems whose solution set is x1 D 2, x2 D 1, x3 D 0.

29.

30.

31.

32.

2

0 41 3 2 1 40 0 2 1 40 4 2 1 40 0

2 4 1 3 2 5 2 5 1 2 1 3

3 2 5 1 4 7 5;4 0 2 6 3 1 3 2 4 1 3 6 5;4 0 1 9 0 5 3 2 1 0 1 2 8 5;4 0 3 6 0 3 2 5 0 1 3 2 5;4 0 9 5 0

T1 D .10 C 20 C T2 C T4 /=4;

10° 10°

x1 C 3x2 D f cx1 C dx2 D g

ax1 C bx2 D f cx1 C dx2 D g

In Exercises 29–32, find the elementary row operation that transforms the first matrix into the second, and then find the reverse row operation that transforms the second matrix into the first.

3 7 55 6 3 4 35 9

2 5 7

1 2 1

2 1 0

5 3 0

3 0 85 6 3 0 25 1

An important concern in the study of heat transfer is to determine the steady-state temperature distribution of a thin plate when the temperature around the boundary is known. Assume the plate shown in the figure represents a cross section of a metal beam, with negligible heat flow in the direction perpendicular to the plate. Let T1 ; : : : ; T4 denote the temperatures at the four interior nodes of the mesh in the figure. The temperature at a node is approximately equal to the average of the four nearest nodes— to the left, above, to the right, and below.2 For instance,

27. Suppose the system below is consistent for all possible values of f and g . What can you say about the coefficients c and d ? Justify your answer.

28. Suppose a, b , c , and d are constants such that a is not zero and the system below is consistent for all possible values of f and g . What can you say about the numbers a, b , c , and d ? Justify your answer.

Systems of Linear Equations 27

or

4T1

20°

20°

1

2

4

3

30°

30°

T2

T4 D 30

40° 40°

33. Write a system of four equations whose solution gives estimates for the temperatures T1 ; : : : ; T4 . 34. Solve the system of equations from Exercise 33. [Hint: To speed up the calculations, interchange rows 1 and 4 before starting “replace” operations.] 2 See

Frank M. White, Heat and Mass Transfer (Reading, MA: Addison-Wesley Publishing, 1991), pp. 145–149.

SOLUTIONS TO PRACTICE PROBLEMS 1. a. For “hand computation,” the best choice is to interchange equations 3 and 4. Another possibility is to multiply equation 3 by 1=5. Or, replace equation 4 by its sum with 1=5 times row 3. (In any case, do not use the x2 in equation 2 to eliminate the 4x2 in equation 1. Wait until a triangular form has been reached and the x3 terms and x4 terms have been eliminated from the first two equations.) b. The system is in triangular form. Further simplification begins with the x4 in the fourth equation. Use the x4 to eliminate all x4 terms above it. The appropriate

28

CHAPTER 1

Linear Equations in Linear Algebra

step now is to add 2 times equation 4 to equation 1. (After that, move to equation 3, multiply it by 1=2, and then use the equation to eliminate the x3 terms above it.) 2. The system corresponding to the augmented matrix is

x1 C 5x2 C 2x3 D 4x2 7x3 D 5x3 D

x3

6 2 0

The third equation makes x3 D 0, which is certainly an allowable value for x3 . After eliminating the x3 terms in equations 1 and 2, you could go on to solve for unique values for x2 and x1 . Hence a solution exists, and it is unique. Contrast this situation with that in Example 3.

x1 x2 (3, 4, ⫺2) Since .3; 4; 2/ satisfies the first two equations, it is on the line of the intersection of the first two planes. Since .3; 4; 2/ does not satisfy all three equations, it does not lie on all three planes.

3. It is easy to check if a specific list of numbers is a solution. Set x1 D 3, x2 D 4, and x3 D 2, and find that

5.3/ .4/ C 2. 2/ D 2.3/ C 6.4/ C 9. 2/ D 7.3/ C 5.4/ 3. 2/ D

15 4 4D7 6 C 24 18 D 0 21 C 20 C 6 D 5

Although the first two equations are satisfied, the third is not, so .3; 4; 2/ is not a solution of the system. Notice the use of parentheses when making the substitutions. They are strongly recommended as a guard against arithmetic errors. 4. When the second equation is replaced by its sum with 3 times the first equation, the system becomes

2x1

x2 D h 0 D k C 3h

If k C 3h is nonzero, the system has no solution. The system is consistent for any values of h and k that make k C 3h D 0.

1.2 ROW REDUCTION AND ECHELON FORMS This section refines the method of Section 1.1 into a row reduction algorithm that will enable us to analyze any system of linear equations.1 By using only the first part of the algorithm, we will be able to answer the fundamental existence and uniqueness questions posed in Section 1.1. The algorithm applies to any matrix, whether or not the matrix is viewed as an augmented matrix for a linear system. So the first part of this section concerns an arbitrary rectangular matrix and begins by introducing two important classes of matrices that include the “triangular” matrices of Section 1.1. In the definitions that follow, a nonzero row or column in a matrix means a row or column that contains at least one nonzero entry; a leading entry of a row refers to the leftmost nonzero entry (in a nonzero row). 1 The

algorithm here is a variant of what is commonly called Gaussian elimination. A similar elimination method for linear systems was used by Chinese mathematicians in about 250 B.C. The process was unknown in Western culture until the nineteenth century, when a famous German mathematician, Carl Friedrich Gauss, discovered it. A German engineer, Wilhelm Jordan, popularized the algorithm in an 1888 text on geodesy.

1.2

DEFINITION

Row Reduction and Echelon Forms 29

A rectangular matrix is in echelon form (or row echelon form) if it has the following three properties: 1. All nonzero rows are above any rows of all zeros. 2. Each leading entry of a row is in a column to the right of the leading entry of the row above it. 3. All entries in a column below a leading entry are zeros. If a matrix in echelon form satisfies the following additional conditions, then it is in reduced echelon form (or reduced row echelon form): 4. The leading entry in each nonzero row is 1. 5. Each leading 1 is the only nonzero entry in its column. An echelon matrix (respectively, reduced echelon matrix) is one that is in echelon form (respectively, reduced echelon form). Property 2 says that the leading entries form an echelon (“steplike”) pattern that moves down and to the right through the matrix. Property 3 is a simple consequence of property 2, but we include it for emphasis. The “triangular” matrices of Section 1.1, such as 2 3 2 3 2 3 2 1 1 0 0 29 40 1 4 8 5 and 4 0 1 0 16 5 0 0 0 5=2 0 0 1 3 are in echelon form. In fact, the second matrix is in reduced echelon form. Here are additional examples.

EXAMPLE 1 The following matrices are in echelon form. The leading entries ( )

may have any nonzero value; the starred entries () may have any value (including zero). 2 3 2 3 0            60 0 0      7 60 6 7  7 6 7; 6 0 0 0 0     7 6 7 40 5 0 0 0 40 0 0 0 0    5 0 0 0 0 0 0 0 0 0 0 0 0 

The following matrices are in reduced echelon form because the leading entries are 1’s, and there are 0’s below and above each leading 1. 2 3 2 3 0 1  0 0 0   0  1 0   60 0 0 1 0 0   0 7 60 6 7 1  7 6 7; 6 0 0 0 0 1 0   0 7 6 7 40 5 0 0 0 40 0 0 0 0 1   0 5 0 0 0 0 0 0 0 0 0 0 0 0 1  Any nonzero matrix may be row reduced (that is, transformed by elementary row operations) into more than one matrix in echelon form, using different sequences of row operations. However, the reduced echelon form one obtains from a matrix is unique. The following theorem is proved in Appendix A at the end of the text.

THEOREM 1

Uniqueness of the Reduced Echelon Form Each matrix is row equivalent to one and only one reduced echelon matrix.

30

CHAPTER 1

Linear Equations in Linear Algebra

If a matrix A is row equivalent to an echelon matrix U , we call U an echelon form (or row echelon form) of A ; if U is in reduced echelon form, we call U the reduced echelon form of A . [Most matrix programs and calculators with matrix capabilities use the abbreviation RREF for reduced (row) echelon form. Some use REF for (row) echelon form.]

Pivot Positions When row operations on a matrix produce an echelon form, further row operations to obtain the reduced echelon form do not change the positions of the leading entries. Since the reduced echelon form is unique, the leading entries are always in the same positions in any echelon form obtained from a given matrix. These leading entries correspond to leading 1’s in the reduced echelon form.

DEFINITION

A pivot position in a matrix A is a location in A that corresponds to a leading 1 in the reduced echelon form of A. A pivot column is a column of A that contains a pivot position. In Example 1, the squares ( ) identify the pivot positions. Many fundamental concepts in the first four chapters will be connected in one way or another with pivot positions in a matrix.

EXAMPLE 2 Row reduce the matrix A below to echelon form, and locate the pivot columns of A.

2

0 6 1 AD6 4 2 1

3 2 3 4

6 1 0 5

4 3 3 9

3 9 17 7 15 7

SOLUTION Use the same basic strategy as in Section 1.1. The top of the leftmost nonzero column is the first pivot position. A nonzero entry, or pivot, must be placed in this position. A good choice is to interchange rows 1 and 4 (because the mental computations in the next step will not involve fractions). 2

1 6 1 6 4 2 0

Pivot

4 2 3 3

5 1 0 6

9 3 3 4

3 7 17 7 15 9

6 Pivot column

Create zeros below the pivot, 1, by adding multiples of the first row to the rows below, and obtain matrix (1) below. The pivot position in the second row must be as far left as possible—namely, in the second column. Choose the 2 in this position as the next pivot. 2

1 60 6 40 0

Pivot

4 2 5 3

5 4 10 6

9 6 15 4

3 7 67 7 15 5 9

6 Next pivot column

(1)

1.2

Row Reduction and Echelon Forms 31

Add 5=2 times row 2 to row 3, and add 3=2 times row 2 to row 4. 2 3 1 4 5 9 7 60 2 4 6 67 6 7 40 0 0 0 05 0 0 0 5 0

(2)

The matrix in (2) is different from any encountered in Section 1.1. There is no way to create a leading entry in column 3! (We can’t use row 1 or 2 because doing so would destroy the echelon arrangement of the leading entries already produced.) However, if we interchange rows 3 and 4, we can produce a leading entry in column 4. 2

1 60 6 40 0 6

4 2 0 0 6

5 4 0 0

9 6 5 0

Pivot 3

7 67 7 05 0

2 6

0 General form: 6 4



0 0

0 0

  0 0

  0

3  7 7 5 0

6 Pivot columns

The matrix is in echelon form and thus reveals that columns 1, 2, and 4 of A are pivot columns. Pivot positions 2 3 0 3 6 4 9 6 1 2 1 3 1 7 7 AD6 (3) 4 2 3 0 3 1 5 1 4 5 9 7 6

6

6

Pivot columns

A pivot, as illustrated in Example 2, is a nonzero number in a pivot position that is used as needed to create zeros via row operations. The pivots in Example 2 were 1, 2, and 5. Notice that these numbers are not the same as the actual elements of A in the highlighted pivot positions shown in (3). With Example 2 as a guide, we are ready to describe an efficient procedure for transforming a matrix into an echelon or reduced echelon matrix. Careful study and mastery of this procedure now will pay rich dividends later in the course.

The Row Reduction Algorithm The algorithm that follows consists of four steps, and it produces a matrix in echelon form. A fifth step produces a matrix in reduced echelon form. We illustrate the algorithm by an example.

EXAMPLE 3 Apply elementary row operations to transform the following matrix first into echelon form and then into reduced echelon form: 2 3 0 3 6 6 4 5 43 7 8 5 8 95 3 9 12 9 6 15

SOLUTION STEP 1 Begin with the leftmost nonzero column. This is a pivot column. The pivot position is at the top.

32

CHAPTER 1

Linear Equations in Linear Algebra

2

0 43 3 6

3 7 9

6 8 12

6 5 9

4 8 6

3 5 95 15

Pivot column

STEP 2 Select a nonzero entry in the pivot column as a pivot. If necessary, interchange rows to move this entry into the pivot position. Interchange rows 1 and 3. (We could have interchanged rows 1 and 2 instead.) 2

3 9 43 7 0 3

Pivot

12 8 6

9 5 6

6 8 4

3 15 95 5

STEP 3 Use row replacement operations to create zeros in all positions below the pivot. As a preliminary step, we could divide the top row by the pivot, 3. But with two 3’s in column 1, it is just as easy to add 1 times row 1 to row 2. 2

3 9 40 2 0 3

Pivot

12 4 6

9 4 6

6 2 4

3 15 65 5

STEP 4 Cover (or ignore) the row containing the pivot position and cover all rows, if any, above it. Apply steps 1–3 to the submatrix that remains. Repeat the process until there are no more nonzero rows to modify. With row 1 covered, step 1 shows that column 2 is the next pivot column; for step 2, select as a pivot the “top” entry in that column. 2

3 40 0

Pivot

3 9 12 9 6 15 2 4 4 2 65 3 6 6 4 5 6 New pivot column

For step 3, we could insert an optional step of dividing the “top” row of the submatrix by the pivot, 2. Instead, we add 3=2 times the “top” row to the row below. This produces 2

3 40 0

9 2 0

12 4 0

9 4 0

3 6 15 2 65 1 4

Row Reduction and Echelon Forms 33

1.2

When we cover the row containing the second pivot position for step 4, we are left with a new submatrix having only one row: 2

3 40 0

9 2 0

12 4 0

9 4 0

3 6 15 2 65 1 4 Pivot

Steps 1–3 require no work for this submatrix, and we have reached an echelon form of the full matrix. If we want the reduced echelon form, we perform one more step. STEP 5 Beginning with the rightmost pivot and working upward and to the left, create zeros above each pivot. If a pivot is not 1, make it 1 by a scaling operation. The rightmost pivot is in row 3. Create zeros above it, adding suitable multiples of row 3 to rows 2 and 1. 2 3  Row 1 C . 6/  row 3 3 9 12 9 0 9  Row 2 C . 2/  row 3 40 5 2 4 4 0 14 0 0 0 0 1 4 The next pivot is in row 2. Scale this row, dividing by the pivot. 2

3 40 0

9 12 1 2 0 0

9 2 0

0 0 1

3 9 75 4

 Row scaled by

1 2

Create a zero in column 2 by adding 9 times row 2 to row 1. 2

3 40 0

0 1 0

6 2 0

9 2 0

0 0 1

3 72 75 4

 Row 1 C .9/  row 2

Finally, scale row 1, dividing by the pivot, 3. 2

1 40 0

0 1 0

2 2 0

3 2 0

0 0 1

3 24 75 4

 Row scaled by

1 3

This is the reduced echelon form of the original matrix. The combination of steps 1–4 is called the forward phase of the row reduction algorithm. Step 5, which produces the unique reduced echelon form, is called the backward phase.

NUMERICAL NOTE In step 2 above, a computer program usually selects as a pivot the entry in a column having the largest absolute value. This strategy, called partial pivoting, is used because it reduces roundoff errors in the calculations.

34

CHAPTER 1

Linear Equations in Linear Algebra

Solutions of Linear Systems The row reduction algorithm leads directly to an explicit description of the solution set of a linear system when the algorithm is applied to the augmented matrix of the system. Suppose, for example, that the augmented matrix of a linear system has been changed into the equivalent reduced echelon form 2 3 1 0 5 1 40 1 1 45 0 0 0 0 There are three variables because the augmented matrix has four columns. The associated system of equations is

x1

5x3 D 1 x2 C x3 D 4 0 D0

(4)

The variables x1 and x2 corresponding to pivot columns in the matrix are called basic variables.2 The other variable, x3 , is called a free variable. Whenever a system is consistent, as in (4), the solution set can be described explicitly by solving the reduced system of equations for the basic variables in terms of the free variables. This operation is possible because the reduced echelon form places each basic variable in one and only one equation. In (4), solve the first equation for x1 and the second for x2 . (Ignore the third equation; it offers no restriction on the variables.) 8 ˆ 8 7 7 7 Thus p13 does not satisfy the second inequality, which shows that p13 is not in P . In conclusion, the minimal representation of the polytope P is           0 7 3 5 0 ; ; ; ; : 0 0 5 3 6

The remainder of this section discusses the construction of two basic polytopes in R3 (and higher dimensions). The first appears in linear programming problems, the subject of Chapter 9. Both polytopes provide opportunities to visualize R4 in a remarkable way.

Simplex A simplex is the convex hull of an affinely independent finite set of vectors. To construct a k -dimensional simplex (or k -simplex), proceed as follows: 0-simplex S 0 : 1-simplex S 1 : 2-simplex S 2 : :: :

a single point fv1 g conv.S 0 [ fv2 g/, with v2 not in aff S 0 conv.S 1 [ fv3 g/, with v3 not in aff S 1

k -simplex S k :

conv.S k

1

[ fvk C1 g/; with vk C1 not in aff S k

1

The simplex S 1 is a line segment. The triangle S 2 comes from choosing a point v3 that is not in the line containing S 1 and then forming the convex hull with S 1 .

494

CHAPTER 8

The Geometry of Vector Spaces v1

v1

v1

v1

v2

S0

v2

v4

v3

S1

v2

S2

v3 S3

FIGURE 6

(See Figure 6.) The tetrahedron S 3 is produced by choosing a point v4 not in the plane of S 2 and then forming the convex hull with S 2 . Before continuing, consider some of the patterns that are appearing. The triangle S 2 has three edges. Each of these edges is a line segment like S 1 . Where do these three line segments come from? One of them is S 1 . One of them comes by joining the endpoint v2 to the new point v3 . The third comes from joining the other endpoint v1 to v3 . You might say that each endpoint in S 1 is stretched out into a line segment in S 2 . The tetrahedron S 3 in Figure 6 has four triangular faces. One of these is the original triangle S 2 , and the other three come from stretching the edges of S 2 out to the new point v4 . Notice too that the vertices of S 2 get stretched out into edges in S 3 . The other edges in S 3 come from the edges in S 2 . This suggests how to “visualize” the fourdimensional S 4 . The construction of S 4 , called a pentatope, involves forming the convex hull of S 3 with a point v5 not in the 3-space of S 3 . A complete picture is impossible, of course, but Figure 7 is suggestive: S 4 has five vertices, and any four of the vertices determine a facet in the shape of a tetrahedron. For example, the figure emphasizes the facet with vertices v1 , v2 , v4 , and v5 and the facet with vertices v2 , v3 , v4 , and v5 . There are five v5

v1

v4

v2

v3

v5

v5

v1

v4 v1

v2

v3

v4

v2

FIGURE 7 The 4-dimensional simplex S 4 projected onto R2 , with two

tetrahedral facets emphasized.

v3

8.5

Polytopes 495

such facets. Figure 7 identifies all ten edges of S 4 , and these can be used to visualize the ten triangular faces. Figure 8 shows another representation of the 4-dimensional simplex S 4 . This time the fifth vertex appears “inside” the tetrahedron S 3 . The highlighted tetrahedral facets also appear to be “inside” S 3 . v4

v4

v5

v2

v1

v3

v1

v3

v4

v5

v1

v2

v4

v5

v2

v3

v1

v2

v3

FIGURE 8 The fifth vertex of S 4 is “inside” S 3 .

Hypercube Let Ii D 0ei be the line segment from the origin 0 to the standard basis vector ei in Rn . Then for k such that 1  k  n, the vector sum2

C k D I1 C I2 C    C Ik is called a k -dimensional hypercube. To visualize the construction of C k , start with the simple cases. The hypercube C 1 is the line segment I1 . If C 1 is translated by e2 , the convex hull of its initial and final positions describes a square C 2 . (See Figure 9.) Translating C 2 by e3 creates the cube C 3 . A similar translation of C 3 by the vector e4 yields the 4-dimensional hypercube C 4 . Again, this is hard to visualize, but Figure 10 shows a 2-dimensional projection of C 4 . Each of the edges of C 3 is stretched into a square face of C 4 . And each of the square faces of C 3 is stretched into a cubic face of C 4 . Figure 11 shows three facets of C 4 . Part (a) highlights the cube that comes from the left square face of C 3 . Part (b) shows the cube that comes from the front square face of C 3 . And part (c) emphasizes the cube that comes from the top square face of C 3 . 2 The

vector sum of two sets A and B is defined by A C B D fc W c D a C b for some a 2 A and b 2 B g.

496

CHAPTER 8

The Geometry of Vector Spaces

C1

C2

C3

FIGURE 9 Constructing the cube C 3 .

FIGURE 10 C 4 projected onto R2 .

(a)

(b)

(c)

FIGURE 11 Three of the cubic facets of C . 4

Figure 12 shows another representation of C 4 in which the translated cube is placed “inside” C 3 . This makes it easier to visualize the cubic facets of C 4 , since there is less distortion.

FIGURE 12 The translated image of

C 3 is placed “inside” C 3 to obtain C 4 .

Altogether, the 4-dimensional cube C 4 has eight cubic faces. Two come from the original and translated images of C 3 , and six come from the square faces of C 3 that are stretched into cubes. The square 2-dimensional faces of C 4 come from the square faces

Polytopes 497

8.5

of C 3 and its translate, and the edges of C 3 that are stretched into squares. Thus there are 2  6 C 12 D 24 square faces. To count the edges, take 2 times the number of edges in C 3 and add the number of vertices in C 3 . This makes 2  12 C 8 D 32 edges in C 4 . The vertices in C 4 all come from C 3 and its translate, so there are 2  8 D 16 vertices. One of the truly remarkable results in the study of polytopes is the following formula, first proved by Leonard Euler (1707–1783). It establishes a simple relationship between the number of faces of different dimensions in a polytope. To simplify the statement of the formula, let fk .P / denote the number of k -dimensional faces of an n-dimensional polytope P .3 Euler’s formula:

n 1 X . 1/k fk .P / D 1 C . 1/n

1

k D0

In particular, when n D 3; v e C f D 2, where v , e , and f denote the number of vertices, edges, and facets (respectively) of P .

PRACTICE PROBLEM Find the minimal representation 2 3of the polytope 2 3P defined by the inequalities Ax  b 1 3 12 and x  0, when A D 4 1 2 5 and b D 4 9 5. 2 1 12

8.5 EXERCISES

      1 2 1 , p2 D , and p3 D in R2 , 0 3 2 let S D conv fp1 ; p2 ; p3 g. For each linear functional f, find the maximum value m of f on the set S , and find all points x in S at which f .x/ D m.

1. Given points p1 D

a. f .x1 ; x2 / D x1

x2

c. f .x1 ; x2 / D

b. f .x1 ; x2 / D x1 C x2

3x1 C x2       0 2 1 2. Given points p1 D , p2 D , and p3 D in R2 , 1 1 2 let S D conv fp1 ; p2 ; p3 g. For each linear functional f, find the maximum value m of f on the set S , and find all points x in S at which f .x/ D m. a. f .x1 ; x2 / D x1 C x2 c. f .x1 ; x2 / D

2x1 C x2

b. f .x1 ; x2 / D x1

x2

3. Repeat Exercise 1 where m is the minimum value of f on S instead of the maximum value. 4. Repeat Exercise 2 where m is the minimum value of f on S instead of the maximum value. In Exercises 5–8, find the minimal representation of the polytope defined by the inequalities Ax  b and x  0.     1 2 10 5. A D , bD 3 1 15 3A

6. A D

 2

2 4

1 7. A D 4 1 4 2 2 8. A D 4 1 1

   3 18 , bD 1 16 3 2 3 3 18 1 5, b D 4 10 5 1 28 3 2 3 1 8 1 5, b D 4 6 5 2 7

9. Let S D f.x; y/ W x 2 C .y 1/2  1g [ f.3; 0/g. Is the origin an extreme point of conv S ? Is the origin a vertex of conv S ? 10. Find an example of a closed convex set S in R2 such that its profile P is nonempty but conv P ¤ S . 11. Find an example of a bounded convex set S in R2 such that its profile P is nonempty but conv P ¤ S . 12. a. Determine the number of k -faces of the 5-dimensional simplex S 5 for k D 0; 1; : : : ; 4. Verify that your answer satisfies Euler’s formula. b. Make a chart of the values of fk .S n / for n D 1; : : : ; 5 and k D 0; 1; : : : ; 4. Can you see a pattern? Guess a general formula for fk .S n /.

proof when n D 3 is presented in Steven R. Lay, Convex Sets and Their Applications (New York: John Wiley & Sons, 1982; Mineola, NY: Dover Publications, 2007), p. 131.

498

CHAPTER 8

The Geometry of Vector Spaces

13. a. Determine the number of k -faces of the 5-dimensional hypercube C 5 for k D 0; 1; : : : ; 4. Verify that your answer satisfies Euler’s formula. b. Make a chart of the values of fk .C n / for n D 1; : : : ; 5 and k D 0; 1; : : : ; 4. Can you see a pattern? Guess a general formula for fk .C n /. 14. Suppose v1 ; : : : ; vk are linearly independent vectors in Rn .1  k  n/. Then the set X k D conv f˙v1 ; : : : ; ˙vk g is called a k-crosspolytope. a. Sketch X 1 and X 2 . b. Determine the number of k -faces of the 3-dimensional crosspolytope X 3 for k D 0; 1; 2. What is another name for X 3 ? c. Determine the number of k -faces of the 4-dimensional crosspolytope X 4 for k D 0; 1; 2; 3. Verify that your answer satisfies Euler’s formula. d. Find a formula for fk .X n /, the number of k -faces of X n , for 0  k  n 1.

15. A k-pyramid P k is the convex hull of a .k 1/-polytope Q and a point x 62 aff Q. Find a formula for each of the following in terms of fj .Q/; j D 0; : : : ; n 1. a. The number of vertices of P n : f0 .P n /. b. The number of k -faces of P n : fk .P n /, for 1  k  n c. The number of .n fn 1 .P n /.

2.

1/-dimensional facets of P n :

In Exercises 16 and 17, mark each statement True or False. Justify each answer. 16. a. A polytope is the convex hull of a finite set of points. b. Let p be an extreme point of a convex set S . If u; v 2 S , p 2 uv, and p ¤ u, then p D v. c. If S is a nonempty convex subset of Rn , then S is the convex hull of its profile.

d. The 4-dimensional simplex S 4 has exactly five facets, each of which is a 3-dimensional tetrahedron.

17. a. A cube in R3 has exactly five facets. b. A point p is an extreme point of a polytope P if and only if p is a vertex of P . c. If S is a nonempty compact convex set and a linear functional attains its maximum at a point p, then p is an extreme point of S . d. A 2-dimensional polytope always has the same number of vertices and edges. 18. Let v be an element of the convex set S . Prove that v is an extreme point of S if and only if the set fx 2 S W x ¤ vg is convex. 19. If c 2 R and S is a set, define cS D fc x W x 2 Sg. Let S be a convex set and suppose c > 0 and d > 0. Prove that cS C dS D .c C d /S . 20. Find an example to show that the convexity of S is necessary in Exercise 19. 21. If A and B are convex sets, prove that A C B is convex.

22. A polyhedron (3-polytope) is called regular if all its facets are congruent regular polygons and all the angles at the vertices are equal. Supply the details in the following proof that there are only five regular polyhedra. a. Suppose that a regular polyhedron has r facets, each of which is a k -sided regular polygon, and that s edges meet at each vertex. Letting v and e denote the numbers of vertices and edges in the polyhedron, explain why kr D 2e and sv D 2e . 1 1 1 1 b. Use Euler’s formula to show that C D C . s k 2 e c. Find all the integral solutions of the equation in part (b) that satisfy the geometric constraints of the problem. (How small can k and s be?) For your information, the five regular polyhedra are the tetrahedron (4, 6, 4), the cube (8, 12, 6), the octahedron (6, 12, 8), the dodecahedron (20, 30, 12), and the icosahedron (12, 30, 20). (The numbers in parentheses indicate the numbers of vertices, edges, and faces, respectively.)

SOLUTION TO PRACTICE PROBLEM The matrix inequality Ax  b yields the following system of inequalities: (a) x1 C 3x2  12 (b) x1 C 2x2  9 (c) 2x1 C x2  12 The condition x  0, places the polytope in the first quadrant of the plane. One vertex is .0; 0/. The x1 -intercepts of the three lines (when x2 D 0) are 12, 9, and 6, so .6; 0/ is a vertex. The x2 -intercepts of the three lines (when x1 D 0) are 4, 4.5, and 12, so .0; 4/ is a vertex.

8.6

Curves and Surfaces 499

How do the three boundary lines intersect for positive values of x1 and x2 ? The intersection of (a) and (b) is at pab D .3; 3/. Testing pab in (c) gives 2.3/ C 1.3/ D 9 < 12, so pab is in P . The intersection of (b) and (c) is at pbc D .5; 2/. Testing pbc in (a) gives 1.5/ C 3.2/ D 11 < 12, so pbc is in P . The intersection of (a) and (c) is at pac D .4:8; 2:4/. Testing pac in (b) gives 1.4:8/ C 2.2:4/ D 9:6 > 9. So pac is not in P . Finally, the five vertices (extreme points) of the polytope are .0; 0/, .6; 0/, .5; 2/ .3; 3/, and .0; 4/. These points form the minimal representation of P . This is displayed graphically in Figure 13. x2 12

8

(c)

4 (a)

P

x1 4

8

(b) 12

FIGURE 13

8.6 CURVES AND SURFACES For thousands of years, builders used long thin strips of wood to create the hull of a boat. In more recent times, designers used long, flexible metal strips to lay out the surfaces of cars and airplanes. Weights and pegs shaped the strips into smooth curves called natural cubic splines. The curve between two successive control points (pegs or weights) has a parametric representation using cubic polynomials. Unfortunately, such curves have the property that moving one control point affects the shape of the entire curve, because of physical forces that the pegs and weights exert on the strip. Design engineers had long wanted local control of the curve—in which movement of one control point would affect only a small portion of the curve. In 1962, a French automotive engineer, Pierre Bézier, solved this problem by adding extra control points and using a class of curves now called by his name.

Bézier Curves The curves described below play an important role in computer graphics as well as engineering. For example, they are used in Adobe Illustrator and Macromedia Freehand, and in application programming languages such as OpenGL. These curves permit a program to store exact information about curved segments and surfaces in a relatively small number of control points. All graphics commands for the segments and surfaces have only to be computed for the control points. The special structure of these curves also speeds up other calculations in the “graphics pipeline” that creates the final display on the viewing screen. Exercises in Section 8.3 introduced quadratic Bézier curves and showed one method for constructing Bézier curves of higher degree. The discussion here focuses on quadratic and cubic Bézier curves, which are determined by three or four control points, denoted

500

CHAPTER 8

The Geometry of Vector Spaces

by p0 , p1 , p2 , and p3 . These points can be in R2 or R3 , or they can be represented by homogeneous forms in R3 or R4 . The standard parametric descriptions of these curves, for 0  t  1, are w.t/ D .1 x.t/ D .1

t/2 p0 C 2t .1 t/ p0 C 3t.1 3

t/p1 C t 2 p2

(1)

t/ p1 C 3t .1 2

t/p2 C t p3

2

(2)

3

Figure 1 shows two typical curves. Usually, the curves pass through only the initial and terminal control points, but a Bézier curve is always in the convex hull of its control points. (See Exercises 21–24 in Section 8.3.) p1

p2

p0

p1

p2

p0

p3

FIGURE 1 Quadratic and cubic Bézier curves.

Bézier curves are useful in computer graphics because their essential properties are preserved under the action of linear transformations and translations. For instance, if A is a matrix of appropriate size, then from the linearity of matrix multiplication, for 0  t  1,

Ax.t / D AŒ.1 D .1

t/3 p0 C 3t.1

t/3 Ap0 C 3t.1

t/2 p1 C 3t 2 .1

t/2 Ap1 C 3t 2 .1

t/p2 C t 3 p3 

t /Ap2 C t 3 Ap3

The new control points are Ap0 ; : : : ; Ap3 . Translations of Bézier curves are considered in Exercise 1. The curves in Figure 1 suggest that the control points determine the tangent lines to the curves at the initial and terminal control points. Recall from calculus that for any parametric curve, say y.t /, the direction of the tangent line to the curve at a point y.t/ is given by the derivative y0 .t/, called the tangent vector of the curve. (This derivative is computed entry by entry.)

EXAMPLE 1 Determine how the tangent vector of the quadratic Bézier curve w.t/ is related to the control points of the curve, at t D 0 and t D 1.

SOLUTION Write the weights in equation (1) as simple polynomials w.t/ D .1

2t C t 2 /p0 C .2t

2t 2 /p1 C t 2 p2

Then, because differentiation is a linear transformation on functions, w0 .t / D . 2 C 2t /p0 C .2

4t /p1 C 2t p2

So w0 .0/ D w0 .1/ D

2p0 C 2p1 D 2.p1 2p1 C 2p2 D 2.p2

p0 / p1 /

The tangent vector at p0 , for instance, points from p0 to p1 , but it is twice as long as the segment from p0 to p1 . Notice that w0 .0/ D 0 when p1 D p0 . In this case, w.t/ D .1 t 2 /p1 C t 2 p2 , and the graph of w.t/ is the line segment from p1 to p2 .

8.6

Curves and Surfaces 501

Connecting Two Bézier Curves Two basic Bézier curves can be joined end to end, with the terminal point of the first curve x.t/ being the initial point p2 of the second curve y.t/. The combined curve is said to have G 0 geometric continuity (at p2 ) because the two segments join at p2 . If the tangent line to curve 1 at p2 has a different direction than the tangent line to curve 2, then a “corner,” or abrupt change of direction, may be apparent at p2 . See Figure 2. p3

p1

p4

p2

p0 FIGURE 2 G 0 continuity at p2 .

To avoid a sharp bend, it usually suffices to adjust the curves to have what is called G 1 geometric continuity, where both tangent vectors at p2 point in the same direction. That is, the derivatives x0 .1/ and y0 .0/ point in the same direction, even though their magnitudes may be different. When the tangent vectors are actually equal at p2 , the tangent vector is continuous at p2 , and the combined curve is said to have C 1 continuity, or C 1 parametric continuity. Figure 3 shows G 1 continuity in (a) and C 1 continuity in (b).

p1

p3

p2

2

p0

p1

p3

p2

p4

p0 p4

0 0

2

4

6

8

10

(a)

12

14

(b)

FIGURE 3 (a) G 1 continuity and (b) C 1 continuity.

EXAMPLE 2 Let x.t/ and y.t/ determine two quadratic Bézier curves, with control points fp0 ; p1 ; p2 g and fp2 ; p3 ; p4 g, respectively. The curves are joined at p2 D x.1/ D y.0/.

a. Suppose the combined curve has G 1 continuity (at p2 ). What algebraic restriction does this condition impose on the control points? Express this restriction in geometric language. b. Repeat part (a) for C 1 continuity. SOLUTION a. From Example 1, x0 .1/ D 2.p2 p1 /. Also, using the control points for y.t/ in place of w.t/, Example 1 shows that y0 .0/ D 2.p3 p2 /. G 1 continuity means that y0 .0/ D k x0 .1/ for some positive constant k . Equivalently, p3

p2 D k.p2

p1 /;

with k > 0

(3)

502

CHAPTER 8

The Geometry of Vector Spaces

Geometrically, (3) implies that p2 lies on the line segment from p1 to p3 . To prove this, let t D .k C 1/ 1 , and note that 0 < t < 1. Solve for k to obtain k D .1 t/=t . When this expression is used for k in (3), a rearrangement shows that p2 D .1 t/p1 C t p3 , which verifies the assertion about p2 . b. C 1 continuity means that y0 .0/ D x0 .1/. Thus 2.p3 p2 / D 2.p2 p1 /, so p3 p2 D p2 p1 , and p2 D .p1 C p3 /=2. Geometrically, p2 is the midpoint of the line segment from p1 to p3 . See Figure 3. Figure 4 shows C 1 continuity for two cubic Bézier curves. Notice how the point joining the two segments lies in the middle of the line segment between the adjacent control points.

p4

p3

p0

p2 x(t) y(t)

p5

p1 p6 FIGURE 4 Two cubic Bézier curves.

Two curves have C 2 (parametric) continuity when they have C 1 continuity and the second derivatives x00 .1/ and y00 .0/ are equal. This is possible for cubic Bézier curves, but it severely limits the positions of the control points. Another class of cubic curves, called B -splines, always have C 2 continuity because each pair of curves share three control points rather than one. Graphics figures using B-splines have more control points and consequently require more computations. Some exercises for this section examine these curves. Surprisingly, if x.t / and y.t/ join at p3 , the apparent smoothness of the curve at p3 is usually the same for both G 1 continuity and C 1 continuity. This is because the magnitude of x0 .t/ is not related to the physical shape of the curve. The magnitude reflects only the mathematical parameterization of the curve. For instance, if a new vector function z.t/ equals x.2t /, then the point z.t / traverses the curve from p0 to p3 twice as fast as the original version, because 2t reaches 1 when t is :5. But, by the chain rule of calculus, z0 .t/ D 2  x0 .2t /, so the tangent vector to z.t / at p3 is twice the tangent vector to x.t/ at p3 . In practice, many simple Bézier curves are joined to create graphics objects. Typesetting programs provide one important application, because many letters in a type font involve curved segments. Each letter in a PostScript® font, for example, is stored as a set of control points, along with information on how to construct the “outline” of the letter using line segments and Bézier curves. Enlarging such a letter basically requires multiplying the coordinates of each control point by one constant scale factor. Once the outline of the letter has been computed, the appropriate solid parts of the letter are filled in. Figure 5 illustrates this for a character in a PostScript font. Note the control points.

8.6

Curves and Surfaces 503

Q FIGURE 5 A PostScript character.

Matrix Equations for Bézier Curves Since a Bézier curve is a linear combination of control points using polynomials as weights, the formula for x.t / may be written as  x.t / D p0 p1 p2

 D p0 p 1 p2

 D p0 p 1 p2

2

3 .1 t/3  6 3t .1 t /2 7 7 p3 6 4 3t 2 .1 t / 5 t3 2 3 1 3t C 3t 2 t 3  6 3t 6t 2 C 3t 3 7 7 p3 6 4 5 3t 2 3t 3 t3 2 32 3 1 3 3 1 1 76 t 7 60 3 6 3 76 7 p3 6 40 0 3 3 5 4 t2 5 0 0 0 1 t3

The matrix whose columns are the four control points is called a geometry matrix, G . The 4  4 matrix of polynomial coefficients is the Bézier basis matrix, MB . If u.t/ is the column vector of powers of t , then the Bézier curve is given by x.t/ D GMB u.t/

(4)

Other parametric cubic curves in computer graphics are written in this form, too. For instance, if the entries in the matrix MB are changed appropriately, the resulting curves are B-splines. They are “smoother” than Bézier curves, but they do not pass through any of the control points. A Hermite cubic curve arises when the matrix MB is replaced by a Hermite basis matrix. In this case, the columns of the geometry matrix consist of the starting and ending points of the curves and the tangent vectors to the curves at those points.1 The Bézier curve in equation (4) can also be “factored” in another way, to be used in the discussion of Bézier surfaces. For convenience later, the parameter t is replaced 1 The

term basis matrix comes from the rows of the matrix that list the coefficients of the blending polynomials used to define the curve. For a cubic Bézier curve, the four polynomials are .1 t /3 , 3t.1 t /2 , 3t 2 .1 t/, and t 3 . They form a basis for the space P3 of polynomials of degree 3 or less. Each entry in the vector x.t/ is a linear combination of these polynomials. The weights come from the rows of the geometry matrix G in (4).

504

CHAPTER 8

The Geometry of Vector Spaces

by a parameter s : 2

3 2 p0 1 6 7 6   p 2 3 6 3 17 x.s/ D u.s/T MBT 6 4 p2 5 D 1 s s s 4 3 p3 1 2 

D .1

s/3 3s.1

s/2 3s 2 .1

0 3 6 3 3

0 0 3 3

32 3 0 p0 6 p1 7 07 76 7 0 54 p 2 5 1 p3

p0 6  p1 7 7 s/ s 3 6 4 p2 5 p3

(5)

This formula is not quite the same as the transpose of the product on the right of (4), because x.s/ and the control points appear in (5) without transpose symbols. The matrix of control points in (5) is called a geometry vector. This should be viewed as a 4  1 block (partitioned) matrix whose entries are column vectors. The matrix to the left of the geometry vector, in the second part of (5), can be viewed as a block matrix, too, with a scalar in each block. The partitioned matrix multiplication makes sense, because each (vector) entry in the geometry vector can be left-multiplied by a scalar as well as by a matrix. Thus, the column vector x.s/ is represented by (5).

Bézier Surfaces A 3D bicubic surface patch can be constructed from a set of four Bézier curves. Consider the four geometry matrices   p11 p12 p13 p14   p21 p22 p23 p24   p31 p32 p33 p34   p41 p42 p43 p44 and recall from equation (4) that a Bézier curve is produced when any one of these matrices is multiplied on the right by the following vector of weights: 2 3 .1 t/3 6 3t .1 t /2 7 7 MB u.t / D 6 4 3t 2 .1 t / 5 t3 Let G be the block (partitioned) 4  4 matrix whose entries are the control points pij displayed above. Then the following product is a block 4  1 matrix, and each entry is a Bézier curve: 2 32 3 p11 p12 p13 p14 .1 t/3 6 p21 6 p22 p23 p24 7 t/2 7 7 6 3t .1 7 GMB u.t / D 6 2 4 p31 5 4 p32 p33 p34 3t .1 t / 5 p41 p42 p43 p44 t3 In fact, 2

.1 6 .1 GMB u.t / D 6 4 .1 .1

t /3 p11 C 3t .1 t /3 p21 C 3t .1 t /3 p31 C 3t .1 t /3 p41 C 3t .1

t /2 p12 C 3t 2 .1 t /2 p22 C 3t 2 .1 t /2 p32 C 3t 2 .1 t /2 p42 C 3t 2 .1

3 t /p13 C t 3 p14 t /p23 C t 3 p24 7 7 t /p33 C t 3 p34 5 t /p43 C t 3 p44

8.6

Curves and Surfaces 505

Now fix t . Then GMB u.t / is a column vector that can be used as a geometry vector in equation (5) for a Bézier curve in another variable s . This observation produces the Bézier bicubic surface: x.s; t / D u.s/T MBT GMB u.t/;

where 0  s; t  1

(6)

The formula for x.s; t / is a linear combination of the sixteen control points. If one imagines that these control points are arranged in a fairly uniform rectangular array, as in Figure 6, then the Bézier surface is controlled by a web of eight Bézier curves, four in the “s -direction” and four in the “t -direction.” The surface actually passes through the four control points at its “corners.” When it is in the middle of a larger surface, the sixteen-point surface shares its twelve boundary control points with its neighbors. p 21

p 11 p 22

p 31 p 32 p 41

p 12 p 13

p 23 p 33

p 42

p 24

p 14

p 34 p 43

p 44 FIGURE 6 Sixteen control points for a Bézier

bicubic surface patch.

Approximations to Curves and Surfaces In CAD programs and in programs used to create realistic computer games, the designer often works at a graphics workstation to compose a “scene” involving various geometric structures. This process requires interaction between the designer and the geometric objects. Each slight repositioning of an object requires new mathematical computations by the graphics program. Bézier curves and surfaces can be useful in this process because they involve fewer control points than objects approximated by many polygons. This dramatically reduces the computation time and speeds up the designer’s work. After the scene composition, however, the final image preparation has different computational demands that are more easily met by objects consisting of flat surfaces and straight edges, such as polyhedra. The designer needs to render the scene, by introducing light sources, adding color and texture to surfaces, and simulating reflections from the surfaces. Computing the direction of a reflected light at a point p on a surface, for instance, requires knowing the directions of both the incoming light and the surface normal— the vector perpendicular to the tangent plane at p. Computing such normal vectors is much easier on a surface composed of, say, tiny flat polygons than on a curved surface whose normal vector changes continuously as p moves. If p1 , p2 , and p3 are adjacent vertices of a flat polygon, then the surface normal is just plus or minus the cross product .p2 p1 /  .p2 p3 /. When the polygon is small, only one normal vector is needed for rendering the entire polygon. Also, two widely used shading routines, Gouraud shading and Phong shading, both require a surface to be defined by polygons. As a result of these needs for flat surfaces, the Bézier curves and surfaces from the scene composition stage now are usually approximated by straight line segments and

506

CHAPTER 8

The Geometry of Vector Spaces

polyhedral surfaces. The basic idea for approximating a Bézier curve or surface is to divide the curve or surface into smaller pieces, with more and more control points.

Recursive Subdivision of Bézier Curves and Surfaces Figure 7 shows the four control points p0 ; : : : ; p3 for a Bézier curve, along with control points for two new curves, each coinciding with half of the original curve. The “left” curve begins at q0 D p0 and ends at q3 , at the midpoint of the original curve. The “right” curve begins at r0 D q3 and ends at r3 D p3 . p1

p2 q2

r1 q3 = r0

q1

r2

p0 = q0

p3 = r3

FIGURE 7 Subdivision of a Bézier curve.

Figure 8 shows how the new control points enclose regions that are “thinner” than the region enclosed by the original control points. As the distances between the control points decrease, the control points of each curve segment also move closer to a line segment. This variation-diminishing property of Bézier curves depends on the fact that a Bézier curve always lies in the convex hull of the control points. p1

q2

r1

p2

q3 = r0

q1

r2

p0 = q 0

p3 = r3

FIGURE 8 Convex hulls of the control points.

The new control points are related to the original control points by simple formulas. Of course, q0 D p0 and r3 D p3 . The midpoint of the original curve x.t/ occurs at x.:5/ when x.t/ has the standard parameterization, x.t / D .1

t 3 /p0 C .3t

3t C 3t 2

6t 2 C 3t 3 /p1 C .3t 2

3t 3 /p2 C t 3 p3

for 0  t  1. Thus, the new control points q3 and r0 are given by q3 D r0 D x.:5/ D 18 .p0 C 3p1 C 3p2 C p3 /

(7) (8)

The formulas for the remaining “interior” control points are also simple, but the derivation of the formulas requires some work involving the tangent vectors of the curves. By definition, the tangent vector to a parameterized curve x.t/ is the derivative x0 .t /. This vector shows the direction of the line tangent to the curve at x.t /. For the Bézier curve in (7), x0 .t/ D . 3 C 6t

3t 2 /p0 C .3

for 0  t  1. In particular,

x0 .0/ D 3.p1

p0 /

12t C 9t 2 /p1 C .6t and

x0 .1/ D 3.p3

9t 2 /p2 C 3t 2 p3 p2 /

(9)

8.6

Curves and Surfaces 507

Geometrically, p1 is on the line tangent to the curve at p0 , and p2 is on the line tangent to the curve at p3 . See Figure 8. Also, from x0 .t/, compute x0 .:5/ D 34 . p0

p1 C p2 C p3 /

(10)

Let y.t / be the Bézier curve determined by q0 ; : : : ; q3 , and let z.t/ be the Bézier curve determined by r0 ; : : : ; r3 . Since y.t/ traverses the same path as x.t/ but only gets to x.:5/ as t goes from 0 to 1, y.t/ D x.:5t/ for 0  t  1. Similarly, since z.t/ starts at x.:5/ when t D 0, z.t/ D x.:5 C :5t/ for 0  t  1. By the chain rule for derivatives, y0 .t / D :5x0 .:5t /

and

z0 .t/ D :5x0 .:5 C :5t /

for 0  t  1

(11)

p0 /

(12)

From (9) with y .0/ in place of x .0/, from (11) with t D 0, and from (9), the control points for y.t / satisfy 0

3.q1

0

q0 / D y0 .0/ D :5x0 .0/ D 32 .p1

From (9) with y .1/ in place of x .1/, from (11) with t D 1, and from (10), 0

3.q3

0

q2 / D y0 .1/ D :5x0 .:5/ D 38 . p0

p1 C p2 C p3 /

(13)

Equations (8), (9), (10), (12), and (13) can be solved to produce the formulas for q0 ; : : : ; q3 shown in Exercise 13. Geometrically, the formulas are displayed in Figure 9. The interior control points q1 and r2 are the midpoints, respectively, of the segment from p0 to p1 and the segment from p2 to p3 . When the midpoint of the segment from p1 to p2 is connected to q1 , the resulting line segment has q2 in the middle! 1 p + p 2) 2( 1

p1 q2 q1

q0 = p0

p2 r1

q3 = r0

r2

p3 = r3

FIGURE 9 Geometric structure of new control points.

This completes one step of the subdivision process. The “recursion” begins, and both new curves are subdivided. The recursion continues to a depth at which all curves are sufficiently straight. Alternatively, at each step the recursion can be “adaptive” and not subdivide one of the two new curves if that curve is sufficiently straight. Once the subdivision completely stops, the endpoints of each curve are joined by line segments, and the scene is ready for the next step in the final image preparation. A Bézier bicubic surface has the same variation-diminishing property as the Bézier curves that make up each cross-section of the surface, so the process described above can be applied in each cross-section. With the details omitted, here is the basic strategy. Consider the four “parallel” Bézier curves whose parameter is s , and apply the subdivision process to each of them. This produces four sets of eight control points; each set determines a curve as s varies from 0 to 1. As t varies, however, there are eight curves, each with four control points. Apply the subdivision process to each of these sets of four points, creating a total of 64 control points. Adaptive recursion is possible in this setting, too, but there are some subtleties involved.2 2 See

Foley, van Dam, Feiner, and Hughes, Computer Graphics—Principles and Practice, 2nd Ed. (Boston: Addison-Wesley, 1996), pp. 527–528.

508

CHAPTER 8

The Geometry of Vector Spaces

PRACTICE PROBLEMS A spline usually refers to a curve that passes through specified points. A B-spline, however, usually does not pass through its control points. A single segment has the parametric form  x.t/ D 16 .1 t /3 p0 C .3t 3 6t 2 C 4/p1 (14)  C . 3t 3 C 3t 2 C 3t C 1/p2 C t 3 p3 for 0  t  1, where p0 , p1 , p2 , and p3 are the control points. When t varies from 0 to 1, x.t/ creates a short curve that lies close to p1 p2 . Basic algebra shows that the B-spline formula can also be written as  x.t/ D 16 .1 t /3 p0 C .3t .1 t/2 3t C 4/p1 (15)  C .3t 2 .1 t/ C 3t C 1/p2 C t 3 p3

This shows the similarity with the Bézier curve. Except for the 1=6 factor at the front, the p0 and p3 terms are the same. The p1 component has been increased by 3t C 4 and the p2 component has been increased by 3t C 1. These components move the curve closer to p1 p2 than the Bézier curve. The 1=6 factor is necessary to keep the sum of the coefficients equal to 1. Figure 10 compares a B-spline with a Bézier curve that has the same control points.

FIGURE 10 A B-spline segment and a Bézier curve.

1. Show that the B-spline does not begin at p0 , but x.0/ is in conv fp0 ; p1 ; p2 g. Assuming that p0 , p1 , and p2 are affinely independent, find the affine coordinates of x.0/ with respect to fp0 ; p1 ; p2 g. 2. Show that the B-spline does not end at p3 , but x.1/ is in conv fp1 ; p2 ; p3 g. Assuming that p1 , p2 , and p3 are affinely independent, find the affine coordinates of x.1/ with respect to fp1 ; p2 ; p3 g.

8.6 EXERCISES 1. Suppose a Bézier curve is translated to x.t / C b. That is, for 0  t  1, the new curve is x.t/ D .1

t/3 p0 C 3t.1 t/2 p1 C 3t 2 .1 t/p2 C t 3 p3 C b

Show that this new curve is again a Bézier curve. [Hint: Where are the new control points?] 2. The parametric vector form of a B-spline curve was defined in the Practice Problems as  x.t/ D 16 .1 t/3 p0 C .3t .1 t/2 3t C 4/p1  C.3t 2 .1 t/ C 3t C 1/p2 C t 3 p3 for 0  t  1, where p0 , p1 , p2 , and p3 are the control points.

a. Show that for 0  t  1, x.t/ is in the convex hull of the control points. b. Suppose that a B-spline curve x.t/ is translated to x.t/ C b (as in Exercise 1). Show that this new curve is again a B-spline. 3. Let x.t/ be a cubic Bézier curve determined by points p0 , p1 , p2 , and p3 . a. Compute the tangent vector x0 .t/. Determine how x0 .0/ and x0 .1/ are related to the control points, and give geometric descriptions of the directions of these tangent vectors. Is it possible to have x0 .1/ D 0? b. Compute the second derivative x00 .t/ and determine how x00 .0/ and x00 .1/ are related to the control points. Draw a

Curves and Surfaces 509

8.6 figure based on Figure 10, and construct a line segment that points in the direction of x00 .0/. [Hint: Use p1 as the origin of the coordinate system.] 4. Let x.t/ be the B-spline in Exercise 2, with control points p0 , p1 , p2 , and p3 . a. Compute the tangent vector x0 .t / and determine how the derivatives x0 .0/ and x0 .1/ are related to the control points. Give geometric descriptions of the directions of these tangent vectors. Explore what happens when both x0 .0/ and x0 .1/ equal 0. Justify your assertions. b. Compute the second derivative x00 .t / and determine how x00 .0/ and x00 .1/ are related to the control points. Draw a figure based on Figure 10, and construct a line segment that points in the direction of x00 .1/. [Hint: Use p2 as the origin of the coordinate system.] 5. Let x.t/ and y.t/ be cubic Bézier curves with control points fp0 ; p1 ; p2 ; p3 g and fp3 ; p4 ; p5 ; p6 g, respectively, so that x.t/ and y.t/ are joined at p3 . The following questions refer to the curve consisting of x.t/ followed by y.t /. For simplicity, assume that the curve is in R2 . a. What condition on the control points will guarantee that the curve has C 1 continuity at p3 ? Justify your answer. b. What happens when x0 .1/ and y0 .0/ are both the zero vector? 6. A B-spline is built out of B-spline segments, described in Exercise 2. Let p0 ; : : : ; p4 be control points. For 0  t  1, let x.t/ and y.t/ be determined by the geometry matrices Œ p0 p1 p2 p3  and Œ p1 p2 p3 p4 , respectively. Notice how the two segments share three control points. The two segments do not overlap, however—they join at a common endpoint, close to p2 . a. Show that the combined curve has G 0 continuity—that is, x.1/ D y.0/.

b. Show that the curve has C 1 continuity at the join point, x.1/. That is, show that x0 .1/ D y0 .0/.

7. Let x.t / and y.t/ be Bézier curves from Exercise 5, and suppose the combined curve has C 2 continuity (which includes C 1 continuity) at p3 . Set x00 .1/ D y00 .0/ and show that p5 is completely determined by p1 , p2 , and p3 . Thus, the points p0 ; : : : ; p3 and the C 2 condition determine all but one of the control points for y.t/. 8. Let x.t / and y.t/ be segments of a B-spline as in Exercise 6. Show that the curve has C 2 continuity (as well as C 1 continuity) at x.1/. That is, show that x00 .1/ D y00 .0/. This higher-order continuity is desirable in CAD applications such as automotive body design, since the curves and surfaces appear much smoother. However, B-splines require three times the computation of Bézier curves, for curves of comparable length. For surfaces, B-splines require nine times the computation of Bézier surfaces. Programmers often choose Bézier surfaces for applications (such as an airplane cockpit simulator) that require real-time rendering.

9. A quartic Bézier curve is determined by five control points, p0 , p1 , p2 ; p3 , and p4 : x.t/ D .1

t/4 p0 C 4t.1 3

C 4t .1

t/3 p1 C 6t 2 .1 t/p3 C t p4 4

t/2 p2 for 0  t  1

Construct the quartic basis matrix MB for x.t/. 10. The “B” in B-spline refers to the fact that a segment x.t/ may be written in terms of a basis matrix, MS , in a form similar to a Bézier curve. That is, x.t/ D GMS u.t/

for 0  t  1

where G is the geometry matrix Œ p0 p1 p2 p3  and u.t/ is the column vector .1; t; t 2 ; t 3 /. In a uniform B-spline, each segment uses the same basis matrix, but the geometry matrix changes. Construct the basis matrix MS for x.t/. In Exercises 11 and 12, mark each statement True or False. Justify each answer. 11. a. The cubic Bézier curve is based on four control points. b. Given a quadratic Bézier curve x.t/ with control points p0 , p1 , and p2 , the directed line segment p1 p0 (from p0 to p1 ) is the tangent vector to the curve at p0 . c. When two quadratic Bézier curves with control points fp0 ; p1 ; p2 g and fp2 ; p3 ; p4 g are joined at p2 , the combined Bézier curve will have C 1 continuity at p2 if p2 is the midpoint of the line segment between p1 and p3 . 12. a. The essential properties of Bézier curves are preserved under the action of linear transformations, but not translations. b. When two Bézier curves x.t/ and y.t/ are joined at the point where x.1/ D y.0/, the combined curve has G 0 continuity at that point. c. The Bézier basis matrix is a matrix whose columns are the control points of the curve. Exercises 13–15 concern the subdivision of a Bézier curve shown in Figure 7. Let x.t/ be the Bézier curve, with control points p0 ; : : : ; p3 , and let y.t/ and z.t/ be the subdividing Bézier curves as in the text, with control points q0 ; : : : ; q3 and r0 ; : : : ; r3 , respectively. 13. a. Use equation (12) to show that q1 is the midpoint of the segment from p0 to p1 . b. Use equation (13) to show that

8q2 D 8q3 C p0 C p1

p2

p3 :

c. Use part (b), equation (8), and part (a) to show that q2 is the midpoint of the segment from q1 to the midpoint of the segment from p1 to p2 . That is, q2 D 12 Œq1 C 12 .p1 C p2 /.

14. a. Justify each equal sign:

3.r3

r2 / D z0 .1/ D :5x0 .1/ D 32 .p3

p2 /:

510

CHAPTER 8

The Geometry of Vector Spaces

b. Show that r2 is the midpoint of the segment from p2 to p3 . c. Justify each equal sign: 3.r1

r0 / D z0 .0/ D :5x0 .:5/.

d. Use part (c) to show that 8r1 D 8r0 .

p0

p1 C p 2 C p 3 C

e. Use part (d), equation (8), and part (a) to show that r1 is the midpoint of the segment from r2 to the midpoint of the segment from p1 to p2 . That is, r1 D 12 Œr2 C 12 .p1 C p2 /.

15. Sometimes only one half of a Bézier curve needs further subdividing. For example, subdivision of the “left” side is accomplished with parts (a) and (c) of Exercise 13 and equation (8). When both halves of the curve x.t/ are divided, it is possible to organize calculations efficiently to calculate both left and right control points concurrently, without using equation (8) directly. a. Show that the tangent vectors y0 .1/ and z0 .0/ are equal. b. Use part (a) to show that q3 (which equals r0 / is the midpoint of the segment from q2 to r1 . c. Using part (b) and the results of Exercises 13 and 14, write an algorithm that computes the control points for both y.t/ and z.t/ in an efficient manner. The only operations needed are sums and division by 2. 16. Explain why a cubic Bézier curve is completely determined by x.0/, x0 .0/, x.1/, and x0 .1/.

17. TrueType® fonts, created by Apple Computer and Adobe Systems, use quadratic Bézier curves, while PostScript® fonts, created by Microsoft, use cubic Bézier curves. The cubic curves provide more flexibility for typeface design, but it is important to Microsoft that every typeface using quadratic curves can be transformed into one that uses cubic curves. Suppose that w.t/ is a quadratic curve, with control points p0 , p1 , and p2 . a. Find control points r0 , r1 , r2 , and r3 such that the cubic Bézier curve x.t/ with these control points has the property that x.t/ and w.t/ have the same initial and terminal points and the same tangent vectors at t D 0 and t D 1. (See Exercise 16.) b. Show that if x.t/ is constructed as in part (a), then x.t/ D w.t/ for 0  t  1.

18. Use partitioned matrix multiplication to compute the following matrix product, which appears in the alternative formula (5) for a Bézier curve: 2 32 3 1 0 0 0 p0 6 3 6 7 3 0 07 6 76 p1 7 4 3 6 3 0 54 p2 5 1 3 3 1 p3

SOLUTIONS TO PRACTICE PROBLEMS 1. From equation (14) with t D 0, x.0/ 6D p0 because

x.0/ D 16 Œp0 C 4p1 C p2  D 16 p0 C 23 p1 C 16 p2 :

The coefficients are nonnegative and sum to 1, so x.0/ is inconv fp0 ; p1 ; p2 g, and the affine coordinates with respect to fp0 ; p1 ; p2 g are 16 ; 23 ; 16 . 2. From equation (14) with t D 1, x.1/ 6D p3 because x.1/ D 16 Œp1 C 4p2 C p3  D 16 p1 C 23 p2 C 16 p3 :

The coefficients are nonnegative and sum to 1, so x.1/ is inconv fp1 ; p2 ; p3 g, and the affine coordinates with respect to fp1 ; p2 ; p3 g are 16 ; 23 ; 16 .

APPENDIX

A Uniqueness of the Reduced Echelon Form

THEOREM

Uniqueness of the Reduced Echelon Form Each m  n matrix A is row equivalent to a unique reduced echelon matrix U .

PROOF The proof uses the idea from Section 4.3 that the columns of row-equivalent matrices have exactly the same linear dependence relations. The row reduction algorithm shows that there exists at least one such matrix U . Suppose that A is row equivalent to matrices U and V in reduced echelon form. The leftmost nonzero entry in a row of U is a “leading l.” Call the location of such a leading 1 a pivot position, and call the column that contains it a pivot column. (This definition uses only the echelon nature of U and V and does not assume the uniqueness of the reduced echelon form.) The pivot columns of U and V are precisely the nonzero columns that are not linearly dependent on the columns to their left. (This condition is satisfied automatically by a first column if it is nonzero.) Since U and V are row equivalent (both being row equivalent to A), their columns have the same linear dependence relations. Hence, the pivot columns of U and V appear in the same locations. If there are r such columns, then since U and V are in reduced echelon form, their pivot columns are the first r columns of the m  m identity matrix. Thus, corresponding pivot columns of U and V are equal. Finally, consider any nonpivot column of U , say column j. This column is either zero or a linear combination of the pivot columns to its left (because those pivot columns are a basis for the space spanned by the columns to the left of column j ). Either case can be expressed by writing U x D 0 for some x whose j th entry is 1. Then V x D 0, too, which says that column j of V is either zero or the same linear combination of the pivot columns of V to its left. Since corresponding pivot columns of U and V are equal, columns j of U and V are also equal. This holds for all nonpivot columns, so V D U , which proves that U is unique.

A1

APPENDIX

B Complex Numbers

A complex number is a number written in the form

´ D a C bi where a and b are real numbers and i is a formal symbol satisfying the relation i 2 D 1. The number a is the real part of ´, denoted by Re ´, and b is the imaginary part of ´, denoted by Im ´. Two complex numbers are considered equal if and only if their real and imaginary parts are equal. For example, if ´ D 5 C . 2/i , then Re ´ D 5 and Im ´ D 2. For simplicity, we write ´ D 5 2i . A real number a is considered as a special type of complex number, by identifying a with a C 0i . Furthermore, arithmetic operations on real numbers can be extended to the set of complex numbers. The complex number system, denoted by C , is the set of all complex numbers, together with the following operations of addition and multiplication:

.a C bi / C .c C d i / D .a C c/ C .b C d /i .a C bi /.c C d i / D .ac

bd / C .ad C bc/i

(1) (2)

These rules reduce to ordinary addition and multiplication of real numbers when b and d are zero in (1) and (2). It is readily checked that the usual laws of arithmetic for R also hold for C . For this reason, multiplication is usually computed by algebraic expansion, as in the following example.

EXAMPLE 1

.5

2i /.3 C 4i / D 15 C 20i D 15 C 14i D 23 C 14i

That is, multiply each term of 5 the result in the form a C bi .

6i 8i 2 8. 1/

2i by each term of 3 C 4i , use i 2 D

Subtraction of complex numbers ´1 and ´2 is defined by

´1

´2 D ´1 C . 1/´2

In particular, we write ´ in place of . 1/´. A2

1, and write

APPENDIX B

by

Complex Numbers A3

The conjugate of ´ D a C bi is the complex number ´ (read as “´ bar”), defined

´Da

bi

Obtain ´ from ´ by reversing the sign of the imaginary part.

EXAMPLE 2 The conjugate of 3 C 4i is

3

4i ; write

3 C 4i D

3

4i .

Observe that if ´ D a C bi , then

bi / D a2

´´ D .a C bi /.a

abi C bai

b 2 i 2 D a2 C b 2

(3)

Since ´´ is real and nonnegative, it has a square root. The absolute value (or modulus) of ´ is the real number j´j defined by

j´j D

p

´´ D

p a2 C b 2

p If ´ is a real number, then ´ D a C 0i , and j´j D a2 , which equals the ordinary absolute value of a. Some useful properties of conjugates and absolute value are listed below; w and ´ denote complex numbers. 1. 2. 3. 4. 5. 6.

´ D ´ if and only if ´ is a real number. w C ´ D w C ´. w´ D w ´; in particular, r´ D r´ if r is a real number. ´´ D j´j2  0. jw´j D jwjj´j. jw C ´j  jwj C j´j.

If ´ ¤ 0, then j´j > 0 and ´ has a multiplicative inverse, denoted by 1=´ or ´ and given by

1 D´ ´

1

D

1

´ j´j2

Of course, a quotient w=´ simply means w  .1=´/.

EXAMPLE 3 Let w D 3 C 4i and ´ D 5

2i . Compute ´´, j´j, and w=´.

SOLUTION From equation (3), ´´ D 52 C . 2/2 D 25 C 4 D 29 p p For the absolute value, j´j D ´´ D 29. To compute w=´, first multiply both the numerator and the denominator by ´, the conjugate of the denominator. Because of (3),

A4

APPENDIX B

Complex Numbers

this eliminates the i in the denominator:

w 3 C 4i D ´ 5 2i D D

3 C 4i 5 C 2i  5 2i 5 C 2i 15 C 6i C 20i 52 C . 2/2

8

7 C 26i 29 7 26 D C i 29 29 D

Geometric Interpretation Each complex number ´ D a C bi corresponds to a point .a; b/ in the plane R2 , as in Figure 1. The horizontal axis is called the real axis because the points (a; 0) on it correspond to the real numbers. The vertical axis is the imaginary axis because the points .0; b/ on it correspond to the pure imaginary numbers of the form 0 C bi , or simply bi . The conjugate of ´ is the mirror image of ´ in the real axis. The absolute value of ´ is the distance from .a; b/ to the origin. Imaginary axis z = a + bi

b

Real axis

a

z = a – bi FIGURE 1 The complex conjugate is a mirror image.

Addition of complex numbers ´ D a C bi and w D c C d i corresponds to vector addition of .a; b/ and .c; d / in R2 , as in Figure 2. Im z w+z w

z Re z

FIGURE 2 Addition of complex numbers.

APPENDIX B

Complex Numbers A5

To give a graphical representation of complex multiplication, we use polar coordinates in R2 . Given a nonzero complex number ´ D a C bi , let ' be the angle between the positive real axis and the point .a; b/, as in Figure 3 where  < '   . The angle ' is called the argument of ´; we write ' D arg ´. From trigonometry,

a D j´j cos ';

and so

b D j´j sin '

´ D a C bi D j´j.cos ' C i sin '/ Im z z |z|

| z| sin ϕ

ϕ

Re z

|z | cos ϕ

FIGURE 3 Polar coordinates of ´.

If w is another nonzero complex number, say,

w D jwj .cos # C i sin #/

then, using standard trigonometric identities for the sine and cosine of the sum of two angles, one can verify that

w´ D jwj j´j Œcos.# C '/ C i sin.# C '/

(4)

See Figure 4. A similar formula may be written for quotients in polar form. The formulas for products and quotients can be stated in words as follows. Im z wz ␽+ϕ

w

z |z| ␽ ϕ

Re z

FIGURE 4 Multiplication with polar

coordinates.

The product of two nonzero complex numbers is given in polar form by the product of their absolute values and the sum of their arguments. The quotient of two nonzero complex numbers is given by the quotient of their absolute values and the difference of their arguments.

Im z iz

ϕ  π2 z=3+i

i π 2

Multiplication by i.

ϕ

Re z

EXAMPLE 4 a. If w has absolute value 1, then w D cos # C i sin # , where # is the argument of w . Multiplication of any nonzero number ´ by w simply rotates ´ through the angle # . b. The argument of i itself is =2 radians, so multiplication of ´ by i rotates ´ through an angle of =2 radians. For example, 3 C i is rotated into .3 C i/i D 1 C 3i .

A6

APPENDIX B

Complex Numbers

Powers of a Complex Number Formula (4) applies when ´ D w D r.cos ' C i sin '/. In this case

´2 D r 2 .cos 2' C i sin 2'/

and

´3 D ´  ´2

D r.cos ' C i sin '/  r 2 .cos 2' C i sin 2'/ D r 3 .cos 3' C i sin 3'/

In general, for any positive integer k ,

´k D r k .cos k' C i sin k'/ This fact is known as De Moivre’s Theorem.

Complex Numbers and R2 Although the elements of R2 and C are in one-to-one correspondence, and the operations of addition are essentially the same, there is a logical distinction between R2 and C . In R2 we can only multiply a vector by a real scalar, whereas in C we can multiply any two complex numbers to obtain a third complex number. (The dot product in R2 doesn’t count, because it produces a scalar, not an element of R2 :/ We use scalar notation for elements in C to emphasize this distinction. x2

Im z (2, 4)

2 + 4i –1 + 2i

(–1, 2) (4, 0)

4 + 0i x1

(–3, –1)

–3 – i (3, –2)

The real plane R2 .

Re z

3 – 2i

The complex plane C .

Glossary A adjugate (or classical adjoint): The matrix adj A formed from a square matrix A by replacing the .i; j /-entry of A by the .i; j /-cofactor, for all i and j , and then transposing the resulting matrix. affine combination: A linear combination of vectors (points in Rn ) in which the sum of the weights involved is 1. affine dependence relation: An equation of the form c1 v1 C    C cp vp D 0, where the weights c1 ; : : : ; cp are not all zero, and c1 C    C cp D 0. affine hull (or affine span) of a set S : The set of all affine combinations of points in S , denoted by aff S . affinely dependent set: A set fv1 ; : : : ; vp g in Rn such that there are real numbers c1 ; : : : ; cp , not all zero, such that c1 C    C cp D 0 and c1 v1 C    C cp vp D 0. affinely independent set: A set fv1 ; : : : ; vp g in Rn that is not affinely dependent. affine set (or affine subset): A set S of points such that if p and q are in S , then .1 t/p C t q 2 S for each real number t . affine transformation: A mapping T W Rn ! Rm of the form T .x/ D Ax C b, with A an m  n matrix and b in Rm . algebraic multiplicity: The multiplicity of an eigenvalue as a root of the characteristic equation. angle (between nonzero vectors u and v in R2 or R3 /: The angle # between the two directed line segments from the origin to the points u and v. Related to the scalar product by u  v D kuk kvk cos # associative law of multiplication: A.BC/ D .AB/C , for all A, B, C . attractor (of a dynamical system in R2 /: The origin when all trajectories tend toward 0. augmented matrix: A matrix made up of a coefficient matrix for a linear system and one or more columns to the right. Each extra column contains the constants from the right side of a system with the given coefficient matrix. auxiliary equation: A polynomial equation in a variable r , created from the coefficients of a homogeneous difference equation.

B back-substitution (with matrix notation): The backward phase of row reduction of an augmented matrix that transforms an echelon matrix into a reduced echelon matrix; used to find the solution(s) of a system of linear equations.

backward phase (of row reduction): The last part of the algorithm that reduces a matrix in echelon form to a reduced echelon form. band matrix: A matrix whose nonzero entries lie within a band along the main diagonal. barycentric coordinates (of a point p with respect to an affinely independent set S D fv1 ; : : : ; vk g): The (unique) set of weights c1 ; : : : ; ck such that p D c1 v1 C    C ck vk and c1 C    C ck D 1. (Sometimes also called the affine coordinates of p with respect to S .) basic variable: A variable in a linear system that corresponds to a pivot column in the coefficient matrix. basis (for a nontrivial subspace H of a vector space V /: An indexed set B D fv1 ; : : : ; vp g in V such that: (i) B is a linearly independent set and (ii) the subspace spanned by B coincides with H , that is, H D Span fv1 ; : : : ; vp g. B-coordinates of x: basis B.

See coordinates of x relative to the

best approximation: given vector.

The closest point in a given subspace to a

bidiagonal matrix: A matrix whose nonzero entries lie on the main diagonal and on one diagonal adjacent to the main diagonal. block diagonal (matrix): A partitioned matrix A D ŒAij  such that each block Aij is a zero matrix for i ¤ j . block matrix:

See partitioned matrix.

block matrix multiplication: The row–column multiplication of partitioned matrices as if the block entries were scalars. block upper triangular (matrix): A partitioned matrix A D ŒAij  such that each block Aij is a zero matrix for i > j. boundary point of a set S in Rn : A point p such that every open ball in Rn centered at p intersects both S and the complement of S . bounded set in Rn : A set that is contained in an open ball B.0; ı/ for some ı > 0. B-matrix (for T ): A matrix ŒT B for a linear transformation T W V ! V relative to a basis B for V , with the property that ŒT .x/B D ŒT B ŒxB for all x in V.

C Cauchy–Schwarz inequality: change of basis:

jhu; vij  kukkvk for all u, v.

See change-of-coordinates matrix.

A7

A8

Glossary

change-of-coordinates matrix (from a basis B to a basis C ): A matrix C P B that transforms B-coordinate vectors into C coordinate vectors: ŒxC D P ŒxB . If C is the standard C

basis for Rn , then

C

B

P is sometimes written as PB . B

characteristic equation (of A): det.A I / D 0. characteristic polynomial (of A): det.A I / or, in some texts, det.I A/. Cholesky factorization: A factorization A D RTR, where R is an invertible upper triangular matrix whose diagonal entries are all positive. closed ball (in Rn ): A set fx W kx pk < ıg in Rn , where p is in Rn and ı > 0. closed set (in Rn ): A set that contains all of its boundary points. codomain (of a transformation T W Rn ! Rm /: The set Rm that contains the range of T . In general, if T maps a vector space V into a vector space W , then W is called the codomain of T. coefficient matrix: A matrix whose entries are the coefficients of a system of linear equations. cofactor: A number Cij D . 1/i Cj det Aij , called the .i; j /cofactor of A, where Aij is the submatrix formed by deleting the i th row and the j th column of A. cofactor expansion: A formula for det A using cofactors associated with one row or one column, such as for row 1: det A D a11 C11 C    C a1n C1n column–row expansion: The expression of a product AB as a sum of outer products: col1 .A/ row1 .B/ C    C coln .A/ rown .B/, where n is the number of columns of A. column space (of an m  n matrix A): The set Col A of all linear combinations of the columns of A. If A D Œa1    an , then Col A D Span fa1 ; : : : ; an g. Equivalently, Col A D fy W y D Ax for some x in Rn g column sum: The sum of the entries in a column of a matrix. column vector: A matrix with only one column, or a single column of a matrix that has several columns. commuting matrices: Two matrices A and B such that AB D BA. compact set (in Rn ): A set in Rn that is both closed and bounded. companion matrix: A special form of matrix whose characteristic polynomial is . 1/n p./ when p./ is a specified polynomial whose leading term is n . complex eigenvalue: A nonreal root of the characteristic equation of an n  n matrix. complex eigenvector: A nonzero vector x in C n such that Ax D x, where A is an n  n matrix and  is a complex eigenvalue. component of y orthogonal to u (for u ¤ 0): The vector yu y u. uu

composition of linear transformations: A mapping produced by applying two or more linear transformations in succession. If the transformations are matrix transformations, say left-multiplication by B followed by left-multiplication by A, then the composition is the mapping x 7! A.B x/. condition number (of A): The quotient 1 =n , where 1 is the largest singular value of A and n is the smallest singular value. The condition number is C1 when n is zero. conformable for block multiplication: Two partitioned matrices A and B such that the block product AB is defined: The column partition of A must match the row partition of B. consistent linear system: A linear system with at least one solution. constrained optimization: The problem of maximizing a quantity such as xTAx or kAxk when x is subject to one or more constraints, such as xTx D 1 or xTv D 0. consumption matrix: A matrix in the Leontief input–output model whose columns are the unit consumption vectors for the various sectors of an economy. contraction: A mapping x 7! r x for some scalar r , with 0  r  1. controllable (pair of matrices): A matrix pair .A; B/ where A is n  n, B has n rows, and rank Œ B

AB

A2 B



An

1

BDn

Related to a state-space model of a control system and the difference equation xkC1 D Axk C B uk .k D 0; 1; : : :/. convergent (sequence of vectors): A sequence fxk g such that the entries in xk can be made as close as desired to the entries in some fixed vector for all k sufficiently large. convex combination (of points v1 ; : : : ; vk in Rn ): A linear combination of vectors (points) in which the weights in the combination are nonnegative and the sum of the weights is 1. convex hull (of a set S ): The set of all convex combinations of points in S , denoted by: conv S . convex set: A set S with the property that for each p and q in S , the line segment pq is contained in S . coordinate mapping (determined by an ordered basis B in a vector space V ): A mapping that associates to each x in V its coordinate vector ŒxB . coordinates of x relative to the basis B D fb1 ; : : : ; bn g: The weights c1 ; : : : ; cn in the equation x D c1 b1 C    C cn bn . coordinate vector of x relative to B: The vector ŒxB whose entries are the coordinates of x relative to the basis B. covariance (of variables xi and xj , for i ¤ j ): The entry sij in the covariance matrix S for a matrix of observations, where xi and xj vary over the i th and j th coordinates, respectively, of the observation vectors. covariance matrix (or sample covariance matrix): The p  p matrix S defined by S D .N 1/ 1 BB T , where B is a p  N matrix of observations in mean-deviation form.

Glossary Cramer’s rule: A formula for each entry in the solution x of the equation Ax D b when A is an invertible matrix. cross-product term: A term cxi xj in a quadratic form, with i ¤ j. cube: A three-dimensional solid object bounded by six square faces, with three faces meeting at each vertex.

D decoupled system: A difference equation ykC1 D Ayk , or a differential equation y0 .t/ D Ay.t/, in which A is a diagonal matrix. The discrete evolution of each entry in yk (as a function of k ), or the continuous evolution of each entry in the vector-valued function y.t /, is unaffected by what happens to the other entries as k ! 1 or t ! 1. design matrix: The matrix X in the linear model y D Xˇ C , where the columns of X are determined in some way by the observed values of some independent variables. determinant (of a square matrix A): The number det A defined inductively by a cofactor expansion along the first row of A. Also, . 1/r times the product of the diagonal entries in any echelon form U obtained from A by row replacements and r row interchanges (but no scaling operations). diagonal entries (in a matrix): Entries having equal row and column indices. diagonalizable (matrix): A matrix that can be written in factored form as PDP 1 , where D is a diagonal matrix and P is an invertible matrix. diagonal matrix: A square matrix whose entries not on the main diagonal are all zero. difference equation (or linear recurrence relation): An equation of the form xkC1 D Axk (k D 0; 1; 2; : : :) whose solution is a sequence of vectors, x0 ; x1 ; : : : : dilation: A mapping x 7! r x for some scalar r , with 1 < r . dimension: of a flat S : The dimension of the corresponding parallel subspace. of a set S : The dimension of the smallest flat containing S . of a subspace S : The number of vectors in a basis for S , written as dim S . of a vector space V : The number of vectors in a basis for V , written as dim V . The dimension of the zero space is 0. discrete linear dynamical system: A difference equation of the form xkC1 D Axk that describes the changes in a system (usually a physical system) as time passes. The physical system is measured at discrete times, when k D 0; 1; 2; : : : ; and the state of the system at time k is a vector xk whose entries provide certain facts of interest about the system. distance between u and v: The length of the vector u v, denoted by dist .u; v/. distance to a subspace: The distance from a given point (vector) v to the nearest point in the subspace. distributive laws: (left) A.B C C / D AB C AC , and (right) .B C C /A D BA C CA, for all A, B , C .

A9

domain (of a transformation T ): The set of all vectors x for which T .x/ is defined. dot product: See inner product. dynamical system: See discrete linear dynamical system.

E echelon form (or row echelon form, of a matrix): An echelon matrix that is row equivalent to the given matrix. echelon matrix (or row echelon matrix): A rectangular matrix that has three properties: (1) All nonzero rows are above any row of all zeros. (2) Each leading entry of a row is in a column to the right of the leading entry of the row above it. (3) All entries in a column below a leading entry are zero. eigenfunctions (of a differential equation x0 .t/ D Ax.t/): A function x.t/ D ve t , where v is an eigenvector of A and  is the corresponding eigenvalue. eigenspace (of A corresponding to ): The set of all solutions of Ax D x, where  is an eigenvalue of A. Consists of the zero vector and all eigenvectors corresponding to . eigenvalue (of A): A scalar  such that the equation Ax D x has a solution for some nonzero vector x. eigenvector (of A): A nonzero vector x such that Ax D x for some scalar . eigenvector basis: A basis consisting entirely of eigenvectors of a given matrix. eigenvector decomposition (of x): An equation, x D c1 v1 C    C cn vn , expressing x as a linear combination of eigenvectors of a matrix. elementary matrix: An invertible matrix that results by performing one elementary row operation on an identity matrix. elementary row operations: (1) (Replacement) Replace one row by the sum of itself and a multiple of another row. (2) Interchange two rows. (3) (Scaling) Multiply all entries in a row by a nonzero constant. equal vectors: Vectors in Rn whose corresponding entries are the same. equilibrium prices: A set of prices for the total output of the various sectors in an economy, such that the income of each sector exactly balances its expenses. equilibrium vector: See steady-state vector. equivalent (linear) systems: Linear systems with the same solution set. exchange model: See Leontief exchange model. existence question: Asks, “Does a solution to the system exist?” That is, “Is the system consistent?” Also, “Does a solution of Ax D b exist for all possible b?” expansion by cofactors: See cofactor expansion. explicit description (of a subspace W of Rn ): A parametric representation of W as the set of all linear combinations of a set of specified vectors. extreme point (of a convex set S ): A point p in S such that p is not in the interior of any line segment that lies in S . (That is,

A10

Glossary if x, y are in S and p is on the line segment xy, then p D x or p D y.)

F factorization (of A): An equation that expresses A as a product of two or more matrices. final demand vector (or bill of final demands): The vector d in the Leontief input–output model that lists the dollar values of the goods and services demanded from the various sectors by the nonproductive part of the economy. The vector d can represent consumer demand, government consumption, surplus production, exports, or other external demand. finite-dimensional (vector space): A vector space that is spanned by a finite set of vectors. flat (in Rn ):

A translate of a subspace of Rn .

flexibility matrix: A matrix whose j th column gives the deflections of an elastic beam at specified points when a unit force is applied at the j th point on the beam. floating point arithmetic: Arithmetic with numbers represented as decimals ˙ :d1    dp  10r , where r is an integer and the number p of digits to the right of the decimal point is usually between 8 and 16. flop:

One arithmetic operation .C; ; ; =/ on two real floating point numbers.

forward phase (of row reduction): The first part of the algorithm that reduces a matrix to echelon form. Fourier approximation (of order n): The closest point in the subspace of nth-order trigonometric polynomials to a given function in C Œ0; 2. Fourier coefficients: The weights used to make a trigonometric polynomial as a Fourier approximation to a function. Fourier series: An infinite series that converges to a function in the inner product space C Œ0; 2, with the inner product given by a definite integral. free variable: variable.

Any variable in a linear system that is not a basic

full rank (matrix): of m and n.

An m  n matrix whose rank is the smaller

fundamental set of solutions: A basis for the set of all solutions of a homogeneous linear difference or differential equation. fundamental subspaces (determined by A): The null space and column space of A, and the null space and column space of AT , with Col AT commonly called the row space of A.

G Gaussian elimination:

See row reduction algorithm.

general least-squares problem: Given an m  n matrix A and a vector b in Rm , find xO in Rn such that kb AOxk  kb Axk for all x in Rn .

general solution (of a linear system): A parametric description of a solution set that expresses the basic variables in terms of

the free variables (the parameters), if any. After Section 1.5, the parametric description is written in vector form. Givens rotation: A linear transformation from Rn to Rn used in computer programs to create zero entries in a vector (usually a column of a matrix). Gram matrix (of A):

The matrix ATA.

Gram–Schmidt process: An algorithm for producing an orthogonal or orthonormal basis for a subspace that is spanned by a given set of vectors.

H homogeneous coordinates: In R3 , the representation of .x; y; ´/ as .X; Y; Z; H / for any H ¤ 0, where x D X=H , y D Y =H , and ´ D Z=H . In R2 , H is usually taken as 1, and the homogeneous coordinates of .x; y/ are written as .x; y; 1/. homogeneous equation: An equation of the form Ax D 0, possibly written as a vector equation or as a system of linear equations.   v homogeneous form of (a vector) v in Rn : The point vQ D 1 in RnC1 . Householder reflection: A transformation x 7! Qx, where Q D I 2uuT and u is a unit vector .uTu D 1/. hyperplane (in Rn ): A flat in Rn of dimension n translate of a subspace of dimension n 1.

1. Also: a

I identity matrix (denoted by I or In ): A square matrix with ones on the diagonal and zeros elsewhere. ill-conditioned matrix: A square matrix with a large (or possibly infinite) condition number; a matrix that is singular or can become singular if some of its entries are changed ever so slightly. image (of a vector x under a transformation T ): assigned to x by T .

The vector T .x/

implicit description (of a subspace W of Rn ): A set of one or more homogeneous equations that characterize the points of W . Im x: The vector in Rn formed from the imaginary parts of the entries of a vector x in C n . inconsistent linear system:

A linear system with no solution.

indefinite matrix: A symmetric matrix A such that xTAx assumes both positive and negative values. indefinite quadratic form: A quadratic form Q such that Q.x/ assumes both positive and negative values. infinite-dimensional (vector space): that has no finite basis.

A nonzero vector space V

inner product: The scalar uTv, usually written as u  v, where u and v are vectors in Rn viewed as n  1 matrices. Also called the dot product of u and v. In general, a function on a vector

Glossary space that assigns to each pair of vectors u and v a number hu; vi, subject to certain axioms. See Section 6.7. inner product space: A vector space on which is defined an inner product. input–output matrix: See consumption matrix. input–output model: See Leontief input–output model. interior point (of a set S in Rn ): A point p in S such that for some ı > 0, the open ball B.p; ı/ centered at p is contained in S . intermediate demands: Demands for goods or services that will be consumed in the process of producing other goods and services for consumers. If x is the production level and C is the consumption matrix, then C x lists the intermediate demands. interpolating polynomial: A polynomial whose graph passes through every point in a set of data points in R2 . invariant subspace (for A): A subspace H such that Ax is in H whenever x is in H . inverse (of an n  n matrix A): An n  n matrix A 1 such that AA 1 D A 1 A D In . inverse power method: An algorithm for estimating an eigenvalue  of a square matrix, when a good initial estimate of  is available. invertible linear transformation: A linear transformation T W Rn ! Rn such that there exists a function S W Rn ! Rn satisfying both T .S.x// D x and S.T .x// D x for all x in Rn . invertible matrix: A square matrix that possesses an inverse. isomorphic vector spaces: Two vector spaces V and W for which there is a one-to-one linear transformation T that maps V onto W . isomorphism: A one-to-one linear mapping from one vector space onto another.

K kernel (of a linear transformation T W V ! W /: The set of x in V such that T .x/ D 0. Kirchhoff’s laws: (1) (voltage law) The algebraic sum of the RI voltage drops in one direction around a loop equals the algebraic sum of the voltage sources in the same direction around the loop. (2) (current law) The current in a branch is the algebraic sum of the loop currents flowing through that branch.

L ladder network: An electrical network assembled by connecting in series two or more electrical circuits. leading entry: The leftmost nonzero entry in a row of a matrix. least-squares error: The distance kb AOxk from b to AOx, when xO is a least-squares solution of Ax D b. least-squares line: The line y D ˇO0 C ˇO1 x that minimizes the least-squares error in the equation y D X ˇ C .

A11

least-squares solution (of Ax D b): A vector xO such that kb AOxk  kb Axk for all x in Rn . left inverse (of A): Any rectangular matrix C such that CA D I . left-multiplication (by A): Multiplication of a vector or matrix on the left by A. left singular vectors (of A): The columns of U in the singular value decomposition A D U †V T . p p length (or norm, of v): The scalar kvk D v  v D hv; vi. Leontief exchange (or closed) model: A model of an economy where inputs and outputs are fixed, and where a set of prices for the outputs of the sectors is sought such that the income of each sector equals its expenditures. This “equilibrium” condition is expressed as a system of linear equations, with the prices as the unknowns. Leontief input–output model (or Leontief production equation): The equation x D C x C d, where x is production, d is final demand, and C is the consumption (or input–output) matrix. The j th column of C lists the inputs that sector j consumes per unit of output. level set (or gradient) of a linear functional f on Rn : A set Œf : d  D fx 2 Rn W f .x/ D d g linear combination: A sum of scalar multiples of vectors. The scalars are called the weights. linear dependence relation: A homogeneous vector equation where the weights are all specified and at least one weight is nonzero. linear equation (in the variables x1 ; : : : ; xn /: An equation that can be written in the form a1 x1 C a2 x2 C    C an xn D b , where b and the coefficients a1 ; : : : ; an are real or complex numbers. linear filter: A linear difference equation used to transform discrete-time signals. linear functional (on Rn ): A linear transformation f from Rn into R. linearly dependent (vectors): An indexed set fv1 ; : : : ; vp g with the property that there exist weights c1 ; : : : ; cp , not all zero, such that c1 v1 C    C cp vp D 0. That is, the vector equation c1 v1 C c2 v2 C    C cp vp D 0 has a nontrivial solution. linearly independent (vectors): An indexed set fv1 ; : : : ; vp g with the property that the vector equation c1 v1 C c2 v2 C    C cp vp D 0 has only the trivial solution, c1 D    D cp D 0. linear model (in statistics): Any equation of the form y D Xˇ C , where X and y are known and ˇ is to be chosen to minimize the length of the residual vector, . linear system: A collection of one or more linear equations involving the same variables, say, x1 ; : : : ; xn . linear transformation T (from a vector space V into a vector space W ): A rule T that assigns to each vector x in V a unique vector T .x/ in W , such that (i) T .u C v/ D T .u/ C T .v/ for all u; v in V , and (ii) T .c u/ D cT .u/ for all u in V and all scalars c . Notation:

A12

Glossary

T W V ! W ; also, x 7! Ax when T W Rn ! Rm and A is the standard matrix for T . line through p parallel to v:

The set fp C t v W t in Rg.

migration matrix: A matrix that gives the percentage movement between different locations, from one period to the next.

loop current: The amount of electric current flowing through a loop that makes the algebraic sum of the RI voltage drops around the loop equal to the algebraic sum of the voltage sources in the loop.

minimal spanning set (for a subspace H ): A set B that spans H and has the property that if one of the elements of B is removed from B, then the new set does not span H .

lower triangular matrix: diagonal.

Moore–Penrose inverse:

A matrix with zeros above the main

lower triangular part (of A): A lower triangular matrix whose entries on the main diagonal and below agree with those in A. LU factorization: The representation of a matrix A in the form A D LU where L is a square lower triangular matrix with ones on the diagonal (a unit lower triangular matrix) and U is an echelon form of A.

M magnitude (of a vector):

See norm.

main diagonal (of a matrix): column indices. mapping:

The entries with equal row and

See transformation.

Markov chain: A sequence of probability vectors x0 , x1 , x2 ; : : : ; together with a stochastic matrix P such that xkC1 D P xk for k D 0; 1; 2; : : : : matrix:

A rectangular array of numbers.

matrix equation: An equation that involves at least one matrix; for instance, Ax D b. matrix for T relative to bases B and C : A matrix M for a linear transformation T W V ! W with the property that ŒT .x/C D M ŒxB for all x in V , where B is a basis for V and C is a basis for W . When W D V and C D B, the matrix M is called the B-matrix for T and is denoted by ŒT B .

matrix of observations: A p  N matrix whose columns are observation vectors, each column listing p measurements made on an individual or object in a specified population or set. matrix transformation: A mapping x 7! Ax, where A is an m  n matrix and x represents any vector in Rn .

maximal linearly independent set (in V ): A linearly independent set B in V such that if a vector v in V but not in B is added to B, then the new set is linearly dependent. mean-deviation form (of a matrix of observations): A matrix whose row vectors are in mean-deviation form. For each row, the entries sum to zero.

m  n matrix:

A matrix with m rows and n columns. See pseudoinverse.

multiple regression: A linear model involving several independent variables and one dependent variable.

N nearly singular matrix:

An ill-conditioned matrix.

negative definite matrix: A symmetric matrix A such that xTAx < 0 for all x ¤ 0. negative definite quadratic form: that Q.x/ < 0 for all x ¤ 0. negative semidefinite matrix: xTAx  0 for all x.

A quadratic form Q such

A symmetric matrix A such that

negative semidefinite quadratic form: such that Q.x/  0 for all x.

A quadratic form Q

nonhomogeneous equation: An equation of the form Ax D b with b ¤ 0, possibly written as a vector equation or as a system of linear equations. nonsingular (matrix):

An invertible matrix.

nontrivial solution: A nonzero solution of a homogeneous equation or system of homogeneous equations. nonzero (matrix or vector): A matrix (with possibly only one row or column) that contains at least one nonzero entry. p p norm (or length, of v): The scalar kvk D v  v D hv; vi. normal equations: The system of equations represented by ATAx D AT b, whose solution yields all least-squares solutions of Ax D b. In statistics, a common notation is X TXˇ D X Ty. normalizing (a nonzero vector v): The process of creating a unit vector u that is a positive multiple of v. normal vector (to a subspace V of Rn ): that n  x D 0 for all x in V .

A vector n in Rn such

null space (of an m  n matrix A): The set Nul A of all solutions to the homogeneous equation Ax D 0. Nul A D fx W x is in Rn and Ax D 0g.

O

A vector whose entries sum

observation vector: The vector y in the linear model y D Xˇ C , where the entries in y are the observed values of a dependent variable.

mean square error: The error of an approximation in an inner product space, where the inner product is defined by a definite integral.

onto (mapping): A mapping T W Rn ! Rm such that each b in Rm is the image of at least one x in Rn .

mean-deviation form (of a vector): to zero.

one-to-one (mapping): A mapping T W Rn ! Rm such that each b in Rm is the image of at most one x in Rn .

Glossary open ball B.p; ı/ in Rn : ı > 0.

pk < ıg in Rn , where

permuted lower triangular matrix: A matrix such that a permutation of its rows will form a lower triangular matrix.

open set S in Rn : A set that contains none of its boundary points. (Equivalently, S is open if every point of S is an interior point.)

permuted LU factorization: The representation of a matrix A in the form A D LU where L is a square matrix such that a permutation of its rows will form a unit lower triangular matrix, and U is an echelon form of A.

origin:

The set fx W kx

A13

The zero vector.

orthogonal basis:

A basis that is also an orthogonal set.

orthogonal complement (of W /: orthogonal to W .

The set W

?

of all vectors

orthogonal decomposition: The representation of a vector y as the sum of two vectors, one in a specified subspace W and the other in W ? . In general, a decomposition y D c1 u1 C    C cp up , where fu1 ; : : : ; up g is an orthogonal basis for a subspace that contains y. orthogonally diagonalizable (matrix): A matrix A that admits a factorization, A D PDP 1 , with P an orthogonal matrix .P 1 D P T / and D diagonal. orthogonal matrix: U 1 D UT.

A square invertible matrix U such that

orthogonal projection of y onto u (or onto the line through u and yu the origin, for u ¤ 0): The vector yO defined by yO D u. uu orthogonal projection of y onto W: The unique vector yO in W such that y yO is orthogonal to W . Notation: yO D projW y. orthogonal set: A set S of vectors such that u  v D 0 for each distinct pair u; v in S . orthogonal to W:

Orthogonal to every vector in W .

orthonormal basis: vectors. orthonormal set:

A basis that is an orthogonal set of unit

An orthogonal set of unit vectors.

outer product: A matrix product uv where u and v are vectors in Rn viewed as n  1 matrices. (The transpose symbol is on the “outside” of the symbols u and v.) T

overdetermined system: A system of equations with more equations than unknowns.

pivot column:

A column that contains a pivot position.

pivot position: A position in a matrix A that corresponds to a leading entry in an echelon form of A. plane through u, v, and the origin: A set whose parametric equation is x D s u C t v (s , t in R/, with u and v linearly independent. polar decomposition (of A): A factorization A D PQ, where P is an n  n positive semidefinite matrix with the same rank as A, and Q is an n  n orthogonal matrix. polygon:

A polytope in R2 .

polyhedron:

A polytope in R3 .

polytope: The convex hull of a finite set of points in Rn (a special type of compact convex set). positive combination (of points v1 ; : : : ; vm in Rn ): A linear combination c1 v1 C    C cm vm , where all ci  0. positive definite matrix: A symmetric matrix A such that xTAx > 0 for all x ¤ 0. positive definite quadratic form: that Q.x/ > 0 for all x ¤ 0.

A quadratic form Q such

positive hull (of a set S ): The set of all positive combinations of points in S , denoted by pos S . positive semidefinite matrix: xTAx  0 for all x.

A symmetric matrix A such that

positive semidefinite quadratic form: such that Q.x/  0 for all x.

A quadratic form Q

power method: An algorithm for estimating a strictly dominant eigenvalue of a square matrix.

P parallel flats: Two or more flats such that each flat is a translate of the other flats. parallelogram rule for addition: A geometric interpretation of the sum of two vectors u, v as the diagonal of the parallelogram determined by u, v, and 0. parameter vector: y D Xˇ C .

pivot: A nonzero number that either is used in a pivot position to create zeros through row operations or is changed into a leading 1, which in turn is used to create zeros.

The unknown vector ˇ in the linear model

principal axes (of a quadratic form xTAx): The orthonormal columns of an orthogonal matrix P such that P 1 AP is diagonal. (These columns are unit eigenvectors of A.) Usually the columns of P are ordered in such a way that the corresponding eigenvalues of A are arranged in decreasing order of magnitude.

parametric equation of a plane: An equation of the form x D p C s u C t v (s , t in R), with u and v linearly independent.

principal components (of the data in a matrix B of observations): The unit eigenvectors of a sample covariance matrix S for B , with the eigenvectors arranged so that the corresponding eigenvalues of S decrease in magnitude. If B is in mean-deviation form, then the principal components are the right singular vectors in a singular value decomposition of B T .

partitioned matrix (or block matrix): A matrix whose entries are themselves matrices of appropriate sizes.

probability vector: A vector in Rn whose entries are nonnegative and sum to one.

parametric equation of a line: x D p C t v (t in R).

An equation of the form

A14

Glossary

product Ax: The linear combination of the columns of A using the corresponding entries in x as weights. production vector: The vector in the Leontief input–output model that lists the amounts that are to be produced by the various sectors of an economy. profile (of a set S in Rn ):

The set of extreme points of S .

projection matrix (or orthogonal projection matrix): A symmetric matrix B such that B 2 D B . A simple example is B D vvT , where v is a unit vector. proper subset of a set S: itself. proper subspace: V itself.

A subset of S that does not equal S

Any subspace of a vector space V other than

pseudoinverse (of A): The matrix VD 1 U T , when UDV T is a reduced singular value decomposition of A.

Q QR factorization: A factorization of an m  n matrix A with linearly independent columns, A D QR, where Q is an m  n matrix whose columns form an orthonormal basis for Col A, and R is an n  n upper triangular invertible matrix with positive entries on its diagonal. quadratic Bézier curve: A curve whose description may be written in the form g.t/ D .1 t /f 0 .t / C t f 1 .t / for 0  t  1, where f 0 .t/ D .1 t/p0 C t p1 and f 1 .t / D .1 t/p1 C t p2 . The points p0 , p1 , p2 are called the control points for the curve. quadratic form: A function Q defined for x in Rn by Q.x/ D xTAx, where A is an n  n symmetric matrix (called the matrix of the quadratic form).

R range (of a linear transformation T ): The set of all vectors of the form T .x/ for some x in the domain of T . rank (of a matrix A): The dimension of the column space of A, denoted by rank A. Rayleigh quotient: R.x/ D .xTAx/=.xTx/. An estimate of an eigenvalue of A (usually a symmetric matrix). recurrence relation:

See difference equation.

reduced echelon form (or reduced row echelon form): A reduced echelon matrix that is row equivalent to a given matrix. reduced echelon matrix: A rectangular matrix in echelon form that has these additional properties: The leading entry in each nonzero row is 1, and each leading 1 is the only nonzero entry in its column. reduced singular value decomposition: A factorization A D UDV T , for an m  n matrix A of rank r , where U is m  r with orthonormal columns, D is an r  r diagonal matrix with the r nonzero singular values of A on its diagonal, and V is n  r with orthonormal columns.

regression coefficients: The coefficients ˇ0 and ˇ1 in the leastsquares line y D ˇ0 C ˇ1 x .

regular solid: One of the five possible regular polyhedrons in R3 : the tetrahedron (4 equal triangular faces), the cube (6 square faces), the octahedron (8 equal triangular faces), the dodecahedron (12 equal pentagonal faces), and the icosahedron (20 equal triangular faces). regular stochastic matrix: A stochastic matrix P such that some matrix power P k contains only strictly positive entries. relative change or relative error (in b/: kbk=kbk when b is changed to b C b.

The quantity

repellor (of a dynamical system in R2 /: The origin when all trajectories except the constant zero sequence or function tend away from 0. residual vector: The quantity  that appears in the general linear model: y D X ˇ C ; that is,  D y Xˇ , the difference between the observed values and the predicted values (of y ). Re x: The vector in Rn formed from the real parts of the entries of a vector x in C n . right inverse (of A): AC D I .

Any rectangular matrix C such that

right-multiplication (by A): right by A.

Multiplication of a matrix on the

right singular vectors (of A): The columns of V in the singular value decomposition A D U †V T.

roundoff error: Error in floating point arithmetic caused when the result of a calculation is rounded (or truncated) to the number of floating point digits stored. Also, the error that results when the decimal representation of a number such as 1/3 is approximated by a floating point number with a finite number of digits. row–column rule: The rule for computing a product AB in which the .i; j /-entry of AB is the sum of the products of corresponding entries from row i of A and column j of B . row equivalent (matrices): Two matrices for which there exists a (finite) sequence of row operations that transforms one matrix into the other. row reduction algorithm: A systematic method using elementary row operations that reduces a matrix to echelon form or reduced echelon form. row replacement: An elementary row operation that replaces one row of a matrix by the sum of the row and a multiple of another row. row space (of a matrix A): The set Row A of all linear combinations of the vectors formed from the rows of A; also denoted by Col AT . row sum:

The sum of the entries in a row of a matrix.

row vector: A matrix with only one row, or a single row of a matrix that has several rows. row–vector rule for computing Ax: The rule for computing a product Ax in which the i th entry of Ax is the sum of the

Glossary products of corresponding entries from row i of A and from the vector x.

S saddle point (of a dynamical system in R2 ): The origin when some trajectories are attracted to 0 and other trajectories are repelled from 0. same direction (as a vector v): multiple of v.

A vector that is a positive

sample mean: The average M of a set of vectors, X1 ; : : : ; XN , given by M D .1=N /.X1 C    C XN /.

scalar: A (real) number used to multiply either a vector or a matrix. scalar multiple of u by c: The vector c u obtained by multiplying each entry in u by c . scale (a vector): Multiply a vector (or a row or column of a matrix) by a nonzero scalar. Schur complement: A certain matrix formed from the blocks of a 2  2 partitioned matrix A D ŒAij . If A11 is invertible, its Schur complement is given by A22 A21 A111 A12 . If A22 is invertible, its Schur complement is given by A11 A12 A221 A21 . Schur factorization (of A, for real scalars): A factorization A D URU T of an n  n matrix A having n real eigenvalues, where U is an n  n orthogonal matrix and R is an upper triangular matrix. set spanned by fv1 ; : : : ; vp g:

The set Span fv1 ; : : : ; vp g.

signal (or discrete-time signal): A doubly infinite sequence of numbers, fyk g; a function defined on the integers; belongs to the vector space S. similar (matrices): Matrices A and B such that P 1 AP D B , or equivalently, A D PBP 1 , for some invertible matrix P . similarity transformation: into P 1 AP .

A transformation that changes A

simplex: The convex hull of an affinely independent finite set of vectors in Rn . singular (matrix):

A square matrix that has no inverse.

singular value decomposition (of an m  n matrix A): A D U †V T , where U is an m  m orthogonal matrix, V is an n  n orthogonal matrix, and † is an m  n matrix with nonnegative entries on the main diagonal (arranged in decreasing order of magnitude) and zeros elsewhere. If rank A D r , then † has exactly r positive entries (the nonzero singular values of A) on the diagonal. singular values (of A): The (positive) square roots of the eigenvalues of ATA, arranged in decreasing order of magnitude. size (of a matrix): Two numbers, written in the form m  n, that specify the number of rows (m) and columns (n) in the matrix. solution (of a linear system involving variables x1 ; : : : ; xn ): A list .s1 ; s2 ; : : : ; sn / of numbers that makes each equation in

A15

the system a true statement when the values s1 ; : : : ; sn are substituted for x1 ; : : : ; xn , respectively. solution set: The set of all possible solutions of a linear system. The solution set is empty when the linear system is inconsistent. Span fv1 ; : : : ; vp g: The set of all linear combinations of v1 ; : : : ; vp . Also, the subspace spanned (or generated) by v1 ; : : : ; vp . spanning set (for a subspace H /: Any set fv1 ; : : : ; vp g in H such that H D Span fv1 ; : : : ; vp g. spectral decomposition (of A): A representation

A D 1 u1 uT1 C    C n un uTn where fu1 ; : : : ; un g is an orthonormal basis of eigenvectors of A, and 1 ; : : : ; n are the corresponding eigenvalues of A. spiral point (of a dynamical system in R2 ): The origin when the trajectories spiral about 0. stage-matrix model: A difference equation xkC1 D Axk where xk lists the number of females in a population at time k , with the females classified by various stages of development (such as juvenile, subadult, and adult). standard basis: The basis E D fe1 ; : : : ; en g for Rn consisting of the columns of the n  n identity matrix, or the basis f1; t; : : : ; t n g for Pn . standard matrix (for a linear transformation T /: The matrix A such that T .x/ D Ax for all x in the domain of T . standard position: The position of the graph of an equation xTAx D c , when A is a diagonal matrix. state vector: A probability vector. In general, a vector that describes the “state” of a physical system, often in connection with a difference equation xkC1 D Axk . steady-state vector (for a stochastic matrix P ): A probability vector q such that P q D q. stiffness matrix: The inverse of a flexibility matrix. The j th column of a stiffness matrix gives the loads that must be applied at specified points on an elastic beam in order to produce a unit deflection at the j th point on the beam. stochastic matrix: A square matrix whose columns are probability vectors. strictly dominant eigenvalue: An eigenvalue 1 of a matrix A with the property that j1 j > jk j for all other eigenvalues k of A. submatrix (of A): Any matrix obtained by deleting some rows and/or columns of A; also, A itself. subspace: A subset H of some vector space V such that H has these properties: (1) the zero vector of V is in H ; (2) H is closed under vector addition; and (3) H is closed under multiplication by scalars. supporting hyperplane (to a compact convex set S in Rn ): A hyperplane H D Œf : d  such that H \ S 6D ¿ and either f .x/  d for all x in S or f .x/  d for all x in S . symmetric matrix: A matrix A such that AT = A.

A16

Glossary

system of linear equations (or a linear system): A collection of one or more linear equations involving the same set of variables, say, x1 ; : : : ; xn .

unit consumption vector: A column vector in the Leontief input–output model that lists the inputs a sector needs for each unit of its output; a column of the consumption matrix.

T

unit lower triangular matrix: A square lower triangular matrix with ones on the main diagonal.

tetrahedron: A three-dimensional solid object bounded by four equal triangular faces, with three faces meeting at each vertex. total variance: The trace of the covariance matrix S of a matrix of observations. trace (of a square matrix A): The sum of the diagonal entries in A, denoted by tr A. trajectory: The graph of a solution fx0 ; x1 ; x2 ; : : :g of a dynamical system xkC1 D Axk , often connected by a thin curve to make the trajectory easier to see. Also, the graph of x.t/ for t  0, when x.t/ is a solution of a differential equation x0 .t/ D Ax.t/. transfer matrix: A matrix A associated with an electrical circuit having input and output terminals, such that the output vector is A times the input vector. transformation (or function, or mapping) T from Rn to Rm : A rule that assigns to each vector x in Rn a unique vector T .x/ in Rm . Notation: T W Rn ! Rm . Also, T W V ! W denotes a rule that assigns to each x in V a unique vector T .x/ in W . translation (by a vector p/: The operation of adding p to a vector or to each vector in a given set. transpose (of A): An n  m matrix AT whose columns are the corresponding rows of the m  n matrix A. trend analysis: The use of orthogonal polynomials to fit data, with the inner product given by evaluation at a finite set of points. triangle inequality: ku C vk  kuk C kvk for all u, v. triangular matrix: A matrix A with either zeros above or zeros below the diagonal entries. trigonometric polynomial: A linear combination of the constant function 1 and sine and cosine functions such as cos nt and sin nt . trivial solution: The solution x D 0 of a homogeneous equation Ax D 0.

U uncorrelated variables: Any two variables xi and xj (with i ¤ j ) that range over the i th and j th coordinates of the observation vectors in an observation matrix, such that the covariance sij is zero. underdetermined system: A system of equations with fewer equations than unknowns. uniqueness question: Asks, “If a solution of a system exists, is it unique—that is, is it the only one?”

unit vector:

A vector v such that kvk D 1.

upper triangular matrix: A matrix U (not necessarily square) with zeros below the diagonal entries u11 ; u22 ; : : : :

V Vandermonde matrix: An n  n matrix V or its transpose, when V has the form 2 3 1 x1  x12 x1n 1 6 7 61 x2 x22  x2n 1 7 6 7 V D6: :: :: :: 7 6 :: : : : 7 4 5 2 n 1 1 xn xn  xn variance (of a variable xj ): The diagonal entry sjj in the covariance matrix S for a matrix of observations, where xj varies over the j th coordinates of the observation vectors. vector: A list of numbers; a matrix with only one column. In general, any element of a vector space. vector addition: entries.

Adding vectors by adding corresponding

vector equation: An equation involving a linear combination of vectors with undetermined weights. vector space: A set of objects, called vectors, on which two operations are defined, called addition and multiplication by scalars. Ten axioms must be satisfied. See the first definition in Section 4.1. vector subtraction: sult as u v.

Computing u C . 1/v and writing the re-

W weighted least squares: Least-squares problems with a weighted inner product such as

hx; yi D w12 x1 y1 C    C wn2 xn yn : weights:

The scalars used in a linear combination.

Z zero subspace: vector.

The subspace f0g consisting of only the zero

zero vector: The unique vector, denoted by 0, such that u C 0 D u for all u. In Rn , 0 is the vector whose entries are all zeros.

Answers to Odd-Numbered Exercises 2

Chapter 1 Section 1.1, page 26 1. The solution is .x1 ; x2 / D . 8; 3/, or simply . 8; 3/. 3. .4=7; 9=7/

5. Replace row 2 by its sum with 3 times row 3, and then replace row 1 by its sum with 5 times row 3. 7. The solution set is empty. 9. .4; 8; 5; 2/ 13. .5; 3; 1/

11. Inconsistent 15. Consistent

17. The three lines have one point in common. 19. h ¤ 2

21. All h

23. Mark a statement True only if the statement is always true. Giving you the answers here would defeat the purpose of the true–false questions, which is to help you learn to read the text carefully. The Study Guide will tell you where to look for the answers, but you should not consult it until you have made an honest attempt to find the answers yourself. 25. k C 2g C h D 0

  1 3 f 27. The row reduction of to c d g   1 3 f shows that d 3c must be 0 d 3c g cf nonzero, since f and g are arbitrary. Otherwise, for some choices of f and g the second row could correspond to an equation of the form 0 D b , where b is nonzero. Thus d ¤ 3c .

29. Swap row 1 and row 2; swap row 1 and row 2.

31. Replace row 3 by row 3 C ( 4) row 1; replace row 3 by row 3 C (4) row 1. 33. 4T1 T2 T4 T1 C 4T2 T3 T2 C 4T3 T4 T1 T3 C 4T4

D 30 D 60 D 70 D 40

Section 1.2, page 37 1. Reduced echelon form: a and b. Echelon form: d. Not echelon: c.

3 1 0 1 2 3. 4 0 1 2 3 5. Pivot cols 1 and 2: 0 0 0 0 2 3 1 2 3 4 4 4 5 6 7 5. 6 7 8 9         0 5. , , 0 0 0 0 0 8 8