Mathematical Approaches to Molecular Structural Biology 9780323903974


252 83 4MB

English Pages [310] Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front Cover
Mathematical Approaches to Molecular Structural Biology
Copyright Page
Dedication
Contents
About the author
Preface
Acknowledgments
Table of symbols
1 Mathematical preliminaries
1.1 Functions
1.1.1 Algebraic functions
1.1.2 Trigonometric functions
1.1.3 Exponential and logarithmic functions
1.1.4 Complex number and functions
1.2 Vectors
1.2.1 Concept of vector in physics
1.2.2 Vector as an ordered set of numbers
1.2.3 Mathematical viewpoint of vector
1.3 Matrices and determinants
1.3.1 Systems of linear equations
Gaussian elimination
1.3.2 Matrices
1.3.3 Determinants
Definiteness of a symmetric matrix
1.4 Calculus
1.4.1 Differentiation
Simple algebraic functions
1.4.2 Integration
Integration involving exponential functions
Integration involving logarithmic functions
Integration by substitution
Integration by parts
1.4.3 Multivariate function
1.5 Series and limits
1.5.1 Taylor series
1.5.2 Fourier series
Exercise 1
Further reading
2 Vector spaces and matrices
2.1 Linear systems
Exercises 2.1
2.2 Sets and subsets
2.2.1 Set
Some relevant notations
2.2.2 Subset
Exercise 2.2
2.3 Vector spaces and subspaces
2.3.1 Vector space
Vector space of m×n matrices
2.3.2 Vector subspaces
2.3.3 Null space/row space/column space
Exercise 2.3
2.4 Liner combination/linear independence
Generalized concept
Exercise 2.4
2.5 Basis vectors
The standard basis for m×n matrices
Exercise 2.5
2.6 Dimension and rank
Exercise 2.6
2.7 Inner product space
Norm
Distance
Dot product
Exercise 2.7
2.8 Orthogonality
Orthogonal and orthonormal set
Coordinates relative to orthogonal basis
Orthogonal projection
Exercise 2.8
2.9 Mapping and transformation
Basic matrix transformations
Exercise 2.9
2.10 Change of basis
Exercise 2.10
Further reading
3 Matrix decomposition
3.1 Eigensystems from different perspectives
3.1.1 A stable distribution vector
3.1.2 System of linear differential equations
Exercise 3.1
3.2 Eigensystem basics
Nonuniqueness of eigenvectors
Computing eigenvectors
Eigenvalues of some special matrices
Linear independence of eigenvectors
Eigendecomposition
Geometric intuition for eigendecomposition
Diagonalization
Invertibility of matrix P
Diagonalizability of a matrix
Orthogonal diagonalization
Projection matrix and spectral decomposition
Exercise 3.2
3.3 Singular value decomposition
Eigendecomposition and singular value decomposition compared
Exercises 3.3
Further reading
4 Vector calculus
4.1 Derivatives of univariate functions
4.2 Derivatives of multivariate functions
Partial derivatives
Critical points and local extrema
4.3 Gradients of scalar- and vector-valued functions
Vector-valued function expressed as a matrix transformation
4.4 Gradients of matrices
4.5 Higher-order derivates – Hessian
Optimization
4.6 Linearization and multivariate Taylor series
Exercise 4
Further Reading
5 Integral transform
5.1 Fourier transform
5.2 Dirac delta function
Derivative of the δ-function
The δ-function in 3D
Fourier series and the δ-function
Fourier transform and the δ-function
Dirac comb
5.3 Convolution and deconvolution
5.4 Discrete Fourier transform
5.5 Laplace transform
Exercise 5
Further reading
6 Probability and statistics
6.1 Probability—definitions and properties
6.1.1 Probability function
A complement
Uniform probability measure
6.1.2 Conditional probability
Independence of events
Bayes’ theorem
6.2 Random variables and distribution
6.2.1 Discrete random variable
The Bernoulli and binomial distributions
The Poisson distribution
6.2.2 Continuous random variable
Cumulative distribution function
The uniform distribution
The exponential distribution
The normal distribution
6.2.3 Transformation of random variables
Linear transformations of random variables
6.2.4 Expectation and variance
Expectation of a discrete random variable
Expectation of a continuous random variable
6.3 Multivariate distribution
6.3.1 Bivariate distribution
Marginal distribution
6.3.2 Generalized multivariate distribution
6.4 Covariance and correlation
Covariance matrix
Multivariate normal distribution
6.5 Principal component analysis
Principal component analysis and singular value decomposition
Exercise 6
Further reading
7 X-ray crystallography
7.1 X-ray scattering
7.1.1 Electromagnetic waves
7.1.2 Thomson scattering
7.1.3 Compton scattering
7.2 Scattering by an atom
7.3 Diffraction from a crystal – Laue equations
7.3.1 Lattice and reciprocal lattice
7.3.2 Structure factor
7.3.3 Bragg’s law
7.4 Diffraction and Fourier transform
7.5 Convolution and diffraction
7.6 The electron density equation
7.6.1 Phase problem and the Patterson function
7.6.2 Isomorphous replacement
7.6.3 Electron density sharpening
Exercise 7
Further reading
8 Cryo-electron microscopy
8.1 Quantum physics
8.1.1 Wave–particle duality
8.1.2 Schrödinger equation
8.1.3 Hamiltonian
8.2 Wave optics of electrons—scattering
8.3 Theory of image formation
8.3.1 Electrodynamics of lens system
8.3.2 Image formation
8.4 Image processing by multivariate statistical analysis—principal component analysis
8.4.1 Hyperspace and data cloud
8.4.2 Distance metrics
Euclidean metric
Chi-square metric (χ2–metric)
Modulation metric
8.4.3 Data compression
8.5 Clustering
8.5.1 Hierarchical clustering
8.5.2 K-means
8.6 Maximum likelihood
Exercise 8
Reference
Further reading
9 Biomolecular structure and dynamics
9.1 Comparison of biomolecular structures
9.1.1 Definition of the problem
9.1.2 Quaternions
9.1.3 Quaternion rotation operator
9.1.4 Minimization of residual
9.2 Conformational optimization
9.2.1 Born–Oppenheimer approximation
9.2.2 Biomolecular geometry optimization
Newton–Raphson method
Conjugate gradient method
9.3 Molecular dynamics
9.3.1 Basic theory
9.3.2 Computation of molecular dynamics trajectory
9.4 Normal mode analysis
9.4.1 Oscillatory systems
9.4.2 Normal mode analysis theory
9.4.3 Elastic network models
Gaussian network model
Anisotropic network model
Exercise 9
References
Further reading
Index
Back Cover
Recommend Papers

Mathematical Approaches to Molecular Structural Biology
 9780323903974

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Mathematical Approaches to Molecular Structural Biology

Mathematical Approaches to Molecular Structural Biology

Subrata Pal

Academic Press is an imprint of Elsevier 125 London Wall, London EC2Y 5AS, United Kingdom 525 B Street, Suite 1650, San Diego, CA 92101, United States 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom Copyright © 2023 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. ISBN: 978-0-323-90397-4 For Information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Andre G. Wolff Acquisitions Editor: Michelle Fisher Editorial Project Manager: Tracy I. Tufaga Production Project Manager: Swapna Srinivasan Elsevier Cover Designer: Vicky Pearson Cover Image Designer: Sharmishtha Pal Typeset by MPS Limited, Chennai, India

Dedication

To my parents

This page intentionally left blank

Contents About the author .......................................................................................................xi Preface ................................................................................................................... xiii Acknowledgments ...................................................................................................xv Table of symbols .................................................................................................. xvii

CHAPTER 1 Mathematical preliminaries ......................................... 1 1.1 Functions ........................................................................................1 1.1.1 Algebraic functions ............................................................. 1 1.1.2 Trigonometric functions...................................................... 3 1.1.3 Exponential and logarithmic functions............................... 5 1.1.4 Complex number and functions.......................................... 6 1.2 Vectors............................................................................................8 1.2.1 Concept of vector in physics .............................................. 9 1.2.2 Vector as an ordered set of numbers................................ 11 1.2.3 Mathematical viewpoint of vector.................................... 13 1.3 Matrices and determinants ...........................................................14 1.3.1 Systems of linear equations .............................................. 14 1.3.2 Matrices............................................................................. 20 1.3.3 Determinants ..................................................................... 26 1.4 Calculus ........................................................................................29 1.4.1 Differentiation ................................................................... 29 1.4.2 Integration ......................................................................... 34 1.4.3 Multivariate function ........................................................ 38 1.5 Series and limits ...........................................................................40 1.5.1 Taylor series ...................................................................... 41 1.5.2 Fourier series..................................................................... 42 Further reading ............................................................................ 45

CHAPTER 2 Vector spaces and matrices....................................... 47 2.1 Linear systems..............................................................................47 2.2 Sets and subsets............................................................................48 2.2.1 Set...................................................................................... 49 2.2.2 Subset ................................................................................ 51 2.3 Vector spaces and subspaces .......................................................53 2.3.1 Vector space ...................................................................... 53 2.3.2 Vector subspaces............................................................... 55 2.3.3 Null space/row space/column space ................................. 57

vii

viii

Contents

2.4 2.5 2.6 2.7 2.8 2.9 2.10

Liner combination/linear independence.......................................60 Basis vectors.................................................................................64 Dimension and rank .....................................................................68 Inner product space ......................................................................73 Orthogonality................................................................................78 Mapping and transformation ........................................................90 Change of basis ............................................................................98 Further reading .......................................................................... 101

CHAPTER 3 Matrix decomposition ............................................... 103 3.1 Eigensystems from different perspectives .................................103 3.1.1 A stable distribution vector............................................. 103 3.1.2 System of linear differential equations........................... 105 3.2 Eigensystem basics.....................................................................107 3.3 Singular value decomposition....................................................129 Further Reading ......................................................................... 136

CHAPTER 4 Vector calculus ......................................................... 137 4.1 4.2 4.3 4.4 4.5 4.6

Derivatives of univariate functions............................................137 Derivatives of multivariate functions ........................................139 Gradients of scalar- and vector-valued functions......................144 Gradients of matrices .................................................................149 Higher-order derivates  Hessian .............................................150 Linearization and multivariate Taylor series.............................153 Further Reading ......................................................................... 157

CHAPTER 5 Integral transform ..................................................... 159 5.1 5.2 5.3 5.4 5.5

Fourier transform........................................................................159 Dirac delta function....................................................................162 Convolution and deconvolution .................................................166 Discrete Fourier transform .........................................................168 Laplace transform.......................................................................170 Further reading .......................................................................... 172

CHAPTER 6 Probability and statistics.......................................... 173 6.1 Probability—definitions and properties .....................................173 6.1.1 Probability function ........................................................ 173 6.1.2 Conditional probability ................................................... 175 6.2 Random variables and distribution ............................................177 6.2.1 Discrete random variable ................................................ 177

Contents

6.2.2 Continuous random variable........................................... 181 6.2.3 Transformation of random variables .............................. 185 6.2.4 Expectation and variance................................................ 186 6.3 Multivariate distribution.............................................................189 6.3.1 Bivariate distribution ...................................................... 189 6.3.2 Generalized multivariate distribution ............................. 194 6.4 Covariance and correlation ........................................................195 6.5 Principal component analysis.....................................................202 Further reading .......................................................................... 210

CHAPTER 7 X-ray crystallography................................................ 211 7.1 X-ray scattering ..........................................................................211 7.1.1 Electromagnetic waves ................................................... 212 7.1.2 Thomson scattering ......................................................... 214 7.1.3 Compton scattering ......................................................... 216 7.2 Scattering by an atom ................................................................218 7.3 Diffraction from a crystal  Laue equations ............................221 7.3.1 Lattice and reciprocal lattice .......................................... 223 7.3.2 Structure factor................................................................ 224 7.3.3 Bragg’s law ..................................................................... 225 7.4 Diffraction and Fourier transform..............................................226 7.5 Convolution and diffraction .......................................................227 7.6 The electron density equation ....................................................228 7.6.1 Phase problem and the Patterson function ..................... 229 7.6.2 Isomorphous replacement ............................................... 230 7.6.3 Electron density sharpening............................................ 230 Further reading .......................................................................... 232

CHAPTER 8 Cryo-electron microscopy ........................................ 235 8.1 Quantum physics ........................................................................235 8.1.1 Waveparticle duality .................................................... 235 8.1.2 Schro¨dinger equation ...................................................... 236 8.1.3 Hamiltonian..................................................................... 237 8.2 Wave optics of electrons—scattering ........................................239 8.3 Theory of image formation ........................................................242 8.3.1 Electrodynamics of lens system ..................................... 242 8.3.2 Image formation.............................................................. 243 8.4 Image processing by multivariate statistical analysis—principal component analysis....................................245 8.4.1 Hyperspace and data cloud ............................................. 245

ix

x

Contents

8.4.2 Distance metrics.............................................................. 246 8.4.3 Data compression............................................................ 248 8.5 Clustering ...................................................................................250 8.5.1 Hierarchical clustering .................................................... 250 8.5.2 K-means........................................................................... 251 8.6 Maximum likelihood ..................................................................252 Reference ................................................................................... 254 Further reading .......................................................................... 255

CHAPTER 9 Biomolecular structure and dynamics ..................... 257 9.1 Comparison of biomolecular structures.....................................257 9.1.1 Definition of the problem ............................................... 257 9.1.2 Quaternions ..................................................................... 258 9.1.3 Quaternion rotation operator........................................... 259 9.1.4 Minimization of residual................................................. 261 9.2 Conformational optimization .....................................................262 9.2.1 BornOppenheimer approximation ............................... 263 9.2.2 Biomolecular geometry optimization ............................. 264 9.3 Molecular dynamics ...................................................................266 9.3.1 Basic theory..................................................................... 267 9.3.2 Computation of molecular dynamics trajectory ............. 271 9.4 Normal mode analysis................................................................273 9.4.1 Oscillatory systems ......................................................... 273 9.4.2 Normal mode analysis theory ......................................... 275 9.4.3 Elastic network models................................................... 278 References.................................................................................. 282 Further reading .......................................................................... 282 Index ......................................................................................................................283

About the author Subrata Pal Bachelor of Science in Physics, Calcutta University, Kolkata, India, 1972. Master of Science in Physics, Calcutta University, Kolkata, India, 1974. Ph.D. in Molecular Biology, Calcutta University, Kolkata, India, 1982. Assistant Professor, Physics, in a Calcutta University-affiliated college, 197982. Visiting Fellow, National Institutes of Health, National Cancer Institute, Bethesda, MD, United States, 198487. Fellow/Claudia Adams Barr Investigator, Harvard Medical School, Dana-Farber Cancer Institute, Boston, MA, United States, 198892. Assoc. Professor, Molecular Biology, Jadavpur University, Kolkata, India, 19932001. Professor, Molecular Biology, Genomics and Proteomics, Jadavpur University, Kolkata, India, 200115. Claudia Adams Barr special investigator award for basic contribution to cancer research at the Dana-Farber Cancer Institute (Harvard Medical School), Boston, MA, United States, 1991.

xi

This page intentionally left blank

Preface With the discovery of the double-helical structure of DNA in 1953 and, a few years later, the structures of the proteins, myoglobin and hemoglobin, molecular biology rose to the status of molecular structural biology (MSB). Since then, scientific literature and databases have been flooded with biomolecular structural information for which three different physics-based techniques, namely, X-ray crystallography (XRC), nuclear magnetic resonance spectroscopy, and cryoelectron microscopy (cryoEM), have been essentially responsible. It is now possible to mechanistically explain and predict functions of biomolecules based on their structures. As technologies improved and molecules became discernible at the atomic scale, it was further revealed that biological macromolecules are intrinsically flexible and naturally exist in multiple conformations. The conformational flexibility is important for their function. Biomolecular functional dynamics has become amenable to investigations by the application of physical principles aided by mathematical and statistical tools. Furthermore, rapid developments in computer hardware and software have remarkably facilitated mathematical analysis of biomolecular structure and dynamics. Needless to say, mathematical approaches are not a “total substitute” for experimental investigations in MSB but, beyond any doubt, an unchallenged complement. Concomitantly, molecular biology courses are also going through a desirable “transformation”—they are no longer restricted to mere description of biological phenomena but adopting a more mechanistic format. Further, some of the interdisciplinary courses at the graduate level, such as structural biology, macromolecular machines, protein folding, molecular recognition, interaction, etc., being built on structural dynamics of biomolecules, are becoming intensely dependent on mathematics and statistics. It is true that most of these courses have mathematics as their prerequisites. However, it is also not unexpected that in such “exclusive” mathematics courses, students are exposed to a variety of topics which are important for different areas of science and technology but may not all be required for MSB. Consequently, it is likely that the students may not be able to do equal justice to all the topics and, in the process, those who would be going for MSB-related courses later discount the importance of the topics which, subsequently, turn out to be indispensable. The book has attempted to sort out and succinctly offer some topics in mathematics and statistics which build the foundation of a discourse in MSB and orient the reader in the appropriate direction. Further, in order to reinforce the “orientation,” the last three chapters of the book have illustrated how this mathematical background can be applied to a few of the most important MSB problems. It is well known that both XRC and cryoEM are capable of solving biomacromolecular structures at atomic resolution. Invariably, they are both dependent on rigorous mathematical and computational analysis. Chapter 7 has discussed the

xiii

xiv

Preface

basic physics of X-ray diffraction and gone on to describe how diffraction data from macromolecular crystals are channelized into structural models based on the mathematical theory of Fourier transform and convolution. CryoEM is based on the scattering of electron waves by the object under investigation. In Chapter 8, the quantum physics of electron scattering has been briefly reviewed. Further, the chapter has discussed how the issues of heterogeneity and noise in cryoEM have been addressed by statistical approaches—principal component analysis and multivariate statistical analysis. Chapter 9 highlights the essential mathematics concerning some selected and widely used computational techniques to investigate biomolecular structure and dynamics. These include quaternion approach to biomolecular structure comparison, molecular dynamics simulations to predict the movement of each atom in a molecular system, and normal mode analysis which considers the constituent atoms of a biomolecule as a set of simple harmonic oscillators. Undoubtedly, each of the above stated topics in MSB deserves an entire book for the coverage of all its aspects. Nonetheless, the intention of the presentation has been to restrict to the fundamentals and care has been taken to avoid overloading. The book should be useful for the students at different levels, advanced undergraduate, graduate, or research, in “regular” molecular (structural) biology or related courses, who would not like to restrict themselves to the narrative aspects of biomolecular phenomena but take a keen interest in the mechanistic and mathematical aspects of MSB. Subrata Pal

Acknowledgments The book is dedicated to the memory of my father who introduced me to mathematics in my early childhood and my mother in whose tenderness I grew up. I respectfully express my indebtedness to late Professor Binayak Dutta-Roy from whom I learnt quantum mechanics. Plenty of thanks to Michelle Fisher, Megan Ashdown, Tracy Tufaga, and the entire Elsevier team for their unhesitating help and cooperation in addressing my queries and difficulties right from the inception of the project to the publication of the book. My wife has been very supportive of the effort in all possible ways. And, last but not least, I am extremely proud of my daughter Sharmishtha who has designed the cover page of the book.

xv

This page intentionally left blank

Table of symbols x, y, z, a, b, c, α, β, γ, λ x, y, z, u, v, r A, B, C xT AT A21 ℤ ℕ ℝ ℂ ℝn ℝm 3 n a :5 b a 5: b .. . $  # , ,, . 3

scalars vectors matrices transpose of a vector transpose of a matrix inverse of a matrix integers natural numbers real numbers complex numbers n-dimensional vector space of real numbers m 3 n ordered array of real numbers a defined as b b defined as a much greater than greater than greater than or equal to approximately equal to less than or equal to less than much less than implies if and only if

xvii

This page intentionally left blank

CHAPTER

Mathematical preliminaries

1

Structural dynamics-function correlation of biomacromolecules is perhaps the most important theme of molecular structural biology. The last three chapters of the book have attempted to illustrate the mathematical approaches to investigate this structure-function paradigm currently overwhelming the field of theoretical and computational molecular biology. This chapter reviews the fundamental mathematical concepts that are needed for the said purpose. To begin with, functions—algebraic, trigonometric, exponential and logarithmic, and complex— which are used to describe physical systems (including biological systems), have been recollected. This has been followed by the introduction of vectors and matrices that are of paramount importance in the study on cryo-electron microscopy of biomolecules, in particular, and biomolecular structure and dynamics, in general. The nature of changes in a function, with respect to a variable(s) on which it depends, has been discussed in the section on calculus.

1.1 Functions In biological literature, we have seen that amino acids are often symbolized by single alphabets. For example, the letter G denotes the amino acid glycine, K denotes lysine, and so on. Clearly, we have two sets, one consisting of the alphabets and the other the amino acids, and a defined “correspondence” so that “corresponding” to an alphabet in the first set, a unique amino acid can be identified in the second set. This special kind of correspondence between two sets is called a function—the first set is called the domain while the second set the range of the function. Definition 1.1.1 A function is a correspondence between one set, called the domain, with a second set, called the range, such that each member of the domain corresponds to exactly one member of the range.

1.1.1 Algebraic functions Physical systems (including biological systems) and their dynamics are quantitatively described in terms of “observables” or entities which can be measured. In Mathematical Approaches to Molecular Structural Biology. DOI: https://doi.org/10.1016/B978-0-323-90397-4.00009-3 © 2023 Elsevier Inc. All rights reserved.

1

2

CHAPTER 1 Mathematical preliminaries

a specific physical problem, we may have an entity, for example, the distance of an object (represented geometrically by a point P) with respect to a reference point O and the value of the entity is denoted by a variable x. Now, it may happen that there is another entity (say, a force on the object), denoted by a variable y, which is related to x. The relation between the two variables may be expressed as y 5 f ðxÞ

(1.1)

Formally, it is said that if there is a unique value of y for each value of x, then y is a function of x. The set of all permitted values of x is called a domain and that of all permitted values of y, a range. Being a function of a single variable, f (x) is also called a univariate function which can be used to describe and analyze a one-dimensional system. Graphically, it is represented by a line. Examples of two univariate functions are shown in Fig. 1.1.

FIGURE 1.1 Graphical representation of univariate algebraic functions.

1.1 Functions

On the other hand, if there be a unique value of x for each value of y, we can write x 5 g ðyÞ

(1.2)

which is defined as the reverse function. g (y) is also a univariate function. However, we may have a function f (x, y) which depends on two variables x and y. In this case, it is required that for any pair of values (x, y), f (x, y) has a well-defined value. The notion can be extended to a function f (x1, x2,. . ., xn) that depends on n number of variables x1, x2,. . ., xn. Functions of two variables can be represented by a surface in a three-dimensional space; functions with higher number of variables are usually difficult to visualize. Functions involving more than one variable are called multivariate. In certain physical problems, a function can be expressed as a polynomial in x. f ðxÞ 5 a0 1 a1 x 1 a2 x2 1 a3 x3 1 ? 1 an21 xn21 1 an xn

(1.3)

When f (x) is set equal to zero, the polynomial equation a0 1 a1 x 1 a2 x2 1 a3 x3 1 ? 1 an21 xn21 1 an xn 5 0

(1.4)

is satisfied by specific values of x known as the roots of the equation. n . 0 is an integer known as the degree of the polynomial and the equation. The coefficients a0, a1,. . ., an, (n6¼0) are real quantities determined by the physical properties of the system under study. In (1.4), if n 5 1, the equation takes the form of a linear equation a0 1 a1 x 5 0

(1.5)

whose solution (root) is given by α1 5 2a0/a1 For n 5 2, (1.4) becomes a quadratic equation a0 1 a1 x 1 a2 x2 5 0

(1.6)

the roots of which are α1;2 5

2 a1 6

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a21 2 4 a0 a2 2a2

(1.7)

n 5 3 gives a cubic equation.

1.1.2 Trigonometric functions Physical systems that involve the periodic motion of a point can be represented by trigonometric functions that are also periodic in nature. Here, the point in question is considered as the projection of another point moving along a circle. To illustrate, let us consider a circle in the x y-plane (Fig. 1.2). The radius of the circle is r and its center is at the origin of x y-coordinate system. P0 is a point on the circle, the coordinates of which are given by (x, y) and makes an angle φ with the x-axis. P is the projection of P0 on the x-axis.

3

4

CHAPTER 1 Mathematical preliminaries

FIGURE 1.2 Position of point P´ indicated by Cartesian coordinates (x, y) and polar coordinates (r, ).

As P0 moves counterclockwise along the circle starting from the x-axis, φ increases from 0 to 2π radians or 360 degrees. With P0 (x, y) associated with angle φ, we have the following definitions: cos φ 5

OP x 5 OP0 r

sin φ 5

PP0 y 5 OP0 r

tan φ 5

(1.8)

0

PP y 5 x OP

and the reciprocal relations sec φ 5 1=cos φ 5 r=x cosec φ 5 1=sin φ 5 r=y cot φ 5 1=tan φ 5 x=y

(1.9)

If the circle be of unit radius, that is, r 5 1 x 5 cos φ; 1=x 5 sec φ;

y 5 sin φ; 1=y 5 csc φ;

y=x 5 tan φ x=y 5 cot φ

All these relations are collectively called trigonometric functions. Referring to Fig. 1.2, one can see that the point P0 can be represented also by polar coordinates r (the radial coordinate) and φ (the angular coordinate). The relations between the polar coordinates and Cartesian coordinates are given by x 5 r cos φ y 5 r sin φ

(1.10)

1.1 Functions

Further, from (1.8), we have cos 0 5 cos 2π 5 1 and cos π 5 21. Therefore it can easily be visualized that as P0 moves along the circle from 0 to 2π, its projection P moves along the x-axis from 11 to 21 and back to 11. The cycle is repeatable between 2π and 4π and so on. It can be seen that f ðφ 1 2nπÞ 5 f ðφÞ

(1.11)

where n 5 1, 2, 3,. . . f (φ) 5 cos φ is, therefore, a periodic function, 2π being the period. Similarly, sin φ is also a periodic function. Fig. 1.2 also shows that sin2 φ 1 cos2 φ 5 1

(1.12)

The basic trigonometric functions, as given previously, are periodic. Hence, their inverses are not single-valued. However, by restricting the domain to an appropriate interval, each of the inverses can be defined as a (single-valued) function. sin 1 x 5 y cos 1 x 5 y tan 1 x 5 y cot 1 x 5 y sec 1 x 5 y csc 1 x 5 y

if and only if if and only if if and only if if and only if if and only if if and only if

x 5 sin y and 2 π=2 # y # π=2 x 5 cos y and 0 # y # π x 5 tan y and 2 π=2 # y # π=2 x 5 cot y and 0 # y # π x 5 sec y and 0 # y # π; y 6¼ π=2 x 5 csc y and 2 π=2 # y # π=2; y 6¼ 0

(1.13)

1.1.3 Exponential and logarithmic functions We have seen that a polynomial contains terms like an xn where x is the variable and n is a fixed number (an integer). However, there can be occasions when a function will contain terms like nx where n is a fixed number (n . 0 and n6¼1) and the variable x is in the exponent. As for example, in a steadily growing and dividing bacterial culture, the number of cells after x generations is given by N 5 N0 2x, where N0 is the initial number of cells. Such functions are called exponential functions. Definition 1.1.2 An exponential function f is given by f ðxÞ 5 ax

(1.14)

where x is a real number and the base a . 0, a6¼0.

Example 1.1.1: Commonly used exponentials are 10x and 2x, where the bases are, respectively, 10 and 2. However, the most commonly used exponential is ex, where the number e (  2.718281828459), which can be defined as a limiting sum of an infinite series (see Section 1.5), is called the natural base.

5

6

CHAPTER 1 Mathematical preliminaries

All exponential functions follow the same rules of manipulations nx1y 5 nx  ny

(1.15a)

nxy 5 ðnx Þy 5 ðny Þx

(1.15b)

and The exponential function f (x) 5 a is “one-to-one” with domain ( 2 N,N) and range (0,N). Hence, it does have an inverse function, defined as the logarithmic function with base b. x

Definition 1.1.3 If x 5 a y, then y 5 log a x, a . 0, a 6¼ 1 The number a is called the logarithmic base.

The logarithmic function that uses e as its base is called the natural logarithm and denoted by loge x

or

ln x

So, we have ln ðex Þ 5 x;

xAℝ;

elnx 5 x;

x.0

(1.16a)

and log10 ð10x Þ 5 x;

xAℝ

10log10 x 5 x;

x.0

(1.16b)

The rule for conversion of log10 and ln is

  ln x 5 ðln10Þ log10 x 5 ð2:3026. . .Þ log10 x

(1.17)

1.1.4 Complex number and functions Mathematical analysis of a periodic system has been well facilitated by the introduction of what are known as complex numbers and complex functions. To understand what a complex number is, let us consider a function f ðzÞ 5 z2  3z 1 2

(1.18)

z2  3z 1 2 5 0

(1.19)

For f (z) 5 0, (1.18) becomes

which is a quadratic equation whose roots (solutions) are given, in accordance to (1.7), as z1 5 1 and z2 5 2

(1.20)

1.1 Functions

However, for another function g ðzÞ 5 z2  2z 1 2 5 0

(1.21)

the solutions will be given as z1 ; z2 5 1 6

pffiffiffiffiffiffiffiffi 21

(1.22)

The problem can be understood by plotting the functions f (z) and g (z) against z (Fig. 1.3). It is seen that f (z) intersects the z-axis (a real line) at 1 and 2 and the roots are, therefore, real. However, g (z) does not intersect the z-axis. We may note here that that the real line or continuum can be considered to be composed of an infinite number of points, each of the points is represented by a real number. The set ℝ of real numbers includes both rational and irrational numbers. It is obvious that while the first term in (1.22) is real, the second term is not. The problem has been solved by introducing the concept of an imaginary number (or the basic complex number). An imaginary number has been defined as i5

pffiffiffiffiffiffiffiffi 21

(1.23)

so that a generalized complex number can be written as z 5 x 1 iy

(1.24)

where x and y are real numbers. x is the real part denoted by Re (z) 5 x, and y is the imaginary part denoted by Im (z) 5 y. Just as a real number is visualized as a

FIGURE 1.3 Graphical solution of equations.

7

8

CHAPTER 1 Mathematical preliminaries

FIGURE 1.4 Representation of a complex number z.

point on an infinite straight line, a complex number can be considered a point on an infinite (complex) plane (Fig. 1.4). z 5 xi y is the called the complex conjugate of z 5 x 1 i y. The magnitude or modulus |z| of a complex number is given by jzj2 5 z z 5 x2 1 y2

(1.25)

and the argument of the complex number is denoted by arg 5 tan 21

y  x

(1.26)

With reference to Fig. 1.4, (1.24) can be written as z 5 r ðcos φ 1 i sin φÞ

(1.27)

Later in this chapter we shall see that eiφ 5 cos φ 1 i sin φ

so that a complex number can be represented in the polar form z 5 rei φ

(1.28)

where r and φ can be identified with |z| and arg z, respectively. Polar representation of a complex number is easier to manipulate.

1.2 Vectors Mathematical analysis of biomolecular structure requires that each atom is described by a position vector written as ri 5 (xi, yi, zi)T while the conformation of the molecule, containing N atoms, is represented by a 3N-dimensional vector r 5 (x1, y2, z3,. . ., xN, yN, zN)T. An electron microscopic image (or measurement) of

1.2 Vectors

a biomacromolecule is considered a point or (vector) in a multidimensional space— the coordinates of the point are defined by the intensities of the pixel (or voxel).

1.2.1 Concept of vector in physics Vectors can be looked at from different perspectives. In physics, we know that certain entity can be completely specified by a single number, called its magnitude, together with the unit in which it is measured—such entities are called scalars. Examples are mass, temperature, energy, and so on. However, several entities, called vectors, require both magnitude and direction to be specified completely. A vector is represented graphically using an arrow—the length of the arrow, drawn to scale, specifies the magnitude while the direction of the arrowhead specifies the direction of the vector quantity. Mathematicians call this a “geometric vector” —the tail and the tip of the arrow being the “initial point” and “terminal point,” respectively. If a vector v in a plane or two-dimensional space (2-space) is positioned with its initial point at the origin of a rectangular coordinate system, it is completely (both in magnitude and direction) determined by the coordinates (v1, v2) of the terminal point. These coordinates are called the components of v with respect to the coordinate system. The arrow can be moved around, but as long as its length and the direction it is pointing in remain the same, it is the same vector. The arrow from an initial point (a, b) to a terminal point (a 1 v1, b 1 v2) is equivalent to that from the origin to (v1, v2). A vector can be denoted by a; its magnitude will be denoted by |a|. A unit vector, that is, a vector whose magnitude is unity, in the direction of a will be aˆ 5

a jaj

(1.29)

Multiplication of a vector by a scalar, say λ, gives a vector λ a in the same direction as the original but with the magnitude changed λ times. The scalar product (or dot product) of two vectors a and b, denoted by a  b, is given by a U b 5 jajjbjcos φ;

0#φ#π

(1.30)

where φ is the angle between the two vectors with their tails at the same point. Evidently, a and b are perpendicular (or orthogonal) if a  b 5 0. A set of vectors S is said to be orthonormal if every vector in S has magnitude unity and the vectors are mutually orthogonal. In the three-dimensional Cartesian coordinate system, i, j, and k are considered to be orthonormal unit vectors in the directions of x, y, and z, respectively. i U i5j U j5k U k51

and i U j5j U k5k U i50

9

10

CHAPTER 1 Mathematical preliminaries

Any vector in the three-dimensional space can be considered a linear combination of the unit vectors. a 5 i ax 1 j ay 1 k az

(1.31)

(ax, ay, az) are the components of a. In contrast to the scalar product, the vector product (also called cross product) of two vectors a and b is a third vector denoted as c5a3b

(1.32)

whose magnitude is given by jcj 5 jajjbjsin φ

0#φ#π

(1.33)

The vector c is orthogonal to both a and b and its direction can be conceptualized in analogy to the motion of a corkscrew. As the corkscrew turns a up to b, it advances in the direction of c (Fig. 1.5). Example 1.2.1: If we have two vectors written in component form   a 5 ax ; ay ; az

  and b 5 bx ; by ; bz

then, in the component form, the dot product of the two vectors is given by aUb 5 ax bx 1 ay by 1 az bz

and the cross product by

 a 3 b 5 ay bz  az by ; 2 ay bz 2 az by 6 5 4 az bx 2 ax bz ax by 2 ay bx

FIGURE 1.5 The vector product.

az bx  ax bz ; ax by  ay bz 3 7 5

(1.34)  (1.35)

1.2 Vectors

1.2.2 Vector as an ordered set of numbers From another perspective, a vector v in two dimensions (2-space, ℝ2) is an ordered pair of real numbers denoted by 

v1 v2





or

v2 :

v1

Synonymously it can be considered a point in ℝ2. Vectors in ℝ2 follow certain basic algebraic rules: 1. Vector equality: Two vectors  u5

u1 u2



 and

v5

v1 v2



are equal if and only if u1 5 v1 and u25v2.     u v 2. Vector addition: The sum w of two vectors u 5 1 and v 5 1 is u2 v2 defined by  w5u1v5

u1 1 v1 u2 1 v2



 5

w1 w2

 (1.36)

(1.36) can be demonstrated by the “parallelogram rule for vector addition” (Fig. 1.6A).   v 3. Scalar multiplication: The product of multiplication of a vector v 5 1 v2 with a scalar λ A ℝ is defined by  λv 5 λ

v1 v2



 5

λv1 λv2

 (1.37)

and shown in Fig. 1.6B. It is to be noted that for u, v A ℝ2, the sum w A ℝ2 and also the scaled product λ v A ℝ2.

FIGURE 1.6 Vector addition (A) and scalar multiplication (B).

11

12

CHAPTER 1 Mathematical preliminaries

Similarly, a vector v in three-dimensional space (3-space, ℝ3) can be represented by an arrow in a three-dimensional Cartesian coordinate system. It can also be denoted as a 3-tuple 0

1 v1 @ v2 A or v3



v1 v2 v3



For u, v A ℝ3 the same rules of vector equality, vector addition, and scalar multiplication are valid, and both u 1 v 5 w, λ v A ℝ3. Extending the idea of a vector in 2-space or 3-space, a “generalized vector” can be represented by a “generalized arrow” or a “generalized point” in n-dimensional space (n-space, ℝn), where n is a positive integer. Alternatively, an n-dimensional vector is an ordered list of n real numbers (n-tuple). It can be written in the comma-delimited form v 5 ðv1 ; v2 ; . . .; vn Þ

or, in the row-vector form as

 v 5 v1 v2 . . . vn

or, in the column-vector form as 2

3 v1 6 v2 7 7 v56 4 ^ 5 vn

The numbers in the n-tuple (v1, v2,. . ., vn) can be considered as either the coordinates of the generalized point or the components of the generalized vector. Introducing here the term “transpose,” it can be said that for a vector v 5 [v1 v2 . . . vn], its transpose is denoted by 2

3 v1 6 v2 7 7 vT 5 6 4 ^ 5 vn

Alternatively, for 2

3 v1 6 v2 7 7 v56 4 ^ 5 vn

 T v 5 v1 v2 . . . vn ;

In other words, transposing a row-vector results in a column-vector while transposing a column-vector results in a row-vector. The basic rules of vector algebra as followed in 2-space and 3-space can also be extended to n-space as well. So, for vectors u, v A ℝn and a scalar

1.2 Vectors

λ A ℝ, u (u1, u2,. . ., un) 5 v (v1, v2,. . ., vn) (that is, u and v are equal or equivalent) if and only if u1 5 v1 ;

u2 5 v2 ;. . .;

un 5 vn

Further, 2

3 2 u1 1 v1 w1 6 u2 1 v2 7 6 w2 7 6 u1v56 4 ^ 554 ^ un 1 vn wn

3 7 75 w 5

wAℝn

(1.38)

and 2

3 2 λv1 λu1 6 λu2 7 6 λv2 6 6 7 λv 5 4 λu 5 4 ^ 5 ^ λun λvn

3 7 7 5

λu; λvAℝn

(1.39)

1.2.3 Mathematical viewpoint of vector Thus it can be seen that vectors u and v in ℝn can be added together to produce w, and multiplied by a scalar to yield λ u and λ v, respectively, all w, λ u and λ v being in ℝn. Based on these two properties of addition and scalar multiplication, an abstract mathematical viewpoint generalizes the concept of vectors— these are “special objects” that can be added together or multiplied by scalars to produce “objects of the same kind.” Besides the geometric vectors as described previously, the following are also examples of vector objects: 1. Polynomials. Two polynomials, when added together, will result in another polynomial; any of the two, if multiplied by a scalar, will also produce a polynomial. Hence, the polynomials can be considered vectors. 2. Color. In RGB color model, a 3-vector represents a color. For example, (255, 0, 0) is red, (0, 255, 0) is green and (0, 0, 255) is blue. These colors are added and scaled up or down to produce different colors. 3. Signals. A time series or signal (such as an audio or a video signal) can be represented by an n-vector. When two audio signals are added, the result is a new audio signal. If an audio signal is scaled, a new audio signal is obtained as well. Evidently, addition and scalar multiplication can be used in combination to form new vectors. A vector w A ℝn is said to be a linear combination of vectors v1, v2,. . ., vk in ℝn if it can be expressed in the form w 5 v1 1 λ2 v2 1 ? 1 λk vk

(1.40)

where the scalars λ1 ; λ2 ; . . .; λk are called the coefficients of the linear combination.

13

14

CHAPTER 1 Mathematical preliminaries

Scalar multiplication of a vector by the number 0 results in a special vector called the zero vector 2 2 3 0 v1 6 v2 607 6 6 7 0 5 4 5 5 0v 5 04 ^ ^ vn 0

3 7 7 5

It has been earlier said that the length of a vector, represented by an arrow drawn to scale, specifies its magnitude. A common mathematical synonym for length is the term “norm” (also called Euclidean norm). For a vector v 5 (v1, v2,. . ., vn) A ℝn, the norm (or length or magnitude) is denoted by ||v||, and defined by the expression jjvjj 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi v21 1 v22 1 ? 1 v2n

(1.41)

It follows from the above equation that a. ||v|| $ 0 b. ||v|| 5 0 if and only if v 5 0 Further, if k is a scalar, then jjkvjj 5 jkj jjvjj

A vector having norm 1 is called a unit vector. If u A ℝn be a nonzero vector, a unit vector that is in the same direction as u is defined as uˆ 5

1 u :u:

(1.42)

It can be shown that || uˆ || 5 1. Hence it appears that multiplication of a nonzero vector by the reciprocal of its magnitude produces a unit vector. The process is called normalizing a vector.

1.3 Matrices and determinants Mathematical analysis of a problem, whether in physics, chemistry, economics, or molecular structural biology, can often be reduced to solving a system of linear equations. If the system is organized in the form of a rectangular array of numbers, called a “matrix,” its solution can be obtained by performing appropriate operations on this matrix.

1.3.1 Systems of linear equations A linear equation, in two dimensions, is written as ax 1 by 5 c;

both a; b 6¼ 0;

(1.43)

1.3 Matrices and determinants

and, in three dimensions, as ax 1 by 1 cz 5 d;

all a; b; c 6¼ 0;

(1.44)

In (1.43), a and b are, respectively, the coefficients of the (unknown) variables x and y while c is the constant term; in (1.44), a, b, and c are, respectively, the coefficients of the (unknown) variables x, y, and z, d being the constant term. More generally, a linear equation in n dimensions can be expressed in the form a1 x1 1 a2 x2 1 ? 1 an xn 5 b

(1.45)

where a1, a2,. . ., an denote real numbers called the coefficients of the (unknown) variables x1, x2,. . ., xn, respectively, and b is also a number called the constant term of the equation. Such an equation does not contain any products or roots of variables which appear only to the first power and not as arguments of trigonometric, logarithmic, or exponential functions. In the special case where b 5 0, the equation takes the form a1 x1 1 a2 x2 1 ? 1 an xn 5 0

(1.46)

called a homogeneous linear equation in variables x1, x2,. . ., xn. A sequence s1, s2,. . ., sn of n numbers is called a solution of (1.45) if a1 s1 1 a2 s2 1 ? 1 an sn 5 b

that is, the equation is satisfied when substitutions x1 5 s1, x2 5 s2,. . ., xn 5 sn are made. A system of linear equations, or briefly a linear system, is a finite set of linear equations involving the same variables. The generalized form of a linear system can be written as a11 x1 1 a12 x2 1 ? 1 a1n xn 5 b1 a21 x1 1 a22 x2 1 ? 1 a2n xn 5 b2 ... am1 x1 1 am2 x2 1 ? 1 amn xn 5 bm

(1.47)

Eq. (1.47) can be compactly represented by 2

a11 6 a21 6 4 ^ am1

a12 a22 ^ am2

32 x1 ? a1n 6 ? a2n 7 76 x2 ? ^ 54 ^ ? amn xn

3

3 b1 7 6 b2 7 756 7 5 4 ^ 5 bn 2

(1.48)

along with the rule of multiplication 2

a11 6 a21 6 4 ^ am1

a12 a22 ^ am2

32 ? a1n x1 6 ? a2n 7 76 x2 ? ^ 54 ^ ? amn xn

3

2

3 a11 x1 1 a12 x2 1 ? 1 a1n xn 7 6 a21 x1 1 a22 x2 1 ? 1 a2n xn 7 756 7 5 4 5 ? am1 x1 1 am2 x2 1 ? 1 amn xn

(1.49)

15

16

CHAPTER 1 Mathematical preliminaries

We already know that 2

3 x1 6 x2 7 7 x56 4 ^ 5 xn

2

3 b1 6 b2 7 7 and b 5 6 4 ^ 5 bm

are the column vectors. 2

a11 6 a21 A56 4 ^ am1

3 ? a1n ? a2n 7 7 ? ^ 5 ? amn

a12 a22 ^ am2

is defined as an m 3 n matrix. It can also be written in a compact notation as

 aij m 3 n

or

aij



The numbers aij are called the elements of the matrix A. Eq. (1.48) can also be written as A x5b

(1.50)

A is called the coefficient matrix. A sequence of numbers is called a solution to a system of equations such as (1.47) if it is a solution to every equation in the system. A system may have no solution at all, a unique solution, or an infinite family of solutions. For example, the system x 1 y 5 5, x 1 y 5 23 has no solution since the sum of two numbers cannot be 5 and 23 simultaneously. A system without a solution is called inconsistent while a system with at least one solution is called consistent. Usually, a system of equations with a greater number of variables than equations has infinitely many solutions. For example, the linear system 2x 2 3y 1 4z 5 5 2x 1 2y 1 3z 5 3

can be solved by setting z 5 s ðarbitraryÞ

as x 5 19 2 17 s y 5 11  10 s z5s

The equations are satisfied for all choices of s. Similarly, the equation 3x 2 y 1 2z 5 6

1.3 Matrices and determinants

can be solved as x5s y53s12t26 z5t

(s and t arbitrary) The quantities s and t in the above examples are called parameters. The set of solutions in each case, said to be in parametric form, is called the general solution to the system. When only two or three variables are involved, a system can be solved graphically or by simple algebraic operations. However, if more than three variables are involved a graphical solution is not possible and further, with an increasing number of unknowns, the complexity of the algebra also increases. The computation in such cases necessitates suitable notations and standardized procedures. For this purpose, the system in (1.48) can be further abbreviated as 2

a11 6 a21 6 4 ^ am1

a12 a22 ^ am2

? a1n ? a2n ? ^ ? amn

3 b1 b2 7 7 ^ 5 bm

(1.51)

which is called the augmented matrix for the system. In this matrix the constant bi’s form the last column. Having this notation for a system of linear equations (1.48), the standardized procedure for solving the system is based on the following principle: Two such systems are considered equivalent if they have same set of solutions. Therefore a system is solved by deriving a series of systems, each from and equivalent to the preceding, to end up with a system that is easy to solve. The operations carried out are simple and do not change the set of solutions. To illustrate, let us see how a specific system of linear equations is solved by (elementary) operations carried out on the equations in each step (left column). In each step the corresponding augmented matrix is displayed in the right column. 2x 2x 3x

2 1 1

3y 2y y

1 2 2

5z 2z z

5 5 5

16 26 4

The first equation is added to 2 times the second equation to obtain 2x

2

3x

1

3y y y

1 1 2

5z z z

5 5 5

16 4 4

3 times the first equation is added to 2 2 times the third equation to obtain

2

2 4 21 3

23 2 1

5 22 21

3 16 26 5 4

The first row is added to 2 times the second row to obtain 2 3 2 23 5 16 40 1 1 45 3 1 21 4 3 times the first row is added to 2 2 times the third row to obtain

17

18

CHAPTER 1 Mathematical preliminaries

2

2x

1 1 1

3y y 11y

2

5 5 5

5z z 17z

11 times the second equation is added to the third equation to obtain 2x

2

3y y

1 1

2x

2

3y y

1 1

5z 5 16 z 5 4 28z 5 84 Multiply the third equation by 1/28 to obtain 5z z z 3 times the second equation the first equation to obtain

5 16 5 4 5 3 is added to

1 1

8z 5 28 z 5 4 z 5 3 2 8 times the third equation is added to the first equation and 2 1 time the third equation is added to the second equation to obtain 2x

y

5 4 5 1 z 5 3 The first equation is multiplied by 1/2 to obtain 2x

y

x y z

5 5 5

2 1 3

2

16 4 40

2 40 0

3 23 5 16 1 1 45 211 17 40

11 times the second row is added to the third row to obtain 3 2 2 23 5 16 40 1 1 45 0 0 28 84 Multiply the third row by 1/28 to obtain 2 3 2 23 5 16 40 1 1 45 0 0 1 3 3 times the second row is added to the first row to obtain 2 3 2 0 8 28 40 1 1 45 0 0 1 3 2 8 times the third row is added to the first row and 2 1 time the third row is added to the second row to obtain 2 3 2 0 0 4 40 1 0 15 0 0 1 3 The first row is multiplied 2 1 0 0 40 1 0 0 0 1

by 1/2 to obtain 3 2 15 3

So, the solution is x 5 2, y 5 1, z 5 3. Let us now generalize the concept of elementary operations carried out to generate a series of equivalent systems in the following definition. Definition 1.3.1 The following algebraic operations are called elementary operations: 1. Interchanging two equations. 2. Multiplying an equation by a nonzero number. 3. Adding a multiple of one equation to another equation.

The above definition is associated with the following theorem: Theorem 1.3.1 If a sequence of elementary operations is performed on a system of linear equations, the resulting system has the same set of solutions as the original; hence the two systems, initial and final, are equivalent.

1.3 Matrices and determinants

The corresponding operations to manipulate the augmented matrix are called elementary row operations. Definition 1.3.2 Elementary row operations on a matrix are the following: 1. Interchanging two rows. 2. Multiplying a row by a nonzero number. 3. Adding a multiple of one row to another row.

Gaussian elimination In the above illustration, a linear system in the unknowns x, y, and z has been solved by reducing the augmented matrix to the form 2

1 40 0

0 0 1 0 0 1

 

3 5



where the asterisks represent arbitrary numbers. It appears to be in a “suitable” form from where the corresponding equations are easy to solve. Such matrices have the following properties: 1. Zero rows (consisting entirely of zeros), if any, are grouped together at the bottom. 2. In each nonzero row the first nonzero entry from the left is a 1. This is called the leading 1 for that row. 3. Each leading 1 is to the right of all leading 1 s in the rows above it. A matrix satisfying these three conditions is said to be in row-echelon form. A row-echelon matrix is said to be in reduced row-echelon form if it satisfies the following additional condition: 4. Each column that has a leading 1, contains zeros everywhere else. Examples of matrices in row-echelon form 

1  0 0





1

2

3 1    40 1   5 0 0 0 1

2

3 1   40 1 05 0 0 1

Examples of matrices in reduced row-echelon form 

1  0 0

0 1



2

3 1 0  0 40 1  05 0 0 0 1

2

3 1 0 0 40 1 05 0 0 1

It is to be noted that every matrix can be brought to (reduced) row-echelon form by a sequence of elementary row operations.

19

20

CHAPTER 1 Mathematical preliminaries

Example 1.3.1: Let us consider the matrix 2

22 24 22

1 A542 1

21 1 2

3 3 0 5 23

By a couple of row operations, the matrix A can be brought to the rowechelon form 2

22 0 0

1 Are 5 4 0 0

21 1 0

and, in another step, to the reduced form 2

22 0 0 1 0 0

1 Arre 5 4 0 0

3 3 22 5 0

3 1 22 5 0

It can be seen that the number r of leading 1s is the same in each of these rowechelon matrices. In general, it is true that r is dependent only on the original matrix A and not on the way A is brought to row-echelon form. Hence the following definition. Rank. The rank of a matrix A is the number of leading 1s in any row-echelon matrix to which A can be brought by row operations.

1.3.2 Matrices As seen previously, rectangular arrays of numbers are often used to abbreviate systems of linear equations. However, as we shall see later in the book, they are used in other important contexts as well. For example, it may be necessary to align two biomacromolecule to compare their structures. It has been mentioned earlier that the conformation of a molecule can be represented by a vector. Hence, the problem boils down to the alignment of two vectors. Here, we have a simple example to illustrate the alignment of two vectors. Example 1.3.2: Let us consider two vectors v1 5 (2, 3) and v2 5 (23, 2) in 2-space with their initial point at the origin. Clearly, v1 needs to be rotated by 90 degrees counterclockwise to align with v2. This can be achieved by taking the following 2 3 2 array 

a b c d



where a 5 0, b 5 21, c 5 1 and d 5 0, along with a rule of multiplication 

a b c d

    x ax 1 by 5 y cx 1 dy

and carrying out the operation 

0 1

21 0

    2 23 5 3 2

1.3 Matrices and determinants

Here, we have a formal mathematical definition of a matrix: Definition 1.3.3 A real-valued m 3 n matrix A is an m  n-tuple of elements aij, i 5 1, 2,. . ., m, j 5 1, 2,. . ., n (m, n A ℕ) arranged in a rectangular array consisting of m rows and n columns. The numbers in the array, aij are called the entries in the matrix. The set of all real-valued m 3 n matrices is denoted by ℝ m x n or ℝ m n.

Example 1.3.3: Matrices of different sizes 3 3 2 6 7 65 4 5 24 1

2

2

5 2 4

332

 6 6 1 6 4

134

3

0 pffiffiffi e 7 20:5 4 0 24 434

π 1 0 2 2

3

47 7 7 65 5

2

3 22 6 7 4 0 5 4 331

Evidently, just as a vector is a one-dimensional array of numbers, a matrix is defined as a two-dimensional array of numbers. In other words, a vector is an array of scalars, while a matrix is an array of vectors. A 1 3 n matrix, with only one row, is equivalent to a row-vector as shown earlier—hence it is also called a row matrix. Similarly, an m 3 1 matrix, with just one column, is called either a column-vector or a column matrix. Square matrices. A, as in Definition 1.3.3, will be a square matrix if m 5 n and called a square matrix of order m (or n). This type of matrices is of particular importance in matrix applications and analysis. A fourth-order square matrix is shown in the above example. A square matrix is called a diagonal matrix if all the entries off the main diagonal are zero. An n 3 n diagonal matrix can be written as 2

d1 6 0 D56 4 ^ 0

3 ? 0 ? 0 7 7 & ^ 5 ? dn

0 d2 ^ 0

(1.52)

A square matrix in which all the entries above the main diagonal are zero is called a lower triangular matrix and a square matrix wherein all the entries below the main diagonal are zero is called an upper triangular matrix. Either is a triangular matrix. The following are representations of a lower triangular and an upper triangular matrix. 2

a11

0

?

0

6a 0 6 21 a22 ? 6 4 ^ ^ & ^ an1 an2 ? ann lower triangular

3 2 7 7 7 5

a11

6 0 6 6 4 ^ 0

a12

?

a1n

a22 ? a2n ^ & ^ 0 ? ann upper triangular

3 7 7 7 5

21

22

CHAPTER 1 Mathematical preliminaries

It is worthwhile to look into the rules of operations on matrices for discussion on their applications in later chapters.



 Equality. Two matrices A 5 aij and B 5 bij are defined to be equal if they have the same size and their corresponding entries are equal, that is, aij 5 bij

(1.53)

the equalities holding for all values of i and j. Addition and subtraction. The sum (or difference) of two matrices A A ℝm 3 n and B A ℝm 3 n is defined as the sum (or difference) of their corresponding entries, that is, 2

a11 A6B54 ^ am1

3 2 b11 a1n ^ 564 ^ ? amn bm1 ?

3 2 a11 6 b11 b1n ^ 554 ^ ? bmn am1 6 bm1 ?

3 a1n 6 b1n 5 ^ ? amn 6 bmn (1.54) ?

Scalar product. If A 5 [aij] is a matrix and λ a scalar, the scalar product λ A is obtained by multiplying each entry of the matrix A by λ. λ A is also called a scalar multiple of A. That is, for A A ℝm 3 n 2 λ a11

  λ A 5 λ aij 5 λ aij 5 4 ^ λ am1

3 λ a1n ^ 5 ? λ amn ?

(1.55)

Product of matrices. For matrices A A ℝm 3 l, B A ℝl 3 n, each element cij of the product C 5 A B A ℝm 3 n is equal to the vector dot product of row i of A with column j of B. That is, 2

cij 5 ai Ubj 5 ai1 ai2

cij 5

l X

aik bkj ;

3 b1j 6 b2j 7 7 ? ail 6 4 ^ 5 blj

i 5 1; 2; . . .; m;

j 5 1; 2; . . .; n

(1.56)

k51

Matrices can be multiplied only if their neighboring dimensions match. For matrices P A ℝk 3 l, Q A ℝl 3 m, V A ℝm 3 k, the products P Q A ℝk 3 m, Q V A ℝl 3 k and V P A ℝm 3 l are defined while V Q, Q P, and P V are not defined. Following the rules of matrix multiplication, we can write from (1.49) 3 2 a11 a11 x1 1 a12 x2 1 ? 1 a1n xn 6 a21 x1 1 a22 x2 1 ? 1 a2n xn 7 6 a21 7 5 x1 6 A x56 5 4 4 ^ ? am1 x1 1 am2 x2 1 ? 1 amn xn am1 2

3

3 2 a12 a1n 7 6 a22 7 6 a2n 7 1 x2 6 7 6 5 4 ^ 5 1 ?xn 4 ^ am2 amn 2

3 7 7 (1.57) 5

From the above equation, it can be said that if a matrix A A ℝm 3 n, and a vector x A ℝn 3 1, the product A x can be expressed as a linear combination of the column vectors of A where the coefficients are the entries of x.

1.3 Matrices and determinants

Zero matrices. A matrix whose entries are all zero is known as a zero matrix. It is usually denoted by 0. Some examples of zero matrices are Example 1.3.4: Zero matrices 2

0 60 6 40 0

3 0 07 7 05 0

0 0

0 0





0 0

0 0 0 0 0 0



Apparently, the zero matrix plays same role in a matrix equation as the number 0 in a numerical equation. However, it should be noted that a zero matrix is not the same as the number 0. The following matrix addition A10501A5A

(1.58)

will be valid only if A and 0 are matrices with the same size. For the sake of clarity, sometimes an m 3 n zero matrix is denoted by 0m 3 n. Multiplication of two nonzero matrices can result in a zero matrix. For example, if  A5

0 2 0 3



 and

B5





AB 5

0 0 0 0

5 3 0 0



Identity matrix. A square matrix with ones on the diagonal and zeroes elsewhere is defined as an identity matrix. Example 1.3.5: A 4 3 4 identity matrix 2

1 60 I4 5 6 40 0

0 1 0 0

0 0 1 0

3 0 07 7Aℝ4 3 4 05 1

For a matrix A A ℝm 3 n AIn 5 A

and

Im A 5 A

(1.59)

Transpose. For any m 3 n matrix, A, the transpose, denoted by A , is defined to be the n 3 m matrix obtained by interchanging the rows and columns of A. In other words, for A A ℝm 3 n, the matrix B A ℝn 3 m with bij 5 aji is known as the transpose of A. That is, B 5 AT. For T

2

a11 6 a21 6 A54 ^ am1

a12 a22 ^ am2

3 ? a1n ? a2n 7 7; ? ^ 5 ? amn

2

a11 6 a12 T 6 A 54 ^ a1n

a21 a22 ^ a2n

3 ? am1 ? am2 7 7 ? ^ 5 ? amn

23

24

CHAPTER 1 Mathematical preliminaries

Example 1.3.6: 2

3 3 2 For B 5 4 5 6 5; 24 1

For C 5 6

22

 BT 5

3 5 2 6

24 1



3 6 6 22 7 7 CT 5 6 4 1 5 0 2

 1 0 ;

Transposition of a matrix can also be shown as 2

2 42 2

u v w

3T 2 j j 2 25 54u v j j 2

3 j w5 j

Here are some properties of the transpose Theorem 1.3.2 Provided the sizes of the matrices permit the stated operations, (AT)T 5 A (A 1 B)T 5 AT 1 BT (A 2 B)T 5 AT 2 BT (k A)T 5 k AT

1. 2. 3. 4.

In addition, we have the following important law for transposition. Theorem 1.3.3 Reverse order law for transposition Provided the matrix product A B is defined, ðABÞT 5 BT AT

(1.60)

Symmetric matrix. A matrix A is called symmetric, if A 5 AT, that is, aij 5 aji. A matrix A is called skew-symmetric, if A 5 2AT, that is, aij 5 2aji. If A is an m 3 n matrix, AT is an n 3 m matrix. Evidently, A 5 AT. n 5 m. Hence, A symmetric (or skew-symmetric) matrix A is necessarily a square matrix. It is to be noted that for every m 3 n matrix A, 

AT A

T

T

5 AT AT 5 AT A

(1.61a)

and 

AAT

T

T

5 AT AT 5 AAT

(1.61b)

1.3 Matrices and determinants

which gives the following theorem. Theorem 1.3.4 For every matrix A A ℝm 3 n, the products AT A and A AT are symmetric matrices.

Example 1.3.7: Let 

3 5 2 6

24 1



Then, 2

3 3 2  3 5 AT A 5 4 5 65 2 6 24 1

2  13 24 5 4 27 1 210

27 61 214

3 210 214 5 17

and 2 3    3 2 50 32 24 4 5 655 32 41 1 24 1



3 5 AA 5 2 6 T

Evidently, both AT A and A AT are symmetric matrices. Inverse. Let us consider a square matrix A A ℝn 3 n. If there exists a matrix B A ℝn 3 n such that AB 5 BA 5 In

then A is said to be regular, invertible, or nonsingular, and B is known as the inverse of A. In the absence of any such matrix B, the matrix A is noninvertible or singular. If A is invertible and B is the inverse of A, it is also true that B is invertible and A is the inverse of B. Let there be two matrices  A5

 a12 Aℝ2 3 2 ; a22

a11 a21

and

B5

1 λ



a22 2a21

 2a12 Aℝ2 3 2 a11

where λ is a scalar. Multiplying A with B, we obtain  1 a11 a22 2 a12 a21 0 λ

AB 5

0 a11 a22 2 a12 a21

 5

1 ða11 a22 2 a12 a21 Þ I2 λ

If λ 5 (a11 a22 2 a12 a21)6¼0, A B 5 I2 5 A A21so that 21

A

1 5 λ



a22 2a21

2a12 a11



if and only if (a11 a22 2 a12 a21)6¼0.

1 5 a11 a22 2 a12 a21



a22 2a21

2a12 a11

 (1.62)

25

26

CHAPTER 1 Mathematical preliminaries

Example 1.3.8:



   0 1 21 1 is inverse of 1 1 1 0   1 1 is singular 2 2

In general, for any square matrix A, the inverse A21 (if it exists) is a square matrix of the same size as A with the following property: AA21 5 I 5 A21 A

(1.63)

The reverse order law also holds for inverses of matrix products. Theorem 1.3.5 If matrices A and B are of the same size and both are invertible, then A B is invertible and ðABÞ21 5 B21 A21

(1.64)

Eq. (1.64) is extendible to cases where three or more matrices are involved. Theorem 1.3.6 A product of any number of invertible matrices of the same size is invertible, and the inverse of the product is equal to the product of the inverses in reverse order.

Applying (1.60) to (1.64), we have 

AA21

T

 T 5 IT 5 I 5 A21 AT

(1.65)

and the following theorem. Theorem 1.3.7 For an invertible matrix A, the transpose of the inverse equals the inverse of the transpose, that is (A21)T is the inverse of AT.

(1.62) leads to the definition of the determinant as we shall see next.

1.3.3 Determinants The quantity (a11 a22 2 a12 a21) is defined as the determinant of the matrix A A ℝ2 3 2. That is  a det ðAÞ 5  11 a21

 a12  5 a11 a22 2 a12 a21 a22 

(1.66)

1.3 Matrices and determinants

The relationship between the determinant and invertibility of a 2 3 2 matrix as evident in (1.62) holds for n 3 n matrices as well. Theorem 1.3.8 A matrix A A ℝn 3 n is invertible if and only if det (A)6¼0.

Determinants are defined for only square matrices. For a matrix A A ℝ3 3 3, the determinant, of order 3, is denoted by   a11  det ðAÞ 5  a21  a31

a12 a22 a32

a13 a23 a33

     

and, in general, for A A ℝn 3 n,

  a11  a det ðAÞ 5  21  ^  an1

a12 a22 ^ an2

? a1n ? a2n & ^ ? ann

       

To calculate the determinant, it is necessary to introduce the notions of the minor and the cofactor of a matrix. Using cofactors, we can write a 3 3 3 determinant in terms of 2 3 2 determinants, a 4 3 4 determinant in terms of 3 3 3 determinants,. . ., and an n 3 n determinant in terms of (n1) 3 (n1) determinants. Minor and cofactor: For a square matrix A A ℝn 3 n, the minor Mij of the element aij is defined to be the determinant of the (n1) 3 (n1) submatrix that remains after removing the ith row and jth column from A. The associated cofactor, Cij 5 (21) i1j Mij. For a 2 3 2 matrix 

a A 5 11 a21

a12 a22



it is very easy to see that M11 5 a22 ; C11 5 a22 M12 5 a21 ; C12 5 2 a21 M21 5 a12 ; C21 5 2 a12 M22 5 a11 ; C22 5 a11

Then, (1.66) can also be written as det ðAÞ 5 a11 C11 1 a12 C12 5 a21 C21 1 a22 C22 5 a11 C11 1 a21 C21 5 a12 C12 1 a22 C22

27

28

CHAPTER 1 Mathematical preliminaries

It can be noted that in each of the four equations the determinant is expressed as the sum of the products of the elements of any row or column and their corresponding cofactors—the elements and cofactors all correspond to the same row or same column of A. Such a rule of cofactor expansion, which is also called Laplace expansion, can be extended to higher order determinants as well. For A A ℝ3 3 3, there are six expansions such as a11C11 1 a12C12 1 a13C13 (expanded along the first row), or a13C13 1 a23C23 1 a33C33 (expanded along the third column). Each of these expansions can be further elaborated as det ðAÞ 5 a11 C11 1 a12 C12 1 a13 C13 5 a11 ða22 a33 2 a23 a32 Þ 2 a12 ða21 a33 2 a23 a31 Þ 1 a13 ða21 a32 2 a22 a31 Þ 5 a11 a22 a33 1 a12 a23 a31 1 a13 a21 a32 2 a13 a22 a31 2 a12 a21 a33 2 a11 a23 a32

and so on. All six expansions give the same value for det (A). In general, for a matrix A A ℝn 3 n, the number obtained by multiplying the entries in any row or column of A by the corresponding cofactors and adding the resulting products is called the determinant of A, and the sums themselves are called cofactor expansions of A. That is, det ðAÞ 5 a1j C1j 1 a2j C2j 1 ? 1 anj Cnj

[cofactor expansion along the jth column] det ðAÞ 5 ai1 Ci1 1 ai2 Ci2 1 ? 1 ain Cin

[cofactor expansion along the ith row] It is to be noted that whichever row or column is chosen, the same result is obtained.

Definiteness of a symmetric matrix As we shall see later, in Chapter 4, the concept of definiteness of a symmetric matrix is important in laying down the rules for optimization. Here, we consider how a specific property of the determinant is related to the concept. Definition 1.3.4 Principal submatrix. In a matrix A A ℝn 3 n, the kth principal submatrix is the submatrix consisting of the first k rows and columns of A.

Example 1.3.8: In the 4 3 4 matrix 2

3 6 e 6 A54 20:5 0

p0ffiffiffi 7 4 24

π 0 2 2

3 1 47 7 65 5

1.4 Calculus

½3

is the first principal submatrix  3 p0ffiffiffi is the second principal submatrix e 7 2 3 3 p0ffiffiffi π 4 e 7 0 5 is the third principal submatrix 20:5 4 2 

The definiteness of a symmetric matrix is related to the determinants of its submatrices in accordance with the following theorem. Theorem 1.3.9 A symmetric matrix A A ℝn 3 n is 1. positive definite if and only if the determinants of all its principal submatrices are positive; 2. negative definite if and only if the determinants of its principal submatrices alternate between negative and positive values, beginning with a negative value for the determinant of the first principal submatrix; and 3. indefinite if and only if it is neither positive definite nor negative definite and at least one principal submatrix has a positive determinant and at least one a negative determinant.

1.4 Calculus Most often, it is not enough to just describe a physical system by a function of one or more variables. To understand the dynamics of the system, it is important to determine how the function changes with respect to the variable/variables on which it depends. The branch of mathematics that is concerned with changes in a function in a precise manner is calculus. It consists of two subbranches—differential calculus, which computes the change in the function (derivative) brought about by an infinitesimal (approaching zero) change in a variable and integral calculus, which is concerned with the summation (integration) of the products of the derivative and the infinitesimal changes in the variable within defined “limits.”

1.4.1 Differentiation Differentiation, as it has been indicated previously, is a limiting process and hence it is restricted by the rule of limit. Therefore, before moving on to the definition of a derivative that is obtained in the process of differentiation, it will be reasonable to have a basic understanding of the concept of limit. We have here the formal definition of the limit of a function.

29

30

CHAPTER 1 Mathematical preliminaries

Definition 1.4.1 As x approaches a, the limit of f (x) is l, written as Lim f ðxÞ 5 l x-a

(1.67)

is defined as follows: given any real number ε .0, there exists another real number δ .0 such that for 0 , |xa| , δ, |f (x)l| , ε. The limit must be a unique real number.

Now, if x changes to x 1 Δ x, where Δx is a very small amount, leading to a change Δ f in the value of the function, that is, Δ f 5 f (x 1 Δ x)f (x), then the (first) derivative of f (x) is defined as follows. Definition 1.4.2 The derivative of a function f (x) at x is defined by f 0 ðxÞ 5

df ðxÞ Δf f ðx 1 ΔxÞ 2 f ðxÞ 5 lim 5 lim Δx-0 Δx Δx-0 dx Δx

(1.68)

provided that the limit exists and is independent of the direction from which ε approaches zero. If the limit exists at x 5 a, the function is said to be differentiable at a.

The derivative of f (x) can be visualized graphically as the gradient or slope of the function at x. It can be seen from Fig. 1.7 that as Δx-0, Δf/Δx approaches the gradient given by tan φ. Since f 0 ðxÞ is also a function of x, its derivative with respect to x, that is, the second derivative of f (x) can also be defined.

FIGURE 1.7 The gradient or slope of a function f(x) at x.

1.4 Calculus

Definition 1.4.3 The second derivative of a function f (x) at x is defined by f v ðxÞ 5

d df ðxÞ d2 f ðxÞ f 0 ðx 1 ΔxÞ 2 f 0 ðΔxÞ 5 5 lim 2 Δx-0 dx dx dx Δx

(1.69)

provided the limit exists.

For most applications of the differentiation, one does not need to start from the first principles (1.68). Here, we have presented the rules and techniques, in the form of theorems which can nevertheless be deduced from the first principles, to evaluate the derivatives of simple to complicated functions.

Simple algebraic functions Theorem 1.4.1 For any real number k, if f (x) 5 x k, then d d f ðxÞ 5 xk 5 kUxk21 dx dx

Example 1.4.1: d 2 x 5 2x; dx

d 5 x 5 5x4 ; dx

d dx

  1 1 52 2 x x

Theorem 1.4.2 The derivative of a constant is zero, that is d c50 dx

Example 1.4.2:  d  3 x 2 5 5 3x2 2 0 5 3x2 dx

Simple algebraic functions are not sufficient to describe and analyze atomic and molecular structure and dynamics. In most cases, the application of trigonometric, exponential, and logarithmic functions is useful and necessary. Here, the derivatives of these functions have been given (without proof). All these derivatives can be deduced from the first principles.

31

32

CHAPTER 1 Mathematical preliminaries

Theorem 1.4.3 The derivatives of trigonometric functions d ðsin xÞ 5 cos x dx

d ðcosec xÞ 5 2 cosec x cot x dx

d ðcos xÞ 5 2 sin x dx d ðtan xÞ 5 sec2 x dx

d ðsec xÞ 5 sec x tan x dx d ðcot xÞ 5 2 cosec 2 x dx

Theorem 1.4.4 The derivative of exponential function For the exponential function f (x) 5 e x, d x e 5 ex ; dx

or

f 0 ðxÞ 5 f ðxÞ

Theorem 1.4.5 The derivative of the natural logarithmic function If x . 0 and f (x) 5 ln x, then f 0 ðxÞ 5

d 1 ln x 5 dx x

In many cases, one has to deal with a function which is the product of two functions, quotient of two functions, or a function “within” another function. Here below are the rules of differentiation of such functions. Theorem 1.4.6 The product rule If F (x) be a product of two functions f (x) and g (x), that is F (x) 5 f (x)  g (x), then     d d d g ðxÞ 1 g ðxÞU f ðxÞ F 0 ðxÞ 5 ½f ðxÞUg ðxÞ 5 f ðxÞU dx dx dx also expressed as F 0 ðxÞ 5 f ðxÞUg0 ðxÞ 1 g ðxÞUf 0 ðxÞ

Example 1.4.3 a. f 5 constant 5 15, g 5 g (x) 5 x4, F (x) 5 f  g 5 15 x4 d d  4 d F ðxÞ 5 15 x 1 x4 ð15Þ 5 60 x3 1 0 5 60 x3 dx dx dx

(1.70)

1.4 Calculus

b. f (x) 5 x5 1 5, g (x) 5 x4 1 3 x2 1 7, F (x) 5 f (x)  g (x) 5 (x5 1 5) (x4 1 3 x2 1 7)  d  4   d  5  d F ðxÞ 5 x5 1 5 x 1 3x2 1 7 1 x4 1 3x2 1 7 x 15 dx dx dx       5 x5 1 5 4x3 1 6x 1 x4 1 3x2 1 7 5x4

c. f (x) 5 5x3, g (x) 5 sin x, F (x) 5 f (x)  g (x) 5 5x3 sin x d d d  3 F ðxÞ 5 5x3 U ðsin xÞ 1 sin xU 5x dx dx dx 5 5x3 cos x 1 15x2 sin x

Theorem 1.4.7 The quotient rule If F ðxÞ 5

Example 1.4.4:

f ðxÞ ; g ðxÞ

f ðxÞ 5 ex ; d d F ðxÞ 5 dx dx

then F 0 ðxÞ 5

g ðxÞ 5 x3 ;

g ðxÞUf 0 ðxÞ 2 f ðxÞUg0 ðxÞ ½g ðxÞ2

F ðxÞ 5

(1.71)

f ðxÞ ex 5 g ðxÞ x

 x e x3 Uex 2 ex U3x2 ex ðx 2 3Þ 5 5 3 6 x x x4

Theorem 1.4.8 The chain rule If F (x) 5 g { f (x)}, then dF dg df 5 dx df dx

(1.72)

also written as F 0 ðxÞ 5 g0 ðf ðxÞÞUf 0 ðxÞ

Example 1.4.5:

f ðxÞ 5 x2 ;

2 F ðxÞ 5 g f ðxÞ 5 ex

d d 2 2 2 F ðxÞ 5 ex 5 ex U2x 5 2x ex dx dx

Example 1.4.6: Differentiation using the quotient rule and the chain rule combined. 2

F ðxÞ 5

ex x2

33

34

CHAPTER 1 Mathematical preliminaries

d 2 d  2 2 0 1 x2 U ex 2 ex U x 2 dx dx d d @ex A 5 F ðxÞ 5 x4 dx dx x2  2  2 2 x2 U2x ex 2 ex U2x 2 ex x2 2 1 5 5 x4 x3

1.4.2 Integration An integral has been conceived as the area under a curve in an f (x) versus x plot. There are two kinds of integral—definite and indefinite. The definite integral of a function f (x) sets the limits for integration and denoted as ðb f ðxÞdx

(1.73)

a

where x 5 a is the lower limit and x 5 b is the upper limit of integration. To arrive at a formal definition of the definite integral, let us consider the graph of f (x) as in Fig. 1.8. Let the finite interval [a, b] 5 [x0, xn] be subdivided into a large number (say, n) of subintervals [xi1, xi], i 5 1, 2,. . ., n, each of width

FIGURE 1.8 Defining a definite integral by subdividing the interval [a, b].

1.4 Calculus

Δx. Let xi be any point in [xi1, xi], so that the height of the corresponding rectangle is f ðxi Þ. The total area of n subintervals is given by the Riemann sum n X

f ðxi ÞΔx

(1.74)

i51

The area under the curve f (x) on [a, b] is then given by A 5 lim

n X

n-N

f ðxi ÞΔx

(1.75)

i51

So, we have the following definition of the definite integral. Definition 1.4.4 If a function f (x) be defined on an interval [a, b], the definite integral of f from a to b is given by ðb n X f ðxÞ dx 5 lim f ðxi ÞΔx (1.76) a

n-N

i51

provided the limit exists. In such case, f (x) is said to be integrable on [a, b] and called the integrand.

The indefinite integral can be considered an inverse of the derivative—hence, antidifferentiation is the process of differentiation in reverse. Thus if f (x) is the derivative of F (x) with respect to x, that is, dF ðxÞ 5 f ðxÞ dx

(1.77)

then F (x) is the indefinite integral of f (x) expressed as ðx F ðxÞ 5

f ðxÞ dx

(1.78)

where the lower limit is arbitrary. It should be noted that the differentiation does not have a unique inverse—any function F (x) obeying (1.78) can be an indefinite integral of f (x). However, any two such integrals will differ only by an arbitrary constant. Definition 1.6.4 An indefinite integral of f (x) can be formally written as ð f ðxÞ dx 5 F ðxÞ 1 c where c is known as the constant of integration. c can be evaluated from what are known as boundary conditions.

(1.79)

35

36

CHAPTER 1 Mathematical preliminaries

Some useful rules of antidifferentiation are listed in the following theorem. Theorem 1.4.9 Rules of antidifferentiation 1. Constant rule

ð m dx 5 mx 1 c;

d ðmx 1 cÞ 5 m dx

since

2. Power rule ð xn dx 5

1 n11 x 1 c; n11

n 6¼ 2 1;

3. Natural logarithm rule ð 1 dx 5 ln x 1 c; x

d dx

since



 xn11 1 c 5 xn n11

x . 0;

since

d 1 ðln x 1 cÞ 5 dx x

α 6¼ 0;

since

d dx

4. Exponential (base e) rule ð

1 eαx dx 5 eαx 1 c; a



 1 αx e 1 c 5 eαx a

Example 1.4.7: Some integrations result in inverse trigonometric functions. ð dx x 1. pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 sin 21 1 c 2 2 a ð a 2x dx 1 21 x 1c 2. 5 tan 2 2 a a ða 1x dx 1 x pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 sec 21 1 c 3. a a x x 2 2 a2

Integration involving exponential functions The exponential function f (x) 5 ex is its own derivative and its own integral—this property makes it one of the most manageable functions in calculus. Exponential functions can be integrated using the following rules. ð

ex dx 5 ex 1 c ð ax dx 5

ax 1c ln a

Integration involving logarithmic functions The integrals involving logarithmic functions follow very simple rules as given next. ð

  f 0 ðxÞ 5 lnf ðxÞ 1 c f ðxÞ

1.4 Calculus

ð ln x dx 5 x ln x 2 x 1 c 5 x ðln x 2 1Þ 1 c ð loga x dx 5

If f (x) 5 x,

ð

x ðln x 2 1Þ 1 c ln a

f 0 ðxÞ dx 5 5 lnjxj 1 c f ðxÞ x

Integration by substitution Sometimes a substitution of variables can make a complicated integral relatively simpler. This is illustrated in the following example. Example 1.4.8: To evaluate the integral

Ð

2x dx ð11x2 Þ2

We can make a substitution u 5 1 1 x2, so that du 5 2x dx, and ð

2xdx ð11x2 Þ2

ð

5

du 1 52 1c u2 u

52

1 1 c ðon reverese substitutionÞ 1 1 x2

Example 1.4.9: To evaluate ð

5 dx x 2 10

Substituting u 5 x 2 10, du 5 dx. Therefore ð

ð 5 du dx 5 5 5 5 lnjuj 1 c 5 5 lnjx 2 10j 1 c x 2 10 u

x 6¼ 0

Integration by parts When a function can be considered as a product of two separate functions or parts, the integration can be carried out based on the product rule of differentiation (1.70). Integrating both sides of (1.70) with respect to x ð

ð dv du dx 1 v dx dx dx ð ð 5 u dv 1 v du

uv 5

u

37

38

CHAPTER 1 Mathematical preliminaries

So, we have the following theorem for integration. Theorem 1.4.10 The integration-by-parts rule is ð ð u dv 5 uv 2 v du

(1.80)

Ð Example 1.4.10: To evaluate x sin x dx Let u 5 x and dv/dx 5 sin x so that du/dx 5 1 and v 5 2cos x. Hence, the integral ð

ð

x sin x dx 5

ð

u dv 5 x ð 2cos xÞ 2

ð1Þ ð 2cos xÞ dx 5 2 x cos x 1 sin x 1 c

Ð Example 1.4.11: To evaluate x ln x dx Let u 5 ln x and dv 5 x dx 2 so that du 5 1x dx and v 5 x2 . Hence, the integral ð

x ln x dx 5 ln xU 5

x2 2 2

ð

ð x2  1  x2 1 dx 5 ln x 2 x dx 2 2 x 2

x2 x2 ln x 2 1 c 2 4

1.4.3 Multivariate function In (1.68), we have defined the derivative of a function one variable. This definition needs to be amended in the case of a function of two variables f (x, y). Here, we introduce the concept of a “partial derivative” and the corresponding notations @/@x and @/@y. Definition 1.6.5 The partial derivative of f (x, y) with respect to x is formally defined as   @f f ðx 1 Δx; yÞ 2 f ðx; yÞ 5 lim @x y Δx-0 Δx

(1.81)

provided that the limit exists. It can be seen that y is treated as a constant. Similarly, the partial derivative of f (x, y) with respect to y (x being held constant) can be expressed as   @f f ðx; y 1 ΔyÞ 2 f ðx; yÞ (1.82) 5 lim @y x Δy-0 Δy also contingent on the existence of a limit.

1.4 Calculus

Further extending the definitions to the general n-variable function f (x1, x2,. . ., xn) ½f ðx1 ; x2 ; . . .; xi 1 Δxi . . .; xn Þ 2 f ðx1 ; x2 ; . . .; xi ; . . .; xn Þ @f ðx1 ; x2 ; . . .; xn Þ 5 lim Δxi -0 @xi Δxi

(1.83)

provided that the limit exists. Just as we have seen the derivates of a function with multiple variables, the integral of a function with more than one variable can also be defined. Let us consider a function of two variables f (x, y) defined over the domain [a, b] 3 [c, d]. For the integral of a function of one variable, we had divided the interval [a, b] into a number of subintervals along a straight line. Here, we consider the limits of integration to be represented by a closed curve in a (two-dimensional) region R in the xy-plane and subdivide the region   into m 3 n subregions Rij (i 5 1,2,. . ., m;  j 5 1,2,. . ., n) each of area ΔA. Let xij ; yij be any arbitrary point in Rij. A double Riemann sum for f (x, y) over R is given by n X m    X f xij ; yij UΔA

(1.84)

j51 i51

which leads to the definition of the surface integral of f (x, y) over the region R. Definition 1.6.6 If a function f (x, y) be defined over a domain [a, b] 3 [c, d], then the double integral of f (x, y) over the domain [a, b] 3 [c, d] is given by ðd ðb c

f ðx; yÞ dx dy 5 lim

m;n-N

a

n X m    X f xij ; yij UΔA

(1.85)

j51 i51

provided the limit exists.

The form of the double integral in (1.85) implies that first integration is carried out with respect to x from a to b, treating y as constant, followed by integration with respect to y from c to d. However, if dx and dy, as well as the limits of integration, are interchanged, the double integral ðb ðd f ðx; yÞ dy dx a

c

produces the same result. Moreover, the double integral can also be written in any of the following forms ðd

ðb dy

c

ðb f ðx; yÞ dx

a

or

ðd dx

a

f ðx; yÞ dy c

as it is understood that each integral symbol acts on everything to its right.

39

40

CHAPTER 1 Mathematical preliminaries

Example 1.4.12: To evaluate ð2 ð1 x2 y2 dx dy 1

0

Integrating first with respect to x and then with respect to y, we have ð2

ð1 dy

1

x2 y2 dx 5

ð2 dy

0

1

 3 2 1 ð 2  2   3 2 x y y y 7 dy 5 5 5 3 0 3 9 1 9 1

If the order of integration is interchanged, we obtain ð1

ð2 dx

0

x2 y2 dy 5

1

ð1 dx 0

 2 3 2 ð 1  2   3 1 x y 7x 7x 7 dx 5 5 5 9 3 1 3 9 0 0

1.5 Series and limits There are several examples in physical sciences where the function describing a system is expressed as a series. This facilitates the solution of a relatively complicated function by means of approximation. A series may contain a finite or infinite number of terms. If a series contains an infinite sequence of terms u1, u2, u3,. . ., the ith partial sum is defined as si 5

i X

un

(1.86)

n51

If the partial sum si converges to a limit as i-N, that is, lim si 5 S

i-N

the infinite series N X

un

n51

is said to be convergent and the sum is given by the limit S. In general, the terms of a series can be complex numbers. In such case the partial sum Si will also be complex and can be expressed as Si 5 Xi 1 iYi

(1.87)

where Xi and Yi are the partial sums of the real and imaginary terms separately. Xi and Yi are real. In a certain type of series, each term depends on a variable, say x, and the series assumes the general form P ðxÞ 5 a0 1 a1 x 1 a2 x2 1 a3 x3 1 . . .

(1.88)

1.5 Series and limits

where a0, a1, a2, etc. are constants. This type of series, occurring frequently in physical problems, is called a power series.

1.5.1 Taylor series Using Taylor’s theorem, a function can be expressed as a power series in x, known as a Taylor series. The theorem, and hence the series, is applicable for functions which are continuous and differentiable within the domain of interest. Theorem 1.5.1 If f (x) has n 1 1 continuous derivatives on an interval containing a, then for any x in the interval " # n X f ðkÞ ðaÞ ðx2aÞk 1 Rn11 ðxÞ (1.89) f ðxÞ 5 k! k50 where the error term Rn11 ðxÞ 5

f ðn11Þ ðcÞ ðx2aÞn11 ðn 1 1Þ!

for some c between a and x.

If lim Rn ðxÞ 5 0

n-N

the infinite Taylor series converges to f ðxÞ 5

~ X f ðkÞ ðaÞ k50

k!

5 f ðaÞ 1

ðx2aÞk (1.90)

f 0 ðaÞ f v ðaÞ ðx 2 aÞ 1 ðx2aÞ2 1 ? 1! 2!

Example 1.5.1: Let us obtain the Taylor series of f ðxÞ 5

1 12x

about x 5 2. We calculate f 0 ðxÞ 5

1 2 6 ; f v ðxÞ 5 ; f 0 v ðxÞ 5 ;...; f 2 3 ð12xÞ ð12xÞ ð12xÞ4

ðkÞ

ðxÞ 5

k! ð12xÞk11

so that f 0 ð2Þ 5 1;

f v ð2Þ 5 2 2;

f 0 v ð2Þ 5 6; . . . ;

f ðkÞ ð2Þ 5 2 1k11 k!

and f ðxÞ 5

1 5 2 1 1 ðx 2 2Þ 2 ðx22Þ2 1 ðx22Þ3 1 ? 1 ð21Þk11 ðx22Þk 1 ? 12x

41

42

CHAPTER 1 Mathematical preliminaries

If the reference point is set to be zero in (1.90) the expansion becomes a Maclaurin series f ðxÞ 5 f ð0Þ 1 x f 0 ð0Þ 1

N k X x2 x ðk Þ f v ð0Þ 1 ? 5 f ð0Þ 2! k! k50

(1.91)

Eq. (1.91) can be used to expand various functions such as exponential, logarithmic, and trigonometric into infinite (power) series. For an exponential function, we have ex 5 1 1 x 1

N k X x2 x3 x 1 1?5 2! 3! k! k50

(1.92)

The Maclaurin series for the trigonometric functions sin x and cos x are sin x 5 x 2

x3 x5 x2n11 1 ? 1 ð21Þn 1? 3! 5! ð2n 1 1Þ!

(1.93)

x2 x4 x2n 1 ? 1 ð21Þn 1? 2! 4! 2n!

(1.94)

cos x 5 1 2

If x 5 iφ, where φ is real, then φ2 iφ3 φ4 iφ5 2 1 1 2? 2! 3! 4! 5! 0 1 0 1 2 4 3 5 φ φ φ φ 5 @1 2 1 2 ?A 1 i @ φ 2 1 2 ?A 2! 4! 3! 5!

eiφ 5 1 1 iφ 2

(1.95)

That is, eiφ 5 cos φ 1 i sin φ

(1.96)

We cannot find a Maclaurin expansion for ln x since the function does not exist at x 5 0. However, it is possible to obtain the Maclaurin expansion of ln (1 1 x), as shown in the following example. Example 1.5.2: To find the Maclaurin expansion of f (x) 5 ln (1 1 x). We have f 0 ðxÞ 5

ð21Þk11 ðk 2 1Þ! 1 1 2 ; f v ðxÞ 5 2 ; f 0 v ðxÞ 5 ; ?; f k ðxÞ 5 2 3 11x ð11xÞ ð11xÞ ð11xÞk

so that f ðxÞ 5 ln ð1 1 xÞ 5 x 2

x2 x3 ð21Þk11 k 1 1?1 x 1? 2 3 k

1.5.2 Fourier series Besides power series, a function may also be represented by a sum of sine and cosine terms. Such a representation is known as a Fourier series. One of the

1.5 Series and limits

conditions a function has to fulfill to be expanded as a Fourier series is that it must be periodic. Let f (x) be a 2 L-periodic function defined in the interval [ 2 L, L] and outside the interval by f (x 1 2 L) 5 f (x). The Fourier series or Fourier expansion is then given by f ðxÞ 5

N  X a0 sπx sπx 1 bs sin 1 as cos L L 2 s51

(1.97)

where the Fourier coefficients as and bs are given by as 5

1 L

bs 5

1 L

ðL f ðxÞ cos

sπx dx L

s 5 0; 1; 2; . . .

(1.98a)

f ðxÞ sin

sπx dx L

s 5 0; 1; 2; . . .

(1.98b)

2L

ðL

2L

The following orthogonality property of the sinusoidal functions makes calculation of Fourier expansion possible ðL cos

sπx

dx 5 0 5

ðL sin

sπx dx L

L 2L  sπx rπx L s5r cos dx 5 cos 0 s 6¼ r L L 2L ðL sπx rπx cos dx 5 0 sin L L 2L  ðL sπx rπx L s5r sin dx 5 sin 0 s 6¼ r L L 2L 2L

ðL

The Fourier series in (1.97) can also be written in a complex form as N X

f ðxÞ 5

cs eiπsx=L

(1.99)

f ðxÞ e2iπsx=L dx

(1.100)

s52N

where cs 5

1 2L

ðL 2L

Example 1.5.3: Let us expand f (x) 5 x2, 0 , x , 2π, in a Fourier series if the period is 2π. We can calculate the Fourier coefficients from (1.98a,b) as follows. Here the period is 2 L 5 2π, hence L 5 π. If s 5 0, a0 5

1 π

ð 2π 0

x2 dx 5

8π2 3

43

44

CHAPTER 1 Mathematical preliminaries

and further, using the orthogonality relations, as 5

1 L

ð 2L f ðxÞ cos 0

sπx 1 dx 5 L π

ð 2π x2 cos s x dx 0

2 32π  2cos s x   2sin s x  1 4  2  sin s x  5 5 4; x 2ð2xÞ 12 5 π s s2 s3 s2

s 6¼ 0

0

Similarly, we can calculate bs 5 2

4π s

Thus we have the Fourier expansion f ðxÞ 5 x2 5

 N  X 4π2 4 4π sin s x 1 cos s x 2 s2 s 3 s51

Exercise 1 1. Find u  v for the following vectors. a. u 5 (2, 2 6, 0, 3), v 5 ( 2 4, 1, 4, 3) b. u 5 ( 2 3, 6, 0), v 5 (2, 2 4, 1) c. u 5 (5, 7), v 5 ( 2 2, 4) d. u 5 (2, 2 4, 2), v 5 (1, 1, 1) 2. Calculate the cross product w 5 u 3 v for the following vectors and show that for each case w is perpendicular to both u and v. a. u 5 (1, 2, 2 -2), v 5 (3, 0, 1) b. u 5 ( 2 6, 4, 2), v 5 (3, 1, 5) c. u 5 (2, 2 1, 4), v 5 (1, 3, 7) 3. Solve the following systems of equations using augmented matrices a. 3x 1 4y 1 z 5 1 2x 1 3y 5 0 4x 1 3y 2 z 5 2 2 b. x 1 y 1 2z 5 9 2x 1 4y 2 3z 5 1 3x 1 6y 2 5z 5 0  2 4 22 4. If A 5 ; show that ATA and AAT are both symmetric but ATA 3 5 1 6¼ A AT. 5. Find2 if the following matrices are invertible 3 3 24 22 a. 4 0 1 1 5 26 7 5

Further reading

2

3 22 3 6 21 5 2 1 3 2 22 5 c. 4 23 1 1 5 2 7 24 Find the derivatives with respect to x of the following functions: a. f (x) 5 2 x b. f (x) 5 x2 e x c. f (x) 5 x2 ln x d. f ðxÞ 5 lnx3x e. f ðxÞ 5 sinx x Evaluate the following integrals Ð1 a. 0 xex dx Ð2 b. 1 x2 ln x dx Ð ln x c. Ð x dx 2 d. Ð 2xex dx e. x sin x dx Evaluate Ð 3 Ð 2 the double integrals a. 21 1 x2 y dx dy Ð1 Ðx b. 0 x2 x y2 dx dy Ð1 Ð1 c. 0 0 x2 ey dx dy Obtain the Taylor series expansions of f ðxÞ 5 1x about x 5 1 and x 5 3. Find the Maclaurin expansion of f (x) 5 cos2 x up to powers of x4. Compute the Fourier series for a. f ðxÞ 5 x; -π # x # π: 0 22 , x , 0 b. f ðxÞ 5 period 5 4 2 0,x,2 1 b. 4 5 3 2

6.

7.

8.

9. 10. 11.

Further reading Anton, H., Rorres, C., 2014. Elementary Linear Algebra, 11th (ed.) Wiley. Arfken, G.B., Weber, H.J., Harris, F.E., 2013. Mathematical Methods for Physicists: A Comprehensive Guide, 7th (ed.) Elsevier Academic Press. Bittinger, M.L., Ellenbogen, D.J., Surgent, S.A., 2012. Calculus and Its Applications, 10th (ed.) Addison-Wesley. Nicholson, W.K., 2021. Linear Algebra with Applications, Open Edition. Creative Common License (CC BY-NC-SA). Riley, K.F., Hobson, M.P., Bence, S.J., 2006. Mathematical methods for physics and engineering, 3rd (ed.) Cambridge University Press, Cambridge, U.K.

45

This page intentionally left blank

CHAPTER

Vector spaces and matrices

2

In Chapter 1, we have looked at vectors from three different perspectives. First, they were viewed as directed line segments (arrows). Next, vectors were considered as ordered pairs and ordered triple of real numbers, which could be visualized in respectively, two-dimensional and three-dimensional rectangular coordinate systems and extending the notion further by going beyond our visual experience, as n-tuples of real numbers. Besides, we have also characterized vectors as “special objects” that can be added together or multiplied by scalars to produce “objects of the same kind.” In this chapter, the notion will be formalized by introducing vector space, a “structured space” where vectors reside. “Vector spaces” and “linear spaces” are considered to be synonymous.

2.1 Linear systems In real world, there are very few physical elements that display truly linear characteristics. Nonetheless, as nonlinear dynamical equations are relatively difficult to solve, nonlinear systems are usually approximated by linear equations (linearization). With a large number of mathematical tools available for analysis, linear systems have found useful applications across a wide range of disciplines, including molecular structural biology. “Linearity” is closely related to “proportionality”; it is the property of a mathematical relationship, say, a function f (x) where x A ℝ, that can be graphically represented by a straight line. Generalized for n dimension, linearity requires the property of a function being compatible with addition and scaling (superposition principle), that is, f ðx 1 yÞ 5 f ðxÞ 1 f ðyÞ

and f ðλ xÞ 5 λ f ðxÞ

where λ is a real or complex number, x and y can be elements of an n-dimensional vector space. Example 2.1.1: Let us consider a differential equation which has relevance in many oscillating systems. L ðxÞ 5 0 Mathematical Approaches to Molecular Structural Biology. DOI: https://doi.org/10.1016/B978-0-323-90397-4.00002-0 © 2023 Elsevier Inc. All rights reserved.

(2.1)

47

48

CHAPTER 2 Vector spaces and matrices

where L5a

d2 d 1b 1c dt dt2

(2.2)

a, b, and c are constants. Looking at each term on the right-hand side, we find c ðx 1 yÞ 5 cx 1 cy; b

d dx dy d2 d2 x d2 y ðx 1 yÞ 5 b 1 b and a 2 ðx 1 yÞ 5 a 2 1 a 2 dt dt dt dt dt dt

so that, L ðx 1 yÞ 5 L ðxÞ 1 L ðyÞ

(2.3)

L ðkxÞ 5 kL ðxÞ

(2.4)

Also,

where k is a constant. Hence, (2.1) is a linear differential equation representing a linear oscillating system.

Exercises 2.1 1. State, giving reasons, if the following systems are linear/nonlinear. a. f (x) 5 k2x b. f (x) 5 kx2 2. In the following differential equations, x is the independent variable and y is the dependent variable. Identify which of the equations are nonlinear. 2 a. ddt2y 1 y 5 0 2

b. y ddt2y 1 y 5 0 2

c. x ddt2y 1 y 5 0 2

d. x2 ddxy2 1 x dy dx 5 sin x 3. In the following partial differential equations, u is the dependent variable and x and y are independent variables. Comment on their linearity/nonlinearity. 2 a. @@xu2 1 sin y 5 0 @u 2 b. x @u @x 1 y @y 1 u 5 0 4. Are the following equations governing the motion of a pendulum linear/ nonlinear? 2 a. ddt2θ 1 gl sin θ 5 0 2 b. ddt2θ 1 gl θ 5 0

2.2 Sets and subsets As we proceed to define a vector space, it will be useful to have some basic notion of “sets” and “subsets”—a notion which is considered to be somewhat “primitive”

2.2 Sets and subsets

or “basic” in modern mathematics. Therefore neither a set nor its content is defined, they are to be understood intuitively. Yet, even for the sake of intuitive understanding, one needs at least an “informal” description of a set and its properties.

2.2.1 Set Stating simply, a set is a collection of definite, distinct objects which are called members or elements of the set. Some simple examples of a set would be a set of nucleotides, a set of acidic amino acids, etc. There is precisely one set with no members at all—it is called the empty set or null set. A set with just one member is called a singleton or a singleton set.

Some relevant notations Usually sets are denoted by capital letters A, B, C, . . . . . . and the members of a set by small letters a, b, c, . . . . . . If x is an element of set X, we write x A X. If set X is a member of set Y, we write X A Y. If z is not a member of set X, we write z 2 = X. A set is generally specified by using curly braces. The same set can be described in three different ways: a. by listing all its members, S 5 {2, 4, 6, 8} b. by stating a property of its members S 5 {x | x is a natural number between 1 and 9 that is divisible by 2} or, S 5 {y | y is an even number between 1 and 9} c. defining a set of recursive rules to identify its members (i) 2 A S (ii) if x A S, then x 1 2 A S, (x , 10) (iii) nothing else belongs to S. Example 2.2.1: Some frequently used sets: 1. 2. 3. 4. 5. 6.

ℕ 5 {0, 1, 2. 3, . . . . . . .} the set of all natural numbers N1 5 {n A N: n . 0} ℤ 5 {0, 21, 11, 22, 12, . . . . . . . . .} the set of all integers ℝ 5 the set of all real numbers ℝn 5 the set of all n-dimensional vectors of real numbers ℂ 5 the set of all complex numbers Two sets S and T are equal if they have exactly the same members. For example, S 5 f5; 9; 3; 8g and T 5 f3; 9; 5; 8g

The order of listing of the members does not matter.

49

50

CHAPTER 2 Vector spaces and matrices

Venn diagrams. Sets and relationship between them are graphically represented by Venn diagrams. A set is shown as a circle or oval (Fig. 2.1) and its members are represented as objects included within the circle (or oval). Examples 2.2.2: N 5 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

V 5 {a, e, i, o, u}.

Union. There are three operations, each of which takes two sets and return a new set—(a) union, (b) intersection, and (c) set-theoretic difference. The union of two sets S and T is the collection of all elements which belong to either S or T, or to both S and T (Fig. 2.2). Symbolically, S , T 5 fx: ðxASÞ or ðxATÞg 5 x: ðxASÞ 3 ðxATÞ

Example 2.2.3: If S 5 {2, 3, 9, 5, 6, 1} and T 5 {5, 4, 8, 3, 7}, then S , T 5 f1; 2; 3; 4; 5; 6; 7; 8; 9g

FIGURE 2.1 Sets N and V as described in the text.

FIGURE 2.2 Union of two sets S and T.

2.2 Sets and subsets

FIGURE 2.3 Intersection of two sets S and T (shaded).

Intersection. The intersection of two sets S and T is the collection of all elements which belong to both S and T (Fig. 2.3). Symbolically, S - T 5 fx: ðxASÞ and ðxAT Þg 5 x: ðxASÞ X ðxAT Þ

Example 2.2.4: If S 5 {2, 3, 9, 5, 6, 1} and T 5 {5, 4, 8, 3, 7}, as above, then S - T 5 f3; 5g

Disjoint sets. If two sets have no elements in common, they are said to be disjoint. That is, S and T are disjoint if and only if S - T 5 [ Set-theoretic difference. This operation subtracts the elements of one set from another. The difference between two sets S and T is the collection of elements in S that are not in T (Fig. 2.4). It is expressed as   S 2 T 5 x: ðxASÞ X x 2 =T

Example 2.2.5: If S 5 {2, 3, 9, 5, 6, 1} and T 5 {5, 4, 8, 3, 7}, as above, then S 2 T 5 f1; 2; 6; 9g

2.2.2 Subset For two sets A and B, it is said that A is a subset of B if and only if every element of A is also an element of B. The relation between two such sets is denoted by A D B. If A D B and A 6¼ B, we call A a proper subset of B and denote the relation by A C B.

51

52

CHAPTER 2 Vector spaces and matrices

FIGURE 2.4 Set-theoretic difference between S and T (shaded).

FIGURE 2.5 A C B (shaded).

It can be proved that two sets S and T are equal if and only if each is a subset of the other. Symbolically, S 5 T3 ðSDTÞ X ðTDSÞ

Example 2.2.6: If A 5 {1, 3, 5, 7, 9} and B 5 {1, 2, 3, 4, 5, 6, 7, 8, 9} then A C B (Fig. 2.5). Example 2.2.7: If T 5 {x, y, z}, then T has eight different subsets. [ fxg x; y y; z

y fz; xg

f zg x; y; z

The empty set [ is a subset of every set. In fact, it can be shown that a set S with n elements has 2n subsets.

2.3 Vector spaces and subspaces

Exercise 2.2 1. If S is a set of vectors A ℝ4, identify which of the following vectors are not members of S. ð3; 7; 2 2; 0; 1Þ; ð0; 3; 2 2; 2Þ; ð0; 0; 0; 0Þ; ð7; 2 5; 2; 2 1Þ; ð0; 0; 0Þ; ð1; 2; 2 1; 0; 0Þ; ð0; 0; 0; 1Þ

2. S is a set of polynomials P (x) having a root x 5 1. Identify which of the following polynomials belong to the set. a. 2 2x23x 1 5 b. 2x27x 1 1 c. 2 2x2 1 3x 1 5 d. x6 1 4x35x e. x5 1 4x35 3. S is the set of all points (x, y) in ℝ2 and T is a set of all points for which x $ 0 and y # 0. Is S a subset of T ? Is T a subset of ℝ2? 4. S 5 {all points in the xy-plane} T 5 {all points in the yz-plane} What is S - T ? 5. S is a set of k number of vectors in ℝn. How many subsets does S have?

2.3 Vector spaces and subspaces 2.3.1 Vector space The formal definition of a vector space stipulates the relationship between the following: (a) two sets—V, a nonempty set of vectors, and F, a scalar field (either a field ℝ of real numbers or a field ℂ of complex numbers), and (b) two algebraic operations— addition and scalar multiplication. In a general notion of a vector space, scalars can come from a mathematical structure known as a “field.” They can be either real numbers or complex numbers—vector spaces with real scalars are called real vector spaces and those with complex scalars are called complex vector spaces. Definition 2.3.1 Axiomatic definition of a vector space. V is called a vector space over F, if the following axioms are satisfied: 1. 2. 3. 4. 5.

x 1 y A V for all x, y A V (closure property for vector addition). (x 1 y) 1 z 5 x 1 (y 1 z) for every x, y, z A V. x 1 y 5 y 1 x for every x, y A V. There is an element 0 A V, called zero vector, such that x 1 0 5 x for every x A V. For every x A V, there is an element (2x) A V such that x 1 (2x) 5 0 (Continued)

53

54

CHAPTER 2 Vector spaces and matrices

6. 7. 8. 9. 10.

k x A V for all k A F and x A V (closure property for scalar multiplication). k (x 1 y) 5 k x 1 k y for every k A F and all x, y A V. (k 1 m) x 5 k x 1 m x for all k, m A F and every x A V. (k m) x 5 k (m x) for all k, m A F and every x A V I x 5 x for every x A V.

Here are some examples of vector spaces. Example 2.3.1: If V contains a single element denoted by 0 and defined by 0 1 0 5 0 and λ 0 5 0

for all scalars λ, it is called the zero vector space. Example 2.3.2: Real vector space ℝn. For u, v A ℝn and λ A ℝ u 1 v 5 ðu1 ; u2 ; . . . . . . : ; un Þ 1 ðv1 ; v2 ; . . . . . . ; vn Þ 5 ðu1 1 v1 ; u2 1 v2 ; . . . . . . : : ; un 1 vn Þ A ℝn

and λ u 5 ðλ u1 ; λ u2 ; . . . . . . : ; λ un Þ A ℝn

that is, the set V 5 ℝ is closed under addition and scalar multiplication. ℝn is a finite-dimensional vector space. n

Example 2.3.3: Infinite-dimensional vector space. Let u and v be infinite sequences of real numbers. u 5 ðu1 ; u2 ; . . . . . . ; un ; . . . . . .Þ and v 5 ðv1 ; v2 ; . . . . . . ; vn ; . . . . . .Þ

For u, v A V, and λ A ℝ, u 1 v 5 ðu1 ; u2 ; . . . . . . ; un ; . . .Þ 1 ðv1 ; v2 ; . . . . . . ; vn ; . . .Þ 5 ðu1 1 v1 ; u2 1 v2 ; . . . . . . ; un 1 vn ; . . .Þ AV

and λ u 5 ðλ u1 ; λ u2 ; . . . . . . : ; λ un ; . . . . . .ÞAV

The set V containing objects which are infinite sequences of real numbers is denoted by ℝ ~ . Example 2.3.4: Vector space of polynomials. Let us consider a polynomial P ðxÞ 5 a0 1 a1 x 1 a2 x2 1 ?? 1 an xn

where a0, a1, a2, . . . . . ., an are real numbers called the coefficients of the polynomial. If all the coefficients are zero, it is called a zero polynomial. Now, for n $ 1, let Pn denote the set of all polynomials of degree # n. It can be seen that the sums and scalar multiple of all polynomials in Pn are again in Pn. Then, Pn, together with the zero polynomial, constitutes a vector space.

2.3 Vector spaces and subspaces

Vector space of m 3 n matrices

For matrices A, B A ℝm 3 n, m, n A ℕ, addition:

A1B5

2

3 2 3 2 b11 ? b1n a11 1 b11 a11 ? a1n 4 ^ & ^514 ^ & ^554 ^ am1 ? amn bm1 ? bmn am1 1 bm1

3 ? a1n 1 b1n & ^ 5A ℝ m 3 n ? amn 1 bmn

and 2

3 2 3 a11 ? a1n λa11 ? λa1n multiplication by a scalar λ: λ A 5 λ4 ^ & ^554 ^ & ^ 5A ℝm 3 n am1 ? amn λam1 ? λamn

that is, the vector space V 5 ℝm 3 n is closed under addition (axiom 1) and scalar multiplication (axiom 6). It can be shown that the other axioms 25 and 710 are also valid for V 5 ℝm 3 n. The vector space of 2 3 2 matrices, which is easier to numerically illustrate, is a special case of V 5 ℝm 3 n where m 5 n 5 2. It is to be noted that V must include  022 5

0 0 0 0



2.3.2 Vector subspaces Vector subspaces are sets within the original vector space with the property that when vector space operations (addition and scalar multiplication) are performed within an individual subspace, we never leave it, that is, the “closure conditions” apply. Vector subspaces have been found to be useful for dimensionality reduction. Definition 2.3.2 Formal definition of vector subspace. Let us consider a nonempty subset S of a vector space V over scalar field F. If S is also a vector space over F under addition and multiplication defined on V, then S is said to be a subspace of V. To rephrase, S D V, S 6¼ Ø, is a subspace of V if and only if (a) x, y A S . x 1 y A S (b) x A S .λ x A S for all λ A F.

Here are some examples of vector subspaces. Example 2.3.5: Trivial subspace. If in a vector space, S 5 {0} is the subset that contains the zero vector only, S is the zero subspace of V since

55

56

CHAPTER 2 Vector spaces and matrices

0 1 0 5 0 and λ 0 5 0 for any scalar λ S is also called a trivial subspace of V. Similarly, V is a trivial subspace of itself. Example 2.3.6: Subspaces of ℝ2. If S is a straight line through the origin in ℝ2, then adding two vectors (say, x and y) on the line or multiplying a vector on the line by a scalar produces a vector on the same line (Fig. 2.6), that is S is closed under addition and scalar multiplication. Therefore straight lines passing through the origin are subspaces of ℝ2. However, straight lines not passing through the origin cannot be subspaces as they do not contain the zero vector. Example 2.3.7: Subspaces of ℝ3. As in ℝ2, in ℝ3 straight lines passing through the origin are subspaces. In addition, any plane P passing through the origin is also a subspace of ℝ3 (Fig. 2.7).

FIGURE 2.6 (A) S is closed under addition and (B) S is closed under scalar multiplication.

FIGURE 2.7 P is a subspace of ℝ3.

2.3 Vector spaces and subspaces

Example 2.3.8: ℝ2 not a subspace of ℝ3. The vectors in ℝ3 have three entries whereas the vectors in ℝ2 have only two. Hence ℝ2 is not a subspace of ℝ3. However, the set 82 3 9 < s = W 5 4 t 5: s; t A R : ; 0

which appears to be equivalent to a plane passing through the origin is actually a subset of ℝ3. It can be seen that the zero vector is in W. Besides, W is closed under both addition and scalar multiplication since these operations on vectors in W invariably produce vectors, the third entries of which are zero. Hence, W is a subspace of ℝ3. Example 2.3.9: A subset that is not a subspace. In ℝ2 consider the subset W5

   s : s $ 0; t $ 0 t

The zero vector is in W, and W is closed under vector addition. However, for a vector (say) v 5 (1, 1) A W, (21) v 5 (21, 21) 2 = W. That is, W is not closed under scalar multiplication. Hence, W is not a subspace of ℝ2.

2.3.3 Null space/row space/column space Let us consider a matrix A A ℝm 3 n 2

3 a11 a12 UUUUUUUUUU a1n 6 a21 a22 UUUUUUUUUU a2n 7 6 7 A56 UUUUUUUUUUUUUUUUUUUU 7 6 7 4 UUUUUUUUUUUUUUUUUUUU 5 am1 am2 : : : : amn

If we write r1 5 ½a11

a12 ?? a1n 

r2 5 ½a21

a22 ?? a2n 

UUUUUUUUUUUUUU UUUUUUUUUUUUUU rm 5 ½am1

am2 ?? amn 

then r1, r2, . . . . . . ., rm A ℝn, formed by the rows of A, are called the row vectors of A.

57

58

CHAPTER 2 Vector spaces and matrices

Similarly, if we write 2

2 2 3 3 3 a11 a12 a1n 6 a21 7 6 a22 7 6 a2n 7 6 6 6 7 7 7 7 c2 5 6 U 7; . . . . . . ; cn 5 6 U 7 c1 5 6 U 6 6 6 7 7 7 4 U5 4 U5 4 U5 am1 am2 amn

then c1, c2, . . . . . . , cn A ℝm, formed by the columns of A, are called the column vectors of A. Example 2.3.10: . Let us consider a 4 3 3 matrix 2

4 6 21 A¼6 4 5 1

0 3 23 4

3 2 17 7 25 0

The row vectors of A are r1 5 ½4 0 2; r2 5 ½ 21 3 1; r3 5 ½5 2 3 2; and r4 5 ½1 4 0

and the column vectors of A are 3 3 2 2 3 4 0 2 6 217 6 37 617 7 7 6 6 7 c1 5 6 4 5 5; c2 5 4 2 3 5; c3 5 4 2 5 1 4 0 2

We can now define three important vector spaces associated with a matrix. Definition 2.3.3. For a matrix A A ℝm 3 n, the subspace of ℝn spanned by the row vectors is called its row space, while the subspace of ℝm spanned by the column vectors is called its column space. Further, the solution space of the homogeneous system of equations Ax 5 0 is called the null space of A. The null space contains all possible linear combinations of the elements in ℝn that produce 0 A ℝm.

Evidently, the product Ax can be expressed as a linear combination of the column vectors Ax 5 x1 c1 1 x2 c2 1 ?? 1 xn cn

(2.5)

so that a linear system of m equations of n unknows can be written as Ax 5 x1 c1 1 x2 c2 1 ?? 1 xn cn 5 b

Therefore, b is in the column space of A. Let us illustrate this in the following example.

(2.6)

Exercise 2.3

Example 2.3.11: Let Ax 5 b be the linear system 2

2 4 21 3

3 32 3 2 16 5 x1 2 2 54 x2 5 5 4 2 6 5 4 x3 21

23 2 1

and let us determine if b is in the column space of A. It has to be shown that b can be expressed as a linear combination of the column vectors of A. Now, on solving the system it can be found that x1 5 2; x2 5 1; x3 5 3

With these values we have 3 2 3 2 2 3 3 16 5 23 2 Ax 5 24 2 1 5 1 14 2 5 1 34 2 2 5 5 4 2 6 5 5 b 4 21 1 3 2

The above example illustrates a specific case where a vector can be expressed as a linear combination of a set of three other vectors. In the following section the generalized concept of linear combination and linear independence of vectors will be discussed.

Exercise 2.3 1. State, giving reasons, if each of the following sets equipped with the given operations is a vector space. a. The set of all real numbers with standard operations of addition and multiplication. b. The set of all pairs of real numbers (m, n) where m $ 0, with the standard operations on ℝ2. c. The set of all 2 3 2 matrices of the form 

λ1 0 0 λ2



2. Are the following sets, along with given operations, vector spaces? Give reasons for your answer. a. The set of all polynomials of degree $ 4 together with 0; operations of P. b. The set of 1, y, y2, . . . . . . . . .; operations of P. 3. Is the straight line y 5 5x 1 3 a subspace of ℝ2? 4. State, giving reasons, if U 5 {x f (x) | f (x) in P3} a subspace of P3. 5. Are the following subspaces of ℝ3? a. All vectors of the form (u1, u2, u3), where u2 5 u1 1 u3. b. All vectors of the form (x1, x2, 0). 6. A set of 2 3 2 invertible matrices is not a subspace of M22—explain.

59

60

CHAPTER 2 Vector spaces and matrices

2.4 Liner combination/linear independence Now that we have some preliminary idea about vector spaces, we can expect that it should be possible to find in a vector space a set of vectors with which every vector can be represented by adding them together and scaling them. In order to do so it is necessary to introduce the concepts of linear combination and linear independence. In a rectangular xy-coordinate system i 5 (1, 0) and j 5 (0, 1) are the standard unit vectors. A vector v (v1, v2) in the plane can be expressed as v ðv1 ; v2 Þ 5 a1 i 1 a2 j 5 a1 ð1; 0Þ 1 a2 ð0; 1Þ

(2.7)

which can also be written as a1 i 1 a2 j  v ðv1 ; v2 Þ 5 0

where a1 5 v1 and a2 5 v2. However, the equation a1 i 1 a2 j 5 0

has only the trivial solution, that is, a1 5 a2 5 0, since i and j are independent of each other (one cannot be expressed in terms of terms of the other). On the other hand, v is dependent on i and j and, as (2.7) shows, it can be expressed as a linear combination of i and j. It is to be noted that a pair of vectors is linearly dependent if and only if one is a scalar multiple of the other. Similarly, a vector v A ℝ3 can be expressed as a linear combination of i 5 (1, 0, 0), j 5 (0, 1, 0), and k 5 (0, 0, 1). v ðv1 ; v2 ; v3 Þ 5 a1 i 1 a2 j 1 a3 k 5 a1 ð1; 0; 0Þ 1 a2 ð0; 1; 0Þ 1 a3 ð0; 0; 1Þ

(2.8)

and the equation a1 i 1 a2 j 1 a3 k 5 0

has a trivial solution, a1 5 a2 5 a3 5 0

Generalized concept A vector wAV is said to be a linear combination of the vectors w1, w2, . . . . . . , wr A V if it can be expressed in the form w 5 λ1 w1 1 λ2 w2 1 ??? 1 λr wr 5

r X

λi wi

(2.9)

i51

where λ1, λ2, . . . . . . . . . , λr A ℝ. These scalars are called the coefficients of the linear combination. If S 5 {w1, w2, . . . . . . , wr} is a nonempty set of vectors in a vector space V, it can be proved that the set W of all possible linear combinations of the vectors in S is a subspace of V. Further, the subspace W of V, which contains all possible

2.4 Liner combination/linear independence

linear combinations of the vectors in S, is called the subspace of V generated by S and denoted as W 5 span fw1 ; w2 ; ??; wr g or W 5 span ðSÞ

Then, w A span fw1 ; w2 ; ??; wr g

A set S 5 {w1, w2, . . ., wr} of two or more vectors in a vector space V is said to be a linearly independent set if no vector in S can be expressed as a linear combination of the others. On the other hand, the set S is linearly dependent if at least one member of the set can be expressed as a linear combination of the rest of the set. The above definition of linear dependence/independence can be restated as follows: Definition 2.4.1 A set of vectors S 5 {w1, w2, . . . . . . , wr}AV is said to be linearly dependent if there is a nontrivial linear combination, such that r X

λi w i 5 0

(2.10)

i51

with at least one λi 6¼ 0. If only trivial solution exists, that is, λ1 5 λ2 5 . . .. . . 5 λr 5 0, the set S 5 {w1, w2, . . . . . . , wr} is said to be linearly independent.

In Chapter 1 (Section 1.5), we have seen for an n 3 n invertible matrix A, the homogenous system Ax 5 0 has only the trivial solution x 5 0. Now, if c1, c2, . . . . . . , cn are the columns of A and we write 2

3 x1 6 x2 7 6 7 6 U7 7 x56 6 U7 6 7 4 U5 xn

so that from (2.10) Ax 5 x1 c1 1 x2 c2 1 UUUU 1 xn cn 5

n X

xi ci 5 0

i51

and we have the following theorem. Theorem 2.4.1 The columns { c1, c2, . . . . . . , cn } of a matrix A is a linearly independent set if and only if Ax 5 0 implies x 5 0.

Hence, it can be said that the columns of an invertible matrix A are linearly independent. Similarly, it can be shown that the rows of an invertible matrix

61

62

CHAPTER 2 Vector spaces and matrices

A are linearly independent. It is to be noted that while invertibility relates to only square matrices, Theorem 2.4.1 holds for any m 3 n matrix. Example 2.4.1: Consider vectors u 5 (2, 21, 2, 1) and v 5 (3, 4, 21, 1) A ℝ4. Let us determine if each of the vectors w 5 (0, 211, 8, 1) and w0 5 (2, 3, 1, 2) A ℝ4 can be expressed as a linear combination of u and v. Solution. For w to be a linear combination of u and v, there must be scalars k1 and k2 such that w 5 k1 u 1 k2 v

that is, 2

3 2 3 2 3 0 2 3 6 2 11 7 6 7 6 7 6 7 5 k1 6 2 1 7 1 k2 6 4 7 4 5 4 5 4 8 2 215 1 1 1

Equating corresponding components gives equations 2k1 1 3 k2 5 0 2k1 1 4 k2 5 2 11 2 k1 2 k2 5 8 k1 1 k2 5 1

This linear system has a solution: k1 5 3, k2 5 22, so that w 5 3u 2 2v

Similarly, for w0 to be a linear combination of u and v, there must be scalars k1 and k2 such that w0 5 k 1 u 1 k 2 v

that is,

2 3 2 3 2 3 2 2 3 637 6 7 6 7 6 7 5 k1 6 2 1 7 1 k2 6 4 7 415 4 25 4 215 2 1 1

Equating corresponding components gives equations 2k1 1 3 k2 5 2 2k1 1 4 k2 5 3 2 k1 2 k2 5 1 k1 1 k2 5 2

Exercise 2.4

It can be verified that this system of equations is “inconsistent”—it has no solution. Consequently, w0 cannot be expressed as a linear combination of u and v. Example 2.4.2: To show if the vectors v1 5 (3, 0, 2 3, 6), v2 5 (0, 2, 3, 1), v3 5 (0, 2 2, 2 2, 0), and v4 5 (22, 1, 2, 1) A ℝ4 are linearly independent (or dependent). The set of vectors S 5 {v1, v2, v3, v4} A ℝ4 is linearly independent if and only if coefficients satisfy the vector equation λ1 v1 1 λ2 v2 1 λ3 v3 1 λ4 v4 5 0

are λ1 5 λ2 5 λ3 5 λ4 5 0. The above equation can be rewritten in the component form as λ1 ð3; 0; 2 3; 6Þ 1 λ2 ð0; 2; 3; 1Þ 1 λ3 ð0; 2 2; 2 2; 0Þ 1 λ4 ð 22; 1; 2; 1Þ 5 ð0; 0; 0; 0Þ

which yields the homogenous linear system 3λ1 22λ1

1

2λ2 22λ2 λ2

1 1 2 1

2 3λ3 3λ3 2λ3 2λ3

1 1

6λ4 λ4

1

λ4

5 5 5 5

0 0 0 0

It can be verified that the system has only trivial solution, that is λ1 5 0; λ2 5 0; λ3 5 0; λ4 5 0

so that it can be concluded that v1, v2, v3, and v4 are linearly independent.

Exercise 2.4 1. Let x 5 (2, 1, 2 1), y 5 (1, 0, 1) and z 5 (1, 1, 2 2). Determine if each of the vectors u 5 (4, 3, 4) and v 5 (3, 0, 3) can be expressed as a linear combination of x, y and z. 2. Let u 5 (2, 1, 0, 3) and v 5 ( 2 1, 2, 1, 0) A ℝ4. Determine if each of the vectors w 5 (3, 0, 21, 6) and w0 5 (2, 1, 21, 4) A ℝ4 can be expressed as a linear2 combination3of u and 2 v. 3 2 21 3 2 3. A 5 4 0 2 2 1 5; x 5 4 3 5 1 4 0 4 Write the product Ax as a linear combination of the column vectors of A. 4. x 5 (2, 1, 21, 2) and y 5 (1, 2, 21, 3) A ℝ4. Determine if each of the vectors u 5 (4, 5, 23, 8) and v 5 (3, 2, 5, 24) A ℝ4 lies in W 5 span {x, y}.

63

64

CHAPTER 2 Vector spaces and matrices

5. Determine if each of the following subsets is independent. a. {(1, 1, 1), (1, 0, 0), (1, 21, 1)} A ℝ3 b. {(1, 0, 1, 0), (1, 1, 0, 0), (0, 1, 0, 1), (0, 0, 1, 1)} A ℝ4 6. Do the columns of the matrix 2

1 A ¼ 42 1

22 1 2

3 2 35 21

form a linearly independent set in ℝ3?

2.5 Basis vectors We have seen above that a set of unit vectors i and j, which are “linearly independent,” “span the 2-space,” that is, any vector in 2-space can be expressed as a linear combination of i and j. Similarly, a set of unit vectors i, j, and k, which are also linearly independent, “spans” the 3-space. It can also be said that the vectors i and j form a “basis” for the 2-space while the vectors i, j, and k form a “basis” for the 3-space. Now, we are in a position to extend the concept of “basis vectors” to general vector spaces. A vector space V can be either finite dimensional or infinite dimensional. In a finite-dimensional vector space, there is a finite set of vectors that spans the space. In an infinite-dimensional vector space, no such set exists. Definition 2.5.1 If S 5 {v1, v2, . . . . . . , vr} is a linearly independent set of vectors that spans a finite-dimensional vector space V, S is called a basis for V.

It is to be noted that if any vector is dropped from the basis, it will no longer span the space. A basis for a space, therefore, is a “minimal spanning set.” The basis is, at the same time, a “maximal linearly independent set”—an additional vector to the set will make it linearly dependent. The following theorem states the uniqueness of basis representation. Theorem 2.5.1 If S 5 {v1, v2, . . . . . . , vn} is a basis for a vector space V, every vector v A V can be expressed in the form v 5 c1 v1 1 c2 v2 1 ?? 1 cn vn in exactly one way.

In such case, the scalars c1, c2, . . . . . . , cn are called the coordinates of v relative to the basis S. The vector (c1, c2, . . . . . . , cn) A ℝn constructed from the coordinates and denoted by ðvÞS 5 ðc1 ; c2 ; . . . . . . ; cn Þ A ℝn

is called the coordinate vector of v relative to S.

2.5 Basis vectors

Theorem 2.5.1 establishes a one-to-one correlation between vectors in ℝn and vectors in V. Example 2.5.1: The vectors u1 5 (1, 21) and u2 5 (2, 3) form a basis for ℝ2. Let us find out the coordinate vector of v 5 (21, 29) relative to the basis S 5 {u1, u2}. Solution. v can be expressed as a linear combination of the vectors in S  5 c1

v 5 c1 u1 1 c2 u2     1 2 21 1 c2 5 21 3 29 

From above, we have a system of linear equations c1 1 2 c2 5 2 1 2c1 1 3 c2 5 2 9

The system can be solved to obtain c1 5 3, c2 5 22, so the coordinate vector ðvÞ S 5 ð3; 2 2Þ

Example 2.5.2: Let us consider a basis S 5 {u1, u2, u3} for ℝ3 where u1 5 (1, 0, 0), u2 5 (2, 2, 0), and u3 5 (3, 3, 3). In a manner similar to the earlier example, it is possible to find out the vector v A ℝ3, the coordinate vector of which relative to S is (v)S 5 (3, 22, 1). Solution. v can be expressed as a linear combination of the vectors in S v 5 c1 u1 1 c2 u2 1 c3 u3

where c1, c2, c3 are the coordinates of the vector v relative to S. So, we have 2 3 2 3 2 3 2 3 2 1 2 3 v 5 34 0 5 1 ð2 2Þ4 2 5 1 14 3 5 5 4 2 1 5 0 0 3 3

which can be also written as v 5 (2, 21, 3) The standard basis for ℝn is constituted by the following standard unit vectors e1 5 ð1; 0; 0; . . . . . . ; 0Þ ; e1 5 ð1; 0; 0; . . . . . . ; 0Þ ; . . . . . . . . . . . . en 5 ð0; 0; 0; . . . . . . ; 1Þ ;

and every vector v 5 (v1, v2, . . . . . . . . . , vn) A ℝn can be expressed as v 5 v1 e1 1 v2 e2 1 ?? 1 vn en

In the particular case where n 5 3, the unit vectors that form the standard basis for ℝ3 are i 5 ð1; 0; 0Þ ; j 5 ð0; 1; 0Þ ; k 5 ð0; 0; 1Þ

65

66

CHAPTER 2 Vector spaces and matrices

which we have seen earlier. Evidently, any vector v 5 (v1, v2, v3) A ℝ3 can be expressed as v 5 ðv1 ; v2 ; v3 Þ 5 v1 ð1; 0; 0Þ 1 v2 ð0; 1; 0Þ 1 v3 ð0; 0; 1Þ :

Example 2.5.3: To show if the vectors v1 5 (3, 0, 26), v2 5 (24, 1, 7), and v3 5 (22, 1, 5) form a basis for ℝ3. We have to show that S 5 {v1, v2, v3} is linearly independent and spans ℝ3. To prove linear independence, it must be shown that the vector equation a1 v1 1 a2 v2 1 a3 v3 5 0

has only the trivial solution, that is, a1 5 a2 5 a3 5 0. The above equation can be expressed as a homogeneous linear system Ma 5 0

where the coefficient matrix 2

3 M54 0 26

24 1 7

3 22 15 5

and a 5 (a1, a2, a3). The homogeneous system has trivial solution if det (M) 6¼ 0. We can calculate det (M) 5 6 6¼ 0. Further, to prove that S spans ℝ3, it must be shown that for every vector b 5 (b1, b2, b3) A ℝ3 can be expressed as a linear combination of v1, v2, v3, that is c1 v1 1 c2 v2 1 c3 v3 5 b

and the nonhomogeneous system Mc 5 b

where c 5 (c1, c2, c3) and M is the coefficient matrix. The nonhomogeneous system will be consistent for all values of b1, b2, and b3 only if det (M) 6¼ 0 which has already been shown above. So, we can conclude that the vectors v1, v2, and v3 form a basis for ℝ3.

The standard basis for m 3 n matrices Earlier we have introduced the vector space V 5 ℝm 3 n of m 3 n matrices. Let us now consider the vector space M22 of 2 3 2 matrices and determine if the matrices 

1 M1 5 0

form a basis for M22.

       0 0 1 0 0 0 0 ; M2 5 ; M3 5 ; M4 5 0 0 0 1 0 0 1

Exercise 2.5

Here, it has to be shown that the matrices are linearly independent and span M22. Linear independence is proved if the equation a1 M1 1 a2 M2 1 a3 M3 1 a4 M4 5 0

(2.11)

where 0 is the 2 3 2 zero matrix and has only the trivial solution. The matrix form of (2.11) is  a1 or

         1 0 0 1 0 0 0 0 0 0 1 a2 1 a3 1 a4 5 0 0 0 0 1 0 0 1 0 0     0 0 a1 a2 5 a3 a4 0 0

which shows a1 5 a2 5 a3 5 a4 5 0, that is, (2.11) has only the trivial solution. Further, to prove that the matrices span M22, it must be shown that every 2 3 2 matrix 

P5

a b c d



can be expressed as c1 M1 1 c2 M2 1 c3 M3 1 c4 M4 5 P

which, in the matrix form, can be written as  c1

  1 0 0 1 c2 0 0 0 

or

c1 c3

    1 0 0 0 1 c3 1 c4 0 1 0 0

   0 a b 5 1 c d

   a b c2 5 c4 c d

This equation has a nontrivial solution c1 5 a; c2 5 b; c3 5 c; c4 5 d

indicating that the matrices span M22. Thus we can conclude that the matrices M1, M2, M3, and M4 form a basis for M22. In general, the mn different matrices whose entries are all zero except for a single entry of 1 form the standard basis for Mmn.

Exercise 2.5 1. u1 5 (2, 21) and u2 5 (1, 1) form a basis for ℝ2. Find out the coordinate vector of v 5 (3, 1) relative to the basis S 5 {u1, u2}. 2. Determine if the vectors v1 5 (1, 2, 1), v2 5 (22, 1, 2) and v3 5 (2, 3, 21) form a basis for ℝ3.

67

68

CHAPTER 2 Vector spaces and matrices

3. Consider a basis S 5 {u1, u2, u3} for ℝ3 where u1 5 (1, 1, 0), u2 5 (0, 1, 1), and u3 5 (1, 0, 1). Find out the vector v A ℝ3 the coordinate vector of which relative  to S is (v)S 5(3, 22,  1).     1 1 0 1 0 0 0 0 ; A2 5 ; A3 5 ; A4 5 4. A1 5 0 1 1 0 1 1 0 1 Show that S 5 {A1, A2, A3, A4} is a basis for M22. Express 

A5

1 1 1 1



as a linear combination of A1, A2, A3, and A4 and find the coordinate vector of A relative to S.

2.6 Dimension and rank In geometry, ℝ3 is considered to be three-dimensional, planes are said to be twodimensional, and lines one-dimensional. However, the concept of “dimension” can be formalized based on the following theorem. Theorem 2.6.1 Let {v1, v2, . . . . . . , vn} be any basis for an n-dimensional vector space V, then, a. if a set in V has more than n vectors, it is linearly dependent, and b. if a set in V has fewer than n vectors, it does not span V

Evidently, the criteria for linear independence and spanning imply that unless a set in V has exactly n vectors it cannot be a basis. Further, it can also be proved that all bases for a finite-dimensional vector space have the same number of vectors. We have seen that the standard basis for ℝn has n vectors (two vectors for ℝ2, three vectors for ℝ3, and so on). There seems to be a link between the number of vectors in a basis and the dimension of a vector space. Therefore the following can be considered as a definition of dimension. Definition 2.6.1 The dimension of a finite-dimensional vector space V is formally defined to be the number of vectors in a basis for V and denoted by dim ðVÞ

Example 2.6.1: The dimension of the space spanned by a linearly independent set of vectors S 5 {v1, v2, . . . . . . , vr} is equal to the number of vectors in that set, that is, dim ½span fv1 ; v2 ; . . .. . .; vr g 5 r

The zero vector space is zero-dimensional.

2.6 Dimension and rank

Example 2.6.2: Let us consider the homogeneous system x1 2 x1

1 1

3 x2 6 x2

2 x1

1

6 x2

2 2

2 x3 5 x3 5 x3

2 1 1

2 x4 10 x4 8 x4

1 1

2 x5 4 x5

1

4 x5

5 5 5 5

0 0 0 0

and try to determine the dimension of its solution space. The augmented matrix for the given system is 2

1 62 6 40 2

22 25 5 0

3 6 0 6

0 22 10 8

2 4 0 4

3 0 07 7 05 0

the reduced row-echelon form of which generated by elementary row operations is 2

1 60 6 40 0

3 0 0 0

0 1 0 0

4 2 0 0

2 0 0 0

3 0 07 7 05 0

The solution to the system in vector form can be calculated to be ðx1 ; x2 ; x3 ; x4 ; x5 Þ 5 ð2 3r  4s  2t; r; 2 2s; s; tÞ

which can be alternatively written as 2

3 2 3 2 3 2 3 x1 23 24 22 6 x2 7 6 17 6 07 6 07 6 7 6 7 6 7 6 7 6 x 3 7 5 r 6 0 7 1 s6 2 2 7 1 t 6 0 7 6 7 6 7 6 7 6 7 4 x4 5 4 05 4 15 4 05 x5 0 0 1

where r, s, and t are arbitrary parameters. This shows that the vectors v1 5 ð 23; 1; 0; 0; 0Þ ; v2 5 ð 24; 0; 2 2; 1; 0Þ ; and v3 5 ð 22; 0; 0; 0; 1Þ

span the solution space. It can be shown that the equation k1 v1 1 k2 v2 1 k3 v3 5 0

is satisfied only when k1 5 k2 5 k3 5 0. Hence, the vectors v1, v2, and v3 are linearly independent and the solution space has dimension 5 3.

69

70

CHAPTER 2 Vector spaces and matrices

Example 2.6.3: Dimension of row space and column space. Let us consider a matrix in its row-echelon form 2

1 3 Are 5 4 0 1 0 0

3 21 0 2 05 0 1

It can be clearly seen that the row vectors with leading 1s (nonzero row vectors) are r1 5 ½1 3 2 1 0; r2 5 ½0 1 2 0 and r3 5 ½0 0 0 1

Let us determine the condition for k 1 r1 1 k 2 r2 1 k 3 r3 5 0

(2.12)

or, in the component form k1 ½1 3 2 1 0 1 k2 ½0 1 2 0 1 k3 ½0 0 0 1 5 ½0 0 0 0

Equating the fourth entries, we find, k3 5 0. so that we have from (2.12) k 1 r1 1 k 2 r2 5 0

Then, equating the first entries, we have k1 5 0, so finally we get k2 5 0. That is, (2.12) is satisfied if k1 5 k2 5 k3 5 0. Hence, the row vectors r1, r2, and r3 are linearly independent and form a basis for the row space of Are. As defined earlier, the dimension of a vector space V is equal to the number of vectors in a basis for V. Hence, dim ðrow space of Are Þ 5 3

The column vectors of Are are

2 3 2 3 2 3 2 3 1 3 21 0 c1 5 4 0 5; c2 5 4 1 5; c3 5 4 2 5; c4 5 4 0 5 0 0 0 1

of which c1, c2, and c4 contain the leading 1s of the row vectors while c3 does not. If we choose c1, c2, and c3 we find that for the equation l1 c1 1 l2 c2 1 l3 c3 5 0

which can be written as

2

1 3 40 1 0 0

the determinant

32 3 21 l1 2 54 l2 5 5 0 l3 0

 1 3  0 1  0 0

 2 1  2  5 0 0

indicating that vectors c1, c2, and c3 are not linearly independent. However, if we choose c1, c2, and c4, we have for the equation l1 c1 1 l2 c2 1 l4 c4 5 0

2.6 Dimension and rank

or, 2

32 3 l1 0 0 54 l2 5 5 0 1 l4

2

3 3 0 1 0 5 5 1 6¼ 0 0 1

1 3 40 1 0 0

the determinant 1 40 0

So, the vectors c1, c2, and c4, which contain the leading 1s of the row vectors, are linearly independent and form a basis for the column space of Are. Hence, dim ðcolumn space of Are Þ 5 3

The dimension of the null space of Are is given by the solution space of Are x 5 0. The augmented matrix for the given homogenous system 2

1 40 0

can be reduced to

3 1 0

2

1 0 40 1 0 0

21 2 0

27 2 0

3 2 0 21 0 5 1 0 3 0 0 0 05 1 0

from which we obtain (x1, x2, x3, x4) 5 r (7, 22, 1, 0) where r is an arbitrary parameter. Thus we find that the solution space has a single basis vector and, therefore, the dimension of the null pace of Are is equal to 1. It is to be noted that for the given Are, which have four columns, dim



 row=column space 1 dim ðnull spaceÞ 5 4

In general, if a matrix Are is in a row-echelon form, then a. The nonzero row vectors form a basis for the row space of Are. b. The column vectors with the leading 1s of the row vectors form a basis for the column space of Are. c. dim (column space of Are) 5 dim (row space of Are). Further, it can be proved that elementary row operations, as defined in Chapter 1, do not affect either the row space or the null space of a matrix. Therefore if Are is the row-echelon form of the matrix A, dim ðrow space of AÞ 5 dim ðrow space of Are Þ dim ðcolumn space of AÞ 5 dim ðcolumn space of Are Þ

and we have the following theorem.

71

72

CHAPTER 2 Vector spaces and matrices

Theorem 2.6.2 For a matrix A, the row space and the column space have the same dimension.

Now we can formalize the definition of the rank that has been preliminarily given in Chapter 1.

Definition 2.6.2 The common dimension of the row space and the column space of a matrix A is called its rank and denoted by rank (A). Besides, the dimension of the null space of a matrix A is called its nullity and denoted by nullity (A).

The result of Example 2.6.3 has been generalized in the following theorem. Theorem 2.6.3 For a matrix A with n columns, rank ðAÞ 1 nullity ðAÞ 5 n

Exercise 2.6 1. Consider the following homogeneous system x1 2 x1 3 x1

 2 2

1 1 1

3 x2 6 x2 9 x2

x3 2 x3 3 x3

5 5 5

0 0 0

and determine the dimension of its solution space. 2. Determine the dimensions of row space and column spaces of a matrix, the row-echelon form of which is 2

1 2 Are 4 0 1 0 0

1 3 0

3 21 05 1

3. Row reduce each of the following matrices and determine its rank 2

1 A542 0

22 21 22

3 1 0 0 25 3 1

2.7 Inner product space

2

2 B 5 4 21 3

23 2 1

5 22 21

3 16 26 5 4

4. Show for each of the following matrices that the sum of rank and nullity is equal to2the number of columns. 3 2 1 0 4 a. A 5 4 3 22 0 15 1 4 21 5 2 3 1 1 2 9 b. B 5 4 2 4 23 1 5 3 6 25 0

2.7 Inner product space In Chapter 1 we have seen that a vector can be represented by an arrow in 2-space or 3-space—mathematicians call this a “geometric vector.” For geometric interpretation of the concepts in ℝ2 and ℝ3, it is necessary to define the length of a vector, and the distance and angle between two vectors. These definitions are then extended to nonvisual higher dimension spaces.

Norm It has been earlier said that the length of a vector, represented by an arrow drawn to scale, specifies its magnitude. A common mathematical synonym for length is the term “norm” (also called Euclidean norm). As indicated in Fig. 2.8, for a

FIGURE 2.8 Norm of a vector in ℝ2 (A) and ℝ3 (B).

73

74

CHAPTER 2 Vector spaces and matrices

vector u 5 (u1, u2) A ℝ2, the norm (or length or magnitude) is obtained from Pythagorean theorem as kuk 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u21 1 u22

(2.13)

Similarly, for a vector u 5 (u1, u2, u3) A ℝ3, kuk 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u21 1 u22 1 u23

(2.14)

The definition can be extended to n-space so that for a vector u 5 (u1, u2, . . . . . . . . . , un) A ℝn, the Euclidean vector norm can be written as jjujj 5 ðuT uÞ1=2 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u21 1 u22 1 UUUUUUUUU 1 u2n

(2.15)

It follows from the above equation that (a) kuk $ 0 (b) kuk 5 0 if and only if u 5 0 Further, if k is a scalar, then jjkujj 5 jkjjjujj Definition 2.7.1 A vector having norm 1 is called a unit vector. If u A ℝn be a nonzero vector, a unit vector that is in the same direction as u is defined as uˆ 5

1 u jjujj

(2.16)

It can be shown that kuˆk 5 1. Hence, it appears that multiplication of a nonzero vector by the reciprocal of its magnitude produces a unit vector. The process is called normalizing a vector. Example 2.7.1: To find a unit vector uˆ that has same direction as u 5 (24, 2, 4). Solution. The vector u has length kuk 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð24Þ2 1 22 1 42 5 6

From (2.16), uˆ 5

1 1 u 5 ð 24; 2; 4Þ 5 jjujj 6

It can be verified that kuˆk 5 1



 2 1 2 2 ; ; 3 3 3

2.7 Inner product space

Distance From the basic concept of a vector (as introduced in Chapter 1), one can say that the terms “vector” and “point” are interchangeable. As such, the point in threedimensional space that is defined as the origin is a vector 0 5 (0, 0, 0). Hence, for the vector u which has the point P 5 (u1, u2, u3), the length kuk given by (2.14) is also defined to be the “distance” from the origin to P. Now, let us consider two vectors u and v in ℝ2 denoted respectively by points P1 5 (u1, u2) and P2 5 (v1, v2). Application of vector subtraction gives the distance between the two vectors. Theorem 2.7.1 If u and v have a common tail, then u 2 v is the vector from the tip of v to the tip of u.

The distance d between the two vectors is given by the    !

d 5 k P1 P2 k 5 ku 2 vk 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðu1 2v1 Þ2 1 ðu2 2v2 Þ2

(2.17)

Similarly, the distance between vectors u 5 (u1, u2, u3) and v 5 (v1, v2, v3) A ℝ3 is defined to be d ðu; vÞ 5 kuvk 5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðu1 2v1 Þ2 1 ðu2 2v2 Þ2 1 ðu3 2v3 Þ2

(2.18)

Example 2.7.2: For two points P1 and P3 in ℝ3 represented by vectors u 5 (2, 21, 3)  !

and v 5 (1, 1, 4) respectively, the vector P1 P2 5 uv 5 (1, 22, 21) and

d5

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffi ð1Þ2 1 ð22Þ2 1 ð21Þ2 5 6

Definition 2.7.2 Considering u 5 (u1, u2, . . . . . . . . . , un) and v 5 (v1, v2, . . . . . . . . . , vn) as points in ℝn, the distance (Fig. 2.9) between u and v is expressed as qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ðu; vÞ 5 kuvk 5 ðu1 2v1 Þ2 1 ðu2 2v2 Þ2 1 UUUUU 1 ðun 2vn Þ2

FIGURE 2.9 Distance between two vectors.

75

76

CHAPTER 2 Vector spaces and matrices

Dot product For the purpose of defining the dot product between two vectors u and v in ℝ2 or ℝ3, the two vectors are positioned in such a way that their initial points coincide. Let θ be the angle between them. The dot product (also called the Euclidean inner product) between u and v is then defined as uUv 5 kuk kvk cos θ

If either u 5 0 or v 5 0, uUv 5 0. When necessary, the angle between the vectors u and v can be written as cos θ 5

uUv kukkvk

In component form, the dot product between two nonzero vectors u 5 (u1, u2, u3) and v 5 (v1, v2, v3) A ℝ3 can be written as uUv 5 u1 v1 1 u2 v2 1 u3 v3

Example 2.7.3: For u 5 (2, 23, 5, 4) and v 5 (1, 24, 1, 0) A ℝ4, the dot product uUv 5 ð2Þ ð1Þ 1 ð 23Þ ð 24Þ 1 ð5Þ ð1Þ 1 ð4Þ ð0Þ 5 19

Extending the operation to ℝn, the dot product (also called the Euclidean inner product) between two vectors u 5 (u1, u2, . . . . . . . . . , un) and v 5 (v1, v2, . . . . . . . . . , vn) is defined as uUv 5 u1 v1 1 u2 v2 1 ??un vn

This can also be written in the form uT v 5

n X

u i vi A ℝ

i51

which is a scalar called the standard inner product for ℝn. Multiplication of a vector by a matrix can be depicted as dot product of vectors. Considering a 4-vector 2

3 u1 6 u2 7 7 u56 4 u3 5 u4

and a 3 3 4 matrix

2

3 r1 4 A 5 r2 5 r3

2.7 Inner product space

where r1 5 ðr11 ; r12 ; r13 ; r14 Þ r2 5 ðr21 ; r22 ; r23 ; r24 Þ r3 5 ðr31 ; r32 ; r33 ; r34 Þ

the product of A and u can be computed as 2

3 3 2 3 2 r1 Uu r11 u1 1 r12 u2 1 r13 u3 1 r14 u4 r1 Au 5 4 r2 5u 5 4 r2 Uu 5 5 4 r21 u1 1 r22 u2 1 r23 u3 1 r24 u4 5 r3 r31 u1 1 r32 u2 1 r33 u3 1 r34 u4 r3 Uu

Thus we see that each entry of Au is the dot product of the corresponding row (vector) with u. So far, the definition of dot product or inner product has been coordinatedependent. However, it needs to be appreciated that dot products are more general concepts with specific properties. Therefore, to formulate a more general coordinate-free definition, the notion of a dot product/inner product can be extended to general real vector spaces based on its essential properties. Definition 2.7.3 General inner product. An inner product on a vector space V is a function that maps each ordered pair of vectors u, v A V to a scalar hu, vi such that the following axioms are satisfied for all vectors u, v and w A V and all scalars λ. a. b. c. d.

hu, vi 5 hv, ui (symmetry) hu 1 v, wi 5 hu, wi 1 hv, wi (additivity) hλ u, vi 5 λ hu, vi (homogeneity) hu, ui $ 0 and hu, ui 5 0 if and only if u 5 0 (positivity) A real vector space with an inner product (,) is called a real inner product space.

Similarly, the earlier coordinate-dependent definitions of the norm of a vector and the distance between two vectors can be generalized for real inner product spaces as follows. Definition 2.7.4 If u and v are vectors in a real inner product space V and λ is a scalar, the norm (or length) of u is defined as pffiffiffiffiffiffiffiffiffiffiffi kuk 5 hu; ui and the distance between u and v is defined as d ðu; vÞ 5 jju  vjj 5 provided that the following conditions are satisfied a. b. c. d.

kuk $ 0 with equality if and only if u 5 0 kλ uk 5 jλj kuk d (u, v) 5 d (v, u) d (u, v) $ 0 with equality if and only if u 5 v

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi hu 2 v; u 2 vi

77

78

CHAPTER 2 Vector spaces and matrices

With the help of inner products, we can depict the geometry of a vector space by defining the angle between two vectors. In a real inner product space V, the angle between two nonzero vectors u and v A V is defined to be the number θ A [0, π] such that cos θ 5

hu; vi kukkvk

(2.19)

Exercise 2.7 1. Determine the norm of u, and a unit vector that has the same direction as u. a. u 5 (2, 3, 4) b. u 5 (1, 2, 3, 1, 4) c. u 5 (2, 22, 21, 4) 2. Determine the Euclidean distance between u and v and the cosine of the angle between the two vectors. a. u 5 (2, 21, 2), v 5 (2, 0, 1) b. u 5 (4, 0, 22), v 5 (3, 2. 0) c. u25 (1, 0, 22, 1, 3 2), v 52(2, 1, 321, 0,22) 3 2 21 3 1 2 3. A 5 4 4 0 2 5; u 5 4 22 5; v 5 4 21 5 21 2 1 21 0 Show, using dot products, AuUv 5 uUATv

2.8 Orthogonality Two vectors are perpendicular if the angle between them is π/2, that is, they are at right angles to each other. A right angle is visually perceivable in ℝ2 and ℝ3, but not in higher dimensions. Orthogonality is the generalization of the concept of perpendicularity. Geometrically, we can consider orthogonal vectors with respect to a specific inner product. In an inner product space V, two vectors u and v are said to be orthogonal (to each other) if and only if hu, vi 5 0; this is denoted by writing u \ v For ℝn. with standard inner product, u \ v 5 uT v 5 0 Further, if kuk 5 1 5 kvk, that is, u and v are unit vectors, the vectors are said to be orthonormal. Example 2.8.1: Two vectors u 5 (3, 1, 22, 4) and v 5 (2, 0, 1, 21) A ℝ4 are orthogonal since uUv 5 ð3Þ ð2Þ 1 ð1Þ ð0Þ 1 ð 22Þ ð1Þ 1 ð4Þ ð 21Þ 5 0

2.8 Orthogonality

Orthogonal and orthonormal set A set S 5 {u1, u2, . . .., un} of two or more vectors in a real inner product space V is said to be an orthogonal set if all pairs of distinct vectors in the set are orthogonal. That is, ui \ uj for all i 6¼ j

In addition, if kuik 5 1 for each i so that 

hui ; uj i 5

1 when i 5 j 0 when i 6¼ j

(2.20)

the set is said to be orthonormal. Example 2.8.2: Let us consider a set of vectors. 8
> < 0X i ¼ 6 j; rij . rc Γij 5 > 2 Γ i 5 j ij > : i;i6¼j

(9.94)

279

280

CHAPTER 9 Biomolecular structure and dynamics

Applying statistical mechanics, we can obtain the relation C 5 3kB T Γ21

(9.95)

where the covariance matrix 2 6 6 C 56 6 4



hΔr ðΔr1 Þ2 1  Δr

2 i ? ? hΔr1  ΔrN i ? ? hΔr2  ΔrN i ðΔr2 Þ2 hΔr2  Δr1 i ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 2

ðΔrN Þ hΔrN  Δr1 i hΔrN  Δr2 i ? ?

3 7 7 7 7 5

(9.96)

In GNM, the spring constant is assumed to be the same for all interacting pairs, γ ij 5 γ. Hence, the overall interaction potential is given by VGNM ðrÞ 5

   1X  γ Δrij 2 Δrij0 U Δrij 2 Δrij0 2 ij     if Δrij0  5 rj0 2 ri0  # rc

(9.97)

Anisotropic network model In ANM, most of the assumptions are similar to those in GNM except that here the interatomic interactions are considered to be anisotropic. Hence, the potential energy is given by VANM ðrÞ 5

   1X  γ Δrij 2 Δrij0  Δrij 2 Δrij0 2 ij ij

(9.98)

but, in this case, γ ij is not same for all atom pairs (i, j). Evidently, while GNM modes contain information only about the magnitude of the fluctuations, the ANM modes retain the directional information as well. NMA has been widely used in the investigation of biomolecular structure and dynamics. Some of its applications include (1) domain motion of the GroEL chaperone subunit (Skaerven et al., 2011), (2) mechanisms of antibodyantigen recognition (Keskin, 2007), (3) retrieval of protein dynamics information from the PDB (Wako and Endo, 2017), (4) dynamics of the ribosomal tunnel (Kurkcuoglu et al., 2009), and (5) ligand binding (Ma, 2005). However, this is only a tiny fraction of the globally published literature in the area.

Exercise 9 1. Utilizing the definition of inner product and cross product in ℝ3, calculate the products of two quaternions p and q, and show that the set of quaternions is closed under multiplication.

Exercise 9

2. Let us consider a unit quaternion q and define an operator on a vector vAℝ3 as R q ðvÞ 5 q v q

Using the basic algebraic properties of a quaternion, show that the operator Rq(v) changes neither the length nor the direction of v. 3. Let the potential energy of a diatomic molecule be   σ 6  σ 12 U ðr Þ 5 ε 22 r r

4.

5. 6.

7.

where r is the interatomic separation. Find the value of r for which U is the minimum. What is the value of Umin? Write down the expressions for the nuclear kinetic energy and the electron kinetic energy operators of the molecule H1 2 and from these expressions quantitatively justify BornOppenheimer approximation. How is the MaxwellBoltzmann distribution of molecular speed relevant in computation of MD trajectory? Consider a coupled oscillator system with two objects, each of mass m, and three springs, each with spring constant k, as shown in Fig. 9.4. The displacements of the objects from equilibrium positions at any instant of time t are given in terms of coordinates x 5 (x1, x2). Write down the equations of motion for the two objects and find the solutions in terms of x and normal coordinates. Write down the Hamiltonian for a system consisting of a set of n coupled simple harmonic oscillators. How is the Hamiltonian transformed so that it now represents a set of n uncoupled independent harmonic oscillators?

FIGURE 9.4 Exercise 9.6.

281

282

CHAPTER 9 Biomolecular structure and dynamics

References Keskin, O., 2007. Binding induced conformational changes of proteins correlate with their intrinsic fluctuations: a case study of antibodies. BMC Struct. Biol. 7, 31. Available from: https://doi.org/10.1186/1472-6807-7-31. Kurkcuoglu, O., et al., 2009. Collective dynamics of the ribosomal tunnel revealed by elastic network modeling. Proteins 75 (4), 837845. Ma, J., 2005. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure 13, 373380. Skaerven, L., Martinez, A., Reuter, N., 2011. Principal component and normal mode analysis of proteins; a quantitative comparison using the GroEL subunit. Proteins 79, 232243. Wako, H., Endo, S., 2017. Normal mode analysis as a method to derive protein dynamics information from the Protein Data Bank. Biophys. Rev. 9, 877893.

Further reading Allen, M.P., 2004. Introduction to molecular dynamics simulation. In: Attig, N., et al., (Eds.), Computational Soft Matter: From Synthetic Polymers to Proteins, NIC Series, vol. 23. pp. 128. ISBN: 3-00-012641-4. Atkins, P., Friedman, R., 2011. Molecular Quantum Mechanics. Oxford University Press. Bahar, I., et al., 2010. Global dynamics of proteins: bridging between structure and function. Annu. Rev. Biophys. 39, 2342. Bahar, I., et al., 2010. Normal mode analysis of biomolecular structures: functional mechanisms of membrane proteins. Chem. Rev. 10 (3), 14631497. Coutsias, E.A., Wester, M.J., 2019. RMSD and symmetry. Comput. Chem. 40 (15), 14961508. Schlegel, H.B., 2011. Geometry optimization. WIREs 1 (5), 790809.

Index Note: Page numbers followed by “b” and “t” refer to boxes and tables, respectively.

A A-invariance, 108 A-invariant, 108 Aberrations, 244245 function, 245 Addition, 22 Agglomerative approach, 250 Agglomerative hierarchical clustering, 250251 Algebra, 1213 Algebraic functions, 13 Algebraic operations, 18 Angular vibrational motion, 268269 Anisotropic network model (ANM), 279280 Antidifferentiation process, 3536 rules of, 36 Array of scalars, 21 Atom, scattering by, 218221 Atomic scattering factor, 221 Augmented matrix, 17, 69, 111 Autocorrelation function, 229

B B-factor, 230231 Back-projection method, 253 Basis vectors, 6467 Bayes’ theorem, 177, 253 Bernoulli distributions, 178179 Bernoulli random variable, 178 Bernoulli trials, 178179 Binomial distributions, 178179 Binomial random variable, 178179 Biomolecular geometry optimization, 264266 Biomolecular structure and dynamics comparison of, 257262 definition of problem, 257258 minimization of residual, 261262 quaternion rotation operator, 259261 quaternions, 258259 conformational optimization, 262266 exercise, 280281 MD, 266273 basic theory, 267271 computation of molecular dynamics trajectory, 271273 NMA, 273280 Biomolecular system, 267268

Bivariate distribution, 189193 bivariate probability function of X and Y, 190t marginal distribution, 192193 Bivariate function, 155 Blurring effect, 231 Bonding component, 268 Born approximation, 241 Born series, 241 BornOppenheimer approximation, 263 “Bottom-up” strategy, 250251 Bragg’s law, 225226, 232 Bravais lattices, 223 BroydenFletcherGoldfarbShanno update (BFGS update), 265

C Calculus, 2940 differentiation, 2934 simple algebraic functions, 3134 integration, 3438 involving exponential functions, 36 involving logarithmic functions, 3637 by parts, 3738 by substitution, 37 multivariate function, 3840 Cartesian coordinates, 45, 275276 Cartesian system, 264265 Centered column vectors, 206 Chain rule, 33, 138139 Characteristic equation, 109 Characteristic polynomial, 109, 111, 120 Chi-square metric, 247 Classical electron radius, 215 Clustering, 250252 hierarchical, 250251 K-means, 251252 Coefficients, 15 Cofactor, 27 Coherent radiation, 216 Collision process, 217 Color, 13 Column space, 5759 Column vectors, 12, 131132 Comma-delimited form, 12 Complement, 174 Complex functions, 68 Complex number, 68 Complex vector spaces, 5354

283

284

Index

Complex wave function, 213 Compton equation, 218 Compton scattering, 216218 Compton wavelength of electron, 218 Computation of molecular dynamics trajectory, 271273 Computational genetics, 183 Computational microscope, 267 Conditional probability, 175177 Conformational optimization, 262266 biomolecular geometry optimization, 264266 CG, 266 NewtonRaphson method, 265 BornOppenheimer approximation, 263 Conjugate gradient method (CG method), 266 Consistent system, 16 Constant, 15 rule, 36 Continuous distribution of electron density, 226227 Continuous random variable, 181185 cumulative distribution function, 182 expectation of, 187189 exponential distribution, 183 normal distribution, 183185 uniform distribution, 182 Contour integration, 240241 Convolution, 166168, 227228 theorem, 168, 227 Coordinate(s), 64 relative to orthogonal basis, 8486 systems, 220, 264265 vector, 64 Correlation, 195202 coefficient, 197 matrix, 261 Covariance, 195202 Critical points, 137138, 141144 Cross product. See Vector, product Cryo-electron microscopy (cryo-EM), 211, 235 clustering, 250252 exercise, 254 image processing by multivariate statistical analysis, 245250 data compression, 248250 distance metrics, 246248 hyperspace and data cloud, 245246 maximum likelihood, 252253 quantum physics, 235238 Hamiltonian, 237238 Schro¨dinger equation, 236237 waveparticle duality, 235236 theory of image formation, 242245 wave optics of electrons, 239242

Crystal, 159, 221, 223, 227 diffraction from, 221226 Crystallography, 229 Cumulative distribution function (cdf), 182

D Data cloud, 245246 Data compression, 248250 Decomposition, 8788 Deconvolution, 166168 Definite integral, 35 Derivatives of function, 30 of multivariate functions, 139144 of univariate functions, 137139 Determinants, 1429 definiteness of symmetric matrix, 2829 Diagonal matrix, 21 Diagonalizability of matrix, 118121 Diagonalization, 116117 Differentiation, 2934 Diffraction, 212, 226228 from crystal, 221226 Dimension, 6872 of finite-dimensional vector space V, 68 Dirac comb, 165166 Dirac delta function, 162166, 170, 220221 Discrete Fourier transform (DFT), 168169. See also Laplace transform(ation) Discrete random variable, 177181 Bernoulli and binomial distributions, 178179 expectation of, 186 Poisson distribution, 180181 Disjoint sets, 51 Distance, 75 metrics, 246248 Chi-square metric, 247 Euclidean metric, 246247 modulation metric, 247248 Divisive hierarchical clustering, 251 DNA mutations, 183 Domain, 12 Dot product, 7678 Double Riemann sum, 39

E Eigendecomposition, 103, 113114 and singular value decomposition comparison, 134135 Eigenfunction, 237238 Eigenspace, 108109 Eigenspectrum, 108

Index

Eigensystem basics, 107127 from different perspectives, 103106 Eigenvalues, 103, 105, 133, 237238 equation, 107108 of special matrices, 112 Eigenvectors, 103, 105, 110 computation, 108112 diagonalizability of matrix, 118121 diagonalization, 116117 eigendecomposition, 113114 eigenvalues of special matrices, 112 Eigenvectoreigenvalue equation, 249 Eigenvectoreigenvalue-based data compression, 250 geometric intuition for eigendecomposition, 114116 invertibility of matrix P, 117118 linear independence of eigenvectors, 112113 nonuniqueness of, 107108 orthogonal diagonalization, 121124 projection matrix and spectral decomposition, 124127 Elastic network models (ENMs), 278280 ANM, 280 ENM-based NMA, 278279 GNM, 279280 Elastic Thomson scattering, 218 Electrodynamics of lens system, 242243 Electromagnetic forces, 146 Electromagnetic lenses, 235 Electromagnetic radiation, 211 Electromagnetic waves, 212214 Electron density, 229 equation, 228231 isomorphous replacement, 230 phase problem and Patterson function, 229230 sharpening, 230231 Electrons microscope lens system, 242 wave optics of, 239242 Elementary operations, 18 Elementary row operations, 19, 7172 Entries in matrix, 21 Equality, 22 Equivalent system, 17 Euclidean distance, 246 Euclidean inner product. See Dot product Euclidean metric, 246247 Euclidean norm. See Norm Euclidean vector norm, 74 Events, 173 independence of, 176

Exit wave, 244 Expectation, 186189 of continuous random variable, 187189 of discrete random variable, 186 Expected value, 186 Expected variance, 186 Exponential distribution, 183 Exponential functions, 56 derivative of, 30 integration involving, 36 Exponential rule, 36

F Finite dimensional vector space, 64 Finite-difference approach, 271 First Born approximation, 241 First-order linear differential equation, 105106 Force field, 271 Fourier coefficients, 159160 Fourier expansion, 43 Fourier series, 4244, 164165 Fourier transform, 159162, 165, 226227 of convolution, 167 Fourth-order tensor, 150 Functional mode analysis (FMA), 137 Functions, 18

G Gaussian elimination, 1920 Gaussian functions, 162, 231 Gaussian network model (GNM), 279280 General solution, 17 Generalized complex number, 78 Generalized vector, 12 Geometric vector, 9, 13, 73 Geometry optimization, 263 Glycine (G), 177 GramSchmidt algorithm, 89, 123 Green’s function, 239b, 240

H Hamiltonian, 237238 equations of motion, 267268 operator, 238 quantum mechanical operators, 238t Heavy atoms, 230 Hessian matrix, 150153, 277 Homogeneous linear system, 15, 66 Homogeneous system, 110, 117 Homogenous linear system, 63 Hyperspace, 245246

285

286

Index

I Identity matrix, 23 Image formation, 243245 electrodynamics of lens system, 242243 theory of, 242245 Image processing by multivariate statistical analysis, 245250 Imaginary number, 78 Incident flux of wave, 215 Incident radiation, 218 Inconsistent system, 16 Indefinite integral, 3536 Indices of X-ray reflection, 225226 Inelastic scattering process, 216 Infinite comb, 165166 Infinite-dimensional vector space, 54, 64 Inner product, 77 space, 7378 distance, 75 dot product, 7678 norm, 7374 Integral transform convolution and deconvolution, 166168 Dirac delta function, 162166 δ-function in 3D, 164 derivative of δ-function, 163164 Dirac comb, 165166 Fourier series and δ-function, 164165 Fourier transform and δ-function, 165 discrete Fourier transform, 168169 Fourier transform, 159162 Laplace transform, 170171 Integration, 3438 integration-by-parts rule, 38 Intersection, 51 Inverse discrete Fourier transform (IDFT), 169 Inverse Fourier transform, 160, 168 Invertibility of matrix P, 117118 Ionic interaction, 270 Isomorphous replacement, 230

Kinetic energy function, 276277 Kirchhoff matrix, 279280

L

Jacobian function, 145, 147 Joint cumulative distribution function, 191 Joint distribution function, 190 Joint probability density function, 191, 194 Joint probability mass function, 190, 194

Lagrangian optimization, 205 Laplace expansion, 28 Laplace transform(ation), 170171 of function, 170 of integral, 171 Larmor formula, 214 Lattice, 223224 Laue equations, 221226, 232 Law of total probability, 176 Lens system, electrodynamics of, 242243 LeonardJones potential (LJ potential), 269270 Light microscopy, 235 Likelihood function, 252 Limits, 4044 Linear combination, 206 Linear differential equations, 48 system of, 105106 Linear discrete dynamical system, 113 Linear equations, 5, 1415, 153154 Gaussian elimination, 1920 in n dimensions, 15 systems, 1420 Linear independence of eigenvectors, 112113 Linear mappings on vector spaces, 90 Linear spaces, 47 Linear systems, 4748 Linear transformations, 91 of random variables, 185186 Linearity, 4748 of expectations, 196 Linearization, 153156 Linearized function, 156 Linearly dependent vectors, 61 Linearly polarized wave, 213 Liner combination/linear independence, 6063 Loading matrix, 207 Local extrema, 141144 Logarithmic base, 6 Logarithmic functions, 56 integration involving, 3637 Lorentz force, 214, 243 Lower triangular matrix, 2122

K

M

J

K-means, 251252 optimization process, 251252 Kernel, 159

m 3 n matrices, standard basis for, 6667 Maclaurin series, 42 for trigonometric functions, 42

Index

Mapping, 9097 Marginal distribution, 192193 Mathematical analysis of periodic system, 6 Mathematical preliminaries algebraic functions, 13 calculus, 2940 complex number and functions, 68 exercise, 4445 exponential and logarithmic functions, 56 matrices and determinants, 1429 systems of linear equations, 1420 series and limits, 4044 Fourier series, 4244 Taylor series, 4142 trigonometric functions, 35 vectors, 814 Mathematical theory of matter wave, 236237 Matrix/matrices, 1429 axiomatic definition of, 53 basis vectors, 6467 standard basis for m 3 n matrices, 6667 change of basis, 98100 decomposition computing eigenvectors, 108112 eigensystem basics, 107127 eigensystems from different perspectives, 103106 nonuniqueness of eigenvectors, 107108 stable distribution vector, 103105 SVD, 129135 system of linear differential equations, 105106 diagonalization, 114, 116 dimension and rank, 6872 eigenvalues of special, 112 gradients of, 149150 inner product space, 7378 invertibility of, 117118 linear systems, 4748 liner combination/linear independence, 6063 generalized concept, 6063 mapping and transformation, 9097 matrix transformations, 9497 matrix-vector multiplication, 9293 orthogonality, 7889 recurrence, 114 sets and subsets, 4852 transformations, 92, 9497, 148149 projection operators, 95 reflection, 94 rotation operators, 96 vector spaces and subspaces, 5359 formal definition of, 53

null space/row space/column space, 5759 vector space of m 3 n matrices, 55 Maximal linearly independent set, 64 Maximum likelihood estimation, 252253 Maximum likelihood estimator (MLE), 253 Maxwell’s equations, 212 MaxwellBoltzmann distribution, 272 Mean. See Expected value Miller indices, 225226 Minimal spanning set, 64 Minor, 27 Modulation metric, 247248 Molecular dynamics (MD), 266273 computation of molecular dynamics trajectory, 271273 simulations, 268 Molecular mechanics force field (MM FF), 268 Molecular structural biology, 257 Molecular system, 271 Molecules, 267 Monochromatic plane wave, 213 Monochromatic wave, 218 Multidimensional potential energy surface, 271 Multiple isomorphous replacement (MIR), 230 Multiplication rule, 176 Multiplicity, 121 Multivariate distribution, 189195 bivariate distribution, 189193 generalized, 194195 Multivariate functions, 3840 derivatives of, 139144 critical points and local extrema, 141144 partial derivatives, 140 Multivariate normal distribution, 199202 Multivariate statistical analysis, 245 image processing by, 245250 Multivariate Taylor series, 153156 Multivariate variable, 3 Murtagh-Sargent update. See Symmetric rank one update (SR1 update)

N n-dimensional random vector, 194 n-dimensional space, 12 Natural base, 5 Natural logarithm, 6 rule, 36 Natural logarithmic function, derivative of, 32 Newton’s equation, 271 Newton’s second law, 242 NewtonRaphson method, 265 Nonhomogeneous system, 66

287

288

Index

Nonlinear dynamical equations, 153154 Norm, 14, 7374 of vector, 144 Normal coordinates, 275 Normal distribution, 183185 Normal mode analysis (NMA), 273280 elastic network models, 278280 oscillatory systems, 273275 theory, 275278 Normal modes, 273, 275 oscillation, 277278 Normalizing process, 74 Null set, 49 Null space, 5759 Nullity, 72 Numbers, vector as ordered set of, 1113

O Operator, 91, 237 Optimization, 152153, 263 second-order conditions for, 152153 Ordered set of numbers, vector as, 1113 Ordinary differential equations, 271 Orthogonal, 9, 83 bases, 121122 diagonalization, 121124, 274275 matrix, 80 operator, 84 projection, 8689 set, 7984 Orthogonality, 7889 coordinates relative to orthogonal basis, 8486 orthogonal and orthonormal set, 7984 orthogonal projection, 8689 Orthogonally diagonalizing matrix, 123124 Orthonormal eigenvectors, 133 Orthonormal set, 7984 Oscillatory systems, 273275

P Parallelogram rule for vector addition, 11 Parameters, 17 Parametric form, 17 Partial derivatives, 3839, 140 Patterson function, 229230 Phase contrast, 243244 Phase problem, 229230 Physical systems, 13 Physics, concept of vector in, 910 Pixel-vectors, 250 Planck-Einstein energies, 211 Planck’s constant, 235236 Plane wave, 213

Point, 75 Point spread function (PSF), 244245 Poisson distribution, 180181 Poisson process, 180 Polar coordinates, 45 Polynomials, 13 Positive covariance, 197 Positive semidefinite matrix, 130 Posterior probability, 253 Potential density function (pdf), 185 Potential energy, 268 Potential energy surface (PES), 137, 263264 Power rule, 36 Poynting vector, 213 Principal component analysis (PCA), 202208, 245250 and singular value decomposition, 206208 two-dimensional variables, 202t Principal components (PCs), 206 loadings, 206207 scores, 206 Principal submatrix, 28 Prior probability, 253 Probability, 173177 conditional, 175177 Bayes’ theorem, 177 independence of events, 176 covariance and correlation, 195202 matrix, 198199 multivariate normal distribution, 199202 density, 173, 236237 function, 181 exercise, 208209 function, 173175 complement, 174 uniform probability measure, 175 mass function, 178 model, 173 multivariate distribution, 189195 PCA, 202208 random variables and distribution, 177189 Product of matrices, 22 Product rule, 32 Profiles, 247 Projection matrix, 124127, 207 operators, 95 Projection Theorem, 8788 Protein, 262 conformations, 137 structure alignment, 257 comparison, 257 Pythagorean theorem, 7374

Index

Q Quadratic equation, 5 Quantized energy, 235236 Quantum mechanics, 236237, 263 Quantum physics, 235238 Quantum theory, 211, 236237 Quasi-Newton methods, 265 Quaternion rotation operator, 259261 Quaternions, 258259 Quotient rule, 33

R Random distribution, 177189 Random variables, 177189 continuous, 181185 discrete, 177181 expectation and variance, 186189 linear transformations of, 185186 transformation, 185186 Random vector, 189 Range, 12 Rank, 6872 matrix, 20 Real inner product space, 77 Real vector spaces, 5354 Reciprocal lattice, 159, 223224 Rectangular xy-coordinate system, 60 Reduced row-echelon form, 19 Reflection, 94 operators, 94 Reverse order law for transposition theorem, 24 RGB color model, 13 Riemann sum, 3435 Right singular vectors, 130 Root mean square deviation (RMSD), 96, 257 Rotation matrix, 97 operators, 96 Row matrix, 21 Row space, 5759 Row vectors, 57, 145 Row-echelon form matrix, 19 Row-vector form, 12

S Saddles, 142143 Sample space, 173 Scalar(s), 9. See also Vector(s) field, 144 gradients of scalar-valued functions, 144149 multiplication, 11 product, 9, 22 scalar-valued function, 152 Scattered amplitude, 228229

Scattered radiation, 222 Scattering, 212, 239242 amplitude, 219 by atom, 218221 process, 219 Schro¨dinger equation, 236237 Second derivative test, 138, 142143 Second-order partial derivatives, 155 Series, 4044 Set-theoretic difference, 51 Sets, 4852 notations, 4951 disjoint sets, 51 intersection, 51 set-theoretic difference, 51 union, 50 Venn diagrams, 50 δ-function, 164165 in 3D, 164 Signal-to-noise ratio (SNR), 208 Signals, 13 Similarity transformation, 116117 Simple algebraic functions, 3134 Single critical point, 142143 Single matrix equation, 9192 Single random variable, 189 Single-variable functions, 141 Singleton set, 49 Singular value decomposition (SVD), 129135, 206, 258 and Eigendecomposition, 134135 PCA and, 206208 Singular values, 130 Skew-symmetric matrix, 24 Solution, 15 space, 108109 to system of equations, 16 Space, 194 lattice, 223 Special objects, 47 Special vectors, 103 Spectral decomposition, 124127 Square matrices, 21 Square matrix, 83 Stable distribution vector, 103105 Standard deviation, 188 Standard inner product, 76 Standard normal distribution, 184185 Standard unit vectors, 65 Stationary points, 137138 Stationary scattering wave, 241 Statistical image processing in cryo-EM, 252 Structure factor, 224225 equation, 227 Structured space, 47

289

290

Index

Subsets, 4852 Substitution, integration by, 37 Subtraction, 22 Symmetric matrix, 24, 130 definiteness of, 2829 Symmetric rank one update (SR1 update), 265 System of linear differential equations, 105106

T Taylor series, 4142 for univariate function, 154 Taylor’s theorem, 41 Temperature-factor, 230231 Third-order tensor, 150 Thomson scattering, 214216 cross-section, 215 elastic, 216 Three-dimension (3D) Cartesian coordinate system, 910 function, 226 space, 12, 75 vector space, 164 Time-dependent Schro¨dinger equation, 236238 Time-independent Schro¨dinger equation, 236238 “Top-down” approach, 251 Total scattered amplitude, 224225 Transfer function, 244245 Transformation, 9097 Transformation of random variables, 185186 linear transformations of random variables, 185186 Transition matrix, 99100 Transpose vector, 12, 23 Triangular matrix, 2122 Trigonometric functions, 35 derivatives, 32 Two-dimension (2D), 9 space, 9 variables, 202t

U “Uncoupled” differential equations, 274275 Uniform distribution, 182 Uniform probability measure, 175 Union, 50 Unit cell, 223 Unit quaternion, 260 Unit vector, 9, 14, 74 Univariate calculus, 138139 Univariate functions, 2 derivatives of, 137139 Upper triangular matrix, 2122, 112

V Van der Waals interactions, 270 Variance, 186189 Variancecovariance matrix, 198, 203204 Vector spaces, 47, 5355 axiomatic definition of, 53 basis vectors, 6467 standard basis for m 3 n matrices, 6667 change of basis, 98100 dimension and rank, 6872 inner product space, 7378 linear systems, 4748 liner combination/linear independence, 6063 generalized concept, 6063 of m 3 n matrices, 55 mapping and transformation, 9097 matrix transformations, 9497 orthogonality, 7889 of polynomials, 54 sets and subsets, 4852 and subspaces, 5359 formal definition of, 53 null space/row space/column space, 5759 vector space of m 3 n matrices, 55 Vector(s), 814, 75. See also Scalar(s) addition, 11 algebra, 1213 calculus derivatives of multivariate functions, 139144 derivatives of univariate functions, 137139 gradients of matrices, 149150 gradients of scalar-and vector-valued functions, 144149 higher-order derivates, 150153 linearization and multivariate Taylor series, 153156 equality, 11 field, 146 function, 146 mathematical viewpoint of, 1314 in physics, 910 product, 10 subspaces, 5359 vector as ordered set of numbers, 1113 vector-valued functions, 146149 gradients of, 144149 partial derivative of, 147 Velocity-Verlet algorithm, 272273 Venn diagrams, 50 Verlet algorithm, 272

Index

W Ward’s method, 251 Wave function, 236237 Wave mechanics, 236237 Wave optics of electrons, 239242 Wave propagation, 212213 Waveparticle duality, 235236 Weak phase object approximation, 244

X X-ray crystallography, 159, 162 convolution and diffraction, 227228 diffraction and Fourier transform, 226227 diffraction from crystal, 221226 Bragg’s law, 225226 lattice and reciprocal lattice, 223224 structure factor, 224225

electron density equation, 228231 exercise, 231232 scattering by atom, 218221 X-ray scattering, 211218 X-rays, 211 photons, 211 scattering, 211218 Compton scattering, 216218 electromagnetic waves, 212214 Thomson scattering, 214216

Z Z-score, 184185 Zero matrices, 23 Zero polynomial, 54 Zero vector, 14, 57 space, 54, 6870

291