A Numerical Library in C for Scientists and Engineers 084937376X, 9780849373763

This extensive library of computer programs-written in C language-allows readers to solve numerical problems in areas of

201 50 23MB

English Pages 816 [798] Year 1994

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
A Numerical Library in C for Scientists and Engineers
Dedication
Contents
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 1 Elementary Procedures
1.1 Real vector and matrix - Initialization
A. inivec
B. inimat
C. inimatd
D. inisymd
E. inisymrow
1.2 Real vector and matrix - Duplication
A.dupvec
B. dupvecrow
C. duprowvec
D. dupveccol
E. dupcolvec
F. dupmat
1.3 Real vector and matrix - Multiplication
A. mulvec
B. mulrow
C. mulcol
D. colcst
E. rowcst
1.4 Real vector vector products
A. vecvec
B. matvec
C. tamvec
D. matmat
E. tammat
F. mattam
G. seqvec
H. scaprdl
I. symmatvec
1.5 Real matrix vector products
A. fulmatvec
B. fultamvec
C. fulsymmatvec
D. resvec
E. symresvec
1.6 Real matrix matrix products
A. hshvecmat
B. hshcolmat
C. hshrowmat
D. hshvectam
E. hshcoltam
F. hshrowtam
1.7 Real vector and matrix - Elimination
A. elmvec
B. elmcol
C. elmrow
D. elmveccol
E. elmcolvec
F. elmvecrow
G. elmrowvec
H. elmcolrow
I. elmrowcol
J. maxelmrow
1.8 Real vector and matrix - Interchanging
A. ichvec
B. ichcol
C. ichrow
D. ichrowcol
E. ichseqvec
F. ichseq
A. rotcol
B. rotrow
1.9 Real vector and matrix - Rotation
A. rotcol
B. rotrow
1.10 Real vector and matrix - Norms
A. infnrmvec
B. infnrmrow
C. infnrmcol
D. infnrmmat
E. onenrmvec
F. onenrmrow
G. onenrmcol
H. onenrmmat
I. absmaxmat
1.11 Real vector and matrix - Scaling
reascl
1.12 Complex vector and matrix - Multiplication
A. comcolcst
B. comrowcst
1.13 Complex vector and matrix - Scalar products
A. commatvec
B. hshcomcol
C. hshcomprd
1.14 Complex vector and matrix - Elimination
A. elmcomveccol
B. elmcomcol
C. elmcomrowvec
1.15 Complex vector and matrix - Rotation
A. rotcomcol
B. rotcornrow
C. chsh2
1.16 Complex vector and matrix - Norms
comeucnrm
1.17 Complex vector and matrix - Scaling
A. comscl
B. sclcom
1.18 Complex monadic operations
A. comabs
B. comsqrt
C. carpol
1.19 Complex dyadic operations
A. commul
B. comdiv
1.20 Long integer arithmetic
A. lngintadd
B. lngintsubtract
C. lngintmult
D. lngintdivide
E. lngintpower
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 2 Algebraic Evaluations
2.1 Evaluation of polynomials in Grunert form
A. pol
B. taypol
C. norderpol
D. derpol
2.2 Evaluation of general orthogonal polynomials
A. ortpol
B. ortpolsym
C. allortpol
D. allortpolsym
E. sumortpol
F. sumortpolsym
2.3 Evaluation of Chebyshev polynomials
A. chepolsum
B. oddchepolsum
C. chepol
D. allchepol
2.4 Evaluation of Fourier series
A. sinser
B. cosser
C. fouser
D. fouserl
E. fouser2
F. comfouser
G. comfouserl
H. comfouser2
2.5 Evaluation of continued fractions
jfrac
2.6 Transformation of polynomial representation
A. polchs
B. chspol
C. polshtchs
D. shtchspol
E. grnnew
F. newgrn
G. lintfmpol
2.7 Operations on orthogonal polynomials
intchs
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 3 Linear Algebra
3.1 Full real general matrices
3.1.1 Preparatory procedures
A. dec
B. gsselm
C. onenrminv
D. erbelm
E. gsserb
F. gssnri
3.1.2 Calculation of determinant
determ
3.1.3 Solution of linear equations
A. sol
B. decsol
C. solelm
D. gsssol
E. gsssolerb
3.1.4 Matrix inversion
A. inv
B. decinv
C. invl
D. gssinv
E. gssinverb
3.1.5 Iteratively improved solution
A. itisol
B. gssitisol
C. itisolerb
D. gssitisolerb
3.2 Real Symmetric positive definite matrices
3.2.1 Preparatory procedures
A. chldec2
B. chldecl
3.2.2 Calculation of determinant
A. chldeterm2
B. chldeterml
3.2.3 Solution of linear equations
A. chlsol2
B. chlsoll
C. chldecsol2
D. chldecsoll
3.2.4 Matrix inversion
A. chlinv2
B. chlinvl
C. chldecinv2
D. chldecinvl
3.3 General real symmetric matrices
3.3.1 Preparatory procedure
decsym2
3.3.2 Calculation of determinant
determsym2
3.3.3 Solution of linear equations
A. solsym2
B. decsolsym2
3.4 Real full rank overdetermined systems
3.4.1 Preparatory procedures
A. lsqortdec
B. lsqdglinv
3.4.2 Least squares solution
A. lsqsol
B. lsqortdecsol
3.4.3 Inverse matrix of normal equations
lsqinv
3.4.4 Least squares with linear constraints
A. lsqdecomp
B. lsqrefsol
3.5 Other real matrix problems
3.5.1 Solution of overdetermined systems
A. solsvdovr
B. solovr
3.5.2 Solution of underdetermined systems
A. solsvdund
B. solund
3.5.3 Solution of homogeneous equation
A. homsolsvd
B. homsol
3.5.4 Pseudo-inversion
A. psdinvsvd
B. psdinv
3.6 Real sparse non-symmetric band matrices
3.6.1 Preparatory procedure
decbnd
3.6.2 Calculation of determinant
determbnd
3.6.3 Solution of linear equations
A. solbnd
B. decsolbnd
3.7 Real sparse non-symmetric tridiagonal matrices
3.7.1 Preparatory procedures
A. dectri
B. dectripiv
3.7.2 Solution of linear equations
A. soltri
B. decsoltri
C. soltripiv
D. decsoltripiv
3.8 Sparse symmetric positive definite band matrices
3.8.1 Preparatory procedure
chldecbnd
3.8.2 Calculation of determinant
chldetermbnd
3.8.3 Solution of linear equations
A. chlsolbnd
B. chldecsolbnd
3.9 Symmetric positive definite tridiagonal matrices
3.9.1 Preparatory procedure
decsymtri
3.9.2 Solution of linear equations
A. solsymtri
B. decsolsymtri
3.10 Sparse real matrices - Iterative methods
conjgrad
3.11 Similarity transformation
3.11.1 Equilibration - real matrices
A. eqilbr
B. baklbr
3.11.2 Equilibration - complex matrices
A. eqilbrcom
B. baklbrcom
3.11.3 To Hessenberg form - real symmetric
A. tfmsymtri2
B. baksymtri2
C. tfmprevec
D. tfmsymtril
E. baksymtril
3.11.4 To Hessenberg form - real asymmetric
A. tfmreahes
B. bakreahesl
C. bakreahes2
3.11.5 To Hessenberg form - complex Hermitian
A. hshhrmtri
B. hshhrmtrival
C. bakhrmtri
3.11.6 To Hessenberg form - complex non-Hermitian
A. hshcomhes
B. bakcomhes
3.12 Other transformations
3.12.1 To bidiagonal form - real matrices
A. hshreabid
B. psttfmmat
C. pretfmmat
3.13 The (ordinary) eigenvalue problem
3.13.1 Real symmetric tridiagonal matrices
A. valsymtri
B. vecsymtri
C. qrivalsymtri
D. qrisymtri
3.13.2 Real symmetric full matrices
A. eigvalsym2
B. eigsym2
C. eigvalsym1
D. eigsym1
E. qrivalsym2
F. qrisym
G. qrivalsyml
3.13.3 Symmetric matrices - Auxiliary procedures
A. mergesort
B. vecperm
C. rowperm
3.13.4 Symmetric matrices - Orthogonalization
orthog
3.13.5 Symmetric matrices - Iterative improvement
symeigimp
3.13.6 Asymmetric matrices in Hessenberg form
A. reavalqri
B. reaveches
C. reaqri
D. comvalqri
E. comveches
3.13.7 Real asymmetric full matrices
A. reaeigval
B. reaeigl
C. reaeig3
D. comeigval
E. comeigl
3.13.8 Complex Hermitian matrices
A. eigvalhrm
B. eighrm
C. qrivalhrm
D. qrihrm
3.13.9 Complex upper-Hessenberg matrices
A. valqricom
B. qricom
3.13.10 Complex full matrices
A. eigvalcom
B. eigcom
3.14 The generalized eigenvalue problem
3.14.1 Real asymmetric matrices
A. qzival
B. qzi
C. hshdecmul
D. hestgl3
E. hestgl2
F. hsh2col
G. hsh3col
H. hsh2row3
I. hsh2row2
J. hsh3row3
K. hshrow2
3.15 Singular values
3.1 5.1 Real bidiagonal matrices
A. qrisngvalbid
B. qrisngvaldecbid
3.15.2 Real full matrices
A. qrisngval
B. qrisngvaldec
3.16 Zeros of polynomials
3.16.1 Zeros of general real polynomials
A. zerpol
B. bounds
3.16.2 Zeros of orthogonal polynomials
A. allzerortpol
B. lupzerortpol
C. selzerortpol
D. alljaczer
E. alllagzer
3.16.3 Zeros of complex polynomials
comkwd
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 4 Analytic Evaluations
4.1 Evaluation of an infinite series
A. euler
B. sumposseries
4.2 Quadrature
4.2.1 One-dimensional quadrature
A. qadrat
B. integral
4.2.2 Multidimensional quadrature
tricub
4.2.3 Gaussian quadrature - General weights
A. reccof
B. gsswts
C. gsswtssym
4.2.4 Gaussian quadrature - Special weights
A. gssjacwghts
B. gsslagwghts
4.3 Numerical differentiation
4.3.1 Calculation with difference formulas
A. jacobnnf
B. jacobnmf
C. jacobnbndf
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 5 Analytic Problems
5.1 Non-linear equations
5.1.1 Single equation - No derivative available
A. zeroin
B. zeroinrat
5.1.2 Single equation - Derivative available
zeroinder
5.1.3 System of equations - No Jacobian matrix
A. quanewbnd
B. quanewbndl
5.2 Unconstrained optimization
5.2.1 One variable - No derivative
minin
5.2.3 One variable - Derivative available
mininder
5.2.4 More variables - Auxiliary procedures
A. linemin
B. rnklupd
C. davupd
D. fleupd
5.2.5 More variables - No derivatives
praxis
5.2.6 More variables - Gradient available
A. rnklmin
B. flemin
5.3 Overdetermined nonlinear systems
5.3.1 Least squares - With Jacobian matrix
A. marquardt
B. gssnewton
5.4 Differential equations - Initial value problems
5.4.1 First order - No derivatives right hand side
A. rk1
B. rke
C.rk4a
D. rk4na
E. rk5na
F. multistep
G. diffsys
H. ark
I. efrk
5.4.2 First Order - Jacobian matrix available
A. efsirk
B. eferk
C. linigerlvs
D. liniger2
E. gms
F. impex
5.4.3 First Order - Several derivatives available
A. modifiedtaylor
B. eft
5.4.4 Second order - No derivatives right hand side
A. rk2
B. rk2n
C. rk3
D. rk3n
5.4.5 Initial boundary value problem
arkmat
5.5 Two point boundary value problems
5.5.1 Linear methods - Second order self adjoint
A. femlagsym
B. femlag
C. femlagspher
5.5.2 Linear methods - Second order skew adjoint
femlagskew
5.5.3 Linear methods - Fourth order self adjoint
femhermsym
5.5.4 Non-linear methods
nonlinfemlagskew
5.6 Two-dimensional boundary value problems
5.6.1 Elliptic special linear systems
A. richardson
B. elimination
5.6 Parameter estimation in differential equations
5.6.1 Initial value problems
peide
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 6 Special Functions
6.1 Elementary functions
6.1.1 Hyperbolic functions
A. arcsinh
B. arccosh
C. arctanh
6.1.2 Logarithmic functions
logoneplusx
6.2 Exponential integral
6.2.1 Exponential integral
A. ei
B. eialpha
C. enx
D. nonexpenx
6.2.2 Sine and cosine integral
A. sincosint
B. sincosfg
6.3 Gamma function
A. recipgamma
B. gamma
C. loggamma
D. incomgam
E. incbeta
F. ibpplusn
G. ibqplusn
H. ixqfix
I. ixpfix
J. forward
K. backward
6.4 Error function
A. errorfunction
B. nonexperfc
C. inverseerrorfunction
D. fresnel
E. fg
6.5 Bessel functions of integer order
6.5.1 Bessel functions J and Y
A. bessj0
B. bessj 1
C. bessj
D. bessy01
E. bessy
F. besspq0
G. besspql
6.5.2 Bessel functions I and K
A. bessi0
B. bessil
C. bessi
D. bessk01
E. bessk
F. nonexpbessio
G. nonexpbessil
H. nonexpbessi
I. nonezpbessk01
J. nonexpbessk
6.6 Bessel functions of real order
6.6.1 Bessel functions J and Y
A. bessjaplusn
B. bessya01
C. bessyaplusn
D. besspqa01
E. besszeros
F. start
6.6.2 Bessel functions I and K
A. bessiaplusn
B. besska01
C . besskaplusn
D. nonexpbessiaplusn
E. nonexpbesska01
F. nonexpbesskaplusn
6.6.3 Spherical Bessel functions
A. spherbessj
B. spherbessy
C. spherbessi
D. spherbessk
E. nonexpspherbessi
F. nonexpspherbessk
6.6.4 Airy functions
A. airy
B. airyzeros
A Numerical Library in C for Scientists and Engineers
Contents
Chapter 7 Interpolation and Approximation
7.1 Real data in one dimension
7.1.1 Interpolation with general polynomials
newton
7.1.2 Approximation in infinity norm
A. ini
B. sndremez
C. minmaxpol
Worked Examples
Examples for chapter 1 procedures
hshcomcol, hshcomprd
elmcomcol
rotcomcol
comabs
comsqrt
carpol
commul
comdiv
lngintadd, lngintsubtract, lngintmult, lngintdivide, lngintpower
Examples for chapter 2 procedures
derpol
allortpol
chepolsum
oddchepolsum
chepol, allchepol
fouser
jfrac
chspol, polchs
polshtchs, shtchspol
newgrn, grnnew
lintfmpol
intchs
Examples for chapter 3 procedures
determ, gsselm
decsoll
gsssol
gsssslerb
dercinv
gssinv
gssinvex-
gssitisol
gssitisslerb
chldec2, chlsol2, chlinv2
chldecf, chlsoll, chlinvl
chldecsol2, chldeterm2, chldecinv2
chldecsoll, chlldeteml, chldecinvl
determsym2
deesolsym2
lsqortdec, lsqsol, lsqdglinv
Ilsqsntdecsd
Isqinv
lsqdecomp, lsqrefsol
solovr
solund
hornsol
psdinv
solbnd, decbnd, determbnd
decsolbnd
decsoltri
soltripiv
decsoltripiv
chlsolbnd, chldecbnd, chldetermbnd
chldecsolbnd, chldetermbnd
decsolsymtri
conjgrad
eqilbrcom
hshhrmtri
valsyrntsi, vecsymtri
eigsyml
symeigimp
comvalqri, csmveches
reaeig3
eighrm
qrihrm
valqricom
qricom
eigcom
qzival
qzi
qrisngvaldec
zerpol, bounds
allzerortpol
lupzerortpol
selzerortpol
alljaczer
alllagzer
comkwd
Examples for chapter 4 procedures
euler
sumposseries
qadrat
integral
tricub
reccof
gsswtssym
gssjacwghts
gsslagwghts
jacobnnf
jacobnmf
jacobnbndf
Examples for chapter 5 procedures
zeroin
zeroinrat
zeroinder
quanewbndl
minin
mininder
praxis
rnklmin, flemin
marquardt
gssnewton
rkl
rke
rk4a
rk4na
rWna
multistep
diffsys
ark
efrk
efsirk
eferk
linigerlvs
liniger2
gms
impex
modifiedtaylor
eft
rk2
rk2n
rk3
rk3n
arkmat
femlagsym
fernlag
femlagspher
fernlagskew
femhermsym
nonlinfemlagskew
richardson
elimination
peide
Examples for chapter 6 and 7
ei
eialpha
enx, nonexpenx
sincosint, sincosfg
recipgamma
gamma
loggamma
incomgam
incbeta
ibpplusn
ibqplusn
errorfunction, nonexperfc
inverseerrorfunction
fresnel, fg
bessj0, bessjl, bessj
bessyOl
bessy
besspq0, besspql
bessi, bessk
besskOl
nonexpbesskol
nonexpbessk
bessjaplusn
besspqaol
besszeros
spherbessi, nonexpspherbessi
spherbessk, nonexpspherbessk
airy
airyzeros
newton
ini
sndremez
minmaxpol
A Numerical Library in C for Scientists and Engineers
Contents
Appendix A: References
Appendix B: Prototype Declarations
Appendix C: Procedure Descriptions
Appendix D: Memory Management Utilities
Recommend Papers

A Numerical Library in C for Scientists and Engineers
 084937376X, 9780849373763

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Numerical Library

for Scientists and Engineers H.T. Lau, Ph.D.

CRC Press Boca Raton Ann Arbor London Tokyo

Copyright 1995 by CRC Press, Inc

LIMITED WARRANTY CRC Press warrants the physical diskette(s) enclosed herein to be free of defects in materials and workmanship for a period of thirty days from the date of purchase. If within the warranty period CRC Press receives written notification of defects in materials or workmanship, and such notification is determined by CRC Press to be correct, CRC Press will replace the defective diskette(s). The entire and exclusive liability and remedy for breach of this Limited Warranty shall be limited to replacement of defective diskette(s) and shall not include or extend to any claim for or right to cover any other damages, including but not limited to, loss of profit, data, or use of the software, or special, incidental, or consequential damages or other similar claims, even if CRC Press has been specifically advised of the possibility of such damages. In no event will the liability of CRC Press for any damages to you or any other person ever exceed the lower suggested list price or actual price paid for the software, regardless of any form of the claim. CRC Press SPECIFICALLY DISCLAIMS ALL OTHER WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, ANY IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Specifically, CRC Press makes no representation or warranty that the software is fit for any particular purpose and any implied warranty of merchantability is limited to the thirty-day duration of the Limited Warranty covering the physical diskette(s) only (and not the software) and is otherwise expressly and specifically disclaimed. Since some states do not allow the exclusion of incidental or consequential damages, or the limitation on how long an implied warranty lasts, some of the above may not apply to you

DISCLAIMER OF WARRANTY AND LIMITS OF LIABILITY: The author(s) of this book have used their best efforts in preparing this material. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. NEITHER THE AUTHOR(S) NOR THE PUBLISHER MAKE WARRANTIES O F ANY KIND, EXPRESS OR IMPLIED, WITH REGARD TO THESE PROGRAMS OR THE DOCUMENTATION CONTAINED IN THIS BOOK, INCLUDING WITHOUT LIMITATION WARRANTIES O F MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. NO LIABILITY IS ACCEPTED IN ANY EVENT FOR ANY DAMAGES, INCLUDING INCIDENTAL OR CONSEQUENTIAL DAMAGES, LOST PROFITS, COSTS OF LOST DATA OR PROGRAM MATERIAL, OR OTHERWISE IN CONNECTION WITH OR ARISING OUT OF THE FURNISHING, PERFORMANCE, OR USE OF THE PROGRAMS IN THIS BOOK. Library of Congress Cataloging-in-Publication Data Lau, H. T. (Hang Tong), 1952Numerical library in C for scientists and engineers / Hang-Tong Lau. p. cm. Includes bibliographical references and index. ISBN 0-8493-7376-X 1. C (Computer program language) I. Title. QA76.73.Cl5L38 1994 51 9.4'0285'53--&20

94-37928 CIP

This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author(s) and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. CRC Press, Inc.'s consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press for such copying. Direct all inquiries to CRC Press, Inc., 2000 Corporate Blvd. N.W., Boca Raton, Florida 3343 1. O 1995 by CRC Press, Inc

No claim to original U S . Government works International Standard Book Number 0-8493-7376-X Library of Congress Card Number 94-37928 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper

Copyright 1995 by CRC Press, Inc

To my wife, Helen, and the children Matthew, Lawrence and Tabia for their love and support

Copyright 1995 by CRC Press, Inc

Contents Introduction 1. Elementary Procedures 1.1 Real vector and matri A. inivec B. inimat C. inimat D. inisym E. inisymrow 1.2 Real vector 'and m A. dupvec B. dupvecr C. duprowvec D. dupveccol E. dupcolvec F. dupmat 1.3 Real vector an rix - Multiplication A. mulvec B. mulrow C. mulcol D. colcst E. rowcst 1.4 Real vector vec products A. vecvec B. matvec C. tamvec D. matmat E. Lmmat F. mattam G. seqvec H. scaprdl I. symmat 1.5 Real matrix vector A. fulmatvec B. fultamvec C. fulsymmat D. resvec E. symres 1.6 Real matrix matrix A. hshvecmat B. hshcolmat C. hshrowmat D. hshvectam E. hshcol~

Copyright 1995 by CRC Press, Inc

F. hshrowta 1.7 Real vector and A. elmve B. elmcol C. elmro D. elmvecc E. elmcolve F. elmvecro G. elmrowve H. elmcolro I. elmrowcol J. maxelmro 1.8 Real vector 'and matrix - Interch'anging A. ichvec B, ichco C. ichro D. ichrowco E. ichseqvec F. ichse 1.9 Real vector and matrix - Rotatio A. rotcol B. rotrow 1.10 Real vector and matrix - Norms A. infnrmve B. infnrmro C. infnrmcol D. infnrmma E. onenrmve F. onenrmrow G. onenrmco H. onenrmmat I. absmaxma 1.1 1 Real vector and trix - Scalin reascl 1.12 Complex vector 'and matrix Multiplication A. comcolcst B. comrowcst 1.13 Complex vector and matrix - Scalar product A. commatvec B. hshcomco C. hshcomprd 1.14 Complex vector and matrix Eliminatio A. elmcomveccol B. elmcomcol C. elmcomrowvec 1.15 Complex vector 'and matrix Rotation A. rotcomcol B. rotcornro C. chsh2

Copyright 1995 by CRC Press, Inc

1.16 Complex vector 'and ma& - Norms comeucnr 1.17 Complex vect A. comscl B. sclcom 1.18 Complex mo A. comab B. comsq C. carpo 1.I9 Complex dy operation A. commul B. comdi 1.20 Long integer ithmetic A. lngintadd B. lngintsub C. Ingintmul D. lngintdivi E. lngintpowe Algebraic Evaluatio 2.1 Evaluation o Grunert form A. pol B. tay C. norde D. derpol 2.2 Evaluation of ral orthogonal polynomial A, ortpo B. ortpol C. allortpol D. allortpol E. sumortpo F. sumortpol 2.3 Evaluation of Chebyshev polynomials A. chepolsum B. oddchepols C. chepol D. allchep 2.4 Evaluation of F A. sinser B. cosse C. fouse D. fouse E. fouser2 F. comfou G. comfouse H. comfouser2 2.5 Evaluation of continue jfra 2.6 Transfo

Copyright 1995 by CRC Press, Inc

A. polch B. chspol C. polshtchs D. shtchspol E. grnne F. newgr G . lintfmpo 2.7 Operations on orthogonal polynomials intchs 3. Linear Algebr 3.1 Full

.

3.1.1 Preplvatory pr A. dec B. gssel C. onenrmin D. erbel E. gsser . F. gssnri 3.1.2 Calculat deter 3.1.3 Solution of linear equations A. so B. decsol . C. solelm . D. gssso E. gsssoler 3.1.4 Matrix inversio A. in B. de C. invl D. gssinv E. gssinverb 3.1.5 Iteratively improved solution A. itisol B. gssitisol C. itisolerb D. gssitisolerb 3.2 Real Symmetric positive definite matrices 3.2.1 Preparatory procedure A. chldec2 B. chldecl 3.2.2 Calculation of determinant A. chldeterm2 B. chldeterm 1 3.2.3 Solution of linear equation A. chlsol2 B. chlsoll C. chldecsol2

Copyright 1995 by CRC Press, Inc

3.3

3.4

3.5

3.6

3.7

D. chldecsoll 3.2.4 Matrix inversio A. chlinv B. chlinv C. chldecinv D. chldecin General real symmetric matrice 3.3.1 Preparatory procedur decsym2 3.3.2 Calculation nt determsym 3.3.3 Solution of linear equatio A. solsym B. decsolsym Real full rank overdetermined syste 3.4.1 Preparatory procedures A. lsqortdec B. lsqdglinv 3.4.2 Least squares solutio A. Isqso B. lsqortdecso 3.4.3 Inverse matrix of norm . lsqinv 3.4.4 Least sq ts A. lsqdecomp B. Isqrefsol Other real matrix problem 3.5.1 Solution of overdetermined systems A. solsvdov B. solo 3.5.2 Solution of underdetermined systems A. solsvdun B. solund 3.5.3 Solution of ho A. homsolsvd B. homso 3.5.4 Pseudo-inversion A. psdinvsvd B, psdinv Real sparse non-symmetric band matrices 3.6.1 Preparatory procedure decbn 3.6.2 Calculation of determinant determbnd 3.6.3 Solution of linear equation A. solbnd B. decsolbnd Real sparse non-symmetric tridiagonal matrices 3.7.1 Preparatory procedures

Copyright 1995 by CRC Press, Inc

A. dectri B. dectri 3.7.2 Solution of linear equations A. soltri B. decsoltri C. soltripiv D. decsoltri 3.8 Sparse symmetric positive d inite band matrices 3.8.1 Preparatory procedure chldecbnd 3.8.2 Calculation nt chldetermbnd 3.8.3 Solution of line A. chlsolbnd B. chldecsol 3.9 Symmetric positive definite tridiagond matrices 3.9.1 Preparatory procedure decsymtri 3.9.2 Solution of linear equ A. solsymtri B. decsolsym 3.10 Sparse real matrices - Iterative methods conjgrad 3.11 Simil'arity t 3.1 1.1 Equilibration A. eqilbr B. baklbr 3.1 1.2 Equilibrati A. eqilbrcom B. baklbrcom 3.1 1.3 To Hessenberg form real symmetri A. tfmsymtri2 B . baksymtri2 C. tfmprevec D. tfmsymtri E. baksymtri 3.1 1.4 To Hessenberg l asymmetric A. tfmreahes B. bakreahes C. b'akreahes2 3.1 1.5 To Hessenberg form - complex Hermitian A. hshhrmtri B. hshhrm~iv C. bakhrmtri 3.1 1.6 To Hessenberg form - complex non-Hermitia A. hshcomhes B. bakcomhes 3.12 Other transformations 3.12.1 To bidiagon - red matrices

Copyright 1995 by CRC Press, Inc

A. hshreabid B. psttfmmat C. pretfmmat 3.13 The (ordin'uy) eigenvalue problem 3.13.1 Real symmetric tridiago A. valsymtri B. vecsymm C. qrivalsy D. qrisymtri 3.13.2 Real symmetri ll matrices A. eigvalsym2 B. eigsym2 C. eigvalsy D. eigsyml E. qrivalsy F. qrisym G . qrival 3.13.3 Symmetric matrices - Auxiliq procedures A. mergesort B. vecperm C. rowperm 3.13.4 Symmetric matrices - Orthogonalization orthog 3.13.5 Symme Iterative improvement symeigim 3.13.6 Asymmetri A. reavalqri B. reaveche C. reaqri D. comv E. comveche 3.13.7 Real asymmet A. reaeigval B, reaeigl C. reaeig3 D. comeig E. comeigl 3.13.8 Complex Her A. eigvalhrm B. eighr C. qrivalh D. qlihrm 3.13.9 Complex u Hessenberg matrices A. valqricom B. qricom 3.13.10 Complex A. eigvalcom B. eigcom 3.14 The generalized eigen

Copyright 1995 by CRC Press, Inc

3.14.1 Red asymmetric matrices A. qziva B. qzi C. hs D. hestgl E. hestgl2 F. hsh2co G. hsh3co H. hsh2row I. hsh2row J. hsh3row K. hsh3row 3.15 Singular values 3.15.1 Red A. qrisngvalbid B. qrisngvaldec 3.15.2 Real full matrices A. qrisngval B. qrisngval 3.16 Zeros of polynomial 3.16.1 Zeros of general r ials A. zerpol B. bounds 3.16.2 Zeros of ort A. allzerortpol B. lupzerortpo C. selzerortpo D. alljacze E. alllagze 3.16.3 Zeros of co olynomial cornkw 4. Analytic Evaluations 4.1 Evaluation ite series A. euler . B. surnp 4.2 Quadratur 4.2.1 A. qadrat B. integra 4.2.2 Multidimensi tricu 4.2.3 Gaussia A. reccof B. gsswts C. gsswts 4.2.4 Gaussian quadrat A. gssjacwghts B. gsslagwghts

Copyright 1995 by CRC Press, Inc

weights

4.3 Numerical differentiation 4.3.1 Calculation wit A. jacobnnf B. jacobnm C. jacobnbnd 5. Analytic Problem 5.1 Non-lin 5.1.1 Single eq A. zeroin B. zeroin 5.1.2 Single equatio zeroinder 5.1.3 System of A. quanewbnd B. quanewbn 5.2 Unconstrained optimization 5.2.1 One variable - No rivativ minin 5.2.3 One v'v mininder 5.2.4 More varia Auxiliary procedures A. linemin B. rnklupd C. davupd D. fleupd 5.2.5 More variabl praxis 5.2.6 More va A. rnklmin B. flemin 5.3 Overdetermined nonlin 5.3.1 Least squares - With Ja A. marquard B. gssnewto 5.4 Differential equations .In 5.4.1 First order - No derivatives right ha A. rkl B. r C. rk D. rk4 E. rk5na F. multis G. diffsy H. a I. efr 5.4.2 First Or obian matrix available A. efsirk B. eferk

Copyright 1995 by CRC Press, Inc

C. liniger 1vs D. liniger2 E. gms F. imp . 5.4.3 First Order eral derivatives availabl A. modifiedtaylo B. eft 5.4.4 Second erivatives right hand side A. rk2 B. rk2 C. rk D. rk 5.4.5 Initial boun&q value problem arkmat 5.5 Two point boundar 5.5.1 Linear methods - Second A. femlagsym B. femlag C . femlag 5.5.2 Linear methods - Second order skew adjoint femlagskew 5.5.3 Linear method femhermsym 5.5.4 Non-linear meth nonlinfemlagskew 5.6 Two-dimension;d boundii val 5.6.1 Elliptic special linear systems A. richardson B. elimination 5.6 Parameter estimation in diff 5.6.1 Initial value problems peid

6. Special Functions 6.1 Element 6.1.1 Hyperboli A. arcsin B. arccos C. arctan 6.1.2 Logarithmic logoneplusx 6.2 Exponential integral 6.2.1 Exponenti A. ei B, ei C. enx D. non 6.2.2 Sine 'and cosine i A. sincosin

Copyright 1995 by CRC Press, Inc

B. sincosfg 6.3 Gamma functio A. recipga B. gamma C. logga D. incomga E. incbeta F. ibpplus G. ibqplus H. ixqfix I. ixpfix J. forwar K. backwa 6.4 Error function A. errorf B. nonexperfc C. inverseerro D. fresnel E. f 6.5 Bessel fu 6.5.1 Bessel functions J and Y A. bessjO B. bessjl C. bessj D. bessy E. bessy F. bessp G. besspql 6.5.2 Bessel functio A. bessiO B. bessil C. bessi D. bessk E. bessk F. nonex G. nonexpbessi 1 H. nonexpbessi I. nonexpbessk J. nonexpbessk 6.6 Bessel functions of real order 6.6.1 Bessel functions J A. bessjaplusn B. bessya0l C. bessyapl D. besspqaol E. besszeros F. start 6.6.2 Bessel fun dK A. bessiaplusn

Copyright 1995 by CRC Press, Inc

.

B. besska0 C. besskapl D. nonexpbess E. nonexpbesska0 F. nonexpbesskapl 6.6.3 Spherical Bessel functio A. spherbessj B. spherbess C. spherbessi D, spherbess E. nonexpspherbessi F. nonexpspherbess 6.6.4 Any functions A. airy B. airy 7. Interpolation and Approximatio 7.1 Red data in one dime 7.1.1 Interpolation with newton 7.1.2 Approxim A. ini B. sn C. minmaxp Worked Ex'mples 1 Examples hshcomcol, hshcomprd elmcomcol rotcomcol comabs comsqrt carpol commu comdiv lngintad intsubtract, Ingintmult, lngintdivide Examples for chapter 2 procedures derpol allortp chepols oddchepols chepol, allche fouser jfra chs polshtchs, sht newgm, gmnew lintfmpol intch

Copyright 1995 by CRC Press, Inc

Examples for chapter 3 procedures determ, gsselm decsol gsssol gsssolerb decinv gssinv gssinverb gssitisol gssitisolerb chldec2, chlsol2, chlinv2 chldecl, chlsoll, chlinvl chldecsol2, chldeterm2, chldecinv2 chldecsoll, chldeterm1, chldecinvl determsym2 decsolsym lsqortdec, lsqsol, lsqdglinv lsqortdecsol lsqinv lsqdecomp, lsqrefsol solovr solund homsol psdinv solbnd, decbnd, determbnd decsolbnd decsolm soltripiv decsoltripiv chlsolbnd, chldecbnd, chldetermbnd chldecsolbnd, chldetermbnd decsolsymtri conjgrad eqilbrco hshhrmtri valsymtri, vecsymtri eigsym 1 symeigimp comvalqri, comveches reaeig3 eighrm qrihrm valqricom qricom eigcom qzival qzi qrisngvalde zerpol, boun

Copyright 1995 by CRC Press, Inc

allzerortpol lupzerortpol selzerortpol alljaczer dllagzer comkwd Examples for chapter 4 procedures euler sumposserie qadrat integral tricub reccof gsswtssym gssjacwghts gsslagwghts jacobnnf jacobnmf jacobnbndf Examples for chapter 5 procedures zeroin zeroinr zeroinde quanewbndl minin mininder praxis mklmin, flemin marquardt gssnewto rkl rke rk4a rk4na rk5na multistep diffsys ark efrk efsirk eferk linigerlvs liniger2 gms impex modified~~ylor eft rk2 rk2n

Copyright 1995 by CRC Press, Inc

rk3 rk3n ark femlagsym femlag femlagspher femlagskew femhermsym nonlinfemlagskew richardson elimination peide Examples for chapter 6 and 7 procedures ei eialpha enx, nonexpenx sincosint, sincosfg recipgamma gamma loggamma incomgam incbeta ibpplusn ibqplusn errorfunction, nonexperfc inverseerrorfunction fresnel, fg bessj0, bessjl, bessj bessyOl bessy besspq0, besspql bessi, bessk besskOl nonexpbesskol nonexpbessk bessjaplusn besspqaO1 besszeros spherbessi, nonexpspherbessi spherbessk, nonexpspherbessk airy airyzeros newton ini sndremez minmaxpol Appendix A: References

Copyright 1995 by CRC Press, Inc

Appendix B: Prototype Declarations Appendix C: Procedure Descriptions Appendix D: Memory Management Utilities

Copyright 1995 by CRC Press, Inc

1. Elementary Procedures This chapter contains elementary operations for vectors and matrices such as the assignment of initial values to vectors and matrix slices, duplication and interchange of such slices, rotations and reflections. The procedures are taken from [Dek68, DekHo681. Most of them are used in subsequent chapters. The elementary procedures are all quite short; prepared versions may be re-coded, making optimal use of machine capabilities, in assembly language.

1.1 Real vector and matrix - Initialization A. inivec Initializes part of a vector a by setting a,=x (i=l,l+l,..., u). Function Parameters: void inivec (I,u,a,x) 1,u: int; lower and upper index of the vector a, respectively; a: float a[l:u]; the array to be initialized; x: float; initialization constant. yoid inivec (int 1, int u, float a[] , float x)

B. inimat Initializes a rectangular submatrix a by setting

ai,,=x (i=lr,lr+l,..., ur; j=lc,lc+l, ...,uc). Function Parameters: void inimat (Ir,ur,lc,uc,a,x) int; lower and upper row-index, and lower and upper column-index of the matrix a, respectively; float a[lr:ur,lc:uc]; the matrix to be initialized; float; initialization constant.

Ir,ur,lc,uc: a: x:

void inimat(int lr, int ur, int lc, int uc, float **a, float x)

I int j; for

( ; lr=l; i - - ) ( t=u [il+carry; carry = (t c 0) ? - 1 : 0; difference [i]=t-carry*BASE; I

if (carry == -1) { difference [O]=O; return; )

i=1; j=lu; while ((difference[il == 0) I--; 1++ ;

Copyright 1995 by CRC Press, Inc

&&

( j > 1)) {

) )

{

J

1

difference [O]=j; if (j c lu) for (i=l; i=l; i--) { t=u [jl *v[il +product [j+il+carry; carry=t/BASE; product [j+i]=t-carry*BASE;

1

1

product [ jI =carry;

if (product[ll == 0) { for (i=2; ic=luv; i++) product [i-11=product [il ; luv-- ; 1 product [O]=luv;

lngintdivide

Copyright 1995 by CRC Press, Inc

Forms the quotient and remainder from two multilength integers, each expressed in the form

where the i, are single length nonnegative integers, and B is a single length positive integer:

qo=uo--vo+l; r o v 0 B2+B not greater than the largest integer having a machine representation. Function Parameters:

void lngintdivide (u,v,quotient,remainder) u,v: int u[O:u[O]], v[O:v[O]]; entry: u contains the dividend, v the divisor (~0); quotient,remainder: int quotient[O:u[O]-v[O]+I], remainder[O:v[O]]; exit: results of the division, u and v remain unchanged. Method:

see the function Ingintpower.

#define BASE 100

t

/ * value of B in the above * /

int *allocate-integer-vector(int, int); void free-integer-vector(int * , int); int lu,lv,vl,diff,i,t,scale,d,ql,j, carry,*uu,*a;

if (lv == 1) { carry=O; for (i=l; ic=lu; i++) { t=carry*BASE+u[i]; quotient [il=t/vl; carry=t-quotient[il *vl;

I

iemainder [Ol =l; remainder [ll =carry; if (quotient[ll == 0) { for (i=2; ic=lu; i++) quotient [i-11=quotient [il ; quotient [O]=lu - ((lu == 1) ? 0 : 1) ; } else quotient [O]=lu; return;

if (lu c lv) { quotient [Ol=l; quotient [l]= O ; for (i=O; i 1) ( carry=O ; for (i=l; i=O; i--) z [il=u[il ; while (1) { n=exp/2; if (n+n ! = exp) ( lngintrnult ( y ,z,h) ; for (i=h[O]; i>=O; i--) y[il =h[il ; if (n == 0) { for (i=y[Ol ; i>=O; i--) result [il=y [il ; free-integer-vector (y,0); free-integer-vector(z,O) ; free-integer-vector(h,O) ; return;

1

1

lngintmult (z,z,h) ; for (i=h[O]; i>=O; i--) z [il=h [il ; exp=n;

1

1

Copyright 1995 by CRC Press, Inc

2. Algebraic Evaluations 2.1 Evaluation of polynomials in Grunert form

A. pol Computes the sum of the polynomial

using Homer's rule. The error growth is given by a linear function of the degree of the polynomial [Wi63].

Function Parameters: float pol (n,x,a) pol: given the value of p(x) above; n: int; the degree of the polynomial; x: float; the argument of the polynomial; a: float a[O:n]; entry: the coefficients of the polynomial. float pol (int n, float x, float a [I )

I

float r; r=O. 0 ; for ( ; n>=O; n - - ) r=r*x+a [nl ; return (r);

1

B. taypol Computes the values of the terms x'Wp(x)/j!

(j=O, ..., k l n ) where

Function Parameters: void taypol (n,k,x,a) n: int; the degree of the polynomial; k: int; the first k terms of the above are to be calculated; x: float; the argument of the polynomial; a: float a[O:n]; entry: the coefficients of the polynomial; exit: the j-th term i*(j-th derivative)/j! is delivered in a/j], j=O,l ,..., k s n ,

Copyright 1995 by CRC Press, Inc

the other elements of a are generally altered.

Method:

The method of evaluation is given in [ShT74]. The more sophisticated algorithm based on divisors of n+l in [ShT74] was not implemented because of the more complex appearance of the implementation and because of the difficulty in choosing the most efficient divisor. In this implementation of the one-parameter family of algorithms, the linear number of multiplications is preserved. See [Wo74] for the k-th normalized derivative.

void taypol (int n, int k, float x, float a [I

)

I i

int i,j,nml; float xj,aa,h; if (X ! = 0.0) { xj=l; for (j=l;j=j; i--) h = a[il += h; I

I

} else { for (; k>=l; n--1 a[kl=O;

1

1

C. norderpol Computes the first k normalized derivatives Dp(x)/j!, (j=O ,...,k s n ) of the polynomial

Function Parameters: void norderpol (n,k,x,a) int; the degree of the polynomial; k: int; the first k normalized derivatives 0-th derivative / j factorial) are to be calculated; x: float; the argument of the polynomial; a: float a[O:n]; entry: the coefficients of the polynomial; the j-th normalized derivative is delivered in ao], j=O,l, ...,k l n , exit: the other elements of a are generally altered. n:

Method:

see the function taypol.

void norderpol (int n, int k, float x, float a [I)

I

float "allocate-real-vector(int, int); void free-real-vector(f1oat *, int); int i,j,nml;

Copyright 1995 by CRC Press, Inc

f l o a t x j , a a , h, *xx; if

(X

! = 0.0)

{

xx=allocate-real-vector(0,n); xj=l; f o r ( j = l ; j=l; i - - ) d=a[il / (b [il+d) ; return (d+b[Ol);

1

2.6 Transformation of polynomial representation A. polchs Given the a,, derives the b, occurring in the relationship

the T,(x) being Chebyshev polynomials. Function Parameters:

void polchs (n,a) n: int; the degree of the polynomial; a: float a[O:n]; entry: the coefficients of the power sum; the coefficients of the Chebyshev sum. exit:

Copyright 1995 by CRC Press, Inc

Method:

Although the transformation of representations of polynomials could have been obtained by fast evaluation and fast interpolation, the algorithm of Hamming [H73] was implemented here because of its simple appearance.

void polchs (int n, float a [I)

I

int k,l,twopow; (n > 1) { twopow=2; for (k=l; kc=n-2; k++) { a[kl / = twopow; twopow *= 2; 1 A [n-l]=2.0*a[n-11/twopow; a [nl / = twopow; a [n-21 += a [nl; for (k=n-2;k>=l; k--1 { a [k-11 += a [k+ll; a [k]=2.0*a[kl+a [k+21; for (l=k+l;lc=n-2; I++) a [ll += a[1+21 ;

1

B. chspol Given the b,, derives the a, occurring in the relationship

the T,(x) being Chebyshev polynomials. Function Parameters: void chspol (n,a) n: int; the degree of the polynomial; a : float a[O:n]; entry: the coefficients of the Chebyshev sum; the coefficients of the power sum. exit: Method:

see the function polchs.

void chspol (int n, float a [I) 1

int k,1,twopow; if (n > 1) ( for (k=O;kc=n-2; k++) { for (l=n-2;l>=k; I--) a[ll - = a[1+21; twopow=2; for (k=l;kc=n-2; k++) { a [k] *= twopow; twopow *= 2; 1

Copyright 1995 by CRC Press, Inc

C. polshtchs Given the a,, derives the b, occurring in the relationship

the S,(x) being shifted Chebyshev polynomials defined by Sk(x)=Tk(2x-I), T,(x) being a Chebyshev polynomial. Function Parameters: void polshtchs (n,a) n: int; the degree of the polynomial; a: float a[O:n]; entry: the coefficients of the power sum; the coefficients of the shifted Chebyshev sum. exit: Functions used: Method:

lintfmpol, polchs.

see the function polchs.

void polshtchs (int n, float a [I I'

)

void lintfmpol (float, float, int, float I1 ) void polchs (int, float [I ;

;

lintfmpol(0.5,0.5,n,a); polchs (n,a) ;

1

D. shtchspol Given the b,, derives the a, occurring in the relationship

the S,(x) being shifted Chebyshev polynomials defined by Sk(x)=Tk(2x-I), T,(x) being a Chebyshev polynomial. Function Parameters: n: a:

void shtchspol (n,a) int; the degree of the polynomial; float a[O:n]; entry: the coefficients of the shifted Chebyshev sum; the coefficients of the power sum. exit:

Copyright 1995 by CRC Press, Inc

Functions used: Method:

lintfrnpol, chspol.

see the function polchs.

yoid shtchspol (int n, float a [I)

'

void chspol (int, float [ I ) ; void lintfmpol (float, float, int, float [])

;

chspol (n,a) ; lintfmpol(2.0,-l.O,n,a);

1

E. grnnew Given the coefficients ai occurring in the polynomial n

and the tabulation points x,, computes the divided differences series representation

in the equivalent Newton

Function Parameters: void grnnew (n,x,a) n: int; the degree of the polynomial; x: float x[O:n-I]; entry: the interpolation points, values of x, above; a: float a[O:n]; entry: the coefficients of the power sum; the coefficients of the Newton sum, values of 6, above. exit: Method:

see the function polchs.

void grnnew (int n, float x [I , float a [I )

l

1

int k, 1; for (k=n-1; k>=O; k--) for (l=n-1; l>=n-1-k; I - - ) a [ll += a [l+ll*x [n-1-kl;

F. newgrn Given the coefficients Gf(x,J, together with the values of the arguments xi from which they are formed, in the truncated Newton interpolation series computes the coefficients c , i=O,...,n, in the equivalent polynomial form

Copyright 1995 by CRC Press, Inc

Function Parameters: n: x: a:

void newgrn (n,x,a) int; the degree of the polynomial; float x[O:n-I]; entry: the interpolation points, values of xi above; float a[O:n]; entry: the coefficients of the Newton sum, values of Gf(xJ; exit: the coefficients of the power sum, values of ci above.

Function used: elmvec. Method:

see the function polchs.

void newgrn (int n, float x [ I , float a [I ) void elmvec (int, int, int, float [I , float 11 , float); int k; for (k=n-1; k>=O; k - - ) elmvec (k,n-l,l,a,a, -x[kl )

;

1

G . lintfmpol Given the ai occurring in the polynomial expression

and p, q, derives the b, occurring in the equivalent expression

where x=py+q. Function Parameters: void lintfinpol (p,q,n,a)

Copyright 1995 by CRC Press, Inc

n: int; the degree of the polynomial; p,q: float; entry: defining the linear transformation gives the value of the polynomial a: float a[O:n]; entry: the coefficients of the power sum the coefficients of the power sum exit:

of the independent variable x=py+q; p=O with argument q; in x, values of a, above; in y, values of b, above.

Function used: norderpol. Method:

see the function polchs.

void lintfmpol(f1oat p, float q, int n, float a[]) I void norderpol (int, int, float, float [I ) ; int k; float ppower;

,

norderpol (n,n,q, a) ; ppower=p; for (k=l; kc=n; k++) ( a [kl *= ppower; ppower * = p;

I

2.7 Operations on orthogonal polynomials intchs Given the real coefficients a, in the expansion

where

q(x)

is the Chebyshev polynomial of the first kind of degree j, those in the expansion

are derived.

Function Parameters: void intchs (n,a,b) n: int; the degree of the polynomial represented by the Chebyshev series; a,b: float a[O:n], b[l:n+l]; entry: the coefficients of the Chebyshev series, values of a, above; exit: the coefficients of the integral Chebyshev series, values of b, above.

Method:

For a description of the algorithm see [Cle62, FoP681.

Copyright 1995 by CRC Press, Inc

void intchs (int n , float a [I , float b [I ) {

int i; float h,1,durn; if (n = = 0) ( b [ll=a [Ol ; return;

I

b[21 =a[l] /4.0; b [ll=a [Ol ; return;

h=a [nl ; d u m a [n-11 ; b [n+ll =h/ ( (n+l)*2) ; b 11-11 =durn/ (n*2); for (i=n-1;i>=2; i - - ) ( l=a [i-11; b [il = (1-h)/ (2*i); h=dum; dum=l ;

I

b [I]=a [O]-h/2.0;

1

Copyright 1995 by CRC Press, Inc

3. Linear Algebra 3.1 Full real general matrices 3.1.1 Preparatory procedures

A. dec Decomposes the nxn matrix A in the form LU=PA, where L is lower triangular, U is unit upper triangular and P is a permutation matrix.

Function Parameters: void dec (a,n,aux,p) a: float a[l:n,l:n]; entry: the matrix to be decomposed; the calculated lower triangular matrix and unit upper triangular matrix with its exit: unit diagonal omitted; n: int; the order of the matrix; aux: float aux[l:3]; entry: aux[2]: a relative tolerance: a reasonable choice for this value is an estimate of the relative precision of the matrix elements; however, it should not be chosen smaller than the machine precision; exit: awc[l]: if R is the number of elimination steps performed (see aux[3J), then aux[l] equals 1 if the determinant of the principal submatrix of order R is positive, else aux[l] equals -1; aux[3]: the number of elimination steps performed; if aux[3] max) -max=aid; 1 I

rgrow += max; for (r=l; r aux[lO] then the process has been broken off because the number of iterations exceeded the value given in aux[l2]; the 1-norm of the residual vector r above; aux[l3]: ri: int ri[l:n]; entry: the pivotal row-indices, as produced by gsselm; ci: int ci[l:n]; entry: the pivotal column-indices, as produced by gsselm; b: float b[l:n]; entry: the right hand side of the linear system; exit: the calculated solution of the linear system. Functions used: Method:

solelm, inivec, dupvec.

If the condition of the matrix is not too bad then the precision of the calculated solution will be of the order of the precision asked for in a d l o ] . If the condition of the matrix is very bad then this process will possibly not converge

Copyright 1995 by CRC Press, Inc

or, in exceptional cases, converge to a useless result. If the user wants to make certain about the precision of the calculated solution then the function itisolerb should be used. itisol leaves a, lu, ri and ci unaltered, so after one call of gsselm several calls of itisol may follow to calculate the solution of several linear systems with the same matrix but different right hand sides.

void itisol(f1oat **a, float **lu, int n, float aux[l, int ri[], int ci[], float b[l) ( float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); void solelm(f1oat * * , int, int [I, int [I, float [I ) void inivec (int, int, float [ I , float); void dupvec (int, int, int, float [I , float [I ) ; int i,j,iter,maxiter; float maxerx,erx,nrmres,nrms~l,r,rr,*res,*sol; double dtemp;

;

res=allocate~real~vector(l,n); sol=allocate-real-vector(1,n) ; maxerx=erx=aux[lo1 ; maxiter=aux [l21 ; inivec(l,n,sol,O.O); dupvec(l,n,O,res,b); iter=l; do { solelm(lu,n,ri,ci, res) ; erx=nrmsol=nrmres=O.O; for (i=l; i=l; i--) { r=l.O/a Iil [il ; il=i+l: dupvecrow(il,n,i,u,a); for (j=n; js=il; j - - ) j,a,u))*r; a[i] [j] = -(tamvec(il,j,j,a,u)+matvec(j+l,n, a[i] [i]=(r-matvec(il,n,i,a,u)) *r;

B. chlinvl Calculates the inverse X of a symmetric positive definite matrix A, provided that the matrix has been decomposed (A = UTu,where U is the Cholesky matrix) by a successful call of chldecl or chldecsoll. The upper triangular part of the Cholesky matrix must be given columnwise in a one-dimensional array. The inverse X i s obtained from the conditions that X be symmetric and UX be a lower triangular matrix whose main diagonal elements are the reciprocals of the diagonal elements of U. The upper triangular elements of Xare calculated by back substitution. Function Parameters: a:

n:

void chlinvl (a,n) float a[l:(n+l)d2]; entry: the upper triangular part of the Cholesky matrix as produced by chldecl or chldecsoll must be given colurnnwise in array a; exit: the upper triangular part of the inverse matrix is delivered columnwise in array a; int; entry: the order of the matrix.

Functions used:

Copyright 1995 by CRC Press, Inc

seqvec, symmatvec.

void chlinvl (float a [I , int n)

I float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); float seqvec (int, int, int, int, float [ I , float [I float symmatvec (int, int, int, float [I , float [I ) ; int i,ii,il,j,ij; float r,*u;

;

u=allocate-real-vector (1,n); ii= ( (n+l)*n)/2; for (i=n; i>=l; i--) { r=l.0/a [iil ; il=i+l; ij=ii+i; for (j=il; j a h , set P(')=Z, s = l .

bl: obtain a =maxA /:

1

(1 sk,rsC k+;

b2: if

~ A , ( f ) , l o > a hset ~, P(~=I, s = l .

cl: if

~A;,!l>ao,

set P("z~,~, s = l .

dl: set P ( ~ = I , , ~ ,s = 2 .

Copyright 1995 by CRC Press, Inc

1

;

With

the choice of a

=

(1 + 171'2)/8above leads to the element growth inequality

When Di,i+l=O,L,+l,i=O,so that if the block structure of D is known, the elements of D and L may be stored in the closed upper triangular part of the two dimensional array in which the elements of A are stored at entry. At exit from decsym2, the successive locations of the one-dimensional integer array p in the parameter list of decsym2 contain not only the pivot reference integers associated with the PO) above, but also information concerning the block structure of D: if p[i]>O and p[i+l]=O, D,,,#O and L,+,,=O.Upon successful exit from decsym2, the successive locations of the one-dimensional real array detaux contains numbers which are of use in computing the determinant of A. If p[i], p[i+l] > 0, detaux[i] = Di,i; if p[i] > 0, p[i+l] = 0, detaux[i] = 1 and detaux[i+l] is the determinant of

(",

"j.1

If g is the number of 1x1 blocks in D, so that h=(n-g)/2 is the number of 2x2 blocks, and u, v and w are respectively the numbers of positive, negative and zero 1x1 blocks, then u+h is the number of positive eigenvalues of A, v+h the number of negative eigenvalues, and w the number of zero eigenvalues (these numbers are delivered in successive locations of aux at exit from decsym2). The decision as to whether a 1x1 block D,, is zero is governed by the small real number z allocated to the real variable to1 upon call of decsym2: if I D,,,I < T, D,, is taken to be zero.

Function Parameters: void decsym2 (a, n, tol, aux,p,detaux) a: float a[l :n, l:n]; entry: the symmetric coefficient matrix; exit: the elements of the LDLTdecomposition of a are stored in the upper triangular part of a; D is a block diagonal matrix with blocks of order 1 or 2, for a block of order 2 we always have D,,,+,# 0 and L,+,,i=O, so that D and L~ fit in the upper triangular part of a, the strictly lower triangular part of a is left undisturbed; n: int; entry: the order of the matrix; tol: float; entry: a relative tolerance used to control the calculation of the block diagonal elements, the value of z above; aux: int aux[2:5]; exit: aux[2]: if the matrix is symmetric then 1; otherwise 0, and no decomposition of a is performed; aux[3]: if the matrix is symmetric then the number of its positive eigenvalues,

Copyright 1995 by CRC Press, Inc

otherwise 0; if aux[3]=n then the matrix is positive definite; aux[4]: if the matrix is symmetric then the number of its negative eigenvalues, otherwise 0; if am[4]=n then the matrix is negative definite; am[5]: if the matrix is symmetric then the number of its zero eigenvalues, otherwise 0; if aux[5]=0 then the matrix is symmetric and non-singular; p: int p[l:n]; exit: a vector recording (1) the interchanges performed on array a during the computation of the decomposition and (2) the block structure of D; if p[i] > 0 and p[i+l] = 0 a 2x2 block has been found (Di,i+,#Oand Li+,,;=O); detaux: float detaux[l :n]; exit: ifp[i]>O and p[i+l]>O then detaux[i/ equals the exit value of a[i,i]; ifp[i]>O and p[i+I]=O then detaux[i/=l and detaux[i+l] equals the value of the determinant of the corresponding 2x2 diagonal block as determined by decsym2. Functions used: Method:

elmrow, ichrow, ichrowcol.

The function decsym2 computes the LDL' decomposition of a symmetric matrix according to a method due to Bunch, Kaufinan and Parlett [BunK77, BunKP761. For the inertia problem it is important that decsym2 can accept singular matrices. However, in order to find the number of zero eigenvalues of singular matrices, the singular value decomposition might be preferred.

void decsym2(float **a, int n, float tol, int aux[l , int p [I , float detaux [I) i

void elmrow(int, int, int, int, float **, float **, float) ; void ichrow(int, int, int, int, float * * ) ; void ichrowcol (int, int, int, int, float * * ) ; int i,j,m,ipl,ip2,onebyone,sym; float det,s,t,alpha,lambda,sigma,aii,aipl,aipli,temp; aux[3] =aux[41 =O; sym=1; i=O; while (sym && (i s n)) { i++; j=i; while (sym && (j < n)) { j++; sym = sym && (a[il [jl == a[jl [il);

,

I

I

if (sym) aux[2] 4 ; else { auxl21 =O; aux [ 5 1 =n; return; \ alpha=(l.O+sqrt(17.0))/8.0;

p In1 =n; i=l; while (i s n) { ipl=i+l; ip2=i+2; aii=fabs(a[il [il ) ; p [il=i; lambda=fabs(a[il [ipll ) j=ipl;

Copyright 1995 by CRC Press, Inc

;

for (m=ip2; m lambda) ( j=m; lambda=fabs(a[il [ml ) ; \ t=alpha*lambda; onebyone=l; if (aii c t) ( sigma=lambda; for (m=ipl; mc=j-1; m++) if (fabs(a[ml [jl) > sigma) sigma=fabs(ah1 [jl ) for (m=j+l; mc=n; m++) if (fabs(a[j] [m]) > sigma) sigma=fabs(a [jl [ml ) if (sigma*aii c lambda) ( if (alpha*sigma c fabs(aCj1 [jl)) ( ichrow(j+l,n,i,, ],a) ; ichrowcol(ip1,J-l,i, j,a); t=a [i] [i]; a[il [il=a[jl [jl; a[jl [jl=t; p [il=j; ) else ( if (j > ipl) ( ichrow(j+l,n,ipl, j,a); ichrowcol(ip2,j-l,ipl,j,a); t=a [il [il ; a[il [il=a[jl [jl ; a[jl [jl=t; t=a[il [jl ; a [i] [j]=a [il [ipll ; a [il [ipll=t; temp=a [il Lip11 ; det=a [il [il*a [ipll [ipll -temp*temp; aipli=a [il [ipll/det; aii=a [il [il /det; aipl=a [ipll Lip11 /det; p [il=j; p [ipl]=O; detaux [il=l.0; detaux[ipll =det; for (j=ip2; jc=n; j++) { s=aipli*a[ipll jl -aipl*a [il [jI ; t=aipli*a[il [jl -aii*aLip11 [jl ; elmrow(j,n,j,i,a,a,s); elmrow(j,n,j,ipl,a,a,t); a[il [jl=s; a[ipll [jl=t; 1

if (onebyone) { if (to1 < fabs (a[il [il ) ) ( aii=a [il [il ; detaux lil =a [il [il ; if (aii > 0.0) aux I31 ++; else aux Ill ++; for (j=ipl; j 0.0) aux [31++; else aux [41++;

)

{

3.3.2 Calculation of determinant

Calculates the determinant of a symmetric matrix A, det(A). The function decsyrn2 should ~ of the symmetric matrix. Given the values of be called to perform the L D L decomposition the determinants of the blocks of the nxn block diagonal matrix D (which has I x l or 2x2 blocks of elements on its principal diagonal and zeros elsewhere) and the number rn of zero eigenvalues of A, deterrnsyrn2 evaluates det(A). If rn # 0 then det(A)=O, otherwise det(A) is simply the product of the above determinants of blocks.

Function Parameters: float determsym2 (detaux,n, aux) deterrnsyrn2: delivers the calculated value of the determinant of the matrix; detaux: float detaux[l:n]; entry: the array detaux as produced by decsyrn2; int; the order of the array detaux ( = the order of the matrix ); n: aux: int aux[2: 51; entry: the array a m as produced by decsyrn2. float determsym2(float detaux[l, int n, int aux[l)

{

int i; float det; if (aux[5] > 0) det=O.0; else ( det=l.0; for (i=l; i < = n ; i++) det * = detaux[il; 1

3.3.3 Solution of linear equations

Solves a symmetric system of linear equations, assuming that the matrix has been decomposed into LDLT form by a call of decsyrn2.

Function Parameters: void solsym2 (a,n,b,p,detaux) a:

float a[l :n, Z:n];

Copyright 1995 by CRC Press, Inc

entry: the L D L ~decomposition of A as produced by decsym2; int; entry: the order of the matrix; b: float b[l:n]; entry: the right hand side of the system of linear equations; the solution of the system; exit: p: int p[l:n]; entry: a vector recording the interchanges performed on the matrix by decsym2, p also contains information on the block structure of the matrix as decomposed by decsym2; detaux: float detaux[l:n]; entry: the array detaux as produced by decsym2. n:

Functions used:

matvec, elmvecrow.

void solsym2(float **a, int n, float b[l, int p[l, float detaux[l) 1

float matvec(int, int, int, float * * , float [I); void elmvecrow (int, int, int, float [I , float * * , float); int i,ii,k,ipl,pi,pii; float det,temp,save; i=l; while (i < n) { ipl=i+l; pi=p [il ; save=b[pi]; if (p[ipll s 0) { b [pi]=b [il ; b [i]=save/a [il [il ; elmvecrow(ipl,n, i,b,a,save) ; i=ipl; ) else { temp=b [il ; b [pi]=b [ipll ; det=detaux[ipll ; b [i]= (tempfa[ipll [ipll -save*a[il [ipll ) /det; b [ipl]= (save*a[il [il -temp*a[il [ipll ) /det; elmvecrow(i+2,n,i,b,a, temp) ; elrnvecrow(i+2,n,ipl,b,a,save) ; i += 2;

1

if (i == n) ( b [il / = a [il [il ; i=n-1; } else i=n-2; while (i > 0 ) ( if (p[il == 0) ii=i-1; else ii=i; for (k=ii; k norm) norm=w; 1

&aux [51=sqrt(norm); eps=aux121 *w; for (k=l; k sigma) { sigma=sum[ jI ; kpiv=j ; I

if (fcpiv != k) sum [kpiv]=sum [kl ; ichcol (l,n,k, kpiv,a); 1 ci [kl=kpiv; akk=a [kl [kl ; sigma=tammat(ken, k,k,a,a) ; w=sqrt(sigma); aidk=aid[k]= ( (akk c 0.0) ? w if (W c eps) { aux[3] =k-1; break;

:

-w);

1

beta=l.0/ (sigma-akk*aidk); a [k] [k]=akk-aidk; for (j=k+l; jc=m; j++) { elmcol (k,n,j,k,a,a,-beta*tammat(k,n, k,j,a,a)) temp=a [kl [ jI ; sum[j] - = temp*temp;

1

;

1

free-real-vector(sum,l);

1

B. lsqdglinv Computes the principal diagonal elements of the inverse of ATA, where A is the coefficient matrix of a linear least squares problem. It is assumed that A has been decomposed after calling lsqortdec successfully. These values can be used for the computation of the standard deviations of least squares solutions.

Function Parameters: void lsqdglinv (a,m,aux,ci,diag) see Isqortdec; the contents of a, aid and ci should be produced by a successful a,m,aid,ci: call of Isqortdec; diag: float diag[l :m]; the diagonal elements of the inverse of ATA, where A is the matrix of the exit: linear least squares problem.

Copyright 1995 by CRC Press, Inc

Functions used:

vecvec, tamvec.

void lsqdglinv(float **a, int m, float aid[] , int ci [I , float diag [I) 1

float vecvec(int, int, int, float [I, float 11 ) float tamvec (int, int, int, float **, float [I ) int j,k,cik; float w;

; ;

for (k=l; k=l; k--) b [kl= (b[kl -matvec(k+l,m,k,a, b) ) /aid[k] ; for (k=m; k>=l; k--1 { cik=ci[kl ; if (cik ! = k) { w=b [kl ;

Copyright 1995 by CRC Press, Inc

b [kl =b [cikl ; b [cikl=w;

B. lsqortdecsol Computes the least squares solution of an overdetermined system Ax = b (n linear equations in m unknowns), and computes the principal diagonal elements of the inverse of ATA. The matrix A is first reduced to the column permuted form A', where R = QA' (see Isqortdec) by calling Isqortdec and, if this call is successful, the least squares solutions are determined by lsqsol and the principal diagonal elements are calculated by Isqdglinv.

Function Parameters: void lsqortdecsol (a,n, rn,am,diag, b) a: float a[l:n,l:m]; entry: the coefficient matrix of the linear least square problem; exit: in the upper triangle of a (the elements a[i,j] with i sigma) ( sigma=sum[jI ; kpiv=j ; if (kpiv ! = k) { sum [kpiv]=sum [kl ; ichcol(l,n,k,kpiv,a) ; 1 1

ci [kl=kpiv; akk=a [kl [kl ; sigma=tammat(k,nr, k,k,a,a); w=sqrt (sigma); aidk=aid[k]= ( (akk c 0.0) ? w if (W c eps) { aux [31=k-1; break;

:

-w);

1J

beta=l.0/ (sigma-akk*aidk) ; a [kl [kl =akk-aidk; for (j=k+l; jc=m; j + + ) ( elmcol (k,nr, j,k,a,a,-beta*tammat(k,nr, k,j ,a,a) ) temp=a [kl [ jI ; sum[jl - = temp*temp; 1 I

if (k == nl) for (j=nl+l; jc=n; j++) for (s=l; sc=m; s++) { nr = (s > nl) ? nl : s-1; w=a[jl [sl-matmat(l,nr, j,s,a,a);

Copyright 1995 by CRC Press, Inc

;

a [j] [sl = (s > nl) ? w

1

:

w/aid [sl ;

I

f ree-real-vector (sum,1);

1

B. lsqrefsol Solves a constrained least squares problem consisting of the determination of that xeRm which minimizes llr211E

where

r2

=

b2 -A2x, b2eRnZ

and A, is an n,xm matrix, subject to the condition that Alx

=

bl

where

b1~Rn1,

A, being n,xm (n,+n,=n). The required solution satisfies the equation Bz=h where

X

being a vector of Lagrange multipliers. It is assumed that the components of the vectors uw associated with the elementary reflectors defining orthogonal matrices Q, and Q, and those of an upper triangular matrix R, together with pivot reference integers associated with a permutation matrix P, have all been obtained by means of a successful call of Isqdecomp. lsqrefsol first obtains a numerical solution z'" of the equation Bz=h, and then uses an iterative scheme of the form f) =h-BdS),BGz@)=J'S',z ~ s ~ f / ' = z ( s ~s=1,2, + ~ z ( s...~ ,to obtain a refined solution to this equation, and in so doing a refined estimate of x, z(') and, at each state, Gz@) are derived by the solution process outlined in the documentation to Isqdecomp. The above iterative scheme [BjG67] is terminated if either (a) 6~'") IE zfS) where c is a small real tolerance prescribed by the user or (b) s=smm where the integer smm is also prescribed by the user. The least squares solutions of several overdetermined systems with the same constraints and coefficient matrix can be solved by successive calls of lsqrefsol with different right hand sides.

1)

1,

1 1

Function Parameters: void lsqrefsol (a,qr,n,m,nl,aux,aid,ci, b,ldx,x,res) a : float a[l:n,/:m]; entry: the original least squares matrix, where the first n l rows should form the constraint matrix (i.e. the first n l equations are to be strictly satisfied);

Copyright 1995 by CRC Press, Inc

qr: float qr[l:n,l:m]; entry: the QR decomposition of the original least squares matrix as delivered by a successful call of lsqdecomp; n: int; entry: the number of rows of the matrices a and qr; m: int; entry: the number of columns of the matrices a and qr; nl: int; entry: number of linear constraints; am: float am[2: 71; entry: am[2]: contains a relative tolerance (value of E above) as a criterion to stop iterative refining, if the Euclidean norm of the correction is smaller than am[2] times the current approximation of the solution then the iterative refining is stopped; awc[d]: maximum number of iterations allowed (value of smax above), usually aux[d]=5 will be sufficient; exit: a*]: the number of iterations performed (the last value of s for which a correction term 6z") is determined in the above); aid floataid[l:m]; entry: the diagonal elements of the upper triangular matrix as delivered by a successful call of lsqdecomp; ci: int ci[l :m]; entry: the pivotal indices as produced by lsqdecomp; b: float b[l:n]; entry: the right hand side of the least squares problem; first n l elements form the right hand sides of the constraints; I&: float *; the Euclidean norm of the last correction of the solution (the value of )Iaxfs)1, for the last &dddetermined in the above, x(") being formed from the last m components of z(S'); x: float x[l:m]; exit: the solution vector; res: float res[l:n]; exit: the residual vector (f) in the above) corresponding to the solution. Functions used:

vecvec, matvec, tamvec, elmveccol, ichcol.

void lsqrefsol(f1oat **a, float **qr, int n, int m, int nl, float aux[l, float aid[], int ci [I, float b[l , float *ldx, float x [I , float res [I )

I float *allocate-real-vector (int, int) ; void free-real-vector(f1oat *, int) ; float vecvec (int, int, int, float [I , float [I ) ; float matvec (int, int, int, float **, float [I ) ; float tamvec(int, int, int, float **, float [I ) ; void ichcol (int, int, int, int, float * * ) ; void elmveccol(int, int, int, float [I, float **, float); int i,j,k,s,startup; float cl,nexve,ndx,ndr,d,corrnorm,*f,*g; double dtemp;

Copyright 1995 by CRC Press, Inc

g=allocate-real-vector(1,m); for (j=l; jc=m; j++) { s=ci [jl ; if (s ! = j) ichcol(l,n,j,s,a); 1 for (j=l; jc=m; j++) x[jl=g[jI=O.O; for (i=l; ic=n; i++) { res[il=0.0; f [il=b [il ;

startup = (k c= 1) ; ndx=ndr=O.O; if (k ! = 0) { for (i=l; ic=n; i++) res[il += f [il; for (s=l; sc=m; s++) ( x[sI += g [sl ; dtemp=O.O; for (i=l; ic=n; i++) dtemp += (double)a [il [s] (double)res Iil ; d=dtemp; g[sl =(-d-tamvec(1,s-l,s,qr,g)) /aid[sl ; :or

I

(i=l; i nl) ? res [il : 0 .O; for (s=l; sc=m; s++) dtemp += (double)a [il [sl* (double)x [sl ; f [il=(double)b [il -dtemp;

1

Aexve=sqrt(vecvec(l,m,0 ,x,x)+vecvec (l,n, 0,res,res)) ; for (s=l; sc=nl; s++) elmveccol (s,nl,s,f,qr,tamvec(s,nl,s,qr,f)/ (qr[sl [sl *aid[sl ) ) for (i=nl+l; ic=n; i++) ; f [il - = matvec(l,nl,i,qr,f) for (s=nl+l; sc=m; s++) elmveccol(s,n,s,f,qr,tamvec(s,n,s,qr,f)/(qr[sl [sl*aid[sl 1 ) ; for (i=l: 1c=m: i++) /

g [s]= (gls]-matvec(s+l,m, s,qr,g)) /aid[sl ndx += g [sl *gIsl ;

;

1 for (s=m; s>=nl+l; s--) elmveccol(s,n,s,f,qr,tamvec(s,n,s,qr,f)/(qr~sl [sl*aid[sl) for (s=l; sc=nl; s++) f Is1 - = tamvec(nl+l,n,s,qr,f); for (s=nl; s>=l; s--) elmveccol (s,nl,s,f,qr, tamvec(s,nl,s,qr,f)/(qr[sl [sl *aidIs aux [7]=k; for (i=l; ic=n; i++) ndr += f [il*f[il ; corrnorm=sqrt(ndx+ndr); k++; ) while (startup I / (corrnorm>aux[21 *nexve && kc=aux[61)) ; *ldx=sqrt(ndx); for (s=m; s>=l; s--) ( j=ci [s]; if (j ! = s) { cl=x[jl ; x[jl=x[sl ; x [sl=c1; ichcol(l,n,j,s,a);

1

1

free-real-vector (f, 1) ; free-real-vector (g,1) ;

1

Copyright 1995 by CRC Press, Inc

;

3.5 Other real matrix problems 3.5.1 Solution of overdetermined systems

A. solsvdovr Solves an overdetermined system of linear equations. solsvdovr determines that xcRn with minimum 1 x 1 , which minimizes 1 Ax-b 1 where A is a real mxn matrix, bcRm(m Tn), the matrices U,A,V occurring in the singular value decomposition A=UAVT being available, where U is an mxn column orthogonal matrix (uTu=I), A is an nxn diagonal matrix whose diagonal elements are the singular values of A (hi, i=l, ...,n), and V is an nxn orthogonal matrix (vTv=VVT=I). The analytic solution of the above problem is x=A'b, where A' is the pseudo-inverse of A: numerically, X = V A ~ ' ) U where ~ ~ , A6') is a diagonal matrix whose successive diagonal elements are (hi)-'if hi > 6, and 0 otherwise, 6 being a small positive real number prescribed by the user. The two stages in the determination of x are the formation of b'=A(-')~~b' and that of x= Vb ' .

,,

Function Parameters: void solsvdovr (u,val,v, m,n,x,em) u: float u[l:m,l:n]; entry: the matrix U in the singular values decomposition UAVT; val: float val[l:n]; entry: the singular values (diagonal elements hi of A); v: float v[l:n,l:n]; entry: the matrix V in the singular values decomposition; m: int; entry: the length of the right hand side vector; n: int; entry: the number of unknowns, n should satisfy n S m ; x: float x[l:m]; entry: the right hand side vector; exit: the solution vector; em: float em[6:6]; entry: the minimal non-neglectable singular value (value of 6 in the above). Functions used: Method:

matvec, tarnvec.

See [WiR71] for the solution of an overdetermined system of linear equations.

void solsvdovr(float **u, float val[l, float **v, int m, int n, float x[], float ern[]) {

float *allocate real vector(int, int); void free-real-?ecto?(float *, int) ; float matvec(int, int, int, float **, float [I); float tarnvec(int, int, int, float **, float [I ) ; int i; float min,*xl;

Copyright 1995 by CRC Press, Inc

for (i=l; ic=n; i++) xl [i] = (val[i] < = min) ? 0.0 : tamvec(l,m,i,u,x)/val[il ; for (i=l; ii, and M,,i=l; (b) an upper triangular band matrix U with qj=O when j>i+lw+rw or i>j; and (c) a sequence of pivot reference integers p(i) (i=l, ...,n) associated with a permutation matrix P, such that MU=PA, by Gauss elimination using partial pivoting. The method used involves the recursive construction of matrices AO),with A(")=A. At the i-th stage the elements (A, .)(')G=l, ..., i-1; k=j+l, ...,n) are zero and (A,)(L(U~~)(~) (k=l, ...,i-I; j=k, k+ 1,...,n). Then (a) the smallest integer I for which I (AI,,)(')12 1 (A,,,)") I for k 2 i is determined; (b) p(i) is set equal to I; (c) rows i and I of Ao) are interchanged (the i-th row of A"+")has now been determined); (d) with Mk,,=(Ak,i)(i)/(Ai,i)(i+'), row(k) of Ao) is replaced by row(k) - M,,yow(i) k>i, are thus zero.) to form row(k) of A"") (k=i+l, ...,min(n, i+lw)). (The elements (A~,,)(~+'), I / 1 i-th row of A(') I), and 6 is a small The process is arrested if 6>6,, where 6,= I (A~,~)("') positive real number prescribed by the user.

Copyright 1995 by CRC Press, Inc

Function Parameters: void decbnd (a,n, lw,rw,am,m,pi) float a[I:(lw+rw)(n-I)+n]; entry: a contains rowwise the band elements of the band matrix in such a way that the (i,j)-th element of the matrix is given in a[(lw+rw)(i-I)+j], i=I, ...,n and j=max(I,i-lw), ...,min(n,i+rw), the values of the remaining elements of a are irrelevant; exit: the band elements of the Gaussian eliminated matrix, which is an upper triangular band matrix U with (Iw+rw) codiagonals, are rowwise delivered in a as follows: the (i,j)-th element of U is a[(lw+rw)(i-j)+j], i=I, ...,n and j=i, ...,min(n,i+lw+rw); n: int; entry: the order of the band matrix; Iw: int; entry: number of left codiagonals of a; rw: int; entry: number of right codiagonals of a; awc: float awc[l:5]; entry: aux[2]: a relative tolerance to control the elimination process (value of 6 above); exit: aux[l]: if successful, given the sign of the determinant of the matrix (+1 or -1); aux[3]: if successful then aux[3]=n; otherwise awc[3]=i-I, where the reduction process was terminated at stage i; aux[5]: if successful, given the minimum absolute value of pivot(i) divided by the Euclidean norm of the i-th row (value of min 6,, l l j l n , in the above); m: float m[I:lw(n-2)+ I]; exit: the Gaussian multipliers (values of Mij above) of all eliminations in such a way that the i-th multiplier of the j-th step is m[lw(&I)+i-j]; p : int p[l:n]; exit: the pivotal indices. a:

Functions used:

vecvec, elmvec, ichvec.

void decbnd(f1oat a[], int n, int lw, int rw, float aux[l , float m[l , int p [I ) float *allocate-real-vector(int, int); void free-real-vector (float *, int) ; float vecvec(int, int, int, float [I , float [I ) ; void elmvec (int, int, int, float [I, float [I, float) ; void ichvec (int, int, int, float [I ) ; int i, j , k,kk, kkl,pk,mk, ik, lwl, f,q,w,wl,w2 ,nrw,iw, sdet; float r, s,eps,min, *v;

Copyright 1995 by CRC Press, Inc

q=lw-1; for (i=2; ic=lw; i++) ( q--; iw += wl; for (j=iw-q;jc=iw; j++) a[jl=O.O; iw = -w2; q = -1w; for (i=l; ic=n; i++) { iw += w; if (i c= lwl) iw--; q += w; if (i > nrw) q - - ; v[i] =sqrt (vecvec(iw,q,O,a,a)) ;

I eps=aux [21 ; min=l.0; kk = -wl; mk = -1w; if (f > nrw) w2 += nrw-f; for (k=l; kc=n; k++) { if (f c n) f++; ik = kk += w; mk += lw; s=fabs(a[kkl ) /v [kl ; pk=k ; kkl=kk+l; for (i=k+l; ic=f; i++) { ik += wl; m [mk+i-kl=r=a [ikl ; a[ikl=0.0; r=fabs(r)/v [il ; if (r > S) ( s=r; pk=i;

I (S c min) min=s; (S c eps) ( aux[31 =k-1; aux [SI=s; aux [I]=sdet; free-real-vector(v,l); return;

1

if (k+w2 >= n) w2--; p [kl=pk; if (pk ! = k) (

;~&plcc=;,~kl

;

ichvec (kkl,kkl+w2,pk*wl,a) ; sdet = -sdet; r=m [mk+pkl; m [mk+pkl=a [kkl ; a [kkl=r; } else r=a [kkl ; if (r c 0.0) sdet = -sdet; iw=kkl; lwl=f-k+mk; for (i=mk+l; is=lwl; i++) ( s = m[il / = r; iw += wl; elmvec(iw,iw+w2,kkl-iw,a,a,-s);

1

1 aux [31 =n; aux [51 =min; aux [ll=sdet; free-real-vector (v,1) ;

1

Copyright 1995 by CRC Press, Inc

3.6.2 Calculation of determinant

determbnd Calculates the determinant of the Gaussian eliminated upper triangular matrix provided with the correct sign that is delivered by decbnd or decsolbnd. determbnd should not be called when overflow can be expected. Function Parameters: float determbnd (a,n, lw,rw,sgndet) determbnd: delivers the determinant of the band matrix; a: float a[l:(lw+rw) *(n-l)+n]; entry: the contents of a are produced by decbnd or decsolbnd; n: int; entry: the order of the band matrix; Iw: number of left codiagonals of a; rw: number of right codiagonals of a; sgndet: int; entry: the sign of the determinant as delivered in aux[l] by decbnd, if the elimination was successful.

float determbnd(f1oat a[], int n, int lw, int rw, int sgndet)

(

int i,l; float p; 1=1; p=1.0 ; lw += rw+l; for (i=l; ic=n; i++) { p = a Ill *p; 1 += lw;

1

return (fabs (p)*sgndet) ;

1

3.6.3 Solution of linear equations A. solbnd Calculates the solution of a system of linear equations, provided that the matrix has been decomposed by a successful call of decbnd. The solution of the linear system is obtained by carrying out the elimination, for which the Gaussian multipliers are saved, on the right hand side, and by solving the new system with the upper triangular band matrix, as produced by decbnd, by back substitution. The solutions of several systems with the same coefficient matrix can be obtained by successive calls of solbnd. Function Parameters: a,n,lw,rw,m,p: entry:

void solbnd (a,n,lw,rw,m,p,b) see decbnd; the contents of the arrays a,m,p are as produced by decbnd;

Copyright 1995 by CRC Press, Inc

b:

float b[l:n]; entry: the right hand side of the system of linear equations.

Functions used:

vecvec, elmvec.

void solbnd(f1oat a [I , int n, int lw, int rw, float m [I , int p [I, float b [I) float vecvec (int, int, int, float 11 , float [I) ; void elmvec (int, int, int, float [I , float [I , float); int f,i,k,kk,w,wl,w2,shift; float s; f=lw; shift = -1w; wl=lw-1; for (k=l; kc=n; k++) { if (f c n) f++; shift += wl; i=p [kl ; s=b [il ; if (i ! = k) ( b [il=b [kl ; b [kl=s;

1

elmvec (k+l,f,shift,b,m,-s);

1 wl=lw+rw; w=w1+1; kk= (n+l)*w-wl; W2 = -1; shift=n*wl; for (k=n; k>=l; k--1 { kk - = W; shift - = wl; if (w2 c wl) w2++; b [k]= (b[k]-vecvec(k+l,k+w2,shift,b,a)) /a [kkl ;

I

I

B. decsolbnd Calculates the solution of a system of linear equations by Gaussian elimination with partial pivoting if the coefficient matrix is in band form and is stored rowwise in a one-dimensional array. decsolbnd performs Gaussian elimination in the same way as decbnd, meanwhile also carrying out the elimination with the given right hand side. The solution of the eliminated system is obtained by back substitution. Function Parameters: void decsolbnd (a,n,lw,rw,aux,b) a,n,lw,rw,aux: see decbnd; b: see solbnd. Functions used:

vecvec, elmvec, ichvec.

void decsolbnd(f1oat a[], int n, int lw, int rw, float ~ux[], float b[l) I

float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); float vecvec (int, int, int, float [I, float [I)

Copyright 1995 by CRC Press, Inc

;

sdet=l; wl=lw+rw; w=Wl+l; w2=w-2; iw=O ; nrw=n-rw; lwl=lw+l; q=lw-1; for (i=2; ic=lw; i++) ( q--; iw += wl; for (j=iw-q;j nrw) w2 += nrw-f; for (k=l;k= n) w2--; (pk ! = k) { v [pkl=v [kl ; pk - = k ; ichvec (kkl,kkl+w2,pk*wl,a) ; sdet = -sdet; r=b [kl ; b [kl= b [pk+kI ; r=h [pkl ; m [pkl=a [kkl ; a [kkl=r; } else r=a [kkl ; iw=kkl;

Copyright 1995 by CRC Press, Inc

lwl=f-k; if (r c 0.0) sdet = -sdet; for (i=l; ic=lwl; i++) { s = m[il / = r; iw += wl; elmvec (iw,iw+w2,kkl-iw,a,a,-s); b [k+il - = b [kl *s;

1

1

aux[3] =n; aux [51=min; kk= (n+l)*w-wl; w2 = -1; shift=n*wl; for (k=n; k>=l; k--) { kk - = W; shift - = wl; if (w2 c wl) w2++; b [k]= (b[kl -vecvec(k+l,k+w2,shift,b,a) ) /a [kkl ; 1 J

aux Ill =sdet; free-real-vector (m,0) ; free-realvector (v,1) ;

1

3.7 Real sparse non-symmetric tridiagonal matrices 3.7.1 Preparatory procedures A. dectri Given the nxn tridiagonal matrix T (T,=O for 1 i-j1>1) obtains a lower bidiagonal matrix L (L. .=O for i>j+l and j>i) and unit upper bidiagonal matrix U (U,,=l, q j = O for i>j and j > i + l ) such that T=LU. The columns of L and rows of U are determined in succession. If at stage k, 1 L , , I 0 && ni > 0) { count- - ; im=i-1; il=i+l; c=sqrt (tammat(p, im,i,i,a,a)+tammat (il,q,i,i,a,a) ) r=sqrt (mattam( p , im,i,i,a,a)+mattam(il,q,i,i,a,a))

Copyright 1995 by CRC Press, Inc

; ;

if (c*omega c= r*eps) ( inter [tl=i; ni=q-p; t++; if (p ! = i) { ichcol(l,n,p,i,a) ; ichrow(l,n,p,i,a); di=d [il ; d [il =d [pl ; d [pl=di ;

1

p++; ) else if (r*omega c= c*eps) ( inter[tl = -i; ni=q-p; t++; if (q ! = i) ( ichcol (l,n,q,i,a) ; ichrow(l,n,q,i,a) ; di=d [il ; d [il =d [ql ; d [ql=di ;

1 q--; ) else ( exponent=log(r/c)*factor; if (fabs(exponent) > 1.0) ( nl=q-p; c=pow(2.0,exponent); r = l .O/c; d[il *= c; for (j=l; jc=im; j++) { a[jl [il *=c; a[il [jl *= r; 1

B. baklbr Given the diagonal elements of the nxn diagonal matrix D, the pivot reference integers associated with a permutation matrix P, and a set of column vectors vo) (i=nl, ...,n2), (i=nl,...,n2). (If v m is an eigenvector of PDADconstructs the sequence of vectors utij=~LIvti) 'P',uti) is an eigenvector of A.)

Function Parameters: void baklbr (n,nl,n2,d, inter,vec) n: int; entry: the length of the vectors to be transformed; nl,n2: int; entry: the serial numbers of the first and last vector to be transformed; d: float d[l:n]; entry: the main diagonal of the transforming diagonal matrix of order n, as produced by eqilbr; inter: int inter[l:n]; entry: information defining the possible interchanging of some rows columns, as

Copyright 1995 by CRC Press, Inc

vec:

produced by eqilbr; floatvec[l:n,nl:n2]; entry: the n2-nl + I vectors of length n to be transformed; the n2-nl+l vectors of length n resulting from the back transformation. exit:

Function used: ichrow.

void ichrow(int, int, int, int, float * * ) ; int i,j,k.p,q; float di; q=n; for (i=l; i < = n ; i++) ( di=d [il ; if (di ! = 1) for (j=nl; j 0) p++ ; else if (k c 0) q--; 1 for (i=p-l+n-q; i>=l; i--) { k=inter [il ; if (k > 0 ) { ) else (

3.1 1.2 Equilibration - complex matrices

A. eqilbrcom Equilibrates an nxn complex matrix A by determining a real diagonal matrix D whose diagonal elements are integer powers of 2, and a permutation matrix P such that if c=PDAD-'P', the diagonal elements of C'C-CC' (C' denotes the conjugate transpose of C ) are approximately zero. With j(k)=1,2 ,..., n, 1,2,..., (i.e. j(k)=(k-1 mod(n))+l, k=1,2,...) a sequence of diagonal matrices D,, where the j(k)-th diagonal element of D, is pi' (p, being an integer power of 2) and all others 1, together with a sequence of matrices C, for which C,=D,C,-,D,,-', C,=A are determined. p, is determined by the condition that the Euclidean norms of the j(k)-th column and j(k)-th row of C, have approximately equal values. If all off-diagonal elements of either the j(k)-th column or the j(k)-th row of Ck., are nearly zero, the row and column in question are interchanged with the next pair for which this is not so. The process is terminated if either (a) 112 < p, < 2 for one whole cycle of the j(k) or (b) k=kmax+l, where kmax is an integer prescribed by the user [DekHo68, Os60, PaR691.

Function Parameters: void eqilbrcom (al,a2,n,em,d,inter) al,a2: float al[l:n,l:n], a2[l:n,I:n]; entry: the real part and imaginary part of the matrix to be equilibrated must be given

Copyright 1995 by CRC Press, Inc

exit:

in the arrays a1 and a2, respectively; the real part and the imaginary part of the equilibrated matrix are delivered in the arrays a 1 and a2, respectively;

int; entry: the order of the given matrix; float em[O: 71; entry: em[O]: the machine precision; em[6]: the maximum allowed number of iterations (value of kmax above); exit: em[7]: the number of iterations performed; float d[l:n]; exit: the scaling factors of the diagonal similarity transformation; inter: int inter[l :n]; exit: information defining the possible interchanging of some rows and the corresponding columns.

Functions used:

ichcol, ichrow, tammat, mattam.

void eqilbrcorn(f1oat **al, float **a2, int n , float em[], float d[l, int inter [I ) I void ichcol (int, int, int, int, float * * ) ; void ichrow (int, int, int, int, float * * ) ; float tammat(int, int, int, int, float **, float * * ) ; float mattam(int, int, int, int, float **, float * * ) ; int i,p,q,j,t,count,exponent,ni,im,il; float c,r,eps,di; eps=em [O]*em [O]; t=p=l; q=ni=i=n; count=em [61 ; for (j=l; jc=n; j++) { d[jl=l.O; inter [ jI =0 ; I

I

i = (i c q) ? i+l : pj while (count > 0 && ni > 0) ( count- - ; im=i-1; il=i+l; c=tammat (p,im,i,i,a1,al)+tammat (il,q,i,i,al,al)+ tammat (p,im,i,i,a2,a2)+tammat (il,q,i,i,a2,a2); r=mattam(p,im,i,i,al,al)+mattarn(il,q, i,i,al,al)+ mattam(p,im,i,i,a2,a2)+mattam(il,q,i,i,a2,a2) ; if (c/eps =l; rl--) { d [rl=a [rl [rl ; xztammat (1,r-2,r,r,a,a); al=a [rll [rl ; if (sqrt(x) c = rnachtol) { bO=b [rll =al; bb [rll=bO*bO; a [rl [rl =1.0; ) else ( bbO=bb [rl]=al*al+x; bO = (a1 > 0.0) ? -sqrt (bbO) : sqrt (bb0); al=a [rl] [rl=al-bO; w=a [rl [rl=l.O/(al*bO); for (j=l; jc=rl; j++) b [j]= (tamrnat(1,j,j,r,a,a)+matmat (j+l,rl,j ,r,a,a)) *w; elmveccol(l,rl,r,b,a,tamvec (1,rl,r,a,b) * *5 ; for (j=l; jc=rl; j++) { elmcol(1,j,j,r,a,a,b[jl); elmcolvec(1,j,j,a,b,a[jl[rl ) ;

1

d [ll =a [ll [ll ; a [ll 111 =1.0; b [n]=bb [nl=O.0 ;

1

Performs the back substitutions upon the intermediate numbers produced by tfmsymtri2. Function Parameters: void baksymtri2 (a,n,nl,n2,vec) a: float a[l:n, 1:n]; entry: the data for the back transformation, as produced by tfmsymtri2, must be given in the upper triangular part of a;

Copyright 1995 by CRC Press, Inc

n:

int; entry: the order of the given matrix; nl,n2: int; entry: the lower and upper bound, respectively, of the column numbers of vec; vec: float vec[l:n, nl:n2]; entry: the vectors on which the back transformation has to be performed; exit: the transformed vectors. Functions used:

tammat, elmcol.

void baksymtril(f1oat **a, int n, int nl, int n2, float **vet) I

float tammat (int, int, int, int, float * * , float * * ) . void elmcol (int, int, int, int, float **, float **, ;loat) int j,k; float w;

;

for (j=2; jj+l) with real nonnegative subdiagonal elements H,+lj. The uo) are determined by imposing the conditions that the first j elements uo) are zero and that with Afl)=A, A0")Ao)@) has zeros in positions i=j+2,...,n of the j-th column G=I,...,n-2). D is determined by imposing the condition that the elements of the subdiagonal of D I A f n - ' ) are ~ absolute values of those of A("-/). For further details see [Mu66, Wi651.

Function Parameters: void hshcomhes (ar,ai,n,em,b, tr,ti,del) ar,ai: float ar[l :n, 1:n], ai[l :n, 1:n]; entry: the real part and the imaginary part of the matrix to be transformed must be

Copyright 1995 by CRC Press, Inc

exit:

given in the arrays a r and ai, respectively; the real part and the imaginary part of the upper triangle of the resulting upper Hessenberg matrix are delivered in the corresponding parts of the arrays a r and ai, respectively; data for the Householder back transformation are delivered in the strict lower triangles of the arrays a r and ai;

n:

int; entry: the order of the given matrix; em: float em[O:I]; entry: em[O]: the machine precision; em[l]: an estimate of the norm of the complex matrix (for example, the sum of the infinity norms of the real part and imaginary part of the matrix); b: float b[l:n-I]; exit: the real nonnegative subdiagonal of the resulting upper Hessenberg matrix; tr,ti: float tr[l :n], ti[l :n]; the real part and the imaginary part of the diagonal elements of a diagonal exit: similarity transformation are delivered in the arrays tr and ti, respectively; del: float del[l :n-21; information concerning the sequence of Householder matrices. exit:

Functions used:

hshcomcol, matmat, elmrowcol, hshcomprd, comcolcst, comrowcst.

carpol, cornmul,

void hshcomhes (float **ar, float **ai, int n, float em[], float b[l , float tr [I , float ti [I , float del [I ) I float matmat(int, int, int, int, float **, float * * ) ; void elmrowcol(int, int, int, int, float **, float **, float); void hshcomprd(int, int, int, int, int, float * * , float * * , float **, float **, float); void comcolcst(int, int, int, float **, float **, float, float); void comrowcst (int, int, int, float **, float **, float, float) ; void carpol(float, float, float * , float *, float * ) ; void commul(float, float, float, float, float *, float * ) ; int hshcomcol(int, int, int, float **, float **, float, float *, float *, float *, float * ) ; int r,rml,i,nml; float tol,t,xr,xi; nml=n-1 ; t=em [Ol*em [ll ; tol=t*t; rml=l; for (r=2; rc=nml; r++) { if (hshcomcol(r,n,rml,ar,ai,tol,&(b[rmll) ,&(tr[rl) ,&(ti [rl), &t)1 { for (i=l; i 1) carpol (ar[nl [nmll ,ai [n] [nmll ,& (b[nmll) , & (tr[nl) , &(ti [nl ) rml=l; tr [ll=l.0; ti[ll=O.O; for (r=2; rc=n; r++) {

Copyright 1995 by CRC Press, Inc

) ;

commul (tr[rml],ti[rmll , tr [rl ,ti[rl , & (tr[rl) , &(ti [rl ) comcolcst (l,rml,r,ar,ai,tr[rl ,ti[rl); comrowcst (r+l,n,r,ar,ai, trtrl ,-ti[rl); rml=r;

1

) ;

1

B. bakcomhes Given m complex vectors uo) (m In-2), an nxn complex diagonal matrix D and a sequence of complex column vectors v" (k=nl,...,n2) computes the vectors WF'=EDV~(k=nl,...,n2) where

(If vM (k=nl,...,n2) are eigenvectors of H=D-'EAED, w m (k=nl,...,n2) are corresponding eigenvectors of A.) For further details see [Mu66, Wi651. Function Parameters: void bakcomhes (ar,ai,tr,ti,del,vr,vi,n,nl,n2) ar,ai,tr,ti,del: float ar[l :n,1:n], ai[l :n,1:n], tr[l :n], ti[l :n], del[l :n-21; entry: the data for the back transformation as produced by hshcomhes; vr,vi: float vr[l:n,nl:n2]; entry: the back transformation is performed on the eigenvectors with the real parts given in vr and the imaginary parts given in vi; exit: the real parts and imaginary parts of the resulting eigenvectors are delivered in the columns of the vr and vi, respectively; n: int; entry: the order of the matrix of which the eigenvectors are calculated; nl,n2: int; entry: the eigenvectors corresponding to the eigenvalues with indices nl,..., n2 are to be transformed. Functions used:

comrowcst, hshcomprd.

void bakcomhes (float **ar, float **ai, float tr [I , float ti [I , float del[l, float **vr, float **vi, int n, int nl, int n2) ( void hshcomprd(int, int, int, int, int, float **, float * * , float **, float **, float); void comrowcst(int, int, int, float **, float **, float, float); int i,r,rml; float h; for (i=2; ic=n; i++) comrowcst (nl,n2,i,vr,vi, tr [il ,ti [il ) r=n-1 ; for (rml=n-2;rml>=l; rml--) { h=del [rmll ; if (h > 0.0) hshcomprd (r,n,nl,n2,rml,vr,vi, ar,ai,h) ; r=rml;

1

1

Copyright 1995 by CRC Press, Inc

;

3.12 Other transformations 3.12.1 To bidiagonal form - real matrices

A. hshreabid Reduces an mxn symmetric matrix A to bidiagonal form B. With A=A,, u") is so chosen that all elements but the first in the first column of

are zero; v") is so chosen that all elements but the first two of the first row in A;

=

A;(Z - 2v(l)v(l)T/v(l)Tv('))

are zero; the first row and column are stripped from A," to produce the (m-l)x(n-1) matrix A, and the process is repeated. Function Parameters: void hshreabid (a,m,n,d,b,em) a: float a[l:m,l:n]; entry: the given matrix; data concerning the premultiplying and postmultiplying matrices; exit: m: int; entry: the number of rows of the given matrix; n: int; entry: the number of columns of the given matrix; d: float d[l:n]; the diagonal of the bidiagonal matrix (diagonal of B above); exit: b: float b[l:n]; the superdiagonal of the bidiagonal matrix is delivered in b[l:n-11; exit: em: float em[O:l]; entry: em[O]: the machine precision; em[l]: the infinity norm of the original matrix. exit: Functions used: Method:

tammat, mattam, elmcol, elmrow.

hshreabid slightly improves a part of a procedure (svd) of Golub and Reinsch [WiR71] by skipping a transformation if the column or row in already in the desired form, (i.e. if the sum of the squares of the elements that ought to be zero is smaller than a certain constant). In svd the transformation is skipped only if the norm of the full row or column is small enough. As a result, some ill-defined transformations are skipped in hshreabid. Moreover, if a transformation is skipped, a zero is not stored in the diagonal or superdiagonal, but the value that would have been found if the column or row were in the desired form already is stored.

Copyright 1995 by CRC Press, Inc

void hshreabid(float **a, int m, int n, float d[l , float b [I , float em [I ) {

float tammat(int, int, int, int, float **, float * * ) ; float mattam(int, int, int, int, float **, float * * ) ; void elmcol (int, int, int, int, float **, float **, float); void elmrow(int, int, int, int, float **, float * * , float); int i,j,il; float norm,machtol,w,s,f,g,h; norm=O .0; for (i=l; ic=m; i++) { w=o . 0 ; for (j=i; jc=n; j++) w += fabs(a[il [jl); if (W > norm) norm=w; Aachtol=em [Ol*norm; em Ill =norm; for (i=l; ic=n; i++) { il=i+l; s=tammat(il,m,i,i,a,a) ; if (S c machtol) d [il=a [il [il ; else { f=a [il [il ; s += f*f; ; d[i] = g = (f c 0.0) ? Sqrt(S) : -sqrt(~) h=f*g-s; a [il [il =f-g; for (j=il; jc=n; j++) elmcol(i,m,j,i,a,a, tammat(i,m,i,j,a,a)/h; 1

if (i c n) s=mattarn(il+l,n,i,i,a,a) ; if (s c machtol) b ti] =a [il [ill ; else { f=a [il [ill ; S += f*f; b[i] = g = (f c 0.0) ? sqrt(s) : -sqrt(S); h=f*g-s; a [il [ill=f -g; for (j=il; jc=m; j++) elmrow(il,n,j,i,a,a,mattarn(il,n, i,j ,a,a)/h) ;

B. psttfmmat Computes the postmultiplying matrix from the intermediate results generated by hshreadbid.

Function Parameters: void psttfinmat (a,n,v,b) a: float a[l:n,l:n]; entry: the data concerning the postmultiplying matrix, as generated by hshreabid; n: int; entry: the number of columns and rows of a; v: float v[l:n,l:n]; exit: the postmultiplying matrix; b: float b[l :n]; the superdiagonal as generated by hshreabid. exit:

Copyright 1995 by CRC Press, Inc

Functions used:

matmat, elmcol.

void psttfmmat(f1oat **a, int n, float **v, float b[]) {

float matmat (int, int, int, int, float * * , float * * ) ; void elmcol (int, int, int, int, float **, float **, float) ; int i,il,j; float h; il=n; v [nl In]=1.0; for (i=n-1; i>=l; i--) { h=b [i]*a [il [ill ; if ( h < 0.0) { for (j=il; jj+l) by inverse iteration [DekHo68, Vr66, Wi651. Starting with ~~~)=(1,1,...,1)~, vectors J,@ and ' x" are produced by use of the scheme (~-Xlly')=x xF"'=fl)/ll y m (k=O, I,...). The process is terminated if either (A-M',~~'EsA *tol, (a) where to1 is a tolerance prescribed by the user, or (b) k=kmm+l, where kmax is an integer prescribed by the user.

1

1

1 1 1

Function Parameters: void reaveches (a,n,lambda,em,v) a: float a[l:n,l:n]; entry: the elements of the real upper Hessenberg matrix must be given in the upper triangle and the first subdiagonal of array a; the Hessenberg part of array a is altered; exit: n: int; entry: the order of the given matrix; lambda: float; the given real eigenvalue of the upper Hessenberg matrix (value of h above); em: float em[0:9]; entry: em[O]: the machine precision; em[l]: a norm of the given matrix; em[6]: the tolerance used for eigenvector (value of to1 above, em[6] > em[OJ); the inverse iteration ends if the Euclidean norm of the residue vector is smaller than em[l]*em[6]; em[8]: the maximum allowed number of iterations (value of kmax above, for example, em[s/=5); exit: em[7]: the Euclidean norm of the residue vector of the calculated eignenvector; the number of inverse iterations performed; if em[7] remains larger than em[9]: em[l]*em[6] during em[8] iterations then the value em[8]+1 is delivered; v: float v[l:n]; the calculated eigenvector is delivered in v. exit: Functions used:

vecvec, matvec.

void reaveches(f1oat **a, int n, float lambda, float em[], float v[ll { int *allocate-integer-vector(int, int);

Copyright 1995 by CRC Press, Inc

void free integer-vector(int *, int); float vecvec (int, int, int, float [I , float [I ) ; float matvec(int, int, int, float **, float [I 1 ; int i,il,j,count,max,*p; float m,r,norm,machtol,tol; p=allocate-integer-vector(l,n); norm=em 111 ; machtol=em [Ol*norm; tol=em [61*norm; max=em [8]; all] [I] - = lambda; for (i=l; ic=n-1; i++) { il=i+l; r=a [il [il ; m=a [ill [il ; if (fabs(m) c machtol) m=machtol; p[i] = (fabs(m) il) ? a[ill ljl a [ill [jl=a [il [jl-m*r; a[il [jl=r; 1

:

:

a[ill [jl-lambda)-m*a[il [jl ;

a[ill [jl-lambda;

1

1 if (fabs (a [nl In] ) c machtol) a [n] In]aachtol; for (j=l; jc=n; j++) v[jl=l.O;

:ztz0; count++; if (count > max) break; for (i=l; ic=n-1; i++) { il=i+l; if ( p [il ) v [ill - = a [ill [il*v [il ; else { r=v[ill ; v [ill=v [i]-a [ill [il *r; v [i]=r;

1

1

for (i=n; i>=l; i--) v[i] = (vli.1 -matvec(i+l,n,i,a,v)) /a[il [il ; r=l.O/sqrt (vecvec(l,n,O,v,v)) ; for (j=l; jc=n; j++) v[jl * = r; } while (r > toll ; em [7l=r; em [91=count; free-integer-vector(p,l);

1

C . reaqri Computes all eigenvalues h, (assumed to be real) and corresponding eigenvectors uj (j=l,...,n) of the nxn real upper Hessenberg matrix A (A,j=O for i>j+l) by single QR iteration. The eigenvectors are calculated by a direct method [see DekHo681, in contrast with reaveches which uses inverse iteration. If the Hessenberg matrix is not too illconditioned with respect to its eigenvalue problem then this method yields numerically independent eigenvectors and is competitive with inverse iteration as to accuracy and computation time.

Copyright 1995 by CRC Press, Inc

Function Parameters: int reaqri (a,n, em,val, vec) reaqri: given the value 0 provided that the process is completed within em[4] iterations; otherwise reaqri is given the value k, of the number of eigenvalues and eigenvectors not calculated; a: float a[l:n,l:n]; entry: the elements of the real upper Hessenberg matrix must be given in the upper triangle and the first subdiagonal of array a; exit: the Hessenberg part of array a is altered; n: int; entry: the order of the given matrix; em: float em[0:5]; entry: em[O]: the machine precision; em[l]: a norm of the given matrix; em[2]: the relative tolerance used for the QR iteration (em[2] > em[OJ); if the absolute value of some subdiagonal element is smaller than em[l]*em[2] then this element is neglected and the matrix is partitioned; em[4]: the maximum allowed number of iterations (for example, em[4]=10*n); exit: em[3]: the maximum absolute value of the subdiagonal elements neglected; the number of QR iterations performed; if the iteration process is not em[5]: completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of val and the last n-k columns of vec are approximated eigenvalues and eigenvectors of the given matrix, where k is delivered in reaqri; val: float val[l:n]; exit: the eigenvalues of the given matrix are delivered in val; vec: float vec[l:n, l:n]; exit: the calculated eigenvectors corresponding to the eigenvalues in val[l:n] are delivered in the columns of vec.

Functions used:

matvec, rotcol, rotrow.

int reaqri(f1oat **a, int n, float em[], float val[l, float **vet)

I float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); float matvec(int, int, int, float * * , float [I ) ; void rotcol(int, int, int, int, float * * , float, float); void rotrow(int, int, int, int, float **, float, float) ; int ml,i,il,m,j,q,max,count; float w,shift,kappa,nu,mu,r,tol,s,machtol,elmax,t,delta,det,*t£; tf=allocate-real-vector(1,n); machtol=em [Ol*em [ll ; tol=em [ll*em 121 ; max=em 141 ; count=O; elmax=0.0; ma; for (i=l; ic=n; i++) { vec [i] [il=l.0; for (j=i+l; jc=n; j++) vec[il [jl=vec[jl [il=0.0;

Copyright 1995 by CRC Press, Inc

ml=m-1; i=m; do { . q=1; i--; ) while ( (i >= 1) ? (fabs(a[i+ll [il ) > tol) : 0) ; if (q > 1) if (fabs(a[q] [q-11) > elmax) elmax=fabs (a[ql [q-11) ; if (q == m) { val [ml=a [ml Em1 ; m=ml ; ] else { delta=a [ml [ml -a[mil [mll ; det=a [ml [mll*a [mll [ml ; if (fabs(delta) s machtol) s=sqrt (det); else ( w=2.0/delta; s=w*w*det+l.O; s = (S max) { em[3] =elmax; em [5]=count; free-real-vector (tf,1) ; return m;

:

a[q] [m]1s;

A[q]

[q] - = shift; for (i=q; i q) { a [i] [i-11=kappa*nu; w=kappa*mu; ] else wzkappa; mu=a [i] [il /kappa; nu=a [ill [il/kappa; a [il [il=w; rotrow(il,n,i,il,a,mu,nu) ; rotcol(l,i,i,il,a,mu,nu); a [i] [i] += shift; rotcol(l,n,i,il,vec,mu,nu);

1

a [ml [mll=a [ml [ml*nu; a [m] [m]=a [ml [ml*mu+shift; .

I

I

) while (m > 0) ; for (j=n; j>=2; j--) { tf[jl=l.O; t=a[jl [jl ;

Copyright 1995 by CRC Press, Inc

;

for (i=j-1; i>=l; i--) { delta=t-a [i] [il ; tf [il =matvec(i+l,j , i,a, tf) / ( (fabs (delta) c machtol)

?

machtol

:

delta) ;

1

;or

(i=l; ij+l) by the double QR iteration of Francis [DekHo68, Fr61, Wi651.

Function Parameters: int comvalqri (a, n, em,re,im) given the value 0 provided that the process is completed within em[4] iterations; otherwise comvalqri is given the value k, of the number of eigenvalues not calculated; float a[l :n, 1:n]; entry: the elements of the real upper Hessenberg matrix must be given in the upper triangle and the first subdiagonal of array a; exit: the Hessenberg part of array a is altered; int; entry: the order of the given matrix; float em[0:5]; entry: em[O]: the machine precision; em[l]: a norm of the given matrix; em[2]: the relative tolerance used for the QR iteration (em[2] > em[OA; if the absolute value of some subdiagonal element is smaller than em[l]*em[2] then this element is neglected and the matrix is partitioned; em[4]: the maximum allowed number of iterations (for example, em[4]=10*n); exit: em[3]: the maximum absolute value of the subdiagonal elements neglected; em[5]: the number of QR iterations performed; if the iteration process is not completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of re and im are approximate eigenvalues of the given matrix, where k is delivered in comvalqri; re,im: float re[l :n], im[l:n]; exit: the real and imaginary parts of the calculated eigenvalues of the given matrix are delivered in re, im[l:n], the members of each nonreal complex conjugate pair being consecutive.

comvalqri:

#include jnt comvalqri(f1oat **a, int n, float em[], float re [I, float im[l )

t

int i,j,p,q,max,count,nl,pltp2,imin1,il,i2,i3,b; float disc,sigma,rho,gl,g2,g3,psil,psi2,aa,e,k,s,norm,machtol2,

Copyright 1995 by CRC Press, Inc

tol,w; norm=em Ell ; w=em [O]*norm; machtol2=w*w; tol=em [21 *norm; max=em l 4 I ; count=O; w=o .o; do { ,

izniq=1; . 1--.

} while ( (i >= 1) ? (fabs(a[i+ll [il ) > tol) : 0) ; if (p > 1) ~f (fabs(a[ql lq-11) > w) w=fabs (a [q-11) ; if (q >= n-1) { nl=n-1; if (q == n) { re [nl=a [nl In1 ; im[nl=0.0; ) else { sigma=atnl [nl -a[nll [nll ; rho = -a[nl [nll*a [nll [nl ; disc=sigma*sigma-4.0*rho; if (disc > 0.0) { disc=sqrt (disc); s = -2.O*rho/(sigma+((sigma >= 0.0) ? disc re [nl=a [nl [nl+s ; re [nl]=a [nll [nl]-s; im [n]=im [nl]=O.0; ) else { re [n]=re [nl]= (a[nll [nll+a [nl [nl) /2.0 ; im[nll =sqrt (-disc)/2 .o; im [nl = -im[nll ;

:

-disc));

1 el'se count++; if (count > max) break; nl=n-1 ; sigma=a[n] [n]+a [nl] [nl]+sqrt (fabs(a[nl] [n-21*a [nl [nll) *em [oI rho=a [n] [nl *a [nll [nll -a[nl [nll *a [nll [nl ; i=n-1; do I pl=il=i; i--; } while ( (i-1 >= q) ? (fabs(aIil [i-11*a [ill [il* (fabs(a[i] [i]+a [ill [ill-sigma)+fabs (a[i+21 [ill ) ) ) > fabs (a[il [il ( (a[il [il-sigma)+ a [i] [ill*a [ill [i]+rho)) *toll : 0) ; p=p1-1; p2LP+2 ; for (i=p; ij+l) by inverse iteration [DekHo68, Vr66, vectors f ) and xm are produced by use of the scheme Wi651. Starting with x(O)=(l, I (A-Xr)f)=xF), X @ + ' ) = ~ F ' / I (k=O,I, ~ ~ ~ ...). The process is terminated if either (a) 1 (A-Xr)xF) 1 ,I 1 A 1 *to& where to1 is a tolerance prescribed by the user, or (b) k=kmax+l, where kmax is an integer prescribed by the user.

[IE

Function Parameters: void comveches (a,n,lambda,mu,em,u,v) float a[l:n,l:n]; entry: the elements of the real upper Hessenberg matrix must be given in the upper triangle and the first subdiagonal of array a; exit: the Hessenberg part of array a is altered; n: int; entry: the order of the given matrix; float; real and imaginary part of the given eigenvalue (values of h and lambda, mu: 1.1 above); em : float em[O: 91; entry: em[O]: the machine precision; em[l]: a norm of the given matrix; em[6]: the tolerance used for eigenvector (value of to1 above, em[6] >

a:

Copyright 1995 by CRC Press, Inc

em[8]:

em[OJ); the inverse iteration ends if the Euclidean norm of the residue vector is smaller than em[l]*em[6]; the maximum allowed number of iterations (value of kmax above, for example, em[8]=5);

exit: em[7]: em[9]:

the Euclidean norm of the residue vector of the calculated eigenvector; the number of inverse iterations performed; if em[7] remains larger than em[l]*em[6] during em[8] iterations then the value em[8]+1 is delivered; float u[l :n], v[l :n]; exit: the real and imaginary parts of the calculated eigenvector are delivered in the arrays u and v.

Functions used:

vecvec, matvec, tarnvec.

void comveches(float **a, int n, float lambda, float mu, float em [I , float u [I , float v [I) 1

float *allocate-realvector(int, int); int *allocate-integer-vector(int, int); void free-integer-vector(int * , int); void free-real-vector(f1oat * , int); float vecvec(int, int, int, float [I, float [ I ) ; float matvec(int, int, int, float * * , float [I ) ; float tamvec (int, int, int, float * * , float [I ) ; int i,il,j ,count,max,*p; float aa,bb,d,m,r,s,w,x,y,norm,machtol,tol,*g,*f; p=allocate-integer-vector(1,n);

g=allocate-real-vector (1,n) ; f=allocate-real-vector(1,n); norm=em [ll ; machtol=em [Ol*norm; tol=em[6]*norm; rnax=em [El ; for (i=2; i em[On; em[2]: the maximum allowed number of iterations (for example, em[4]=10n); em[4]: the tolerance used for the eigenvectors (value of .c above; em[6] > em[2n; for em[6]: each eigenvector the inverse iteration ends if the Euclidean norm of the residue vector is smaller than em[l]*em[6]; the maximum allowed number of inverse iterations for the calculation of each em[8]: eigenvector (value of kmax above; for example, em[8]=5); exit: the infinity norm of the equilibrated matrix; em[/: the maximum absolute value of the subdiagonal elements neglected; em[3]: the number of QR iterations performed; if the iteration process is not em[5]: completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of val and the last n-k columns of vec are approximate eigenvalues and eigenvectors of the given matrix, where k is delivered in reaeigl; the maximum Euclidean norm of the residues of the calculated eigenvectors of em[7]: the transformed matrix; the largest number of inverse iterations performed for the calculation of some em[9]: eigenvector; if, for some eigenvector the Euclidean norm of the residue remains larger than em[l]*em[6], then the value em[8]+1 is delivered; nevertheless the eigenvectors may then very well be useful, this should be judged from the value delivered in em[7] or from some other test; val: float val[l:n]; exit: the eigenvalues of the given matrix are delivered in monotonically decreasing order; vec: float vec[l :n, I :n]; exit: the calculated eigenvectors corresponding to the eigenvalues in val[l:n] are delivered in the columns of vec.

Functions used:

eqilbr, tfmreahes, bakreahes2, baklbr, reavalqri, reaveches, reascl.

int reaeigl(f1oat **a, int n, float em[], float val[l, float **vet)

I

int *allocate-integer-vector(int, int); int); float *allocate~real~vector(int, float **allocate-real-matrix(int, int, int, int); void free-integer-vector(int *, int); void free-real-vector(f1oat *, int) ; void free-real-matrix(f1oat **, int, int, int); void tfmreahes (float **, int, float [I , int [I ) ; void bakreahes2 (float * * , int, int, int, int [I , float * * ) void eqilbr(f1oat **, int, float [I, float [I, int [I ) ; void baklbr (int, int, int, float [I , int [I , float * * ) ; int reavalqri (float **, int, float [I , float [I ) ; void reaveches (float **, int, float, float [I, float 11 ) ; void reascl (float **, int, int, int) ; int i,k,max,j,l,*ind,*indO; float residu,r,machtol,*d,*v,**b;

Copyright 1995 by CRC Press, Inc

;

residu=O.0; max=0 ; eqilbr (a,n,em,d, indo); tfmreahes (a,n,em,ind) ; for (i=l; ic=n; i++) for (j=((i == 1) ? 1 : i-1); jc=n; j + + ) b[il [jl=a[il[jl ; k=reavalqri(b,n,em,val); for (i=k+l; ic=n; i++) for (j=i+l; jc=n; j++) if (val[jl > valIil) { r=val [il ; val [il=val [jl ; val [jl=r;

1

machtol=em [O] *em [l]; for (l=k+l; lc=n; I++) { if (1 > 1) if (val[l-l]-val[ll c machtol) val[ll =val [l-11-machtol; for (i=l; ic=n; i++) for (j=((i == 1) ? 1 : i-1); jc=n; j++) bIi1 [jl=aIil [jl; reaveches (b,n,val[ll ,em,v); if (em[7] > residu) residu=em [71 ; if (em[9] > max) max=em[91 ; for (j=l; jc=n; j++) vec[jl [ll=v[jl; 1

em [7]=residu; em [91=max; bakreahes2 (a,n, k+l,n,ind,vec); baklbr (n,k+l,n,d, ind0,vec); reascl (vec,n,k+l,n) ; free-integer-vector (ind,1) ; free-integer-vector(ind0,l); free-real-vector (d,1) ; free-real-vector (v,1) ; free-real-matrix(b, l,n,1) ; return k;

Determines all eigenvalues hj (assumed to be real) and corresponding eigenvectors ua) (i=l, ...,n) of a real nxn matrix A by equilibration to the form A'=PDAD-'PI (calling eqilbr), transformation to similar real upper Hessenberg form H (calling tfmreahes), computation of the eigenvalues of H by QR iteration and direct determination of the eigenvectors of H (calling reaqri); if all eigenvalues of H have been determined, there follow back transformation from the eigenvectors of H to the eigenvectors of A' (calling bakreahes2) further back transformation from the eigenvectors of A' to those of A (calling baklbr) and finally scaling of the eigenvectors of A in such a way that the element of maximum size in each eigenvector is 1 (calling reascl). The procedure reaeig3 should be used only if all eigenvalues are real. Function Parameters: int reaeig3 (a,n, em,val,vec) given the value 0 provided that the process is completed within em[4] iterations; otherwise reueig3 is given the value k, of the number of eigenvalues not calculated; a: float a[l:n,l:n]; entry: the matrix whose eigenvalues and eigenvectors are to be calculated; the array elements are altered; exit:

reaeig3:

Copyright 1995 by CRC Press, Inc

n:

int; entry: the order of the given matrix; em: float em[0:5]; entry: em[O]: the machine precision; em[2]: the relative tolerance used for the QR iteration (em[2] > em[On; em[4]: maximum allowed number of QR iterations (for example, em[4]=lOn); exit: em[l]: the infinity norm of the equilibrated matrix; em[3]: the maximum absolute value of the subdiagonal elements neglected; em[5]: the number of QR iterations performed; if the iteration process is not completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of val are approximate eigenvalues of the given matrix and no useful eigenvectors are delivered, where k is delivered in reaeig3; vaE: float val[l:n]; the eigenvalues of the given matrix are delivered; exit: vec: float vec[l :n, l:n]; the calculated eigenvectors corresponding to the eigenvalues in val[l:n] are exit: delivered in the columns of vec.

Functions used:

eqilbr, tfmreahes, bakreahes2, baklbr, reaqri, r e a d

int reaeig3 (float **a, int n, float em [ I , float val [ I , float **vet) I

int *allocate-integer-vector(int, int); float *allocate-real-vector(int, int); void free-integer-vector(int *, int); void free-real-vector(f1oat *, int); void tfmreahes(f1oat **, int, float [ I , int [I ) ; void bakreahes2(float **, int, int, int, int [ I , float * * ) ; void eqilbr (float **, int, float [ I , float [ I , int [I ) ; void baklbr(int, int, int, float [ I , int [ I , float * * ) ; void reascl(f1oat **, int, int, int); int reaqri (float **, int, float [I , float [I , float * * ) ; int i,*ind,*indo ; float *d; ind=allocate-integer-vector(1,n); indo=allocate-integer-vector(1,n); d=allocate-real-vector(1,n); eqilbr (a,n,em,d,indo); tfmreahes (a,n,em, ind); i=reaqri(a,n,em,val,vec); if (i == 0) { bakreahes2 (a,n,l,n,ind,vec) ; baklbr(n,l,n,d,indO,vec); reascl (vec,n,1,n) ;

I

1

free-integer-vector(ind,l); free-integer-vector (indo,1); free-real-vector (d,1); return i;

D. comeigval Determines the real and complex eigenvalues hi (i=l, ...,n) of a real nxn matrix A by equilibration to the form A'=PDAD-'PI (calling eqilbr) transformation to similar real upper

Copyright 1995 by CRC Press, Inc

Hessenberg form H (calling tfmreahes) and computation of the eigenvalues of H by double QR iteration (calling comvalqri). Function Parameters: int comeigval (a,n,em,re,im) comeigval: given the value 0 provided that the process is completed within em[4] iterations; otherwise comeigval is given the value k, of the number of eigenvalues not calculated; a: float a[l:n,l:n]; entry: the matrix whose eigenvalues are to be calculated; the array elements are altered; exit: n: int; entry: the order of the given matrix; em: float em[0:5]; entry: em[O]: the machine precision; em[2]: the relative tolerance used for the QR iteration (em[2] > em[On; em[4]: the maximum allowed number of iterations (for example, em[4]=10n); exit: em[l]: the infinity norm of the equilibrated matrix; em[3]: the maximum absolute value of the subdiagonal elements neglected; em[5]: the number of QR iterations performed; if the iteration process is not completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of re and im are approximate eigenvalues of the given matrix, where k is delivered in comeigval; re, im: float re[l:n], im[l:n]; the real and imaginary parts of the calculated eigenvalues of the given matrix exit: are delivered in re, im[l:n], the members of each nonreal complex conjugate pair being consecutive. Functions used:

eqilbr, tfmreahes, comvalqri.

int *allocate-integer-vector(int, int) ; float *allocate-real-vector(int, int); void free-integer-vector(int *, int) ; void free-real-vector(f1oat *, int); void eqilbr(f1oat * * , int, float 1 1 , float [ I , int [I); void tfmreahes (float **, int, float [I , int [I ) ; int comvalqri (float **, int, float [I , float [I , float 1 1 int i,*ind,*indO; float *d;

ind=allocate-integer-vector(1,n) ; indo=allocate-integer-vector(1,n) ; d=allocate-real-vector(1,n); eqilbr (a,n,em,d, indo) ; tfmreahes (a,n,em,ind) ; i=comvalqri (a,n, em,re, im) ; free-integer-vector(ind,l) ; free-integer-vector (indo,1) ; free-real-vector (d,1) ; return i;

Copyright 1995 by CRC Press, Inc

) ;

E. comeigl Determines the real and complex eigenvalues hj and corresponding eigenvectors ua) U=l, ...,n) of a real nxn matrix A. A is equilibrated to the form A'=PDAD-'P' (by means of a call of eqilbr) and transformed to similar real upper Hessenberg form H (by means of a call of tfmreahes). The eigenvalues of H are then computed by double QR iteration (by means of a call of comvalqri). The real eigenvectors and complex eigenvectors of H are determined either by direct use of an iterative scheme of the form

I/ 1

or by delayed application of such a scheme. If min I hj-hiI l e H (j# i) for hi ranging over , the previously determined eigenvalues, hj is replaced in (I) by pj, where min I pj-hiI =E e being the value of the machine precision supplied by the user. The inverse iteration scheme is terminated if (a) (H-AJ)xko) I H 1 z, where T is a relative tolerance prescribed by the user, or (b) k=kmax+l, where the integer value of kmax, the maximum permitted number of inverse iterations, is also prescribed by the user. The above inverse iteration is performed, when hj is real, by reaveches, and when hj is complex, by comveches. The eigenvectors @) of the equilibrated matrix A' are then obtained from those, xo), of H by back transformation (by means of a call of bakreahes2) and the eigenvectors wo) of the original matrix A are recovered from the vo) by means of a call of baklbr. Finally, the wm are scaled to uo) by imposing the condition that the largest element of uo) is 1 (by means of a call of comscl).

1

1 HI/

1 1

Function Parameters: int comeigl (a,n, em,re, im,vec) given the value 0 provided that the process is completed within em[#] iterations; otherwise comeigl is given the value k, of the number of eigenvalues and eigenvectors not calculated; a: float a[l:n,l:n]; entry: the matrix whose eigenvalues and eigenvectors are to be calculated; exit: the array elements are altered; n: int; entry: the order of the given matrix; em: float em[0:9]; entry: em[O]: the machine precision (the value of E above); em[2]: the relative tolerance used for the QR iteration (em[2] > em[OB; em[4]: the maximum allowed number of iterations (for example, em[4]=10n); em[6]: the tolerance used for the eigenvectors (value of z above; em[6] > em[28; for each eigenvector the inverse iteration ends if the Euclidean norm of the residue vector is smaller than em[l]*em[6]; em[8]: the maximum allowed number of inverse iterations for the calculation of each eigenvector (value of kmax above; for example, em[8]=5); exit: em[l]: the infinity norm of the equilibrated matrix; em[3]: the maximum absolute value of the subdiagonal elements neglected; em[5]: the number of QR iterations performed; if the iteration process is not

comeigl:

Copyright 1995 by CRC Press, Inc

completed within em[4] iterations then the value em[4]+1 is delivered and in this case only the last n-k elements of re, im and columns of vec are approximate eigenvalues and eigenvectors of the given matrix, where k is delivered in comeigl; em[7]: the maximum Euclidean norm of the residues of the calculated eigenvectors of the transformed matrix; em[9]: the largest number of inverse iterations performed for the calculation of some eigenvector; if the Euclidean norm of the residue for one or more eigenvectors remains larger than em[l]*em[6], then the value em[8]+1 is delivered; nevertheless the eigenvectors may then very well be useful, this should be judged from the value delivered in em[7] or from some other test; re, im: float re[l :n], im[l :n]; the real and imaginary parts of the calculated eigenvalues of the given matrix exit: are delivered in arrays re[l:n] and im[l:n], the members of each nonreal complex conjugate pair being consecutive; float vec[l :n, 1:n]; vec: exit: the calculated eigenvectors are delivered in the columns of vec; an eigenvector corresponding to a real eigenvalue given in array re is delivered in the corresponding column of array vec; the real and imaginary part of an eigenvector corresponding to the first member of a nonreal complex conjugate pair of eigenvalues given in the arrays re, im are delivered in the two consecutive columns of array vec corresponding to this pair (the eigenvectors corresponding to the second members of nonreal complex conjugate pairs are not delivered, since they are simply the complex conjugate of those corresponding to the first member of such pairs). Functions used:

eqilbr, tfmreahes, bakreahes2, comveches, comscl.

baklbr,

reaveches,

int comeigl (float **a, int n , float em [I , float re [I , float im[l , float **vet) ( int *allocate-integer-vector(int, int); float *allocate-real-vector(int, int); float **allocate-real-matrix(int, int, int, int) ; void free-integer-vector(int *, int) ; void free-real-vector(f1oat *, int); void free-real-matrix(f1oat **, int, int, int); void eqilbr (float **, int, float [I , float [I , int [I ) ; void tfmreahes(f1oat * * , int, float [I, int [I); void bakreahes2(float * * , int, int, int, int [I, float * * I ; void baklbr (int, int, int, float [I , int [I , float * * ) ; void reaveches (float **, int, float, float [I , float [I ) ; void comscl(f1oat * * , int, int, int, float [I) ; int comvalqri (float **, int, float [I , float [I , float [I ) ; void comveches(f1oat * * , int, float, float, float [ I , float [I, float [I); int i,j,k,pj,itt,again,*ind,*indO; float ~,y,max,neps,**ab,*d,*u,*v,templ,temp2;

eqilbr (a,n ,em,d,indo) ; tfmreahes (a,n,em,ind) ;

Copyright 1995 by CRC Press, Inc

comvalqri,

for (i=l; ic=n; i++) for (j=((i == 1) ? 1 : i-1); jc=n; j++) ab[il [jl=a[il[j]; k=comvalqri(ab,n,em,re,im) ; neps=em[Ol *em [ll ; max=O.0 ; itt=O; for (i=k+l; ic=n; i++) { x=re [il ; y=im [il ; pj=O; again=l; do { for (j=k+l; jc=i-1; j++) { ternpl=x-re[ jI ; temp2=y-im[jl; if (templ*templ+temp2*temp2 c= neps*neps) { if (pj == j) neps=em [21 *em [ll ; else pj=j; x += 2.O*neps; again = ( !again); break; I

1

1

again = ( !again); ) while (again); re [il=x; for (i=l; ij); exit: the array elements are altered; n: int; entry: the order of the given matrix; numval: int; entry: eigvalhrm calculates the largest numval eigenvalues of the Hermitian matrix (the value of n ' above); val: float val[l:numval]; in array val the largest numval eigenvalues are delivered in monotonically exit: nonincreasing order; em: float em[0:3]; entry: the machine precision (value of E above); em[O]: em[2]: the relative tolerance used for the eigenvalues (value of p above); more precisely, the tolerance for each eigenvalue h is I h 1 *em[2]+em[l]*em[O]; exit: an estimate of a norm of the original matrix; em[l]: em[3]: the number of iterations performed. entry:

Functions used:

hshhrmtrival, valsymtri.

void eigvalhrm(f1oat **a, int n, int numval, float val[l, float em[]) ( float *allocate-real-vector(int, int) ; void free-real-vector(f1oat *, int); void hshhrmtrival (float * * , int, float [ I , float [I , float [I ) ; void valsymtri (float [I, float [ I , int, int, int, float [ I , float [I); float *d,*bb; d=allocate-real-vector(1,n); bb=allocate-real-vector(1,n-1); hshhrmtrival (a,n,d, bb, em) ; valsymtri (d,bb,n, l,numval,val,em) ; free-real-vector (d,1) ; free-real-vector (bb,1) ;

1

B. eighrm Determines the n ' largest eigenvalues A, G = l , ...,n') and corresponding eigenvectors uo) G=l, ...,n ') of the nxn Hermitian matrix A. A is first reduced to similar real tridiagonal form T by a call of hshhrmtri. The required eigenvalues of this tridiagonal matrix are then obtained by a call of valsymtri. The corresponding eigenvectors of T are determined by inverse iteration and (possibly) Gram-Schmidt orthogonalization by a call of vecsymtri; the required eigenvectors of A are then recovered from those of T by means of a call of bakhrmtri. Function Parameters: void eighrm (a, n, numval,val,vecr,veci,em) a:

float a[I:n, I:n];

Copyright 1995 by CRC Press, Inc

entry:

the real part of the upper triangle of the Hermitian matrix must be given in the upper triangular part of a (the elements a[i,j], i s j ) ; the imaginary part of the strict lower triangle of the Hermitian matrix must be given in the strict lower part of a (the elements a[i,j], i>j); exit: the array elements are altered; n: int; entry: the order of the given matrix; numval: int; entry: eighrm calculates the largest numval eigenvalues of the Hermitian matrix (the value of n ' above); val: float val[l:numval]; exit: in array val the largest numval eigenvalues are delivered in monotonically nonincreasing order; vecr,veci: float vecr[l :n, l:numval], veci[l :n, l:nurnval]; exit: the calculated eigenvectors; the complex eigenvector with real part vecr[l:n,i] and the imaginary part veci[l:n,i] corresponds to the eigenvalue val[i], i=l, ...,numval; em: float em[0:9]; entry: the machine precision; em[O]: the relative tolerance used for the eigenvalues; more precisely, the em[2]: tolerance for each eigenvalue h is I h 1 *em[2]+ern[I]*em[O]; the orthogonalization parameter (for example, em[4]=0.01); em[4]: the tolerance for the eigenvectors; em[6]: the maximum number of inverse iterations allowed for the calculation of em[8]: each eigenvector; exit: an estimate of a norm of the original matrix; em[l]: em[3 1: the number of iterations performed; the number of eigenvectors involved in the last Gram-Schmidt em[5]: orthogonalization; the maximum Euclidean norm of the residues of the calculated em[7]: eigenvectors; the largest number of inverse iterations performed for the calculation of em[9]: some eigenvector; if, however, for some calculated eigenvector, the Euclidean norm of the residues remains greater than em[l ]*em[6], then em[9]=em[8]+1. Functions used:

hshhrmtri, valsymtri, vecsymtri, bakhrrntri.

void eighrm(f1oat **a, int n, int numval, float val[l, float **vecr, float **veci, float em [I ) i float *allocate-real-vector(int, int) ; void free-real-vector(f1oat * , int); void hshhrmtri (float **, int, float [I , float [ I , float [I , float [I, float [I, float 11); void valsymtri (float [I, float [I, int, int, int, float [I, float [I); void vecsymtri (float [I, float [I, int, int, int, float [I, float **, float [I); void bakhrmtri(f1oat **, int, int, int, float **, float **, float [I, float [I); float *bb,*tr,*ti,*d,*b;

Copyright 1995 by CRC Press, Inc

bb=allocate-real-vector(1,n-1); tr=allocate-real-vector(1,n-1); ti=allocate-real-vector(1,n-1); d~allocate-real-vector(1,n); b=allocate-real-vector(1,n); hshhrmtri (a,n,d,b, bb, em, tr, ti) ; valsymtri(d,bb,n,l,numval,val,em) b[nl =O.O; vecsymtri (d,b,n, 1,numval,val,vecr',em); bakhrmtri (a,n,l,numval,vecr,veci,tr,ti) ; free real vector (bb,1) ; freeIrealIvector (tr,1) ; free-real-vector (ti,1) ; free-real-vector (d,1) ; free-real-vector (b,1) ;

1

C. qrivalhrm Determines all eigenvalues Aj of the nxn Hermitian matrix A. A is first reduced by a similarity transformation to real tridiagonal form Tby calling hshhrmtrival. The eigenvalues A, of T are then determined QR iteration using a call of qrivalsymtri. Function Parameters:

int qrivalhrm (a, n, val, em) given the value 0 provided the QR iteration is completed within em[4] iterations; otherwise, qrivalhrm is given the number of eigenvalues, k, not calculated and only the last n-k elements of val are approximate eigenvalues of the original Hermitian matrix; float a[l:n,l:n]; entry: the real part of the upper triangle of the Hermitian matrix must be given in the upper triangular part of a (the elements a[i,j], i i j ) ; the imaginary part of the strict lower triangle of the Hermitian matrix must be given in the strict lower part of a (the elements a[i,j], i>j); the array elements are altered; exit: int; entry: the order of the given matrix; float val[l :n]; exit: the calculated eigenvalues; float em[O:5]; entry: em[O]: the machine precision; em[2]: the relative tolerance used for the QR iteration; em[4]: the maximum allowed number of iterations; exit: em[l]: an estimate of a norm of the original matrix; em[3]: the maximum absolute value of the codiagonal elements neglected; em[5]: number of iterations performed; em[5]=em[4]+1 when qrivalhrmz0.

qrivalhrm:

a:

n: val: em:

Functions used:

hshhrrntrival, qrivalsymtri.

jnt qrivalhrm(f1oat **a, int n, float val [I , float em [I )

'

float *allocate-real-vector (int, int) ; void free-real-vector(f1oat *, int);

Copyright 1995 by CRC Press, Inc

void hshhrmtrival (float * * , int, float [I , float [I , float [I ) int qrivalsymtri (float [I , float [I , int, float 1 1 ) ; int i; float *bb;

;

bb=allocate-real-vector(1,n); hshhrmtrival(a,n,val,bb,ern); bb[nl=0.0; i=qrivalsymtri(val,bb,n,em); free-real-vector (bb,1); return i;

1

D. qrihrm Determines all eigenvalues h, and corresponding eigenvectors uo) of the nxn Hermitian matrix A. A is first reduced to similar real tridiagonal form T by a call of hshhrmtri. The eigenvalues hj and corresponding eigenvectors vo of T are then determined QR iteration, using a call of qrisymtri. The eigenvectors uW of A are then obtained from the voj by means of a call of bakhrmtri.

Function Parameters: int qrihrm (a, n, val,vr, vi, em) qrihrm: qrihrm=O, provided the process is completed within em[4] iterations; otherwise, qrihrm is given the number of eigenvalues, k, not calculated and only the last n-k elements of val are approximate eigenvalues and the columns of the arrays vr, vi[l:n,n-k:n] are approximate eigenvectors of the original Hermitian matrix; a: float a[l:n, l:n]; entry: the real part of the upper triangle of the Hermitian matrix must be given in the upper triangular part of a (the elements a[i,j], i s j ) ; the imaginary part of the strict lower triangle of the Hermitian matrix must be given in the strict lower part of a (the elements a[i,j], +j); exit: the array elements are altered; n: int; entry: the order of the given matrix; val: float val[l :n]; exit: the calculated eigenvalues; vr,vi: float vr[l:n,l:n], vi[l:n,l:n]; exit: the calculated eigenvectors; the complex eigenvector with real part vr[l:n,i] and the imaginary part vi[l:n,i] corresponds to the eigenvalue val[i], i=l, ...,n; em: float em[0:5]; entry: em[O]: the machine precision; em[2]: the relative tolerance for the QR iteration; em[4]: maximum allowed number of iterations (for example, em[4]=10n); exit: em[l]: an estimate of a norm of the original matrix; em[3]: the maximum absolute value of the codiagonal elements neglected; em[5]: number of iterations performed; em[5]=em[4]+1 when qrihrm#O.

Functions used:

Copyright 1995 by CRC Press, Inc

hshhrrntri, qrisymtri, bakhrrntri.

int qrihrm(f1oat **a, int n, float val [I, float **vr, float **vi, float em [I ) { float *allocate-real-vector(int, int) ; void free-real-vector(f1oat *, int); void hshhrmtri (float * * , int, float [I, float [I , float [I, float [I, float [I, float [ I ) ; int qrisymtri(f1oat **, int, float [I, float [I, float [ I , float [I ) void bakhrmtri(f1oat **, int, int, int, float **, float **, float [I, float [I); int i,j; float *b,*bb,*tr,*ti;

;

b=allocate-real-vector(1,n); bb=allocate-real-vector(1,n); tr=allocate-real-vector(1,n-1); ti=allocate-real-vector(1,n-1); hshhrmtri(a,n,val,b,bb,em,tr,ti); for (i=l; ic=n; i++) { vr [ij[il=l..0; for (j=i+l; jc=n; j++) vr[il

1

b [nl=bb [nl=O.0 ; ieqrisymtri (vr,n,val,b, bb,em) ; bakhrmtri(a,n,i+l,n,vr,vi,tr,ti) free-real-vector (b,1) ; free-real-vector (bb,1) ; free-real-vector (tr,1) ; free-real-vector(ti,l); return i;

I

3.13.9 Complex upper-Hessenberg matrices A. valqricom Determines the eigenvalues hj Cj=l,...,n) of an nxn complex upper Hessenberg matrix A (Aij=O for i>j+l) with real subdiagonal elements Aj,,j, by QR iteration. For further details see the documentation of the procedure qricom. Function Parameters: int valqricom (al,a2, b, n, em,vall, val2) given the value 0 provided the process is computed within em[4] iterations; otherwise given the number, k, i f eigenvalues not calculated and only the last n-k elements of the arrays vall and va12 are approximate eigenvalues of the upper Hessenberg matrix; float al[l:n,l:n], a2[1:n,l:n]; entry: the real part and the imaginary part of the upper triangle of the upper Hessenberg matrix must be given in the corresponding parts of the arrays a1 and a2; the array elements in the upper triangle of a1 and a2 are altered; exit: float b[l:n-I]; entry: the subdiagonal of the upper Hessenberg matrix; the elements of b are altered; exit: int; entry: the order of the given matrix; float em[0:5]; entry:

valqricom:

al,a2:

b:

n: em:

Copyright 1995 by CRC Press, Inc

em[O]: em[l]: em[2]: em[4]: exit: em[3]: em[5]:

the machine precision; an estimate of the norm of the upper Hessenberg matrix (e.g. the sum of the infinity norms of the real and imaginary parts of the matrix); the relative tolerance for the QR iteration; the maximum allowed number of iterations (e.g. 10n);

the maximum absolute value of the subdiagonal elements neglected; the number of iterations performed; em[5]=em[4]+1 in the case valqricom#O; vall,val2: float vall [I :n], val2[1 :n]; the real part and the imaginary part of the calculated eigenvalues are exit: delivered in vall and va12, respectively. Functions used:

comkwd, rotcomrow, rotcomcol, comcolcst.

int valqricom(f1oat **al, float **a2, float b[l, int n , float em[], float vall [I , float va12 [I) void comcolcst (int, int, int, float **, float **, float, float) ; void rotcomcol(int, int, int, int, float * * , float **, float, float, float); void rotcomrow(int, int, int, int, float **, float **, float, float, float); void comkwd(float, float, float, float, float *, float * , float *, float * ) ; int nml,i,il,q,ql,max,count; float r,zl,z2,ddl,dd2,cc,gl,g2,kl,k2,hc,alnn,a2nn,aij 1,aij 2, aili,kappa,nui,muil,mui2,muimll,muiml2,nuiml,tol; tol=em [ll *em 121 ; max=em [ 4 1 ; count=O; r=O.O; if (n > 1) hc=b [n-11; do { nml=n-1 ;

&an!

q=1; i--; ) while ((i >= 1) ? (fabs(b[i]) > tol) : 0); if (q > 1) ~f (fabs(b[q-11) > r) r=fabs (b[q-11) ; if (q == n) { vall [n]=a1 [nl [nl ; va12 [n]=a2 [nl [nl ; n=nml; if (n > 1) hc=b[n-11 ; ) else ( ddl=al [nl 11-11 ; dd2=a2 [nl [nl ; cc=b [nmll ; comkwd( (a1[nmll [nml]-ddl)/2.0,(a2[nml] [nmll -dd2)/2.0, cc*al [nmll [n], cc*a2 [nmll [n], &gl,&g2,&kl,&k2) ; if (q == nml) { vall [nml]=gl+ddl; val2 [nml]=g2+dd2; vall [n]=kl+ddl; va12 [nl =k2+dd2; n - = 2; if (n > 1) hc=b[n-11 ; ) else { count++; if (count > max) break;

Copyright 1995 by CRC Press, Inc

2l=kl+ddl; z2=k2+dd2; if (fabs(cc) > fabs(hc)) 21 += fabs(cc); hc=cc/2.0; i=ql=q+l; aijl=al [ql [ql -21; aij2=a2 [ql [ql -22; aili=b [ql ; kappa=sqrt(aijl*aijl+aij2*aij2+aili*aili; muil=aijl/kappa; mui2=aij2/kappa; nui=aili/kappa; a1 [ql [ql =kappa; a2 [ql [ql = O . 0 ; a1 [qll [qll - = 21; a2 [qll [qll - = 22; rotcomrow (ql,n,q,ql,al,a2,muil ,mui2,nui) ; rotcomcol (q,q,q,ql,al,a2,muil,-mui2,-nui); a1 [ql [ql += 21; a2 [ql [ql += 22; for (il=ql+l; ilj+l) with real subdiagonal elements Aj+lj. A is transformed into a complex upper triangular matrix U by means of the Francis QR iteration [Fr61, Wi651. The diagonal elements of U are its eigenvalues; the eigenvectors of U are obtained by solving the associated upper triangular system, and the eigenvectors of A are recovered by back transformation. Function Parameters: int qricom (al,a2, b, n, em,vall,va12,vecl, vec2) given the value 0 provided the process is computed within em[#] qricom: iterations; otherwise given the number, k, of eigenvalues not calculated and only the last n-k elements of the arrays vall and va12 are approximate eigenvalues of the upper Hessenberg matrix and no useful eigenvectors are delivered; al,a2: float al[l:n,I:n], a2[1:n,l:n]; entry: the real part and the imaginary part of the upper triangle of the upper Hessenberg matrix must be given in the corresponding parts of the arrays a 1 and a2; exit: the array elements in the upper triangle of a 1 and a2 are altered; float b[l :n-11; b: entry: the real subdiagonal of the upper Hessenberg matrix; the elements of b are altered; exit: n: int; entry: the order of the given matrix; em : float em[0:5]; entry: em[O]: the machine precision; em[l]: an estimate of the norm of the upper Hessenberg matrix (e.g, the sum of the infinity norms of the real and imaginary parts of the matrix); em[2]: the relative tolerance for the QR iteration; em[4]: the maximum allowed number of iterations (e.g. 10n); exit: em[3]: the maximum absolute value of the subdiagonal elements neglected; em[5]: the number of iterations performed; em[5]=em[4]+1 in the case qricom#O; vall,val2: float vall[l :n], val2[1:n]; the real part and the imaginary part of the calculated eigenvalues are exit: delivered in vall and va12, respectively. vecl,vec2: float vecl[l:n, l:n], vec2[1:n, l:n]; the eigenvectors of the upper Hessenberg matrix; the eigenvector with real exit: part vecl[l:n,j] and imaginary part vec2[1:n,j] corresponds to the eigenvalue vall/j]+val2fi]*i, j=l, ...,n. Functions used:

comkwd, rotcomrow, rotcomcol, comcolcst, comrowcst, matvec, commatvec, comdiv.

int qricom(f1oat **al, float **a2, float b[l , int n, float em[], float vall[l, float va12[1, float **vecl, float **vec2) float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int);

Copyright 1995 by CRC Press, Inc

void comkwd(float, float, float, float, float *, float *, float * , float * ) ; void rotcornrow(int, int, int, int, float **, float **, float, float, float); void rotcorncol (int, int, int, int, float **, float **, float, float, float); void comcolcst(int, int, int, float * * , float **, float, float); void comrowcst (int, int, int, float **, float **, float, float) ; float matvec(int, int, int, float **, float [I) ; void comrnatvec(int, int, int, float **, float **, float [I, float [I, float * , float * ) ; void comdiv(float, float, float, float, float * , float * ) ; int m,nrnl,i,il,j ,q,ql,max,count; float r,zl,22,ddl,dd2,cc,pl,p2,tl,t2,deltal,delta2,mvl,rnv2,h,hl, h2,gl,g2,kl,k2,hc,aij12,aij22,alnn,a2nn,ai~l,ai]2,aili, kappa,nui,muil,mui2,rnuimll,rnuirnl2,nuiml,tol,rnachtol, *tfl,*tf2; tfl=allocate-real-vector(1,n); tf2=allocate-real-vector(1,n); tol=ern[ll *em [21 ; machtol=em [OI*em [ll ; max=em [41 ; count=O; r=O . 0 ; m=n; if (n > 1) hc=b [n-11; for (i=l; i tol) : 0); if (q > 1) if (fabs(b[q-11) > r) r=fabs (b[q-11 ; if (q == n) { vall [n]=a1 [nl [nl ; val2 11-11 =a2 [nl [nl ; n=nml ; if (n > 1) hc=b[n-11; ) else ( d d k a l [nl [nl ; dd2=a2 11-11 [nl ; cc=b [nmll ; pl= (a1[nmll [nrnll -ddl)*O.5; p2=(a2 [nmll [nmll -dd2)*O.S; comkwd (pl,p2,cc*al [nmll [nl , cc*a2 [nmll [nl ,&gl,&g2,&kl,&k2) ; if (q == nml) { a1 [nl [nl =vall [n]=gl+ddl; a2 [nl [nl =val2 [n]=g2+dd2; a1 [ql [q]=vall [ql =kl+ddl; a2 [q] [q]=va12 [ql =k2+dd2; kappa=sqrt(kl*kl+k2*k2+cc*cc); nui=cc/kappa; muil=kl/kappa; mui2=k2/kappa; aijl=al [ql [nl ; aij2=a2 [ql [nl ; hl=muil*muil-mui2*mui2; h2=2.0*muil*mui2; h = -nui*2.0; a1 [ql [n]=h* (pl*rnuil+p2*rnui2)-nui*nui*cc+aijl*hl+aij2*h2; a2 [ql [n]=h* (p2*rnuil-pl*mui2)+aij2*hl-aijl*h2; rotcomrow (q+2,m,q,n,al,a2,muil,mui2,nui); rotcomcol (1,q-1,q,n,al,a2,muil,-mui2,-nui); rotcomcol(l,m,q,n,vecl,vec2,muil,-rnui2,-nui); n - = 2;

Copyright 1995 by CRC Press, Inc

if (n > 1) hc=b In-11 ; b[q]=O.O; ) else ( count++; if (count > rnax) ( em131 =r; em 151 =count; free-real-vector (tf1,1) ; free-real-vector (tf2,l); return n;

1 Ll=kl+ddl; z2=k2+dd2; if (fabs(cc) > fabs(hc)) zl += fabs(cc); hc=cc/2.0; ql=q+l; aijl=al [ql [ql-21; aij2=a2 [ql [ql- z2;

nui=aili/kappa; a1 [ql [ql=kappa; a2 [ql [ql=O .O; a1 [qll [qll - = zl; a2 [qll [qll - = 22 ; rotcomrow(ql,m,q,ql,al,a2,muil,mui2,nui) ; rotcomcol(1,q,q,ql,al,a2,muil,-rnui2,- n u i ; a1 [ql [ql += zl; a2 [ql [ql += 22; rotcomcol(l,m,q,ql,vecl,vec2,muil,-mui2,-nui) ; for (i=ql; i=2; j--) { tfl[jl=l.O; tf2 [jl=O.O; tl=al[jl [jl; t2=a2 [jl [jl ; for (i=j-1; i>=l; i--) { deltal=tl-a1[il [il ; delta2=t2-a2[il [il ; commatvec(i+l,j,i,al,a2,tfl,tf2,&mvl,&mv2) ; if (fabs(delta1) c machtol && fabs(delta2) c machtol) { tf 1 [il=mvl/machtol; tf2 [il=mv2/rnachtol; ) else corndiv(rnvl,rnv2,deltal,delta2,&tfl~il,&tf2 [ i l ) ; for (i=l; ic=m; i++) commatvec(1,j ,i,vecl,vec2,tfl,tf2,&vecl [il [jl ,&vec2 [il [jl )

;

em[3]=r; em 151 =count; free-real-vector (tfl,1); free-real-vector (tf2,l); return n;

1

3.13.10 Complex full matrices A. eigvalcom Computes the eigenvalues h, (j=l, ...,n) of an nxn complex matrix A. A is first transformed to equilibrated form A ' (by means of a call of eqilbrcom); A ' is then transformed to complex upper Hessenberg form H with real subdiagonal elements (by means of a call of hshcomhes). The eigenvalues hj (j=l,...,n) of H are then determined by QR iteration (by means of a call of valqricom).

Function Parameters: int eigvalcom (ar, ai,n, em,valr,vali) eigvalcom: given the value 0 provided the process is computed within em[4] iterations; otherwise given the number, k, of eigenvalues not calculated and only the last n-k elements of the arrays valr and vali are approximate eigenvalues of the original matrix; ar,ai: float ar[l:n, 1:nJ ai[l:n, l:n]; entry: the real part and the imaginary part of the matrix must be given in the arrays a r and ai, respectively; exit: the array elements of a r and ai are altered; n: int; entry: the order of the given matrix; float em[0: 71; em : entry: em[O]: the machine precision; em[2]: the relative tolerance for the QR iteration; em[4]: the maximum allowed number of QR iterations (e.g. 10n);

Copyright 1995 by CRC Press, Inc

em[6]:

the maximum allowed number of iterations for equilibrating the original matrix (e.g. em[6]=n* nl2);

exit: em[l]: em[3]:

the Euclidean norm of the equilibrated matrix; the maximum absolute value of the subdiagonal elements neglected in the QR iteration; em[j]: the number of QR iterations performed; em[j]=em[4]+1 in the case eigvalcom#O; em[7]: the number of iterations performed for equilibrating the original matrix; float valr[l :n], vali[l :n]; valr,vali: the real part and the imaginary part of the calculated eigenvalues are exit: delivered in valr and vali, respectively.

Functions used:

eqilbrcom, comeucnrm, hshcomhes, valqricom.

int eigvalcom(float **ar, float **ai, int n, float em[], float valr [I , float vali [I ) 1

'

int *allocate-integer-vector(int, int) ; float * a l l o c a t e - r e a l v e c t o r ( i n t , int); void free-integer-vector(int *, int) ; void free-real-vector(f1oat * , int); void hshcomhes (float **, float * * , int, float [I, float 11 , float [I, float [I, float [I); float comeucnrm(f1oat * * , float **, int, int); void eqilbrcom(f1oat * * , float * * , int, float [I, float [I, int [I) int valqricom(f1oat * * , float * * , float [I, int, float [I, float [ I , float [I); int i,*ind; float *d,*b,*del,*tr,*ti;

;

ind=allocate-integer-vector(1,n); d=allocate-real-vector(1,n); b=allocate-real-vector(1,n) ; del=allocate-realvector(1,n); tr=allocate-real-vector (1,n); ti=allocate-real-vector (1,n); eqilbrcom(ar,ai,n,em,d, ind) ; em [l]=comeucnrm(ar,ai,n-1,n) ; hshcomhes (ar,ai,n,em,b,tr,ti,del) ; i=valqricom(ar,ai,b,n ,em,valr,vali); free-integer-vector(ind,l); free-real-vector (d,1) ; free-realvector (b,1) ; free-realvector (del, 1) ; free-real-vector (tr,1) ; free-real-vector (ti,1) ; return i;

1

B. eigcom Computes the eigenvalues A, and eigenvectors ufi)Q=l,...,n) of an nxn complex matrix A . A is first transformed to equilibrated form A' (by means of a call of eqilbrcom); A ' is then transformed to complex upper Hessenberg form H with real subdiagonal elements (by means of a call of hshcomhes). The eigenvalues h, and eigenvectors va) Q=l,...,n) of H are then determined by QR iteration (by means of a call of qricom), and the ua) are then recovered from the va) by two successive back transformation (by means of calls of bakcomhes and baklbrcom).

Copyright 1995 by CRC Press, Inc

Function Parameters:

int eigcom (ar, ai,n, em,valr,vali,vr, vi) given the value 0 provided the process is computed within em[4] iterations; otherwise given the number, k, of eigenvalues not calculated and only the last n-k elements of the arrays valr and vali are approximate eigenvalues of the original matrix and no useful eigenvectors are delivered; ar,ai: float ar[l:n,l:n], ai[l:n,l:n]; entry: the real part and the imaginary part of the matrix must be given in the arrays a r and ai, respectively; the array elements of a r and ai are altered; exit: n: int; entry: the order of the given matrix; em : float em[0: 71; entry: em[O]: the machine precision; em[2]: the relative tolerance for the QR iteration; the maximum allowed number of QR iterations (e.g. Ion); em[4]: em[6]: the maximum allowed number of iterations for equilibrating the original matrix (e.g. em[6]=n*n/2); exit: the Euclidean norm of the equilibrated matrix; em[l]: the maximum absolute value of the subdiagonal elements neglected in the em[3]: QR iteration; em[5]: the number of QR iterations performed; em[5]=em[4]+1 in the case eigcom#O; em[7]: the number of iterations performed for equilibrating the original matrix; valr,vali: float valr[l:n], vali[l :n]; exit: the real part and the imaginary part of the calculated eigenvalues are delivered in valr and vali, respectively; float vr[l:n,l:n], vi[l:n,l:n]; vr, vi: exit: the eigenvectors of the matrix; the normalized eigenvector with real part vr[l:n,j] and imaginary part vi[l:n,j] corresponds to the eigenvalue valrfi]+valifi]*i, j=l, ...,n.

eigcom:

Functions used:

eqilbrcom, comeucnrm, hshcomhes, qricom, bakcomhes, baklbrcom, sclcom.

int eigcom(f1oat **ar, float **ai, int n, float ern[], float valr [I, float vali [I, float **vr, float **vi) { int *allocate-integer-vector(int, int); float *allocate-real-vector(int, int) ; void free-integer-vector(int *, int) ; void free-real-vector(f1oat *, int); void eqilbrcom(f1oat **, float **, int, float [I, float [I , int [ I ) float comeucnrm(float **, float **, int, int) ; void hshcomhes (float **, float **, int, float [I , float [I , float [I, float [I, float [I); int qricom(f1oat **, float **, float [I, int, float [I, float [ I , float [I, float **, float * * I ; void bakcomhes (float **, float **, float [I , float [I , float [I , float **, float * * , int, int, int); void baklbrcom(int, int, int, float [I , int [I , float **, float * * )

Copyright 1995 by CRC Press, Inc

;

;

void sclcom(f1oat **, float **, int, int, int); int i, *ind; float *d,*b,*del,*tr,*ti; ind=allocate-integer-vector (1,n) ; d=allocate-real-vector (1,n) ; b=allocate-real-vector (1,n) ; del=allocate-real-vector(1,n); tr=allocate-real-vector(1,n); ti=allocate-real-vector(1,n); eqilbrcom(ar, ai,n, em, d, ind) ; em [ll =comeucnrm (ar,ai,n - 1,n) ; hshcomhes(ar,ai,n,em,b,tr,ti,del); i=qricom (ar,ai,b, n, em,valr,vali,vr,vi); if (i == 0) ( bakcomhes (ar,ai,tr, ti,del,vr,vi,n, 1,n) ; baklbrcom(n, l,n,d, ind,vr,vi); sclcom(vr,vi,n,l,n) ; 1 kree-integer-vector (ind,1) ; free-real-vector (d,1) ; free-real-vector (b,1) ; free-real-vector (del,1) ; free-real-vector (tr,1) ; free-real-vector (ti,1) ; return i;

1

3.14 The generalized eigenvalue problem 3.14.1 Real asymmetric matrices A. qzival Solves the generalized matrix eigenvalue problem Ax=XBx by means of QZ iteration [MS71]. Given two nxn matrices A and B, qzival determines complex aa)and real Ra) such that RO'A-aa)B is singular (i=l,...,n). QZ iteration may fail to converge in the determination of a sequence a('), R(') (i=l, ...,m). Such failure is signalled at return from qzival by the allocation of the value -1 to the array iter[i], (i=l, ...,m) and allocation of a nonnegative value, that of the number of iterations required in each case, to the array iterb] (i=m+l,...,n) to signal success in the determination of the remaining pairs of numbers a@, Ra). In particular, if iter[l] contains the value 0 at exit, then the computations have been completely successful. If QZ iteration is completely successful (i.e. m=O above), A and B are both reduced by unitary transformations to quasi-upper triangular form U and upper triangular form V respectively: U has 1x1 or 2x2 blocks on the principal diagonal, but Qj=O for i>j+l and Ui+,,i=Ofor those elements not belonging to the blocks; vj=O for i>j. The sets aa), Ba) (i=m+l ,...,n) may contain complex conjugate pairs in the sense that for = conjugate of {a(k")lR(k"));the components of such pairs are stored some k>m+l, a(k)lR(k) consecutively in the output arrays alfr, aalfi and beta.

Function Parameters: void qzival (n,a, b,alfr, a@,beta, iter,em) n: int; entry: the number of rows and columns of the matrices a and b; a: float a[l:n,l:n]; entry: the given matrix;

Copyright 1995 by CRC Press, Inc

exit: a quasi upper triangular matrix (value of U above); float b[l:n, I:n]; entry: the given matrix; exit: an upper triangular matrix (value of V above); alfr: float alf[l:n]; exit: the real part of crU) given in a&b], j=m+l, ...,n, in the above; a&: float alJi[l:n]; the imaginary part of ah)given in aljlu], j=m+l, ...,n, in the above; exit: beta: float beta[] :n]; the value of Ra) given in betab], j=m+l, ...,n, in the above; exit: iter: int iter[I :n]; trouble indicator and iteration counter, see above; if iter[l]=O then no trouble exit: is signalized; em: float em[O: I]; entry: the smallest positive machine number; em[O]: the relative precision of elements of a and b. em[l]: b:

Functions used:

elmcol, hshdecmul, hestgl2, hsh2co1, hsh3co1, hsh2row2, hsh3row2, chsh2, hshvecmat, hshvectam.

void qzival (int n, float **a, float **b, float alfr [I , float alfi [I , float beta [I , int iter [I , float em[] ) I

elmcol (int, int, int, int, float * * , float **, float); hshdecmul (int, float **, float * * , float) ; hestgl2 (int, float * * , float * * ) ; hsh2row2 (int, int, int, int, float, float, float * * , float * * ) ; hsh3row2 (int, int , int , float, float, float, float * * , float * * ) ; hsh2col (int, int, int, int, float, float, float **, float * * ) hsh3col (int, int, int, int, float, float, float, float **, float * * ) ; chshZ(float, float, float, float, float *, float * , float * ) ; hshvecmat (int, int, int, int, float, float [I, float * * ) ; hshvectam(int, int, int, int, float, float [I, float * * ) ;

void void void void void void void

;

void void void int i,q,m,ml,ql,j,k,kl,k2,k3,kml,stationary,goon,l,out; float dwarf,eps,epsa,epsb,

anorm,bnorm,ani,bni,constt,a10,a20,a30,bll,b22,b33,b44,all, a12,a2l,a22,a33,a34,a43,a44,bl2,b34,oldl,old2,

an,bn,e,c,d,er,ei,allr,al1i,al2r,a12ira2lr, a21i,a22r,a22i, cz,szr,szi,cq,sqr,sqi,ssr,ssi,tr,ti,bdr,bdi,r; dwarf=em [OI ; eps=em Ill ; hshdecmul (n,a,b,dwarf) ; hestgl2 (n,a,b) ; anorm=bnorm=O.O; for (i=l; i 1) ? fabs(a[il [i-11) : 0.0; for (j=i; j >

==

Copyright 1995 by CRC Press, Inc

anorm) anorm=ani; bnorm) bnorm=bni; 0.0) anorm=eps;

if (bnorm == 0.0) bnorm=eps; epsa=eps*anorm; epsb=eps*bnorm; m=n; out=O; do I , I=q=m; while ((i > 1) ? fabs(a[il [i-11) > epsa q=i-1;

:

0) (

I--.

1

if (q > 1) a [q] [q-ll=0.0; goon=l; while (goon) ( if (q s = m-1) { m=q-1 ; goon=0 ; ) else ( if (fabs(b[ql [ql ) c= epsb) {

q=q1;

) else (

goon=O; ml=m-1; ql=q+l; constt=0.75; (iter[ml ) ++; stationary = (iter[ml == 1) ? 1 : (fabs(a [ml [m-11 ) >= constt*oldl && fabs(a[m-11 [m-21) >= constt*old2); if (iter[ml > 30 && stationary) ( for (i=l; ic=m; i++) iter[il = out=1; break;

1

if (iter[m] == 10 a10=0.0; a20=1.0; a30=1.1605; 2

&&

stationary) (

----

blllb [ql [ql ; b22 = (fabs(b[qll [qll ) < epsb) ? epsb : b [qll [qll ; b33 = (fabs (b[mll [mll ) c epsb) ? epsb : b [mll [mll ; b44 = ( fabs (b[ml [ml ) c epsb) ? epsb : b [ml [ml ; all=a [ql [ql/bll; a12=a Iql [qll/b22; a21=a [qll [ql/bll; a22=a [qll [qll/b22; a33=a [mll [mll/b33 ; a34=a [mll [ml /b44; a43=a [ml [mll/b33 ; a44=a [ml [ml/b44 ; bl2=b [ql [qll/b22; b34=b [mll [ml /b44; alO=((a33-all)*(a44-all)-a34*a43+a43*b34*all)/a21+ a12-all*b12; - (a33-all)- (a44-all)+a43*b34; a20= (a22-all-a21*b12) a30=a [q+21 [qll /b22;

I

oldkf abs (a[ml Em-11 ) ; old2=fabs(a[m-11[m-21); for (k=q; kc=ml; k++) { kl=k+l; k2=k+2; k3 = (k+3 > m) ? m : k+3; if (k == q) hsh3col (kml,km1.n. k,a10,a20,a30,a,b) ; else ( hsh3col (kml,kml,n,k, a [kl [kmll ,

Copyright 1995 by CRC Press, Inc

.

a [kll [kmll ,a[k21 [kmll ,a,b ) ; a [kl] [kmll=a [k21 [kmll=O .O;

\

) else { hsh2col (kml,kml,n,k,a [kl [kmll ,a[kll kmll ,a,b) ; a [kll [kml]=O .O; 1

1

) /* goon loop * / if (out) break; ) while (m >= 3) ; do (

(m > 1) ? (a[ml [m-11 == 0) : 1) { alfr [ml=a [ml [ml ; beta [ml=b [ml [ml ; alfi [ml=O.0; m-- . ) else { l=m-1; if (fabs(b[ll [ll ) = fabs (a2l)+fabs (a22) hsh2row2 (l,m,m, l,al2,all,a,b); else hsh2row2 (l,m,m, 1,a22,a2l,a,b) ; if (an >= fabs(e)*bn) hsh2col(l,l,n,l,b 111 Ell ,b[ml [I1 ,a,b);

if

(

Copyright 1995 by CRC Press, Inc

a [m] [I]=b [ml [ll=O.0; alfr [ll=a [ll [ll ; alfr [ml =a [ml [ml ; beta [l]=b [ll [ll ; beta [m]=b [ml [ml ; alfi [m]=alfi [lI=O.0; ) else { er=e+c; ei=sqrt (-d); allr=all-erfbll; alli=ei*bll; a12r=a12-er*b12; a12i=ei*b12; a21r=a21; a2li=0.0; a22r=a22-er*b22; a22i=ei*b22; if (fabs(allr)+fabs (alli)+fabs (al2r)+fabs ( a fabs (a2lr)+fabs (a22r)+fibs (a22i)) chsh2 (al2r,al2i,-allr,-alli,&cz,&szr,&szi) ; else chsh2 (a22r,a22i,-a21r,-a2li,&cz,&szr,&szi) ; if (an >= (fabs(er)+fabs (ei)) *bn)

>=

chsh2(cz*bll+szr*bl2,szi*b12,szr*b22,szi*b22,

&cq,&sqr,&sqi) ; else chsh2(cz*all+szr*al2,szi*al2,cz*a2l+szr*a22,

szi*a22,&cq,&sqr,&sqi) ; ssr=sqr*szr+sqi*szi; ssi=sqr*szi-sqi*szr; tr=cq*cz*all+cq*szr*al2+sqr*cz*a2l+ssr*a22; ti=cq*szi*al2-sqi*cz*a21+ssi*a22; bdr=cq*cz*bll+cq*szr*b12+ssr*b22;

bdi=cq*szi*bl2+ssi*b22; r=sqrt (bdr*bdr+bdi*bdi); beta [ll =bn*r; alfr [ll =an* (tr*bdr+ti*bdi)/r; alfi [ll=an* (tr*bdi-ti*bdr)/r; tr=ssr*all-sqr*cz*al2-cq*szr*a2l+cq*cz*a22; ti = -ssi*all-sqi*c~*al2+cq*szi*a21; bdr=ssr*bll-sqr*cz*b12+cq*cz*b22; bdi = -ssi*bll-sqi*cz*bl2;

r=sqrt(bdr*bdr+bdi*bdi); beta [ml=bn*r; alfr [ml =an* (tr*bdr+ti*bdi)/r; alfi [ml =an* (tr*bdi-ti*bdr)/r;

B. qzi Solves the generalized matrix eigenvalue problem Ax=XBx by means of QZ iteration [MS71]. The procedure qzi applies the same method as qzival. Given two nxn matrices A and B, qzi determines complex aQ) and real BQ) such that BQ)A-aa)~ is singular and vectors xQ)such that Ba)Axa)=aa)~xa) U=l, ...,n), the latter being normalized by the condition that max( I real@,"))I , 1 imag(x ,"') I )=1 (1SiSn) for each x" and either real(x,"))=l or imag(x,"') for some x,(i). With regard to the determination of the aQ),Ba), the remarks made in the documentation to qzival apply with equal force here. In particular, (a) QZ iteration may fail to converge in the determination of a sequence a('),B(') (i=l, ...,m); this failure is signalled by the insertion of -1 in the array iter[i], (i=l,...,m) and for those am, flu) that are determined, the required number of QZ iterations required in each case is allocated to the array iterb]

Copyright 1995 by CRC Press, Inc

Cj=m+l ,...,n); (b) a quasi-upper triangular matrix U and an upper triangular matrix V are produced and (c) the sets am, Bcj) (j=m+l,...,n) may contain complex conjugate pairs in the = conjugate of {a@")lB(k'l)},and the components of such sense that for some k2m+l, a(k)l13@) pairs are stored consecutively in the output arrays a f i alp and beta.

Function Parameters: void qzi (n,a,b,x,alfr,alfi, beta, iter,em) int; entry: the number of rows and columns of the matrices a, b and x; float a[l:n, l:n]; entry: the given matrix; exit: a quasi upper triangular matrix (value of U above); float b[l:n, 1:n]; entry: the given matrix; exit: an upper triangular matrix (value of V above); float x[l:n, 1:n]; entry: the nxn unit matrix; exit: the matrix of eigenvectors (components of xcj) above); the eigenvectors are stored in x as follows: if alfi[m]=O then x[.,m] is the m-th real eigenvector; otherwise, for each pair of consecutive columns x[.,m] and x[.,m+l] are the real and imaginary parts of the m-th complex eigenvector, x[.,m] and -x[.,m+l] are the real and imaginary parts of the (m+l)-st complex eigenvector; the eigenvectors are normalized such that the largest component is 1 or l+O*i; alfr: floatalfr[l:n]; exit: the real part of aa)given in alfl;], j=m+l, ...,n, in the above; alp: float alJijl:n]; exit: the imaginary part of aa)given in arfil;], j=m+l, ...,n, in the above; beta: float beta[l:n]; exit: the value of Ba) given in betal;], j=m+l, ...,n, in the above; iter: int iter[l :n]; exit: trouble indicator and iteration counter, see qzival; if iter[I]=O then no trouble is signalized; em: float em[O:l]; entry: em[O]: the smallest positive machine number; em[]]: the relative precision of elements of a and b.

Functions used:

matmat, hshdecmul, hestgl3, hsh2co1, hsh2row3, hsh3row3, hsh3co1, chsh2, comdiv.

void qzi(int n, float **a, float **b, float * * x , float alfr[l, float alfi [ I , float beta [ I , int iter [ I , float em [ I )

I

float matmat(int, int, int, int, float **, float * * ) ; void hshdecmul (int, float **, float **, float) ; void hestgl3 (int, float **, float **, float * * ) ; void hsh2row3(int, int, int, int, int, float, float, float **, float **, float * * ) ; void hsh3row3 (int, int, int, int, float, float, float, float **, float **, float * * ) ; void hsh2col(int, int, int, int, float, float, float **, float * * ) ;

Copyright 1995 by CRC Press, Inc

void hsh3col(int, int, int, int, float, float, float, float **, float * * ) ; void chsh2 (float, float, float, float, float *, float *, float * ) ; void comdiv(float, float, float, float, float * , float * ) ; int i,q,m,ml,ql,j,k,kl,k2,k3,kml,stationary,goon,l,mr,mi,ll,out; float dwarf,eps,epsa,epsb, anorm,bnorm,ani,bni,constt,a10,a20,a30,bll,b22,b33,b44,all,

a12,a21,a22,a33,a34,a43,a44,b12,b34,oldl,old2, an,bn,e,c,d,er,ei,allr,alli,a12r,a12i,a21rta21i,a22r,a22i, cz,szr,szi,cq,sqr,sqi,ssr,ssi,tr,ti,bdr,bdi,r, betm,alfm,sl,sk,tkk,tkl,tlk,tll,almi,almr,slr,sli,skr,ski, dr,di,tkkr,tkki,tklr,tkli,tlkr,tlki,tllr,tlli,s; dwarf=em [O]; eps=em [ll ; hshdecmul (n,a,b,dwarf) ; hestgl3 (n,a,b,x); anorm=bnorm=O.O; for (i=l; i 1) ? fabs(a[il [i-11) : 0.0; for (j=i; j anorm) anorm=ani; if (bni > bnorm) bnorm=bni;

1

if (anorm == 0.0) anorm=eps; if (bnorm == 0.0) bnorm=eps; epsa=eps*anorm; epsb=eps*bnorm; m=n ; out=O; do { , 1=q=m; while ((i > 1) ? fabs(aLi.1 [i-11) > epsa q=i-1; i--;

:

0) {

1

if (q > 1) a [ql [q-11=O.O; goon=l; while (goon) { if (q >= m-1) { m=q-1; goon=O ; } else { if (fabs(b[ql [ql ) c = epsb) { b [ql [ql=O . 0 ;

q=qz; ) else { goon=O; ml=m-1 ; ql=q+l; constt=0.75; (iter[ml ) ++; stationary = (iter[m] == 1) ? 1 : (fabs(a[m] [m-11) >= constt*oldl && fabs (a[m-11[m-21) >= constt*old2); if (iter[ml > 30 && stationary) { for (i=l; i= 3); do

Iif

((m > 1) ? (a[ml [m-11 == 0) : 1) { alfr Em1 =a lml [ml ; beta [ml=b [ml [ml ; alfi [ml=0.0; m--; } else { l=m-1; if (fabs(b[ll [ll = 0.0) { e += ( (c < 0.0) ? c-sqrt(dl : c+sqrt (dl) ; all - = e*bll; a12 - = e*b12; a22 - = e*b22; if (fabs(all)+fabs (al2) >= fabs (a2l)+fabs (a22)) hsh2row3 (l,m,m,n, 1,a12,all,a,b,x); else hsh2row3 (l,m,m,n, 1,a22,a21,a,b,x); if (an >= fabs(e)*bn) hsh2col (l,l,n,l,b[l][11 ,b[ml Ill ,a,b); else hsh2col(l,l,n,l,a[ll [l],a[ml [l],a,b); a [ml Ill =b [ml 111 =O.0; alfr [ll =a [ll [I]; alfr [ml =a [ml [ml ; beta [ll=b [ll [ll ; beta [ml =b [ml [ml ; alfi [ml =alfi [ll=0.0; } else { er=e+c; ei=sqrt (-dl; allr=all-er*bll; alli=ei*bll; a12r=a12-er*b12; al2i=ei*bl2; a21r=a21 ; a21i=0.0; a22r=a22-er*b22; a22i=ei*b22; if (fabs(allr)+fabs (alli)+fabs (al2r)+£abs ( a >= fabs (a2lr)+fabs (a22r)+fabs (a22i)) chsh2 (alzr,al2i,-allr,-alli,&cz,&szr,&szi) ; else chsh2 (a22r,a22i,-a21r,-a2li,&cz,&szr,&szi) ; if (an >= (fabs(er)+fabs (ei)) *bn) chsh2(cz*bll+szr*bl2,szi*b12,szr*b22,szi*b22,

&cq,&sqr,&sqi) ; else chsh2(cz*all+szr*a~2,szi*al2,cz*a2l+szr*a22, szi*a22,&cq,&sqr,&sqi) ; ssr=sqr*szr+sqi*szi; ssi=sqr*szi-sqi*szr; tr=cq*cz*all+cq*~zr*al2+sqr*cz*a2l+ssr*a22; ti=cq*szi*al2-sqi*cz*a21+ssi*a22; bdr=cq*cz*bll+cq*szr*b12+ssr*b22;

bdi=cq*szi*bl2+ssi*b22; r=sqrt(bdr*bdr+bdi*bdi); beta [ll =bn*r; alfr [ll =an* (tr*bdr+ti*bdi)/r; alfi [ll=an* (trfbdi-ti*bdr) /r; tr=ssr*all-sqr*cz*al2-~q*szr*a2l+cq*cz*a22; ti = -ssi*all-sqi*cz*al2+cq*szi*a21; bdr=ssr*bll-sqr*cz*bl2+cq*cz*b22;

bdi = -ssi*bll-sqi*cz*bl2; r=sqrt (bdr*bdr+bdi*bdi); beta [ml =bn*r;

Copyright 1995 by CRC Press, Inc

alfr [m]=an* (tr*bdr+ti*bdi)/r; alfi [m]=an* (tr*bdi-ti*bdr) /r;

,

I

>

0) ;

i

} while (m

for (m=n: m>=l: m--) (alfi[ml == 0.0) ( alfm=alfr [ml ; betm=beta [ml ; b [ml [ml=l . 0 ; ll=m; for (l=m-1; 1>=1; I - - ) { sl=0.0; for (j=ll; j=1; 1--) { slr=sli=O.O; for (j=ll; jc=m; j + + ) { tr=betm*a[ll [jl -almr*b[ll [jl ; ti = -almi*b[ll [jl; slr += tr*b [ jl [mrl -ti*b[ j I [mil ; sli += tr*b [jI [mil+ti*b [ jI [mrl ; (1 ! = 1) ? (betm*a[ll [l-11 == 0.0) : 1) { dr=betm*a[ll [l]-almr*b [ll [ll ; di = -almi*b[ll [ll ; comdiv(-slr,-sli,dr,di,&b[ll[mrl ,&b[ll [mil ) } else { k=l-1; skr=ski=O.O; for (j=ll; j= fabs (tlkr)) comdiv(-skr-tklr*b[ll [mrl+tkli*b [ll [mil , -ski-tklr*b[ll [mil -tkli*b111 [mrl , tkkr,tkki,&b [kl [mrl ,&b [kl [mil ) ; else comdiv(-slr-tllr*b[ll [mrl+tlli*b [ll [mil , -sli-tllr*b[ll [mil -tlli*b111 [mrl , tlkr,tlki,&b [kl [mrl , &b [kl [mil ) ;

I

for (m=n; m>=l; m--) for (k=l; k=l; m--) { s=o.o; if (alfi[ml == 0.0) { for (k=l; k= s) {

1

for ) else for '(k=l; k= S) {

l

;

J

for (k=l; k dwakfj { r = (b[kl [k] c 0.0) ? -sqrt(r+b[kl [kl*b [kl [kl ) sart (r+b[kl [kl*b [kl [kl) ;

:

v[kl =l.O; for (j=kl; jc=n; j++) v[jl=b[jl [kl/t; hshvecmat (k.n,kl,n.c,v,b) ; . . . . hshvecmat (k)n,l,n,c,v,a) ;

free-real-vector (v,1) ;

1

Given an nxn matrix A and an nxn upper triangular matrix U, obtains vectors u,('), u?) such that with

and Q, similarly defined, Q,AQ2=H is an upper Hessenberg matrix and Q,UQ2=U1is an

Copyright 1995 by CRC Press, Inc

upper triangular matrix and, also given an nxn matrix X, forms X'=Q,XQ2. hestgZ3 is used in qzi. Function Parameters: void hestgl3 (n,a,b,x) n: int; entry: the order of the given matrices; a: float a[Z:n, l:n]; entry: the given matrix; exit: the upper Hessenberg matrix (value of H above); b: float b[l:n,l:n]; entry: the given upper triangular matrix (value of U above); exit: the upper triangular matrix (value of U' above); x: float x[Z:n, l:n]; entry: the given matrix (value of X above); exit: the transformed matrix (value of X' above). Functions used:

hsh2co1, hsh2row3.

void hestgl3 (int n, float **a, float **b, float * * x )

I float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); void hsh2col (int, int, int, int, float, float, float **, float void hsh2row3(int, int, int, int, int, float, float, float **, float **, float * * ) ; int nml,k,l,kl,ll;

**) ;

if (n > 2) { for (k=2;kc=n; k++) for (1=1; lc=k-1; I++) b[kl [11=0.0; nml=n-1; k=l; for (kl=2; kls=nml; kl++) ( ll=n; for ( 1 s - 1 ; l>=kl; I--) ( hsh2col (k,l,n.l,a[ll [kl ,a[lll [kl ,a,b); a [lll [kl=O.0; hsh2row3 (l,n, ll,n,1,b1111 [Ill ,b[Ill [ll ,a,b,x); b [ll] [ll=O.O; 11=1;

Given an nxn matrix A and an nxn upper triangular matrix U, obtains vectors u,('),u,(') such that with n'-1

Q~ =

n i=l

(1 -

(13

(0T (OT

U1

lul

~ ~, ' 3 (1

(nl n)

and Q, similarly defined, Q,AQ,=H is an upper Hessenberg matrix and Q,UQ2=U9 is an upper triangular matrix. hestgl2 is used in qzival.

Copyright 1995 by CRC Press, Inc

Function Parameters: void hestgl2 (n,a,b) int; entry: the order of the given matrices; a: float a[l:n, 1:n]; entry: the given matrix; exit: the upper Hessenberg matrix (value of H above); b: float b[l:n,l:n]; entry: the given upper triangular matrix (value of U above); exit: the upper triangular matrix (value of U'above). n:

Functions used:

hsh2col. hsh2row2.

void hestgl2 (int n, float **a, float **b) 1

float *allocate-real-vector (int, int) ; void free-real-vector(f1oat *, int); void hsh2col (int, int, int, int, float, float, float **, float void hsh2row2 (int, int, int, int, float, float, float **, float * * ) ; int nml,k,l,kl,ll;

**) ;

if (n > 2) { for (k=2; k=kl; I--) { hshZcol(k,l,n,l,a[11fkl ,at111 Ekl ,a,b); a Ill] [kl=O .O; hsh2row2 (l,n,ll,l,btlll Ill1 ,btlll tll ,a,b); b [ll] [ll=O.0; 11=1;

1

k=kl;

I

1

1

(a) Given the values of two elements Mkj (k=i,i+l) belonging to a certain column of a rectangular matrix M, determines a vector v such that all rows except the i-th and (i+l)-th of M and M'=EM agree, where E=I-2wTIvTv and Adij=O and, @) given the elements Akj (k=i,i+l; j=la, ...,u) of the rectangular matrix A, determines the corresponding elements of A'=EA and, (c) given the elements Bkj (k=i,i+l; j=lb, ...,u) of the rectangular matrix B, determines the corresponding elements of B'=EB. hsh2col is used in qzival and qzi.

Function Parameters: void hsh2col (la,lb, u,i,al,a2,a, b) la: int; entry: Ib: int; entry: u: int;

the lower bound of the running column subscript of a (value of la above); the lower bound of the running column subscript of b (value of Ib above);

Copyright 1995 by CRC Press, Inc

entry:

the upper bound of the running column subscript of a and b (value of u above);

i:

int; entry: the lower bound of the running row subscript of a and b (value of i above); i+l is the upper bound; al,a2: float; entry: a 1 and a2 are the i-th and (i+l)-th component of the vector to be transformed, respectively (values of Mkj (k=i,i+l) above); a: float a[i: i+ l,la:u]; entry: the given matrix (value of A above); the transformed matrix (value of A' above); exit: b: float b[i:i+l,lb:u]; entry: the given matrix (value of B above); the transformed matrix (value of B' above). exit:

Function used: hshvecmat.

void hsh2col(int la, int lb, int u, int i, float al, float a2, float **a, float **b) i

'

float *allocate-real-vector(int, int) ; void free-real-vector(f1oat *, int); void hshvecmat (int, int, int, int, float, float [I , float * * ) float *v,dl,d2,sl,s2,r,d,c;

;

if (a2 ! = 0.0) { v=allocate-realvector(i,i+l); dl=fabs (al); d2=fabs (a2); sl = (a1 >= 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; if (d2 = 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; s3 = (a3 >= 0.0) ? 1.0 : -1.0; if (dl >= d2 && dl >= d3) ( r2=d2/dl; r3 =d3/dl ; d=sqrt(l.O+rZ*rZ+r3*r3); c = -1.0-(l.O/d); d=l.O/(l.O+d); v [i+l]=sl*sZ*rZ*d; v [i+21=sl*s3*r3*d; } else if (d2 >= dl && d2 >= d3) { rl=dl/d2;

Copyright 1995 by CRC Press, Inc

;

r3=d3/d2; d=sqrt(l.O+rl*rl+r3*r3); c = -1.0- (sl*rl/d); d=l.0/ (rl+d); v [i+ll=sl*s2*d; v [i+21=sl*s3*r3*d; } else { rl=dl/d3 ; r2=d2/d3 ; d=sqrt(l.O+rl*rl+r2*r2); c = -1.0- (sl*rl/d); d=l.0/ (rl+d); v [i+ll=sl*s2*r2*d; v[i+2] =sl*s3*d; 1 v[il =l.O; hshvecmat (i,i+2,la,u,c,v,a); hshvecmat (i,i+2,lb,u,c,v,b) ; free-real-vector (v,i) ;

1

1

(a) Given the values of two elements Mi,, (k=jj+l) belonging to a certain row of a rectangular matrix M, determines a vector v such that all columns except the j-th, (j+l)-th of M and M'=ME agree, where E=I-2wTIvTv and Mij=O and, (b) given the elements Ai,, (i=l,..+a; k=jj+l) of the rectangular matrix A, determines the corresponding elements of A1=AE and, (c) given the elements Bi,k(i=I,...,ub; k=jj+l) of the rectangular matrix B, determines the corresponding elements of B1=BE and, (d) given the elements A,, (i=l, ...,wc; k=jj+l) of the rectangular matrix X, determines the corresponding elements of X1=XE. hsh2row3 is used in qzi.

Function Parameters: void hsh2row3 (I,ua,ub,wc,j,al,a2,a,b,x) I:

int; entry: ua: int; entry: ub: int; entry: wc: int; entry: j : int; entry:

the lower bound of the running row subscript of a, b and x (value of I above); the upper bound of the running row subscript of a (value of ua above); the upper bound of the running row subscript of b (value of ub above); the upper bound of the running row subscript of x (value of wc above);

the lower bound of the running column subscript of a, b and x (value of j above); j + l is the upper bound; al,a2: float; entry: a1 and a 2 are the j-th and (j+l)-th component of the vector to be transformed, respectively (values of M,,, (k=jj+l) above); a: float a[l:ua,jj+l]; entry: the given matrix (value of A above); exit: the transformed matrix (value of A' above); b: float b[l:ub,jj+l]; entry: the given matrix (value of B above);

Copyright 1995 by CRC Press, Inc

x:

the transformed matrix (value of B' above); exit: float x[l:ux,j:j+l]; entry: the given matrix (value of X above); the transformed matrix (value of X' above). exit:

Function used: hshvectam.

void hsh2row3(int 1, int ua, int ub, int ux, int j, float al, float a2, float **a, float **b, float **x) float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); void hshvectarn(int, int, int, int, float, float [ I , float float *v,dl,d2,sl,s2,r,d,c;

**) ;

if (a2 ! = 0.0) { v=allocate-real-vector(j,j+l); dl=fabs (al); d2=fabs (a2); sl = (a1 >= 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; if (d2 c= dl) { r=d2/dl; d=sqrt (l.O+r*r); c = -1.0-l.O/d; v[j] =sl*sz*r/ (l.O+d); ) else { r=dl/d2; d=sqrt (l.O+r*r); c = -1.0-r/d; v [ jI =sl*s2/ (r+d); 1

(a) Given the values of two elements Mi,, (k=jj+l) belonging to a certain row of a rectangular matrix M, determines a vector v such that all columns except the j-th, (j+l)-th of M and M'=ME agree, where E=I-2wTlvTvand Mij=O and, (b) given the elements Ai,, (i=l, ...,ua; k=jj+l) of the rectangular matrix A, determines the corresponding elements of AJ=AE and, (c) given the elements Bi,, (i=I,...,ub; k=jj+l) of the rectangular matrix B, determines the corresponding elements of BJ=BE. hsh2row2 is used in qzival. Function Parameters: void hsh2row2 (I,ua,ub,j,al,a2,a,b) I:

int; entry: ua: int; entry: ub: int; entry: j : int;

the lower bound of the running row subscript of a and b (value of I above); the upper bound of the running row subscript of a (value of ua above); the upper bound of the running row subscript of b (value of ub above);

Copyright 1995 by CRC Press, Inc

entry: the lower bound of the running column subscript of a and b (value o f j above); j+l is the upper bound; al,a2: float; entry: a1 and a2 are the j-th and G+l)-th component of the vector to be transformed, respectively (values of M,, (k=jj+l) above); a: float a[a: ua,jj:j+I]; entry: the given matrix (value of A above); exit: the transformed matrix (value of A ' above); b: float b[lb:ub,jj+l]; entry: the given matrix (value of B above); the transformed matrix (value of B' above). exit: Function used: hshvectarn. #include cmath.h> void hsh2row2(int 1, int ua, int ub, int j, float al, float a2, float **a, float **b) { float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); void hshvectam(int, int, int, int, float, float [ I , float * * ) ; float *v,dl,d2,sl,s2,r,d,c; if (a2 ! = 0.0) { v=allocate-real-vector(j,j+l); dl=fabs(al); d2=fabs(a2); sl = (a1 >= 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; if (d2 c= dl) ( r=d2/dl; d=sqrt (l.O+r*r); c = -1.0-l.O/d; v[j] =sl*s2*r/ (l.O+d); ) else ( r=dl/d2; d=sqrt(l.O+r*r); c = -1.0-r/d; v [ jI =sl*s2/ (r+d);

I

v[j+ll =l.O; hshvectam(l,ua,j,j+l,c,v,a); hshvectam(l,ub,j , j+l,c,v,b); free-real-vector (v,j) ;

I

1

(a) Given the values of three elements M,,(k=jj+l j+2) belonging to a certain row of a rectangular matrix M, determines a vector v such that all columns except the j-th, G+l)-th and (j+2)-th of M and M'=ME agree, where E = I - ~ w ~ I vand ~ vM',j=M'i+,j=Oand, (b) given the elements Ai,k (i=I,...,U; k=jj+l j+2) of the rectangular matrix A, determines the corresponding elements of A '=AE and, (c) given the elements Bi,, (i=I,...,U; k=jj+l j+2) of the rectangular matrix B, determines the corresponding elements of B'=BE and, (c) given the elements X,,, (i=l,...,w ; k=jj+l j+2) of the rectangular matrix X, determines the corresponding elements of X'=XE. hsh3row3 is used in qzi.

Copyright 1995 by CRC Press, Inc

Function Parameters:

void hsh3row3 (I,u,~~,j,al,a2,a3,a, b,x)

I:

int; entry: int; entry:

U:

UX:

j:

int; entry:

the lower bound of the running row subscript of a, b and x (value of l above); the upper bound of the running row subscript of a and b (value of u above); the upper bound of the running row subscript of x (value of ux above);

int; entry:

the lower bound of the running column subscript of a, b and x (value of j above); j+2 is the upper bound; al,aZ,a3: float; entry: a l , a2 and a 3 are the j-th, (j+l)-th and (j+2)-th component of the vector to be transformed, respectively (values of M , , (k=jj + l j+2) above); a: float a[]: uj:j+2]; entry: the given matrix (value of A above); exit: the transformed matrix (value of A' above); b: float b[l:uj.j+2]; entry: the given matrix (value of B above); exit: the transformed matrix (value of B' above); x: float x[l:wcj.j+2]; entry: the given matrix (value of X above); exit: the transformed matrix (value of X' above). Function used: hshvectam.

void hsh3row3(int 1, int u, int ux, int j, float al, float a2, float a3, float **a, float **b, float * * x )

I

float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); void hshvectam(int, int, int, int, float, float [I , float * * ) ; float *v,c,dl,d2,d3,sl,s2,s3,rl,r2,r3,d; if (a2 ! = 0.0 I I a3 ! = 0.0) ( v=allocate-real_vector(j,j+2); dl=fabs (al); d2=fabs(a2); d3=fabs(a3); sl = (a1 >= 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; s3 = (a3 >= 0.0) ? 1.0 : -1.0; if (dl >= d2 && dl >= d3) ( r2=d2/dl; r3=d3/dl ; d=sqrt(l.O+rZ*r2+r3*r3); c = -1.0-(l.O/d); d=l.O/ (l.O+d); v [j+l]=sl*s2*r2*d; v [ jl =sl*s3*r3*d; ) else if (d2 >= dl && d2 >= d3) ( rl=dl/d2; r3=d3/d2 ; d=sqrt (l.O+rl*rl+r3*r3) ;

Copyright 1995 by CRC Press, Inc

c = -1.0- (sl*rl/d); d=l.0/ (rl+d); v[j+ll =sl*sZ*d; v[j] =sl*s3*r3*d; ) else { rl=dl/d3 ; r2=d2/d3 ; d=sqrt(l.O+rl*rl+r2*rZ); c = -1.0- (sl*rl/d); d=l.O/ (rl+d); v [ j+l]=sl*sZ*rZ*d; v [j] =sl*s3 *d; 1

(a) Given the values of three elements (k==jj+lj+2) belonging to a certain row of a rectangular matrix M, determines a vector v such that all columns except the j-th, (j+l)-th and (jt-2)-th of M and M'=ME agree, where E=I-2wT/vTv and M'ij=M'i+,j=Oand, (b) given the elements Ai,k (i=I, ...,U; k=jj+ 1 j+2) of the rectangular matrix A, determines the corresponding elements of AJ=AE and, (c) given the elements Bi,k(i=I,...,U; k=jj+l j+2) of the rectangular matrix B, determines the corresponding elements of B'=BE. hsh3row2 is used in qzival. Function Parameters:

void hsh3row2 (I, u,j, a l ,a2,a3,a,b)

I:

U:

j:

int; entry: int; entry: int; entry:

the lower bound of the running row subscript of a and b (value of 1 above); the upper bound of the running row subscript of a and b (value of u above); the lower bound of the running column subscript of a and b (value of j above);

j+2 is the upper bound; al,a2,a3: float; entry: al, a2 and a 3 are the j-th, (j+l)-th and (j+2)-th component of the vector to be transformed, respectively (values of M i , (k=jj + l j+2) above); a: float a[l:u j3+2]; entry: the given matrix (value of A above); exit: the transformed matrix (value of A' above); b: float b[l:~jY+2]; entry: the given matrix (value of B above); exit: the transformed matrix (value of B' above). Function used: hshvectarn.

Copyright 1995 by CRC Press, Inc

void hsh3row2(int 1, int u, int j, float al, float a2, float a3, float **a, float **b) I1 float *allocate-real-vector(int, int); void free real vector(f1oat *, int); void hshv&tamTint, int, int, int, float, float [ I , float * * ) ; float *v,c,dl,d2,d3,sl,s2,s3,rl,r2,r3,d; if (a2 ! = 0.0 a3 ! = 0.0) { v=allocate~real~vector(j,j+2); dl=£abs (al); d2=fabs (a2); d3=fabs (a3); sl = (a1 >= 0.0) ? 1.0 : -1.0; s2 = (a2 >= 0.0) ? 1.0 : -1.0; s3 = (a3 >= 0.0) ? 1.0 : -1.0; if (dl >= d2 && dl >= d3) ( r2=d2/dl; r3=d3/dl; d=sqrt(l.O+r2*r2+r3*r3); c = -1.0-(l.O/d); d=l.O/ (l.O+d); v [ j+l]=sl*s2*r2*d; v[j] =sl*s3*r3*d; ) else if (d2 >= dl && d2 >= d3) { rl=dl/d2; r3=d3/d2 ; d=sqrt(l.O+rl*rl+r3*r3); c = -1.0-(sl*rl/d); d=l.0/ (rl+d); v [j+ll=sl*s2*d; v [ j]=sl*s3*r3*d; } else { rl=dl/d3; r2=d2/d3; d=sqrt(l.O+rl*rl+r2*r2); c = -1.0-(sl*rl/d); d=l.0/ (rl+d); v [ j+l] =sl*s2*r2*d; v[jl =sl*s3*d;

1 v[j+2] =1.0; hshvectam(l,u,j,j+2,c,v,a) ; hshvectam(l,u,j,j+2,c,v,b) ; free-real-vector (v,j) ;

1

1

3.15 Singular values 3.15.1 Real bidiagonal matrices A. qrisngvalbid Computes, by use of a variant of the QR algorithm the singular values of a bidiagonal nxn matrix A, i.e. the elements d,,...,d,, of the diagonal matrix D for which A=UDVT, where UTU=VTV=I (nxn unit matrix).

Function Parameters: qrisngvalbid

int qrisngvalbid (4b,n,em) given the number of singular values not found, i.e. a number not equal to zero if the number of iterations exceeds em[4J;

Copyright 1995 by CRC Press, Inc

d float d[l :n]; entry: the diagonal of the bidiagonal matrix; exit: the singular values; b: float b[l:n]; entry: the super diagonal of the bidiagonal matrix in b[l:n-I]; n: int; entry: the length of b and 4 float em[l: 71; em: entry: em[l]: the infinity norm of the matrix; em[2]: the relative precision in the singular values; em[4]: the maximal number of iterations to be performed; em[6]: the minimal non-neglectable singular value; exit: em[3]: the maximal neglected superdiagonal element; em[5]: the number of iterations performed; em[7]: the numerical rank of the matrix; i.e. the number of singular values greater than or equal to em[6].

Method:

The method is described in detail in [WiR71]. qrisngvalbid is a rewriting of part of the procedure SVD published there.

int qrisngvalbid(float d[l , float b [I , int n, float em [I ) I1

int nl,k,kl,i,il,count,max,rnk; float tol,bmax,z,x,y,g,h,f, c,s,min; tol=em[2l *em [ll ; count=O; bmax=O .0; max=em [dl ; min=em [61 ; rnk=n; do { k=n; nl=n-1; while (1) { k-- ; if (k = tol) { if (fabs(dIk1) < tol) {

.

C=O 0; s=1.0;

for (i=k; i

and

Q(z)

=

R(z) - P(z)

&,z

= k=l

Let m of the a;,namely a,(,, (i=l,...,m) lie near to each other; let n

rl[ .fk)

denote a product

ir i=l

from which the terms with suffix k(i) (i=l,...,m) have been removed; set

Copyright 1995 by CRC Press, Inc

.

and B,

=

ai- y (i=l, ...,n). For z on the circle C with center y and suitable radius r,

If 1 Q(z) I =l; j--) rclj] - = xk*rc[j-11; else { k++; if (k =l; j--) rcelj] += rce[j-ll*sqrt(zk); for (j=k; j>=2; j--) rclj] += xk*rc[j-ll+zk*rc[j-21; rc [I] += xk*rc [OI ;

,

I

1

j

rc [Ol=rce [Ol ; corr=l.O6*FLT_MIN; for (i=l; i= 2) { j - = clust[j-11; templ=recent-recentre[jl ; temp2=imcent-imcentrelj]; h=sqrt (templ*templ+temp2*temp2);

if (h < bound[il+bound[jl indexl=j ; index2=i; min=h;

,

&&

h

=k; j--) { place=j+l-k; re [j]=wal [place]; im [j]=wa2[place]; bound [ j I =boundin; clust [ jI =dustin; recentre [ jI =recent; imcentre [ jI =imcent; 1

iree-real-vector (wal,1) ; free real vector(wa2.1); ) / * end of-shift * / k=clust[indexll+clust [kl ; kcluster(k,indexl,n,rc,re,im,recentre,imcentre, bound,clust) ;

I

I

1 £re=-real-vector (rc,0); free real vector(c,O); freeIreal-ector (rce,0); free-real-vector(clust,l);

void kcluster(int k, inf m, int n, float rc [I, float re [ I , float im [I , float recentre [I , float imcentre [I , float bound [I , float clust [I) 1 1

/ * this function is used internally by BOUNDS * / int i,stop,l,nonzero; float recent,imcent,d,prod,rad,gr,r,*dist,s,hl,h2,tempi,temp2; dist=allocate-real-vector(m,m+k-1);

recent=re[ml ; imcent=im[ml ; stop=m+k-1; 1 = (imcent == 0 .O) ? 0 : ( (imcent > 0.0) ? 1 nonzero = (1 ! = 0); for (i=m+l; ic=stop; i++) { recent += re [il ; if (nonzero) { nonzero=(l = s ((im[il == 0.0) ? 0 : ((im[i]>O.O) ? 1 : -1))); imcent += im lil ; kecent / = k; imcent = (nonzero ? imcent/k : 0.0); dzO.0; rad=O .0; for (i=m; ic=stop; i++) { recentre [il=recent; imcentre [il=imcent; templ=re[il-recent; temp2=im [il-imcent; dist [i]=sqrt (templ*templ+ternp2*temp2); if (d c dist [il) d=dist [il ; II

s=sqrt(recent*recent+imcent*imcent); hl=rc [ll ; h2=rc [OI ; for (i=2; i = l ; i - - ) if (c [il +fabs (b [il ) > nrm) nrm=c [il +fabs (b [il ) ; if (n > 1) nrm = (nrm+l >= c[n-ll+fabs(b[n-11)) ? nrm+l.O : (c [n-11+fabs (b In-11 ) ) ; em [ll =nrm;

Copyright 1995 by CRC Press, Inc

for (i=n; i > = l ; i--) d[il=b[i-11; valsymtri (d,c,n,nl,n2,zer, em) ; em[5] =em[31 ; free-real-vector (d,1) ;

1

D. alljaczer Calculates all zeros of the n-th Jacobi polynomial P , ' ~ ( X ) ,see [AbS65]. The Jacobi polynomials satisfy the recursion P~'*)(x) = 1, P,(*)(x) = 1/2 (a+J+2)x+ 1/2 (a$), 2(k+l)(k+a+J+l)(2k+a+J) P , + , ( ~ ( x )= (~~+(Y+J+I)((~~+cY+J)(~~+(Y+J+~)x+~~$Z) P~*)(x) (k=2,3,...) 2k(k+a) (k+J) (2k+a+J+2) Pk-,(4(~) and the coefficient of Xk in Pi4&) is

The polynomials p,(x) = P,(*)(x)/c,

satisfy the recursion

The roots of p,(x) (i.e. those of P,'*)(x)) may be obtained by a call of allzerortpol. However, for the special case in which a=R, P2m(04tJ(~) = C,,,P,(%-~/~) (22-1) = d,~P,(g'/~)(2?-1) P2m-1(ad(~) where c , and dm are independent of x. Thus, in this special case, the determination of the roots of P,'*)(x) may slightly be simplified. Function Parameters:

void alljaczer (n,alfa, beta,zer) int; entry: the upper bound of the array zer, n 2 1; alfa, beta: float; entry: the parameters of the Jacobi polynomial (values of a and R above); alfa, beta > - 1; zer: float zer[l :n]; exit: the zeros of the n-th Jacobi polynomial with parameters alfa and beta. n:

Function used: allzerortpol.

Copyright 1995 by CRC Press, Inc

void alljaczer(int n, float alfa, float beta, float zer[l) I float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); void allzerortpol(int, float [I, float [I, float [ I , float [I) ; int i,m; float sum,min,*a,*b,em[6] ,gamma,zeri; if (alfa == beta) { a=allocate-real_vector(O,n/2); b=allocate-real-vector(O,n/2); m=n/2; if (n ! = 2*m) { gamma=0.5; zer [m+ll=O.0 ; ) else gamma = -0.5; minz0.25-alfa*alfa; sum=alfa+gamma+2.0; a [0l=(gamma-alfa)/sum; a [ll=min/sum/ (sum+2.0); b[l] =4.0*(1.0+alfa) (l.o+gamma)/sum/sum/(sum+l.~) ; for (i=2; ic=m-1; i++) { sum=i+i+alfa+gamma; a [il=min/sum/ (sum+2.0); sum * = sum; b [i]=4.0*i* (i+alfa+gamma) (i+alfa)* (i+gamma)/sum/ (sum-1.0);

1

em [O]=FLT-MIN; em [2I =FLT-EPSILON; em [41=6*m; allzerortpol (m,a,b,zer,em) ; for (i=l; ic=m; i++) { zer [i] = zeri = -sqrt( (1.o+zer [il ) /2.0); zer[n+l-i] = -zeri; kree-real-vector (a,0) ; f ree-real-vector (b,0) ; ) else ( a=allocate-real-vector(0,n); b=allocate-real-vector(0,n);

min= (beta-alfa)* (beta+alfa); sum=alfa+beta+2.0; b[Ol=O.O; a [OI= (beta-alfa)/sum; a [ll=min/sum/ (sum+2.0); b [I]=4.o* (l.O+alfa)* (1.o+beta)/sum/sum/ (sum+l.0); for (i=2; i -1; zer: float zer[l :n]; exit: the zeros of the n-th Laguerre polynomial with parameters alfa. n:

Function used: allzerortpol.

void alllagzer(int n, float alfa, float zer[l) 1 ' float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int) ; void allzerortpol(int, float [ I , float 1 1 , float 11, float 11); int i; float *a,*b,em[61 ; a=allocate-real-vector(0,n); b=allocate-real-vector (0,n) ; b[Ol=O.O; a [n-11 =n+n+alfa-1.0 ; for (i=l; ic=n-1; i++) ( a [i-11=i+i+alfa-1.O; b [il =i* (i+alfa); 1 &m LO]=FLT-MIN; em [2]=FLT-EPSILON; em [41=6*n; allzerortpol ( n , a,b,zer,em) ; free-real-vector (a,0) ; free-real-vector (b,0) ;

1

3.16.3 Zeros of complex polynomials

comkwd Determines the roots g and k of the quadratic equation 9-2~z-q=0with complex p and q.

Function Parameters: void cornkwd (pr,pi,qr,qi,gr,gi, kr, ki) pr,pi,qr,qi: float; entry: pr, qr are the real parts and pi, qi are the imaginary parts of the coefficients of the quadratic equation: 2 - 2(pr+pi*i)z - (qr+qi*i) = 0; gr,gi,kr,ki: float *; exit: the real parts and the imaginary parts of the dinomial are delivered in gr,

Copyright 1995 by CRC Press, Inc

kr and gi, ki, respectively; moreover, the modulus of gr+gi*i is greater than or equal to the modulus of b+ki*i. Functions used:

commul, comdiv, comsqrt.

void comkwd(f1oat pr, float pi, float qr, float qi, float *gr, float *gi, float *kr, float *ki)

I

void commul(float, float, float, float, float * , float void comdiv(float, float, float, float, float * , float void comsqrt (float, float, float *, float * ) ; float hr,hi ; if (qr == 0.0 && qi == 0.0) { *kr = *ki = 0.0; *gr = pr*2.0; *gi = pi*2.0; return; if (pr == 0.0 && pi == 0.0) ( comsqrt (qr,qi,gr,gi) ; *kr = -(*gr); *ki = -(*gi); return; if (fabs(pr) > 1.0 I I fabs(pi) > 1.0) { corndiv(qr,qi,pr,pi,&hr,&hi) ; comdiv(hr,hi,pr,pi, &hr,&hi) ; comsqrt (l.O+hr,hi,&hr,&hi) ; commul (pr,pi,hr+l.0,hi,gr,gi) ; } else { comsqrt (qr+(pr+pi)* (pr-pi), qi+pr*pi*2.0,&hr,&hi) ; if (pr*hr+pi*hi > 0.0) { *gr = pr+hr; *gi = pi+hi; ) else { *gr = pr-hr; *gi = pi-hi;

Copyright 1995 by CRC Press, Inc

*); *);

4. Analytic Evaluations 4.1 Evaluation of an infinite series A. euler Applies the Euler transformation to the series

The course of the computations is determined by two parameters, both prescribed by the user: eps, a real tolerance; tim, a positive integer specifying the number of consecutive transformed sums whose agreement to within the tolerance eps is taken to imply that the last of them is an acceptable approximation to the Euler sum of the original series. A set of numbers Mij, a sequence of integers J(i), and sequences of partial sums Si and terms ti for which Si+,=Si+ti+,are computed as follows. Initially M,,,=a,, J(I)=O, Sl=?4a,. For i 2 I, and with Mi+,,,=ai+,, the numbers Mi+lj+l = I/z(Mij+Mi+,j O=l, ...,J(9) are 1 < I MLJW1 then J(i+l) =J(i)+l and ti+,=I/zMi+,,Jo+l;otherwise computed. If I Mi+l,Jo)+, I+tim-1 for some PO, the process J(i+l) =J(i) and ti+l=Mi+l,Jo+,.If I ti+,( - 1; X: float x[l:n]; x[iJ is the i-th zero of the n-th Jacobi polynomial; exit: w: float w[l:n]; w[iJ is the Gauss-Christoffel number associated with the i-th zero of the exit: n-th Jacobi polynomial.

n:

Functions used: Method:

gamma, alljaczer.

See [AbS65, St721.

void gssjacwghts(int n, float alfa, float beta, float x[l , float w[l) { float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int) ; float gamma (float) ; void alljaczer (int, float, float, float [I ) ; int i,j,m;

Copyright 1995 by CRC Press, Inc

float rO,rl,r2,s,hO,alfa2,xi,*a,*b,min,sum,alfabeta,temp; if (alfa == beta) { b=allocate-real-vector(1.n-1); alljaczer (n,alfa,alfa,x) ; alfa2=2.0*alfa; temp=gamma(l.O+alfa); ho=pow(2.0,alfa2+l.0)*temp*temp/gamma(alfa2+2.0) ; b [ll=l.O/sqrt (3.0+alfa2); m=n-n/2; for (i=2; i 3 ) w=mb ; else { if (mb == 0.0) tol=O.O; else if (mb c 0.0) to1 = -tol; p= (b-a)*fb; if (first) { q=fa-fb; first=O; } else { fdb= (fd-fb)/ (d-b); fda=(fd-fa)/(d-a); p *= fda; q=fdb*fa-fda*fb; I

I

1

1

if (ext == 3) p *= 2.0; w= (pcFLT-MIN I I p= 0.0) : (fb = 0.0) ? (fb c = 0.0)

:

(fb >= 0.0));

1

5.1.2 Single equation

- Derivative available

zeroinder Given the values, xo and yo say, of the end points of an interval assumed to contain a zero of the functionf(x), zeroinder attempts to find, by use of a combination of interpolation and extrapolation based upon the use of a fractional linear function, and of bisection, values, xn and yn say, of the end points of a smaller interval containing the zero. At successful exit f(xJf(yJI0, If(xJ 1 IIf(yJ I and I xn-y, I I2*tol(x,J, where to1is a tolerance function prescribed by the user (see the documentation of zeroin). The algorithm used by zeroinder is similar to that used by zeroin, with the difference that the estimate I(b,a) of the zero off($ obtained by linear interpolation between the two function values at x=a and x=b is replaced by an estimate obtained by interpolation based upon use of a fractional linear function (x-u)/(vx+w) whose value agrees with that of f(x) at x=a and x=b, and whose derivative is also equal in value to that of f(x) at one of these points. zeroinder is to prefer to zeroin or zeroinrat if the derivative is (much) cheaper to evaluate than the function.

Copyright 1995 by CRC Press, Inc

The number of evaluations of@, dfx and tolx is at most 4*log,( I x-y I )/tau, where x and y are the argument values given upon entry, tau is the minimum of the tolerance function tolx on the initial interval (i.e. zeroinder requires at most 4 times the number of steps required for bisection). If upon entry x and y satisfy f(x)*ff(y))lO then convergence is guaranteed and the asymptotic order of convergence is 2.414 for a simple zero off:

Function Parameters: zeroinder:

int zeroinder (x,y&, dfi, tolx) on exit zeroinder is given a nonzero value when a sufficiently small subinterval of J containing a zero of the function f(x) has been found; otherwise zeroin is given the value zero;

float *; entry: one endpoint of interval J in which a zero is searched for; exit: a value approximating the zero within the tolerance 2*tol(x) when zeroinder has a nonzero value; y: float *; entry: the other endpoint of interval J in which a zero is searched for; upon entry x < y as well as y < x is allowed; exit: the other straddling approximation of the zero, i.e. upon exit the values of y and x satisfy (a) f(x)*f(y))lO, (b) Ix-y l l2*tol(x) and (c) If(x) 12 1f(y) )l when zeroinder has a nonzero value; fx: float (*fx)(x); entry: defines function f as a function depending on the actual parameter for x; dfx: float (*dfx)(x); entry: defines derivative df of f as a function depending on the actual parameter for x; tolx: float (*tolx)(x); entry: defines the tolerance function to1 which may depend on the actual parameter for x; one should choose tolx positive and never smaller than the precision of the machine's arithmetic at x, i.e. in this arithmetic x+tolx and x-tolx should always yield values distinct from x; otherwise the procedure may get into a loop. x:

int zeroinder (float *x, float * y , float (*fx)(float), float (*dfx)(float), float (*tolx)(float)) { int ext,extrapolate; float b, fb,dfb, a, fa,dfa, c,fc,dfc, d, w , m , tol,m,p,q; b = *x; fb= (*fx)(*x); dfb= (*dfx)(*x); a = * x = *Y; fa= (*fx)(*x); dfa= (*dfx)(*XI ; c=a; fc=fa; df c=dfa; ext=O ; extrapolate=l; while (extrapolate) { if (fabs(fc) c fabs(fb)) { a=b; fa=fb;

Copyright 1995 by CRC Press, Inc

to1= (*tolx)(*x); m= (c+b)*0.5; mb=m-b; if (fabs(mb) > tol) { if (ext > 2 ) w =mb ; else { if (mb == 0.0) tol=O .O; else if (mb < 0.0) to1 = -tol; d = (ext == 2) ? dfa : (fb-fa)/(b-a); p=fb*d* (b-a); q=fa*dfb-fb*d; if ( p < 0.0) { P = -p; q = -q;

1

w= (p= 0.0) : (fb c=a; fc=fa; df c=dfa ; ext=O; } else ext = (w == mb) ? 0 : ext+l; ) else break ;

:

c=

((p= 0.0) ? (fb = 0.0));

5.1.3 System of equations - No Jacobian matrix A. quanewbnd Solves systems of non-linear equations of which the Jacobian is known to be a band matrix and an approximation of the Jacobian is assumed to be available at the initial guess. The method used is the same as given in [Br071]. quanewbnd computes an approximation to the solution x of the system of equations f(x) =O V;xtR") for which the components ~J)=a~j(x)/axo) of the associated Jacobian matrix are zero when j>i+rw or i>j+lw, an approximation xo to x and an approximation Joto df(xJldxo being assumed to be available. At the i-th stage of the iterative method used, 6i is obtained by solving (decsolbnd is called for this purpose) the band matrix system of equations ASi = -j(xJ (1) with xi+, = xi + and with p!) = (0,...,0,6 ,...,6hoj,0,...,0) where hfj)=max(l j-lw), k(j)=rnin(n,j+rw), and with Ij being the matrix whose (jJ)-th element is

Copyright 1995 by CRC Press, Inc

unity, all other elements being zero, and (the bars denoting Euclidean norms) with A!) = ~ f ( x ~ + , ) ( P ~ ) ~ / ( p ~ if) ,IIPF) ~ ~ ) )2e2 4 , and 0 otherwise (E being the machine precision) J,+, is obtained from the relationship

1 1 1 \IPi 1

formula (1) is reapplied and the process continued.

Function Parameters: void quanewbnd (n, Iw,rw,x,J;jac,funct, in, out) n: int; entry: the number of independent variables; the number of equations should also be equal to n; lw: int; entry: the number of codiagonals to the left of the main diagonal of the Jacobian; rw: int; entry: the number of codiagonals to the right of the main diagonal of the Jacobian; x: float x[l:n]; entry: an initial estimate of the solution of the system that has to be solved; exit: the calculated solution of the system; f: float f[l:n]; entry: the values of the function components at the initial guess; exit: the values of the function components at the calculated solution; jac: float jac[l:(lw+rw) *(n-l)+n]; entry: an approximation of the Jacobian at the initial estimate of the solution; an approximation of the (i,j)-th element of the Jacobian is given in jac[(lw+rw) *(iI)+j], for i=l, ...,n and j=max(l,i-lw) ,...,min(n,i+rw); exit: an approximation to the Jacobian at the calculated solution; funct: int (*funct) (n, I, u,x,f); entry: the meaning of the parameters of the function funct is as follows: n: the number of independent variables of the function 1,u: int; the lower and upper bound of the function component subscript; x: the independent variables are given in x[l:n]; f: after a call of funct the function components f[i], i=l,...,u, should be given in f[l: 4 ; exit: if the value of funct is zero then the execution of quanewbnd will be terminated, while the value of out[5] is set equal to 2; in: float in[0:4]; entry: in[O]: the machine precision; in[l]: the relative precision asked for; in[2]: the absolute precision asked for; if the value, delivered in out[5] equals zero then the last correction vector d, say, which is a measure for the error in the whereby x denotes the solution, satisfies the inequality d I x *in[l]+in[2], calculated solution, given in array x and denotes the Euclidean norm; however, we cannot guarantee that the true error in the solution satisfies this inequality, especially if the Jacobian is (nearly) singular at the solution;

1 1 1 1

Copyright 1995 by CRC Press, Inc

1 ./I

the maximum value of the norm of the residual vector allowed; if out[5]=0 then this residual vector r, say, satisfies: 1 r 1 lin[3]; in[4]: the maximum number of function component evaluations allowed; I-u+l function component evaluations are counted for each call of funct(n,l,u,xf); if out[5]=l then the process is terminated because the number of evaluations exceeded the value given in in[4]; out: float out[l:5]; exit: out[l]: the Euclidean norm of the last step accepted; out[2]: the Euclidean norm of the residual vector at the calculated solution; out[3]: the number of function component evaluations performed; out[4]: the number of iterations carried out; out[5]: the integer value delivered in out[5] gives some information about the termination of the process; out[5]=0: the process is terminated in a normal way; the last step and the norm of the residual vector satisfy the conditions (see in[2], in[3J; if out[5]+0 then the process is terminated prematurely; the number of function component evaluations exceeds the value out[5]=l: given in in[4]; a call of funct delivered the value zero; out[5]=2: the approximation to the Jacobian matrix turns out to be singular. out[5]=3: in[3]:

Functions used:

mulvec, dupvec, vecvec, elmvec, decsolbnd.

void quanewbndfint n, int lw, int rw, float x[l, float f[l, float jac[l, int (*funct)(int, int, int, float [I, float [I) , float in [I , float out [I ) 1

float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int); float vecvec(int, int, int, float [I, float [I ) ; void elmvec (int, int, int, float [ I , float [I , float); void mulvec (int, int, int, float [ I , float [I , float) ; void dupvec (int, int, int, float [ I , float [I ) ; void decsolbnd(f loat [I , int, int, int, float [I , float [I ) ; int l,it,fcnt,fmax,err,b,i,j,k,r,m; float macheps,reltol,abstol,tolres,nd,mz,res,*delta,mul,~rit, *pp,*s,aux[61 ,*lu; delta=allocate-real-vector(1,n) ; nd=O .O; macheps=in [OI ; reltol=in[ll ; abstol=in[21 ; tolres=in[31 ; fmax=in[41 ; mz=macheps*macheps; it=fcnt=O; b=lw+rw; l= (n-1)*b+n; b++; res=sqrt(vecvec(l,n,0,£ , £ ) ) ; err=O; while (1) { if (err ! = 0 I I (res c tolres && sqrt (nd) c sqrt (vecvec(l,n, O,x,x)) *reltol+abstol)) break; it++; if (it ! = 1) {

Copyright 1995 by CRC Press, Inc

/ * update jac * / pp=allocate-real-vector(1,n); s=allocate-realvector(1,n); crit=nd*mz; for (i=l; ic=n; i++) pp [il =delta[il *delta [il ; r=k=l; m=rw+l; for (i=l; i crit) elmvec (k,m-j,j,jac,delta,f[il /mul); k += b; if (i > lw) r++; else k--; if (m c n) m++; II free-real-vector (pp,1) ; free-real-vector (s,1) ; 1

I*

direction * /

lu=allocate~real~vector(l,l);

aux [2I =macheps;

mulvec(l,n,0,delta,fI-1.0) ;

dupvec(l,l,O,lu,jac) ; decsolbnd(lu,n,lw,rw,aux,delta); free-real-vector(lu,l); if (aux[3l ! = n) { err=3; break; ] else { elmvec (l,n,O,x, delta,1.0); nd=vecvec (l,n,O,delta,delta); / * evaluate * / fcnt += n; if ( ! ( (*funct)(n,l,n,x,f)) ) { err=2; break;

J if (fcnt

s

fmax) err=l;

res=sqrt(vecvec(l,n,O,f,f))

,

;

I

I out [1l=sqrt (nd); out [21=res; out [31=fcnt; out [41=it; out [51=err; free-real-vector (delta,1) ;

I

B. quanewbndl Solves systems of non-linear equations of which the Jacobian is known to be a band matrix and an approximation to the Jacobian at the initial guess is calculated using forward differences. Computes an approximation to the solution x of the system of equations f(x)=O V;x&") for which the components ~ j ) = a ~ ) ( x ) / a x 0to) the associated Jacobian matrix are zero when j>i+rw or i>j+lw, an approximation xo to x being assumed to be available. An approximation Jo to af(x,J/axo, based upon the use of first order finite difference approximations using equal increments d for all components, is first obtained by means of a call of jacobnbndJ; and quanewbnd is then called to obtain the desired approximation.

Function Parameters:

Copyright 1995 by CRC Press, Inc

void quanewbndl (n,lw,rw,x,lfunct, in, out) n:

int; entry:

the number of independent variables; the number of equations should also be equal to n;

fw: int; entry: the number of codiagonals to the left of the main diagonal of the Jacobian; rw: int; entry: the number of codiagonals to the right of the main diagonal of the Jacobian; x: float x[l:n]; entry: an initial estimate of the solution of the system that has to be solved; exit: the calculated solution of the system; j float f[l:n]; the values of the function components at the calculated solution; exit: funct: int (*funct) (n,l, u,x,) ; entry: the meaning of the parameters of the function funct is as follows: n: the number of independent variables of the function f,u: int; the lower and upper bound of the function component subscript; x: the independent variables are given in x[l:n]; j after a call of funct the hnction components f[i], i=l, ...,u, should be given in f[l: 4 ; if the value of funct is zero then the execution of quanewbndl will be exit: terminated, while the value of out[5] is set equal to 2; in: float in[0:5]; entry: in[O]: the machine precision; the relative precision asked for; in[l]: the absolute precision asked for; if the value delivered in out[5] equals zero in[2]: then the last correction vector d, say, which is a measure for the error in the solution, satisfies the inequality 1 d 1 I 1 x 1 *in[l]+in[2], whereby x denotes the calculated solution, given in array x and 11. 1 1 denotes the Euclidean norm; however, we cannot guarantee that the true error in the solution satisfies this inequality, especially if the Jacobian is (nearly) singular at the solution; in[3]: the maximum value of the norm of the residual vector allowed; if out[5]=0 then this residual vector r, say, satisfies: 1 r 1 lin[3]; in[4]: the maximum number of function component evaluations allowed; 1-u+l function component evaluations are counted each call of funct(n, I, u,x,); if out[5]=l then the process is terminated because the number of evaluations exceeded the value given in in[4]; in[5]: the Jacobian matrix at the initial guess is approximated using forward differences, with a fixed increment to each variable that equals the value given in in[5]; out: float out[l:5]; exit: out[l]: the Euclidean norm of the last step accepted; out[2]: the Euclidean norm of the residual vector at the calculated solution; out[3]: the number of function component evaluations performed; out[4]: the number of iterations carried out; out[5]: the integer value delivered in out[5] gives some information about the

Copyright 1995 by CRC Press, Inc

termination of the process; the process is terminated in a normal way; the last step and the norm out[5]=0: of the residual vector satisfy the conditions (see in[2], in[3n; if out[5]#0 then the process is terminated prematurely; the number of function component evaluations exceeds the value out[5]=1: given in in[4]; a call of funct delivered the value zero; out[5]=2: the approximation of the Jacobian matrix turns out to be singular. out[5]=3:

Functions used:

jacobnbndf, quanewbnd.

void quanewbndl (int n, int lw, int rw, float x[l , float f [I, int (*funct)(int, int, int, float [I , float [I ) , float in [I , float out [I ) {

float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); float quanewbndlt (float, int) ; void quanewbnd(int, int, int, float [I, float [I, float [I, int ( * ) (int, int, int, float [I , float [I ) , float [I, float [I); void jacobnbndf (int, int, int, float [I , float [I , float [I, float ( * ) (int), int ( * ) (int, int, int, float [I , float [I ) ) ; int k ; float *jac;

in[4] - = k; quanewbndlt (in[51 , 1) ; jacobnbndf (n,lw,rw,x , W a c ,quanewbndls,funct) ; quanewbnd (n,lw,rw,x , f,jac, funct,in,out) ; in141 += k; out[31 += k ; free-real-vector (jac,1) ;

I float quanewbndls(int i) 1

1

/ * this function is used internally by QUANEWBNDl * / float quanewbndlt (float, int); return (quanewbndlt(0.O,0 ) )

;

I ;loat quanewbndlt(f1oat x, int i) 1

/ * this function is used internally by QUANEWBNDl * / static float y ;

y = ( i ? x : y); return (y);

1

5.2 Unconstrained optimization 5.2.1 One variable - No derivative minin

Copyright 1995 by CRC Press, Inc

Determines a point xe[a,b] at which the real valued function f(x) assumes a minimum value. It is assumed that, for some point p&(a,b) either (a) f is strictly monotonically decreasing on [a+) and strictly monotonically increasing on [p,b] or (b) these two intervals may be replaced by [a+] and (p,b] respectively. Use is made of function values alone. The method used involves the determination of five sequences of points a,, b,,, v,, w,, x,, and further auxiliary numbers, u , p, and q, (n=0,1, ...). Initially a, = a, b, = b, v, = w, = x, = a, + %(3-1/5)(b,-ad. At the n-th stage of the computations a local minimum is known to lie in [a,b,] and x,&[a,,b,]. If max(x,-a,, b,-x,) I2*tol(xJ, where tol(x) is a tolerance function prescribed by the user, the computations terminate, x, is such that f(xJ < f(xJ either (a) for r=O,l,...,n-1 or (b) for r=O,l, ...,n-2 with f(xJ = f(x,.J; f(wJ < f(xJ for r=O,l,...,n-2 in case (a) and for r=O,l,...,n-3 in case (b). If the computations are not terminated, p,, q, are determined such that x, + p,,/qn is a turning point of the parabola through (v,,f(vJ), (w,,f(wJ) and (x,,,f(x,J): PI1 = (6,-vJ 2Cf(xJf(w J)-(xn-wJ2B(~J-SivJ)) qn = f2((xn-vJ2Cf(xJ-f(wJ)-(xn-wJ Cf(wn)XvJ)). If either Ip,/q, 1 ltol(xJ or q,=O or x,+p,/q, E (a,,b,) then u,,+, = %v5-1)xn+%(3-V5)an if xn2%(an+bJ u,,+, = %(/5-l)x,+%(3-1/5)bn if x, f(xJ then (a) an+,=un+,and b,+,=b, if u,,+, (*a-z)*q && p < (*b-z)*q) { d=p/q; u=z+d; if (u-(*a) c t I I (*b)-u< t) d = ( ( 2 c rn) ? to1 : -toll ; ) else ( e = ((z < m) ? *b : *a) - z; d=c*e; I

1

u = *x = z + ((fabs(d) >= tol) fu=(*fx)(*x); if (fu 0. Termination occurs when b,+,-a,+, 5 3*tol(u,+ J, where tol(x) is a tolerance function prescribed by the user. The user should be aware of the fact that the choice of tolx may highly affect the behavior of the algorithm, although convergence to a point for which the given function is minimal on the interval is assured. The asymptotic behavior will usually be fine as long as the numerical function is strictly 8-unimodal on the given interval (see [Bre73]) and the tolerance function satisfies tol(x) 26,for all x in the given interval. Let the value of dfx at the begin and end point of the initial interval be denoted by dfa and dfb, respectively, then, finding a global minimum is only guaranteed if the function is convex and dfa I0 and dfb 2 0. If these conditions are not satisfied then a local minimum or a minimum at one of the end points might be found.

Function Parameters: mininder:

float mininder (x,y,fx,dfx,tolx) delivers the calculated minimum value of the function, defined by fx, on the interval with endpoints a and b;

float *; entry: one of the end points of the interval on which the function has to be minimized; exit: the calculated approximation of the position of the minimum; y: float *; entry: the other end point of the interval on which the function has to be minimized; a value such that Ix-y l 2 3*tol(x); exit: fx: float (*fx)(x); entry: the function is given by the actual parameter fx which depends on x; dfx: float (*dfx)(x); entry: the derivative of the function is given by the actual parameter dfx which x:

Copyright 1995 by CRC Press, Inc

depends on x; fx and dfx are evaluated successively for a certain value of x; tolx: float (*tolx)(x); entry: defines the tolerance function which may depend on x; a suitable tolerance function is: I x 1 *retae, where re is the relative precision desired and ae is an absolute precision which should not be chosen equal to zero.

float mininder(f1oat *x, float *y, float (*fx)(float), float (*dfx)(float), float (*tolx)(float)) {

int sgn; float a,b,c,fa,fb,fu,dfa,dfb,dfu,e,d,to1,baaz,p,q,s; if (*x c= *Y) ( a = *x; fa=(*fx) (*x); dfa=(*dfx) (*XI ; b = *x = * y ; fb=(*fx) (*x); dfb= (*dfx)(*x); } else { b = *x; fb= (*fx)(*x); dfb= (*dfx)(*XI ; a = *x = *y; fa= (*fx)(*x); dfa=(*dfx) (*x);

1

c=(3 .O-sqrt(5.0)) /2.0; d=b-a; e=d*2.0; z=e*2.0; while (1) { ba=b-a; tol= (*tolx)(*XI ; if (ba < tol*3.0) break; if (fabs(dfa) c= fabs(dfb1) ( *x=a; sgn=l; } else { *x=b; sgn = -1;

1

if (dfa < = 0.0 && dfb >= 0.0) { z=(fa-fb)*3.0/ba+dfa+dfb; s=sqrt (z*z-dfa*dfb); p = (sgn == 1) ? dfa-s-z : dfb+s-z; p * = ba; q=dfb-dfa+s*2.0; z=e; e=d; d = (fabs( p ) = fabs(z*0.5) I I fabs(d) > ba*0.5) ( e=ba; d=c*ba*sgn; 1 1

*X += d; fu= (*fx)(*x); dfu= (*dfx)(*x); if (dfu >= 0.0 I b = *x; fb=fu; dfb=dfu; ) else ( a = *x; fa=fu; dfa=dfu;

1

Copyright 1995 by CRC Press, Inc

I (fu >= fa

&&

dfa < = 0.0)) {

1

if (fa c = fb) { *x=a; *y=b; r e t u r n fa; } else { *x=b; *y=a; r e t u r n fb;

1

5.2.4 More variables - Auxiliary procedures A. linemin Determines the real value a,,, (>O) of a for which f(a) = F(x,+ad) attains a minimum, where F(x) is a real valued function of x s P ; X ~ EisPfixed, and ~ E isPa fixed direction. It is assumed that the partial derivatives gi(x) = aF(x)/axi (i=l,...,n) may easily be computed; we then have f'(a) = af(a)/act = dTg(x,+cwd). The method employed utilises an iterative process, based upon cubic interpolation (see [Dav59]), for locating a minimum of the real-valued function J: The process involves an initial approximation a,,three real sequences u, vk and yk, a fixed real parameter ys(O,lh), and relative and absolute tolerances, E, and E, respectively, used to terminate the process. Initially yo=uo=O, vo=cyo.At step k: (a) if f'(aJ 2 0, compute y = vk - ( v ~ - u & C ~ ' ( V & + W - Z ) / C ~ ' ( V & ~ ( U J + ~ W ) where z = 3Cf(uJ-j(vJ)/(vk-uJ + f'(@ + f'(vJ and w = (2 - f ' ( ~ J f ' ( v J ) ~ ' ~ ; set ek = IIx0 + y&ll E, + E,; yk+, is determined as follows: ~ , (ii) if (vk-y)Qk then yk+,=vk-~,,else (iii) yk+,=y; (i) if b-uk) < ck then yk+, = u ~ + Eelse uk+,and vk+,are determined as follows: (iv) if f'(y,+J 2 0 then vk+,=yk+,and uk+,=uk; else (v) vk+,=vkand uk+,=yk+,. (b) if f'(vJ < 0 then (i) if Cf(vJ-j(O))/vJ'(O) > y (now 0 < vk < emin)set uk+,=vkand Y ~ + ~ = v ~ +else ~ =(ii) ~ v(now ~ ; uk < E,, < vk) set yk+, = %(uk+vJ and if f'(vk+J 2 0 or Cf?k+J-f(O))/~k+lf(O) > then vk+l=yk+~ and uk+l=uk, vk+l=vk and uk+l=yk+~. The above process is terminated if either 1. lh I vk - ~k I < 11x0 + ~k+ldll Er + &a, 11. P 2 C ~ ~ J - ~ ( W / Y J '5( O1!- P, or 111. the maximal permitted number of simultaneous evaluations of f(e) and f'(e) is exceeded. In cases I and 11, yk+, is an acceptable approximation to emin. The direction vector d and the initial values, x , f(O)=F(x,J, and f'(O)=drg(O) must be supplied by the user. The user must also prescribe the value of the variable strongsearch: if this is nonzero then the stopping criteria I and I11 alone are used; if it is zero then all criteria are used. The stopping criterion used when the value of strongsearch is zero is described in [Fle63] and [GolP67]. A detailed description of this procedure is given in [Bus72b]. Function Parameters: void linemin (n,x,d,nd,alfa,g,funct,p f l , d p ,dfl,evlmax,strongsearch,in) n: int; entry: the number of variables of the given function x: float x[l:n];

Copyright 1995 by CRC Press, Inc

a vector x,, such that f is decreasing in x,, in the direction given by d; the calculated approximation of the vector for which f is minimal on the line defined by: x, + alfa*d, (alfa>O); d float d[I:n]; entry: the direction of the line on which f has to be minimized; n d float; entry: the Euclidean norm of the vector given in d[l:n]; alfa: float *; the independent variable that defines the position on the line on which f has to be minimized; this line is defined by x, + alfa*d, (alfa>O); entry: an estimate alfaO of the value for which h(alfa)=F(x,+alfa*d) is minimal; exit: the calculated approximation alfam of the value for which h(alfa) is minimal; g: float g[l:n]; exit: the gradient off at the calculated approximation of the minimum; funct: float (*funct)(n,x, g); entry: a call of funct should effectuate: 1. funct=f(x); 2. the value of g[i], (i=l, ...,n), becomes the value of the i-th component of the gradient off at x; jU: float; entry: the value of h(O), (see alfa); f l : float *; entry: the value of h(alfaO), (see alfa); exit: the value of h(alfam), (see alfa); djU: float; entry: the value of the derivative of h at a W 0 ; dfl: float *; entry: the value of the derivative of h at alfa=alfaO; exit: the value of the derivative of h at alfa=alfam; evlmax: int *; entry: the maximum allowed number of calls of funct; exit: the number of times funct has been called; strongsearch: int: entry: if the value of strongsearch is nonzero then the process makes use of two stopping criteria: A: the number of times funct has been called exceeds the given value of evlmax; B: an interval is found with length less than two times the prescribed precision, on which a minimum is expected; if the value of strongsearch is zero then the process makes also use of a third stopping criterion: C : p I (h(alfak)-h(alfaO))l(alfak*djU) < 1 - y, whereby alfak is the current iterate and p a prescribed constant; in: float in[l:3]; entry: in[l]: relative precision, E,, necessary for the stopping criterion B (see strongsearch); in[2]: absolute precision, E,, necessary for the stopping criterion B (see strongsearch); the prescribed precision, E, at alfa=alfak is given by: entry: exit:

Copyright 1995 by CRC Press, Inc

E =

in[3]:

.I1

1 x0+alfa*d1 Ef E , where 1 denotes the Euclidean norm; the parameter p necessary for stopping criterion C; this parameter must satisfy: 0 0) ; while (1) if (ilic) { / * random stop to get off resulting valley * / for (i=l; ic=n; i++) { s=z [iI=(O.1*ldt+t2*~0~(10.0, kt)) * (rand0j(f1oat)RAND-MAX-0.5) ; elrnveccol (l,n,i,x,v,s) ; 1

1

for (k2=k; k2c=n; k2++) { sl=fx; s=O . 0; praxismin (k2,2,& (d[k2l), &s,&fx,0, n,x,v,&qa,&qb,&qc,qdO,qdl,qO,ql, &nf, &nl,&fx,m2,m4,dmin,ldt,reltol,abstol,small,h,funct~; s = illc ? d[k21* (s+z[k21 ) * (s+z[k21) : sl-fx; if (df c s) { df=s; kl=k2;

1 if

( ! illc &&

illc=l:

Copyright 1995 by CRC Press, Inc

df c fabs (100.O*macheps*fx))

else break; 1 ?or (k2=1; k2=k; i--) ( for (j=l; j ktm) { out [l]=O.0; emergency=l;

1

1

if (emergency) break; / * quad * / sax; fx=qf1 ; qf 1=s; qdl=O .O; for (i=l; i FLT-MIN) && (nl >=3*n*n)) ( praxismin (0,2,&s,&l, &q£l,1, n,x,v,&qa,&qb,&qc,qdO,qdl,qO,ql,&nf,

&nl,&fx,m2,m4,dmin,ldt,reltol,abstol,small,h,funct); qa=l* (1-qdl)/ (qdO*(qdO+qdl)) ; qb= (l+qd~) * (qdl-1)/ (qd~*qdl) ; qc=l* (l+qdO)/ (qdl*(qdO+qdl)) ; } else { fx=qf1 ; qa=qb=0.0; qc=1.0 ; 1

qdO=qdl; for (i=l; i 1.0) { s=vlarge; for (i=l; i s1) s=sl;

1

for (i=l;ic=n; i++) ( sl=s/z[il ; z [il=l.O/sl; if (z[il > scbd) { sl=l.O/scbd; z [il=scbd; 1 I

mulrow(l,n,i,i,v,v,sl);

1

1

for (i=l; ic=n; i++) ichrowcol (i+l,n, i,i,v); em [o]=em [21=macheps; em [ 4 ] =lO*n; em [6]=vsmall; dupmat (l,n, l,n,a,v); if (qrisngvaldec(a,n,n,d,v.em) ! = 0) { out [ll=2.0; emergency=l;

1

if (emergency) break; if (scbd > 1.0! { for (i=l;i dmin; if (nf >= rnaxf) ( out [ll=l.0; break;

1

I

out [21=fx; out [41=nf; out [S]=nl; out [61=ldt; free-real-vector (d,1) ; free-real-vector (y,1) ; free-real-vector (z,l); free-real-vector (q0,l); free-real-vector (ql,1) ; free-real-matrix(v, 1,n,1); free-real-matrix (a,1,n,1);

I void praxismin(int j, int nits, float *d2, float *xl, float *fl, int fk, int n, float x[l , float **v, float *qa, float *qb, float *qc, float qd0, float qdl, float q0 [ I , float ql [I , int *nf, int *nl, float *fx, float m2, float m4, float dmin, float ldt, float reltol, float abstol, float small, float h, float (*funct)(int, float [I ) )

/ * this function is internally used by PRAXIS * / float praxisflin(float, int, int, float [I, float **, float *, float *, float *, float, float, float [I , float (*)(int, float[])); float [I, int *, int k,dz,loop; float x2,xm,fO,f2,fm,dl,t2,s,sfl,sxl; sf1 = *fl; SX1 = *XI; k=O ; xm=O .o; fO = fm = *fx; dz = *d2 c reltol; s=sqrt(vecvec(l,n,O,x,x)) ; t2=m4*sqrt(fabs(*fx)/ (dz ? dmin s=s*m4+abstol; if (dz && (t2 > s)) t2=s; if (t2 c small) t2=small; if (t2 > 0.01*h) t2=0.01*h; if (fk && (*fl c= fm)) { xm = *xl; fm = *fl;

:

*d2)+s*ldt)+m2*ldt;

I

if (!fk 1 1 (fabs(*xl) c t2)) ( *Xl = (*x1 > 0.0) ? t2 : -t2; qa,qb,qc,qdO,qdl,q0,ql,nf,funct); *fl=praxisflin (*xl,j ,n,x,v,

loop=1; while (loop) ( if (dz) { / * evaluate praxisflin at another point and estimate the second derivative * / x2 = (fO c *fl) ? -(*xl) : (*x1)*2.0; f2=praxisflin (x2,j , n,x,v,qa,qb,qc,qdO ,qdl,q0,ql,nf , funct) if (f2 c= fm) ( xm=x2; fm=f2; I

/ * estimate first derivative at 0 * / dl= ( (*fl)-fO)/ (*XI)- (*XI)* (*d2);

Copyright 1995 by CRC Press, Inc

;

dz=l; x2 = (*d2 c= small) ? ((dl c 0.0) ? h : -h) : -0.5*dl/(*d21 ; if (fabs(x2) > h) x2 = (x2 > 0.0) ? h : -h; while (1) { b qc,qdO,qdl,qO,ql,nf,funct); f2=praxisflin(x2,j ,n,x,v, if (k < nits && f2 > £0) ' k++ ; if (£0 c *£I && (*xl)*x2 > 0.0) break; x2=0.5*~2; ) else { loop=0; break;

qatq

else fm=f2; *d2 = (fabs(x2*(x2-(*XI))) > small) ? ( (x2*( (*fl)-fO)- (*xl)* (fm-£0) ) / ( (*xl)*x2* ( (*XI)-x2)) ) ((k > 0) ? 0.0 : *d2); if (*d2 c= small) *d2=small; *x1=x2; *fx=fm; if (sf1 c *fx) { *fx=sf1; *x1=sx1:

:

float praxisflin(f1oat 1, int j, int n, float x[l , float **v, float *qa, float *qb, float *qc, float qd0, float qdl, float q0 [I, float ql [I, int *nf, float (*funct)(int, float [I ) I

L

/ * this function is internally used by PRAXISMIN * / int i; float *t,result; t=allocate real-vector(1,n); if (j > 0)for (i=l; ic=n; i++) t [i]=x[i] +l*v[il Ijl ; else { / * search along parabolic space curve * / *qa=l* (1-qdl)/ (qdO*(qdO+qdl)) ; *qb= (l+qdO)* (qdl-1)/ (qd0*qd1); *qc=l*(l+qdO)/ (qdl*(qdO+qdl)) ; for (i=l; =c=n; i++) t [il= (*qa)*q0 [il+ (*qb)*x [il + (*qc)*ql [il ; (*nf)++; result= (*funct)(n,t) ; free-real-vector (t,1) ; return result ;

1

5.2.6 More variables

- Gradient available

4. rnklmin Determines a vector x R n for which the real valued function F(x) attains a minimum. t is assumed that the partial derivatives gi(x)=aF(x)/axi ( i = l,...,n) may easily be computed. nklmin is suitable, in particular, for use in connection with problems for which the nxn lessian matrix G(x), whose components are G,(x) =a2F(x)/ax ,ax, (i,j=l,...,n), is almost

Copyright 1995 by CRC Press, Inc

singular at the minimum. With H(x)=G(x)-', and the initial vector x(O) prescribed, the sequence of vectors x@) produced by use of the Newton scheme xF") = xF) - H(xFqg(x ") (kO, 1,...) under certain conditions converges quadratically. Use of this scheme requires the evaluation and inversion of a Hessian matrix at each stage, and in order to avoid this rnklmin determines a sequence of vectors xF) by use of a scheme of the form xF") = x" - aF)~#g(xF))in which the metric p)is an approximation to H(xF9, and is corrected by use of a simple updating formula of the form HF")= @) + CF) where cF)is of rank one (or possibly rank two), and the aF)are suitably determined real numbers. The user is asked to provide at outset the initial approximation vector x(O) and values of the machine precision e, the required relative and absolute precisions, e, and e, respectively, in the determination of the minimum, a descent parameter y in the range O 0, i.e. the derivative of F(x) in the direction be taken as just given then (89Tg(~F9 8)at the point xm would be decreasing. To avoid this, @) is decomposed in the form H") = UhUT, where U is a matrix of eigenvectors, and A an nxn diagonal matrix of eigenvalues hi(i=l,...,n); defining 1 A ( to be the diagonal matrix with diagonal elements I hiI (i=l, ...,n), and K@ by fl = UI A I UT (SOthat fl)is positive definite) we take 8) = -KWg(xF9 when condition (1) is violated. and determine the smallest integer 11. If k=O, set @) = min(1, 2(F,,-F(~(~'))/(-d(")~g(x(~')), rLO for which either (d09Tg(x(0)+2'e(0)d'09 2 0 or - F ( X ( ~ ~ ) / ~ ' B ( ~ ) Tg(x' ( ~ ~ ~09 < y (~(x'~)-2'e'~)d(~~ = 2'tf0) for this r. F(x) now decreases and then increases upon the line and set ~=x(~)+acf0l (a>O). An approximation to the value of a specifying the point at which F is minimum upon this line is determined by cubic interpolation (see the theory of linemin and [Dav59]) or (if (do))Tg(x(0)+B(O)do~ < 0 by bisection. For the value of a so determined y s (F(x~)+cY~') - F ( x ~ ) ) ) / c x ( ~ ~ (Ix 1-p. (2) If h 0 , set P) = 11 SF-') 1 111 dF)1 if ky (3) set aF)= BF). If condition (3) is violated, the distance from x=xm to the minimum of F(x) upon the line x m + a 8 ) (OPO) is overestimated by setting a=BF), and a value of a

Copyright 1995 by CRC Press, Inc

in the range 0 tol) ( if (ext > 2) w=mb; else ( if (mb == 0.0) tol=O.O; else if (mb c 0.0) to1 = -tol; p= (bb-a)*fb; if (ext = 0.0) : (fb c= 0.0)) { c=a; fc=fa; ext=O; ) else ext = (w == mb) ? 0 : ext+l; ) else break;

1

)* end of finding zero * /

1

sl = i v ? *x : *y; if (iv) rk4arkstep(x,xl, s-xl,y,yl,zl,fxy,3'0, &kO,&kl,&k2,&k3,&k4,&k5,&discr,mu); else rk4arkstep (y,yl,s-yl,x,xl, zl,fxy,3,1, &k0,&kl,&k2,&k3,&k4,&k5,&discr,mu); d [31=(*x); d [41=(*y);

void rk4arkstep(float *x, float xl, float h, float *y, float yl, float zl, float (*fxy)(float, float), int d, int invf, float *kO, float *kl, float *k2, float *k3, float *k4, float *k5, float *discr, float mu) 11

/ * this function is internally used by RK4A * /

Copyright 1995 by CRC Press, Inc

mb);

*kO=(invf ? (l.O/(*fxy)(*x,*y))

:

(*fxy)(*x,*y)) *h;

} else if (d == 1) *k0=zl*h; else *kO *= mu; *x=xl+h/4.5; *y=yl+(*k0)/4.5; *kl=(invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,*y)) *h; *x=xl+h/3.0; *y=yl+((*kO)+(*kl)*3 .O)/12.0; *k2=(invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,*y)) *h; *x=xl+h*0.5; *y=yl+((*kO)+(*k2)*3.0)/8.0; *k3=(invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,*y)) *h; *x=xl+h*0.8: *y=yl+((*k0j*53.0-(*kl)*135.0+(*k2)*126.O+(*k3)*56.0)/125.0; *k4=(invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,*y))*h; if (d c = 1) { *x=xl+h; *y=yl+( (*kO)*I33 .O-(*kl)*378.O+(*k2)*276.O+ (*k3)*ll2.O+(*k4)*25.O)/l68.O; *k5=(invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,* y ) ) *h; *discr=fabs((*k0)*21.0-(*k2)*162.O+(*k3) *224.O(*k4)*l25.O+(*kS)*42.O)/l4.O; return; *x=xl+h; *y=yl+( - (*kO)*63.O+(*kl)*lag.0-(*k2)*36.O(*k3)*ll2.O+(*k4)*5O.O)/28.O; *k5 = (invf ? (l.O/(*fxy)(*x,*y)) : (*fxy)(*x,*y)) *h; *y=yl+((*kO)*35.0+(*k2)*162.0+(*k4)*l25.O+(*k5)*l4.0)/336.O;

Solves an initial value problem for a system of first order ordinary differential equations dx,(xJ/dx, = $(x), (j=l ,...,n) of which the derivative components are supposed to become large, e.g. in the neighbourhood of a singularities, by means of a 5 t h order Runge-Kutta method [Z64]. The system is assumed to be non-stiff. rk4na integrates the given system of differential equations from x, = a (xj(a), j=l, ...,n being given) and the direction of integration (i.e. with 1 specified, 0 I I I n, x, increasing or decreasing) prescribed at outset until, to within stated tolerances, a zero of the function b(x) is encountered. (b(x) is, of course, effectively a function of the single variable x,). The role of the function b is similar to that played in the implementation of rk4a; now, for example, we may take b(x) to be x, - x,, to determine the point at which x,(xo) and x,(xJ become equal. At each integration step, the quantities I dxj(xa)/dxo1 (j=O,l,...,n) are inspected, and the variable of integration is taken to be xi,, where j' is that value of j corresponding to the maximum of these quantities. The system of equations actually solved has the form (j=O,l,...,n; j#j7. dx/dx,. = J(x& (x) Function Parameters: void rk4na (x,xa,b,@j, e,d,Ji,n,l,pos) float x[O:n]; entry: x[O] is the independent variable, x[l], ...,x[n] are the dependent variables; exit: the solution at b=O; float xa[O:n]; entry: the initial values of x[0], ...,x[n];

Copyright 1995 by CRC Press, Inc

float (* b)(n,x); b depends on x[0], ...,x[n]; if the equation b=O is satisfied within a certain tolerance (see parameter e), the integration is terminated; b is evaluated and tested for change of sign at the end of each step; float (*fjcj)(n,jA; fxj depends on x[O],..,x[n] and j, defining the right hand side of the differential equation; at each call it delivers: dxlj]/dx[O]; float e[0:2n+3]; entry: e[2j] and e[2j+l], OSjIn, are the relative and the absolute tolerance, respectively, associated with xlj]; e[2n+2] and e[2n+3] are the relative and absolute tolerance used in the determination of the zero of b; float d[O:n+3]; After completion of each step we have: d[O] is the number of steps skipped; d[2] is the step length; db+3] is the last value of xlj], j=O, ...,n; int; entry: if$ is nonzero then the integration is started with initial condition xlj] = xalj]; if$ is zero then the integration is continued with xlj] = db+3]; int; entry: the number of equations; int; entry: an integer to be supplied by the user, Ollln (see pos); pos: int; entry: if$ is nonzero then the integration starts in such a way that x[l] increases if pos is nonzero and x[l] decreases if pos is zero; if$ is zero then pos is of no significance.

void rk4na(float x[l , float xa[l, float (*b)(int, float [I ) , float (*fxj)(int, int, float [I ) , float e [I , float d[l , int fi, int n, int 1, int pos)

I float *allocate-real-vector(int, int); float **allocate-real-matrix(int, int, int, int) ; void free-real-vector(f1oat *, int); void free-real-matrix(f1oat **, int, int, int); void rk4narkstep(float, int, int, int, float, float ( * ) (int, int, float[]), float [I, float [I, float [I, float [ I , float * * ) ; int j,i,iv,iv0,fir,first,rej,change,t,next,ext,extrapolate; float h,condo,condl,fhm,absh,tol,fh,max,xO,xl,s,hmin,hl,mu,mul, p,£zero,*xl,*discr,*y,**k,el[3], c,fc,bb,fb,a,fa,dd,fd,fdb,fda,w,mb,m,q;

for ( L O ; ic=n; i++) d[i+3] =xa [il ; d[O] =d[ZI=O.O; 1 d[ll=0.0; for (i=O; ic=n; i++) x[i] =xl [il=d[i+31 ; iv=d [OI ;

Copyright 1995 by CRC Press, Inc

first=fir=l; y[OI=1.0; next=O; change=l; while (1) ( if (!change) ( while (1) { absh=fabs(h); if (absh < hmin) ( h = (h > 0.0) ? hmin absh=fabs(h);

:

,

rk4narkstep (h,i,n,iv,mu,fxj,x,xl, y,discr,k); rej=O; fhm=0.0; for (i=O;i fhm) fhm=fh; 1 mu=l.O/(l.O+fhm)+0.45; if (!rej) break; if (absh = xe-(*x)) { last=l; h=xe- (*x); (*x)=xe; } else (*x) += h; / * newton iteration * / itnum=O; while (1) { itnum++; if ( (*evaluate)(itnum)) ( (*jacobian)(m,j,y,sigmal,sigma2); l~niger2coef(m,j,a,aux,pi,h,*sigmal,*sigma2, & C O , & C ~ , & C ~ , & C ~ ,;& C ~ ) } else if (itnum == 1 && h ! = hl) liniger2coef(m,j,a,aux,pi,h,*sigmal,*sigma2, &CO,&Cl,&C2,&C3,& ~ 4 ;) for (i=l; i 40.0) r=bl/ (bl-2.0); else ( ex=exp (-bl); r=bl* (1.0-ex)/ (bl-2.O+(bl+2.O)*ex); 1 p=r/3.0-2.0/bl; ] else if (fabs(bl-b2) < bl*bl*l.Oe-6) doublefit=l; else { if (bl > 40.0) r=bl/ (bl-2.0); else ( ex=exp (-bl); r=bl* (1.0-ex)/ (bl-2.O+(bl+2.0)*ex); I

rl=r*bl; if (b2 > 40.0) r=b2/(b2-2.0); else { ex=exp ( -b2) ; r=b2* (1.0-ex)/ (b2-2.O+(b2+2.0)*ex); 1

if (doublefit) ( bl=O.5* (bl+b2); if (bl > 40.0) r=bl/ (bl-2.0); else { ex=exp (-bl); r=bl* (1.0-ex)/ (bl-2.O+(bl+2.0)*ex);

I

rl=r; if (bl > 40.0) ex=O.O; r2=bl/(1.0-ex); r2=1.0-ex*r2*r2; q=l.0/ (rl*rl*r2); p=rl*q-2.0/bl; 1 J

*cO *cl *c2 *c3 *c4 for

I

) = 0.25*h*h*(p+q);

= 0.5*h* (l.O+p); = h- (*cl); = 0.25*h*h* (q-p);

= 0.5*h*p;

(i=l; i= xe) break; / * step size * / xo= (*x); hO=hl; if ((*n = 1.1) ? a*hO : hO; count=O; reeval= (a c = 0.9 && nsjevl ! = 1); countl = (a >= 1.0 I I reeval) ? 0 : countl+l; if (countl == 10) ( countko ; reeval=l; hl=a*hO; 1

) el'se { hl=h; reeval= ( (nsjev == nsjevl)

&&

!strategy

&&

!linear);

1 J

if (strategy) hl = (hl > hrnax) ? hmax (*x) += hl; if ( (*x) >= xe) ( hl=xe-x0; (*x)=xe ;

:

((hl < hmin) ? hmin

1

if ( (*n 0) && !linear); kchange--; if (update) / * operator construction 4 /

Copyright 1995 by CRC Press, Inc

:

hl) ;

gmsopconstruct (&reeval,&update,r,hjac.h2jac2,rqz8 y,aux, ri,ci,lu,jev,&nsjevl,delta,&alfa,h1,h0,&sl,&s2, jacobian); if (!linear) gmscoefficient(&xll,&x10,xO,change,*n,&ql,&q2,hl, alfa,bdl,bd2,strategy); / * next integration step */ for (1=2; 1>=1; I--) { dupvec(l*r+l, (l+l)*r,-r,yl,yl); dupvec(l*r+l, (1+1)*r,-r,f1,fl);

kree-integer-vector (ri,1) ; free-integer-vector (ci,1) ; free-realvector (yl,1) ; f ree-real-vector (yo,1) ; free-real-vector (yl,1) ; free real vector(f1,l); free-realPmatrix(bdl, l,3,1); free~real~matrix (bd2,1,3,1) ; free-real-matrix(hjac, 1,r.1); free-real-matrix(h2 jac2,l,r,1); free-real-matrix (rqz,1,r,1) ;

I void gmsopconstruct(int *reeval, int *update, int r, float **hjac, float **h2jac2, float **rqz, float y[l, float a m [I , int ri [ I , int ci [I, int *lu, int *jev, int *nsjevl, float *delta, float *alfa, float hl, float hO, float *sl, float *s2, void (*jacobian)(int, float **, float [I, float * ) ) 1

\

/ * this function is internally used by GMS * / int i,j; float a,al,zl,e,q; if (*reeval) { (*jacobian)(r,hjac,y,delta) ; (*lev)++; *nsjevl = 0; if (*delta c= 1.0e-15) (*alfa)=1.0/3.0; else ( zl=hl*(*delta); a=zl*zl+l2.0; al=6.0*zl; if (fabs(z1) < 0.1) (*alfa)=(zl*zl/l40.0-1.0)*zl/30.0;

,

else if (zl c -33.0) (*alfa)= (a+al)/ (3.0*21*(2.0+21)) ; else { e=exp (21); (*alfa)= ( (a-al)*e-a-al)/ ( ( ( 2 .O-21)*e-2.O-21) *zl*3.O);

I

a=hl/h0; al=a*a; if (treeval) a=hl; if (a != 1.0) for (j=i; jc=r; j++) colcst(l,r,j,hjac,a); for (i=l; ic=r; i++) { for (j=l; jc=r; j++) { q=h2jac2Iil [j]=(*reeval ? matmat (l,r,i,j,hjac,hjac) h2jac2 [il [jl *all ; rqz [il [jl=(*s2)*q;

Copyright 1995 by CRC Press, Inc

:

void gmscoefficient(f1oat *xll, float *x10, float xO, int change, int n, float *ql, float *q2, float hl, float alfa, float **bdl, float **bd2, int strategy) I

/ * this function is internally used by GMS * / float a,q12,q22,qlq2,x12; x12= (*xl1); (*xl1)= (*xlO); ( *x10) =xo ; if (change) { if (n > 2) { (*ql)= ( (*xll)- (*xlO)) /hl; (*q2)= (x12-(*x10)) /h1; I

void gmsdiffscheme(int k, int count, int r, float El[], float yl [I , int *n, int *nsjevl, float y0 [I , float alfa, float **bdl, float **bd2, float hl, float y[] , float **hjac, float **h2jac2, float **rqz, int ri [ I , int ci [I , float *delta, void (*derivative)(int, float [I, float * ) ) { / * this function is internally used by GMS * / int i,l; if (count != 1) { dupvec(l,r,O,fl,yl) ; (*derivative)(r,f1,delta); (*n)++; (*nsjevl)++;

1

mulvec(1,r,O,yO,yl, (l.O-alfa)/2.0-bdl[11[kl); for ( 1 ~ 2 ; l 0) { / * backward differences * / impexbackdiff (n,ul,u3,wl,w2,w3,sl,s2,s3,r,rf); (*update)(weights,s2,n);

1

1 eci=4; Mstp : if (hnew ! = h2) { eci=l; / * change of information * / cl=hnew/h2; c2=c1*c1; c3=c2*c1; kof 121 [21=cl; kof [21 [31=(cl-c2)/2 .O; kof [21 [41=~3/6.0-c2/2 .O+c1/3.O; kof 1 - 31 - [31 - - =c2: . kof [31 [41=c2-c3; kof [41 141 =c3; for (i=l; ic=n; i++) ul [il=r 121 [il+r [31 [il/2.O+r[41 [il/3.0; alfl=matvec(l,n,l,rf,ul)/vecvec(l,n,~,ul,ul~ ;

alf= (alf+alfl)*cl; for (i=l; i s) s=x;

\

1

else' s=sqrt (vecvec(mO,m, 0,c,c)) *hstart = (*eta)/s;

;

do (:

/ * difference scheme * / hi=1.0; sigmal=(*sigma)(*t,mO,m); phil=phi; / * step size * / if (!start) ( / * local error bound * / s=o.o; if (norm == 1) for (j=mO; j s) S=X; else' s=sqrt (vecvec(m0,m,0,u,u) ) ; *eta = (*aeta)(*t,mO,m)+(*reta) (*t,mO,m)*s; if (start) hl=h2=hacc=(*hstart);

Copyright 1995 by CRC Press, Inc

ec2=ecl=l.0; kid; startd; ) else if (kl < 3) ( hacc=pow( (*eta)/ (*rho),l.O/q)*h2; if (hacc > lO.O*hZ) hacc=lO.O*h2; else kl++ ; ) else { a= (hO*(ec2-ecl)-hl* (ecl-ecO)) / (h2*h0-hl*hl); h=h2* ( (*eta c *rho) ? pow( (*eta)/ (*rho),l.O/q) if ( a > 0 . 0 ) ( b=(ec2-ecl-a*(h2-hl))/hl;

:

alfa) ;

hacc=O.0; hmax=h; / * find zero * / bO=hacc; fb=pow (hacc,q)* (a*hacc+b*(*t)+cc)- (*eta aO=hacc=h; fa=pow (hacc,q) * (a*hacc+b*(*t)+cc)- (*eta cO=aO ; fc=fa; ext=O; extrapolate=l; while (extrapolate) { if (fabs(fc) c fabs(fb)) ( if (cO ! = aO) { dO=aO; fd=fa; 1

I

tol=l.0e-3*h2; mm=(cO+bO)*0.5; mb=mm-b0; if (fabs(mb) > tol) { if (ext > 2) w=mb; else ( if (mb == 0.0) tol=O.O; else if (mb c 0.0) to1 = -tol; p0= (b0-a0)*fb; if (ext c = 1) qO=fa-fb; else { fdb= (fd-fb)/ (do-bO); fda= (fd-fa)/ (do-a0); pO *= fda; qO=fdb*fa-fda*fb; ) if (PO c 0.0) ( po = -PO; qo = -qo; 1

) dO=aO ; fd=fa; aO=bO ; fa=fb; hacc = bO += w; fb=pow (hacc,q) * ( (a*hacc+b* (*t)+cc) - (*eta); if ((fc >= 0.0) ? (fb >= 0.0) : (fb c= 0.0)) ( co=aO:

Copyright 1995 by CRC Press, Inc

fc=fa; ext=O; ) else ext = (w == mb) ? 0 } else break;

ext+l;

:

h=c0; if (!((fc >= 0.0) ? (fb c = 0.0) hacc=hmax; } else hacc=h; if (hacc c 0.5*h2) hacc=0.5*h2;

:

(fb >= 0.0)))

1 if (hacc c hmin) hacc=hmin; if~~(h*$igmal> 1.0) ( a=fabs ( (*diameter)(*t,mO,m)/ S ~ ~ ~ ~ ~ + F L T - E P S I/2.0; LON) b=2.0*fabs (sin(phil) ; if (hstab c 1.0e-14: (*t)) break; if (h > hstab) h=hstab; hcr=h2*h2/hl; if (kl > 2 && fabs(h-hcr) c FLT-EPSILON*hcr) h = (h c hcr) ? hcr* (1.0-FLT-EPSILON) : hcr*(l.O+FLT-EPSILON); if ((*t)+h > te) { last=l; *hstart = h; h=te- (*t);

1

hO=hl; hl=h2; h2=h; / * coefficient * / b=h*sigmal; bl=b*cos (phil); bb=b*b; if (fabs(b) c 1.0e-3) ( beta2=0.5-bb/24.0; beta3=1.0/6.0+b1/12.0; betha[31 =O.5+bl/3.O; } else if (bl c -40.0) ( beta2=(-2.0*b1-4.0*bl*bl/bb+l.O)/bb; beta3=(1.0+2.0*bl/bb)/bb;

betha [3]=l.0/bb; } else { e=exp (bl)/bb; b2=b*sin(phil); beta2=(-2.0*b1-4.0*bl*bl/bb+l.O)/bb; beta3=(1.0+2.0*bl/bb)/bb;

if (fabs(b2/b) c 1.0e-5) ( beta2 - = e* (bl-3.0); beta3 += e* (bl-2.0) /bl; betha [3]=l.O/bb+e*(b1-1.0); ) else { beta2 - = e*sin(b2-3.0*phil)/b2*b; beta3 += e*sin (b2-2.0*phil)/b2; betha [3]=l.O/bb+e*sin(b2-phil)/b2*b;

1

1

beta [ll =betha [ll =l.0; beta [21=beta2; beta [31=beta3; betha [21=l.0-bb*beta3; b=fabs (b); q = (b c 1.5) ? 4.0-2.0*b/3.0 : ((b c 6.0) ? (30.0-2.0*b)/9.0: 2.0); for (i=l; ic=3; i++) ( hi *= h; ; if (i > 1) (*derivative)(*t,mO,m,i,c) / * local error construction * /

Copyright 1995 by CRC Press, Inc

if (i =P 1) inivec(m0,m,r0,0.0); if (i < 4) elmvec(mO,rn,O,ro,c,betha[il*hi); if (i = s 4) { elmvec(mO,m,O,ro,c,-h); s=o.o; if (norm == 1) for (j=mO; j S) S=X; J

else s=sqrt(vecvec(mO,m,0,ro,ro)) *rho=s; ecO=ecl; ecl=ec2; ec2= (*rho)/pow (h,q) ;

;

1

elmvec (mO,m,O,u,c,beta[il *hi);

I

t2= (*t); (*k)++; if (last) ( last=O; (*t)=te; start=l; ) else (*t) += h; dupvec (m0,m,0 ,c ,u) ; (*derivative)(*t,mO,rn, 1,c) ; / * local error construction * / elm~ec(mO,m,0,r0,c,-h); s=o.o; if (norm == 1) for (j=mO; j s) S=X; 1

else s=sqrt(vecvec(m0,m,0,ro,ro)) ; *rho=s; ecO=ecl; ecl=ec2; ec2= (*rho)/pow (h,q) ; (*out)(*t,te,mO,m,u, *k,*eta,*rho); ) while (*t ! = te) ; f ree-real-vector (ro,mO);

1

5.4.4 Second order - No derivatives right hand side

Solves an initial value problem for a single second order ordinary differential equation dy/& = f(x,y,&/&), from x=a to x=b, y(a) and &(a)/da being given, by means of a 5 t h order Runge-Kutta method with steplength and error control [Z64]. Function Parameters: void rk2 (x,a,b,y,ya,z,za,j$z,e,dJ) float *; the independent variable; a: float; entry: the initial value of x; b: float; entry: the end value of x, ( b I a is allowed);

x:

Copyright 1995 by CRC Press, Inc

float *; the dependent variable; exit: the value of y(x) at x = b; ya: float; entry: the initial value of y at x=a; z: float *; the derivative @/&; exit: the value of z(x) at x = b; za: float; entry: the initial value of @/& at x=a; jiyz: float (*jiyz)(x,y,z); the right hand side of the differential equation; jiyz depends on x, y, z, giving the value of dy/&; e: float e[I:4]; entry: e[l] and e[3] are used as relative, e[2] and e[4] are used as absolute tolerances for y and @I&, respectively; d: float d[I:5]; exit: the number of steps skipped; d[l]: the last step length used; d[2]: d[3]: equal to b; d[4]: equal to y(b); for x = b; equal to @I&, d[5]: Ji: int; entry: ifJi is nonzero then the integration starts at x=a with a trial step b-a; ifJi is zero then the integration is continued with, as initial conditions, x=d[3], y=d[4], z=d[5], and a, ya and za are ignored. y:

void rk2(float *x, float a, float b, float *y, float ya, float *z, float za, float (*fxyz)(float, float, float), float e [I , float d[l , int fi)

l

int last,first,reject,test,ta,tb; float el,e2,e3,e4,xl,yl,zl,h,ind,hmin,hl,absh,kO,kl,k2,k3,k4, k5,discry,discrz,toly,tolz,mu,mul,fhy,fhz; if (fi) { d [3]=a; d [4]=ya; d [51=za;

1

d [ll=O.0; xl=d 131 ; yl=d [41 ; zl=d[51 ; if (fi) d[21 =b-d[31 ; absh=h=fabs(d[2l ) ; if (b-xl c 0.0) h = -h; ind=fabs(b-xl); hmin=ind*e [ll +e 121 ; hl=ind*e [3l+e 141 ; if (hl < hmin) hmin=hl; el=e [ll/ind; e2=e [21/ind; e3=e [31 /ind; e4=e [41 /ind; first=l;

Copyright 1995 by CRC Press, Inc

6hile (1) { if (test) { absh=fabs (h); if (absh c hmin) { h = (h > 0.0) ? hmin : -hmin; absh=hmin; 1 ta= (h >= b-xl); tb=(h >= 0.0); if ((ta && tb) I I ( ! (ta I I tb))) { d [21=h; last=l; h=b-xl; absh=fabs(h); ) else last=O; test=l; *x=xl; *y=yl; *z=zl; k0= (*fxyz)(*x,*y,*z)*h; *x=xl+h/4.5; *y=yl+(zl*18.0+k0*2,0)/81.0*h;

*z=zl+k0/4.5; kl= (*fxyz)( * x , *y,*z)*h; *x=xl+h/3.0; *y=yl+(zl*6.0+k0)/18.O*h;

*z=zl+(kO+kl*3.O)/12.0; k2= (*fxyz)(*x,*y,*z)*h; *x=xl+h*0.5; *y=yl+(zl*8.0+kO+k2)/16.0*h;

*z=zl+ (kO+k2*3.O)/8.0; k3= (*fxyz)(*x,*y,*z)*h; *x=xl+h*0.8; *y=yl+(zl*100.0+k0*12.0+k3*28.0)/125.0*h; *z=zl+(k0*53.0-kl*135.O+k2*126.0+k3*56.0)/125.0;

k4= (*fxyz)(*x,*y,*z)*h; *x = (last ? b : xl+h) ; *y=yl+(zl*336.0+k0*21.0+k2*92.0+k4*55.0)/336.O*h; *z=zl+(k0*133.0-kl*378.0+k2*276.0+k3*112.0+k4*25.0)/168.0;

k5= (*fxyz)(*x,*y,*z)*h; discry=fabs((-kD*21.O+k2*108.0-k3*112.0+k4*25.0)/56.O*h); discrz=fabs(k0*21.0-k2*162.0+k3*224.0-k4*125.0+k5*42.0)/14.0;

toly=absh*(fabs(21)*el+e2); tolz=fabs(kO)*e3+absh*e4; reject=(discry > toly I I discrz > tolz); fhy=discry/toly; fhz=discrz/tolz; if (fhz > fhy) fhy=fhz; mu=l.O/(l.O+fhy)+O.45; if (reject) { if (absh c= hmin) { d[ll += 1.0; *y=yl; *z=zl; first=l; if (b =I *x) break; xl = *x; yl = *y; zl = *z; ) else h *= mu; ) else { if (first) { first=O; hl=h; h *= mu; ) else {

Copyright 1995 by CRC Press, Inc

1

fhy=mu*h/hl+mu-mul; hl=h; h *= fhy;

mul=mu; *y=yl+(zl*56.0+k0*7,0+k2*36.0-k4*15.0)/56.O*hl; *z=zl+(-k0*63.0+kl*189.0-k2*36.0-k3*112.0+k4*50.0~/28.0;

k5= (*fxyz)(*x,*y,*z)*hl; *y=yl+(zl*336.0+k0*35.0+k2*108.0+k4*25.0)/336.0*h1; *z=zl+(k0*35.0+k2*162.0+k4*125.0+k5*14.0)/336.0;

if x1 yl z1

(b == *x) break; = *x; = *y; = *z;

1

Solves an initial value problem for a system of second order ordinary differential equations dyj(x)/dx2 =J(x,y,dy(x)/&), (j=1,2, ...,n), from x=a to x=b, y(a) and dy(a)/da being given, by means of a 5-th order Runge-Kutta method [Z64]. Upon completion of a call of rk2n we have: x=d[3]=b, y/3]=db+3] the value of the dependent variables for x=b, zo]=d[n+j+3], the value of the derivatives of yo] at x=b, j = l ,...,n, rk2n uses as its minimal absolute step length hmin = min(e[2 7-l]*k+e[2 *jfi with 15312*n and k = I b - (if fi is nonzero then a else d[3fi I . If a step of length I h 1 I hmin is rejected then a step sign(h)*hmin is skipped. A step is rejected if the absolute value of the computed discretization error is greater than (Izli/l *e[2*j-l]+e[2*jn* lhllk or if that term is greater than ( Ifxvzj 1 *e[2*G;+n)-l]+e[2*G;+n)A*I h 1 lk, for any value of j, 1 En,?) - En,?) = (tn,YO- t n , T l k ) ~ n The functions 4,, and 4,, (n=l, ...JV, l=l,...,k-1; n=N, I=l,...,k) above are taken to be respectively and zero polynomials of degree k over the interval I, and In (n=l, elsewhere in [a,b]; 4,, (n=1, ...,N- 1) is a polynomial of degree k over In, another such polynomial over In+,, and is zero elsewhere in [a,b]. These polynomials are fixed by the conditions ~ o @ l , o o=)4,(a) = 1, 4,@,,,") = 0 (v=l ,...,k) (O 5 v 5 k ( ~z I)) 44,@n.,F!, =0 n n = for n=1, ...,N, I=l, ...,k-I; n=N, k l , ...,k, and +n,k@n,kfk)) = I 4n,k@n,,fi!, = 0 (v=O, ...,k-I), 4n,k@n+l,?))= 0 (v=I,...,k)

...a

7

Copyright 1995 by CRC Press, Inc

for n=1,

...a1. Setting

where the superscripts (v) and (z,v) imply that the terms corresponding to r=v and r=r, r=v are to be omitted, the polynomials + , ,are given explicitly by the formulae

+,,

(X E I,;

I < k)

Their derivatives at the displaced Radau-Lobatto points lying in the intervals over which they are not identically zero are given by D+n,/@n,,F!, = '(v,l) An (1 < k) Wn,,@n,,F!, = '(v,k) / An, D$n,,@n+,,Fq = X'(v,O) An+l. Over intervals in which the + , , are zero, the derivative of +,,, is, of course, also zero. The numbers X(v), X1(v,l) are problem independent; they may be, and have been, computed in advance. The first of the boundary conditions (2) implies that

Copyright 1995 by CRC Press, Inc

When n=1 ,...,N-1, 1=1,...,k; n=N, 1=1,...,k- 1, the expression denoted by square brackets in formula (5) vanishes, since 4,,(a) = 4,,(b) = 0.With n-1, the following term is equal to

for I=l, ...,k; when n > 1, this term vanishes. For n=l,...,N, l=l, ...,k-1 the next term is equal to

and when n=1, ...,N-1, I=k it is equal to the sum of the above expression with I=k and

The value of the expression upon the right hand side of equation (5) is ~ p ' f @ ~ , p q for + A,,+,Jfor n=1, ...,N-1, n=l, I=0,...,k, and n=1, ...,N, l=l, ...,k-1, and is equal to w,,@'~@~,,@))(A, Z=k. The second of the boundary conditions (2) implies that

In summary, the two boundary conditions (2) and Nk-1 equations derived from equation (3) with n=1, ...,N-1, I=l, ...,k and n=N, I=l, ...,k-1 yield Nk+l linear equations for the Nk+l coefficients a, and amj(m=1, ...A,j=l, ...,k). The first k-1 of these equations involve a, and aij(j=l, ...,k) alone. By transferring the terms involving a, and a,,, to the right hand side, and premultiplying by the inverse of the matrix multiplying the vector (a,,,,...,al,k~,),this vector my be isolated. Its components may be eliminated from the next equation which involves (j=l ,...,k) alone. (j=l ,...,k). The following k-1 equations involve the two sets a I j and The vector (a,,, ...,a,,,) may again be isolated and its components eliminated from the equation from which a,,,...,a,,,, have been eliminated, and also from the next equation which involves the two sets a,j, ajj (j=l, ...,k). The process may be continued, and a tridiagonal system of equations for a, and a,, (m=1,...,PI) is obtained. The process of

Copyright 1995 by CRC Press, Inc

elimination may be carried out as soon as each complete set of equations involving am,,, ...,a,,, has been constructed (computer storage space is thereby saved). When k 1 , no elimination takes place; when k=2 and 3 (the other two values of k permitted by femlagsym) the matrix elimination process is particularly simple and is programmed independently (rather than by calls of matrix operation procedures). The tridiagonal system of equations is solved by a method of Babuska [see Bab721. Stripping some redundant sequences from the original exposition, the set of equations Ay=f, where A.+ I. = ca AiZi+,= bi (i=l, ...,n-1), A,,, = 71 - 61, A,,i = 7, - ci., - b, (i=2 ,...,n-I), Anin = T,,- cl1may be solved by means of the construction of four sequences g,, xi, g,' and xi as follows: with xI=7,, g, =fi gi+: =A+/ - gicJ(~i-bJ, xi+, = 7i+l - xicJ(~i-bu) (i=l,...,n-l) and with x,', g, = f, gi*= f; - bigi+,*/(Xi+,*-cJ, xi*= ri - biX,+l*/(Xi+,*-~i) (i=n-1, ...,1) and then (i=n,...,1). ~i = (gi + gig-fJ / (xi + xi* - T$ The above process can be further economized (three one-dimensional arrays, with another to contain they, are required). The coefficients a , , a,,, (n=l, obtained by means of the above process are, in order, the required approximations y, (n=O, (since IP,~,,(XJ = 6,,).

...a ...a

Function Parameters: void femlagsym (x,y,n,p,r,fjorder,e) float x[O:n]; entry: a = x, < x, < ... < x, = b is a partition of the interval [a'b]; float y[O:n]; exit: y[i], i=O, 1,...,n, is the approximate solution at x[i] of the differential equation (1) with boundary conditions (2); int; entry: the upper bound of the arrays x and y (value of N above), n > 1; float (*p)(x); the procedure to compute p(x), the coefficient of Dy(x) in equation (1); float (*r)(x); the procedure to compute r(x), the coefficient of y(x) in equation (1); float (*J>(x); the procedure to compute f(x), the right hand side of equation (1); order: int; entry: order denotes the order of accuracy required for the approximate solution of the differential equation; let h = max(x[i]-x[i-18, then ly[i] - y(x[i]))( 5 c*harder , i=O,...,n; order can be chosen equal to 2, 4 or 6 only; e: float e[l:6]; entry: e[l], ...,e[6] describe the boundary conditions (values of e,, i=1, ...,6, in (2)); e[l] and e[4] are not allowed to vanish both.

void femlagsym(f1oat x[l , float y [I , int n, float (*p) (float), float (*r)(float), float (*f)(float), int order, float e [ l )

Copyright 1995 by CRC Press, Inc

I

float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int) ; int 1,ll; float xll,xl,h,al2,bl,b2,taul,tau2,ch,tl,g,yl,pp,pl,p2,p3,p4, rl,r2,r3,r4,fl,f2,f3,£4,el,e2,e3,e4,e5,e6,*t,*sub,*chi,*gi, h2,x2,h6,h15,b3,tau3,cl2,c32,a13,a22,a23 ,x3,h12,h24,det, cl3,c42,c43,al4,a24,a33,a34,b4,tau4,aux;

1=1; xl=x 101 ; el=e [ll ; e2=e [21 ; e3=e [31 ; e4=e [41 ; e5=e [51 ; e6=e [61 ; while (1 c = n ) { 11=1-1; xll=xl; xl=x [ll ; h=xl-xll; if (order == 2) { / * element mat vec evaluation 1 * / if (1 == 1) { p2= (*p)( ~ 1 1;) r2= (*r)(xll); f2= (*f)(xl1);

h2=h/2.0; bl=h2*fl; b2=h2*f2; taul=h2*rl; tau2=h2*r2; a12 = - 0.5*(pl+p2)/h; } else if (order == 4) { / * element mat vec evaluation 2 * / if (1 == 1) ( p3= (*p)(xll); r3= (*r)(xll); 1

f 3 = ( * f ) ( x l l );

h6=h/6.0; h15=h/1.5; pl=p3; p2=(*p) (x2 p3= (*p)(xl rl=r3; r2= (*r)(x2 r3= (*r)(xl fl=f3 ; f2=(*f)(x2 f3= (*f)(xl bl=h6*f1; b2=h15*f2; b3=h6*f3:

Copyright 1995 by CRC Press, Inc

a23 = - (pl/3.O+p3)*2.0/h; c12 = -a12/a22; c32 = -a23/a22; a12=a13+c32*a12; bl += c12*b2; b2=b3+c32*b2; taul += c12*tau2; tau2=tau3+c32*tau2; } else { / * element mat vec evaluation 3 * / if (1 == 1) { pe= (*p) ( x n ); r4= (*r) (xll); f4= (*f) (xl1); I

I

x2=~11+0.27639320225*h; x3=x1-x2+x11; h12=h/12.0; h24=h/2.4 ; pl=p4 ; p2= (*p) (x2); p3=(*p) (x3); p4= (*p)(xl); rl=r4 ; r2= (*r)(x2); r3=(*r) (x3); r4= (*r)(xl); fl=f4; f2=(*f) (x2); f3=(*f) (x3); f4=(*f) (xl); bl=hl2*fl; b2=h24*£2; b3=h24*f3 ; b4=h12*f4; taul=hl2*rl; tau2=h24*r2; tau3=h24*r3; tau4=h12*r4 ; a12 = -(4.04508497187450*p1+0.57581917135425*p3+ 0.25751416197911*p4)/h; a13=(1.5450849718747*pl-1.5075141619791*~2+ 0.6741808286458*~4)/h; al4= ( (p2+p3)/2.4- (pl+p4)/2.0) /h; a22=(5.454237476562*pl+p3/0.48+0.79576252343762*~4)/h+tau2; a23 = - (pl+p4)/ (h*O.48) ; a24=(0.67418082864575*~1-1.507514161979lO*p3+ 1.54508497187470*p4)/h; a33=(0.7957625234376*pl+p2/0.48+5.454237476562*p4)/h+tau3; a34 = -(0.25751416197911*p1+0.57581917135418*p2+ 4.0450849718747*p4)/h; det=a22*a33-a23*a23; c12=(a13*a23-a12*a33)/det; c13=(a12*a23-a13*a22)/det; c42=(a23*a34-a24*a33)/det; c43=(a24*a23-a34*a22)/det; taul += c12*tau2+c13*tau3; tau2=tau4+~42*tauZ+c43*tau3; a12=a14+~42*a12+~43*a13; bl += c12*b2+c13*b3; b2=b4+~42*b2+~43*b3; if (1 == 1 I f 1 == n) { / * boundary conditions * / if (1 == 1 && e2 == 0.0) { t a u k l . 0; bl=e3/el; b2 - = a12*bl; tau2 - = a12; a12=0.0; } else if (1 == 1 && e2 ! = 0.0) { aux=pl/e2 ; taul - = aux*el; bl - = e3*aux;

Copyright 1995 by CRC Press, Inc

1

) else if (1 == n && e5 == 0.0) { tau2=l.0; b2=e6/e4; bl - = a12*b2; taul - = a12; a12=0.0; ) else if (1 == n && e5 ! = 0.0) ( aux=p2/e5; tau2 += aux*e4; b2 += aux*e6; )

/ * forward babushka * /

if (1 == 1) { chi 101 =ch=tl=taul; t [Ol =tl; gi [O]=g=yl=bl; y [Ol =yl; sub [Ol=al2: pp=a12/ (chial2); ch=tau2-ch*pp; g=b2-g*pp; tl=tau2; yl=b2 ; ) else ( chi[lll = ch += taul; gi [ll] = g += bl; sub [ll]=a12 ; pp=al2/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; t [ll]=tl+taul; tl=tau2; y [ll]=yl+bl; yl=b2 ;

I* backward babushka * / pp=yl; y [nl =g/ch; g=PP ; ch=tl; l=n-1; while (1 >= 0) ( pp=sub [ll ; pp / = (ch-pp); tl=t [ll ; ch=tl-ch*pp; yl=y [ll ; g=yl-g*pp; y[i] = (gl[ll +g-yl)/(chi [ll +ch-tl); 1-- ; 1

free-real-vector free-real-vector free-real-vector free-real-vector

(t,0) ; (sub,0) ; (chi,0) ; (gi,0) ;

1

B. femlag Solves a second order self-adjoint linear two point boundary value problem by means of Galerkin's method with continuous piecewise polynomials [Bab72, BakH76, He75, StF731. femlag computes approximations y , (n=O,...,N) to the values at the points x = x,,, where -a, < a = x , < x , < . . . < x , = b < a, of the real valued function y(x) which satisfies the equation

Copyright 1995 by CRC Press, Inc

(1) -D2y(x) + r(x)y(x) = f(x) where D=ciY&, with boundary conditions e,y(a) + e2@(a) = e,, ey(b) + e,@(b) = e, (el, e, # 0). (2) It is assumed that r(x) 2 0 (a I x I b); if e2 = e, = 0, this condition may be relaxed to r(x) > -(~/(b-a))~(a I x I b). The general theory underlying the method used is that described in femlagsym. Now, however, (since p(x) in equation (1) of femlagsym is unity) the term

in expression (7) of femlagsym becomes

The value of the sum in this expression is problem independent and may be determined in advance (a similar remark holds with respect to expression (6) in femlagsym); the computations may be simplified.

Function Parameters: void femlag (x,y,n, rdorder, e) float x[O:n]; entry: a = x, < x, < ... < x, = b is a partition of the segment [a,b]; y: float y[O:n]; y[i], i=O,l,...,n, is the approximate solution at x[i] of the differential equation exit: (1) with boundary conditions (2); n: int; entry: the upper bound of the arrays x and y (value of N above), n > 1; r: float (*r)(x); the procedure to compute r(x), the coefficient of y(x) in equation (1); f: float (*j)(x); the procedure to compute f(x), the right hand side of equation (1); order: int; entry: order denotes the order of accuracy required for the approximate solution of the differential equation; let h = max(x[i]-x[i-I]), then 1 y[i] - y(x[i]) 1 I c*hord", i=O,...,n; order can be chosen equal to 2, 4 or 6 only; e: float e[I:6]; entry: e[I], ...,e[6] describe the boundary conditions (values of ei, i=1,...,6, in (2)); neither e[I] nor e[4] is allowed to vanish. x:

void femlag(f1oat x [ I , float y [ l , int n, float (*r)(float), float (*f)(float), int order, float e[l)

I

float *allocate real vector(int, int); void free-real-vector(f1oat * , int);

Copyright 1995 by CRC Press, Inc

int 1,ll; float xll,xl,h,al2,bl,b2,taul,tau2,ch,tl,g,yl,pp,el,e2,e3,e4,e5, e6,*t,*sub,*chi,*gi,f2,r2,rl,f1,h2,r3,f3,~2,h6,h15,b3,tau3, c12,a13,a22,a23,r4,f4,x3,h12,h24,det,c13,c42,c43,a14,a24, a33,a34,b4,tau4;

1=1; xl=x[O] ; el=e [ll ; e2=e [21 ; e3=e 131 ; e4=e [41 ; e5=e [51 ; e6=e 161 ; while (1 c = n) { 11=1-1; xll=xl; xl=x [ll ; h=xl-xll; if (order == 2) { / * element mat vec evaluation 1 * / if (1 == 1) ( f2=(*f)(xl1); r2= (*r)(xll); I

a12 = -l.O/h; h2=h/2.0; rl=r2; r2= (*r)(xl); fl=f2; f2= (*f)(xl); bl=h2*f1; b2=h2*f2; taul=h2*rl; tau2=h2*r2; } else if (order == 4) ( / * element mat vec evaluation 2 * / if (1 == 1) { r3= (*r)(xll); f3= (*f)(xl1);

1

x2= (x11+x1)/2.0; h6=h/6.0; h15=h/1.5; rl=r3; r2= (*r)(x2); r3=(*r) (xl); fl=f3 ; f2=(*f) (x2); f3=(*f)(xl); bl=h6*fl; b2=h15*f2; b3=h6*f3; taul=h6*rl; tau2=h15*r2; tau3=h6*r3; a12 = a23 = -8.O/h/3.0; a13 = -a12/8.0; a22 = -2.O*a12+tau2; cl2 = -a12/a22; a12=a13+c12*a12; b2 * = ~ 1 2 ; bl += b2; b2 += b3; tau2 *= c12; taul += tau2; tau2=tau3+tau2; } else ( / * element mat vec evaluation 3 * /

Copyright 1995 by CRC Press, Inc

if (1 == 1) { r4= (*r)(xll); f4= (*f)(xl1);

1

x2=~11+0.27639320225*h; x3=xl-x2+x11; rl=r4; r2= (*r)(x2); r3= (*r)(x3); r4= (*r)(xl); fl=f4; f2=(*f) (x2); f3=(*f) (x3); f4=(*f) (xl); h12=h/12.0; h24=h/2.4; bl=hl2*fl; b2=h24*£2; b3=h24*£3; b4=h12*£4; taul=hl2*rl; tau2=h24*r2; tau3=h24*r3; tau4=h12*r4; a12 = a34 = -4.8784183052078/h; al3=a24=0.7117516385412/h; a14 = -0.16666666666667/h;

a23=25.O*al4; a22 = -2.O*a23+tau2; a33 = -2.O*a23+tau3; det=a22*a33-a23*a23; c12=(a13*a23-a12*a33)/det; c13=(a12*a23-a13*a22)/det; c42=(a23*a34-a24*a33)/det; c43= (a24*a23-a34*a22)/det; taul += c12*tau2+~13*tau3; tau2=tau4+~42*tau2+~43*tau3; a12=a14+~42*a12+~43*a13;

bl += c12*b2+c13*b3; bZ=b4+~42*b2+~43*b3; (1 == 1 I I 1 == n) { / * boundary conditions * / if (1 == 1 && e2 == 0.0) ( taukl.0; bl=e3/el; b2 - = a12*bl: tau2 - = a12; a12=0.0; ) else if (1 == 1 && e2 ! = 0.0) ( taul - = el/e2; bl - = e3/e2; } else if (1 == n && e5 == 0.0) { tau2=l.0; b2=e6/e4; bl - = a12*b2; taul - = a12; a12=0.0; } else if (1 == n &&e5 ! = 0.0) ( tau2 += e4/e5; b2 += e6/eS;

1

1

/ * forward babushka * /

if (1 == 1) { chi [O]=ch=tl=taul; t [Ol=tl; gi [OI=g=yl=bl; y [Ol=yl; sub [Ol=a12; pp=al2/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; tl=tau2:

Copyright 1995 by CRC Press, Inc

yl=b2 ; ) else { chi 1111 = ch += taul; gi [lll = g += bl; sub [lll=al2; pp=al2/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; t 1111 =tl+taul; tl=tau2; y [lll =yl+bl; yl=b2; 1

)* backward babushka * /

l=n-1; while ( 1 z = 0) { pp=sub 111 ; PP / = (ch-pp); tl=t [ll :

kree-real-vector (t,0) ; free-real-vector(sub,O); free-real-vector (chi,0) ; free-real-vector (gi,0) ;

1

C. femlagspher Solves a second order self-adjoint linear two point boundary value problem with spherical coordinates by means of Galerkin's method with continuous piecewise polynomials [Bab72, BakH76, He75, StF731. femlagspher computes approximations y,, (n=O, to the values at the points x = x,, where -a, < a = x , < x , < ... < x N = b < co of the real valued function y(x) which satisfies the equation -D(YcDy(x))/x "'+ r(x)y(x) = f(x) (1) where D=ddx, with boundary conditions e~v(b) + e,Dy(b) = e, (el, e, # 0). (2) e,y(a) + e,Dy(a) = e,, It is assumed that r(x) and f(x) are sufficiently smooth on [x,xN] except at the grid points; furthermore r(x) should be nonnegative. The solution is approximated by a function which is continuous on the closed interval [xo,xN]and a polynomial of degree less than or equal to k on each segment [x,-,x,] (j=l, ...a. This piecewise polynomial is entirely determined by the values it has at the knots x, and on k-1 interior knots on each segment [x,,,~,]. These values are obtained by the solution of an order+l diagonal linear system with a specially structured matrix. The entries of the matrix and the vector are inner products which are approximated by some piecewise k-point Gaussian quadrature. The evaluation of the matrix and the vector is done segment by segment: on each segment the contributions to the entries of the matrix and the vector are computed and embedded in the global matrix and vector. Since the function values on the interior points of each segment are not coupled with the function values outside that

...m

Copyright 1995 by CRC Press, Inc

segment, the resulting linear system can be reduced to a tridiagonal system by means of static condensation. The final tridiagonal system, since it is of finite difference type, is solved by means of Babuska's method. For further details, see the documentation of femlagsym.

Function Parameters: void femlagspher (x,y,n,nc,r,Jorder, e) float x[O:n]; entry: a = x, < x, < ... < x, = b is a partition of the interval [a,b]; y: float y[O:n]; exit: y[i], i=O,l,...,n, is the approximate solution at x[i] of the differential equation (1) with boundary conditions (2); n: int; entry: the upper bound of the arrays x and y (value of N above), n > 1; nc: int; entry: if nc = 0, Cartesian coordinates are used; if nc = 1, polar coordinates are used; if nc = 2, spherical coordinates are used; r: float (*r)(x); the procedure to compute r(x), the coefficient of y(x) in equation (1); f: float (*j)(x); the procedure to compute f(x), the right hand side of equation (1); order: int; entry: order denotes the order of accuracy required for the approximate solution of then the differential equation; let h = max(x[i]-x[i-ll), ly[i] - y(x[i]) 1 I ~*h"'~",i=O, ...,n; order can be chosen equal to 2 or 4 only; e: float e[I:6]; entry: e[l], ...,e[6] describe the boundary conditions (values of ei, i=1,...,6 , in (2)); e[l] and e[4] are not allowed to vanish both. x:

void femlagspher(f1oat x[l, float y [ l , int n, int nc, float (*r)(float), float ( * f ) (float), int order, float e [I

)

I

1

float *allocate-real-vector(int, int); void free-real-vector(f1oat *, int) ; int 1,ll; float xll,xl,h,al2,bl,b2,taul,tau2,ch,tl,g,yl,pp,tau3,b3,al3,a22, a23,c32,cl2,el,e2,e3,e4,e5,e6,*t,*sub,*chi,*gi,xm,vl,vr,wl, wr,pr,rm,fm,xl2,xlxr,xr2,xlm,xrm,vlm,vrm,wlm,wrm,flm,frm,rlm, rrm,pll,pl2,pl3,prl,pr2,pr3,qll,q12,q13,rlmpll,rlmpl2,rrmprl, rrmpr2 ,vlmqll,vlmql2,vrmqrl,vrmqr2,qrl, qr2, qr3, a,a2, a3,a4, b,b4,p4h,p2,p3,p4,auxl,aux2,a5,a6,a7,a8,b5,b6,b7,b8,ab4, a2b3,a3b2,a4b,p5,p8,p8h,aux,plm,prrn;

Copyright 1995 by CRC Press, Inc

e5=e [51 ; e6=e [61 ; while (1 O ) { plm= (xlm-xll) /h; prm= (xrm-xll) /h; aux=2.O*plm-1.0; pll=aux* (plm-1.0); pl3=aux*plm; p12=1.0-pll-p13; aux=2.0*prm-1.0; prl=aux* (prm-1.0); pr3=aux*prm; pr2A.0-prl-pr3; aux=4.0*plm; qll=aux-3.0; ql3=aux-1.0; q12 = -qll-q13; aux=4.0*prm; qrl=aux-3.0; qr3=aux-1.0; qr2 = -qrl-qr3; 1 wlm=h*vlm; wrm=h*vrm; vlm / = h; vrm / = h; flm= (*f)(xlm)*wlm; f rm=wrm* (*f) (xrm); rlm= (*r)(xlm)*wlm; rrm=wrm* (*r)(xrm); taul=pll*rlm+prl*rrm; tau2=pl2*rlm+pr2*rrm; tau3=pl3*rlm+pr3*rrm; bl=pll*flm+prl*frm; b2=pl2*flm+pr2*frm; b3=~13*flm+~r3*frm;

Copyright 1995 by CRC Press, Inc

1

bl += c12*b2; b2=b3+~32*b2; taul += c1Zctau2; tauZ=tau3+~32*tau2;

if (1 == 1 I I 1 == n) ( / * boundary conditions * / if (1 == 1 && e2 == 0.0) { taul=l.0; bl=e3/el; b2 - = a12*bl; tau2 - = al2; a12=0.0; ) else if (1 == 1 && e2 ! = aux=((nc == 0) ? 1.0 : bl - = e3*aux; taul - = el*aux; ) else if (1 == n && e5 == tau24.0; b2=e6/e4; bl - = al2*b2; tau1 - = a12; a12=0.0; ) else if (1 == n && e5 ! = aux= ( (nc == 0) ? 1.0 : tau2 += aux*e4; b2 += aux*e6;

0.0) ( pow(x[O],nc))/e2; 0.0) {

0.0) { pow (x[nl , nc) ) /e5;

)* forward babushka * /

if (1 == 1) { chi [OI=ch=tl=taul; t [Ol=tl; gi [Ol=g=yl=bl; y [Ol=yl; sub 101 =al2; pp=al2/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; tl=tau2; yl=b2; ) else { chi [lll = ch += taul; gi [lll = g += bl; sub [lll=al2; pp=al2/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; t [lll=tl+taul; tl=tau2; y[lll =yl+bl; yl=b2;

j * backward babushka * / P P = Y;~ Y [nl=g/ch; la-1; while (1 >= 0) { pp=sub Ell ; pp / = (ch-pp); tl=t [ll ; ch=tl-ch*pp; yl=y[ll ; g=yl-g*pp; y[l] =(gi [ll+g-yl)/ (chi[ll+ch-tl); I--;

1

free-real-vector (t,0) ; free-real-vector (sub,0) ; free-real-vector (chi,0) ;

Copyright 1995 by CRC Press, Inc

free-real-vector (gi,0 )

;

1

5.5.2 Linear methods - Second order skew adjoint femlagskew Solves a second order skew-adjoint linear two point boundary value problem by means of Galerkin's method with continuous piecewise polynomials [Bab72, BakH76, He75, StF731. femlagskew computes approximations yn (n=O,...JV) to the values at the points x = x,, where - w < a = x, < x, < ... < x, = b < oo of the real valued function y(x) which satisfies the equation (1) -D2y(x) + q(x)Dy(x) + r(x)y(x) = f(x) where D=ddx, with boundary conditions e,.v(a) + e,Dy(a) = e,, ey(b) + e,Dy(b) = e, (el, e4 + 0). (2) It is assumed that (a) r(x) 2 0 (a I x I b), if e, = e, = 0 then this condition may be relaxed to r(x) > -(xl(b-a))* (a I x I b); (b) q(x) is not allowed to have very large values in some sense: the product q(x)*(x,-x,.,) should not be too large on the interval [xj,,x,], otherwise the boundary value problem may degenerate to a singular perturbation or boundary layer problem, for which either special methods or a suitably chosen grid are needed; (c) q(x), r(x) and f(x) are required to be sufficiently differentiable on the domain of the boundary value problem; however, the derivatives are allowed to have discontinuities at the grid points, in which case the order of accuracy (2, 4 or 6) is preserved; (d) if q(x) and r(x) satisfy the inequality r(x) 2 Dq(x)/2, the existence of a unique solution is guaranteed, otherwise this remains an open question. The general theory underlying the method used is that described in femlagsym. Now a term q(x)Dy(x) is being treated (it is absent from equation (1) in femlagsym), however, the simplification described in femlag is possible. Function Parameters:

void femlagskew (x,y,n, q,r,Jorder, e) float x[O:n]; entry: a = x, < x, < ... < xn = b is a partition of the segment [a,b]; y: float y[O:n]; y[i], i=O, 1,...,n, is the approximate solution at x[i] of the differential equation exit: (1) with boundary conditions (2); n: int; entry: the upper bound of the arrays x and y (value of N above), n > 1; q: float (*q)(x); the procedure to compute q(x), the coefficient of Dy(x) in equation (1); r: float (*r)(x); the procedure to compute r(x), the coefficient of y(x) in equation (1); j float (*f)(x); the procedure to compute f(x), the right hand side of equation (1); order: int; x:

Copyright 1995 by CRC Press, Inc

entry:

e:

order denotes the order of accuracy required for the approximate solution of then the differential equation; let h = max(x[i]-x[i-IJ), ly[iJ - y(x[i]) 1 Ic*Wrder,i=O,...,n; order can be chosen equal to 2, 4 or 6 only; float e[I:6]; entry: e[l], ...,e[6] describe the boundary conditions (values of ei, i=1, ...,6, in (2)); neither e[l] nor e[4] is allowed to vanish.

void femlagskew(f1oat x[l, float y[l, int n, float (*q)(float), float (*r)(float), float (*f)(float), int order, float e [I) float *allocate-real-vector(int, int) ; void free-real-vector(f1oat *, int); int 1,ll; float xll,xl,h,al2,a2l,bl,b2,taul,tau2,ch,tl,g,yl,pp,el,e2,e3,e4, e5,e6,*t,*super,*sub,*chi,*gi,q2,r2,£ 2 , ql,rl,£1,h2,s12,q3, r3,f3,sl3,s22,x2,h6,h15,c12,c32,a13,a31,a22,a23,a32,b3, tau3,q4,r4,£4,s14,s23,x3,h12,h24,det,cl3,c42,c43,a14,a24, a33,a34,a41,a42,a43,b4,tau4;

1=1; xl=x [O]; el=e [ll ; e2=e [21 ; e3=e [ 3 1 ; e4=e [41 ; e5=e [51 ; e6=e [61 ; while (1 < = n) { xll=xl; 11=1-1; xl=x [ll ; h=xl-xll; if (order == 2) { / * element mat vec evaluation 1 * / if (1 == 1) { q2= (*q)( ~ 1 1;) r2= (*r)(xll); f2=(*f)(xl1); 1 J

h2=h/2 .O; sl2 = -l.O/h; q1=q2 ; q2= (*q)(xl); r k r 2; r2= (*r)(xl); fl=f2; f2=(*f)(xl); bl=h2*f1; b2=h2*f2; taul=h2*rl; tau2=h2*r2; a12=s12+q1/2.0; a2ks12-q2/2.0; ) else if (order == 4) { / * element mat vec evaluation 2 * / if (1 == 1) { q3= (*q)( ~ 1 1;) r3= (*r)(xll); f3= (*f)( ~ 1 1;) 1

Copyright 1995 by CRC Press, Inc

h6=h/6.0; h15=h/1.5; ql=q3 ; q2= (*q)(x2); q3= (*q)(xl); rl=r3 ; r2= (*r)(x2); r3= (*r)(xl); fl=f3 ; f2= (*f)(x2); f3= (*f)(xl); bl=h6*f 1; b2=h15*f2; b3=h6*f3; taul=h6*rl; tau2=h15*r2; tau3=h6*r3; sl2 = -1.O/h/0.375; S13 = - ~ 1 2 / 8 . 0 ; 522 = -2.0*512; a12=s12+q1/1.5; a13=s13-q1/6.0; a21=s12-q2/1.5; a23=s12+q2/1.5; a22=s22+tau2; a31=s13+q3/6.0; a32=s12-q3/1.5; c12 = -a12/a22; c32 = -a32/a22; a12=a13+c12*a23; a21=a31+c32*a21; bl += c12*b2; b2=b3+c32*b2; tau1 += cl2*tau2; tau2=tau3+c32*tau2; } else { / * element mat vec evaluation 3 * / if (1 == 1) { q4= (*q)( ~ 1 1;) r4= (*r)(xll); f4= (*f)(xl1); 1 ~2=~11+0.27639320225*h; x3=x1-x2+x11; h12=h/12.0; h24=h/2.4; gl=q4 ; q2= (*q)(x2); q3= (*q)(x3); q4= (*q)(xl); rl=r4 ; r2= (*r) (x2); r3= (*r) (x3); r4= (*r)(xl); fl=f4; f2=(*f)(x2); f3=(*f) (x3); f4=(*f)(xl); ,912 = -4.8784183052080/h; ~13=0.7117516385414/h; s14 = -0.16666666666667/h; s23=25.O*sl4; ~ 2 2 = -2.0*~23; bl=hl2*fl; b2=h24*£2; b3=h24*f3; b4=h12*£4; taul=hl2*rl; tau2=h24*r2; tau3=h24*r3; tau4=h12*r4; al2=~12+0.67418082864578*ql; a13=~13-0.25751416197912*q1; a14=s14+q1/12.0;

Copyright 1995 by CRC Press, Inc

a21=~12-0.67418082864578*q2;

a22=s22+tau2; a23=~23+0.9316949906249O*q2; a24=s13-0.25751416197912*q2;

a3l=sl3+O.25751416l979l2*q3; a32=s23-0.93169499062490*q3;

a33=s22+tau3; a34=~12+0.67418082864578*q3;

a41=s14-q4/12.0; a42=~13+0.25751416197912*q4; a43=s12-0.67418082864578*q4;

det=a22*a33-a23*a32; c12=(a13*a32-a12*a33)/det; c13=(a12*a23-a13*a22)/det; c42=(a32*a43-a42*a33)/det; c43=(a42*a23-a43*a22)/det; taul += c12*tau2+c13*tau3; tau2=tau4+~42*tau2+~43*tau3; a12=a14+~12*a24+~13*a34; a2l=a41+~42*aZl+c43*a31; bl += c12*b2+c13*b3;

b2=b4+~42*b2+~43*b3;

1

if (1 == 1 I I 1 == n) { / * boundary conditions * / if (1 == 1 && e2 == 0.0) { taukl.0 ; bl=e3/el; a12=0.0; ) else if (1 == 1 && e2 ! = 0.0) { taul - = el/e2; bl - = e3/e2; ) else if (1 == n && e5 == 0.0) { tau2=l.0 ; a21=0.0; b2=e6/e4; ) else if (1 == n && e5 ! = 0.0) ( tau2 += e4/e5; b2 += e6/e5;

,

I

/ * forward babushka * /

if (1 == 1) { chi [O]=ch=tl=taul; t [O]=tl; gi [O]=g=yl=bl; y [Ol=yl; sub [OI=a21; super [Ol=al2; pp=a2l/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; tl=tau2; yl=b2; ) else { chi [lll = ch += taul; gi [lll = g += bl; sub [lll=a21; super [lll=al2; pp=a2l/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; t [lll=tl+taul; tl=tau2; y 1111 =yl+bl; yl=b2;

i * backward babushka * / pp=yl; y [nl=g/ch; g=PP; ch=tl;

Copyright 1995 by CRC Press, Inc

l=n-1; while (1 >= 0) ( pp=super [ll/(ch-sub[11 ) ; tl=t [ll ; ch=tl-ch*pp; yl=y [ll ; g=yl-g*pp; y[l] = (gi[ll+g-yl)/(chi [ll+ch-tl); I--;

I

free real vector (t,0) ; freeIrealIvector (super,0) ; free real vector (sub,0) ; fr e e ~ r e a l ~ v e c t o(chi, r 0) ; free-real-vector (gi,0) ;

1

5.5.3 Linear methods

- Fourth order self adjoint

femhermsym Solves a fourth order self-adjoint linear two point boundary value problem by means of Galerkin's method with continuous differentiable piecewise polynomial functions [BakH76, He75, StF731.femhermsym computes approximations y, and dy, (n=O,...JV)to the values at the points x = x,, where - w < a = x , < x , < ... < x N = b < ee of the real valued function y(x) and its derivative Dy(x), where y satisfies the equation (1) -D2(p(x)dy(x)) - D(q(x)Dy(x)) + r(x)y(x) = f(x) where D=ddx, with boundary conditions (2) y(a) = el, Dy(4 = e,, y(b) = e,, Dy(b) = e,. It is assumed that (a) p(x) should be positive on the interval [xo,xN],and q(x) and r(x) should be nonnegative there; (b) p(x), q(x), r(x) and f(x) are required to be sufficiently smooth on the interval [xo,xN] except at the knots, where discontinuities of the derivatives are allowed; in that case the order of accuracy is preserved. The solution is approximated by a function which is continuously differentiable on the closed interval [xo,xN]and a polynomial of degree less than or equal to k (k = 1 + orderf2) on each closed segment [xj-,,x,] G=l, ...JV).This function is entirely determined by the values of the zeroth and first derivative at the knots xj and by the value it has at k-3 interior knots on each closed segment [xj-,,x,]. The values of the function and its derivative at the knots are obtained by the solution of an order+l diagonal linear system of (k-1)N-2 unknowns. The entries of the matrix and the vector are inner products which are approximated by piecewise k-point Lobatto quadrature. The evaluation of the matrix and the vector is performed segment by segment. If k>3 then the resulting linear system can be reduced to a pentadiagonal system by means of static condensation. This is possible because the function values at the interior knots on each segment [xj-,,x,] do not depend on function values outside that segment. The final pentadiagonal system, since the matrix is symmetric positive definite, is solved by means of Cholesky's decomposition method. The theory upon which the method of solving the above boundary value problem is based on an extension of that described in femlagsym. The function y(x) is now approximated not, as was the case for the method of femlagsym (see formula (4) there) by a function which is simply continuous and equal to a polynomial of degree k over the

Copyright 1995 by CRC Press, Inc

interval [x,.,,~,], but by a (necessarily continuous) function whose first derivative possesses these properties. Radau-Lobatto quadrature is used, systems of linear equations are derived, and coefficients concerning internal points of [xn,,xn]are eliminated just as is described in femlagsym. Now, however, the resulting system of linear equations involving coefficients equal to the yn and dy, is pentadiagonal and is solved by Cholesky's decomposition method. Function Parameters: void femhermsym (x,y,n,p,q,r,Jorder, e) float x[O:n]; entry: a = x, < x, < ... < xn = b is a partition of the segment [a,b]; y: float y[I:2n-21; y[2i-I] is an approximation to y(x[i]), y[2i] is an approximation to dy(x[i]), exit: where y(x) is the solution of the equation (1) with boundary conditions (2); n: int; entry: the upper bound of the arrays x (value of N above), n > 1; p : float (*p)(x); the procedure to compute p(x), the coefficient of D2y(x) in equation (1); p(x) should be strictly positive; q: float (*q)(x); the procedure to compute q(x), the coefficient of Dy(x) in equation (1); q(x) should be nonnegative; r: float (*r)(x); the procedure to compute r(x), the coefficient of y(x) in equation (1); r(x) should be nonnegative; J float (*J(x); the procedure to compute f(x), the right hand side of equation (1); order: int; entry: order denotes the order of accuracy required for the approximate solution of then the differential equation; let h = max(x[i]-x[i-In, 1 y[2i-I] - y(x[ij) I Ic l *hard", ly[2i] - Dy(x[i]) I I c2*harder,i=l, ...,n-I; order can be chosen equal to 4, 6 or 8 only; e: float e[I:4]; entry: e[l], ...,e[4] describe the boundary conditions (values of ei, i=1, ...,4, in (2)). x:

Function used: chldecsolbnd.

void femhermsym (float x [I , float y [ I , int n, float ( * p ) (float), float (*q)(float), float (*r)(float), float (*f)(float), int order, float e [I ) 1

'

float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int) ; void chldecsolbnd (float [ I , int, int, float [I , float 11 ) ; void femhermsymeval (int, int, float ( * I (float), float ( * ) (float), float ( * ) (float), float ( * ) (float), float *, float *, float *, float *, float * , float *, float *, float *, float *, float * , float *, float *, float *, float *, float *, float); int l,n2,v,w; float *a,em [41 ,all,a12,a13,a14,a22,a23,a24,a33,a34,a44,ya,yb,za, zb,bl,b2,b3,b4,dl,d2,el,rl,r2 ,xll,xl;

Copyright 1995 by CRC Press, Inc

w=v=o ; n2 =n+n-2 ; xll=x [O] ; xl=x[11 ; ya=e [ll ; za=e [21 ; yb=e [31 ; zb=e [41 ; / * element matvec evaluation * / femhermsymeval(order,1,p,q,r,f,&all,&a12,&al3,&al4,&a22, &a23,&a24,&a33,&a34,&a44,&bl,&b2,&b3,&b4,&xl,xll); em [21=FLT-EPSILON; rl=b3-a13*ya-a23*za; dl=a33 ; d2=a44 ; rZ=b4-al4*ya-a24*za; el=a34 ; I++; while (1 < n) { xl l=xl; xl=x [ll ; / * element matvec evaluation * / femhermsymeval (order,l,p,q,r, f,&all,&a12,&a13,&a14,&a22, &a23,&a24,&a33,&a34,&a44,&bl,&b2,& b 4 ,&xl,xll); a [w+ll=dl+all; a [w+41=el+a12; a [w+71=a13 ; a [w+lO]=a14 ; a [w+51=d2+a22; a [w+81=a23 ; a [w+ll]=a24 ; a [w+141=O.0; y [v+ll=rl+bl; y [v+21=r2+b2; rl=b3 ; r2=b4 ; v += 2; w += 8; dl=a33; d2=a44; el=a34 : l=n; xll=xl; xl=x [ll ; / * element matvec evaluation * / femhermsymeval(order,l,p,q,r,f,&all 2 ,&al3,&a14,&a22, &a23,&a24,&a33,&a34,&a44,&bl,&b2,&b3,&b4,&xl,xll) ; y 11-12-11 =rl+bl-al3*yb-al4*zb; y [nZ]=rZ+b2-a23*yb-a24*zb; a [w+ll=dl+all; a [w+41=el+a12; a [w+S]=d2+a22; chldecsolbnd(a,n2,3,em,y) ; free-real-vector (a,1) ;

1 void femhermsymeval(int order, int 1, float (*p)(float), float (*q) (float), float (*r)(float), float (*f)(float), float *all, float *a12, float *a13, float *a14, float *a22, float *a23, float *a24, float *a33, float *a34, float *a44, float *bl, float *b2, float *b3, float *b4, float *xl, float xll)

/ * this function is internally used by FEMHERMSYM * / static float p3,p4,p5,q3,q4,q5,r3,r4,r5,f3,f4,f5; if (order == 4) ( float x2,h,h2,h3,pl,p2,ql,q2,rl,r2,fl,f2,bll,b12,b13,b14,b22, b23,b24,b33,b34,b44,sll,s12,s13,s14,s22,~23,~24,~33,~34,

Copyright 1995 by CRC Press, Inc

)* element bending matrix * / pl=p3 ; p2=(*p) (x2); p3= (*p)(*xl); bll=6.O* (pl+p3); b12=4.0*p1+2.0*p3; b13 = -bll; b14=bll-b12; b22=(4.0*pl+p2+p3)/1.5; b23 = -b12; b24=b12-b22; b33=bll; b34 = -b14; b44=b14-b24; / * element stiffness matrix * / sll=l :5*q2; s12=q2/4.0; ~ 1 3 = -Sll; s14=S12; s24=q2/24.0; ~22=q1/6.0+~24; s23 = - ~ 1 2 ; s33=sll; 534 = -s12; s44=~24+q3/6.0; / * element mass matrix * / rl=r3; r2= (*r)(x2); r3= (*r)(*xl); mll= (rl+r2)/6.0; m12=r2/24.0; m13=r2/6.0; m14 = -ml2; m22=r2/96.0; m23 = -m14; m24 = -m22; m33=(r2+r3)/6.0; m3 4=ml4; m44=m22; / * element load vector * / fl=f3; f2=(*f) (x2); f3=(*f) (*xl); *bl=h*(f1+2.0*f2)/6.0; *b3=h* (f3+2.O*f2)/6.O; *b2=h2*£2/12.0; *b4 = - (*b2); *all=bll/h3+sll/h+mll*h; *a12=b12/h2+~12+m12*h2; *a13=b13/h3+~13/h+ml3*h; *a14=b14/h2+~14+m14*h2; *a22=b22/h+s22*h+m22*h3; *a23=b23/h2+~23+m23*h2; *a24=b24/h+s24*h+m24*h3; *a34=b34/hZ+s34+m34*h2; *a33=b33/h3+~33/h+m33*h; *a44=b44/h+s44*h+m44*h3; } else if (order == 6) ( float h,h2,h3 ,x2,x3,pl,p2,p3,ql,q2,q3,rl,r2,r3,fl,£2,£3,bll,b12, b13,b14,b15,b22,b23,b24,b25,b33,b34,b35,b44,b45,b55,s11,

Copyright 1995 by CRC Press, Inc

~12,~13,~14,~15,~22,~23,~24,~25,~33,~34,~35,~44,~45,~55, mll,m12,m13 ,m14,m15,m22,m23,m24,m25,m33,m34,m35,m44,m45, m55,a15,a25,a35,a45,a55,cl,c2,~3,~4,bS; if (1 == 1) { p 4 = (*p) ( ~ 1 1;) q4= (*q)( ~ 1 1;) r4= (*r)(xll); f4= (*f) (xl1);

I

h= (*xl)-xll; h2=h*h; h3 =h*h2 ; x2=0.27639320225*h+xll; ~ 3 = x 1 1 (*xl) + -x2 ; / * element bending matrix * / pl=p4 ; p2= (*p) (x2); p3=(*p) (x3);

b23=6.6666666666667e0*~1-3.-7791278464167e0*p2+ 2.4579451308295e-l*p3+3.6666666666667eO*p4; b25 = - (b12+b23) ; b24 = - (b22+b23+b25/2.0); b33=8.3333333333333eO*pl+1.4422084194666el*p2+ 1.1124913866726e-l*p3+4.0333333333333el*p4; b35 = - (b13+b33); b34 = - (b23+b33+b35/2.0); b45 = - (b14+b34); b44 = - (b24+b34+b45/2.0); b55 = -(b15+b35); / * element stiffness matrix * / ql=q4 ; q2= (*q) (x2); q3= (*q)(x3);

8.3333333333338eL2*q4; s45 = - ( s 1 4 + ~ 3 4;) 955 = - ( s 1 5 + ~ 3 5 ;) / * element mass matrix * / r b r 4; r2= (*r) (x2); r3= (*r) (x3); r4= (*r) (*xl); m11=8.3333333333333e-2*r1+1.0129076086083e-l*r2+ 7.3759058058380e-3*r3; m12=1.3296181273333e-2*r2+1.3704853933353e-3*r3; m13 = -2.7333333333333e-2*(r2+r3); m14=5.0786893258335e-3*r2+3.5879773408333e-3*r3; m15=1.3147987115999e-1*r2-3.5479871159991e-2*r3; m22=1.7453559925000e-3*r2+2.5464400750059e-4*r3; m23 = -3.5879773408336e-3*r2-5.0786893258385e-3*r3;

Copyright 1995 by CRC Press, Inc

m24=6.6666666666667e-4*(r2+r3); m25=1.7259029213333e-2*r2-6.5923625466719e-3*r3; m33=7.375905805838Oe-3*r2+1.0129076086083e-l*r3+

8.3333333333333e-2*r4; m34 = -1.3704853933333e-3*r2-1.3296181273333e-2*r3; m35 = -3.5479871159992e-2*r2+1.3147987115999e-l*r3; m44=2.5464400750008e-4*r2+1.7453559924997e-3*r3; m45=6.5923625466656e-3*r2-1.7259029213330e-2*r3; m55=0.17066666666667eO*(r2+r3);

/ * element load vector * / fl=f4 ; f2=(*f) (x2); f3=(*f) (x3); f4=(*f) (*xl);

*bl=8.3333333333333e-2*f1+2.0543729868749e-l*f2-

5.5437298687489e-2*f3; *b2=2.6967233145832e-2*£2-1.0300566479175e-2*f3;

b5=2.6666666666667e-1*(f2+f3); *all=h2*(h2*mll+sll)+bll; *a12=h2*(h2*ml2+~12)+b12;

*a13=h2*(hZ*ml3+sl3)+bl3; *a14=h2*(h2*m14+~14)+b14;

a15=h2* (h2*m15+s15)+bl5; *a22=h2*(h2*m22+~22)+b22; *a23=h2*(h2*m23+~23)+b23; *a24=h2*(h2*m24+~24)+b24; a25=h2* (h2*m25+s25)+b25; *a33=h2*(h2*m33+~33)+b33; *a34=h2*(h2*m34+~34)+b34; a35=h2* (h2*m35+s35)+b35; *a44=hZ*(h2*m44+~44)+b44; a45=h2* (h2*m45+545)+b45; a55=hZ*(h2*m55+~55)+b55; / * static condensation * / cl=a15/a55; c2=a25/a55; c3=a35/a55; c4=a45/a55; *bl= ( (*bl)-cl*b5)*h; *b2= ( (*b2)- ~ 2 * b 5*h2; ) ) *b3= ( (*b3)- ~ 3 * b 5*h; ) *b4= ( (*b4)- ~ 4 * b 5*h2; *all= ( (*all)-cl*al5)/h3; *a12= ( (*a12)-cl*a25)/h2; *a13= ( (*al3)-cl*a35)/h3; *a14= ( (*a14)-cl*a45)/h2; *a22= ( (*a22)-c2*a25)/h; *a23= ( (*a23)-c2*a35)/h2; *a24= ( (*a24)-cZ*a45)/h; *a33=( (*a33)-c3*a35)/h3; *a34=( (*a34)-c3*a45)/h2; *a44= ( (*a44)-c4*a45)/h; } else { float x2,x3,x4,h,h2,h3,pl,p2,p3,p4,q1,q2,q3,q4,rl,r2,r3,r4, fl,f2,f3,f4,bll,b12,b13,b14,b15,b16,b22,b23,b24,b25,b26, b33,b34,b35,b36,b44,b45,b46,b55,b56,b66,s11,s12,s13,s14, S15,~16,~22,~23,~24,~25,~26,~33,~34,~35,~36,~44,~45,~46, s55,s56,s66,mll,m12,m13,m14,m15,m16,m22,m23,m24,m25,m26, m33,m34,m35,m36,m44,m45,m46,m55,m56,m66,~15,~16,~25,~26, c35,c36,c45,c46,b5,b6,a15,a16,a25,a26,a35,a36,a45,a46,

a55,a56,a66,det; if (1 == 1) { p5= (*p)( ~ 1 1;) q5= (*q)( ~ 1 1;) r5= (*r)(xll); f5=(*f) (xl1);

I

h= (*xl)-xll; h2=h*h; h3=h*h2; x2=xll+h*0.172673164646;

Copyright 1995 by CRC Press, Inc

x3=xll+h/2.0; x4=x11+ (*XI)-x2; / * element bending matrix * / pl=p5; p2= (*p) (x2); p3= (*p) (x3); p4=(*p) (x4); p5= (*p) (*xl); bll=105.8*p1+9.8*p5+7.3593121303513e-2*p2+ 2.2755555555556el*p3+7.0565656088553eO*p4; b12=27.6*p1+1.4*p5-3.41554824811e-l*p2+ 2.8444444444444eO*p3+1.0113960946522eO*p4; b13 = -32.2*(pl+p5)-7.2063492063505e-l*(p2+p4)+ 2.2755555555556el*p3; bl4=4.6*p1+8.4*~5+1.0328641222944e-l*p22.8444444444444eO*p3-3.3445562534992eO*p4; b15 = - (bll+bl3); b16 = - (blZ+b13+bl4+b15/2.0); b22=7.2*pl+0.2*p5+1.5851984028581eO*p2+ 3.5555555555556e-l*p3+1.4496032730059e-1*p4; b23 = -8.4*p1-4.6*~5+3.3445562534992eO*p2+ 2.8444444444444eOlp3-1.0328641222944e-1*p4; b24=1.2* (pl+p5)-4.7936507936508e-l* (p2+p4)3.5555555555556e-1*p3; b25 = - (b12+b23); b26 = - (b22+b23+b24+b25/2.0); b33=7.0565656088553eO*p2+2.2755555555556el*p3+ 7.3593121303513e-2*p4+105.8*p5+9.8*pl; b34 = -1.4*p1-27.6*p5-1.0113960946522eO*p22.8444444444444eO*p3+3.41554824811OOe-1*p4; b35 = - (b13+b33); b36 = - (b23+b33+b34+b35/2.0); b44=7.2*p5+pl/5.0+1.4496032730059e-l*p2+ 3.5555555555556e-l*p3+1.585198402858leO*p4; b45 = - (bl4+b34); b46 = - (b24+b34+b44+b45/2.0); b55 = - (bl5+b35); b56 = -(b16+b36); b66 = - (b26+b36+b46+b56/2.0); / * element stiffness matrix * / ql=q5 ; q2= (*q)(x2); q3=(*q) (x3); q4= (*q)(x4); q5= (*q)(*xl); ~11=3.0242424037951eO*q2+3.1539909130065e-2*q4; ~12=1.2575525581744e-1*q2+4.1767169716742e-3*q4; 913 = -3.0884353741496e-l*(q2+q4); s14=4.0899041243062e-2*q2+1.2842455355577e-2*q4; SlS = - ( ~ 1 3 + ~ l l ) ; ~16=5.9254861177068e-1*q2+6.0512612719116e-2*q4; s22=5.2292052865422e-3*q2+5.5310763862796e-4*q4+q1/20.0; 523 = -1.2842455355577e-2*q2-4.0899041243062e-2*q4; s24=1.7006802721088e-3*(q2+q4); S25 = - ( ~ 1 2 + ~ 2;3 ) s26=2.4639593097426e-2*q2+8.0134681270641e-3*q4; s33=3.1539909130065e-2*q2+3.0242424037951e0*q4; s34 = -4.1767169716742e-3*q2-1.2575525581744e-l*q4; S35 = - ( ~ 1 3 + ~ 3 3 ) ; s36 = -6.0512612719116e-2*q2-5.9254861177068e-l*q4; s44=5.5310763862796e-4*q2+5.2292052865422e-3*q4+q5/20.0; S45 = - ( ~ 1 4 + ~ 3 4 ) ; ~46=8.0134681270641e-3*q2+2.4639593097426e-2*q4; S55 = - ( ~ 1 5 + ~ 3;5 ) s56 = - ( s 1 6 + ~ 3 6 ) ; ~66=1.1609977324263e-l*(q2+q4)+3.5555555555556e-l*q3; / * element mass matrix * / rl=r5 ; r2= (*r)(x2); r3= (*r)(x3); r4= (*r)(x4); r5= (*r)(*xl); m11=9.7107020727310e-2*r2+1.5810259199180e-3*r4+r1/20.0; m12=8.2354889460254e-3*r2+2.1932154960071e-4*r4;

Copyright 1995 by CRC Press, Inc

m13=1.2390670553936e-2*(r2+r4); m14 = -1.7188466249968e-3*r2-1.0508326752939e-3*r4; m15=5.3089789712119e-2*r2+6.7741558661060e-3*r4; m16 = -1.7377712856076e-2*r2+2.2173630018466e-3*r4; m22=6.9843846173145e-4*r2+3.0424512029349e-5*r4; m23=1.0508326752947e-3*r2+1.7188466249936e-3*r4; m24 = -1.4577259475206e-4*(r2+r4); m25=4.5024589679127e-3*r2+9.3971790283374e-4*r4; m26 = -1.4737756452780e-3*r2+3.0759488725998e-4*r4; m33=1.5810259199209e-3*r2+9.7107020727290e-2*r4+r5/20.0; m34 = - 2 . 1 9 3 2 1 5 4 9 6 0 1 3 1 e - 4 * r 2 - 8 . 2 3 5 4 e - 3 * r 4 ; m35=6.7741558661123e-3*r2+5.3089789712112e-2*r4; m36 = -2.2173630018492e-3*r2+1.7377712856071e-2*r4; m44=3.0424512029457e-5*r2+6.9843846173158e-4*r4; m45 = -9.3971790283542e-4*r2-4.5024589679131e-3*r4; m46=3.0759488726060e-4*r2-1.4737756452778e-3*r4; m55=2.9024943310657e-2*(r2+r4)+3.5555555555556e-l*r3; m56=9.5006428402050e-3*(r4-r2); m66=3.1098153547125e-3*(r2+r4); / * element load vector * / fkf.5; f2=(*f) (x2); f3=(*f) (x3); f4=(*f) (x4); f5= (*f)(*xl); *b1=1.6258748099336e-l*f2+2.0745852339969e-2*f4+fl/20.0; *b2=1.3788780589233e-2*£2+2.8778860774335e-3*f4; *b3=2.0745852339969e-2*f2+1.6258748099336e-l*f4+f5/20.0; *b4 = -2.8778860774335e-3*f2-1.3788780589233e-2*£4; b5=(£2+f4)/11.25+3.5555555555556e-1*£3; b6=2.9095718698132e-2*(f4-f2); *all=h2* (h2*mll+sll)+bll; *a12=h2*(h2*mlZ+s12)+b12; *a13=h2*(h2*m13+s13)+b13; *a14=h2*(h2*m14+~14)+b14; a15=h2* (hZ*m15+s15)+b15; a16=h2* (h2*m16+s16)+bl6; *a22=hZ*(h2*m22+~22)+b22; *a23=h2*(hZ*m23+~23)+b23; *a24=h2*(h2*m24+~24)+b24; a25=h2*(h2*m25+~25)+b25; a26=h2* (h2*m26+s26)+b26; *a33=h2*(h2*m33+~33)+b33; *a34=h2*(h2*m34+~34)+b34; a35=h2* (h2*m35+s35)+b35; a36=h2* (hZ*m36+s36)+b36; *a44=h2*(h2*m44+~44)+b44; a45=h2* (h2*m45+s45)+b45; a46=h2* (h2*m46+s46)+b46; a55=h2*(h2*m55+~55)+b55; a56=h2*(h2*m56+~56)+b56; a66=hZ* (h2*m66+s66)+b66; / * static condensation * / det = -a55*a66+a56*a56; c15=(a15*a66-a16*a56)/det; cl6= (a16*a55-a15*a56)/det; c25= (a25*a66-a26*a56)/det; c26=(a26*a55-a25*a56)/det; c35=(a35*a66-a36*a56)/det; c36=(a36*a55-a35*a56)/det; c45=(a45*a66-a46*a56)/det; c46=(a46*a55-a45*a56)/det:

Copyright 1995 by CRC Press, Inc

5.5.4 Non-linear methods nonlinfemlagskew Solves a nonlinear two point boundary value problem with spherical coordinates EBab72, Bak77, BakH76, StF731. nonlinfemlagskew solves the differential equation a < x < b, (1) D(xncDy(x))/P = f(x,y,Dy(x)), where D=ddx, with boundary conditions ey(b) + e,Dy(b) = e, (el, e, # 0). (2) e ~ v ( 4+ e2Dy(a) = e,, The functions J; d m and df/dz are required to be sufficiently smooth in their variables on the interior of every segment [ X ~ , X (i=O, ~ + ~..., ] n-1). Let y[O](x) be some initial approximation of y(x); then the nonlinear problem is solved by successively solving -D(x""Ddkl(x))/x nc + +(~>,y[kl(~)J?Y[klW*g[kl(x) + fi(x>,y[kl(x),Dyrkl(x)) *Dg[kl(x) Xo < X < X,, = D(x"cDyrkl(x)/x"c - f(~>yrkl~Dyrkl(x,), with boundary conditions elg[kl(xd + e,Dg[kl(xo) = 0, e4g[kl(xJ + e,Dg[kl(x J = 0 with Galerkin's method (see femlagsym) and putting k=O,l,... y[k+ll(x) = y[kl(~) + g[kl(x), This is the so-called Newton-Kantorowitch method.

Function Parameters: void nonlinfemlagskew (x,y,n,J;JL,fi,nc, e) float x[O:n]; entry: a = xo < x, < ... < x, = b is a partition of the segment [a,b]; y: float y[O:n]; entry: y[i], i=O,l,...,n, is an initial approximate solution at x[i] of the differential equation (1) with boundary conditions (2); exit: y[i], i=O,l,...,n, is the Galerkin solution at x[i] of the differential equation (1) with boundary conditions (2); n: int; entry: the upper bound of the arrays x and y, n > 1; j float (*j)(x,y,z); float x,y,z; the procedure to compute the right hand side of (1); (f(x,y,) is the right hand side of equation (1)); &: float (*JL)(x,y,z); float x,y,z; the procedure to compute JL(x,y,z), the derivative off with respect to y; fi: float (*fi)(x,y,z); float x,y,z; the procedure to compute fi(x,y,z), the derivative off with respect to z; nc: int; entry: if nc = 0, Cartesian coordinates are used; if nc = 1, polar coordinates are used; if nc = 2, spherical coordinates are used; e: float e[l:6]; x:

Copyright 1995 by CRC Press, Inc

entry: e[I],...,e[6] describe the boundary conditions (values of ei, i=1,...,6 , in (2)); e[l] and e[4] are not allowed to vanish both. Function used: dupvec.

void nonlinfemlagskew(float x [I , float y [I , int n. float (*f)(float, float, float), float (*fy)(float, float, float), float (*fz)(float, float, float), int nc, float e [I

I

)

float *allocate-real-vector(int, int); void free-real-vector(f1oat * , int); void dupvec(int, int, int, float [ I , float [I ) ; int l,ll,it; float xll,xl,h,al2,a21,bl,b2,taul,tau2,ch,tl,g,yl,pp, zll,zl,el,e2,e4, e5,eps,rho,*t,*super,*sub,*chi,*gi,*z, xm,vl,vr,wl,wr,pr,qm,rm,fm,x112,xllxl,x12,zm,zaccm;

dupvec(O,n.O,z,y); el=e ill ; e2=e(21 ; e4=e [41 ; e5=e[51 ; it=1; do ( 14; xl=x [OI ; zl=zLO1 ; while (1 c = n) { xll=xl; 11=1-1; xl=x 111 ; h=xl-xll ; z11=21; zl=z[11 ; / * element mat vec evaluation 1 * / if (nc == 0) vl=vr=0.5; else if (nc == 1) { vl=(xll*2.0+x1)/6.0; vr= (xll+xl*2.0)/6.0; } else { x112=xll*x11/12.0; xllxl=xll*xl/6.0; x12=x1*x1/12.0; vl=3.0*x112+x11x1+x12; vr=3.0*xl2+xllxl+xll2; 1 1

wl=h*vl; wr=h*vr; pr=vr/ (vl+vr); xm=xll+h*pr; zm=pr*zl+(1.0-pr) *zll; zaccm=(21-zll)/h; qm= (*fz)( x m , zm,zaccm); nn= (*fy)(xm,zm,zaccrn); fm= (*f)(xm,zm,zaccm); taul=wl*rm; tau2=wr*rm; bl=wl*fm-zaccm* (vl+vr); b2=wr*fm+zaccm*(vl+vr);

Copyright 1995 by CRC Press, Inc

a12 = - (vl+vr)/h+vl*qm+(1.0-pr)*pr*rm* (wl+wr); a21 = - (vl+vr)/h-vr*qm+(1.0-pr)*pr*rm* (wl+wr); if (1 == 1 I I 1 == n) ( / * boundary conditions * / if (1 == 1 &&e2 == 0.0) { taukl.0; bl=a12=0.0; ) else if (1 == 1 && e2 ! = 0.0) ( taul - = el/e2; ) else if (1 == n && e5 == 0.0) ( tau2=1.0; b2=a21=0.0; ) else if (1 == n && e5 ! = 0.0) ( tau2 += e4/e5;

1

1

/ * forward babushka * /

if (1 == 1) ( chi 101 =ch=tl=taul; t 101 =tl; gi [Ol=g=yl=bl; y [Ol =yl; sub 101 =a21; super [O]=a12; pp=a2l/(ch-al2); ch=tau2-ch*pp; g=b2-g*pp; tl=tau2; yl=b2; ] else { chi[lll = ch += taul; gi [lll = g += bl; sub [ll]=a21; super [lll =al2; pp=a2l/ (ch-al2); ch=tau2-ch*pp; g=b2-g*pp; t ill1 =tl+taul; tl=tau2; y [lll=yl+bl; yl=b2;

1

1

i++;

/ * backward babushka * / pp=yl; y lnl =g/ch; g=PP; ch=tl; l=n-1 ; while (1 >= 0) ( pp=super ill / (ch-sub[ll); tl=t [ll ; ch=tl-ch*pp; yl=y [ll ; g=yl-g*pp; y [ll= (gi[ll +g-yl)/ (chi[ll+ch-tl); 1--;

I

eps=O.0; rho=l.O; for (1=0; l rho) ; dupvec(O,n,O,y,z); free-real-vector (t,0) ; free-real-vector(super,O); free-real-vector(sub,O); free-real-vector (chi,0) ;

Copyright 1995 by CRC Press, Inc

free-real-vector (gi,0) ; free-real-vector (z,O);

1

5.6 Two-dimensional boundary value problems 5.6.1 Elliptic special linear systems A. richardson Solves a system of linear equations with a coefficient matrix having positive real eigenvalues by means of a non-stationary second order iterative method: Richardson's method [CoHHS73, Vh681. Since Richardson's method is particularly suitable for solving a system of linear equations that is obtained by discretizing a two-dimensional elliptic boundary value problem, the procedure richardson is programmed in such a way that the solution vector is given as a two-dimensional array u/j,l], Ij SjS uj, N I I 1 I ul. The coefficient matrix is not stored, but each row corresponding to a pair (13 is generated when needed. richardson can also be used to determine the eigenvalue of the coefficient matrix corresponding to the dominant eigenfunction. richardson either (a) determines an approximation u, to the solution of the system of equations (1) Au=f where A is an NxN matrix (N=(uj-Ij+l)*(ul-ll+l); Ij, I1 1 I) whose eigenvalues hiare such that 0 < A, < A, < ... < A, = b, and u is a vector whose components up)are stored in the in locations of a two-dimensional real array U by means of the mapping u~~("'-~~+')'"~+') location Q,l)(j=Ij,Ij+l, ..., uj; I=ll,ll+l,...,ul) or (b) estimates the eigenvalue A, of the matrix A just described. In both cases a sequence of vectors uk ( k l , ...,n) is constructed from an initial member uo (if the parameter inap is given the value nonzero upon call of richardson, the components of uo are those stored in the locations of U prior to call; if inap is given the value zero, uo is the unit vector (1,1,...,1)) by means of the recursion u, = Bouo - wovo uk+l = J k " k + (I - B k h k - l - wkvk @I where vk = Auk - f (3) and, with 0 < a < b, yo = (a + b)/(b - a) Bo = 1, wo = 2/(b+a), B k = 2~oTkhJ/Tk+iOd, wk = 4Tkfj~J/((b-a)Tk+i&J), Tk(yl = cos(k arccosb)) being a Tscheyscheff polynomial. Individually the u, vk satisfy the recursions U l = g o - ~ 0 4 +~ wd0 uk+, = 6% - ~ U u k+ (1 -BJuk-l + 4 VI = P o - ~DA)vo Vk+l = @k - W W k - (1 - Bbvk-I . The residue vectors vk are then given explicitly by the formula Vk = Ck(a,b;A)vo where Ck(a,b;A) = Tk((b+a-2A)/(b-a))/Tk((b+a)/(b-a)).

Copyright 1995 by CRC Press, Inc

Of all k-th degree polynomials pk with pk(0)=I, C,(a,b;X) is that for which maxlpk(X)I (a lX 5 b) is a minimum. If, in the above, a < A,, 1 vk1 tends to zero as k increases, and uk tends to the solution u of equation (1). The mean rate of convergence over k steps may be defined to be (where In is the natural logarithm) - 1 1v 1 Iv 1 ) I k and the measure of this quantity computed during the implementation of richardson is rk = -(1/(2k))pn{1 k 1 2 l 1 v~ 1 + ln{ 1 vk 1 a1 1 v~ 1 dl. If, however, A, < a, 1 vk1 may not tend to zero, but an estimate of X, may be extracted from the uk and v,. If, in the decomposition of v, in terms of the eigenvalues of e, or A,

c,;cO, then A,, where

4 = ~kV(ab) - ~d/(Va+db)*/4 - PJ ~k = II vk II1 I(~ k - ~ k -II tends to A,. The estimates used in the implementation of richardson are & = (Ad2) + AJm))/2 where with 7 = 2, oo in both cases, = /.4k(Ti@(ab) - pp)/(Va+4b)2/4 - /kk(T)) . Pk(T) = 1 Vk 1 1 Uk-Uk-/ 1 ,, In both cases it is required that b should be an upper bound for the eigenvalues h, of A: it is remarked that, denoting the elements of A by A,,,, for all hi

Function Parameters: void richardson (u,Ij,uj,11,ul, inap,residual,a, b,n,discr,k, rateconv,domeial out) u: float u[Ij: uj,ll:ul]; after each iteration the approximate solution calculated by richardson is stored in u; entry: if inap is chosen to be nonzero then an initial approximation of the solution, otherwise arbitrary; exit: the final approximation of the solution; j : int; entry: lower and upper bound for the first subscript of u; 1 , int; entry: lower and upper bound for the second subscript of u; inap: int; entry: if the user wishes to introduce an initial approximation then inap should be chosen to be nonzero, choosing inap to be zero has the effect that all components of u are set equal to 1 before the first iteration is performed; residual: void (*residual)(Ij,uj, 11,ul, u); suppose that the system of equation at hand is Au=J for any entry u the procedure residual should calculate the residual Au-fin each pointj,I, where Ij Sj S uj, 11II Iuul, and substitute these values in the array u; qb: float;

Copyright 1995 by CRC Press, Inc

entry:

if one wishes to find the solution of the boundary value problem then in a and b the user should give a lower and upper bound for the eigenvalues for which the corresponding eigenfunctions in eigenfunction expansion of the residual Au-J; with u equals the initial approximation, should be reduced; if the dominant eigenvalue is to be found then one should choose a greater than this eigenvalue; n: int *; entry: gives the total number of iterations to be performed; discr: float discr[l:2]; after each iteration richardson delivers in discr[l] the Euclidean norm of the exit: residual (value of vn above), and in discr[2] the maximum norm of the residual (value of 1 v, above); k: int *; counts the number of iterations richardson is performing; exit: rateconv: float *; after each iteration the average rate of convergence is assigned to rateconv exit: (value of rn above); domeigval: float *; exit: after each iteration the dominant eigenvalue, if present, is assigned to domeigval (value of An above); if there is no dominant eigenvalue then the value of domeigval is meaningless; this manifests itself by showing no convergence to a fixed value; out: void (*out)(u,lj,uj, 11,ul, n, discr,k,rateconv,domeigval); by this procedure one has access to the following quantities: for Olkln the k-th iterand in u, the Euclidean and maximum norm of the k-th residual in discr[l] and discr[2], respectively; for O 0 and nbp > 0 then par[m+l:m+nbp] contains the values of the newly introduced parameters before the process broke off; res: float res[l:nobs+nbp]; exit: res[l:nobs] contains the residual vector at the calculated minimum (res[i] contains y"'(p),p') - yo)(to)), i=l,...,nobs, in the above); contains the additional if out[l] > 0 and nbp > 0 then res[nobs+l:nobs+nbp] continuity requirements at the break-points before the process broke off 7("-'),p1fi)) -y@~[~~)(p? ~~?~"@ iJ=1, i )..., , nbp, in the (res[nobs+i '] contains p1i7(p', above); bp: int bp[O:nbp]; entry: bp[i], i=l, ...,nbp, should correspond to the index of that time of observation which will be used as a brvk-point (1 I bp[i] I nobs); the break-points have to be ordered such that bp[i] 5 bpb] if i 5 j; exit: with normal termination of the process bp[l:nbp] contains no information; otherwise, if out[l] > 0 and nbp > 0 then bp[iJ, i=l, ...,nbp, contains the index

Copyright 1995 by CRC Press, Inc

of that time of observation which was used as a break-point before the process broke off; jtjinv: float jtjinv[l:m, I:m]; exit: the inverse of the matrix J'*J where J denotes the matrix of partial derivatives dres[i]/dpar[k] (i=l ,...,nabs; k=l ,...,m) and J' denotes the transpose of 4 this matrix can be used if additional information about the result is required; e.g. statistical data such as the covariance matrix, correlation matrix and confidence intervals can easily be calculated from jtjinv and out[2]; in: float in[0:6]; entry: in[O]: the machine precision; in[l]: the ratio: the minimal steplength for the integration of the differential equations divided by the distance between two neighboring observations; mostly, a suitable value is lo4; in[2]: the relative local error bound for the integration process; this value should satisfy in[2] 5 in[3]; this parameter controls the accuracy of the numerical integration; mostly, a suitable value is in[3]/100; the relative tolerance for the difference between the Euclidean norm of the in[3]: ultimate and penultimate residual vector; see in[4] below; in[4]: the absolute tolerance for the difference between the Euclidean norm of the ultimate and penultimate residual vector; the process is terminated if the improvement of the sum of squares is less than in[3] * (sum of squares) + in[4] * in[4]; in[3] and in[4] should be chosen in accordance with the relative and absolute errors in the observations; note that the Euclidean norm of the residual vector is defined as the square root of the sum of squares; in[5]: the maximum number of times that the integration of the differential equations is performed; a starting value used for the relation between the gradient and the Gaussin[6]: Newton direction; if the problem is well conditioned then a suitable value for in[6] will be 0.01; if the problem is ill conditioned then in[6] should be greater, but the value of in[6] should satisfy: in[O] < in[6] 5 l/in[O]; out: float out[l:7]; exit: out[l]: this value gives information about the termination of the process; out[l]=O: normal termination; if out[l] > 0 then the process has been broken off and this may occur because of the following reasons: out[l]=l: the number of integrations performed exceeded the number given in in[5]; out[l]=2 the differential equations are very nonlinear; during an integration the value of in[]] was decreased by a factor 10000 and it is advised to decrease in[l], although this will increase computing time; out[l]=3 a call of deriv delivered the value zero; out[l]=4 a call of jacdfdy delivered the value zero; out[l]=5 a call of jacdfdp delivered the value zero; out[l]=6 the precision asked for cannot be attained; this precision is possibly chosen too high, relative to the precision in which the residual vector is calculated (see in[3J;

Copyright 1995 by CRC Press, Inc

out[2]:

the Euclidean norm of the residual vector calculated with values of the unknowns delivered; out[3]: the Euclidean norm of the residual vector calculated with the initial values of the unknown variables; out[4]: the number of integrations performed, needed to obtain the calculated result; if out[4] = 1 and out[l] > 0 then the matrix jtjinv cannot be used; out[5]: the maximum number of times that the requested local error bound was exceeded in one integration; if it is a large number then it may be better to decrease the value of in[l]; out[6]: the improvement of the Euclidean norm of the residual vector in the last integration step of the process of Marquardt; 0ut[7]: the condition number of J'*J, i.e, the ratio of its largest to smallest eigenvalues; deriv: int (*deriv)(n,m,par,y,t,dJ); entry: par[l:m] contains the current values of the unknowns and should not be altered; y[l:n] contains the solutions of the differential equations at time t and should not be altered; an array element df[i] (i=l,...,n) should contain the right hand side of the i-th exit: differential equation; after a successful call of deriv, the procedure should deliver the value nonzero; however, if deriv delivers the value zero then the process is terminated (see out[ll); hence, proper programming of deriv makes it possible to avoid calculation of the right hand side with values of the unknown variables which cause overflow in the computation; jacdfdy: int (*jacdfdy)(n,m,par,y,t,jj); float &[I :n, 1:n]; entry: for parameters par, y and t, see deriv above; exit: an array element JL[i,j] (i,j=l,...,n) should contain the partial derivative of the right hand side of the i-th differential equation with respect to yb], i.e. df[i//dybl; the integer value should be assigned to this procedure in the same way as is done for the value of deriv; jacdfdp: int (*jacdfdp)(n,m,par, y, t,fp); float fp[l:n, 1:m]; entry: for parameters par, y and t, see deriv above; exit: an array element fp[i,j] should contain the partial derivative of the right hand side of the i-th differential equation with respect to parb], i.e. df[i]/dparb]; the integer value should be assigned to this procedure in the same way as is done for the value of deriv; callystart: void (*callystart)(n,m,par,y,ymax); entry: par[l:m] contains the current values of the unknown variables and should not be altered; exit: y[l:n] should contain the initial values of the corresponding differential equations; the initial values may be functions of the unknown variables par; in that case, the initial values of dy/dpar also have to be supplied; note that dy[i]/dparb] corresponds with y[5*n+j*n+i] (i=l ,...,n, j=l ,...,m); ymax[i], i=l, ...,n, should contain a rough estimate to the maximal absolute value of y[i] over the integration interval; data: void (*data)(nobs, tobs,obs,cobs);

Copyright 1995 by CRC Press, Inc

this procedure takes the data to fit into the procedure peide; entry: nobs has the same meaning as in peide; exit: tobs: float tobs[O:nobs]; the array element tobs[O] should contain the time, corresponding to the initial values of y given in the procedure callystart; an array element tobs[i], llilnobs, should contain the i-th time of observation; the observations have to be ordered such that tobs[i] 5 tobsb] if i 5 j; cobs: int cobs[l:nobs]; an array element cobs[i] should contain the component of y observed at time tobs[i]; note that 1 I cobs[i] I n; obs: float obs[l :nabs]; an array element obs[i] should contain the observed value of the component cobs[i] of y at the time tobs[i]; monitor: void (*monitor)(post,ncol,nrow,par,res,weight,nis); int post, ncol,nrow,weight,nis; this procedure can be used to obtain information about the course of the iteration process; if no intermediate results are desired then a dummy procedure satisfies; inside peide, the procedure monitor is called at two different places and this is denoted by the value of post; post=l: monitor is called after an integration of the differential equations; at this place are available: the current values of the unknown variables par[l:ncol], where ncol=m+nbp, the calculated residual vector res[l:nrow], where nrow=nobs+nbp, and the value of nis, which is the number of integration steps performed during the solution of the last initial value problem; post=2: monitor is called before a minimization of the Euclidean norm of the residual vector with the Levenberg-Marquardt algorithm is started; available are the current values of par[l:ncol] and the value of the weight, with which the continuity requirements at the break-points are added to the original least squares problem.

Functions used:

inivec, inimat, mulvec, mulrow, dupvec, dupmat, vecvec, matvec, elmevc, sol, dec, mulcol, tamvec, mattam, qrisngvaldec.

void peide(int n, int m, int nobs, int *nbp, float parll, float res[l, int bpll, float **jtjinv, float in [I , float out [I , int (*deriv)(int,int,float [I,float [I,float,float [I) , int (*jacdfdy)(int,int,float [I ,float [I ,float,float * * ) , int (*jacdfdp)(int,int,float [I ,float [I , float,float * * ) , void (*callystart)(int,int,float [I,float [I,float [I ) , void (*data)(int,float [I , float [I , int [I ) , void (*monitor)(int,int,int,float [ I , float [I ,int,int)) int *allocate-integer-vector(int, int) ; float *allocate-real-vector(int, int); float **allocate-real-matrix(int, int, int, int); void free-integer-vector(int *, int); void free-real-vector(f1oat *, int) ; void free-real-rnatrix(f1oat **, int, int, int) ; int peidefunct(int nrow, int ncol, float par[], float res[l, int n, int m, int nobs, int *nbp, int first, int *sec,

Copyright 1995 by CRC Press, Inc

int *max, int *nis, float epsl, int weight, int bp 11 , float save [I , float pax[] , float y [I , float **yp, float **fy, float **fp, int cobsll, float tobs[l, float obscl, float in[], float aux[l, int clean, int (*deriv)(int,int,float [I ,float [I ,float,float [I ) , int (*jacdfdy)(int,int,float 11 ,float [I , float,float * * ) , int ( *jacdfdp) (int,int , float [I , float [I , float,float * * ) , void (*callystart)(int,int,float [I , float 11 , float [I ) , void (*monitor)(int,int,int,float [I,float [I,int,int)) ; void inivec (int, int, float [I , float); void inimat(int, int, int, int, float **, float); void mulvec (int, int, int, float [I , float [I , float); void mulrow(int, int, int, int, float **, float **, float); void dupvec (int, int, int, float 11 , float 11 ) ; void dupmat (int, int, int, int, float **, float * * ) ; float vecvec (int, int, int, float [I, float [I ) ; float matvec(int, int, int, float **, float [I ) ; void elmvec (int, int, int, float [I, float [I, float) ; void sol(f1oat **, int, int [I, float [I); void dec (float **, int, float [I , int [I ) ; void mulcol(int, int, int, int, float * * , float **, float); float tamvec (int, int, int, float **, float [I ) ; float mattam(int, int, int, int, float * * , float * * ) ; int qrisngvaldec (float **, int,int,float [I , float **, float [I ) ; int i,j,weight,ncol,nrow,away,max,nfe,nis,*cobs, first,sec,clean,nbp~ld,maxfe,fe,it,err,emergency; float epsl,resl,in3,in4,fac3,fac4,aux[4l,*obs,*save,*tobs, **yp,*ymax,*y,**fy,**fp,w,**aid,temp, ~,~~,~2,mu,res2,fpar,fparpres,lambda,lambdamin,p,pw~ reltolres,abstolres,em[8],*val,*b,*bb,*parpres,**jaco; static float save1[35]=(1.0, 1.0, 9.0, 4.0, 0.0, 2.0/3.0, 1.0, 1.0/3.0, 36.0, 20.25, 1.0, 6.0/11.0, 1.0, 6.0/11.0, 1.0/11.0, 84.028, 53.778, 0.25, 0.48, 1.0, 0.7, 0.2, 0.02, 156.25, 108.51, 0.027778, 120.0/274.0, 1.0, 225.0/274.0, 85.0/274.0, 15.0/274.0, 1.0/274.0, 0.0, 187.69, 0.0047361); nbpold= ( *nbp) ; cobs=allocate~integer~vector(1,nobs); obs=allocate~real~vector(l,nobs); save=allocate-real-vector(-38,6*n); ; tobs=allocate~real~vector(0,nobs) ymax=allocate-real-vector(1,n); y=allocate real vector (l,6*n*(nbpold+m+l)) ; yp=allocat~reai~matrix(1,nbpold+nobs,l,nbpold+m) ; fy=allocate-real-matrix(l,n,l,n); fp=allocate real matrix (1,n,l,m+nbpold); aid=allocat~rea~~matrix(1,m+nbpold,1,m+nbpold); for (i=O; i 0) ; aux 121 =FLT-EPSILON; epsl=l.OelO; out 111 =o. 0; bp [Ol =max=O; / * smooth integration without break-points * / if (!peidefunct(nobs,m,par,res, n,m,nobs,nbp,first,&sec,&max,&nis,epsl,weight,bp, save,ymax,y,yp,fy,fp,cobs,tobs,obs,in,aux,clean,deriv, jacdfdy,jacdfdp,callystart,monitor)) goto Escape; reslzsqrt (vecvec(1,nobs ,0,res , res)) ; nfe=l; if (in151 == 1.0) { out Ell =l.O; goto Escape; 1

if (clean) { first=l; clean=O ; fac3=sqrt(sqrt(in[3l /resl)) fac4=sqrt(sqrt(in[41/resl))

Copyright 1995 by CRC Press, Inc

; ;

epsl=resl*fac4; if ( !peidefunct(nobs,m,par, res, n,m,nobs,nbp,first,&sec,&max,&nis,epsl,weight,bp, save,ymax,y,yp,fy,fp,cobs,tobs,obs,in,aux,clean,deriv, jacdfdy,jacdfdp,callystart,monitor)) goto Escape;

first=O; ) else. nfe=0; ncol=m+(*nbp); nrow=nobs+ (*nbp); sec=l; in3=in [31 ; in4=in[41 ; in [31=resl; weight=away=O; out [41=out [51=w=0-0; temp=sqrt(weight)+l.0; weight=temp*temp; while (weight ! = 16 && *nbp > 0) ( if (away == 0 && w ! = 0.0) ( / * if no break-points were omitted then one function function evaluation is saved * / w=weight/w; for (i=nobs+l;ic=nrow; i++) ( for (j=l; jc=ncol; j++) yp[il [jl *= w; res [il *= w;

1 in [31 *= fac3*weight; in [41=epsl; (*monitor)(2,ncol,nrow,par,res,weight,nis); / * marquardt's method * / val=allocate~real~vector(1,ncol); b=allocate-real-vector(1,ncol); bb=allocate-real-vector(1,ncol); parpres=allocate-real-vector(1,ncol); jaco=allocate~real~matrix(l,nrow,l,ncol);

w=lO.O ; w2=0.5 ; mu=O.01; ww = (in[61 c 1.0e-7) ? 1.0e-8 : l.0e-l*in[61; em [Ol=em[21 =em[61 =in 101 ; em[4] =lO*ncol; reltolres=in[31 ; abstolres=in[41*in [41 ; maxfe=in [51 ; err=O; fe=it=l; p=fpar=res2=0.0; pw = -log(ww*in[Ol ) /2.30; if ( !peidefunct(nrow,ncol,par,res, n,m,nobs,nbp,first,&sec,&max,&nis,epsl, weight,bp,save,ymax,y,yp,fy,fp,cobs,tobs,obs, in,aux,clean,deriv,jacdfdy,jacdfdp,

callystart,monitor)) err=3; else ( fpar=vecvec(l,nrow,0,res,res) ; out [31=sqrt (fpar); emergency=O; it=l; do ( dupmat (1,nrow,1,ncol,jaco,yp) ; i=qrisngvaldec(jaco,nrow,ncol,val,aid,em); if (it == 1) lambdazin [6]*vecvec (l,ncol,O,val,val); else if (p == 0.0) lambda *= w2; for (i=l; ic=ncol; i++) b [i]=val [il*tamvec (l,nrow, i,jaco,res) ; while (1) { for (i=l; i= pw) ( err=4; ernergency=l; break; 1

) elke ( dupvec (1,ncol,0,par,parpres) ; fpar=fparpres; break;

1

if (emergency) break; it++; ) while (fpar>abstolres && res2>reltolres*fpar+abstolres); for (i=l; i tol) ( fails++; if (h > l.l*hmin) ( if (fails > 2) { peidereset (n,k,hmin,hmax,hold,xold,y,save,&ch,&x, &h,&decompose) ; goto Newstart; ) else ( / * calculate step and order * / peidestep (n,k,fails,tolup,toldwn,tol,error,delta, lastdelta,y,ymax,&knew,&chnew) ; if (knew ! = k) ( k=knew; peideorder(n,k,eps,a,save,&tol,&tolup,

&toldwn,&tolconv,&aO,&decompose) ;

1

bh *= chnew; peidereset (n,k,hmin,hmax,hold,xold,y,save,&ch,&x, &h,&decompose) ;

) el$e { if (k == 1) { / * violate eps criterion * / save[-21 += 1.0; same=4; goto Errortestok;

I

k=1; peidereset (n,k,hmin,hmax,hold,xold,y,save,&ch,&x, &h,&decompose) ; peideorder (n,k,eps,a,save,&tol,&tolup, &toldwn,&tolconv,&aO,&decompose) ; same=2;

) el'se Errortestok: fails=O; for (i=l; i 1.1) ( if (k ! = knew) { if (knew > k)

Copyright 1995 by CRC Press, Inc

I

same=k+l; if (chnew*h > hmax) chnew=hmax/h; h *= chnew; C=1.0; for (j=n; jc=k*n; j += n) { c *= chnew; rnulvec(j+l,j+n,O,y,y,c); decompose=l; ) else same=lO;

1

(*nis)++;

/ * start of an integration step of yp * /

if (clean) { hold=h; xold=x; kpold=k; ch=l.0 ; dupyec(l,k*n+n,O,save,y) ; ) else { (h ! = hold) { ch=h/hold; c=1.0; for (j=n6+nnpar; jc=kpold*nnpar+n6; j += nnpar) { c *= ch; for (i=j+l; ic=j+nnpar; i++) y[il *= c;

\

xold=x; kpold=k; ch=l.0 ; dupvec(l,k*n+n,O,save,y) ; / * evaluate jacobian * / evaluate=O; decompose=evaluated=l; if ( ! (*jacdfdy)(n,m,par,y,x,f~)) { save[-31=4.0; goto Finish;

)* decompose jacobian * / decompose=O; c = -aO*h; for (j=l; jc=n; j++) { for (i=l; ic=n; i++) jacob[il [jl=fy[il [jl*c; jacob[jl [jl += 1.0;

1

dec(jacob,n,aux,p) ; if ( ! (*jacdfdp)(n,m,par,y,x,fp))( save[-31=5.0; goto Finish;

1 if (npar > m) inimat (l,n,m+l,npar, fp,0.0) ; / * prediction * / for ( 1 ~ 0 ;lc=k-1; 1++) for (j=k-1;j>=l; j--) elmvec(j*nnpar+n6+l,j*nnpar+n6+nnpar,nnpart y,y,1.0); / * correction * / for (j=l; js=npar; j++) ( jSn= (j+S)*n; dupvec(l,n,jSn,yO,y); for (i=l; ic=n; i++) df [i]=h*(fp[il[jl+matvec(l,n,i,fy,yO))y [nnpar+j5n+il ;

Copyright 1995 by CRC Press, Inc

sol(jacob,n,p,df) ; for (1=0; l= t) { / * calculate a row of the jacobian matrix and an element of the residual vector * / tobsdif = (tobs[iil -x)/h; cobsii=cobs[iil ; res [iil=peideinterpol(cobsii,n,k,tobsdif,y)-obs[iil ; if (!clean) ( for (i=l; i 0.0) ? 0.5*logoneplusx(2.0*ax/(l.0-ax)) : -0.5*logoneplusx(2.0*ax/(l.O-ax))));

6.1.2 Logarithmic functions

logoneplusx Computes the function In(1 + x) for x > -1. For values of x near zero, loss of relative accuracy in the computation of ln(1 +x) by use of the formulae z=l +x, ln(1 +x) =ln(z) occurs

Copyright 1995 by CRC Press, Inc

(since l + x = I for small x); the use of logoneplusx avoids this loss [HaCL68]. For x < -0.2929 or x > 0.4142, In(l+x) is evaluated by use of the standard function In directly; otherwise, a polynomial expression of the form

is used, and for small x loss of relative accuracy does not take place. Function Parameters: logoneplusx: X: float; entry:

float logoneplusx (x) delivers the value of ln(l+x); the real argument of In(1 +x), x > - 1.

float logoneplusx(float x)

I float y,z; if (X == 0.0) return 0.0; else if ( x < -0.2928 1 1 x > 0.4142) return log (l.O+x); else { z=x/ (x+2.0); y=z*z; return ~*(2.0+y*(0.666666666663366+y*(0.400000001206045+y* (0.285714091590488+y*(0.22223823332791+y* (0.1811136267967+y*0.16948212488))))));

1

I

6.2 Exponential integral 6.2.1 Exponential integral

Calculates the exponential integral

where the integral is to be interpreted as the Cauchy principal value. When x > 0, the related function

may be obtained from that of Ei(x) by use of the relationship E,(x) integral is undefined and the procedure will cause overflow.

Copyright 1995 by CRC Press, Inc

=

-Ei(-x). For x=O the

The exponential integral is computed by means of the rational Chebyshev approximations given in [AbS65, CodT68, CodT691. Only ratios of polynomials with equal degree I are considered. Function Parameters: float ei (x) ei: delivers the value of the exponential integral; x: float; entry:

the argument of the integral.

Functions used:

chepolsum, pol, jfiac.

float ei(f1oat x) 1

float float float float

chepolsum (int, float, float [I ) ; pol (int, float, float [I ) ; jfrac(int, float [I, float [I ) ; p t81 ,qt81 ;

if (X > 24.0) { p [O]= 1.00000000000058; q[l] = 1.99999999924131; q[21 = -2.99996432944446; p [l]=x-3.OOOOOOl6782085; q [31 = -7.90404992298926; p [21 =x-5.00140345515924; q[41 = -4.31325836146628; p [31 =x-7.49289167792884; q[51 = 2.95999399486831eZ; p t41 =x-3.08336269051763el; p[5] =x-1.39381360364405; qt61 = -6.74704580465832; p 161 =x+8. 91263822573708; qt71 = 1.04745362652468e3; p [71 =x-5.31686623494482el; return exp(x)* (1.0+jfrac(7,q,p)/x)/x; ) else if (x > 12.0) { p LO1 = 9.99994296074708e-1; q[ll = 1.00083867402639; p [l]=X-1.95022321289660; q[2] = -3.43942266899870; p [21=~+1.75656315469614; q[31 = 2.89516727925135el; p [31 =x+l.79601688769252el; q [4] = 7.60761148007735e2; p [4]=x-3.23467330305403el; q[51 = 2.57776384238440el; p [Sl =x-8.28561994140641; q[61 = 5.72837193837324el; p [61 =x-1.86545454883399el; qt71 = 6.95000655887434el; p [7l =x-3 .48334653602853; return exp(x)*jfrac(7,q,p)/x; ) else if (x > 6.0) { q [l] = 5.27468851962908e-1; p [O]= 1.00443109228078; q t21 = 2.73624119889328e3; p 111 =x-4.32531132878l35el; q[3] = 1.43256738121938el; p [21 =x+6.01217990830080el; q[4] = 1.00367439516726e3; p Dl =x-3.31842531997221el; q[51 = -6.25041161671876; p 141 =x+2.50762811293561el; p [51=~+9.30816385662165; qt61 = 3.00892648372915eZ; a[71- = 3.93707701852715: P [61 =x-2.19010233854880el: [71 =x-2 .18086381520724; . return exp(x)*jfrac(7,q,p)/x; ) else if (x > 0.0) { float t,r,xO,xmxO; q[Ol = -8.26271498626055e7 p [O]= -1.95773036904548e8; p 111 = 3.89280421311201e6; q[ll = 8.91925767575612e7 v t21= -2.21744627758845e7; a I21 = -2.49033375740540e7

p

Copyright 1995 by CRC Press, Inc

a -

else { float z,z2; p fO] = 0.837207933976075el; p [l] = -0.652268740837103el; p [21 = 0.569955700306720; q[Ol = 0.418603966988037el; q[l] = -0.465669026080814el; q[21 = O.lel; z=xmxo/ (x+x0); z2=z*z; t=z*pol(2,z2,p)/pol(2,z2,q); keturn t+xmxO*r;

) else if (x > -1.0) { float y; p fO] = -4.41785471728217e4; q[0] =7.65373323337614e4; p [l] = 5.77217247139444e4; q[1] =3.25971881290275e4; p [Z] = 9.93831388962037e3; q[2] =6.10610794245759e3; p[31 = 1.84211088668000e3; q[3]=6.35419418378382e2; p [4] = 1.01093806161906e2; q [4]=3.72298352833327el; p [5l = 5.03416184097568; q [5l =l. 0; y = -x; return log(y)-pol(5,y,p)/po1(5,y,q); ) else if (x > -4.0) ( float y; p [01=8.67745954838444e-8; q[01=1.0; p [I]=9.9999551930139Oe-1; q[1] =1.28481935379157el; p I21 =l. 18483lO5554946el; q [2]=5.64433569561803el; p I31 =4.55930644253390el; q [3]=I. 06645183769914e2; p [4]=6.99279451291003el; q [4]=8.9731109712529Oel; p [51=4.2520203476884lel; q[5] =3.1497184917044lel; p [61=8.83671808803844; sf61=3.79559003762122; p[71=4.01377664940665e-1; q[7]=9.08804569188869e-2; y = -1.o/x; return -exp(x)*pol(7,y,pl/pol(5,y,q); ) else ( float y; p [Ol = -9.99999999998447e-1; q [OI =1.0; q [11=2.86271060422192el; p [ll = -2.66271060431811el; p [ZI = -2.41055827097015e2; q[21=~.92310039388533e2; p [31 = -8.95927957772937e~; q[31=1.33278537748257e3; q[41=2.77761949509163e3; p [41 = -1.29885688756484e3; p [51 = -5.45374158883133e2; q[51=2.40401713225909e3; p [61 = -5.66575206533869; q[61=6.3165748328080Oe2; y = -1.o/x; return -exp(x)*y* (l.O+y*pol(6,y,p)/pol(5,y,q));

1

1

B. eialpha Calculates a sequence of integrals [AbS65] of the form

by use of the recursion a&) = X-I, a&) = q(x)

+ (i/x)a,, (x)

Function Parameters: void eialpha (x, n,alpha) float; entry: n: int;

x:

the real x in the integrand;

Copyright 1995 by CRC Press, Inc

(i=l

,...,n).

entry: the integer n in the integrand; alpha: float alpha[O:n]; exit: the value of the integral is stored in alpha[i], i=O,...,n.

void eialpha(f1oat x, int n, float alpha[]) {

int k; float a,b,c; c=1.o/x; a=exp (-x); b=alpha [OI=a*c; for (k=l; k n l to compute the required values of the functions (1). The successive convergence Cr(x) of expansion (2) are computed until I I - Cr(x)/Cr+,(x) 5 6 , where 6 is the value of the machine precision.

I

Function Parameters: void enx (x,nl,n2,a) float; entry: the real positive x in the integrand; nl,n2: int; entry: lower and upper bound, respectively, of the integer n in the integrand; a: float a[nl:n2]; exit: the value of the integral is stored in a[i], i=nl,...,n2.

x:

Functions used:

ei, nonexpenx.

void enx(f1oat x, int nl, int n2, float a[]) 1

if (X 1) e=exp (-x); for (i=2; i= nl) a[il =w; 1

1

eLe { int i,n; float w,e,an; n=ceil (x); if (n 1.0e-15*w); ) else { float *allocate-real-vector(int, int) ; void free-real-vector(f1oat *, int); void nonexpenx (float, int, int, float [I ) float *b; b=allocate-real-vector(n,n); nonexpenx (x,n,n,b) ; w=b [nl*exp ( -x); free-real-vector (b,n) ; 1 I if (nl == n2 && nl == n) a [nl =w; else { e=exp(-x); an=w; if (n = nl) a [nl=w; for (i=n-1;i>=nl; i--1 { w= (e-i*w)/x; if (i = nl) a[il =w;

1

D. nonexpenx Calculates a sequence of integrals of the form The value of ar,,(x) where no = x is first computed using the methods (a) and (b) described in the documentation to em if no 5 10 (calling e m for this purpose) and the

Copyright 1995 by CRC Press, Inc

continued fraction expansion (2) of that documentation if no > 10. Thereafter, the recursion ar,,+,(x) = n-'{I - xa,(x)] is used as described in the documentation to e m to compute the required values of the functions (1). See [AbS65, CodT68, CodT69, G61, G731.

Function Parameters: void nonexpenx (x,nl, n2, a) float; entry: the real positive x in the integrand; nl,n2: int; entry: lower and upper bound, respectively, of the integer n in the integrand; a: float a[nl:n2]; exit: the value of the integral is stored in a[i], i=nl,...,n2.

x:

Function used: enx.

void nonexpenx(f1oat x, int nl, int n2, float a [ I I 1

)

int i,n; float w,an; n = (X C = 1.5) ? 1 : ceil (x); if (n =l; i - - ) { u=uo ; uO=z*uO+b [il -ul; u1=u; 1

I

1

1

return (uO*y+0.491415393029387-ul)* (x-1.0)* (x-2.O)+f;

D. incomgam Computes the incomplete gamma functions based on Pad6 approximations [AbS65, Lu701. incomgam evaluates the functions

to a prescribed relative accuracy E. If (a) a,x < 3 or (b) x < a and a 2 3, y(a,x) is computed by use of a continued fraction derived from a power series in ascending powers of x for this function, and r(a,x) is determined from the relationship r(a,x) = r(a) - y(a,x). If neither of the above conditions holds, I'(a,x) is computed by use of a continued fraction derived from an asymptotic series in descending powers of x for this function, and the relationship y(a,x) = r(a) - r(a,x) is used. The relative accuracy of the results depends not only on the quantity e, but also on the accuracy of the functions exp and gamma. Especially for large values of x and a, the desired accuracy cannot be guaranteed.

Function Parameters: void incomgam (x,a,klgam,grgam,gam,eps) x: float; entry: the independent argument x, x 2 0; a: float; entry: the independent parameter a, a > 0;

Copyright 1995 by CRC Press, Inc

klgam: float *; exit: the integral y(a,x) is delivered in klgam; grgam: float *; exit: the integral I'(a,x) is delivered in grgam; gum: float; entry: the value of I'(a); for this expression, the procedure gamma may be used; eps: float; entry: the desired relative accuracy (value of e above); the value of eps should not be smaller than the machine accuracy.

void incorngarn(f1oat x, float a, float *klgam, float *grgam, float gam, float eps) {

int n; float cO,cl,c2,do,dl,d2 ,x2,ax,p,q,r,s,rl,r2,scf; s=exp(-x+a*log(x)) ; scf=FLT-MAX; if (X C = ((a c 3.0) ? 1.0 : a)) ( x2=x*x; ax=a*x; d0=1.0; p=a; co=s; dl= (a+l.0)* (a+2.0-x); cl=( (a+l.O)* (a+2.0)+x) *s; r2=cl/dl; n=l ; do ( p += 2.0; q= (p+1.0)* (p*(p+2.0)-ax); r=n* (n+a)* (p+2.0)*x2; c2= (q*cl+r*cO)/p; d2= (q*dl+r*dO)/p; rl=r2: r2=c2/d2; co=c1; c1=c2; dO=dl; dl=d2; if (fabs(cl) > scf I I fabs (dl) > scf cO / = scf; Cl / = scf; do / = scf; dl / = scf;

1

} while' (fabs( (r2-rl)/r2) > eps) ; *klgam = r2/a; *grgam = gam- (*klgarn); } else ( co=a*s; c1= (1.O+X)*co; q=x+2.0-a; dO=x; dl=x*q; rl=cl/dl; n=l; do ( q += 2.0; r=n* (n+l-a); c2=q*cl-r*c0; d2=q*dl-r*dO; rl=r2;

Copyright 1995 by CRC Press, Inc

r2=c2/d2; co=c1; c1=c2 ; dO=dl ; dl=d2 ; if (fabs (cl) > scf co / = scf; Cl / = scf; do / = scf; dl /= S C ~ ;

II

fabs (dl) > scf) {

E. incbeta The incomplete beta function is defined as

p > 0, q > 0, 0 1 x 5 I, and the incomplete beta function ratio is I,(p,q) = B,(p,q) /B,(p,q). incbeta computes I,(p,q) for 0 1 x 1 1, p > 0, q > 0 by use of the continued fraction corresponding to formula 26.5.8 in [AbS65], see also [G64, G671. If x > 0.5 then the relation I,(p,q)=l-I,,(q,p) is used. The value of the continued fraction is approximated by the convergent C,(x), where r is the smallest integer for which 11 - C,,(x)/C,(x) I I e, e being a small positive real number supplied by the user. It is advised to use in incbeta only small values of p and q, say O

0 ; eps: float; entry: the desired relative accuracy (value of e above); the value of eps should not be smaller than the machine accuracy. Function used: gamma.

;loat incbeta(f1oat x, float p, float q, float eps)

Copyright 1995 by CRC Press, Inc

float gamma (float); int m,n,neven,recur; float g,f,fn,fnl,fnZ,gn,gnl,gn2,dn,pq; if (X == 0.0 I I x == 1.0) return x; else { if (X r 0.5) { f=p; p=q; q=f; x=1.0-x; recur=l; ) else recur=O; g=fn2=0.0; m=O ; pq=p+q; f=fnl=gnl=gnZ=l.O; neven=O ; n=l; do ( if (neven) ( m++; dn=m*x*(q-m)/ (p+n-1.0) / (p+n); } else dn = -x*(p+m)* (pq+m)/ (p+n-1.0)/ (p+n); g=f; fn=fnl+dn*fnZ; gn=gnl+dn*gn2; neven= ( !neven) ; f=fn/gn; fnZ=fnl; fnl=fn; gn2=gnl; gnl=gn; n++ ; } while (fabs((f-g)/f)> eps); POW ( ~ , p*pow ) (1.0-x,q) *gamma (p+q)/gamma(p+l.0)/gamma(q); if (recur) f=l.O-f; return f;

1

1

F. ibpplusn The incomplete beta function is defined as

p > 0, q > 0, 0 5 x I; I, and the incomplete beta function ratio is I,(p,q) = B,(p,q) / B,(p,q). ibpplusn computes I,(p+n,q) for n=O,l,...,nmax, 0 S x 5 1, p > 0, q > 0 (see [G64, G671). In [G64] the procedure ibpplusn is called "incomplete beta q fixed". There is no control on the parameters x, p, q, nmax for their intended ranges.

Function Parameters:

void ibpplusn (x,p,q,nmax,eps,i) x:

float; entry: p: float; entry: q: float;

this argument should satisfy: 0 I x I 1; the parameter p, p > 0; it is advised to take 0 < p I 1;

Copyright 1995 by CRC Press, Inc

entry: the parameter q, q > 0; nmax: int; entry: nmax indicates the maximum number of function values I,(p+n,q) to be generated; eps: float; entry: the desired relative accuracy (value of E above); the value of eps should not be smaller than the machine accuracy; i: float i[O:nmax]; exit: i[n]=I&+n,q)forn=O,l, ...,nmax. Functions used:

ixqfix, ixpfix.

void ibpplusn(f1oat x, float p, float q, int nmax, float eps, float i [I ) void ixqfix (float, float, float, int, float, float [I) void ixpfix (float, float, float, int, float, float [I) int n;

; ;

if (X == 0.0 I I x == 1.0) for (n=O; n 0, 0 5 x 5 1, and the incomplete beta function ratio is I,(p,q) = B,(p,q) /B,(p,q). ibqplusn computes I,(p,q+n) for n=O,l,...,nmax, 0 5 x 5 I, p > 0, q > 0 (see [G64, G671). In [G64] the procedure ibqplusn is called "incomplete beta p fixed". There is no control on the parameters x, p, q, nmax for their intended ranges. Function Parameters:

void ibqplusn (x,p,q,nmax,eps,i) x:

float; entry: p : float; entry: q: float; entry: nmax: int; entry:

this argument should satisfy: 0 5 x I 1; the parameter p, p > 0; the parameter q, q > 0; it is advised to take 0 < q I1; nmax indicates the maximum number of function values I,(p,q+n) to be generated;

Copyright 1995 by CRC Press, Inc

eps:

i:

float; entry: the desired relative accuracy (value of E above); the value of eps should not be smaller than the machine accuracy; float i [ O : n m d ; i[n] = I,(p,q+n) for n=O,l,...,nmax. exit:

Functions used:

ixqfix, ixpfix.

void ibqplusn(f1oat x, float p, float q, int nmax, float eps, float i[l) I

if (X == 0.0 1 1 x == 1.0) for (n=O;ns=nmax; n++) i [nl=x; else { if (X 1-x, yet another similar expansion with j=12 is used. is used, and over the range OS* Function Parameters: void inverseerrorfunction (x,oneminx,inverfl x: float; entry: the argument of inverseerrorfunction(x); it is necessary that -1 < x < 1; if I x I > 0.8 then the value of x is not used in the procedure; oneminx: float; entry: if Ix 1 5 0.8 then the value of oneminx is not used in the procedure; if 1x1 > 0.8 then oneminx has to contain the value of 1-1x1; in the case that I X I is in the neighborhood of 1, cancellation of digits take place in the calculation of 1- I x 1 ; if the value 1- I x 1 is known exactly from another source then oneminx has to contain this value, which will give better results; inverj float *; exit: the result of the procedure. Function used: chepolsum.

void inverseerrorfunction(float x, float oneminx, float *inverf)

I

float chepolsum (int, float, float [I ) ; float absx,p,betax,a[241 ; absx=f abs (x); if (absx > 0.8 && oneminx > 0.2) oneminx=O.O; if (absx c= 0.8) ( a [O] = 0.992885376618941; a [l] = O.l204675l6l43lO4; a [2] = 0.016078199342100; a [3] = 0.002686704437162; a [4] = 0.000499634730236; a [5] = 0.000098898218599; a [6] = 0.000020391812764; a [7] = 0.000004327271618; a [8] = 0.000000938081413; a [9] = O.OOOOOO2O673472O; a [lo] = 0.000000046159699; a [lll = 0.000000010416680; a [l2] = 0.000000002371501; a [I31 = 0.000000000543928; a [l4] = 0.000000000125549; a [I51 = 0.000000000029138; a [16] = 0.000000000006795; a 1171 = 0.000000000001591; a [l8] = 0.000000000000374; a [I91 = 0.000000000000088; a [20] = 0.000000000000021; a [211 = O.OOOOOOOOOOOOOO5; *inverf = chepolsum(2l,x*x/0.32-l.O,a)*x; } else if (oneminx >= 25.0e-4) { a [0] = 0.912158803417554; a [I] = -0.016266281867664; a [2] = 0.000433556472949; a [3] = 0.000214438570074; a [4] = 0.000002625751076; a [5] = -0.000003021091050; a [6] = -0.000000012406062; a [7] = O.OOOOOOO624O66O9; a [8] = -0.000000000540125; a [9] = -0.000000001423208; a [lo] = 0.000000000034384; a [ll] = 0 .OOOOOOOOOO33584; a[l2] = -0.000000000001458; a[13] = -0.000000000000810; a [I41 = 0.000000000000053; a [I51 = 0.000000000000020; betax=sqrt (-log( (l.O+absx)*oneminx) ) ; p = -1.54881304237326*betax+2.56549012314782; p=chepolsum (l5,p,a) ; *inverf = ( x c 0.0) ? -betax*p : betax*p; ) else if (oneminx >= 5.0e-16) { a [0] = 0.956679709020493; a [l] = -0.023107004309065; a [2] = -0.004374236097508; a [3] = -0.000576503422651; a[4] = -0.000010961022307; a[51 = 0.000025108547025; a [6] = 0.000010562336068; a [7] = 0.000002754412330; a [8] = 0.000000432484498; a [9] = -0.000000020530337; a[lO] = -0.000000043891537; a[lll = -0.000000017684010; a[l2] = -0.000000003991289; a[131 = -0.000000000186932; a [14] = O.OOOOOOOOO272923; a [15] = O.OOOOOOOOOl328l7;

Copyright 1995 by CRC Press, Inc

a [16] a[l8] a[20] a [22]

= O.OOOOOOOOOO3l834; = -0.000000000002036; = -0.000000000000220;

=

0.000000000000014;

a [17] a[191 a[2ll a [231

= 0.000000000001670; = -0.000000000000965; = -0.000000000000010; = O.OOOOOOOOOOOOOO6;

betax=sqrt(-log((l.O+absx)*oneminx)); p = -0.559457631329832*betax+2.28791571626336; p=chepolsum(23 ,p,a) ; *inverf = (x c 0.0) ? -betax*p : betax*p; ) else if (oneminx >= FLT-MIN) { a [O] = 0.988575064066189; a [l] = 0.010857705184599; a [2] = -0.OOl751165lO2763; a [3l = O.OOOO2ll96993207; a [4] = 0.000015664871404; a [5] = -0.000000519041687; a [6] = -0.000000037135790; a[7] = 0.000000001217431; = -0.000000000011937; a [8] = -0.000000000176812; a[91 a [lo] = 0.000000000000380; a [ll] = -0.000000000000066; a [I21 = -0.000000000000009; betaxzsqrt (-log( (l.O+absx)*oneminx) ) ; p = -9.19999235883015/sqrt(betax)+2.79499082012460; p=chepolsum(l2 ,p,a) ; *inverf = ( x c 0.0) ? -betax*p : betax*p; } else *inverf = (x > 0.0) ? 26.0 : -26.0;

1

D. fresnel Evaluates the Fresnel integrals

' dt, ~ ( x )= /oX~in('t2 1 If 1x1 I 1.2 then S(x) is approximated by means of a Chebyshev rational function (see [Cod68]) of the form

and C(x) by a similar function of the form

where j=4. When 1.2 < I x 1 I 1.6, similar approximations with j=5 are used. Over the range 1.6< I x I , the functions f(x) and g(x) occurring in the formulae S(x) = '/i - f(x)cos(ux '/2) - g(x)sin(u2/2) C(x) = '/i + f(x)sin(?rx2/2) - g(x)cos(ux2/2) are evaluated by means of a call of fg, and the values of S(x) and C(x) are obtained by use of these relationships.

Function Parameters: void fresnel (x,c,s) float; entry: the real argument of C(x) and S(x); c: float *;

x:

Copyright 1995 by CRC Press, Inc

s:

exit: the value of C(x); float *; exit: the value of S(x).

Function used: fg.

void f r e s n e l ( f 1 o a t x , f l o a t * c , f l o a t * s )

I '

void f g ( f l o a t , f l o a t *, f l o a t * ) ; f l o a t absx, x3, x4, a , p , q, f , g , c l , s l ; absx=fabs ( x ) ; i f (absx c = 1 . 2 ) { a=x*x; x3=a*x; x4=a*a; p=(((5.47711385682687e-6*~4-5.28079651372623e-4)*~4+ 1.76193952543491e-2)*~4-1.99460898826184e-l)*x4+1.0; q=(((1.18938901422876e-7*~4+1.55237885276994e-5)*~4+ 1.09957215025642e-3)*x4+4.72792112010453e-2)*~4+1.0;

*C = x*p/q; p=(((6.71748466625141e-7*~4-8.45557284352777e-5)*~4+ 3.87782123463683e-3)*x4-7.07489915144523e-2)*x4+

5.23598775598299e-1; q=(((5.95281227678410e-8*~4+9.62690875939034e-6)*~4+ 8.17091942152134e-4)*~4+4.11223151142384e-2)*~4+1.0; * s = x3*p/q;

) e l s e i f (absx c= 1 . 6 ) { a=x*x; x3=a*x; x4=a*a; p=((((-5.68293310121871e-8*x4+1.02365435056106e-5)*~46.71376034694922e-4)*~4+1.91870279431747e-2)*x42.07073360335324e-1)*x4+1.00000000000111eO; q=((((4.41701374065010e-10*x4+8.77945377892369e-8~*~4+ 1.01344630866749e-5)*~4+7.88905245052360e-4)*x4+ 3.96667496952323e-2)*x4+1.0; *c = x*p/q;

p=((((-5.76765815593089e-9*~4+1.28531043742725e-6)*~41.09540023911435e-4)*x4+4.30730526504367e-3)*~47.37766914010191e-2)*~4+5.23598775598344e-1; q=((((2.05539124458580e-10*x4+5.03090581246612e-8)*~4+ 6.87086265718620e-6)*~4+6.18224620195473e-4)*x4+ 3.53398342767472e-2)*x4+1.0; *s = x3*p/q;

} e l s e i f (absx c 1.0e15) {

1

f g ( x , & f , & g;) a=x*x; a = ( a - f l o o r ( a / 4. o ) *4 . o ) *1.57079632679490; c l = c o s( a ) ; s l = s i n( a ) ; a = ( X c 0 . 0 ) ? - 0 . 5 : 0.5; *C = f * s 1 - g * c l + a ; *s = -f*cl-g*sl+a; ) else *c = * s = ( ( x > 0 . 0 ) ? 0.5 : - 0 . 5 ) ;

Evaluates the functions f(x), g(x) related to the Fresnel integrals by means of the formulae f(x) = {W - S(x)}cos(?rx2/2) - {W - C(x)}sin(nx 2/2) g(x) = {W - C(x)}cos(m2/2) + {W - S(x)}sin(d/2).

Copyright 1995 by CRC Press, Inc

When Ix I 5 1.6 the functions S(x) and C(x) are evaluated by means of a call offresnel, and the values of f(x) and g(x) are obtained by use of the above relationships. When 1.6 < ( x111.9, f(x) is approximated by use of a Chebyshev rational function (see [Cod68]) of the form

and g(x) by use of a similar function of the form

with i=4 and j=5. When I x 1 2 2.4, similar expansions with i=j=5 are used. When I x 1 > 2.4, the approximating functions are

for f(x) and

for g(x). Function Parameters: void fg (x,jg) float; entry: the real argument of f(x) and g(x); $ float *; exit: the value of f(x); g: float *; exit: the value of g(x). x:

Function used: fresnel.

void fg(f1oat x, float *f, float *g) 1

void fresnel (float, float *, float * ) ; float absx,c, s,cl, sl,a, xinv,x3inv, c4,p, q; absx=fabs (x); if (absx 0; p l : float *; exit: the value of P,(x); q l : float *; exit: the value of Q,(x). x:

Functions used:

bessj 1, bessy0l.

void besspql (float x, float *pl, float *ql) I

if ( ~ ~ 8 . {0 ) float bess j1 (float); void bessyOl(float, float *, float * ) ; float b,cosx,sinx,jlx,yl; b=sqrt (x)*1.25331413731550; bessy0l (x,&jlx, &yl) ; j lx=bessj1 (x); X - = 0.785398163397448; cosx=cos (x); sinx=sin (x); *pl = b*(jlx*sinx-yl*cosx); *ql = b*(jlx*cosx+yl*sinx); ) else { int i; float x2,bO,bl,b2,y; static float arl[lll={0.10668e-15, -0.72212e-15, 0.545267e-14, -0.4684224e-13, 0.46991955e-12, -0.570486364e-11, 0.881689866e-10, -0.187189074911e-8, 0.6177633960644e-7, -0.39872843004889e-5, 0.89898983308594e-3); static float ar2[ll]={-0.10269e-15, 0.65083e-15, -0.456125e-14, 0.3596777e-13, -0.32643157e-12, 0.351521879e-11, -0.4686363688e-10, 0.82291933277e-9, -0.2095978138408e-7, 0.9138615~579555e-6,-0.96277235491571e-4); y=8.O/X; x=2.o*y*y-1.0; xz=x+x; bl=b2=0.0; for (i=O; ic=10; i++) { bO=x2*bl-b2+arl [il ; b2=bl; bl=bO ;

Copyright 1995 by CRC Press, Inc

1

*pl = x*bl-b2+1.0009030408600137; bl=b2=0.0; for (i=O; i 15, the function I, '(x) = e"Io(x) is evaluated by means of a call of nonexpbessio, and the relationship Io(x) = e'l,'(x) is used. Function Parameters: float bessiO (x) bessi0: delivers the modified Bessel function of the first kind of order zero with argument x;

x:

float; entry:

the argument of the Bessel function.

Function used: nonexpbessio.

float bessi0 (float x)

I if (X == 0.0) return 1.0;

if (fabs(x) c = 15.0) { float z,denominator,nurnerator; z=x*x; numerator= (z*( z * (z*(z* (z*(z*(z*(z*(z*(z*(z* (z*(z*(z* 0.210580722890567e-22+0.380715242345326e-19)+ 0.479440257548300e-16)+0.435125971262668e-13)+ 0.300931127112960e-10)+0.160224679395361e-7)+ 0.654858370096785e-5)+0.202591084143397e-2)+ 0.463076284721000e0)+0.754337328948189e2)+ 0.830792541809429e4)+0.571661130563785e6)+ 0.216415572361227e8)+0.356644482244025e9)+ 0.144048298227235e10); denominator=(z*(z*(z-0.307646912682801e4)+ 0.347626332405882e7)-0.144048298227235elO); return -nurnerator/denominator; } else { float nonexpbessiO(f1oat); return exp (fabs( x ) ) *nonexpbessiO (x);

I

I

Copyright 1995 by CRC Press, Inc

B. bessil Computes the value of the modified Bessel function of the first kind of order one I,($. For Ix 1 5 15, a Chebyshev rational function approximation [Bla74] of the form

for the function x'"I,(x) is used, and the value of I,(x) is recovered. When 1x1 > 15, the function I,'($ = e"I,(x) is evaluated by means of a call of nonexpbessil, and the relationship I,@) = e"l, '(x) is used. Function Parameters: float bessil (x) bessil: delivers the modified Bessel function of the first kind of order one with argument x; x: float; entry: the argument of the Bessel function. Function used: nonexpbessil.

£loat bessil (float x) {

1

if (X == 0.0) return 0.0; if (fabs(x) c = 15.0) { float z,denominator,numerator; z=x*x; denominator=z*(z-0.222583674000860e4)+0.136293593052499e7; numerator= (z*(z*(z*(z* (z*(z*(z*(z*(z*(z*(z*(z*(z* (z* 0.207175767232792e-26+0.257091905584414e-23)+ 0.306279283656135e-20)+0.261372772158124e-17)+ 0.178469361410091e-14)+0.963628891518450e-12)+ 0.410068906847159e-9)+0.135455228841096e-6)+ 0.339472890308516e-4)+0.624726195127003e-2)+ 0.806144878821295e0)+0.682100567980207e2~+ 0.341069752284422e4)+0.840705772877836e5)+ 0.681467965262502e6); return x*(numerator/denominator); } else { float nonexpbessil(f1oat); return exp (fabs(x)) *nonexpbessil (x);

1

C. bessi Generates an array of modified Bessel functions of the first kind of order one h(x), j=O,

...,n.

The functions h'(x) = e-lxllj(x),j=O, ...,n, are first evaluated by means of a call of nonexpbessi, and the required values of 4(x) are recovered by multiplication by 8.

Copyright 1995 by CRC Press, Inc

Function Parameters: x: n: i:

void bessi (x,n,i) float; entry: the argument of the Bessel functions; int; entry: the upper bound of the indices of the array i; float i[O:n]; exit: ib] contains the value of the modified Bessel function of the first kind of order j, j=O ,...,n.

Function used: nonexpbessi.

void bessi (float x, int n, float i [ I

)

I

if ( x = = 0.0) { i[O]=l.O; for ( ; n>=l; n--) i[nl=O.O; } else { void nonexpbessi(float, int, float [ I ) ; float expx; expx=exp ( fabs (x) ) ; nonexpbessi (x,n, i) ; for ( ; n>=O; n--) i [nl *= expx;

1

1

Computes the modified Bessel functions of the third kind of orders zero and one: K,(x) and K,(x) for x>O. For 0 < x < 1.5, K,(x) and K,(x) are computed by use of truncated versions of the Taylor series expansions [AbS65]

For x 2 1.5, the functions K,'(x) = e'K,(x), j=0,1, are evaluated by a call of nonexpbesskOl, and the relationship K,(x) =e"K, '(x) is used.

Function Parameters:

Copyright 1995 by CRC Press, Inc

void besskO 1 (x, kO, kl) float; entry: the argument of the Bessel functions; x > 0; kO: float *; exit: kO has the value of the modified Bessel function of the third kind of order zero with argument x; kl: float *; exit: kl has the value of the modified Bessel function of the third kind of order one with argument x.

x:

Function used: nonexpbessko 1.

void besskOl(f1oat x, float *kO, float *kl) {

if (x c= 1.5) { int k; float c,d,r,sumO,suml,t,term,to,tl; sumO=d=log(2.0/~)-0.5772156649015328606;

1

suml = c = -1.0-2.0*d; r=term=l.O; t=x*x/4.0; k=1; do { term * = t*r*r; d += r; c - = r; r=l.O/ (k+l); c - = r; tO=term*d; tl=term*c*r; sum0 += to; suml += tl; k++; } while (fabs(tO/sumO)+fabs(tl/suml) > 1.0e-15); *kO = sumo; /x; *kl = (l.O+t*suml) } else { void nonexpbesskOl(float, float *, float * ) ; float expx; expx=exp( -x); nonexpbesskol (x,kO ,kl); *kl *= expx; *kO *= expx; }

E. bessk Generates an array of modified Bessel functions of the third kind of order j, %(x), j=O ,...,n, for x > 0. The functions KO($ and K,(x) are first evaluated by means of a call of bessk01, and the

Copyright 1995 by CRC Press, Inc

recursion [AbS65] is then used. Function Parameters: void bessk (x,n,k) float; entry: the argument of the Bessel functions; x > 0; n: int; entry: the upper bound of the indices of array k; n 2 0; k: float k[O:n]; exit: kfi] is the value of the modified Bessel function of the third kind of order j with argument x, j=O ,..., n.

x:

Function used: bessk0l. #include void bessk(f1oat x, int n, float kt]) void besskOl(float, float int i; float kO,kl,k2;

*, float

*);

bessk01 (x,&k0,&kl) ; kt01 =k0; if (n s 0) k[ll =kl; x=2. o/x; for ( i = 2 ; iO, multiplied by 8. nonexpbesskO1 evaluates the functions Kjl(x) = exKj(x) for x > 0, j=0,1. For 0-51.5, the functions K&) and Kl(x) are computed by a call of besskOl and the 4'are evaluated by use of their defining relationship. For 1.5-15, the trapezoidal rule (see [Hu64] of bessi0) is used to evaluate the integrals (see [AbS65] of bessi0)

with p = 215. For x > 5, truncated Chebyshev expansions (see [Cle62, Lu691) of the form are used. Function Parameters:

void nonexpbessko 1 (x, k0,kl)

Copyright 1995 by CRC Press, Inc

x:

float; entry: the argument of the Bessel functions; x > 0; kO: float *; kO has the value of the modified Bessel function of the third kind of order zero exit: with argument x, multiplied by 8; kl: float *; kl has the value of the modified Bessel function of the third kind of order one exit: with argument x, multiplied by 8. Function used: bessk0l.

void nonexpbesskOl(float x, float *kO, float *kl) if (X c= 1.5) { void bessk0l (float, float *, float * ) ; float expx; expx=exp (x); bessk01 (x,kO,kl) ; *kO *= expx; *kl *= expx; ) else if ( x c= 5.0) { int i,r; float t2,~1,~2,terml,term2,sqrtexpr,exph2,~2; static float fac[201=(0.90483741803596, 0.67032004603564, 0.40656965974060, 0.20189651799466, 0.82084998623899e-1, 0.27323722447293e-1, 0.74465830709243e-2, 0.16615572731739e-2, 0.30353913807887e-3, 0.45399929762485e-4, 0.55595132416500e-5, 0.55739036926944e-6, 0.45753387694459e-7, 0.30748798795865e-8, 0.16918979226151e-9, 0.76218651945127e-11, 0.28111852987891e-12, 0.84890440338729e-14, 0.2098791048793e-15, 0.42483542552916e-17); s1=0.5; s2=0.0; r=O 0; x2 =x+x; exph2=l. O/sqrt (5.0*x); for (i=O; i 0; int; entry: the upper bound of the indices of array k; n 2 0; float k[O:n]; kD] is value of the modified Bessel function of the third kind of order j exit: multiplied by 8 , j=O ,...,n.

Function used: nonexpbessko 1.

void nonexpbessk(f1oat x, int n, float k[l)

t

void nonexpbesskOl(float, float int i; float k0,kl,k2 ; nonexpbessk01 (x,hk0, &kl) ; k [01 =k0; if (n > 0) k[ll =kl; x=2.o/x; for (i=2; iO, a20. For x < 3, the above functions are evaluated by use of truncated Taylor series (see [T76a]). For x23, the functions P,(x), Q,(x) occurring in the formula Y,(x) = (2/(1rx))"~{~,(x)sin(x - (a+%))?r/2) + Q,(x)cos(x - (a+%)z/2)) are evaluated for a=a. a + l by means of a call of besspqa01; the values of Ya(x), Y,+,(x) are then recovered.

Function Parameters: void bessyaO 1 (a,x,ya,yal) float; entry: the order; x: float; entry: this argument should satisfy x > 0; ya: float *; exit: the Neumann function of order a and argument x; yal: float *; exit: the Neumann function of order a+l. a:

Functions used:

bessy0 1, recipgamma, besspqa0 1.

#include void bessyaOl(f1oat a, float x, float *ya, float *yal) I if ( ~ = = o . o{ ) void bessyOl(float, float *, float * ) ; bessy0l (x;ya,yal) ; } else ( int n,na,rec,rev; float b,c,d.e,f,g,h,p,pi,q,r,s; pi=4.0*atan(l.O); na=floor(a+0.5); rec = (a >= 0.5);

Copyright 1995 by CRC Press, Inc

rev = (a c -0.5); if (rev I I rec) a - = na; if (a == -0.5) { p=sqrt (2.0/pi/x); f=p*sin(x); g = -p*cos(x); ) else if (x c 3.0) { float recipgamma (float, float * , float * ) ; b=x/2.O ; d = -log(b); e=a*d; c = (fabs(a) c 1.0e-8) ? l.O/pi : a/sin(a*pi) ; s = (fabs(e) c 1.0e-8) ? 1.0 : sinh(e)/e; e=exp (e); g=recipgamma(a,&p,&q)*e; e=(e+l.O/e)/2 .O; f=2.0*c*(p*e+q*s*d); e=a*a; p=g*c; q=l.O/g/pi; c=a*pi/2.0; r = (fabs(c) c 1.0e-8) ? 1.0 : sin(c)/c; r *= pi*c*r; c=1.0; d = -b*b; *ya = f+r*q; *yal = p; n=l; do { f= (f*n+p+q)/ (n*n-e); c=c*d/n; p / = (n-a); q / = (n+a); g=c* (f+r*q); h=c*p-n*g; *ya += g; *yal += h; n++; ) while (fabs(g/(l.~+fabs(*ya)) )+fabs(h/(l.O+fabs*yal1 ) > 1.0e-15); f = -(*ya); g = - (*yal)/b; } else { void besspqaol (float,float,float * , float *, float *, float * ) ; b=x-pi*(a+O.5)/2.0; c=cos(b); s=sin(b); d=sqrt(2.0/x/pi); besspqaol (a,x, &p,&q,&b,&h) ; f=d* (p*s+q*c); g=d* (h*s-b*c);

1

if (rev) { x=2.o/x; na = -na-1; for (n=O; nc=na; n++) { h=x* (a-n)*f-g; g=f; f=h;

1

) else if (rec) { x=2.o/x; for (n=l; nc=na; n++) ( h=x* (a+n)*g-f; f=g; g=h;

Copyright 1995 by CRC Press, Inc

C. bessyaplusn Generates an array of Bessel functions of the second kind of order a+n, Ya+,(x), n=O,...,nmax, for x > 0, a 2 0. The values of the functions Ya(x), Ya+,(x) are first obtained by means of a call of bessya01, and the recursion Ya+n+/(x) = -Ya+n-/(x) + J2(n+a)/x)Ya+n(x) is then used. Function Parameters: void bessyaplusn (a,x,nmcucjan) float; entry: the order; x: float; entry: the argument value, x > 0; nmax: int; entry: the upper bound of the indices of the array yan; nmax 2 0; yan: float yan[O:nmax]; exit: the values of the Bessel functions of the second kind of order a+n, for argument x are assigned to yan[n], 0 I n I nmax. a:

Function used: bessya0l .

void bessyaplusn(f1oat a, float x , int nmax, float yan[l)

{

void bessyaOl(float, float, float * , float int n; float yl;

*) ;

bessya0l (a,x,w a n [O], &yl) ; a - = 1.0; x = 2 . o/x; if (nmax > 0) yan[ll=yl; for (n=2; nc=nrnax; n++) yan [nl = -yan [n-21+(a+n)*x*yan [n-11;

1

This procedure is an auxiliary procedure for the computation of the Bessel functions for large values of their argument. besspqaOl evaluates the functions P,(x), Q,(x) occurring in the formulae J, (x) = (2/(7rx)'" JP ,(x) cos(x - (a+ %) x/2) - Q,(x) sin (x - (a+ %) d2)) Y,(x) = (2/(~x))'"J~,(x)sin(x - (a+%)n/2) + Q,(x)cos(x - (a+%)n/2)} for a = a, a + l (a 2 0 and x > 0). If x < 3 then the formulae P,(x) = ( d 2 ) I" J~,(x)sin(x - (a+ %) d 2 ) + J,(x) cos(x - (a+ %) 7d2)) Q,(x) = ( d 2 ) "2J~a(x)cos(~- (a+ 35) 7d2) - Ja(x)sin(x - (a+ %) d2)) Pa+,(x) = (~x/2)~"{J~+,(x)sin(x- (a+%)d2) - Y,+,(x)cos(x - (a+%)n/2)) Q,+,(x) = (?~~/~)'"JJ~+~(x)cos(x - (a+%)n/2) + Y,+,(x)sin(x - (a+%)lr/2)) are used, bessjaplusn and bessyaOI being called to evaluate the functions J, ..., Ya+,. When x23, Chebyshev expansions are used (see [T76a, W451).

Copyright 1995 by CRC Press, Inc

Function Parameters:

void besspqaO 1 (a,x,pa,qa,paI,gal) a: float; entry: the order; x: float; entry: this argument should satisfy x > 0; pa: float *; exit: the value of P,(x); qa: float *; exit: the value of Q,(x); pal: float *; exit: the value of P,+,(x); gal: float *; exit: the value of Q,+,(x). Functions used:

besspq0, besspql, bessjaplusn, bessyaol.

void besspqaOl(f1oat a, float x, float *pa, float *qa, float *pal, float *qal) ( if (a == 0.0) { void besspq0 (float, float *, float * ) ; void besspql (float, float *, float * ) ; besspq0 (x,pa, qa) ; besspql (x,pal,qal) ; ) else ( int n,na,rec,rev; float b,pi,pO,qO; pi=4.0*atan(l.0) ; rev = (a < -0.5); if (rev) a = -a-1.0; rec = (a >= 0.5); if (rec) ( na=floor (a+0.5); a - = na; if ( a == -0.5) ( *pa = *pal = 1.0; *qa ,= *qal = 0.0; ) else if ( x >= 3.0) ( float c,d,e,f,g,p,q,r,s, temp; c=0.25-a*a; b=x+x; f=r=l.O; g = -x; s=o . 0 ; temp=x*cos(a*pi)/pi*l.Oe15; e=temp*temp; n=2 ; do ( d= (n-l+c/n); p= (2*n*f+b*g-d*r) / (n+1); q= (2*n*g-b*f-d*s) / (n+1); r=f; f=p; s=g; n++; } while ( (p*p+q*q)*n*n < e) ; e=f*f+g*g; p= (r*f+s*g)/e; q= (s*f-r*g) /e;

Copyright 1995 by CRC Press, Inc

g=q; n--; while (n > 0) ( r=(n+l)* (2.0-p)-2.O; s=b+ (n+l)*q; d= (n-l+c/n)/ (r*r+s*s); p=d*r; q=d*s; e=f; f=p*(e+l.O)-g*q; g=q* (e+l.0)+p*g; n--;

1

f += 1.0; d=f*f+g*g; *pa = f/d; *qa = -g/d; d=a+0.5-p; q += x; *pal = ( (*pa)*q- (*qa)*d)/x; *qal = ( (*qa)*q+ (*pa)*d)/x; ) else ( void bessjaplusn (float, float, int, float [I ) ; void bessyaOl(float, float, float *, float * ) ; float c,s,chi,ya,yal,ja[21; b=sqrt (pi*x/2.0); chi=x-pi*(a/2.0+0.25); c=cos (chi); s=sin(chi); bessyaol (a,x, &ya,&yal) ; bessjaplusn(a,x,1,ja) ; *pa = b* (ya*s+c*ja[Ol ) ; *qa = b*(c*ya-s*ja[Ol ) ; *pal = b* (s*ja[ll -c*yal); *qal = b* ( c * ja [ll+s*yal); I

if (rec) ( x=2.o/x; b= (a+l.O)*x; for (n=l; nc=na; n++) ( p0= (*pa)- (*qal)*b; q0= (*qa)+ (*pal)*b; *pa = *pal; *pal = PO; *qa = *qal; *qal = q0; b += x;

1

if (rev) { pO = *pal; *pal = *pa; *pa = PO; q0 = *qal; *qal = *qa; *qa = qO;

1

1

1

E. besszeros Calculates the first n zeros of a Bessel function of the first or the second kind or its derivative. besszero computes the first n zeros of either 1) J,(z) or 2 ) Y,(z) or 3) dl,(z)/& or 4 ) dY,(z)/&, (a 2 0). Which of the above set of zeros is derived is determined by the value of the parameter d upon call: thus 1 for the zeros of J,(z), and so on. Each zero is obtained by use of an initial approximation derived from an asymptotic

Copyright 1995 by CRC Press, Inc

expansion [AbS65], and subsequent improvement by use of a Newton-Raphson process [T76b, T78]. If a < 3, then with p=4d, the s-th zero of Ja(z) or Ya(z) is given by za,s = P - (~-1)/(8P)- 4(~-~)(7~-31)//3(8P)~}32(p-l)(83p '-982p+3 779)/{15(8J)'} ... where J = (s + d 2 - 1/4)u for Ja(z) and J = (s + d 2 - 3/4)7r for Ya(z). Similarly the s-th zero of dJa(z)/dz or dYa(z)/dz is given by z = J ' - (p+3)/(8) 7 - 4(7p2 +82p-9)/{3(8J y3} 32(83p3+2075p2-3039c+353 7)/{15(8J 7') ... where J' = (s + d 2 - 3/4)n for dla(z)/dz and ji" = (s + d 2 - 1/4)r for dYa(z)/dz. If a 2 3, then with w(u) defined by = {w(u) - 1) ' I 2 - arc cos(w(u)-') (2/3)(-~)~" i.e. w(u) = llcosgl(u) where {tangl(u)) - $(u) = (2/3)(-~)~" the s-th zero of Ja(z) or Ya(z) is given by z,, = aw[-a2'y{3 ~(4s-2k-I)/8}] where f{x} = 2l3(l+ (5/48)f2 - (5/36)x4 + ...) where k=O for Ja(z) and k=1 for Ya(z). Similarly, the s-th zero of dla(z)/dz or dYa(z)/dz is given by z h,, = a~[-a"'~{3u(4~-2k-1)/8,'] where g{x) = x'/3(l - (7/48)x" + (35/288)x4 + ...) where now k=l for dla(z)/dz and k=O for dYa(z)/dz. la,,

Function Parameters: void besszeros (a,n,z,d) a: float; entry: the order of the Bessel function, a 2 0; n: int; entry: the number of zeros to be evaluated, n 2 1; z: float z[l:n]; exit: z/j] is the j-th zero of the selected Bessel function; d int; entry: the choice of d determines the type of the Bessel function of which the zeros are computed: if d=l then Ja(z); if d=2 then Ya(z); if d=3 then dla(z)/dz; if &=then dYa(')/dz.

Function used: besspqaol .

void besszeros(f1oat a, int n, float z[l , int d)

I

void besspqaOl(float, float, float *, float *, float *, float * ) ; int j,s; float aa,a2,b,bb,c,chi,co,mu,mu2,mu3,mu4,p,pi,pa,pal,pO,pl,ppl, q,qa,qal,ql,qql,ro,si,t,tt,u,v,wrx,xx,x4,y,fl,fi;

Copyright 1995 by CRC Press, Inc

pi=4 .O*atan(l.0) ; aa=a*a; mu=4.0*aa; mu2=mu*mu; mu3=mu*mu2; mu4=mu2*mu2; if ( d < 3 ) { p=7.0*mu-31.0; pO=mu-1.0; pl=4.0*(253.0*mu2-3722.0*mu+17869.0)/15.0/p*pO; ql=8.0*(83.0*mu2-982.0*mu+3779.0)/5.0/p;

) else ( p=7.0*mu2+82.0*mu-9.0; pO=mu+3.0; pl=(4048.0*mu4+131264.0*mu3-221984.0*mu2417600.0*mu+1012176.0)/60.0/p; ql=1.6*(83.0*mu3+2075.0*mu2-3039.0*mu+3537.O~/p; \J

t = (d == 1 I / d = = 4) ? 0.25 tt=4.0*t; if ( d c 3 ) { ppl=5.0/48.0; qql = -5.0/36.0; ) else ( ppl = -7.0/48.0; qql=35.0/288.0;

:

0.75;

I

for (s=l; s= 3.0*a-8.0) { b= (s+a/2.0-t)*pi; c=l.O/b/b/64.0; x=b-1.O/b/8.0*(PO-pl*c)/ (1.0-ql*c) ; ) else { if (S == 1) x = ((d == 1) ? -2.33811 : ((d == 2) ? -1.17371 ((d == 3) ? -1.01879 : -2.29444))); else ( x=y* (4.O*s-tt); v=l.O/x/x; x = -pow(x,2.0/3.0)* (1.O+V* (ppl+qql*v)) ;

1

:

u=x*bb; yy=2.0/3.0*p0~(-~, 1.5) ; if (yy == 0.0) fi=O.O; else if (yy > 1.0e5) fi=1.570796; else { float ~ , P , P P ; if (yy ~ 1 . 0 ){ p=pow(3.0*yy,1.0/3 .O); PP=P*P; ) i1575.0); p * = (I.o+pp* (-210.0+pp*(27.0-2.0*pp) ) else {

Copyright 1995 by CRC Press, Inc

j=O; do { xx=x*x; x4=xx*xx; a2=aa-xx; besspqaol (a,x,&pa,&qa,&pal,&qal) ; chi=x-pi*(a/2.0+0.25); si=sin(chi); co=cos(chi); ro = ( (d == 1) ? (pa*co-qa*si)/ (pal*si+qal*co) : / (qal*si-pal*co) : ( (d == 2) ? (pa*si+qa*co) / (pa*co-qa*si) : ( (d == 3) ? a/x- (pal*si+qal*co) a/x- (qal*si-pal*co) / (pa*si+qa*co)) ) j++; if (d < 3) { u=ro; p= (1.0-4.0*a2) /6.0/x/ (2.0*a+1.0); q=(2.O* (xx-mu)-l.0-6.0*a)/3.0/~/(2.O*a+l.O); } else { u = -xx*ro/a2; v=2.0*x*a2/(aa+xx)/3.0; w=a2*a2*a2; q=v* (1.0+(rnu2+3Z.O*mu*xx+48.O*x4)/32.O/w) ; p=v*(1.0+(-mu2+4O.O*mu*xx+48.O*x4)/64.O/w);

) ;

t,

w=u* (l.O+p*ro)/ (l.O+q*ro); X += w; } while (fabs(w/x) > 1.0e-13 && j

c

5);

F. start This is an auxiliary procedure which computes a starting value of an algorithm used in several Bessel function procedures. Certain stable methods for evaluating functions f,f,, ..., J;,which satisfy a three term recurrence relationship of the form f,+, + a& + bS,, = 0 require a value of v to be determined such that the value of the continued fraction

and that of its convergent

should agree. (The theory of such methods [see T76b], which involve backward recursion, is described in outline in the documentation to bessj which concerns the case in which f, = J,(x); the same methods are also implemented by bessjaplusn (for which f, = Ja+,(x)), by nonexpbessi (for which f, = I,(x)), by nonexpbessiaplusn (for which f, = e"Ia+,(x)), by spherbessj (for which ~,=(X/(~X))'/~J,+,(X)), and by nonexpspherbessi (for which f,=e%W.) x ( 7 m 4 )1/21k+

Copyright 1995 by CRC Press, Inc

The above requirement is equivalent to the condition that the tail

of expansion (1) should be negligible. For the special cases considered, t(v) represents the ratio of two Bessel functions of contiguous orders. Estimates of t(v) may be obtained by the use of such formulae as ] J,(x) = (2nr tanh(a)}"'2exp[r{tanh(a)-a} where r=x cosh(a), and I,&) = (21rr)-'" (1+2)-''4d~z) where z=xh and ~(z) = (1+z2)'" + ln[z/{l+ (1+z 2)1'23] Function Parameters: int start (x, n, t) start: a starting value for the Miller algorithm for computing an array of Bessel functions; x: float; entry: the argument of the Bessel functions, x > 0; n: int; entry: the number of Bessel functions to be computed, n 2 0; t: int; entry: the type of Bessel function in question, f=O corresponds to ordinary Bessel functions; t-1 corresponds to modified Bessel functions.

int start (float x, int n, int t) {

s=2*t-1; p=36.0/~-t; r=n/x; if (r > 1.0 [ I t == 1) ( q=sqrt(r*r+s); r=r*log(q+r)-q; ) else r=O.O; q=la.O/x+r; r = ( p > q ) ? p : q; p=sqrt (2.O*(t+r)) ; p=x*((l.o+r)+p)/(l.O+p);

y=o . 0 ;

y=p; p / = x; q=sqrt (p*p+s); p=x* (r+q)/log (p+q); q=y; ) whlle (p > q 1 1 p < q-1.0); return ((t == 1) ? floor(p+l.O)

1

Copyright 1995 by CRC Press, Inc

:

-floor(-p/2.0)*2);

6.6.2 Bessel functions I and K A. bessiaplusn Generates an array of modified Bessel functions of the first kind of order a+j, Iu+j(x), (Oljln, OIa-4). When x=O the above functions are evaluated directly; when a=O or a=0.5 the procedures bessi or nonexpspherbessi, as is appropriate, is used. Otherwise the functions e-lx'~,+,(x)(j=O,...,n) are evaluated by means of a call of nonexpbessiaplusn, and the values of the functions I,+,(x) (j=O,...,n) are recovered by multiplication by elX1.

Function Parameters: void bessiaplusn (a,x,n,ia) a: float; entry: the noninteger part of the order of the Bessel functions; 0 I a < 1 ; x: float; entry: the argument value of the Bessel functions, x 10; n: int; entry: the upper bound of the indices of the array ia; n 2 0; ia: float ia[O:n]; ial;] is assigned the value of the modified Bessel function of the first kind of exit: order a+j and argument x, I,+,(x), Oljln.

Functions used:

nonexpbessiaplusn, bessi, nonexpspherbessi.

yoid bessiaplusn(f1oat a, float x, int n, float ia[l) i if (X == 0.0) { ia[O] = (a == 0.0) ? 1.0 : 0.0; for ( ; n>=l; n--) ia[nl=0.0; ) else if (a == 0.0) { void bessi (float, int, float [I ) ; bessi (x,n,ia) ; } else if (a == 0.5) { void nonexpspherbessi(float, int, float [I); float c; c=0.797884560802865*sqrt (fabs(x)) *exp (fabs(x)) ; nonexpspherbessi(x,n,ia) ; for ( ; n>=O; n--) ia[nl *= c; } else { void nonexpbessiaplusn(float, float, int, float[] ) ; float expx; expx=exp( fabs (x)) ; nonexpbessiaplusn(a,x,n,ia); for ( ; n>=O; n--) ia[nl *= expx;

Computes the modified Bessel functions of the third kind of order a and a+]:K,(x) and K,+,(x) for x>O, a20. For 0 < x < 1, K,(x) and K,+,(x) are computed by using Taylor series (see [T75]). For

Copyright 1995 by CRC Press, Inc

x2 1 the procedure calls for nonexpbesska01. Function Parameters: void besskaO 1 float; entry: the order; x: float; entry: this argument should satisfy x > 0; ka: float *; the value of the modified Bessel function of the third kind of order a and exit: argument x; kal: float *; the value of the modified Bessel function of the third kind of order a+l and exit: argument x. a:

Functions used:

bessk0 1, recipgamma, nonexpbesskao1.

yoid besskaOl(f1oat a, float x, float *ka, float *kal) if (a == 0.0) { void bessk0l (float, float bessk0l (x,ka,kal) ; ) else ( int n,na,rec,rev; float f,g,h,pi; pi=4 .O*atan(l.O); rev = (a < -0.5); if (rev) a = -a-1.0; rec = (a >= 0.5); if (rec) { na=floor (a+0.5); a - = na;

*, float

*) ;

I

if (a == 0.5) f=g=sqrt (pi/x/Z.O)*exp (-x); else if ( x < 1.0) { float recipgamma (float, float float al,b,c,d,e,p,q,s; b=x/2.0; d = -log(b); e=a*d; c=a*pi; c = (fabs(c) < 1.0e-15) ? 1.0 s = (fabs(e) < 1.0e-15) ? 1.0 e=exp (e); al= (e+l.0/e)/2 .O; g=recipgarnma (a,&p,&q) *e; *ka = f = c* (p*al+q*s*d); e=a*a; p=o.s*g*c; q=o.5/g; c=1.0;

Copyright 1995 by CRC Press, Inc

*, float

:

:

*) ;

c/sin(c); sinh(e)/e;

n++; ) while (h/(*ka)+fabs(g)/(*kal) > 1.0e-15); f= (*ka); g= (*kalj/b; } else { void nonexpbesskaOl(float, float, float *, float float expon; expon=exp( -x); nonexpbesskaol(a,x, ka,kal); f=expon*(*ka); g=expon*(*kal);

*);

I

if (rec) ( x=2.o/x; for (n=l;n 0, a 2 0. The values of the functions Ka(x), Ka+,(x) are first obtained by means of a call of besskaol, and the recursion Ka+n+,(x) = Ka+n-I&) + J2(n+a)/xIKa+n(x) is then used.

Function Parameters: void besskaplusn (a,x,nmax,kan) a: float; entry: the order; a 2 0; x: float; entry: the argument value, x > 0; nmax: int; entry: the upper bound of the indices of the array kan; nmax 2 0; kan: float kan[O:nmax]; exit: the values of the modified Bessel functions of the third kind of order a+n, for argument x are assigned to kan[n], 0 5 n n nmax.

Function used: besska0l .

void besskaplusn(f1oat a, float x, int nmax, float kan[l) ( void besskaOl(float, float, float *, float * ) ; int n; float kl;

Copyright 1995 by CRC Press, Inc

I

besska01 (a,x,&kan [OI ,&kl) ; a - = 1.0; x=2.o/x; if (nmax > 0) kan[ll =kl; for (n=2; nO, a2O. Thus, apart from the exponential factor, the functions are the same as those computed by besska0l. nonexpbessku01 evaluates the functions K,'(x) = e'K,(x) (a=a,a+l). For O= 0.5); if (rec) { na=floor (a+O.5) ; a - = na; 1

if (a == -0.5) f=g=sqrt(pi/x/2.0); else if (x c 1.0) { void besskaOl(float, float, float *, float float expon; expomexp (x); besska0l (a,x,ka, kal);

Copyright 1995 by CRC Press, Inc

*);

f=expon*(*ka); g=expon*(*kal); } else { float b,c,e,p,q; c=O.25-a*a; b=x+x; g=l.O; f=O.O; e=cos (a*pi)/pi*x*l.0e15; n=l; do I h= (2.O* (n+x)*g-(n-l+c/n)*f) / (n+l); f=g; g=h; n++ ; } while (h*n < el; p=q=f/g; e=b-2.0: q=p*(1.0+q); n--; } while (n > 0); f=sqrt (pi/b)/ (l.O+q); g=f* (a+x+O.5-p)/x; 1 if (rec) { x=2.o/x; for (n=l; n 0; n: int; entry: the upper bound of the indices of array k; n 2 0; k: float k[O:n]; exit: kb] is the value of the modified spherical Bessel function l$+0.5'(~), j=O, ...,n.

x:

Function used: nonexpspherbessk.

Copyright 1995 by CRC Press, Inc

void spherbessk(f1oat x, int n, float k[] ) I void nonexpspherbessk(float, int, float float expx; expx=exp( -x); nonexpspherbessk (x,n, k); for ( ; n>=O; n--) k[nl *= expx;

[I

) ;

1

E. nonexpspherbessi Calculates the modified spherical Bessel functions multiplied by e". nonexpspherbessi evaluates J+O.Jrl(X) = e~*(~/(2~))''~I/+~,~(x),j=O, ...,n, where I/+o,5(x)denotes the modified Bessel function of the first kind of order j+0.5, for x 2 0. The ratio of two subsequent elements is computed using a backward recurrence formula according to Miller's method (see [G67]). Since the zeroth element is known to be (1-e'")(2x), the other elements follow immediately. The starting value is computed by start. Function Parameters: void nonexpspherbessi (x,n, i) float; entry: the argument of the Bessel functions; x 2 0; n: int; entry: the upper bound of the indices of array i; n 2 0; i: float i[O:n]; exit: ib] is the value of the modified spherical Bessel function I/+o,,"(x), j=O, ...,n.

x:

Function used: start.

void nonexpspherbessi(f1oat x, int n, float i[l) ( if (X == 0.0) ( i [Ol=l.O; for ( ; n>=l; n--) i[nl=O.O; ) else ( int start (float, int, int); int m; float x2,r; x2=x+x; i[O] = x2 = ((x == 0.0) ? 1.0 : ((x2 < 0.7) ? sinh(x)/(x*exp(x)) : (1.0-exp(-x2))/~2)); if (n ! = 0) { r=O.0; m=start (x,n,1); for ( ; m>=l; m--) ( r=l.O/( (m+m+l)/x+r); if (m 0; n: int; entry: the upper bound of the indices of array k; n 2 0; k: float k[O:n]; exit: k/j] is the value of the modified spherical Bessel function K,+o,5't(x), j=O, ...,n.

x:

void nonexpspherbessk(f1oat x, int n, float k[l)

l int i; float ki,kil,kiZ; X=l.o/x; k [O]=ki2=~*1.5707963267949; if (n ! = 0) ( k [l]=kil=ki2*(l.O+x); for (i=2; i= -5.0 && z c= 8.0) { u=v=t=uc=vc=tc=1.0; s=sc=o.5; n=3 ; x=z*z*z; while (fabs(u)+fabs (v)+fabs(s)+fabs(t) > 1.0e-18) { u=u*x/ (n*(n-1)) ; v=v*x/ (n*(n+l)) ; s=s*x/(n*(n+2) ; t=t*x/ (n*(n-2)) ; UC += u; VC += v; SC += s; tc += t; *bi=sqrt3*(cl*uc+c2*z*vc); *bid=sqrt3*(cl*z*z*sc+c2*tc); if ( ~ ~ 2 . {5 ) *ai=cl*uc-c2*z*vc; *aid=cl*sc*z*z-c2*tc; return; 1

I

kl=k2=k3=k4=0.0; sqrtz=sqrt(fabs(2))

;

zt=0.666666666666667*fab~(Z)*sqrtZ;

c=sqrtlopi/sqrt(sqrtz); if (Z c 0.0) { Z = -2; co=cos(zt-pio4); si=sin(zt-pio4); for (1=1; 1c=10; I++) ( wwl=ww [ll ; pl=xx [ll/zt; pl2=pl*pl; pl1=1.O+p12; pl3=pll*pll; kl += wwl/pll; k2 += wl*pl/pll; k3 += wl*pl* (1.0+pl*(2.O/zt+pl)) /pl3; ) /zt)/pl3; k4 += wwl* (-1.0-pl*(l.~+pl*(zt-pl) 1 1

*ai=c* (co*kl+si*k2); *aid=0.25*(*ai)/z-c*sqrtz*(co*k3+si*k4); *bi=c*(co*k2-si*kl); *bid=O.25*(*bi)/z-c*sqrtz*(co*k4-si*k3); } else { if (z c 9.0)

Copyright 1995 by CRC Press, Inc

expzt=exp (zt); else { expzt=l. 0; *expon=zt; I

B. airyzeros Computes the zeros and associated values of the Airy functions Ai(x) and Bi(x), and their derivatives. Denoting by a,, a, ', b,, b, ' the s-th zero of Ai(x), &i(x)/dx, Bi(x), dBi(x)/dx respectively (see [AbS65]): a,' = -g{3*(4s-3)/8} a, = -f{3*(4s-1)/8}, b,' = -g{3*(4s-1)/8} b, = #3~(4~-3)/8), where the functions f and g are defined in the documentation of airy. The appropriate member of the above set of approximations is used as the first iterate in a quadratic interpolation process for determining the zeros in question. The values of the Airy functions (and the associated values delivered in the array vai) are calculated by means of the procedure airy. Function Parameters: float airyzeros (n,d,zai,vai) airyzeros: delivers the n-th zero of the selected Airy hnction; d int; entry: an integer which selects the required Airy function, d = 0, 1, 2 or 3; zai: float zai[l :n]; zailj] contains the j-th zero of the selected Airy function: exit: if d=O then Ai(x), if d=l then (d&)Ai(x), if d=2 then Bi(x), if d=3 then (ddx)Bi(x); vai: float vai[l :n]; vailj] contains the value at x = zailj] of the following function: exit: if d=O then (d/d)Ai(x), if d=l then Ai(x), if d=2 then (ddx)Bi(x), if d=3 then Bi(x).

Copyright 1995 by CRC Press, Inc

Function used: airy.

float airyzeros(int n, int d, float zaill, float vai[l) I' void airy(float,float *,float *,float *,float *,float *,int); int a,found,i; float c,e,r,zaj.zak.vaj,daj.kaj,zz; a=((d == 0) 1 1 (d == 2)); r = (d == 0 1 1 d == 3) ? -1.17809724509617 : -3.53429173528852; airy(O.O,&zaj,&vaj,&daj,&kaj, &zz,l); for (i=l; i H" then a) if sign g,, = sign gs(o)then a') if H y Ig,, I , S(0) is replaced by I' and a") if sign g,. = gS(, and H" > Jg,(,, 1 , S(n) is replaced by I"; b) if sign g,. # sign gs(o)then b') if H' > Ig,(,,) 1 , S(k) is replaced by S(k-I), k=n, ...,1 , and S(0) is replaced by I' and b") (with the new value of S(n)) if sign g,. = sign gs(n)and H" > I g,(,,) 1 , S(n) is replaced by I". If H" 1 H' then similar modifications take place. sndremez is an auxiliary procedure used in minmaxpol.

Function Parameters: void sndremez (n,m,s,g,em) int; n,m: entry: the number of points to be exchanged is smaller than or equal to n+l; the reference set contains the numbers O,l,...,m, ( m 2 n); s: int s[O:n]; entry: in s one must give n+l (strictly) monotone increasing numbers out of O,l,...,m; n+l (strictly) monotone increasing numbers out of the numbers O,l, ...,m; exit: g: float g[O:m]; entry: in array g[O:m] one must give the function values; em: float em[O:I]; entry: 0 < em[O] I g[i], i=O,...,m; em[1] = infinity norm of array g[O:m]. exit: Function used: i n h v e c .

float infnrmvec(int, int, int *, float [I); int sO,sn,sjpl,i,j,k,up,low,nml; float max,msjpl,hi,hj ,he,abse, h, templ,temp2; sO=sjpl=s [Ol ; he=em [O]; low=sO+l; max=msjpl=abse=fabs(he); nml=n-1;

Copyright 1995 by CRC Press, Inc

for (j=O; j max) max=h; if (h > abse) if (he*g[il > 0.0) ( s[jl = (msjpl < h) ? i sjpl=s[j+ll ; msjpl=abse; ) else { s[jl=sjpl; sjpl=i; msjpl=h;

:

sjpl;

I

else ( s [jl=sjpl; sjpl-s j+ll ; msjpl=abse;

J

sn=s [nl ; s [nl=sjpl; hi=infnrmvec(O,sO-1,&i,g); hj=infnrmvec(sn+l,m,&j ,g); if (j > m) ,j=m; if (hi > h)) { if (hi > max) max=hi; templ = (g[il == 0.0) ? 0.0 : ((g[i] > 0.0) ? 1.0 : -1.0); temp2 = (g[s[Oll==O.O) ? 0.0 : ((g[s[Oll>O.O) ? 1.0 : -1.0); if (templ == temp2) ( if (hi > fabs(g[s [OI 1)) (

L ~ s e{ if (hi > fabs(g[s[nll ) ) { s[n] = (g[jl/g[s[nmlll > 1.0) ? j : s[nmll; for (k=nml; k>=l; k--) s [kl=s [k-11; s [Ol=i; \

I

) else { if (hj > max) max=hj; templ = (g[jl == 0.0) ? 0.0 : ((g[j] > 0.0) ? 1.0 : -1.0); temp2 = (g[s[nll==O.O)? 0.0 : ((g[s[nll>O.O) ? 1.0 : -1.0); if (templ == temp2) { if (hj > fabs(g[s [nll ) ) { s [nl=j; if (g[il/g[s[Oll > 1.0) s[Ol=i; I

) else if (hj > fabs (g[s [OI1 ) ) { s[Ol = (g[il/g[s[ll] > 1.0) ? i : s[ll; for (k=l; k n). The method used [Me641 involves the iterative construction of polynomials

for which

where s(k,j), j=O, ...,n, is a strictly increasing sequence selected from O,l,...,m. The discrepancies

' .

at all points j=0 ,...,m are then inspected, and from these a sequence gk+lj= gk,sF+lj,,j=O ,...,n, is constructed which a) possesses the alternating property sign gk+lj = -sign gk+lj-l,j=l, ...,n, and b) for which Igk+,jl2 Igkjl, j=O, ...,n, with (unless the process is to be terminated) Igk+Ij ( >(gkj ( for at least one j (for the details of this construction, see the documentation to sndremez). The coefficients {c~+,,~) in the succeeding polynomial are determined (by use of Newton's interpolation formula, and subsequent reduction to the given polynomial form) from condition (2) with k replaced by k+l. Initially s(O,j), j=O, ...,n, is a sequence of integer approximations to the points at which Tn((2x-m)/m), where T,,(y) = cos{n arc cosb)} being a Chebyshev polynomial (see the documentation of ini) assumes its maximal absolute values in the range 0 I x I m. The procedure ini is used to set the s(0,j). Function Parameters: void minrnaxpol (n,m,yfy,co,em) n:

int; entry:

the degree of the approximating polynomial; n 2 0;

Copyright 1995 by CRC Press, Inc

m: int; entry: the number of reference function values is m+l; y : float y[O:m], fy[0:m]; entry: JL[i] is the function values at y[i], i=O, ...,m; co: float co[O:n]; exit: the coefficients of the approximating polynomial (co[i] is the coefficient ofy'); em: float em[0:3]; entry: em[2]: maximum allowed number of iterations, say 10*n+5; exit: em[O]: the difference of the given function and the polynomial in the first approximation point; em[l]: the infinity norm of the error of approximation over the discrete interval; em[3]: the number of iterations performed. Functions used:

elmvec, dupvec, newton, pol, newgrn, ini, sndremez.

void minmaxpol (int n, int m, float y [I , float fy [I , float co [I , float em [I ) { int *allocate-integer-vector(int, int); float *allocate-real-vector(int, int); void free-integer-vector (int *, int); void free-real-vector(f1oat *, int); void elmvec(int, int, int, float [I, float [I, float); void dupvec(int, int, int, float [I , float [I ) ; float pol (int, float, float [I ) ; void newton (int, float [I , float [I ) ; void newgrn (int, float [I , float [I ) ; void ini (int, int, int 11 ) ; void sndremez(int, int, int [I, float [I, float [I); int np~,k,pomk,count,cnt,j,mi,*s,sjml,sj,sO,up; float e,abse,abseh,*x,*b,*coef,*g;

npl=n+l; ini (npl,m,s) ; mi=em [21 ; abse=O .0; T t = l ; pomk=l; for (k=O; k 3 :

RETA le-01 1.0e-02 1.0e-03 1.0e-04

The van del Pol equation dty/'ik2 = IO(1-y=)(&/&) - y , x 2 0, y = 2, &/& = 0, x = 0, can be integrated by rk2. At the points x=9.32386578, 18.86305405, 28.40224162, 37.94142918 the derivative d y / h vanishes. Note that the computations are carried out in double precision by using the macro statement: #define float double. #define float double

float fxyz(f1oat x, float y , float z)

IL

1

return l0.0* (1.0-y*y)*z-y;

void main 0 void rk2(float *, float, float, float *, float, float * , float, float ( * ) (float, float, float), float [I, float [I, int); int i,fi; float x,y,z,e[SI,d[61, b[4]=(9.32386578, 18.86305405, 28.40224162, 37.94142918);

.

e [I]=e [21=e [31=e [41=e [51=l.Oe-8; printf (ItRK2delivers : \nW) ; for (i=O; iO; the result is multiplied by ex. NONEXPBESSK calculates the modified bessel functions of the third kind of order 1, 1=0,...,n, with argument x, x>O; the result is multiplied by ex. BESSJAPLUSN calculates the bessel functions of the first kind of order a+k, O&%, O=lr; i--) free((char*) (m[il+lc)); free ( (char*) (m+lr)) ;

1

Copyright 1995 by CRC Press, Inc