193 54 18MB
English Pages 522 [514] Year 2023
Rudolf Rabenstein Maximilian Schäfer
Multidimensional Signals and Systems Theory and Foundations
Multidimensional Signals and Systems
Rudolf Rabenstein • Maximilian Scha¨ fer .
Multidimensional Signals and Systems Theory and Foundations
Rudolf Rabenstein Multimedia Communications and Signal Processing ¨ Friedrich-Alexander-Universitat Erlangen-Nurnber ¨ g (FAU) Erlangen, Germany
Maximilian Sch.a¨fer Digital Communications ¨ Friedrich-Alexander-Universitat Erlangen-Nurnber ¨ g (FAU) Erlangen, Germany
ISBN 978-3-031-26514-3 (eBook) ISBN 978-3-031-26513-6 https://doi.org/10.1007/978-3-031-26514-3 © Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This book presents the theory and foundations of multidimensional signals and systems. It takes a comprehensive approach without reference to a specific application field. It is followed up by a second volume with details on applications in different areas. The book is intended for graduate students in various disciplines of engineering and science, for doctoral students, and for researchers in engineering or applied physics. Readers are expected to have a solid knowledge of mathematics at the undergraduate level and of basic physics and its applications in either electrical or mechanical engineering. Knowledge of one-dimensional signals and systems is an advantage but no strict prerequisite. The presented material is based on a one-semester course on multidimensional signals and systems that has been taught for several years at the Chair of Multimedia Communications and Signal Processing at the Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg (FAU) in Germany. Instructors planning to teach a course based on this book might want to consider the previous knowledge of their audience. Students from science, mathematics, mechanical engineering, or computer science with no previous exposure to onedimensional signals and systems will benefit from the review in Chap. 3. Students of electrical engineering, communications, or control theory with knowledge of undergraduate signals and systems can proceed from the overview in Chap. 2 directly to Chap. 4 on signal spaces. It may be covered in detail for students without previous knowledge in Hilbert spaces or treated only briefly otherwise. The theory of multidimensional signals starts in Chap. 5 and proceeds with multidimensional transformations and sampling in the following chapters. Discrete and continuous multidimensional systems follow in Chaps. 8 and 9, respectively. The Sturm-Liouville Transformation as a tool for boundary-value problems is presented in detail in Chap. 10 with solution methods and associated discrete-time algorithms in the final Chaps. 11 and 12. A selection of applications from the upcoming second volume is a good complement to the theory and foundations of multidimensional signals and systems in this volume. v
vi
Preface
The interest in the content of this book grew out of research performed in funded projects at the Faculty of Engineering at the FAU and at the Physics Department of the University of Siegen in Germany. The project partners from all over Europe that have participated over the years are too numerous to mention. This book would not have been possible without the direct support of many people. At first, we would like to thank Peter Steffen, a teacher and a friend, for his introduction to the subject early on and for proof reading the entire manuscript. The first author has learned a lot from working with his doctoral students Lutz Trautmann, Stefan Petrausch, Sascha Spors, Achim Kuntz, Paolo Annibale, Maximilian Sch¨afer, and Christian Strobl. The foundations laid out here have been acquired with support from Andr`e Kaup, head of the Chair of Multimedia Communications and Signal Processing, and from Robert Schober, head of the Institute of Digital Communications, both at the FAU. Research on audio engineering including two European projects has been conducted in close cooperation with Walter Kellermann, also at the FAU. Our sincere thanks go to Susan Evans from Springer Nature for initiating this project and for patiently supporting it through the various stages of completion. She conducted also a review of an early draft by anonymous experts who suggested various valuable improvements. Most figures in this book have been prepared with TikZ [4] and its various packages including the signalflow library [3] and with material on 3D graphics [2]. Some colormaps are based on colorbrewer [1]. We thank Edward H. Adelson for his friendly permission to use the flowergarden sequence. Erlangen, Germany April 2023
Rudolf Rabenstein Maximilian Sch¨afer
References 1. Brewer, C.A.: Colorbrewer. https://colorbrewer2.org/. Accessed at April 3, 2022 2. Miani, M.: Example: Spherical and cartesian grids. https://texample.net//tikz/ examples/spherical-and-cartesian-grids. Accessed at April 3, 2022 3. Ochs, K.: Example: Signal flow building blocks. https://texample.net/tikz/ examples/signal-flow-building-blocks/. Accessed at April 3, 2022 4. Tantau, T.: The TikZ and PGF packages (2022). https://pgf-tikz.github.io/pgf/ pgfmanual.pdf. Accessed at April 3, 2022
Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 3
2
Overview on Multidimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 One-Dimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 One-Dimensional Time-Dependent Signals . . . . . . . . . . . . . . 2.1.2 One-Dimensional Space-Dependent Signals . . . . . . . . . . . . . . 2.2 Two-Dimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Two-Dimensional Space-Dependent Signals . . . . . . . . . . . . . . 2.2.2 Two-Dimensional Space- and Time-Dependent Signals . . . . 2.3 Three-Dimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Three-Dimensional Space-Dependent Signals . . . . . . . . . . . . 2.3.2 Three-Dimensional Space- and Time-Dependent Signals . . . 2.4 Four-Dimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Higher Dimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Properties of Multidimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Multidimensional Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Autonomous Systems and Input-Output-Systems . . . . . . . . . . 2.7.2 Linear, Time-, and Shift-Invariant Systems . . . . . . . . . . . . . . . 2.7.3 Mathematical Formulation of Multidimensional Systems . . . 2.8 Overview on the Next Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 7 7 8 8 9 10 11 11 12 14 15 15 16 16 17 18 21 22 22
3
Elements from One-Dimensional Signals and Systems . . . . . . . . . . . . . . 3.1 Convolution and Impulse Response . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 An Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Delta Impulse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 23 23 27 35
vii
viii
Contents
3.2
Fourier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Eigenfunctions of Linear and Time-Invariant Systems . . . . . . 3.2.2 Definition of the Fourier Transformation . . . . . . . . . . . . . . . . . 3.2.3 Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 Summary of Correspondences and Properties of the Fourier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Sampling of Continuous-Time Functions . . . . . . . . . . . . . . . . 3.3.2 Spectrum of a Sampled Signal . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Differential Equations and Transfer Functions . . . . . . . . . . . . . . . . . . . 3.4.1 A Light Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 State Space Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
39 39 39 40 43 45 45 46 46 48 48 52 61 61 62
Signal Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.1 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1.1 Vectors, Functions, and Signals . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1.2 Topics from Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 Introduction to Signal Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.1 What Is a Signal Space? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.2 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2.3 Norm, Distance, and Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2.4 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.2.5 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.2.6 Extension to Generalized Functions . . . . . . . . . . . . . . . . . . . . . 84 4.3 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.3.2 Perpendicularity in Two Dimensions . . . . . . . . . . . . . . . . . . . . 87 4.3.3 Expansion into Basis Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3.4 Gram-Schmidt Orthogonalization . . . . . . . . . . . . . . . . . . . . . . . 93 4.4 Duality and Biorthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.1 Dual Spaces and Biorthogonal Signal Spaces . . . . . . . . . . . . . 97 4.4.2 Sets of Biorthogonal Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.4.3 General Biorthogonal Signal Spaces . . . . . . . . . . . . . . . . . . . . 104 4.5 Signal Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.5.1 General Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.5.2 Discrete Fourier Transformation . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5.3 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.5.4 Discrete-Time Fourier Transformation . . . . . . . . . . . . . . . . . . . 110 4.5.5 Fourier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.5.6 Discrete Cosine Transformation . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5.7 Summary of the Fourier-Type Transformations . . . . . . . . . . . 114
Contents
ix
4.5.8 4.5.9
A Review of the Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Signal Transformations with Complex Frequency Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5
Multidimensional Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.1.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.1.2 Separable Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.1.3 Symmetrical Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.1.4 Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.1.5 Symmetry and Separability in Different Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.2 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.2.1 Review of One-Dimensional Convolution . . . . . . . . . . . . . . . . 131 5.2.2 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.2.3 Convolution of Separable Signals . . . . . . . . . . . . . . . . . . . . . . . 135 5.2.4 2D Convolution and Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.1 A Note on Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 5.3.2 Two-Dimensional Point Impulses . . . . . . . . . . . . . . . . . . . . . . . 140 5.3.3 Two-Dimensional Line Impulses . . . . . . . . . . . . . . . . . . . . . . . 144 5.3.4 Properties of Line Impulses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5.3.5 Ring Impulses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.3.6 Combinations of Line Impulses . . . . . . . . . . . . . . . . . . . . . . . . 153 5.3.7 Applications of 2D Impulses . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.3.8 Point Impulses in Different Coordinate Systems . . . . . . . . . . . 157 5.3.9 Review of Delta Impulses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6
Multidimensional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 6.1 Fourier Transformation in Cartesian Coordinates . . . . . . . . . . . . . . . . 163 6.1.1 Definition of the 2D Fourier Transformation . . . . . . . . . . . . . . 164 6.1.2 Fourier Transforms of Frequently Used Functions . . . . . . . . . 168 6.1.3 Basic Properties of the 2D Fourier Transformation . . . . . . . . 175 6.2 Affine Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 6.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 6.2.2 Affine Mappings in Two Dimensions . . . . . . . . . . . . . . . . . . . . 180 6.2.3 Properties of Affine Mappings . . . . . . . . . . . . . . . . . . . . . . . . . 181 6.2.4 Coordinate Transformation in Multiple Integrals . . . . . . . . . . 182 6.3 More Properties of the 2D Fourier Transformation . . . . . . . . . . . . . . . 184 6.3.1 Affine Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 6.3.2 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
x
Contents
6.3.3 Projection Slice Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 6.3.4 Radon Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 6.3.5 Back-Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 6.3.6 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 6.3.7 Summary of Correspondences and Properties . . . . . . . . . . . . . 197 6.4 Fourier Transformation in Non-Cartesian Coordinates . . . . . . . . . . . . 199 6.5 Fourier Transformation in Polar Coordinates . . . . . . . . . . . . . . . . . . . . 199 6.5.1 Notation and Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 6.5.2 Coordinate Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 6.5.3 Angular Series Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 6.5.4 Hankel Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 6.5.5 Summary of the Fourier Transformation in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 6.6 Fourier Transformation in Spherical Coordinates . . . . . . . . . . . . . . . . 206 6.6.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 6.6.2 Coordinate Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 6.6.3 Angular Series Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.6.4 Summary of the Fourier Transformation in Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.7 Other 2D Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 6.7.1 2D Discrete-Time Fourier Transformation . . . . . . . . . . . . . . . 210 6.7.2 2D z-Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 6.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 7
Multidimensional Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 7.1 Rectangular Sampling of Two-Dimensional Signals . . . . . . . . . . . . . . 217 7.1.1 Rectangular Sampling of a Continuous 2D Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 7.1.2 2D Rectangular Sampling in Vector Notation . . . . . . . . . . . . . 220 7.2 2D Sampling on General Sampling Grids . . . . . . . . . . . . . . . . . . . . . . . 222 7.2.1 A Glimpse of Lattice Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 222 7.2.2 Definition of a General 2D Sampling Grid . . . . . . . . . . . . . . . 225 7.2.3 Repetition Pattern for General 2D Sampling . . . . . . . . . . . . . . 228 7.2.4 Summary of Non-Rectangular Sampling and Spectral Repetition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 7.2.5 An Extended Example for Non-Rectangular Sampling . . . . . 231 7.3 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 7.3.1 Aliasing in One Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 7.3.2 Aliasing in Two Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 7.4 Summary of 1D and 2D sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 7.5 Frequently Used Sampling Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 7.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 7.5.2 Sampling Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Contents
xi
7.5.3 2D Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 7.5.4 Rectangular Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 7.5.5 Diagonal Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 7.5.6 Hexagonal Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 7.5.7 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 7.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 8
Discrete Multidimensional Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 8.1 Discrete Finite Impulse Response Systems . . . . . . . . . . . . . . . . . . . . . 254 8.1.1 Discrete Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 8.1.2 Definition of 2D FIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 255 8.1.3 Order of Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 8.1.4 Typical 2D FIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 8.1.5 Transfer Functions of 2D FIR Systems . . . . . . . . . . . . . . . . . . 258 8.1.6 Stability of 2D FIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 8.2 Discrete Infinite Impulse Response Systems . . . . . . . . . . . . . . . . . . . . 263 8.2.1 Definition of 2D IIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 263 8.2.2 Order of Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 8.2.3 Transfer Functions of 2D IIR Systems . . . . . . . . . . . . . . . . . . . 267 8.2.4 Stability of 2D IIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 8.2.5 Application of 2D IIR Systems . . . . . . . . . . . . . . . . . . . . . . . . . 268 8.3 Discretization of Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 269 8.3.1 Continuous Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 269 8.3.2 The Method of Weighted Residuals . . . . . . . . . . . . . . . . . . . . . 271 8.3.3 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 8.4 Iterative Methods for the Solution of Systems of Linear Equations . . 280 8.4.1 Finite Difference Approximation . . . . . . . . . . . . . . . . . . . . . . . 280 8.4.2 Iterative Solution of Large Systems of Linear Equations . . . . 284 8.4.3 Classical Iteration Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 8.4.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 8.4.5 Review of Iterative Matrix Inversion . . . . . . . . . . . . . . . . . . . . 298 8.5 Multidimensional Systems and MIMO Systems . . . . . . . . . . . . . . . . . 299 8.5.1 Multiple-Input Multiple-Output Systems . . . . . . . . . . . . . . . . . 299 8.5.2 Relations Between Multidimensional Systems and MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 8.5.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 8.6 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 8.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
9
Continuous Multidimensional Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 9.1 Distributed Parameter Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 9.1.1 Lumped and Distributed Parameter Systems . . . . . . . . . . . . . . 310 9.1.2 Electrical Transmission Line . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
xii
Contents
9.1.3 Telegraph Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 9.1.4 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 9.1.5 Initial and Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . 315 9.1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 9.2 Scalar Linear Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . 317 9.2.1 Spatio-Temporal Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 9.2.2 Initial-Boundary-Value Problems for a Scalar Variable . . . . . 319 9.2.3 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 320 9.2.4 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 9.2.5 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 9.2.6 Telegraph Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 9.3 Vector-Valued Linear Partial Differential Equations . . . . . . . . . . . . . . 324 9.3.1 Coupled Partial Differential Equations . . . . . . . . . . . . . . . . . . . 325 9.3.2 A Note on Analogies of Physical Variables . . . . . . . . . . . . . . . 327 9.3.3 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 9.4 Vector-Valued and Scalar Partial Differential Equations . . . . . . . . . . . 334 9.4.1 Converting a Vector Representation into a Scalar Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 9.4.2 Converting a Scalar Representation into a Vector Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 9.4.3 Transformation of the Dependent Variables . . . . . . . . . . . . . . 341 9.5 General Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 9.5.1 Review of One-Dimensional Systems . . . . . . . . . . . . . . . . . . . 343 9.5.2 Solution of Multidimensional Systems . . . . . . . . . . . . . . . . . . 344 9.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 10
Sturm-Liouville Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 10.1 Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 10.1.1 Physical Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 10.1.2 Laplace Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 10.1.3 Finite Fourier-Sine Transformation . . . . . . . . . . . . . . . . . . . . . 349 10.1.4 Transfer Function Description . . . . . . . . . . . . . . . . . . . . . . . . . 351 10.1.5 Inverse Fourier Sine Transformation . . . . . . . . . . . . . . . . . . . . 352 10.1.6 Inverse Laplace Transformation . . . . . . . . . . . . . . . . . . . . . . . . 352 10.1.7 Solution in the Space-Time Domain . . . . . . . . . . . . . . . . . . . . . 353 10.1.8 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 10.2 Spatial Differentiation Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 10.2.1 Initial-Boundary-Value Problem . . . . . . . . . . . . . . . . . . . . . . . . 357 10.2.2 Spatial Differentiation Operator and its Adjoint Operator . . . 359 10.2.3 Eigenfunctions of the Spatial Operators . . . . . . . . . . . . . . . . . . 362 10.2.4 Recapitulation of the Eigenvalue Problems . . . . . . . . . . . . . . . 365 10.3 Spatial Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 10.3.1 Eigenfunctions and Basisfunctions . . . . . . . . . . . . . . . . . . . . . . 368 10.3.2 Definition of the Sturm-Liouville Transformation . . . . . . . . . 370
Contents
xiii
10.3.3 Differentiation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 10.3.4 Application to the Initial-Boundary-Value Problem . . . . . . . . 374 10.3.5 Transfer Function Description . . . . . . . . . . . . . . . . . . . . . . . . . 375 10.4 Green’s Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 10.4.1 Green’s Function for the Initial Value . . . . . . . . . . . . . . . . . . . 377 10.4.2 Green’s Function for the Excitation Function . . . . . . . . . . . . . 378 10.4.3 Green’s Function for the Boundary Value . . . . . . . . . . . . . . . . 379 10.4.4 Summary of the Green’s Functions . . . . . . . . . . . . . . . . . . . . . 380 10.4.5 Response to an Impulse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 10.5 Propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 10.5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 10.5.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 10.6 Review of the Solution of Initial-Boundary-Value Problems . . . . . . . 384 10.7 Continuous Multidimensional Systems with Space-Dependent Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 10.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 10.7.2 Formulation in the Time-Domain and Frequency-Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 10.7.3 Adjoint Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 10.7.4 Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 10.7.5 Sturm-Liouville Transformation . . . . . . . . . . . . . . . . . . . . . . . . 389 10.7.6 Transfer Function Description . . . . . . . . . . . . . . . . . . . . . . . . . 389 10.8 Classical Sturm-Liouville Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 10.8.1 Derivation from a 2 × 2 Eigenvalue Problem . . . . . . . . . . . . . 390 10.8.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 10.8.3 Solution of Classical Sturm-Liouville Problems . . . . . . . . . . . 393 10.8.4 Legendre Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394 10.8.5 Associated Legendre Functions . . . . . . . . . . . . . . . . . . . . . . . . 396 10.8.6 Bessel Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 10.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 11
Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 11.1 Solution of Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 11.1.1 An Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 11.1.2 Definition and Properties of the Matrix Exponential . . . . . . . 407 11.1.3 Representations of the Matrix Exponential by Finite Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 11.1.4 Matrix Representation of a 1D Continuous-Time System . . . 411 11.1.5 Calculation of the Matrix Exponential . . . . . . . . . . . . . . . . . . . 416 11.1.6 Summary and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 11.2 Calculation of Further Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 11.2.1 Matrix Exponential of the Adjoint Operator . . . . . . . . . . . . . . 423 11.2.2 Normalization Factor Nµ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
xiv
Contents
11.2.3 Boundary Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 11.2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 11.3 Solve Initial-Boundary-Value Problems in Seven Steps . . . . . . . . . . . 432 11.3.1 Seven Step Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432 11.3.2 Example for the Solution of Sturm-Liouville Problems . . . . . 433 11.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 12
Algorithmic Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 12.1 State-Space Representation of Continuous Multidimensional Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 12.1.1 Infinite-Dimensional Linear Operators . . . . . . . . . . . . . . . . . . 439 12.1.2 State Space Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 12.1.3 Relation to the Green’s Functions and to the Propagator . . . . 449 12.1.4 Enumeration of the Eigenvalues and Eigenfunctions . . . . . . . 451 12.2 Time Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 12.2.1 Bilinear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 12.2.2 Impulse Invariant Transformation . . . . . . . . . . . . . . . . . . . . . . . 461 12.2.3 Comparing Bilinear and Impulse Invariant Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 12.2.4 Discrete-Time Algorithmic Structure . . . . . . . . . . . . . . . . . . . . 475 12.3 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 12.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Solutions to the Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
Acronyms
1D 2D 3D nD (n + 1)D (n + t)D ADC BC BIBO DC DCT DFT DTFT FAU FD FEM FIR FS FT IBVP IC IIR LSI LTI MIMO PDE RMS SL w.r.t.
One-dimensional Two-dimensional Three-dimensional n-dimensional n space dimensions plus time Alternative for (n + 1)D Analog-to-digital converter Boundary condition Bounded-input bounded-output Direct current Discrete cosine transform Discrete Fourier transform Discrete time Fourier transform Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg Finite difference Finite element method Finite impulse response Fourier series Fourier transform Initial-boundary value problem Initial condition Infinite impulse response Linear shift invariant Linear time invariant Multi-input multi-output Partial differential equation Root mean square Sturm-Liouville With respect to
xv
Chapter 1
Introduction
The theory of signals and systems is a well established subject in the curricula of electrical engineering, communications, signal processing, and control theory. It provides indispensible tools for the analysis and design of complex technical systems as well as for understanding natural processes or for describing interdependencies in economy. Concepts from signals and systems are widely applicable because they abstract from particular physical effects. Instead they rely on the unifying power of mathematical models in the form of differential or difference equations, impulse responses, and transfer functions, to name just a few. Suitable input-outputdescriptions of discrete-time systems often lead directly to efficient signal processing algorithms. In short, the theory of signals and systems is the universal working horse for the representation, analysis and design of dynamical systems. However, most courses and textbooks on signals and systems are confined to time-dependent quantities and do not consider space or other independent variables. The resulting one-dimensional signals are most suitable for applications in communications, control, economy, biology, and other fields where only the temporal evolution is of interest. Nevertheless, one-dimensional signals and systems are only a subset. There are also many examples for signals which depend on two or more independent variables: Images of all kinds are described by two dimensions in space, video frame sequences depend on space and time. The propagation of acoustic or electromagnetic waves in fluid or solid media requires one to three spatial dimensions plus time. The same is true for the transport of matter by diffusion or for the transport of heat by conduction. The mathematical tools for functions with two and more independent variables have long been provided. Differentiation with respect to space is expressed by the operators gradient, divergence and rotation. Spatial integration is described by line, surface and volume integrals. Their interrelations are formulated by the integral theorems of Gauss, Green, and Stokes. These mathematical techniques were of paramount importance for the mathematical foundation of physics, e.g. [11, 12, 28]. However, engineering applications of spatial signals with analog means had initially been based on simpler methods. Early examples are optical photography and analog television. This situation changed with the advent of digital computing. It enabled a © Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 1
1
2
1 Introduction
broad application of processing techniques for signals with two and more independent variables, henceforth called multidimensional signals. An early application of multidimensional signal processing has been the exploration of oil and gas fields based on seismic migration of waves in solid stratified media [5, 6, 10]. The required geophysical data were recorded by spatially distributed sensors on the earth’s surface. The general theory for processing data with a continuous time variable and one or more discrete space variables emerged as array signal processing [22, 30, 33]. Other application fields boosted by digital storage and computing were signal processing [4, 16, 23, 40], image and video processing [7, 21, 25, 36, 37, 41], medical imaging [26] and more recently spatial sound reproduction and digital sound synthesis [2, 17, 20, 31, 32, 38, 39, 42, 43]. In parallel to these application-driven evolutions, the theory of multidimensional systems has been further developed from the view points of operator theory [9, 24], infinite-dimensional system theory [13, 14, 29], and distributed-parameter systems [8] with applications in control theory [15, 18, 19, 34, 35], applied physics [27] and electrical engineering [3]. Owing to this historical development, the literature dealing with multidimensional signals and systems has branched out into different directions. The theoretical work is divided into books founded on operator theory and infinite-dimensional system theory [8, 9, 13, 14, 24, 29] and into books which extend the well known theory of one-dimensional linear systems to two and more dimensions [4, 16, 23, 40]. On the application side exist well-separated communities in image and video processing, medical imaging, audio engineering, and control with disjoint journals and conferences. Consequently, the field of multidimensional signals and systems is much less coherent than the theory of one-dimensional signals and systems as familiar from many undergraduate engineering curricula. The remainder of this book gives an introduction to the field of multidimensional signals and systems with two objectives in mind: • The theory shall be accessible to readers with a solid background in onedimensional signals and systems as well as in the topics of undergraduate mathematics as taught to engineers. All further theoretical concepts are developed in the course of the book. • The presentation is not geared towards a particular application field. Instead, the intention is to show how the theoretical concepts of multidimensional signals and systems can be applied to various practical situations. This volume introduces the basic theory of multidimensional signals and systems. A review of some elements from one-dimensional signals and systems serves as starting point for the generalization to multiple dimensions. Then discrete and continuous multidimensional systems are covered along with methods and algorithms for their implementation. Although this volume is mainly devoted to theory, it uses elements from several applications as examples, where appropriate. A separate volume [1] reviews and expands the theory from the viewpoint of different application fields. The theory of electrical transmission lines is covered as a classical example of distributed parameter systems in engineering. Broad considera-
References
3
tion is given to the generation and propagation of acoustic signals in time and space because these topics link multidimensional theory with current practical applications in audio engineering. An emerging field is molecular communication which relies on the transport of small particles such as molecules rather than the propagation of waves. Therefore, particle diffusion is presented as another connection between physics and multidimensional signals and systems.
References 1. Rabenstein, R., Sch¨afer, M.: Multidimensional Signals and Systems: Applications. Springer Nature, Heidelberg, Berlin (to appear) 2. Ahrens, J.: Analytic Methods of Sound Field Synthesis. T-Labs series in Telecommunication Services. Springer, Berlin (2012) 3. Antonini, G., Orlandi, A., Pignari, S.A.: Review of Clayton R. Paul studies on multiconductor transmission lines. IEEE Transactions on Electromagnetic Compatibility 55(4), 639–647 (2013). https://doi.org/10.1109/TEMC.2013. 2265038 4. Bamler, R.: Mehrdimensionale lineare Systeme. Springer, Berlin (1989) 5. Berkhout, A.J.: Seismic Migration. Imaging of Acoustic Energy by Wave Field Extrapolation, Developments in solid earth geophysics, vol. 12. Elsevier Scientific Publ., Amsterdam (1980) 6. Berkhout, A.J.: Applied Seismic Wave Theory, Advances in exploration geophysics, vol. 1. Elsevier, Amsterdam (1987) 7. Bracewell, R.N.: Fourier Analysis and Imaging. Kluwer Academic/Plenum Publishers, New York (2003) 8. Butkovskiy, A.: Structural Theory of Distributed Systems. Ellis Horwood Ltd., Chichester, England (1983) 9. Churchill, R.V.: Operational Mathematics, 3 edn. Mc Graw Hill, Boston, Massachusetts (1972) 10. Claerbout, J.F.: Fundamentals of Geophysical Data Processing. With Applications to Petroleum Prospecting. McGraw-Hill international series in the earth and planetary sciences. McGraw-Hill, New York (1976) 11. Courant, R., Hilbert, D.: Methods of Mathematical Physics, Vol. 1, 1st English edn. Wiley, New York (1989,1937). https://onlinelibrary.wiley.com/doi/book/ 10.1002/9783527617210 12. Courant, R., Hilbert, D.: Methods of Mathematical Physics, Vol. 2, Partial Differential Equations. Wiley classics library. Interscience Publishers, New York (1989). https://onlinelibrary.wiley.com/doi/book/10.1002/9783527617234 13. Curtain, R., Zwart, H.: An Introduction to Infinite-Dimensional Systems Theory. Springer-Verlag, New York (1995) 14. Curtain, R.F., Pritchard, A.J.: Infinite Dimensional Linear Systems Theory, Lecture notes in control and information sciences, vol. 8. Springer, Berlin (1978)
4
1 Introduction
15. Deutscher, J.: Zustandsregelung verteilt-parametrischer Systeme. Springer, Berlin (2012) 16. Dudgeon, D.E., Mersereau, R.M.: Multidimensional Digital Signal Processing. Prentice-Hall, Englewood Cliffs, NJ (1984) 17. Fazi, F.M., Nelson, P.A.: A multi-channel audio system based on the theory of integral equations. The Journal of the Acoustical Society of America 125(4), 2543–2543 (2009). https://doi.org/10.1121/1.4783610 18. Franke, D.: Systeme mit o¨ rtlich verteilten Parametern. Eine Einf¨uhrung in die Modellbildung, Analyse und Regelung. Hochschultext. Springer, Berlin u.a. (1987) 19. Gilles, E.: Systeme mit verteilten Parametern: Einf. in d. Regelungstheorie. Methoden der Regelungstechnik. Oldenbourg (1973) 20. Helwani, K., Spors, S., Buchner, H.: The synthesis of sound figures. Multidimensional Systems and Signal Processing (2013). https://doi.org/10.1007/ s11045-013-0261-4 21. Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ (1989) 22. Johnson, D.H., Dudgeon, D.E.: Array Signal Processing. Concepts and Techniques. Prentice Hall signal processing series. PTR Prentice Hall, Englewood Cliffs, NJ (1993) 23. Kaczorek, T.: Two-Dimensional Linear Systems. Springer, Berlin (1985) 24. Kato, T.: Perturbation Theory for Linear Operators. Springer-Verlag, Berlin, Germany (1976) 25. Lim, J.S.: Two-Dimensional Signal and Image Processing. Prentice-Hall, Upper Saddle River, NJ (1990) 26. Maier, A., Steidl, S., Christlein, V., Hornegger, J. (eds.): Medical Imaging Systems. An Introductory Guide, Lecture notes in computer science, vol. 11111. Springer Open, Cham (2018). https://doi.org/10.1007/978-3-319-96520-8 ¨ 27. Mikhailov, M., N. Ozisik, M.: Unified Analysis and Solutions of Heat and Mass Diffusion. Dover Publications, Inc., New York (1994) 28. Morse, P.M., Feshbach, H.: Methods of Theoretical Physics. International series in pure and applied physics. McGraw-Hill; Kogakusha Comp., New York, NY; Tokyo (1953) 29. Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Springer, New York (1983). https://doi.org/10.1007/ 978-1-4612-5561-1 30. Pillai, S.U.: Array Signal Processing. Springer, New York u.a. (1989) 31. Poletti, M.A.: A unified theory of horizontal holographic sound systems. J. Audio Eng. Soc 48(12), 1155–1182 (2000). http://www.aes.org/e-lib/browse. cfm?elib=12033 32. Poletti, M.A.: Three-dimensional surround sound systems based on spherical harmonics. J. Audio Eng. Soc 53(11), 1004–1025 (2005). http://www.aes.org/ e-lib/browse.cfm?elib=13396 33. Rafaely, B.: Fundamentals of Spherical Array Processing, Spr. Topics Signal Process., vol. 16, 2 edn. Springer, Cham, Switzerland (2019)
References
5
34. Rogers, E., Galkowski, K., Owens, D.H.: Control Systems Theory and Applications for Linear Repetitive Processes, vol. 349. Springer (2007). https:// eprints.soton.ac.uk/263634/ 35. Rogers, E., Galkowski, K., Paszke, W., Moore, K.L., Bauer, P.H., Hladowski, L., Dabkowski, P.: Multidimensional control systems: case studies in design and evaluation. Multidimensional Systems and Signal Processing 26(4), 895– 939 (2015) 36. Schroeder, H., Blume, H.: One- and Multidimensional Signal Processing. Algorithms and Applications in Image Processing. Wiley, Chichester (2000) 37. Smirnov, A.: Processing of Multidimensional Signals. Digital signal processing. Springer, Berlin (1999) 38. Spors, S., Wierstorf, H., Raake, A., Melchior, F., Frank, M., Zotter, F.: Spatial sound with loudspeakers and its perception: A review of the current state. IEEE Proceedings 101(9), 1920–1938 (2013). https://doi.org/10.1109/JPROC.2013. 2264784 39. Teutsch, H.: Modal Array Signal Processing: Principles and Applications of Acoustic Wavefield Decomposition. No. 348 in Lecture Notes in Control and Information Sciences. Springer, Berlin (2007) 40. Tzafestas, S.G. (ed.): Multidimensional Digital Signal Processing. Marcel Dekker Inc., New York and Basel (1984) 41. Woods, J.: Multidimensional Signal, Image, and Video Processing and Coding, 2 edn. Elsevier, Amsterdam (2012) 42. Wu, Y.J., Abhayapala, T.D.: Spatial multizone soundfield reproduction: Theory and design. IEEE Transactions on Audio, Speech, and Language Processing 19(6), 1711–1720 (2011). https://doi.org/10.1109/TASL.2010.2097249 43. Zhang, W., Abhayapala, T.D., Betlehem, T., Fazi, F.M.: Analysis and control of multi-zone sound field reproduction using modal-domain approach. The Journal of the Acoustical Society of America 140(3), 2134–2144 (2016). https:// doi.org/10.1121/1.4963084
Chapter 2
Overview on Multidimensional Signals
Signals are functions or sequences of numbers which represent information. They show the variation of physical or abstract quantities in dependence on one or more independent variables, like time, space, frequency, or others. This chapter serves as a first introduction to the rich world of different kinds of multidimensional signals. It gives an overview on various types of one- and multidimensional signals, presents signals on bounded and unbounded domains, and discusses the influence of initial conditions and boundary conditions.
2.1 One-Dimensional Signals One-dimensional signals have one single independent variable which may be either time, space, or another physical quantity. Important differences between different kinds of signals show up even in one dimension. One-dimensional signals are often abbreviated as 1D signals.
2.1.1 One-Dimensional Time-Dependent Signals Many physical or technical signals do not have a well-determined starting time or end time. Examples are the temperature at a certain location, speech and music signals from a radio program, or the background noise received by an antenna. These signals are best modelled as existing on an unbounded domain for the time variable t, i.e. for −∞ < t < ∞. Nevertheless, their observation time is often restricted by technical means, e.g. by the available memory for storing the recorded data. The length and position of the observation interval, however, has no influence on the observed signal. Such a situation is shown in Fig. 2.1 for a noise signal with different observation intervals. © Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 2
7
8
2 Overview on Multidimensional Signals
t
v(t)
v(t)
Other signals represent the response to a certain event, like a mechanical impact or switching an electrical device on or off. This event is the starting point for a signal which extends into one direction, e.g. 0 < t < ∞. Figure 2.2 shows the voltage of a discharging capacitor, where different initial voltages at t = 0 lead to different time signals.
t 0 Fig. 2.1 A 1D noise signal v(t) in an unbounded domain. Different shades of grey indicate different observation intervals. The signal values do not depend on the observation interval
Fig. 2.2 1D signals v(t) with different initial values at t = 0 as shown by different shades of grey. They represent the exponential discharge of a capacitor. The signal values depend on the initial value
2.1.2 One-Dimensional Space-Dependent Signals One-dimensional signals in unbounded domains exist also for space-dependent signals. As an example, Fig. 2.3 shows the height profile of a fictitious landscape as observed in different finite intervals of the space variable x. Similar to Fig. 2.1, the observed profile is the same in both observation intervals. On the other hand, there are also space-dependent signals which exist only inside of a bounded domain. Figure 2.4 shows two examples of a beam which is supported at both ends and bends downward under the weight of a load. The deflection profile exists only between the supported points, i.e. either in the interval x1 < x < x4 (grey profile) or in the shorter interval x2 < x < x3 (black profile). The shape of the profile depends on the length of the interval, i.e. on the distance of the end points. Further, it depends on the mechanical conditions at these boundary points; here the end points are simply supported, i.e. the deflection is zero. Unlike as in Figs. 2.1 and 2.3, the choice of the interval changes the profile considerably.
2.2 Two-Dimensional Signals The transition from one to two dimensions expands the variety of possible signals. Obvious candidates for two-dimensional signals are images of all kinds, where the
9
v(x)
v(x)
2.2 Two-Dimensional Signals
x Fig. 2.3 A 1D spatial profile in an unbounded domain. As in Fig. 2.1, the observation intervals are distinguished by their grey values. The observed profile does not depend on the observation interval
0
x1
x2
x3
x4 x
−1 Fig. 2.4 1D spatial profiles of two loaded beams with different end points. The shape of the bent profile depends on the distance of the end points
two dimensions are given by spatial coordinates, e.g. by length and width. But also the shape of thin elastic bodies like plates or membranes, the deformation of surfaces, water waves, etc. constitute two-dimensional signals, where the static deflection of an extended body depends on two independent spatial variables. Twodimensional signals of this kind are often abbreviated as 2D signals. Another class of two-dimensional signals depend on time as one coordinate and on a one-dimensional space coordinate. An example is the propagation of signals in long cables or thin optical fibers. The length of the cable provides the onedimensional space variable, the second dimension is the time axis. Two-dimensional signals with mixed space and time variables are denoted as (1+1)D signals or also as (1+t)D signals to emphasize the time dependence. Signals with two space variables may also be described by different coordinate systems, where Cartesian coordinates with length and width and polar coordinates with radius and an angular coordinate are the most popular ones. The choice of the coordinate system is often suggested by the shape of the considered spatial region, e.g. Cartesian coordinates for rectangular regions and polar coordinates for circular domains. However, there are also elliptic and further, more exotic, two-dimensional coordinate systems.
2.2.1 Two-Dimensional Space-Dependent Signals Figures 2.5 and 2.6 show two-dimensional space dependent signals of different characteristics. Figure 2.5 displays an image which constitutes an observation of a view without given boundary. Reducing the size of the observation windows—as shown by the white frames—does not alter the content of the image. The object in the smallest frame looks the same in the larger frames. For rectangular images, the Cartesian coordinate system is most suitable. Figure 2.6 shows two-different shapes of the vibrations of a circular membrane. The membrane is fixed at the boundary, as e.g. for a drumhead. Here, the size of the vibration pattern increases with the radius of the circle. In other words, the twodimensional signal inside of the boundary is determined by the area enclosed by the boundary. Outside of the boundary, it is not defined. For circular arrangements of this kind a polar coordinate system is most suitable.
10
2 Overview on Multidimensional Signals 3
0.5
2
y
1 0
0
−1 −2 −3 −3
−2
−1
0 x
1
2
Fig. 2.5 Image with different observation windows. Reducing the size of the observation window does not reduce the size of the displayed objects
3
−0.5 Fig. 2.6 Deflection of a circular drumhead at a specific instant of time. Reducing the diameter of the drumhead reduces the wavelength of the vibration pattern. The deflection is shown in normalized units
2.2.2 Two-Dimensional Space- and Time-Dependent Signals An example for a (1+1)D signal v(x, t) depending on time t and on one space coordinate x is the propagation of an impulse v(x, t) = f (x − ct) along a transmission line or an optical fiber. The signal v(x, t) describes an impulse of the shape f , for instance the bell-shape in Fig. 2.7. The impulse may be traced by considering one special value, e.g. v(x, t) = f0 = f (0) which is attained for x = ct. This condition describes a line in the x, t-plane with a slope defined by the propagation speed c as shown in Fig. 2.7. The constant value f0 travels with increasing time over a linearly increasing distance x. In this idealised model of impulse propagation there are no bounds on time and space, i.e. 0 < t < ∞ and −∞ < x < ∞. Real transmission lines have finite length , i.e., 0 < x < . Pulses travelling toward one end of a line are partially reflected and go on in the reverse direction. This effect can be minimized by proper termination but not totally eliminated in practical situations. Figure 2.8 shows a transmission line of length , excited by a pulse at the location x0 at t = 0. One half of this pulse travels to the right and one half to the left, where it is soon reflected at the boundary at x = 0. The same happens a little later at the boundary at x = . This way, the propagating pulses change their directions with each reflection at the boundary. They traverse a pair of zigzag-paths shown in Fig. 2.8. The exact amplitude and phase shift of each reflected pulse depends on the condition at the boundaries, i.e. open ends or termination by a shortcut or by an impedance.
2.3 Three-Dimensional Signals
11
t
t
t3 t2 t1 t0
x0
x1
x2
x3
t0
t1
t2
t3
x0
x1
x2
x3
x
x
Fig. 2.7 Propagation of a bell-shaped impulse with a constant speed c. The center value of the impulse moves to different points in the x, t-plane for xn = ctn , n = 0, 1, 2, 3
0
0
x0
0
ℓ
ℓ
x
x
Fig. 2.8 Propagation of bell-shaped impulses starting at x0 . Two pulses travel to the left and to the right and change their direction upon reflection at the boundaries at x = 0 and x =
2.3 Three-Dimensional Signals The variety of signals expands further for three dimensions. Examples for signals which depend on three space coordinates are static field quantities of all kinds like temperature, pressure, electrostatic and magnetostatic fields. Some advanced imaging techniques record depth information along with familiar two-dimensional images. Medical imaging strives to create full three-dimensional images of the body. Three-dimensional signals are abbreviated as 3D signals. The recording of movies and videos creates signals which depend on two space coordinates and on time. Since movies are shot frame by frame, the time axis is of discrete nature. Each individual frame is a 2D image where the coordinates for length and width are either continuous (for classical photographic film) or discrete (in digital video cameras). Signals of this kind are abbreviated as (2+1)D or (2+t)D signals. The deflection of a drumhead is shown in Fig. 2.6 only for one instance of time. Actually, the vibration of the drumhead as a function of time also creates a (2+t)D signal.
2.3.1 Three-Dimensional Space-Dependent Signals Shapes and figures in the three-dimensional space are hard to reproduce on a 2D piece of paper. Shading, specular reflections, and perspective projections support the human perception of the three-dimensional character of a 2D picture. Figure 2.9 shows color images which are perceived by the observer as complex threedimensional shapes.
12
2 Overview on Multidimensional Signals
Fig. 2.9 Pictorial representation of three-dimensional shapes defined by a complex-valued mathematical function (spherical harmonic Y11 (θ, ϕ), Y32 (θ, ϕ), Y54 (θ, ϕ) with azimuth angle ϕ and zenith angle θ). The specular reflections are not related to the 3D signal. They have been applied as a spatial cue to enhance the 3D perception of human observers. See [1, Chap. 5] for details on the spherical harmonic functions
2.3.2 Three-Dimensional Space- and Time-Dependent Signals A short video sequence is considered as an example for a (2+1)D signal. Figure 2.10 shows three frames recorded by a camera moving from left to right along a scene with a flowergarden between a tree in the foreground and a row of houses in the background.
Fig. 2.10 Three images from the MPEG flowergarden video test sequence [4, 5]
The variation of the content of the view field over time is shown in Fig. 2.11. The continuous time axis corresponds to the continuous movement of the viewer. A camera, however, records 2D images at discrete points of time such that the time axis is actually discrete, see Fig. 2.12. Signals with two space and one time dimension arise not only in video sequences. Also the sound pressure of a wave travelling in a certain direction constitutes a (2+1)D signal. Figure 2.13 shows a wave with a straight wave front (plane wave) travelling into a certain direction in the x-y-plane of a Cartesian coordinate system. The resulting (2+1)D signal is shown as a sequence of three 2D static wave shapes at three consecutive points in time tn , n = 1, 2, 3. The wave front reaches certain points along its direction of propagation at certain instants of time. There are no boundaries which could restrict the propagation of the wave (free field propagation). The situation is similar to Fig. 2.7 for a (1+1)D signal, with the pulse replaced by a plane wave.
2.3 Three-Dimensional Signals
13 x
time t
y
Fig. 2.11 View of the flowergarden scene as a continuous (2+1)D signal. As the view varies with time, the branches of the tree move from right to the left (see top surface of the cube). In a similar way, first the large lamp post and then the small one vanish at the left boundary (see left surface) according to the three images from Fig. 2.10 [3]
y x
t1
t2
t3
t
Fig. 2.12 The flowergarden video [4, 5] as a sequence of individual images recorded at discrete time steps tn , n = 1, 2, 3, . . .
If a wave propagates within an enclosure then the boundaries reflect a part of the wave’s energy back into the enclosure. The character of the reflection depends on the conditions at the boundaries. Figure 2.14 shows the (2+1)D signal of an expanding circular wave as a sequence of three 2D static wave shapes at three consecutive points in time tn , n = 1, 2, 3. Different from the (1+1)D counterpart in Fig. 2.8, the superposition of an expanding circular wave and its increasing number of reflections quickly adopts a complex interference pattern.
14
2 Overview on Multidimensional Signals
t1
t2
t3
y
x Fig. 2.13 Free field propagation of a plane wave from lower left to upper right. The spatial domain is unbounded; the thin lined frames indicate the observation windows. The amplitude is shown in normalized units with positive values at the top of the colorbar
t1
t2
t3
y
x Fig. 2.14 Propagation of a circular wave within an enclosure with reflections at the boundaries (indicated by thick lined frames). These reflections cause complex interference patterns. The amplitude is shown in normalized units with positive values at the top of the colorbar
2.4 Four-Dimensional Signals The (2+1)D-signals discussed in Sect. 2.3 result often from a simplification of the three-dimensional space to a projection to two dimensions. Such simplifications may be justified by the problem at hand or they are dictated by the bare necessity of casting the temporal evolution of a three-dimensional scene into one of the representations of Figs. 2.10, 2.11, and 2.12. In general, descriptions of the world around us result in signals that depend on the three spatial dimensions and on time, or in short in (3+1)D- or (3+t)D-signals. An example are the results of medical imaging of the beating heart. Let alone the technical difficulties to record enough 3D spatial data in short intervals during the heart beat cycle, the result is a (3+1)D-signal. Since they are hard to visualize in full beauty, projections to two spatial dimensions are applied, as discussed above.
2.6 Properties of Multidimensional Signals
15
2.5 Higher Dimensional Signals The possible number of dimensions does not stop at four. Particular applications employ signal representations with far more dimensions. The physical meaning of these additional independent variables is not confined to time or space. An example is the plenoptic function as applied in computer vision and computer graphics [2]. It attempts to answer the question: What is the amount of light from a certain direction which reaches an observer at a given position? Furthermore, the plenoptic function notes variations with respect to colour and to time. The amount of light is expressed by the radiance l which is measured as power per area and per solid angle. The observer may be a human eye or any kind of camera. Figure 2.15 depicts the geometrical arrangement of the deployed coordinate systems. The position of the observer in the 3D space is recorded in Cartesian coordinates x, y, and z. At this position the observer receives light from different directions expressed in terms of the zenith angle (or alternatively the elevation angle) θ and the azimuth angle ϕ of a spherical coordinate system. The received radiation may be of different wavelength λ and its amount may vary with time t. The result is the radiance l as a function of seven variables, the plenoptic function l(x, y, z, θ, ϕ, λ, t). Fig. 2.15 Coordinate systems for the position and view direction of an observer. The position is given by the Cartesian coordinates x, y, and z and the view direction by the spherical coordinates θ and ϕ. The observed object’s color is determined by the wavelength λ
λ
z
y
θ ϕ
x
Due to its high number of dimensions, the plenoptic function is used as a theoretical concept rather than as the description of technical systems. Several practical applications can be derived from the plenoptic function by reducing the number of dimensions down to five or four based on suitable assumptions [2].
2.6 Properties of Multidimensional Signals The preceding sections have shown several examples of one- and multidimensional signals. There are one-dimensional signals without a well-determined starting point, e.g. noise signals. Their signal values are not affected by the choice of the observation interval. But there are also signals which depend on an initial condition at a given starting point, e.g., exponential decay (see Fig. 2.2).
16
2 Overview on Multidimensional Signals
The same distinctions apply to space- or space- and time-dependent multidimensional signals. Images of all kinds are observations of the world around us. The content of the image is not altered by the choice of the view field, e.g. by zooming in or out. There is just more or less to see. On the other hand there are signals, like the vibrations of a drumhead in Fig. 2.6 or the sound field in Fig. 2.14 which are determined by the shape and size of their boundaries and furthermore by the conditions for reflections at these boundaries. Increasing the size of a drumhead affects the signal at all points within the boundary; the effect can be heard as lowering the pitch of the drum’s sound. In the same way, the sound field inside of an auditorium can be made less reverberant by covering the walls with absorbing material. For time- and space-dependent problems, both initial conditions for the time axis and boundary conditions in space apply. This distinction between signals on unbounded domains and those with initial and boundary conditions in time and space is specific to multidimensional signals. However, there are also other classifications which multidimensional signals inherit from one-dimensional ones. Time-dependent signals may be continuous-time or discrete-time, where discretetime signals often result from a sampling process. Similar, there are continuousspace and discrete-space signals. The latter ones are often delivered by sensors in digital cameras. But also the signal amplitude may be continuous or discrete, where discrete amplitudes often result from quantization of continuous ones. Real-world signals which are continuous in time, space, and amplitude are called analog signals. Signals suitable for storage in digital memory must be discrete in time, space, and amplitude and are called digital signals. Furthermore the signal amplitudes may be deterministic, e.g., as given by the solution of a differential equation, or random for unpredictable signals. Sensor signals from measurements are real-valued, but mathematical descriptions can sometimes be more elegantly formulated for complex-valued signals.
2.7 Multidimensional Systems Systems model natural or technical processes which relate different signals to each other. Systems can be classified according to the nature of these signals, e.g., systems which connect digital signals are called digital systems. Hence, the classifications for multidimensional signals discussed in Sect. 2.6 carry over to multidimensional systems. However, systems can also be distinguished by their mathematical formulation.
2.7.1 Autonomous Systems and Input-Output-Systems A very general form of a mathematical system model is given by
2.7 Multidimensional Systems
17 .
f (v1 , v2 , . . . vN ) = 0.
(2.1)
The function f describes how the signals vn , n = 1, . . . , N, are related to each other. The zero on the right hand side indicates that these signals are in a state of dynamic balance. Systems of this kind are also called autonomous systems. The (3+1)D signals vn = vn (x, t) depend on the 3D vector x of space coordinates and on the time variable t and possibly on other independent variables. The motion of the celestial bodies in the solar system is an example of an autonomous system. A more detailed investigation of technical and other systems requires to break down the description into connected subsystems, where the involved signals are grouped into input signals u and output signals v. For N output signals v1 , v2 , . . . vN and one input signal u, the system description of an input-output-system reads .
f (v1 , v2 , . . . vN ) = u .
(2.2)
The classification into input and output signals is not unique, since the output signal of one system may serve as an input signal to another system. The function f may be given in a variety of different forms. Frequently used mathematical formulations are presented in Sect. 2.7.3. But f may also be given as the schematic of an electrical circuit or as the mechanical description of a machine. In any case, also the output v in dependence on the input u is of interest. For a system with one input and one output, the corresponding relations are f (v) .
u,
S (u)
v,
u
S
v
(2.3)
The notation S (u) = v as well as the block diagram on the right indicate that the output v is a function S of the input u. Except for very simple systems, the function S is not easy to determine. Even if the function f is well defined by a mathematical expression, no closed form for S might exist. Furthermore, for initial-boundary-value problems, the output signal depends also on the initial and boundary values, see Sect. 2.7.3.2. Indeed, determining the input-output relation S from f is one of the important tasks in one- and multidimensional signals and systems. Even if S is not explicitly known, it is nevertheless useful as a theoretical concept, since it allows to formulate general system properties in a concise fashion. This feature is exploited in Sect. 2.7.2.
2.7.2 Linear, Time-, and Shift-Invariant Systems The system descriptions by (2.1), (2.2), and (2.3) are rather general for practical calculations. The relations between input and output can be formulated in more detail with some further assumptions on linearity, time- and shift-invariance.
18
2 Overview on Multidimensional Signals
Linearity Consider the output signals v1 and v2 in response to the input signals u1 and u2 as S (u1 ) = v1 and S (u2 ) = v2 . In general, it is not possible to predict the response to the sum of input signals u1 + u2 or to the weighted sum a1 u1 + a2 u2 , where a1 and a2 are real or complex numbers. However, there are systems for which the response to the weighted sum is equal to the weighted sum of the responses S (a1 u1 + a2 u2 ) = a1 f (u1 ) + a2 f (u2 ) = a1 v1 + a2 v2 .
.
(2.4)
These systems are called linear systems. Time-Invariance Consider the output signals v(t) in response to the input signal u(t) as S (u(t)) = v(t). In general it is not possible to predict the response to an input signal u(t − τ) which has been delayed, i.e., time-shifted, because the system might have changed its properties in between. However, if a system is invariant to time shifts, then the response to the delayed input signal is simply the delayed output signal S (u(t − τ)) = v(t − τ) .
.
(2.5)
These systems are called time-invariant systems. Shift-Invariance The concept of invariance against time-shifts can be generalized to shifts with respect to the other independent variables of a multidimensional system. Systems for which a relation similar to (2.5) holds for a certain independent variable are called shift-invariant with respect to this variable. The definitions of linearity in (2.4), time-invariance in (2.5) and shift-invariance can be formulated in a similar manner also for discrete-time systems. Systems which are both linear and time-invariant, resp. shift-invariant are abbreviated as LTIsystems, resp. LSI-systems. Time-invariance does not imply shift-invariance with respect to the spatial variables and vice versa. Furthermore, the concepts of linearity, time- and shift-invariance are idealizations. If at all, linearity holds only within certain amplitude ranges for the input- and output signals, while time- and shift-invariance hold only within certain shift ranges. Nevertheless, these assumptions allow to derive convenient mathematical tools in the time- and frequency domain which are not available otherwise.
2.7.3 Mathematical Formulation of Multidimensional Systems Multidimensional systems are often described in terms of difference equations or differential equations. These formulations allow to draw on a rich set of rigorous mathematical tools. Some examples for simple difference and differential equations are highlighted in the following subsections.
2.7 Multidimensional Systems
19
2.7.3.1 Difference Equations Difference equations establish relations between the values of discrete-time and discrete-space signals. The independent variables are integer values, often resulting from sampling the respective continuous signals. Difference equations result from the general system description (2.2) by specifying the function f through delay and shift operations as shown below. The simple first order difference equation (2.6) represents a 1D system. It calculates a new value of the discrete-time signal v(k + 1) from the previous values v(k) and the input signal u(k). The discrete time variable is denoted by k and c0 is a constant coefficient .
f (v(k)) = v(k + 1) − c0 v(k) = u(k), v(0) = v0 ,
k>0, k=0.
(2.6) (2.7)
To ensure a unique solution of this difference equation, an initial condition is specified in (2.7) with an arbitrary initial value v0 . A difference equation for a (1+1)D system is given in (2.8). The variable v(k, n) depends on the discrete time variable k as in (2.6) and additionally on the discrete space variable n. The function f (v(k, n)) connects the current value v(k, n) to the two values from the previous time step k − 1 and from the neighbouring spatial positions n ± 1 as f (v(k, n)) = v(k + 1, n) − c−1 v(k, n − 1) − c1 v(k, n + 1) with the constant coefficients c−1 and c1 . The initial condition (2.9) specifies the values of v(k, n) at k = 0 as a function v0 (n) of the space variable n v(k + 1, n) − c−1 v(k, n − 1) − c1 v(k, n + 1) = u(k, n),
.
k > 0, 0 < n < N, (2.8)
v(0, n) = v0 (n),
k = 0, 0 < n < N, (2.9)
v(k, 0) = v1 (k), v(k, N) = v2 (k),
k > 0, n = 0, k > 0, n = N .
(2.10)
Since the space variable is restricted to the range 0 ≤ n ≤ N, also boundary conditions are required for a unique solution as, e.g., in (2.10). The boundary values v1 (k) at the left boundary n = 0 and v2 (k) at the right boundary n = N may depend on time k. Difference equations are an attractive system description, if they lead to an algorithm for the step-wise calculation of the output signal. For example, the difference equation (2.8) is easily rearranged as v(k + 1, n) = c−1 v(k, n − 1) + c1 v(k, n + 1) + u(k, n) .
.
(2.11)
Starting from the initial values v0 (n) at k = 0, new values v(k, n) for k > 0 can be calculated from those at the previous time step k − 1, the input signal u(k, n), and by considering the boundary conditions. However, not every difference equation leads to a computable algorithm.
20
2 Overview on Multidimensional Signals
2.7.3.2 Differential Equations Differential equations are the appropriate system description for continuous-time and continuous-space systems. Here, the independent variables are real numbers. Therefore the function f in (2.2) is built from differential operators with respect to time and space. A first order differential equation corresponding to the difference equation (2.6) is given by (2.12). The time variable is denoted by t, while T 0 is a time-constant. The derivative with respect to time is denoted by d/dt. An initial value v0 at t = 0 is given by the initial condition (2.13) .
f (v(t)) = T 0
d v(t) + v(t) = u(t), dt
v(t) = v0 ,
t > 0,
(2.12)
t = 0.
(2.13)
The combination of Eqs. (2.12) and (2.13) is called an initial-value problem. It describes a 1D system specified by the ordinary differential equation (2.12). The general form of v(t) is shown in Fig. 2.2 for a fixed value of the time constant T 0 and different initial values v0 . An example for a (1+1)D system dependent on time t and one space coordinate x is given by (2.14). The quantity c is a constant coefficient. The initial condition (2.15) for t = 0 is determined by the initial value v0 (x) as a function of x. .
f (v(x, t)) =
2 ∂2 2 ∂ v(x, t) = u(x, t), v(x, t) + c ∂x2 ∂t2 v(x, 0) = v0 (x),
t ≥ 0,
x1 ≤ x ≤ x2 ,
(2.14)
t = 0,
x1 ≤ x ≤ x2 ,
(2.15)
v(x1 , t) = v1 (t),
t > 0,
x = x1 ,
v(x2 , t) = v2 (t),
t > 0,
x = x2 .
(2.16)
Since the space variable is bounded by the lower and upper limits x1 and x2 , respectively, also boundary conditions (2.16) are required. The boundary values v1 (t) and v2 (t) depend on time. The system description (2.14) contains partial derivatives with respect to time (∂/∂t) and to space (∂/∂x) and constitutes a partial differential equation. It is the simplest form of the so-called wave equation, since its solutions have a wave-like character. Example solutions are the impulses shown in Figs. 2.7 and 2.8. Its counterpart in two or three spatial dimensions is shown in (2.17). The vector x of length two or three contains the components of a position vector in 2D or 3D space. Consequently, the spatial differentiation operator is more involved than in (2.14). It performs a second order differentiation of v(x, t) by forming first the gradient and then its divergence. The spatial region on which (2.17) is defined may have a more complicated shape than the 1D interval in (2.14). In 2D space it may be a square, a disc, or an irregular shape. Figures 2.6, 2.13, and 2.14 show solutions of the wave equation in two spatial dimensions. In 3D the spatial region may be the interior of a cube, of a sphere or,
2.8 Overview on the Next Chapters
21
say, of a potato. Without defining this shape further, it is called the volume V while its boundary ∂V is given by the surface of the cube, sphere, or potato. .
f (v(x, t)) =
∂2 v(x, t) + c2 div grad v(x, t) = u(x, t), ∂t2 v(x, 0) = v0 (x), v(x, t) = vb (x, t),
t > 0,
x ∈ V,
(2.17)
t = 0, t = 0,
x ∈ V, x ∈ ∂V.
(2.18) (2.19)
Then the initial condition (2.18) is specified by the initial value v0 (x) for x ∈ V, while the boundary condition (2.19) is specified by the boundary value vb (x, t) which is valid only on the boundary x ∈ ∂V. Each of the sets of Eqs. (2.14)–(2.16) as well as (2.17)–(2.19) contains a partial differential equation, an initial condition and a boundary condition. Problems of this kind are called initial-boundary-value problems. Their solution is determined by the input signal u, the initial value v0 and the boundary values. E.g., the output signal of (2.17)–(2.19) is given in terms of (2.3) by v(x, t) = S (u(x, t), v0 (x), vb (x, t)). As already discussed in Sects. 2.1, 2.2, and 2.3, multidimensional signals can also be defined in unbounded domains −∞ < t < ∞ or −∞ < x < ∞. Then no initial or boundary values apply. Instead other restrictions might be imposed, such as finite energy or finite power.
2.8 Overview on the Next Chapters This book is the first one in a set of two volumes. Chapter 3 reviews some elements from 1D signals and systems that are revisited in later chapters. The general concept of signal spaces is presented in Chap. 4. Chapter 5 introduces multidimensional signals in greater detail than in this chapter. Transformations for multidimensional signals are treated by Chap. 6, followed by Chap. 7 on sampling of multidimensional signals. Then Chaps. 8 and 9 cover discrete and continuous multidimensional systems, respectively. Chapter 10 introduces the Sturm-Liouville transformation as a general tool for multidimensional boundary-value problems. Its application requires solution methods presented in Chap. 11. The implementation of the resulting discrete-time algorithm in Chap. 12 completes this volume. A second volume [1] discusses selected applications of multidimensional signals and systems. After a review of some material from this volume in [1, Chap. 1], [1, Chap. 2] presents the propagation of pulses on electrical transmission lines. The creation of musical sounds from the physical description of vibrating and oscillating bodies is shown in [1, Chap. 3]. Sound propagation in free space and in enclosures is investigated in [1, Chap. 4] with applications to spatial sound reproduction in [1, Chap. 5]. The diffusion of particles with respect to molecular communication is covered by [1, Chap. 6].
22
2 Overview on Multidimensional Signals
2.9 Problems Problems 2.1 and 2.2 are concerned with finding the function S from (2.3). These simple cases can be solved by techniques from Laplace- and z-transformation. Problems 2.3 and 2.4 cover an alternate approach for systems with discrete variables. 2.1. Find the function v(k) = S (u(k), v0 ) for f (v(k)) from (2.6). 2.2. Find the function v(t) = S (u(t), v0 ) for the initial value problem (2.12), (2.13). 2.3. Determine an algorithm for the step-wise calculation of v(k) for the function f (v(k)) from (2.6). Calculate v(k) for k = 0, 1, 2, 3 and compare with the result from Problem 2.1. 2.4. Calculate v(k, n) from (2.8)-(2.11) for k = 0, 1, 2, 3 and N = 8 for u(k, n) = 0, v1 (k) = 0, v1 (k) = 0. The initial condition is given by v0 (n) = 0 except for v0 (4) = 1. 2.5. Show that the differential equation (2.12) for v0 = 0 is solved by t t−τ − .v(t) = e T0 u(τ) dτ. 0
(2.20)
References 1. Rabenstein, R., Sch¨afer, M.: Multidimensional Signals and Systems: Applications. Springer Nature, Heidelberg, Berlin (to appear) 2. Chan, S.C.: Plenoptic function. In: K. Ikeuchi (ed.) Computer Vision: A Reference Guide, pp. 618–623. Springer US, Boston, MA (2014). https://doi.org/10. 1007/978-0-387-31439-6 7 3. Girod, B., Rabenstein, R., Stenger, A.: Signals and Systems. Wiley, Chichester (2001) 4. MIT BCS Perceptual Science Group: Three frames of original flower garden sequence (MPEG suite). http://persci.mit.edu/demos/jwang/garden-layer/ orig-seq.html. Accessed on Apr. 25, 2022 5. Wang, J., Adelson, E.: Representing moving images with layers. IEEE Transactions on Image Processing 3(5), 625–638 (1994). https://doi.org/10.1109/83. 334981
Chapter 3
Elements from One-Dimensional Signals and Systems
This chapter provides a short introduction to selected elements of one-dimensional signals and systems. It serves to provide a foundation for reference in the following chapters. No complete coverage is intended since readers are assumed to have previous knowledge on signals and systems. The number of available textbooks in this classical field is vast, a small selection includes e.g. [2, 4–11] and many more. Rather than repeating a basic introduction to signals and systems, emphasis is placed here on signal spaces and signal transformations to support the generalization to signal transformations in higher dimensions in later chapters.
3.1 Convolution and Impulse Response The concepts of the convolution of two signals and of the impulse response of a system are intrinsically linked and therefore they are jointly discussed here. The presentation starts with an introductory example to develop the fundamental topics from scratch. Then delta impulses are introduced and convolution is presented in more depth.
3.1.1 An Introductory Example The convolution of two signals follows naturally from the system response to a unit impulse. This concept is introduced here in three steps. At first, some simple operations with integer numbers are performed. Then these considerations are extended to sequences of numbers and finally formulated for continuous-time functions.
© Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 3
23
24
3 Elements from One-Dimensional Signals and Systems
3.1.1.1 Integer Numbers This introductory example starts with integer numbers, not with signals. It assumes a rather restricted knowledge on integers. Only the addition of integers is known, but no further operations on integers. A special case of addition is the decomposition of integers into a sum of ones, e.g. 3 = 1 + 1 + 1. Thus, the list of properties reads addition of integers, decomposition of integers into a sum of ones.
(+) (D)
.
There is a system S which processes integer numbers, e.g., S (m) = n for integer values m and n (see Fig. 3.1). Also the knowledge on this system is restricted, making it act more like a magic box rather than a well-defined system. Only the response of the box to the integer 1 is known to be 4, i.e., S (1) = 4, but the response to other integer numbers is unknown (see Fig. 3.1). Further, the box behaves additively, i.e., S (m1 + m2 ) = S (m1 ) + S (m2 ). Thus, the known properties of the system S are S (1) = 4, S (m1 + m2 ) = S (m1 ) + S (m2 ) .
(U) (A)
.
Of interest is now the response of the magix box to an integer value different from 1. Figure 3.1 displays the response to 3 as an example. m
n
S
1
S
S
3
4
?
Fig. 3.1 Unknown system S for processing integer numbers
Next, on the basis of the known signal properties (+) and (D) and the known system properties (U) and (A), the following conclusions can be drawn D
A
U
+
S (3) = S (1 + 1 + 1) = S (1) + S (1) + S (1) = 4 + 4 + 4 = 12 .
.
The first equality sign is justified by the decomposition into ones (D), the second equality sign follows from additivity (A), the third one from the system response to 1 (U), and the fourth one from addition of integers (+). The final conclusion is that the magic box performs an operation not defined so far, namely multiplication of 3 by 4. Multiplication by other integers follows in the same way.
3.1.1.2 Sequences of Numbers The next step in the introductory example considers sequences of numbers u(k) as already addressed in Sect. 2.7.3.1, Eq. (2.6) . When the index k indicates discrete time then u(k) is also called a discrete-time signal. For brevity, sequences are ad-
3.1 Convolution and Impulse Response
25
dressed as discrete-time signals in the sequel, but the results hold also for other meanings of the index k, e.g. as step in spatial direction or as address of a memory cell. Again, only a restricted set of operations on sequences is assumed. They are based on the properties for operations on integers as introduced before. However, the time dependence requires some additional properties. At first, the addition of sequences is defined as an element-wise addition u1 (k)+u2 (k), ∀k. Then the multiplication of a sequence u(k) by a real or complex number a is explained as element-wise multiplication a u(k). Note, that all elements of the sequence u(k) are multiplied by the same number a; no multiplication of two sequences is defined. Finally, an equivalent to the decomposition of an integer into a sum of ones is required. If 1 is regarded as the “unit integer” then an arbitrary sequence u(k) needs to be composed into “unit sequences”. To this end, (3.1) defines a so-called delta sequence which is equal to 1 for k = 0 and equal to 0 for all other values of k ⎧ ⎪ ⎪ ⎨1 k = 0 .δ(k) = ⎪ . ⎪ ⎩0 else
k −2
−1
0
1
(3.1)
2
Time-shifted versions δ(k − κ) are equal to 1 for k = κ and zero otherwise. Then an arbitrary sequence u(k) can be decomposed into u(k) = . . .+u(−1)δ(k +1)+u(0)δ(k)+u(1)δ(k −1)+. . . =
∞
.
u(κ)δ(k −κ) . (3.2)
κ=−∞
Note that the index for the sequence u(k) is k not κ. Also the unit sequences δ(k − κ) are indexed in k while κ denotes the time-shift w.r.t. δ(k). Therefore the shifted unit sequence δ(k − κ) in (3.2) is multiplied by a complex number u(κ) which does not depend on the index k. However, the multipliers u(κ) are values of the sequence u(k). The list of properties of sequences is now .
addition of sequences, multiplication of a sequence by a number,
(+) (×)
decomposition of sequences into a sum of unit sequences.
(D)
The decomposition (D) according to (3.2) includes addition of sequences (+) and multiplication of a sequence by a number (×). To extend the example on integer numbers further, a system S is considered which turns an input sequence u(k) into an output sequence v(k). This system is characterized by its response h(k) to a unit sequence δ(k) and by further properties, see Fig. 3.2, left hand side. The response h(k) is called the impulse response of the system S . The further properties of S are now determined such that they allow to express the response v(k) to an arbitrary input sequence u(k) with the help of the impulse response h(k).
26
3 Elements from One-Dimensional Signals and Systems
In the same way as for integer numbers, additivity is also required for sequences, i.e., S (u1 (k) + u2 (k)) = S (u1 (k)) + S (u2 (k)). Since also the multiplication of a sequence u(k) by a number a is explained, S is required to be a homogeneous function of degree one, i.e. S (au(k)) = aS (u(k)). Additivity and homogeneity are often combined into a single property: A system S is called linear if the superposition principle holds, i.e., S (a1 u1 (k) + a2 u2 (k)) = a1 S (u1 (k)) + a2 S (u2 (k)), (see (2.4)). The superposition principle affects only sequence values u1 (k) and u2 (k) of the same index k. To establish also relations between different time steps, it is assumed that the system S does not change its properties with time, or in other words that it is a time-invariant system. Then the response to a delayed input S (u(k − κ)) is equal to the delayed response to S (u(k), i.e., from S (u(k)) = v(k) follows S (u(k − κ)) = v(k − κ)), ∀κ. Systems which are both linear and time-invariant are also called LTIsystems (see Sect. 2.7.2). The list of system properties is now complete S (δ(k)) = h(k),
(U)
.
S (a1 u1 (k) + a2 u2 (k)) = a1 S (u1 (k)) + a2 S (u2 (k)), S (u(k)) = v(k) S (u(k − κ)) = v(k − κ).
δ(k)
h(k)
S
u(k)
(L) (T I)
S
v(k) =?
Fig. 3.2 Unknown system S for processing sequences of numbers
Based upon the property (D) of sequences and upon the properties (U), (L), and (T I) of systems, the response v(k) of a system S to the input u(k) is determined as (see Fig. 3.2, right hand side) ⎛ ∞ ⎞ ∞ ⎜⎜⎜ ⎟⎟⎟ L
D ⎜ .S (u(k)) = S ⎜ u(κ)δ(k − κ)⎟⎟⎟⎠ = u(κ) S δ(k − κ) ⎜⎝ κ=−∞
U,T I
=
∞
κ=−∞
u(κ) h(k − κ) = v(k) .
(3.3)
κ=−∞
The first equality sign is justified by the decomposition (D) of u(k) into unit sequences δ(k), the second equality sign follows from linearity (L), and the third equality sign uses the response to a unit sequence (U) and time-invariance (T I). The final result is that the response of a linear and time-invariant (LTI) system with the impulse response h(k) is given by v(k) = S (u(k)) =
∞
.
u(κ) h(k − κ) = u(k) ∗ h(k) .
(3.4)
κ=−∞
This operation is known as convolution of the input sequence u(k) and the impulse response h(k). It is abbreviated by a star ∗.
3.1 Convolution and Impulse Response
27
As already noted in Sect. 2.7.2, the assumed properties of linearity, time- (and shift-) invariance are idealizations which hold at best within certain amplitude or index ranges.
3.1.1.3 Continuous Functions It is tempting to transfer the results for sequences, i.e., for discrete-time signals, to continuous-time signals (again, time may be substituted by any other onedimensional independent variable). Indeed, the concepts of addition of functions and multiplication of a function by a number carry over directly from the discretetime case presented in Sect. 3.1.1.2 as (+) and (×). However, the decomposition of a function poses problems, since there is no immediate continuous-time equivalent for the discrete-time unit sequence δ(k). The formal counterpart for (3.2) in the continuous-time domain would be u(t) =
∞
.
u(τ) δ(t − τ) dτ .
(3.5)
−∞
However, there is no continuous-time function δ(t) which satisfies (3.5) for all integrable functions u(t) and all values of the time-variable t. Establishing a decomposition of the form (3.5) requires to extend the range of classical functions by so-called distributions. They are introduced in Sect. 3.1.2. This insight completes the introductory example.
3.1.2 Delta Impulse This section introduces the so-called delta impulse δ(t), also called unit impulse or Dirac impulse.1 The wording carefully avoids the designation as function, since the delta impulse is not a function in the classical sense. It is a special case of a so-called generalized function or distribution. However, with some caution, many calculations with delta impulses can be performed similar as for functions.
3.1.2.1 Classical Functions and Generalized Functions Functions in the classical sense assign to each value of the argument t a value of the function f (t). This assignment may be evaluated by calculation rules as for the complex exponential function e(t) or by a piecewise mapping as for the rectangular function rect(t) in (3.6)
1
Paul Dirac, 1902–1984, physicist.
28
3 Elements from One-Dimensional Signals and Systems
e(t) = e
iωt
.
⎧ ⎪ ⎪ ⎨1 rect(t) = ⎪ ⎪ ⎩0
,
|t| ≤ 12 , else,
f (t) δ(t) dt .
(3.6)
Generalized functions like the delta impulse δ(t) are only defined by their effect on classical functions in the form of an integration like in (3.6). The evaluation of this integral requires a proper definition of the delta impulse.
3.1.2.2 Definition The starting point for the definition of the delta impulse δ(t) are impulse functions which are functions in the classical sense. Equations (3.7)–(3.9) list a few choices. They have in common that their width is scalable by a parameter T and that their integral (the gray area) is unity independent of T . 1 T
t 1 rect .d1 (t; T ) = T T ⎧ t+T ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ T2 d2 (t; T ) = ⎪ ⎪ ⎪ −t + T ⎪ ⎪ ⎩ T2 1 − d3 (t; T ) = √ e πT
−
t T
0
t
T 2
(3.7)
1 T
−T ≤ t < 0 0≤t 0 ⎨1 t > + T2 dθ = lim ⎪ =⎪ . δ(θ) dθ = lim rect = (t). ⎪ ⎪ T ⎩0 t < 0 T →0 ⎩0 t < − T →0 T T 2 −∞ −∞ (3.12) The integration of the delta impulse up to t results in the unit step function (t). Sifting Property The sifting property is important for the decomposition of a function into unit pulses as already envisioned in (3.5). It is derived by evaluation of the effect of a shifted delta impulse δ(t − t0 ) on a function f (t). Performing this evaluation with d1 (t; T ) gives
30
3 Elements from One-Dimensional Signals and Systems t0 + 2 ∞ t − t 1 1 0 dt = lim f (t)δ(t − t0 ) dt = lim f (t) rect f (t) dt = f (t0 ). T →0 T T →0 T T T −∞ T
∞ .
−∞
t0 − 2
(3.13) The last integral in (3.13) calls for some attention. Its limits follow from the definition of the rectangular function according to (3.6) by expanding the inequality for the magnitude of the argument into a double inequality and solving for t t − t ⎧ t−t ⎪ ⎪ 1 1 t − t0 ⎨1 | T 0 | ≤ 12 0 ≤ , t0 − T2 ≤ t ≤ t0 + T2 . (3.14) =⎪ .rect , − ≤ ⎪ ⎩0 else 2 T 2 T Further, as T → 0, the rectangular impulse function d1 (t; T ) gets narrower while maintaining its unit area, see Fig. 3.3. In the limit, only the constant value f (t0 ) remains which can be removed from the integral. To this end, the function f (t) must be continuous around t0 as indicated in Fig. 3.3. 1 T
f (t) t0 t0 − T2
t0 + T2
f (t)
lim f (t) = f (t0 )
t→t0
t0
t
t0 − T2
t0 + T2
t
Fig. 3.3 Illustration of the sifting property of the delta impulse δ(t)
The extraction of a single value f (t0 ) from the function f (t) resembles the effect of a sieve with a single hole at t0 which explains the name sifting property. It simplifies the derivation of further properties of the delta impulse. Superposition Operations on delta impulses like addition and multiplication by a constant cannot be defined in the classical way. They are only explained by integration. Two different expressions for delta impulses are identical, if they have the same effect on a continuous function upon integration. An example is the superposition or linear combination of delta impulses: a δ(t) + b δ(t) where a, b ∈ C. Two different integral expressions are now evaluated with the sifting rule as .
∞
∞ ∞ aδ(t) + bδ(t) f (t) dt = a δ(t) f (t) dt + b δ(t) f (t) dt = a f (0) + b f (0)
−∞
−∞
∞ −∞
(a + b) δ(t) f (t) dt = (a + b)
−∞
∞
δ(t) f (t) dt
= (a + b) f (0), = (a + b) f (0) .
−∞
Obviously, the superposition a δ(t) + b δ(t) and the multiplication (a + b) δ(t) have the same effect on an arbitrary continuous function f (t). Thus both expressions are identical in the sense of generalized functions
3.1 Convolution and Impulse Response
31
a δ(t) + b δ(t) = (a + b) δ(t) .
(3.15)
.
Derivation Calculating the derivative of a function requires certain properties to ensure differentiability. These properties are not satisfied for delta impulses. Nevertheless, a generalized form of derivation can be introduced. The triangle shaped impulse function d2 (t; T ) from (3.8) is useful for this purpose. It is piecewise differentiable with the time derivative d˙2 (t; T ) ⎧ 1 1 ⎪ ⎪ −T < t < 0 ⎪ ⎪ T2 T2 ⎪ d ⎨ 0 T d2 (t; T ) = d˙2 (t; T ) = ⎪ . 0 0 shall be determined as solution of a differential equation. The output voltage v(t) depends not only on the input voltage u(t) but also on the initial value uC,i of the capacitor voltage uC (t) for t = 0 which is caused by an initial charge of the capacitor.
3.4 Differential Equations and Transfer Functions R1 u(t)
M1
R2
i1 (t) N uC (t)
49
C
i2 (t) R3
M2
v(t)
Fig. 3.6 Electrical circuit with three resistors and one capacitor C, input voltage u(t) and output voltage v(t), two meshes M1 and M2 and one node N
The analysis of this circuit starts from three balance equations: • The input voltage u(t), the voltage drop across the resistor R1 and the capacitor voltage uC (t), all counted in the same direction, have to add up to zero. • The capacitor voltage uC (t) and the voltage drop across the resistors R2 and R3 , all counted in the same direction, have to add up to zero. • The current i1 (t) through the resistor R1 , the current i2 (t) through the resistor R2 , and the current which charges the capacitor, all counted in the same direction w.r.t. to the node N, have to add up to zero. The first two balance equations are mesh equations, requiring that the voltage along the closed mesh M1 respectively M2 are zero. The third equation is a node equation, which requires that the sum of all currents into the node N are zero. These conditions for meshes and nodes are known as Kirchhoff’s laws. Finally, the output signal v(t) follows from Ohm’s law at resistor R3 . The balance equations resulting from these considerations are detailed in (3.95) to (3.98). The time derivative in (3.97) is abbreviated by a dot. u(t) = R1 i1 (t) + uC (t)
.
uC (t) = (R2 + R3 ) i2 (t) C
d uC (t) = C u˙ C (t) = i1 (t) − i2 (t) dt v(t) = R3 i2 (t)
mesh M1,
(3.95)
mesh M2,
(3.96)
node N,
(3.97)
resistor R3 .
(3.98)
These individual equations are now combined into certain standard forms for the representation of ordinary differential equations. Solving the mesh equations for the currents i1 (t) and i2 (t) and inserting into the node equation gives a differential equation for the voltage at the capacitor C u˙ C (t) +
.
1 1 uC (t) = u(t), R0 R1
R0 = R1
R2 + R3 . R1 + R2 + R3
(3.99)
This form emphasizes the separate influence of the capacitance C and the resistance R for later reference in Chap. 10.2.1, e.g., in (10.25). From (3.96) and (3.98) follows that the output signal v(t) is a fraction of the capacitor voltage v(t) = R3 i2 (t) =
.
R3 uC (t) . R2 + R3
(3.100)
50
3 Elements from One-Dimensional Signals and Systems
Both Eqs. (3.98) and (3.100) are combined and complemented with the initial voltage at the capacitor u˙ (t) = a uC (t) + b u(t),
uC (0) = uC,i ,
. C
v(t) = c uC (t),
1 , R0C R3 c= . R2 + R3
a=−
b=
1 , R1C
(3.101) (3.102)
These equations resemble the so-called state space representation to be discussed in more detail in Sect. 3.4.2. By elimination of uC (t) from (3.99) and (3.100) follows a differential equation which links the output signal v(t) directly to the input signal u(t) v˙ (t) = av(t) + cb u(t),
.
v(0) = vi = c uC,i .
(3.103)
Together with the initial condition, the differential equation (3.103) constitutes an initial-value problem (IVP).
3.4.1.2 Solution of the Homogeneous Differential Equation At first, the solution of the homogeneous differential equation is considered, i.e., Eq. (3.103) for u(t) = 0 v˙ (t) = av(t),
.
v(0) = vi .
(3.104)
In this case, the circuit is an autonomous system (see (2.1)) and the output v(t) is the response to the initial value vi only. A general approach to the solution of (3.104) applies an operator P(t) to the initial value vi to obtain the output v(t) v(t) = P(t) v(0) = P(t) vi .
.
(3.105)
Since, the operator P(t) propagates the initial value at t = 0 to the observation of the output at t > 0, it is called a propagator. For time-invariant systems, the propagator does not depend on an absolute time axis but only on the time difference between initial value and observation. That means that an initial value at t = t0 and an observation of a causal system at time t > t0 are connected by a propagator P(t − t0 ). For time-variant systems, the propagator depends on both t0 and t. Unless noted otherwise, only time-invariant systems are considered in the sequel. Since (3.104) is a linear differential equation, vi is a scalar constant and v(t) a time-dependent scalar function. Therefore the propagator P(t) must be a scalar multiplier. Two required properties follow directly from (3.105). First, for t = 0 the output is equal to the initial value (see (3.104)), such that v(0) = P(0) v(0). Thus P(0) = 1 is an identity. Second, since (3.104) describes a time-invariant system, the time count can be restarted at a certain time t0 with v(t0 ) as an initial value and with a new time variable τ such that
3.4 Differential Equations and Transfer Functions .
51
P(τ)v(t0 ) = P(τ)P(t0 )v(0) = v(τ + t0 ) = P(τ + t0 )v(0) .
(3.106)
Therefore, the propagator P(t) must be a scalar function of time with the properties .
P(0) = 1,
P(t1 )P(t0 ) = P(t0 + t1 )
for t0 , t1 ≥ 0 ,
(3.107)
where τ has been replaced by t1 . Having established the required properties, P(t) must now be represented by a suitable function. Since exponential functions satisfy the properties (3.107), P(t) is expressed by (3.108) with yet unknown parameter s0 .
P(t) = e s0 t ,
v(t) = e s0 t vi ,
v˙ (t) = s0 e s0 t vi ,
t≥0.
(3.108)
Thus v(t) evolves exponentially from the initial value vi and since the exponential function is differentiable also the time derivative v˙ (t) is easily calculated. Inserting v(t) and v˙ (t) from (3.108) into the differential equation (3.103) turns it into an algebraic equation for the unknown parameter s0 , such that the solution of the initial-value problem (3.104) for a < 0 is the familiar exponential decay s = a,
P(t) = eat ,
. 0
v(t) = P(t)vi = eat vi ,
t≥0.
(3.109)
3.4.1.3 Solution of the Inhomogeneous Differential Equation With the solution to the homogeneous problem (3.104), the solution to the inhomogeneous problem (3.103) could be obtained by a particular solution of (3.103). As an alternative, Laplace transformation is used to turn the differential equation (3.103) into an algebraic one. The initial value is set to zero here: vi = 0. The Laplace transforms of the input signal u(t) and the output signal v(t) U(s) = L {u(t)}
.
V(s) = L {v(t)}
(3.110)
turn (3.103) into an algebraic equation which is solved for the transfer function H(s) .
(s − a) V(s) = cb U(s),
H(s) =
V(s) bc = c(s − a)−1 b = . U(s) s−a
(3.111)
The sequence of the factors in the product c(s − a)−1 b is arbitrary, since multiplication of scalars is commutative in contrast to multiplication of matrices. Inverse Laplace transformation of the transfer function gives the impulse response h(t) $ % bc −1 = bc eat = bc P(t), t ≥ 0 , .h(t) = L {H(s)} = L −1 (3.112) s−a which is a scaled version of the propagator P(t) for the homogeneous problem. From the solution in the frequency domain results the time domain solution as a convolution of the input signal with the impulse response
52
3 Elements from One-Dimensional Signals and Systems
V(s) = H(s)U(s),
.
v(t) = u(t) ∗ h(t) =
t 0
h(t − θ) u(θ) dθ .
(3.113)
3.4.1.4 Solution of the Initial Value Problem Since the initial value problem (3.103) describes a linear system, its solution is just the superposition of the solutions of the homogeneous problem (3.104) and the inhomogeneous problem with zero initial value vi = 0. From (3.108) and (3.113) follows the complete solution expressed either by the impulse response h(t) or by the propagator P(t) as t 1 1 h(t) vi + h(t) ∗ u(t) = h(t) vi + h(t − θ) u(θ) dθ, 0 bc bc t v(t) = P(t) vi + bc P(t) ∗ u(t) = P(t) vi + bc P(t − θ) u(θ) dθ . v(t) =
.
0
(3.114) (3.115)
The above problem discussed one of the most simple differential equations one can think of. Nevertheless it allows to study the essential tools for the solution of linear differential equations with constant coefficients. Not only the tools that have been introduced here but also the general form of the solution carry over to higher order differential equations.
3.4.2 State Space Systems The state space representation is a standard technique in signals and systems and control theory. It expresses any higher order scalar linear differential equation as a first order differential equation for vector-valued variables. Therefore it is most suitable to generalize the results from the simple circuit in Sect. 3.4.1 to higher order differential equations.
3.4.2.1 State Space Representation The standard form of a state space representation is given by Eqs. (3.116) and (3.117). The vectors u(t) and x(t) contain the input and output signals, while x(t) is the state vector with initial value xi x˙ (t) = A x(t) + B u(t), v(t) = C x(t) + D u(t) .
.
x(0) = xi ,
(3.116) (3.117)
The matrix A is the state matrix, while B and C are the input and output matrices. The matrix D describes the direct connection between input and output. All matrices are of appropriate sizes. Together they are called the state space matrices.
3.4 Differential Equations and Transfer Functions
53
Eq. (3.116) is the state equation and Eq. (3.117) the output equation. Both equations are referred to as state space representation or simply as the state equations. Similar to (3.103), also the state equations constitute an initial-value problem. Example 3.1 (State Space Representation of an Electrical Circuit). The state space representation of the electrical circuit from Sect. 3.4.1 is given by (3.101) and (3.102). The state vector x(t) has only one entry and the state space matrices are scalars x(t) = uC (t)
.
A = a,
B = b,
C = c,
D=0.
(3.118)
The single state variable uC (t) is the voltage at the only storage element, the capaci
tor in Fig. 3.6.
3.4.2.2 Solution of the Homogeneous State Equation The approach for the solution of the homogeneous state equation with zero input u(t) = 0 is similar as for the scalar case in (3.105) x(t) = P(t) x(0) = P(t) xi .
(3.119)
.
The propagator P(t) is now a matrix which maps the initial value onto the current solution x(t).
3.4.2.3 Definition of the Propagator P(t) In the scalar case discussed in Sect. 3.4.1.2, the operator P(t) has been defined by an exponential function. For the state space representation, a similar operator in matrix form is required which inherits the essential properties from the scalar case. To this end, the matrix At is inserted into the Taylor series expansion of the exponential function. The resulting matrix is called the matrix exponential and constitutes a matrix-valued propagator P(t) = eA t =
.
∞ 1 νν 1 1 A t = I + A t + A 2 t2 + A 3 t3 + · · · ν! 2 6 ν=0
.
(3.120)
This convergent series in A t is an analytical function and is differentiable w.r.t. time for all values of t.
3.4.2.4 Properties of the Propagator P(t) The properties of the scalar operator P(t) follow simply from the calculation rules of the exponential function. For the matrix case, these properties have to be derived from the Taylor series expansion (3.120).
54
3 Elements from One-Dimensional Signals and Systems
Identity The identity operator P(0) follows immediately from (3.120) for t = 0 P(0) = I .
(3.121)
.
Time Derivative The time derivative of the propagator follows from (3.120) by some manipulations with the individual terms and the substitution κ = ν − 1. The result is the direct extension of the scalar case in (3.108) .
∞ ∞ d 1 ν ν−1 1 d =A P(t) = eA t = A νt A ν−1 tν−1 dt dt ν! (ν − 1)! ν=1 ν=1
=A
∞ 1 κκ A t = A eA t = A P(t) . κ! κ=0
(3.122)
Product In the scalar case holds e p eq = e p+q for p, q ∈ C. Therefore it is reasonable to assume that the product of two matrix exponentials eP and eQ is equal to the P +Q Q) , if the two matrices P and Q commute. This claim is matrix exponential e(P proven by evaluation of the product eP eQ with the series expansion (3.120) ⎛∞ ⎞ ∞ ∞ ⎞⎛ ∞ ⎜⎜⎜ 1 m ⎟⎟⎟ ⎜⎜⎜ ⎟⎟ 1 1 m n 1 P Q n Q ⎟⎟⎟⎠ = P Q . P ⎟⎟⎠ ⎜⎜⎝ (3.123) .e e = ⎜⎜⎝ n! m! n! m! m=0 m=0 n=0 n=0 The idea is to convert one of the summations into the binomial expansion for the matrix sum P + Q . This conversion is achieved by a reassignment of the index pairs (m, n) such that one of the summation runs over a finite index range. The relation between the original index pairs (m, n) and the reassigned pairs (μ, ν) is described by an affine mapping between the (m, n)-plane and the (μ, ν)-plane (more on affine mappings in Sect. 6.2) .
& ' & '& ' & ' & ' m 1 −1 μ 1 −1 = = μ+ ν, n 0 1 ν 0 1
m = μ − ν, n = ν,
μ = 0, . . . , ∞, ν = 0, . . . , μ.
(3.124)
Figure 3.7 shows a geometric visualization of this re-indexing process. Note that the basis vectors in the (m, n)-plane are horizontal and vertical, while in the (μ, ν)plane they are horizontal and diagonal. Both ways of indexing address the same set of points as illustrated in Fig. 3.7. As an example consider the index pair marked by a circle. It is described both by (m, n) = (1,2) as well as by (μ, ν) = (3,2). From Fig. 3.7 follows that the black points addressed by m = 0, . . . , ∞ and n = 0, . . . , ∞ are alternatively indexed by μ = 0, . . . , ∞ and ν = 0, . . . , μ. Now the indices (m, n) in (3.123) are substituted by (μ, ν) according to (3.124)
3.4 Differential Equations and Transfer Functions
55
n 3 2
ν
1 0
1 0
2
1
3
m
00
μ 1
2
3
Fig. 3.7 Indexing of the same set of points in the (m, n)-plane (left) and in the (μ, ν)-plane (right). In the (μ, ν)-plane, ν runs from 0 to μ, while in the (m, n)-plane both indices run from 0 to ∞ μ ∞
1 μ−ν ν 1 P Q (μ − ν)! ν! μ=0 ν=0 μ ∞ ∞ 1 μ μ−ν ν 1 Q P + Q )μ = eP +Q (P = . P Q = μ μ! μ! μ=0 ν=0 μ=0
eP eQ =
.
(3.125)
The binomial expansion in the last line holds only if the matrices P and Q commute, see, e.g., for μ = 2 2 P P + Q )(P P + Q ) = P 2 + P Q + QP + Q 2 = P 2 + 2P P Q + Q 2 , (3.126) (P . + Q ) = (P
where the last identity is only true for P Q = QP . In short, is has been proven that Q eP eQ = eP +Q
iff
.
P Q = QP .
(3.127)
This result is now specialized to the propagator P(t) from (3.120). Setting P = A t1 ,
Q = A t0 ,
.
P Q = Q P = A 2 t0 t1
(3.128)
shows that the matrices A t0 and A t1 commute. From (3.127) follows immediately eA t0 eA t1 = eA (t0 +t1 )
.
and
P(t0 )P(t1 ) = P(t0 + t1 ) .
(3.129)
Q The proof of (3.127) can also be performed in the reverse direction eP +Q = eP eQ , see, e.g., [3, Ex. 2.1.3].
Summary Obviously, the propagator P(t) possesses the same essential properties as in the scalar case, since from (3.121), (3.122) and (3.129) follows P(0) = I,
.
d P(t) = A P(t), dt
P(t1 ) P(t0 ) = P(t1 + t0 ) .
(3.130)
Therefore the propagator P(t) defined by (3.120) is suitable for expressing the solution of the state space system (3.116) and (3.117). It provides a link to the theory of semigroups as discussed, e.g., in [3].
56
3 Elements from One-Dimensional Signals and Systems
3.4.2.5 Similarity Transformation The state space representation in (3.4.2.1) is not unique since for a given input u(t), the same output v(t) is generated also by other state space matrices. They result from a transformation of the state vector x(t) into a new state vector x˜ (t) by a nonsingular matrix M . It turns the state matrices from (3.116) and (3.117) into a new set of state matrices for the state vector x˜ (t) M x˜ (t) = x(t),
.
A , B , C , D −→ M −1AM , M −1B , C M , D .
(3.131)
The matrix A undergoes a similarity transformation A → M −1AM . The requirement that A is nonsingular ensures that its column vectors are linear independent and thus form a basis for the space of vectors x(t). Of special interest are similarity transformations where the transformation matrix is composed of the eigenvectors of the matrix A . If A of size M × M has only single eigenvalues sμ , μ = 1, . . . , M, then the corresponding eigenvectors kμ are combined into a matrix K . The similarity transformation with the transformation matrix K yields a diagonal matrix D with the eigenvalues sμ on the main diagonals μ = 1, . . . , M, .A kμ = sμ kμ , K = k1 , . . . , kμ , . . . k M , (3.132) K −1 , A = K DK
D = diag{s1 , . . . , sμ , . . . , s M } .
Since the integer powers of A are expressed by the powers of D ν
ν K −1 = K DνK −1 , .A = K DK also the matrix exponential eA t is expressed by eDt ⎛∞ ⎞ ⎜⎜⎜ 1 ν ν ⎟⎟⎟ −1 At ⎜ D t ⎟⎟⎠ K = K eDtK −1 , .e = K ⎜⎝ ν! ν=0
(3.133)
(3.134)
(3.135)
where eDt is a diagonal matrix of scalar exponential functions with the eigenvalues sμ in the exponent eDt = diag{e s1 t , . . . , e sμ t , . . . , e sM t } .
.
(3.136)
Reducing the size of A to 1 × 1 shows the coincidence with the scalar case in Sect. 3.4.1.2. With the matrix exponential (3.120) in its diagonalized form (3.135), the solution (3.119) of the homogeneous problem turns into x(t) = eA t xi = K eDtK −1 xi .
.
(3.137)
A state space representation with diagonal state matrix D results from a state transformation (3.131) with M = K . It leads to a simple form of the solution in terms of
3.4 Differential Equations and Transfer Functions
57
a diagonal matrix exponential according to (3.136) K x˜ (t) = x(t),
K x˜ i = xi ,
.
x˜ (t) = eDt x˜ i .
(3.138)
3.4.2.6 Interpretation of the State Transformation as a Signal Transformation An alternate representation of the state transformation (3.138) results when the mapping between the state vectors x(t) and x˜ (t) is described as a signal transformation T x˜ (t) = K −1 x(t) = T {x(t)},
.
x(t) = K x˜ (t) = T
−1
{˜x(t)},
x˜ i = T {xi }, xi = T
−1
{˜xi }.
(3.139) (3.140)
This transformation adds a new interpretation to the solution of the homogeneous state equation (3.137): The initial state xi is transformed into a domain where the matrix exponential has a diagonal form and allows a decoupled representation of the solution. The solution x(t) is recovered by an inverse transformation x(t) = K eDt K −1 xi = T −1 {eDt T {xi }} .
.
(3.141)
Signal transformations are discussed in detail in Sect. 4.5. That the transformation T in (3.139) and (3.140) actually qualifies as a signal transformation is shown in Sect. 4.5.1 by Example 4.12.
3.4.2.7 Propagator P(t) in the Transform Domain The expression for the state vector in (3.141) serves also as a representation of the propagator P(t) in the transform domain. From P(t){xi } = eA t xi = T −1 {eDt T {xi }}
.
(3.142)
follows that the propagator P(t){xi } performs • a matrix multiplication of the vector xi with the matrix exponential eA t , • a matrix multiplication with the diagonal matrix exponential eDt in a transform domain T . The representation of P(t) in the transform domain provides also an elegant proof of the third property in (3.130) " # " # P(t1 ) P(t0 ){xi } = P(t1 ) T −1 {eDt0 T {xi }} " " ## = T −1 eDt1 T T −1 {eDt0 T {xi }} " # = T −1 eDt1 eDt0 T {xi } " # = T −1 eD(t1 +t0 ) T {xi } = P(t1 + t0 ){xi },
.
(3.143)
58
3 Elements from One-Dimensional Signals and Systems
with eDt1 eDt0 = diag{e s1 t1 e s1 t0 , . . . , e sμ t1 e sμ t0 , . . . , e sM t1 e sM t0 }
.
= diag{e s1 (t1 +t0 ) , . . . , e sμ (t1 +t0 ) , . . . , e sM (t1 +t0 ) } = eD(t1 +t0 ) .
(3.144)
While the proof in Eq. (3.143) is more elegant, the one in Sect. 3.4.2.4 is more general because no assumptions like single eigenvalues were made there.
3.4.2.8 Laplace Transfer Function Similar to (3.103), also the state equations (3.116) and (3.117) constitute an initialvalue problem. Its input and its output are linked by a matrix-valued transfer function, which is obtained by one-sided Laplace transformation. At this point, a careful distinction between the different transformations is necessary. In the component xμ (t) of the state vector x(t), there are two variables: the position μ of this component within the state vector and the time variable t. The transformation T , defined by a matrix multiplication in (3.139), is a transformation along the variable μ, since it labels the columns of the matrix K −1 and the rows of the column vector x(t). On the other hand, there is Laplace transformation L along the time variable. Application of the Laplace transformation to (3.116) and (3.117) turns the state space representation into the Laplace transform domain .
sX(s) = A X(s) + B U(s) + xi , V(s) = C X(s) + D U(s).
(3.145) (3.146)
Solving (3.145) for X(s) and inserting into (3.146) gives the solution of the state equation in the Laplace domain X(s) = (sI − A )−1B U(s) + (sI − A )−1 xi .
(3.147)
.
Comparing the homogeneous case (U(s) = 0) to the corresponding time domain solution in (3.119) and (3.120) X(s) = (sI − A )−1 xi
.
x(t) = P(t)xi = eA t xi ,
t ≥ 0,
(3.148)
A)−1 is the Laplace transform of the propagator P(t) shows that the expression (sI −A resp. the matrix exponential (sI − A )−1
.
P(t) = eA t ,
t≥0.
(3.149)
An infinite-dimensional extension of (sI − A )−1 will appear again in Sect. 12.1.1.3 as resolvent operator. Inserting the solution for the state vector (3.147) into the output equation (3.146) gives the output V(s) in dependence of the input U(s) and the initial state xi
3.4 Differential Equations and Transfer Functions
59
C (sI − A )−1B + D ) U(s) + C (sI − A )−1 xi . V(s) = (C
.
(3.150)
The Laplace transfer function H(s) between input and output follows as V(s) = H(s)U(s),
.
H(s) = C (sI − A )−1B + D ,
(3.151)
with the impulse response h(t) as time domain equivalent H(s)
.
h(t) = C eA tB + D δ(t),
t≥0.
(3.152)
The transfer function is invariant w.r.t. a similarity transformation (3.131) which includes also a diagonalization by the matrix K from (3.132). Inserting (3.133) into (3.151) and rearranging by standard matrix operations gives a representation by the diagonal matrix D from (3.133) K −1 )−1B + D = (C C K )(sI − D)−1 (K K −1B ) + D , H(s) . = C (sI − K DK where the inverse matrix (sI − D)−1 is given by % $ 1 1 1 −1 . .(sI − D) = diag ,..., ,..., s − s1 s − sμ s − sM
(3.153)
(3.154)
Thus the diagonalization of the state space representation according to Sect. 3.4.2.5 results in a partial fraction expansion of the transfer function H(s). Expressing the diagonalization by a signal transformation in Eq. (3.141) provides a link to transfer functions of continuous multidimensional systems in Sect. 10.3.5.
3.4.2.9 Solution of the State Equations The impulse response h(t) and the matrix exponential eA t are the key elements to formulate the solution of the state equations in the time domain v(t) = C eA t xi + h(t) ∗ u(t) t = C eA t xi + C eA (t−θ)B u(θ) dθ + D u(t) 0 t B u(θ) dθ + D u(t) , = C P(t)xi + C P(t − θ)B
(3.155)
.
0
(3.156) t≥0.
(3.157)
Especially the formulations in (3.156) and (3.157) show that it is the matrix exponential eA t , respective its abstract representation by the propagator P(t), which determine the time evolution of the output signal v(t).
60
3 Elements from One-Dimensional Signals and Systems
3.4.2.10 Discrete-Time State Space Representation For completeness, it is shortly mentioned that there is also a discrete-time state space representation. In the same way as Eqs. (3.116) and (3.117) it is given by x[n + 1] = A x[n] + B u[n], v[n] = C x[n] + D u[n] .
x[0] = xi ,
.
(3.158) (3.159)
These difference equations permit a direct solution by updating the state equation from n to n + 1 and by substituting previous values starting from n = 0 x[1] = A xi + B u[0],
(3.160)
x[2] = A xi + AB u[0] + B u[1] .
(3.161)
.
2
Repeating this process and applying the output equation (3.159) gives the solution for arbitrary values of n x[n] = A n xi +
n−1
.
A n−ν−1B u[ν],
(3.162)
ν=0
y[n] = C A n xi + C
n−1
A n−ν−1B u[ν] + D u[n] .
(3.163)
ν=0
This solution is also established by induction. Comparing (3.162) for u[n] = 0 and the corresponding relation (3.119) for the continuous case shows that the propagator P[n] for the discrete state equation is given by P[n] = A n
such that
.
x[n] = P[n]xi .
(3.164)
The properties P[0] = A 0 = I,
.
P[n1 ]P[n0 ] = A n1 A n0 = A n1 +n0 = P[n1 + n0 ]
(3.165)
follow directly from the properties of matrix multiplication. The solution (3.162) and (3.163) of the state equations (3.158) and (3.159) adopts a form similar to (3.157) x[n] = P[n]xi +
n−1
.
B u[ν], P[n − ν − 1]B
(3.166)
ν=0
y[n] = C P[n]xi + C
n−1
B u[ν] + D u[n] . P[n − ν − 1]B
(3.167)
ν=0
The input-output behaviour for xi = 0 is characterized by the matrix of impulse responses h[n], which follows from (3.163), and by the matrix of transfer functions H(z) which follows by z-transformation
3.5 Problems
⎧ ⎪ ⎪ ⎨D .h[n] = ⎪ ⎪ ⎩C A n−1B
61
n = 0, n > 0,
H(z) = C (zI − A )−1B + D .
(3.168)
The propagator P[n] and the general solution in (3.162) and (3.163) or (3.166) and (3.167) are of minor practical importance compared to the continuous-time case. Instead, it is more straightforward to use (3.158) and (3.159) for a direct step-wise computation of the state vector x[n] and the output v[n] from previous values.
3.4.3 Conclusion and Outlook Some conclusions are in place after this review of systems described by linear differential equations and of the corresponding transfer functions. Apparently, the presentation included some redundancy, because more tools and approaches were introduced than absolutely necessary. First, the state matrix A defines the matrix exponential eA t , an essential quantity to compute the output in terms of input and initial state. In addition, the effect of the matrix exponential has also been expressed by the propagator P(t), which is also solely determined by the matrix A . Indeed, A can be extracted from P(t) by differentiation, see (3.130). Second, the similarity transformation in Sect. 3.4.2.5 has been additionally expressed by the signal transformation T . It provides an interesting link to the signal spaces discussed in Sect. 4.4. However, the transformation T is just another way of writing the corresponding similarity transformation. Why introducing more theoretical equipment than actually needed? Chap. 9 presents multidimensional systems described by partial differential equations. The discussion will rely heavily on the approaches for one-dimensional systems discussed in this chapter. However, the tools and methods from 1D systems do not carry over to multidimensional systems too easily. It turns out that a transformation is a more versatile tool than a matrix multiplication and that an operator is more general than a Taylor series. This is why these extra methods have been introduced. These are the extra lanes promised at the beginning of Sect. 3.4 and they will be travelled in Chap. 9.
3.5 Problems 3.1. Derive linear scaling of the delta impulse as a special case of nonlinear scaling. 3.2. Express δ b(t) for b(t) =
t2 −t02 t0
by one or more shifted delta impulses.
3.3. Is b(t) = at2 a useful function to define delta impulses as δ b(t) ?
62
3 Elements from One-Dimensional Signals and Systems
3.4. Draw comb: of each impulse
three impulses
at least 1 1 t , (c) X τ − , (b) X − (a) X t−T/2 T 2 2 . T 3.5. Show the commutativity of the discrete-time convolution (3.4) in parallel to the continuous-time case. 3.6. Calculate the Fourier transform X(ω) of x(t) = cos ω0 t without integration using only Tables 3.3 and 3.4. 3.7. Derive relation
(3.82) for time spacing T and spectral repetition ωs from Ω with τ = Tt and Ω = ωs T . X(τ) X 2π 3.8. Analyse the electrical circuit from Fig. 3.8 with input u(t) and output v(t) and the component values R1 = 2R, R2 = 3R, C1 = C2 = C. Determine 1. 2. 3. 4.
the matrices A , B , C , D of its state space representation, the eigenvalues sμ and the matrix K of eigenvectors of the state matrix A , the propagator P(t), and the transfer function H(s) and the impulse response h(t).
R1 u(t)
R2
i1 (t) uC1 (t)
C1
i2 (t)
uC2 (t)
C2 v(t)
Fig. 3.8 Electrical circuit with two resistors R1 and R2 , two capacitors C1 and C2 , input voltage u(t) and output voltage v(t)
References 1. Bracewell, R.N.: Fourier Analysis and Imaging. Kluwer Academic/Plenum Publishers, New York (2003) 2. Chen, C.T.: Signals and Systems, 3 edn. Oxford University Press, Oxford, UK (2004) 3. Curtain, R., Zwart, H.: An Introduction to Infinite-Dimensional Systems Theory. Springer-Verlag, New York (1995) 4. Girod, B., Rabenstein, R., Stenger, A.: Signals and Systems. Wiley, Chichester (2001) 5. Haykin, S., Veen, B.V.: Signals and Systems, 2002 edn. Wiley, Chichester, UK (2002) 6. Kamen, E., Heck, B.S.: Fundamentals of Signals and Systems Using the Web and MATLAB, 3 edn. Pearson, London, UK (2007)
References
63
7. Lathi, B., Green, R.: Linear Signals and Systems, 3 edn. Oxford University Press, Oxford, UK (2017) 8. Mitra, S.K.: Signals and Systems. Oxford University Press, Oxford, UK (2016) 9. Poularikas, A.D.: Signals and Systems Primer with MATLAB. CRC Press, Boca Raton, USA (2007) 10. Wasylkiwskyj, W.: Signals and Transforms in Linear Systems Analysis. Springer-Verlag, New York (2013) 11. Yarlagadda, R.K.R.: Analog and Digital Signals and Systems. Springer, New York, NY (2010)
Chapter 4
Signal Spaces
Signal processing uses a variety of different transformations between time and frequency domain. The Fourier transformation introduced in Sect. 3.2 is just one of many examples. In each case, the pair of forward and inverse transformations must be well defined to guarantee meaningful results. But apart from transformations there are also other operations that must be rigorously justified. One example is the approximation of certain signals by a limited number of basis functions. Another example is the minimization of error functions by iterative approaches. Both cases need to answer questions like: How close is a current solution to a given target? The proper definition of these and other operations requires a unifying description of different kinds of signals. To this end, functional analysis provides mathematical tools for a general treatment of real or complex numbers, discrete-time sequences, continuous-time functions, or vectors with numbers, sequences, or functions as elements. This set of tools is rich and powerful but also complex to handle in its full generality. However, not any arbitrary set of elements is suitable as a carrier of information. Almost any signal processing application deals with information which is either directly represented by physical quantities or could at least in principal be represented that way. Physical quantities are subject to certain restrictions imposed by the fundamental laws of physics like the various conservation laws, most notable conservation of energy. Finite energy content restricts the temporal variation of a signal or—in different words—its spectral content. Energy is related to square-integrability which in turn leads to the Hilbert space of square-integrable functions and the Hilbert space of square-summable sequences, see Sects. 4.1.2.3 and 4.1.2.4. These Hilbert spaces possess well-structured mathematical properties of highly practical relevance. Engineering design does, however, not always start from physical realizability. As an example, time division multiplex systems transmit multiple signals in parallel where each signal is assigned to a dedicated time slot within a specified time frame. The individual signals are extracted by rectangular time windows similar to (3.7). Although ideal switching from zero to a constant value in no time is not technically © Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 4
65
66
4 Signal Spaces
realizable, the rectangular window still allows for a simple design of complex systems. Realizability issues are taken care of at a later stage of the design process by trading in some of the system’s efficiency. A similar example is the ideal bandpass in frequency division multiplex systems. Also the delta impulse should be a permissible signal, irrespective of its physical realizability. The same holds for constant signals and complex exponentials although they are not square-integrable. It is therefore reasonable to look for signal spaces which possess the benign properties of the Hilbert spaces of square-integrable functions or square-summable sequences but which include also generalized functions and other idealized signals. This section describes a route to arrive at the definition of such signal spaces. The focus lies on the practical importance of its structure from an engineering point of view. Mathematically rigorous introductions are available in the rich literature on functional analysis [3, 11]. At first some foundations from signal processing are reviewed in Sect. 4.1. Then signal spaces are introduced in Sect. 4.2. Sections 4.3 and 4.4 proceed with introducing the concepts of orthogonality and biorthogonality. They are known from the various signal transformations in signal processing as detailed in Sect. 4.5.
4.1 Foundations This section prepares the ground for the introduction of signal spaces in Sect. 4.2. At first, the designations vector, function and signal are reconsidered in Sect. 4.1.1. Then some selected topics from signal processing are reviewed in Sect. 4.1.2.
4.1.1 Vectors, Functions, and Signals The terms vector, sequence, function, and signal are sometimes used interchangeably or with distinctions that are not clearly defined. This section explains their similarities and differences from a view point of signal processing. Figure 4.1 lists vectors, sequences, and continuous-time functions by their increasing number of dimensions. The first line shows a two-dimensional vector (d = 2) which may be regarded as the position vector of a point in the xy-plane, i.e. in the 2D space. However, its two values may also be arranged as a short sequence of two numbers. The numbering of these elements (0,1 as shown or alternatively 1,2) is often a matter of convention. Very similar, a three-dimensional vector (d = 3) may address a point in the 3D space or its elements may be arranged as a sequence of three numbers. Vectors may have more than three elements and these can be arranged as a sequence. However, for d > 3 there is no longer a visual representation as a point in space. Nevertheless, it is common to address these vectors as points in a ddimensional space, where space is now used in a more abstract sense. In particular, the sampling values v(kT ) of a continuous-time function v(t) are a sequence of
4.1 Foundations
67
numbers, but often it is convenient to a arrange them in vector form to facilitate processing by vector-matrix operations. The number of dimensions d is then equal to the number of sampling values. In signal processing, sampling values are often arranged in vectors or blocks of d = 512, d = 1024 and more values. When the sampling interval gets shorter and shorter, the number of sampling values for a certain interval of the time axis increases further and further. In the limit, the signal is represented by a continuous-time function v(t) with finite duration. Its Fourier series expansion consists of potentially infinitely many terms. Thus, it is justified to include also continuous-time functions into the list in Fig. 4.1 as a limiting case with infinitely many dimensions or d → ∞. y
v
v v= 1 v2 v1 v = v2 v3 v1 .. . vd
x
0
1
0
1
d=2
d
y v x
d=3
d
2
z no visual representation for d > 3
d
d>3
v = v(kT )
t
d3
v = v(t)
t
d→∞
v
0
1
2
3
4
Fig. 4.1 Vectors, sequences, and functions sorted by their increasing number d of dimensions
The compilation in Fig. 4.1 shows that vectors and finite sequences are equivalent. Sequences may also have an infinite number of elements, e.g., the sample values of a function of infinite duration. Also continuous-time functions can be regarded as infinite-dimensional. The term vector space is quite descriptive for two and three dimensions. In mathematics this term is also used for higher and even infinite dimensions. Thus the term vector space covers all representations in Fig. 4.1. For continuous-time functions, a vector space is also called a function space. In signal processing, both sequences and functions are called signals to emphasize their role as carriers of information. Consequently, also the term signal space is used for the representations in Fig. 4.1. However, vector or function or signal spaces are not just a collection of vectors, functions or signals. These spaces require some additional properties to be discussed next.
68
4 Signal Spaces
4.1.2 Topics from Signal Processing A signal space is a collection of signals with well-defined rules for a minimal set of operations. A more refined definition follows in Sect. 4.2.1. These calculations with signals and some further topics are discussed here from the viewpoint of signal processing. These topics will be reviewed from a mathematical perspective when signal spaces and their properties are introduced. At first the superposition of signals is discussed by a practical example. Then the geometrical terms distance and angle are reviewed in their signal processing context. In a similar way, power and energy as physical terms are connected to square-integrable and square-summable signals.
4.1.2.1 Superposition of Signals An example is the superposition principle for sequences from Sect. 3.1.1.2 which requires that scaling of a sequence by a number and addition of scaled sequences is defined. A similar case is considered now in Example 4.1. It shows that these operations need not only be well-defined in a mathematical sense, but that they are also important from an engineering point of view. Example 4.1 (Mixing Console). A frequent task in audio engineering is the combination of several recorded signals into one output signal. The recorded signals are typically singing voices or the sounds of musical instruments. The output signal shall contain a well-balanced mixture of the input signals. v1 (t)
v2 (t)
v3 (t)
v4 (t)
v(t) = c1
c2
c3
4 n =1
cn vn (t)
c4
Fig. 4.2 Mixing console for audio signals. The signals vn (t), n = 1, 2, 3, 4 on the top are the input signals, the signal v(t) on the right is the output signal. The vertical lines with the black rectangles are the so-called faders. Their positions determine the values of the weighting factors cn
This procedure is technically realized by a mixing console as shown in Fig. 4.2. The signals vn (t) for n = 1, 2, 3, 4 are the input signals. The output signal v(t) is a linear combination of the input signals, where the weighting factors cn are set by manual positioning of the so-called faders. The operations of scaling by cn and addition of the scaled input signals are technically realized by the internal circuitry of the mixing console.
4.1 Foundations
69
There are two different mathematical entities in Example 4.1: signals (vn (t), v(t)) and weighting factors (cn ). It has been tacitly assumed that the signals are of finite power in order not to overdrive the internal electronic components. The weighting factors are real numbers limited by the range of the faders, e.g., 0 ≤ cn ≤ 1. A slight generalization from these technically imposed restriction leads to the requirements of a signal space.
4.1.2.2 Distance and Angle There are more complex signal processing tasks than the linear combination implemented in the mixing console from Example 4.1. A frequent problem is to approximate a specified function by a simpler one. An example is shown in Fig. 4.3 where a rectangular function is approximated in the range −T < t < T by a finite sum of cosines (M < ∞) as rect
t T
.
1 + cμ cos μπ Tt . 2 μ=1 M
≈
(4.1)
Both approximations in Fig. 4.3 seem to be reasonable but it is not immediately clear which one should be regarded as “better” in a certain sense. Therefore some metric is needed to quantify the distance of the approximation to the specified target. Note that the cosine functions in (4.1) are functions of time t while the coefficients cμ are real numbers. 1
1
−T − T 2
0
T 2
T
t
−T − T 2
0
T 2
T
t
Fig. 4.3 Different approximations of a rectangular function by a truncated Fourier series (M = 7): rectangular window (left), Hann window (right)
Besides the distance, also the angle between two signals is important. The angle between two 2D vectors follows from simple geometric considerations, as shown on the left of Fig. 4.4. Since the angle between two vectors does not depend on their length, a length of unity is assumed for simplicity. Then the vectors a and b can be represented by complex numbers a and b with magnitude one. The angle ϕ in between is the difference of the individual angles α and β a = eiα ,
.
b = eiβ ,
b = a eiϕ
for
ϕ = β − α.
(4.2)
70
4 Signal Spaces
A closely related notion is the phase angle between two sinusoidal signals, see Fig. 4.4 (right). It follows from the angle ϕ defined in Fig. 4.4 (left) by considering a and b as complex amplitudes of complex exponential functions. Their real parts, or equally their imaginary parts, reveal the phase difference ϕ sin(ω0 t − α)
b
sin(ω0 t − β) ω0 t
ϕ β α
a
ϕ
Fig. 4.4 Angles between elements of a vector space: 2D vectors a and b (left) and sinusoidal functions (right). The axis for the sinusoidal functions is labeled in ω0 t instead of t to show angles rather than time
{a∗ eiω0 t } = sin(ω0 t − α),
.
{b∗ eiω0 t } = sin(ω0 t − β) = sin(ω0 t − α − ϕ) .
(4.3)
Distance and angle between signals are shown in Figs. 4.3 and 4.4 for simple cases. Their definition for more general signals, possibly in multiple dimensions, still needs to be established. The required mathematical tools are presented in Sect. 4.2.3.
4.1.2.3 Power and Energy So far, there are no restrictions on the permissible signals as introduced in Sect. 4.1.1. However, not any conceivable mathematical function makes sense for the purpose of signal processing. Signals are either direct sensor or actuator signals or they represent a physical quantity in a more abstract sense. Therefore signals are governed by fundamental laws of physics like conservation of energy. Fig. 4.5 Voltage v(t) and current i(t) at a resistor. Ohm’s law describes the relation between voltage and current
i(t) v(t)
R
i(t) =
1 R v(t)
As a most simple example consider the resistor in Fig. 4.5. The voltage source on the left hand side supplies a voltage v(t) which drives a current i(t) through the resistor. It is related to the voltage v(t) by Ohm’s law. The resistor dissipates the electrical energy delivered by the voltage source into heat. This process consumes electrical power p(t) which can be calculated from (4.4). The complex conjugation v∗ (t) has been included for complex valued representations of the electrical quantities
4.1 Foundations
71 .
p(t) = v∗ (t)i(t) =
1 ∗ 1 v (t)v(t) = |v(t)|2 . R R
(4.4)
The total energy E delivered by the source follows by integration w.r.t. time according to (4.5). For sure, this energy is finite .
E=
∞ −∞
p(t) dt =
1 ∞ |v(t)|2 dt < ∞ . R −∞
(4.5)
Although the scope of this example is restricted, it contains a fundamental truth: The energy represented by physical quantities is finite. Indeed, the physical dimension of E in (4.5) is W s = J, i.e., the physical dimension of energy. However, reducing the finiteness of E to its mathematical core shows an important property of the signal v(t): The energy E in (4.5) is finite because v(t) is a square-integrable function (or quadratically integrable function). Reducing the further considerations to squareintegrable functions is therefore no severe restriction for all signals which actually do or at least potentially could represent physical quantities. A similar argument holds for square-summable sequences. The assumptions of square-integrability and square-summability turn out to be very helpful for further refining the signal spaces introduced in Sect. 4.2.1.
4.1.2.4 Square-Integrability and Square-Summability The properties of signals which represent physical quantities with finite energy motivate to consider the signal space of square-integrable signals for continuous-time signals v(t) and the signal space of square-summable signals for discrete-time signals v(k) ∞ .
v∗ (t)v(t) dt =
−∞
∞
|v(t)|2 dt < ∞,
−∞
∞
v∗ (k)v(k) =
k=−∞
∞
|v(k)|2 < ∞ . (4.6)
k=−∞
The inequality in (4.6) makes only sense for real scalar values. Therefore the conjugate transpose v∗ (t) is necessary to make the product v∗ (t)v(t) = |v(t)|2 real for complex v(t). When vector-valued signals v(t) are considered, then the product for the squareT integration with a real and scalar result takes the form v∗ (t)v(t), where vT denotes transposition of a vector v. The joint operation of complex conjugation and transpoT sition is denoted as v∗ = vH . Similarly vH (k) denotes the complex conjugate transposition of a vector-valued sequence v(k). The conditions for square-integrability of vector-valued continuous-time signals v(t) and for square-integrability of vectorvalued discrete-time signals v(k) now take the form ∞ .
−∞
vH (t)v(t) dt < ∞,
∞ k=−∞
vH (k)v(k) < ∞ .
(4.7)
72
4 Signal Spaces
For signals that are not defined for the complete range −∞ < t < ∞ of the independent variable t but on a bounded domain x0 < x < x1 of, say, a space variable x (see Fig. 2.4), the property of square-integrability is defined accordingly. The same holds for sequences on a bounded domain k0 ≤ k ≤ k1 x1 .
x0
∗
v (x)v(x) dx =
x1
|v(x)|2 dx < ∞,
x0
k1 k=k0
v∗ (k)v(k) =
k1
|v(k)|2 < ∞ .
(4.8)
k=k0
The properties of square-integrability and square-summability lead directly to the definition of the scalar product in Sect. 4.2.2.
4.2 Introduction to Signal Spaces This section starts off with the basic idea of a signal space in Sect. 4.2.1. Then follow suitable scalar products in Sect. 4.2.2 and further the norm of a function or a sequence, as well as the angle and the distance between two different functions or sequences in Sect. 4.2.3. Some considerations are required for functions or sequences which are defined as the limit of a converging series of functions or sequences in Sect. 4.2.4 before Sect. 4.2.5 finally introduces Hilbert spaces of square-integrable functions and square-integrable sequences. The discussion of suitable signal spaces is completed by the extension to generalized functions in Sect. 4.2.6.
4.2.1 What Is a Signal Space? The selected topics from signal processing discussed in Sect. 4.1.2 serve now as a blueprint for more abstract definitions. This section introduces the idea of a signal space which is further refined in later sections. A signal space is a set of signals for which scaling of a signal by a number (scalar multiplication) and addition of signals are defined. To distinguish signals from numbers, signals are written in boldface. For the sake of a uniform notation, the boldface style represents vectors of continuous-time functions as well as vectors of discrete-time functions (sequences) with one or more elements. Thus also scalar signals v(t) or sequences v(k) are included in the boldface notation and consequently, the arguments t or k are frequently omitted.
4.2.1.1 Operations in a Signal Space A signal space is defined as a set of signals v with two basic operations: scaling of a signal v by a number c and addition of signals
4.2 Introduction to Signal Spaces
v = c v0 ,
.
73
v = v1 + v2 .
(4.9)
The coefficients c are real or complex numbers for which the basic arithmetic operations addition, subtraction, multiplication, and division are defined. This includes the existence of a neutral element of the addition c = 0 and of a neutral element of the multiplication c = 1 as well as the laws of distributivity, commutativity, and associativity. For the signals v exists a neutral signal 0 with the property v + 0 = v. The following mathematical expressions are often encountered in the relevant literature: A signal space is defined as a set of signals over the field of real or complex numbers, for which addition of vectors and scalar multiplication are defined (see Eq. (4.9)). Since addition and scalar multiplication allow to define a linear combination (v = c1 v1 + c2 v2 ), a signal space is also called a linear space. Depending on the nature of the signals, a signal space is also called a vector space or a function space. The requirements for signals v and coefficients c to constitute a space ensure that basic algebraic operations can be carried out. Therefore a signal space is called a set of signals with an algebraic structure.
4.2.1.2 Basis of a Signal Space Linear combination allows to represent elements v of a signal space in terms of other elements v1 , v2 , . . . and in terms of the values of the coefficients c1 , c2 , . . . . It is often convenient to characterize a signal space by a minimal set of elements vn , n ∈ N which generate any other element v just by the choice of the corresponding coefficients cn in a linear combination. Such a minimal set of linear independent elements is called the basis of the signal space. The number of these basis elements is the dimension of the signal space. Some simple examples are shown in Fig. 4.1, where vectors of dimension 2 and 3 are represented by the coordinate vectors in a Cartesian coordinate system. Also vectors and discrete-time signals of higher dimension can be represented in the same way. The coordinate vectors are the basis vectors of the respective signal space and the coordinate values are the coefficients. The choice of a basis is by no means unique. Even for two dimensions, a Cartesian coordinate system can be shifted, rotated, or sheared to obtain a different basis. The examples in Sect. 4.3.3 show different choices of the basis vectors in two dimensions. Furthermore, there are other coordinate systems (polar coordinates, elliptical coordinates, etc.) which also serve as bases. Exploiting this freedom to choose the basis according to the geometry of a given problem may simplify its practical solution considerably. The last entry in Fig. 4.1 shows an example from the signal space of continuous functions on a finite interval. Here the Fourier expansion into complex exponentials (or sines and cosines) serves as a linear combination with the complex exponentials as basis functions vn and the Fourier coefficients as coefficients cn . Since it takes infinitely many series terms in the Fourier series expansion to represent any
74
4 Signal Spaces
continuous function on a given finite interval the dimension of this signal space is infinite.
4.2.2 Scalar Product Addition and scalar multiplication of signals is enough to describe the superposition of signals as implemented in the mixing console of Example 4.1 in Sect. 4.1.2.1. However, the other topics in Sect. 4.1.2 require more operations on the elements of a signal space. Therefore the scalar product is introduced here. More operations will follow in Sect. 4.2.3.
4.2.2.1 Definition Consider a signal space with elements u, v over the field of real or complex numbers c. A scalar product is a mapping which assigns to any two elements u, v a real or complex number c. It is written as u, v and has the properties (4.10)–(4.13)
u, v = v, u∗ ,
.
u1 + u2 , v = u1 , v + u2 , v,
cu, v = c u, v,
u, u ≥ 0,
u, u = 0 ⇔ u = 0 .
(4.10) (4.11) (4.12) (4.13)
The scalar product describes the relation between two signals u and v by a real or complex number. Interchanging u and v within the scalar product yields the complex conjugate result (4.10). The scalar product is additive (4.11). Scalar multiplication of the first element gives a multiplication of the scalar product with the same scalar (4.12). From (4.10) and (4.12) follows u, cv = c∗ u, v. Finally, the scalar product of two identical signals yields a real and positive number, except for the zero element, for which holds 0, 0 = 0. Therefore, u, u is positive definit.
4.2.2.2 Schwarz Inequality An important inequality follows from the nonnegative number
u + cv, u + cv = u, u + cv + cv, u + cv = u + cv, u∗ + c u + cv, v∗ =
.
= u, u + c∗ u, v + c u, v∗ + cc∗ v, v ≥ 0 .
(4.14)
Some complex conjugations in the second line have been eliminated: In the first and the fourth term because u, u and v, v are real and in the second term due to (4.10). The second and the fourth term cancel if c adopts the special value
4.2 Introduction to Signal Spaces
c=−
.
u, v
v, v
75
u, u v, v ≥ | u, v|2 .
(4.15)
The result in (4.15) is known as Schwarz inequality.
4.2.2.3 Square-Integrable Functions and Square-Summable Sequences Equations (4.10)–(4.13) describe the requirements which a scalar product has to fulfill, but they do not specify, how the scalar product u, v should be calculated. In fact, any calculation rule for u, v with the properties (4.10)–(4.13) is a valid scalar product. These rather general statements can be formulated in more detail for signal spaces of square-integrable functions, respective square-summable sequences according to Sect. 4.1.2.4. Extending the condition for square-integrability (4.7) from one signal v(t) to two signals u(t) and v(t) gives a mapping of two elements of a signal space to a complex number c. This statement holds equally for square-summable sequences ∞ .
vH (t)u(t) dt = c,
−∞
∞
vH (k)u(k) = c .
(4.16)
k=−∞
Checking the requirements (4.10)–(4.13) for scalar products shows that both the integral as well as the sum in (4.16) are valid scalar products. The property (4.10) follows by comparing vH u and uH v. The order of u and v in the product vH u ensures property (4.11). Property (4.12) follows directly from integration or summation. Finally, (4.13) is given by square-integrability or square-summability. Indeed, for v = u, the number c is real and positive and satisfies (4.7). Only for u(t) = 0, respective u(k) = 0, holds c = 0. The result of these considerations leads to the following statement: Scalar products in the signal spaces of square-integrable functions respective square-summable sequences are given by
u(t), v(t) =
∞
.
−∞
vH (t)u(t) dt,
u(k), v(k) =
∞
vH (k)u(k) .
(4.17)
k=−∞
As noted before, other definitions lead also to valid scalar products, as long as the properties (4.10)–(4.13) are satisfied. Indeed, also the alternative formulations for square-integrability and square-summability (4.6) and (4.8) are useful to define scalar products. The specific choice depends on the nature of the signals and the domains on which they are defined. For some applications also scalar weighting functions w(x) are included, e.g.,
76
4 Signal Spaces
u(x), v(x) =
x1
.
w(x) vH (x)u(x) dx ,
x0 < x1 .
(4.18)
x0
The weighting function must be nonnegative, w(x) ≥ 0, to ensure property (4.13). Some further mathematical designations are often used: The scalar product is also called inner product. Signal spaces with a scalar product are called inner product spaces. For vectors u and v with real-valued elements, the notation u, v = u · v is used and then the scalar product is called a dot product.
4.2.3 Norm, Distance, and Angle For the approximations of the rectangular function in Fig. 4.3 it cannot be decided which one is better than the other one, unless there is a measure for the quality of such approximations. Such a measure may be defined as the distance between two functions. Then an approximation problem can be formulated as: Minimize the distance of an approximating function to a target function. If the approximating function is defined as a series with a finite number of terms then the quality of the approximation should increase by adding more terms to decrease the distance to the target. The first step towards the definition of a distance is the introduction of a norm as a measure for a single signal v. The distance between two signals u and v follows then as the norm of their difference. In general, the norm is independent of the existence of a scalar product. However, once a scalar product is defined, it induces a corresponding norm and distances are expressed by scalar products. Furthermore, a scalar product serves also to define an angle between two signals. Simple examples have already been given in Sect. 4.1.2.2. A more general definition follows here. Distance and angle are no topics from algebra, instead they describe the geometrical arrangement of objects. In mathematical terms, distance and angle can be introduced by endowing a signal space with an additional topologic structure.
4.2.3.1 Definition of a Norm In general, a norm assigns to an element v of a signal space a real and non-negative number ||v|| with the properties ||u + v|| ≤ ||u|| + ||v||, ||cv|| = |c| ||v||, ||v|| ≥ 0, ||v|| = 0 ⇔ v = 0 .
.
(4.19) (4.20) (4.21)
Property (4.19) says that the norm of the sum of two vectors is equal or less than the sum of their individual norms. It is also called the triangle inequality, see Fig. 4.6 for 2D vectors. Property (4.20) regulates scaling by an element c from the associated
4.2 Introduction to Signal Spaces
77
field of numbers. It states that the norm is an absolutely homogeneous function. Property (4.21) requires that the norm is positive definit. u+ v
||u
|| 2 +v ||v||2
v2 u
||u||2
v2 u
v
v1
ϕ
||u − v|| d (u 2 , v)
v
v1
Fig. 4.6 Left: triangle inequality for 2D vectors v and u and their Euclidian norms || • ||2 . Right: distance d(u, v) and angle ϕ between the 2D vectors v and u
4.2.3.2 L p-Norms Similar as for the scalar product, (4.19)–(4.21) define the requirements which a norm has to fulfill, but they do not specify how to calculate the norm. The situation is obvious in the space of 2D vectors with the components v1 and v2 , where the norm of a vector can be associated with its length in the sense of Euclidian geometry. Then the norm ||v||2 of a vector v is given by (4.22). However, there are also other ways for defining the norm of a vector, e.g., ||v||1 as sum of the absolute values of the components v1 and v2 . Both are special cases of the L p -norm ||v|| p of vectors with N elements for N = 2 and p = 1, 2 ||v||1 = |v1 | + |v2 |,
.
√ ||v||2 = v21 + v22 = vT v,
⎡ N ⎤ 1p ⎢⎢⎢ ⎥⎥ p ||v|| p = ⎢⎢⎣ |vn | ⎥⎥⎥⎦ .
(4.22)
n=1
Also p > 2 is possible and in the limit p → ∞ the L∞ -norm is a measure for the element with the greatest magnitude ||v||∞ = max |vn |. 1≤n≤N
On the other hand, in compressed sensing and machine learning, the function ||v|| p is used with 0 ≤ p < 1. Strictly speaking, it is not a norm because it does not satisfy condition (4.19). Figure 4.7 shows the location of constant L p -norm ||v|| p = 1 for vectors of length N = 2 and different values of p. Fig. 4.7 Solid lines: locations of constant L p -norm ||v|| p = 1 for p = 1, 2, 4, 8, ∞ (from inside out). Dashed line:
1 ||v|| p = |v1 | p + |v2 | p p = 1 1 for p = 2
v2
1
−1
1
−1
v1
78
4 Signal Spaces
4.2.3.3 Norms in Inner Product Spaces The expressions defined in (4.22) possess the properties (4.19)–(4.21) and are valid norms. A special case for p = 2 is the L2 -norm or Euclidian norm. Its definition can be written as a scalar product as already noted for the norm of vectors in (4.22). Indeed, in a signal space with a scalar product (also called an inner product space) it is convenient to define the norm in terms of the scalar product .||v||2 =
v, v . (4.23) The required properties (4.20) and (4.21) follow directly from (4.10), (4.11) and (4.13). Property (4.19) follows from ||u + v|| as ||u + v||2 = u + v, u + v = u, u + u, v + v, u + v, v
.
≤ u, u + | u, v| + | v, u| + v, v ≤ u, u + 2 u, u v, v + v, v = ||u||2 + 2||u|| ||v|| + |v||2 = (||u|| + ||v||)2 .
(4.24)
The first inequality holds because the real sum u, v + v, u has been replaced by the sum of the absolute values of the individual terms. The second inequality is due to the Schwarz inequality (4.15). Thus any scalar product can be used to define a corresponding norm according to (4.23).
4.2.3.4 Norms for Square-Integrable Functions and Square-Summable Sequences This general result on norms can be refined for square-integrable functions respective square-summable sequences, where different forms of scalar products have been defined by (4.17) and (4.18). The norm of square-integrable functions respective square-summable sequences with the scalar product (4.17) is given by ∞ ∞ H vH (k)v(k) . (4.25) .||v(t)||2 = v (t)v(t) dt , ||v(k)||2 = −∞
k=−∞
Different forms of the norm follow from other definitions of the scalar product. E.g., the definition (4.18) with a weighting function w(x) gives rise to technically relevant norms, as shown in the following two examples. Example 4.2 (Root-Mean-Square Value). Scalar signals defined on an interval [x0 , x1 ] with a scalar product according to (4.18) and a constant weighting function w(x) = (x1 − x0 )−1 have the following norms
4.2 Introduction to Signal Spaces
||v(t)||2 =
.
x1 1 |v(t)|2 dt , x1 − x0 x0
79
||v(k)||2 =
k1 1 |v(k)|2 . k1 − k0 k=k
(4.26)
0
These norms are known as Root-Mean-Square values or RMS-values. The roots and squares are obvious in (4.26), the mean value is represented by a weighted integration respective summation. Thus the RMS-value of a signal constitutes a norm. Example 4.3 (Energy of an Electrical Signal). The energy of the voltage across a resistor from (4.5) corresponds to a squared norm when the scalar product is defined with a constant weighting w(x) = R−1 .
E = ||v(t)||22 = v(t), v(t) =
∞ 1 |v(t)|2 dt . R −∞
(4.27)
The finite energy dissipated by the resistor in Fig. 4.5 can now be expressed by the squared norm of the voltage v(t). The norm of a signal is a measure for the size of the signal. Loosely spoken, the norm answers the question: How big is a signal? A different, but related question is: What is the distance between two signals? Also this question is answered with the help of the norm.
4.2.3.5 Distance and Metric Distance and angle for 2D vectors are shown in Fig. 4.6. The scalar product and the associated norm allow to generalize their immediate geometrical meaning to arbitrary signal spaces. The distance d(u, v) between two elements u and v of an inner product space is defined by the norm of their difference d(u, v) = ||u − v||2 ,
.
d(u, v) = d(v, u) .
(4.28)
From the property (4.20) for c = −1 follows that the distance is commutative. For square-integrable functions respective square-summable sequences the distance adopts different forms depending on the nature of the signals u and v. For 2D vectors the distance corresponds to the Euclidian distance of the endpoints of both vectors (see Fig. 4.6) v u1 , v = 1 , d(u, v) = ||u − v||2 = |u1 − v1 |2 + |u2 − v2 |2 . (4.29) .u = u2 v2 The distance of the scalar signals u(t) and v(t) is given by the distance measure
80
4 Signal Spaces
⎡ ∞ ⎤ 12 ⎢⎢⎢ ⎥⎥ 2 ⎥ ⎢⎢ ⎥⎥⎥ . .d(u(t), v(t)) = ||u(t) − v(t)||2 = ⎢ |u(t) − v(t)| dt ⎣ ⎦
(4.30)
−∞
It is suitable for evaluating the quality of the approximations in Fig. 4.3. A distance measure for a signal space is called a metric of this space and signal spaces with a distance measure are called metric spaces.
4.2.3.6 Angle From the Schwarz inequality (4.15) for real valued signals u and v follows by taking the square root, using the definition of the norm (4.23), and by resolving the absolute value | u, v| into a double inequality (compare (3.14)) .
−1≤
u, v ≤1. ||u|| ||v||
(4.31)
The range of this expression between −1 and 1 suggests to identify it with a cosine function as .
u, v = cos ϕ . ||u|| ||v||
(4.32)
For vectors, the angle ϕ is the angle between the vectors u and v, similar to the distance d(u − v) between these vectors. Fig. 4.6 shows a 2D example. When the angle ϕ is known then the scalar product u, v is easily expressed by the norms of u and v and their angle
u, v = ||u|| ||v|| cos ϕ .
.
(4.33)
A special case are vectors with ϕ = π2 such that cos ϕ = 0. Then the scalar product u, v is zero and the vectors u and v are called orthogonal. The concept of orthogonality is further discussed in Sect. 4.3. The idea of an angle between two signals does not easily generalize to complex signals u and v, because in general their scalar product u, v is also complex and cannot be expressed by an inequality as in (4.31). There are different definitions of an angle between complex valued signals [8], none of which is of major importance for multidimensional systems and signals.
4.2.3.7 Law of Cosines An application of distance and angle is the law of cosines. It is derived from the square of the distance d(u − v) = ||u − v|| of real signals u and v as
4.2 Introduction to Signal Spaces
81
||u − v||2 = u − v, u − v = u, u − v, u − u, v + v, v
.
= u, u − 2 u, v + v, v = ||u||2 + ||v||2 − 2||u|| ||v|| cos ϕ .
(4.34)
If u and v are vectors then u, v, and ||u − v|| define a triangle (compare Fig. 4.6). Renaming the length of the sides as ||u|| = a, ||v|| = b, ||u − v|| = c gives (4.34) the form of the law of cosines c2 = a2 + b2 − 2ab cos ϕ .
.
(4.35)
It includes Pythagoras’ theorem c2 = a2 + b2 as a special case for ϕ = π2 , i.e. when u and v are orthogonal.
4.2.4 Completeness Figure 4.3 shows attempts to approximate a rectangle function by a finite series of weighted cosine functions. The question discussed in Sect. 4.1.2.2 was how the quality of the approximation can be measured. This question has been answered in Sect. 4.2.3.5 by the introduction of the distance d(u, v) between two signals. Here arises a new question for the rectangle function u(t) and its approximations uN (t) by N cosine functions and a constant c0 u(t) = rect
.
t , T
uN (t) = c0 +
t . cn cos nπ T n=1
N
(4.36)
Can the rectangle function u(t) be approximated arbitrary well in the sense that the distance d(u, uN ) tends to zero for N → ∞? This is not a trivial question, since the cosine functions are differentiable ad infinitum while the rectangle function has discontinuities at t = ± T2 and is thus not differentiable at all. Even when the distance between the rectangle function and its approximation vanishes in the limit, the property of differentiability is lost. Actually, this is not a major problem since differentiability is not among the essential properties assumed so far. However, this observation raises another suspicion: An approximation by an infinite series might lead to a result for which the initial assumptions of squareintegrability or square-summability do not hold anymore. Then the carefully constructed signal space would collapse: The definitions of the scalar product were meaningless, as well as norm and distance which are based on the scalar product. To avoid such a situation, it is save to use signal spaces where all approximations by infinite series lead to signals for which scalar product, norm and distance are still well-defined. Signal spaces with this property are called complete spaces. The mathematical literature formulates this property more generally by considering any sequence of functions un (t) for which the distance d(un , un+1 ) becomes arbitrarily small as n → ∞. Sequences with this property are called fundamental sequences or Cauchy sequences and converge to a limit u(t) in the sense of the induced norm. A
82
4 Signal Spaces
signal space is complete if it contains both un (t) and the limit u(t). Such spaces are closed in the sense that no function is left out because it could not be approximated by an infinite series. It can be shown that the construction of the signal space starting from a squareintegrability or square-summability, and the definition of corresponding scalar products, norms and distances leads to a complete signal space. Proofs can be found in the literature on functional analysis. The special properties of such spaces have already been collected in Sects. 4.2.1–4.2.4 and are compiled next.
4.2.5 Hilbert Spaces This section compiles from Sects. 4.2.1–4.2.4 the assumptions and definitions that lead to a well-structured signal space. The starting point are signals v in the form of square-integrable functions or square-summable sequences (4.7) ∞ .
vH (t)v(t) dt < ∞,
−∞
∞
vH (k)v(k) < ∞ .
(4.37)
k=−∞
A set of these signals becomes a signal space when multiplication of a signal v by a real or complex number c and the addition of two signals v1 and v2 are defined. v = c v0 ,
.
v = v1 + v2 .
(4.38)
The definition of addition and scalar multiplication endows the set of signals with an algebraic structure. This definition does not only satisfy the demand of mathematical rigor, it also tells how to build signal processing devices like the mixing console from Example 4.1 in Sect. 4.1.2. The properties of square-integrability or square-summability lead to the definition of a corresponding scalar product which, in turn, induces a norm as a metric for the size of a signal. Furthermore, the norm of the difference of two signals is a metric for their distance. Similarly, the relation between the scalar product of two signals and the product of their norms is a metric for the angle between two signals. This way, the signal space is endowed with a topologic structure and is called a metric space. It can be shown that this metric space is complete. Signal spaces with all these properties are called Hilbert spaces.1 Their properties are compiled in (4.39)–(4.43). The steps that lead to a Hilbert space are shown in graphical form in Fig. 4.8.
1
David Hilbert, 1862–1943.
4.2 Introduction to Signal Spaces
scalar product
.
83
u(t), v(t) =
u(k), v(k) =
∞
vH (t)u(t) dt
−∞ ∞
vH (k)u(k)
(4.39) (4.40)
k=−∞
norm distance angle
||v||2 =
v, v
d(u, v) = ||u − v||2 cos ϕ =
u, v ||u||2 ||v||2
(4.41) (4.42) (4.43)
Again, norm, distance, and angle are not only elements of mathematical structure. They also play important roles for solving problems in signal processing, as is shown shortly. A frequent task in signal processing is to minimize an error signal by an iterative procedure. It requires a criterion for stopping the iteration once a satisfying accuracy has been reached. The norm of the error signal serves this purpose because it tells when the error falls below a certain threshold. Another frequent task is the approximation of a desired signal by a series with a finite number of terms. The number of terms can be determined by evaluating the distance between the approximation and the desired target signal. Both tasks, iterative minimization as well as approximation by a truncated series are only meaningful if they actually converge to either zero or to a desired target. Indeed, convergence is ensured here by the completeness of the defined metric space. Finally, the angle between two signals introduces orthogonality, a concept which is of paramount importance for designing signal transformations. This topic is explored further in Sect. 4.3. In summary, a Hilbert space is a well-organized environment for all standard signal processing calculations. It has been introduced here for 1D signals but the basic definitions of integrability, summability, and the elements which define a Hilbert space can also be formulated for multidimensional signals. Fig. 4.8 Organization of signal spaces with algebraic and topologic structure: a signal space is a set of signals where scalar multiplication and addition are defined (algebraic structure). The definition of a scalar product turns it into an inner product space and induces norm and distance (topologic structure). If the resulting metric space is complete, then it is called a Hilbert space
Hilbert space completeness metric space norm and distance inner product space scalar product signal space
84
4 Signal Spaces
4.2.6 Extension to Generalized Functions The Hilbert space introduced in Sect. 4.2.5 is a standard space for signals as they occur in reality. The restriction to physical processes with finite energy ensures the mathematical property of square-integrability. However engineering design sometimes requires abstraction from reality. Examples are the impulse response or a signal with a constant value or trigonometric signals which extend from −∞ to ∞. The impulse response introduced in Sect. 3.1.3.1 is the response to an idealized impulse (delta impulse) which is derived from a classical impulse function by a limiting process as shown in Sect. 3.1.2.2. The delta impulse can take the place of a continuous-time function within an integral relation like(4.17) to define a scalar product δ(t), v(t). However, the norm of a delta impulse δ(t), δ(t) according to (4.23) is not explained. Nevertheless, delta impulses are very popular for the description of system responses and or of sampling processes, as discussed in Sects. 3.1.3.1 and 3.3. Therefore, they should not be excluded from a useful signal space. Another example for abstraction from reality are signals with a constant value, e.g., v(t) = v0 for −∞ < t < ∞. Such a signal is convenient for describing a voltage source in a direct current (DC) power supply, see Fig. 4.5. Unfortunately, there is no constant voltage source which would deliver power forever. In physical terms, it would require infinite energy, in mathematical terms, the integral (4.5) does not exist. On the other hand, for calculations with signals and systems, a constant signal value is much more convenient than working with finite duration signals. This predicament is resolved by a technique described in [4, Chapter 2]. The idea is to start with square-integrable functions of a certain duration and then to either contract the duration to zero or to extend it to infinity. The duration can be defined in different ways, e.g. as the standard deviation of a Gaussian shaped impulse or as the non-zero domain of a rectangle function. The following Example 4.4 demonstrates the technique from [4] using Gaussian functions with their well-known properties of differentiability and square integrability. Then the same approach is applied to rectangle functions in the context of signal spaces. Indeed, it has already been used for the calculations in Sect. 3.2.3 to find correspondences of the Fourier transform. Example 4.4 (Generalized Functions and their Fourier Transforms). This example is adapted from [4, Example 5–7]. Consider first a pair of Fourier transforms fn (t) Fn (ω) as given in (4.44). Both are of Gaussian nature as in (3.9). The integer n appears as a factor in the exponent of fn (t) but as a divisor in Fn (ω) ω2 n −nt2 e . fn (t) = e− 4n = Fn (ω), n∈N. (4.44) π ∞ The scaling factor for fn (t) has been chosen such that −∞ fn (t) dt = 1 and hence Fn (0) = 1 ∀n ∈ N. In addition there are two square integrable functions g(t) and G(ω) which are differentiable infinitely many times. Here, G(ω) is not required to be a Fourier transform of g(t). Then the following relations hold
4.2 Introduction to Signal Spaces
l. im
n→∞
∞ −∞
fn (t)g(t) dt = g(0),
85
lim
n→∞
∞ −∞
Fn (ω)G(ω) dω =
∞ −∞
G(ω) dω , (4.45)
which show that lim fn (t) = δ(t) ,
lim Fn (ω) = 1 .
n→∞
n→∞
A formal proof of the equation on the left of (4.45) is given in [4].
(4.46)
A constant in the time domain and a delta impulse δ(ω) in the frequency domain are established in the same fashion. Indeed, Example 4.4 re-establishes the definition of a delta impulse from the impulse function d3 (t; T ) in Sect. 3.1.2.2, where the limit process has been performed for a continuous variable T . Here, an integer variable n is used instead. The result is the same, but this example shows how to embed the limit process into the framework of completeness of signal spaces, see Sect. 4.2.4. As a conclusion, delta impulses, constants, and complex exponentials can be used for the calculation of scalar products much in the same way as square integrable functions. Since delta impulses only act on classical functions, there can only be one delta impulse in a scalar product, i.e. δ(t), f (t). It is now shown how the Fourier transforms of delta impulses and constants are elegantly formulated as scalar products.
4.2.6.1 Delta Impulse The Fourier transform of a rectangle function has already been calculated in (3.59). The integrand on the bounded interval − T2 ≤ t ≤ T2 is square-integrable in the sense of (4.6), such that the Fourier integral constitutes a scalar product according to (4.17). Then the result from (3.59) (up to a constant T ) can be written as a scalar product t T/2 1 t t sin (ωT/2) 1 1 −iωt iωt = rect = rect e rect ,e F. dt = . −T/2 T (ωT/2) T T T T T (4.47) For T → 0 the rectangle function turns into a delta-impulse by the procedure from Sect. 3.1.2.2. Then the evaluation of the scalar product requires the sifting property of the delta impulse and gives the result from (3.77) ∞ .F {δ(t)} = δ(t) e−iωt dt = δ(t), eiωt = 1 . (4.48) −∞
Since δ(t) = 0 for t 0 the limits of the integration do not matter as long as they enclose t = 0. Therefore the integration is extended to infinity as required by the definition of the Fourier transformation.
86
4 Signal Spaces
4.2.6.2 Constant Value The Fourier transform of a constant value can be developed from a rectangle function as well. Different from (4.47) the height of the rectangle is independent of T , such that the result of (3.59) applies t t t T/2 sin (ωT/2) rect F rect . (4.49) = e−iωt dt = rect , eiωt = T −T/2 (ωT/2) T T T
.
For T → ∞ the nonzero part of the rectangle function and the limits of integration extend to −∞ ≤ t ≤ ∞ while the result of the integration approaches the generalized function 2π δ(ω) (see (3.60)–(3.62)) F {1} =
∞
.
1 e−iωt dt = 1, eiωt = 2π δ(ω) .
(4.50)
−∞
It has to be observed that (4.50) does not represent a scalar product in the strict sense of Sect. 4.2.2, since the result is not a well-defined scalar value from the field of complex numbers but a generalized function of ω. Therefore some authors avoid the term scalar product for situations like (4.50) and use instead the designations closure relation or—for summations of discrete terms—sum orthogonality [2]. Here the name scalar product is retained nevertheless and used in a wider sense also for relations like (4.50). This somewhat loose terminology is adopted in the same spirit as distributions are addressed as generalized functions, see Sect. 3.1.2.1.
4.2.6.3 Signal Spaces with Generalized Functions These considerations lead to the conclusion that the Hilbert space of squareintegrable functions can be extended to include also generalized functions. In the same way also the meaning of the scalar product is extended to allow for distributions as in (4.50). These extensions modify the signal space such that it also includes generalized functions and their Fourier transforms. In particular, Fourier transformation adopts the form of a—possibly generalized—scalar product between a—possibly generalized—function and a complex exponential. Mathematically rigorous presentations of this procedure are found in [4]. The term signal space is now used in this wider sense which encompasses generalized functions, a generalized form of scalar products, and Fourier transformation.
4.3 Orthogonality The basis of a signal space has been introduced in Sect. 4.2.1.2, where the different cases of Fig. 4.1 served as examples. This idea is obvious for the first two cases
4.3 Orthogonality
87
with a Cartesian coordinate systems with two or three axes. Cartesian coordinates are convenient to handle because their axes are perpendicular to each other. This convenience extends to more than three dimensions when the geometric concept of perpendicularity is generalized and abstracted as orthogonality.
4.3.1 Definition The elements u and v from a signal space are called orthogonal if their scalar product is zero . u, v = 0 ⇔ u and v are orthogonal (4.51) For real-valued signals u, v, the scalar product can be expressed by an angle ϕ as in (4.43). Then orthogonality indicates that two signals are aligned in some sense, e.g. vectors are perpendicular in the sense of geometry or sinusoidal signals have a phase difference of 90◦ . Note that the definition of orthogonality in (4.51) is not based on an angle. Therefore this definition is valid also for complex-valued signals. Nevertheless, geometric visualizations in two or three dimensions are often helpful. They are frequently used to illustrate otherwise abstract concepts. Therefore Sect. 4.3.2 starts with some simple 2D arrangements. They hold equally also for three spatial dimensions. The following sections investigate orthogonality for higher dimensions and for continuous functions.
4.3.2 Perpendicularity in Two Dimensions Orthogonality is initially a concept from geometry, where it is also known as perpendicularity. Figure 4.9 shows a few basic arrangements for elements of 2D geometry. On the left are two lines which intersect at an angle of 90◦ . These lines are called perpendicular. The figure in the center is a line, a point A, and its perpendicular to the line. The perpendicular and the line intersect at an angle of 90◦ . The length of the perpendicular cn is the distance of the point A from the line.
A ·
· cn
x2
x
n
· x0 cn
· x1
Fig. 4.9 Left: intersection of two lines at an angle of 90◦ . Center: perpendicular of a point A to a line. Right: normal vector n of a line and distance cn of the line from the origin
The right hand side of Fig. 4.9 shows a line x and a vector n perpendicular to the line x. When n is a unit vector, i.e. ||n||2 = 1, then it is called the normal vector to
88
4 Signal Spaces
the line x. The set of points x = [x1 x2 ]T on the line is defined by its Hessian normal form, which can be written as a scalar product xT n = x, n = ||x||2 cos ϕ = cn .
.
(4.52)
The position vector x0 denotes the foot of the perpendicular of the origin to the line x. Its scalar product with the normal vector is given by (4.33) as
x0 , n = ||x0 ||2 cos ϕ. Since both the perpendicular of the origin as well as the normal vector n point in the same direction, the angle ϕ between x0 and n is either zero or π, i.e., cos ϕ = ±1. The sign depends on whether the origin lies on the same side of x where the normal vector n points to or not. Therefore cn = ±||x0 ||2 is the distance of the origin from the line x. The sign of cn indicates the position of the origin relative to the line. These simple considerations show that perpendicularity is a helpful concept even for simple problems of 2D geometry.
4.3.3 Expansion into Basis Vectors The benefit of orthogonality is explained here for the representation of a 2D vector by orthogonal and non-orthogonal basis vectors. Indeed the calculation of expansion coefficients is greatly simplified if the basis vectors are orthogonal.
4.3.3.1 Orthogonal and Non-orthogonal Basis Vectors The representation of a real-valued 2D vector v by two basis vectors e1 and e2 is shown in Fig. 4.10. For a given vector v with the elements v1 and v2 the expansion coefficients vˆ 1 and vˆ 2 with respect to the basis vectors e1 and e2 shall be determined. The 2D example shall only exemplify the situation for practical problems where the number of elements is much larger. Note that vectors with 1000 elements and more are not uncommon in signal processing. vˆ 2 e2
v
e2
e1
vˆ 1 e1
Fig. 4.10 Representation of a vector v by two basis vectors e1 and e2 . The basis vectors are not assumed to be orthogonal
4.3 Orthogonality
89
By setting v = vˆ 1 e1 + vˆ 2 e2 the problem turns into the solution of a linear system of equations with the unknowns vˆ 1 and vˆ 2 " vˆ 1 ! v .v ˆ 1 e1 + vˆ 2 e2 = e1 e2 (4.53) =v= 1 . vˆ 2 v2 The set of linear equations is now rewritten by forming the scalar product •, eμ in (4.53) for μ = 1, 2
ˆv1 e1 + vˆ 2 e2 , e1 = vˆ 1 e1 , e1 + vˆ 2 e2 , e1 = v, e1 ,
ˆv1 e1 + vˆ 2 e2 , e2 = vˆ 1 e1 , e2 + vˆ 2 e2 , e2 = v, e2 ,
.
or in matrix notation
.
e1 , e1 e2 , e1 vˆ 1
v, e1 = .
e1 , e2 e2 , e2 vˆ 2
v, e2
(4.54) (4.55)
(4.56)
The solution of this system of 2 × 2 equations is an easy task, but it gets rather involved for more unknowns. Even worse, for infinite dimensions, the concept of matrix inversion is not applicable at all. However, it is obvious from (4.56) that matrix inversions can be avoided if all off-diagonal elements are zero, i.e. if the basis vectors are pairwise orthogonal
eν , eμ = 0
.
for μ ν .
(4.57)
Then the expansion coefficients follow by solving each Eqs. (4.54) and (4.55) separately vˆ =
. 1
v, e1 ,
e1 , e1
vˆ 2 =
v, e2 .
e2 , e2
(4.58)
The result can be calculated even simpler if the basis vectors e1 and e2 satisfy ⎧ ⎪ ⎪ ⎨1 μ = ν, . eν , eμ = δμν = ⎪ (4.59) ⎪ ⎩0 μ ν. Basis vectors with this property are called pairwise orthonormal. The case distinction in (4.59) is avoided by the use of the Kronecker symbol δμν . The expansion coefficients are now simply the scalar products of the given vector v with the basis vectors eμ vˆ = v, e1 ,
. 1
vˆ 2 = v, e2 .
(4.60)
A set of orthogonal basis functions can always be converted into a set of orthonormal basis functions by suitable scaling of eν and eμ .
90
4 Signal Spaces
The bottomline of these considerations is that orthogonality of the basis functions is essential for avoiding large matrix inversion. On the other hand, orthonormality is a further simplification but not essential.
4.3.3.2 The Gramian Matrix The matrix of scalar products in (4.56) is called the Gramian matrix G which corresponds to the set of basis vectors e1 and e2
e1 , e1 e2 , e1 .G = . (4.61)
e1 , e2 e2 , e2 Since any reasonable basis consists of linear independent vectors, the Gramian matrix is non-singular, which ensures the solution of the linear system (4.56). Furthermore property (4.10) requires that e2 , e1 = e1 , e2 ∗ , i.e., the Gramian matrix exhibits conjugate symmetry and thus the Gramian matrix is a normal matrix. The Gramian matrix of an orthogonal basis is a diagonal matrix and for an orthonormal basis, the Gramian matrix is the identity matrix. The definition (4.61) and the above properties hold also for signal spaces of higher dimension with a larger set of basis vectors.
4.3.3.3 Examples for Different Kinds of Basis Vectors The representation of 2D vectors by basis functions is shown here by three examples. Figure 4.11 displays three arrangements: Basis vectors e1 and e2 in a Cartesian coordinate system, basis vectors e1 and e2 in a rotated coordinate system, and basis vectors e1 and e2 in a non-orthogonal coordinate system. The Examples 4.5, 4.6, and 4.7 show the representation of an arbitrary vector v in each of these coordinate systems.
e2 vˆ 2 e2
vˆ 2 e2 v e1 vˆ 1 e1
e2
vˆ 1 e1
e 2 v
vˆ 2 e2
vˆ 1 e1
e1
Fig. 4.11 Coordinate systems for the Examples 4.5, 4.6, and 4.7
v e 1
4.3 Orthogonality
91
Example 4.5 (Cartesian Coordinate System). The basis vectors e1 and e2 for the Cartesian coordinate system shown on the right of Fig. 4.11 are orthonormal 1 0 .e1 = , e2 = ,
eν , eμ = δμν . (4.62) 0 1 The coefficients vˆ 1 and vˆ 2 follow from (4.60) as ! " v vˆ = v, e1 = eT1 v = 1 0 1 = v1 , v2 ! " v1 vˆ 2 = v, e2 = eT2 v = 0 1 = v2 . v2
. 1
(4.63) (4.64)
In this trivial example, the vector v is obviously represented by its elements. The Gramian matrix is the 2 × 2 identity matrix. Example 4.6 (Rotated Orthogonal Coordinate System). The basis vectors e1 and e2 for the rotated coordinate system from the center of Fig. 4.11 are orthonormal as well 1 1 1 1 e2 =
eν , eμ = δμν . (4.65) .e1 = √ , √ , −1 1 2 2 The rotation matrix between the basis vectors e1 , e2 and e1 , e2 is recovered as ! " " 1 1 1 10 cos π4 sin π4 ! = . e e (4.66) = √ π π e1 e2 . 1 2 − sin 4 cos 4 2 −1 1 0 1 The coefficients vˆ 1 and vˆ 2 w.r.t. basis vectors e1 and e2 follow again from (4.60) as ! " v T .v ˆ 1 = v, e1 = e 1 v = √1 2 1 −1 1 = (v1 − v2 ) √1 2 , (4.67) v2 ! " v1 T 1 √ 1 1 = (v1 + v2 ) √1 2 . (4.68) vˆ 2 = v, e2 = e 2 v, = 2 v2 This result is checked by evaluating the weighted superposition of the basis vectors 1 1 1 1 √ 1 √ + (v1 + v2 ) √ .v ˆ 1 e1 + vˆ 2 e2 = (v1 − v2 ) √1 2 2 1 2 −1 2 1 v1 + v2 1 v1 − v2 v + = 1 =v. = v2 2 −v1 + v2 2 v1 + v2 The coefficients vˆ 1 and vˆ 2 correctly represent the vector v although they are different from its elements in a Cartesian coordinate system. The Gramian matrix is the 2 × 2 identity matrix.
92
4 Signal Spaces
Example 4.7 (Non-orthogonal Basis Vectors). The basis vectors e1 and e2 for the coordinate system on the right of Fig. 4.11 are not orthogonal since for 2 1 .e1 = , e2 = , (4.69) 0 1 the Gramian matrix T T T e e e e
e , e e2 , e1 21 21 42 = 1 T 1 2 T 1 = G = 1 1 = . e 1 e 2 e 2 e2
e1 , e2 e2 , e2 01 01 22
(4.70)
is real valued and symmetric, but not diagonal. Further, inspection of Fig. 4.11 shows that the angle between the basis vectors is different from 90◦ . Its actual value follows from (4.43) .
e1 , e2 1 = √ = cos π4 = cos 45◦ . ||e1 ||2 ||e2 ||2 2
(4.71)
Since the basis vectors are not orthogonal, the coefficients vˆ 1 and vˆ 2 w.r.t. basis vectors e1 and e2 cannot be calculated from (4.58) or (4.60). Instead the linear system (4.56) has to be solved ! "−1 v1 2 1−1 v1 1 1 −1 v1 1 (v1 − v2 ) v1 . . = 2 = e1 e2 = = v2 v2 v2 01 v2 2 0 2 v2 The coefficients vˆ 1 and vˆ 2 indeed represent the vector v 1 v 2 1 = 1 =v. + v2 .v ˆ 1 e1 + vˆ 2 e2 = 2 (v1 − v2 ) v2 1 0
(4.72)
An expansion into non-orthogonal basis vectors is possible, but all expansion coefficients have to be determined simultaneously as in (4.72). An explicit matrix inversion like in (4.72) is not necessary, but the solution of a linear system of equations still requires considerable numerical effort for vectors of high dimension. These examples have shown that an expansion into orthogonal basis vectors is preferable over a non-orthogonal basis. The most simple case are orthonormal basis vectors, where the expansion coefficients are the scalar products from (4.60). If basis vectors are orthogonal but not orthonormal, then they can be made orthonormal by individual scaling of each basis vector. However, if the basis vectors are not orthogonal then a corresponding set of orthogonal basis vectors can be obtained by a dedicated orthogonalization procedure to be discussed next.
4.3 Orthogonality
93
4.3.4 Gram-Schmidt Orthogonalization The Gram-Schmidt orthogonalization procedure is a standard method to turn a set of arbitrary basis vectors into a set of orthogonal basis vectors. Its most important tool, the projection of a vector, is introduced first. Then the orthogonalization procedure is introduced and an introductory example with 2D vectors is given. However, another application is of more interest to signal processing: the expansion of a continuous function into basis functions. Starting from a polynomial expansion, the Gram-Schmidt orthogonalization procedure is applied to derive a special set of orthogonal polynomials. In this section, they serve as a further example, but they are required as basis functions in Sect. 8.3.3, as solution of a special Sturm-Liouville problem in Sect. 10.8.4, and in [1, Chap. 5] for 3D-systems in spherical coordinates.
4.3.4.1 Projection The projection of a vector v onto a line defined by the vector u is a vector in the direction of u with length ||v|| cos ϕ. It is denoted as pu (v). Figure 4.12 illustrates the situation for 2D vectors. v
Fig. 4.12 Projection pu (v) of a vector v on a line defined by the vector u. The vectors pu (v) and v − pu (v) are orthogonal
v) v − pu ( u pu (v)
ϕ
||v|| cos
ϕ
The projection pu (v) is given by its length ||v|| cos ϕ times a unit vector in the direction of u. Since cos ϕ is related to the scalar product u, v by (4.43), the projection pu (v) can be expressed in various ways as p (v) = ||v|| cos ϕ
. u
u, v
u, v
u, v u u u= = u. = ||v|| 2 ||u|| ||v|| ||u||
u, u ||u|| ||u||
(4.73)
Inspection of the vectors pu (v) and v − pu (v) in Fig. 4.12 suggests that they are orthogonal. This assertion is shown calculating the scalar product and observing its properties (4.10) and (4.11)
v − pu (v), pu (v) = v, pu (v) − pu (v), pu (v)
.
=
u, v | u, v|2
u, u = 0 .
v, u −
u, u
u, u2
(4.74)
94
4 Signal Spaces
Note that v − pu (v), pu (v) = 0 holds also for complex vectors. The angle ϕ in Fig. 4.12 has been introduced to display the real-valued case, it is not used anymore in (4.74). The orthogonality of pu (v) and v − pu (v) is the cornerstone for the following orthogonalization procedure.
4.3.4.2 Gram-Schmidt-Orthogonalization Procedure The Gram-Schmidt orthogonalization procedure turns a set of linear independent vectors vn into a set of orthogonal vectors un . It relies on the orthogonality property (4.74) of the projection. The individual steps are shown in (4.75). The left column uses the general notation pu (v) for the projection, the right column uses the notation with scalar products from (4.73) u = v1 ,
u1 = v1 ,
. 1
u2 = v2 − pu1 (v2 ), u3 = v3 − pu1 (v3 ) − pu2 (v3 ), uN = vN −
N−1 n=1
pun (vN ),
u1 , v2 u1 ,
u1 , u1
u2 , v3
u1 , v3 u3 = v3 − u2 , u1 −
u2 , u2
u1 , u1 N−1 un , vN uN = vN − (4.75) un .
un , un n=1 u2 = v2 −
The procedure starts with selecting u1 = v1 . The second step ensures that u2 is orthogonal to u1 by exploiting (4.74) for v2 = v and u1 = u. The following steps apply the same principle to each additional dimension. The following example reconsiders the situation from Example 4.7 with nonorthogonal basis vectors e1 and e2 and applies the Gram-Schmidt orthogonalization procedure to find an orthogonal basis. Example 4.8 (Gram-Schmidt-Orthogonalization for 2D Vectors). The procedure ! "T from (4.75) constructs an orthogonal basis for the non-orthogonal vectors e1 = 2 0 and ! "T e2 = 1 1 2 , 0
eT u1 eT e1
u1 , e2 1 u1 = 2T u1 = T2 e1 = , pu1 (e2 ) = 0 u1 u1 e1 e1
u1 , u1 1 1 0 2 1 − = u1 = , u2 = . u2 = e2 − pu1 (e2 ) = 1 0 1 0 0
u = e1 =
. 1
The resulting orthogonal basis is given by u1 and u2 . It could be turned into an orthonormal basis by scaling the first basis vector u1 by a factor of 12 . This simple example demonstrated the application of the Gram-Schmidt-Orthogonalization to 2D vectors. However, since the procedure is defined by (4.75) in
4.3 Orthogonality
95
terms of projections and scalar products, it can be applied to any metric signal space with a scalar product.
4.3.4.3 Expansion of a Function into Powers Not just vectors in space but any signal in a Hilbert space can be represented by a set of functions. If these functions constitute a basis in the respective space, then they are called basis functions and often denoted as eν (x). Then the representation of a real-valued function f (x) of a real variable x by the basis functions eν (x) reads .
f (x) =
∞
fˆν eν (x) ,
(4.76)
ν=0
with the expansion coefficients fˆν . A popular expansion is the truncated Taylor series which results in a polynomial expansion with the basis functions eν (x) = xν .
f (x) =
∞
fˆν xν = fˆ0 + fˆ1 x + fˆ2 x2 . . . .
(4.77)
ν=0
Since the powers of x exceed all limits (lim x→±∞ xν → ±∞), the approximation of f (x) needs to be restricted to a finite interval, e.g. |x| ≤ 1. Then a useful scalar product is given by 1 . f (x), eν (x) = f (x) eν (x) dx . (4.78) −1
Now some questions arise: • Are the basis functions eν (x) = xν orthogonal w.r.t. the scalar product (4.78)? • In the case they are not orthogonal, how would an orthogonal basis look like? The answer to the first question is given by the calculation of the scalar product '1 xμ+ν+1 ''' . eμ (x), eν (x) = x , x = x x dx = ' −1 μ + ν + 1 '−1 ⎧ 2 ⎪ ⎪ μ+ν+1 ⎪ ⎪ 1 − (−1) ⎨ μ + ν + 1 μ + ν even =⎪ . = ⎪ ⎪ μ+ν+1 ⎪ ⎩0 μ + ν odd μ
ν
1
μ ν
(4.79)
The result shows that there are indices μ ν for which xμ , xν 0. Thus the powers of x do not constitute an orthogonal basis. The second question is answered by invoking the Gram-Schmidt orthogonalization procedure. The scalar products xμ , xν are calculated from the result of (4.79)
96
4 Signal Spaces
u = e0 (x) = x0 = 1,
. 0
u0 , e1
1, x · 1 = x − 0 = x, u0 = x −
1, 1
u0 , u0
u0 , e2
u1 , e2 u2 = e2 (x) − u0 − u1 =
u0 , u0
u1 , u1
u1 = e1 (x) −
2
1, x2
x, x2 1 ·1− · x = x2 − 3 − 0 = x2 − , 2 3
1, 1
x, x
u0 , e3
u2 , e3
u1 , e3 u3 = e3 (x) − u1 − u2 = u0 −
u1 , u1
u2 , u2
u0 , u0
= x2 −
= x2 −
x2 − 31 , x3
1, x3
x, x3 ·1− ·x− · (x2 − 13 )
1, 1
x, x
x2 − 13 , x2 − 13 2 5 2 3
= x3 − 0 −
x − 0 = x3 − 53 x .
(4.80)
The resulting set of polynomials Pν (x) is often scaled such that .
Pν (x) =
uν (x) uν (1)
Pν (1) = 1 .
(4.81)
The polynomials Pν (x) are scaled but they are not orthonormal. Proceeding with the orthogonalization in (4.80) gives a set of polynomials with increasing order. They are listed here for ν = 0, . . . , 5 in the unscaled and scaled version u (x) = 1, u1 (x) = x,
P0 (x) = 1, P1 (x) = x,
u2 (x) = x2 − 13 ,
P2 (x) = 12 (3x2 − 1),
u3 (x) = x3 − 35 x,
P3 (x) = 12 (5x3 − 3x),
u4 (x) = x4 − 67 x2 +
P4 (x) = 18 (35x4 − 30x2 + 3),
u5 (x) = x5 −
P5 (x) = 18 (63x5 − 70x3 + 15x) .
. 0
3 35 , 5 10 3 9 x + 21 x
(4.82)
The set of scaled polynomials is known as the Legendre polynomials and conventionally denoted as Pn (x). Figure 4.13 illustrates their behaviour. Note that |Pn (x)| ≤ 1 for |x| ≤ 1 due to scaling as in (4.81).
4.4 Duality and Biorthogonality
97
Pn (x) P0
1
P1 P2 P3
0 -1 P4
x P5
1
-1 Fig. 4.13 Legendre polynomials Pn (x) for n = 0, . . . , 5
The Legendre polynomials are a member of the family of classical orthogonal polynomials (Legendre, Hermite, Laguerre, Jacobi, and Chebyshev polynomials). They differ in the integration range and in a possible weighting function of the scalar product (4.78) and in the scaling factor (4.81). At this point, the Legendre polynomials serve as an example for orthogonal polynomials obtained by Gram-Schmidt orthogonalization. They appear again in Sect. 8.3.3 where Example 8.6 discusses a Galerkin method for the approximate solution of differential equations and in Sect. 10.8.4 as solution of a classical SturmLiouville problem. Further they play in important role in the treatment of multidimensional systems in spherical coordinates [1, Chap. 5], where they adopt the form Pn (cos θ). Thus the integration interval in (4.78) is an appropriate choice.
4.4 Duality and Biorthogonality The concept of orthogonality is a special case of a more general approach which requires two different signal spaces, each with its own set of basis functions. The additional signal space is called the dual space. The existence of a dual space and its properties are usually investigated for general vector spaces [3, 5, 6, 12]. Since Hilbert spaces have already been introduced, duality is introduced here with the focus on scalar products.
4.4.1 Dual Spaces and Biorthogonal Signal Spaces Given a Hilbert space H, its scalar product defines a mapping v˜ of a function v of H to the underlying field of complex numbers
98
4 Signal Spaces
v˜ (v) = v, v˜ ∈ C .
(4.83)
.
The mapping itself is given by the scalar product v, v˜ where both v and v˜ are elements of H. However, these two elements play distinct roles in the mapping v˜ : The function v˜ defines the mapping v˜ and the function v is its argument. For each function v˜ there is a mapping v˜ defined by the scalar product (4.83). Thus the set of all mappings v˜ is endowed with the algebraic structure of the scalar product, compare the requirements for a vector space (4.9) and the properties (4.11) and (4.12). This means that the set of all mappings v˜ is a vector space H˜ of its own. Since the mappings v˜ are defined w.r.t. the elements v of the Hilbert space H, their vector space H˜ is called the dual space to the Hilbert space H. For Hilbert spaces, the dual space H˜ appears in two different guises: On the one hand, its elements are mappings v˜ from H to the field of complex numbers C. On the other hand, each element v˜ of H˜ is defined by a function v˜ of H. It is therefore convenient to identify the elements of H˜ by their functions v˜ and to speak simply of the dual space with elements v˜ rather than the mappings v˜ defined by v˜ . Thus there are two vector spaces, the Hilbert space H with the functions v and the dual space H˜ with the functions v˜ . Since both v and v˜ are elements of H, they must be closely related. This relation is established by representing v and v˜ by their respective basis functions. In a Hilbert space with basis functions eμ for μ ∈ Z, each function v is expressed as a weighted sum of basis functions with expansion coefficients Vμ , see (4.84). In the same way also the functions v˜ are expressed in terms of the basis functions of the dual space e˜ ν with the expansion coefficients V˜ ν v=
.
Vμ eμ ,
v˜ =
μ
V˜ ν e˜ ν .
(4.84)
ν
The basis functions eμ are called the primal basis and the basis functions e˜ ν are the dual basis. Both bases are related by the requirement that the scalar product of their basis functions satisfies .
eμ , e˜ ν = δμν .
(4.85)
This relation closely resembles the condition for orthogonality (4.59), however here are basis functions eμ and e˜ ν from two different bases. Therefore, functions eμ and e˜ ν which satisfy (4.85) are called biorthogonal. Note, that neither the functions eμ nor the functions e˜ ν are required to be orthogonal by themselves. The condition for biorthogonality allows to express the expansion coefficients from (4.84) in terms of (4.85) as ˜ν = . v, e Vμ eμ , e˜ ν = Vν ,
˜v, eμ = (4.86) V˜ ν ˜eν , eμ , = V˜ μ . μ
ν
Then the expansions (4.84) adopt the simple forms .v =
v, e˜ μ eμ , v˜ =
˜v, eν e˜ ν . μ
ν
(4.87)
4.4 Duality and Biorthogonality
99
Orthogonality is included in this concept as a special case. For an orthonormal basis eμ of a Hilbert space H, Eq. (4.85) holds for e˜ ν = eν and the expansion (4.87) is simply v=
.
v, eμ eμ .
(4.88)
μ
Examples 4.5 and 4.6 have considered basic cases of orthogonal expansions. The following examples highlight biorthogonality in different kinds of vector spaces. Example 4.9 (2×2 Matrix). A non-orthogonal basis for 2D vectors has already been discussed in Example 4.7. Here, the dual basis is determined. The vectors e1 = e1 and e2 = e2 from Example 4.7 as well as the yet unknown vectors e˜ 1 and e˜ 2 of the dual basis are combined into the matrices E and E˜ " ! " ! E˜ = e˜ 1 e˜ 2 . .E = e1 e2 , (4.89) Then the condition for biorthogonality (4.85) for all four pairs of vectors can be ˜ expressed simultaneously by the product of E and E˜ and solved for E H ( ) ˜ H E = I, .E E˜ = E−1 = EH −1 = E−H . (4.90) For the numerical values from Example 4.7 follows 1 1 1 1 0 21 ˜ .E = , e˜ 1 = , , E= 01 2 −1 2 2 −1
0 e˜ 2 = . 1
(4.91)
The dual basis vectors e˜ 1 and e˜ 2 satisfy the biorthogonality condition (4.85) as shown in Fig. 4.14. Primal and dual vectors are revisited in Sect. 7.2.1 in the context of lattice theory. Fig. 4.14 Basis vectors e1 and e2 (black) and dual basis vectors e˜ 1 and e˜ 2 (gray) from (4.91) of Example 4.9.The grid size is one unit. Note that e1 and e˜ 2 are perpendicular, as well as e2 and e˜ 1 , such that e1 , e˜ 2 = 0 and e2 , e˜ 1 = 0. On the other hand, e1 , e˜ 1 = e2 , e˜ 2 = 1
e˜ 2
e2
e1 e˜ 1
Example 4.10 (Polynomial). Section 4.3.4.3 discussed the expansion of a function f (x) into a power series. This example investigates a truncated version with three terms
100
4 Signal Spaces
.
f (x) =
2
e0 (x) = 1,
fˆμ eμ (x),
e1 (x) = x,
e2 (x) = x2 .
(4.92)
μ=0
The dual basis is determined from the scalar product (4.78) and the approach e˜ (x) =
2
. ν
pν,n xn ,
eμ (x), e˜ ν (x) =
n=0
2 n=0
1 pν,n
xμ+n dx = δμν .
(4.93)
−1
The values of the integral are given in (4.79), such that the coefficients pν,n follow as solutions of ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢⎢⎢ e0 , e˜ 0 e1 , e˜ 0 e2 , e˜ 0 ⎥⎥⎥ ⎢⎢⎢ p0,0 p0,1 p0,2 ⎥⎥⎥ ⎢⎢⎢2 0 23 ⎥⎥⎥ ⎢⎢⎢1 0 0⎥⎥⎥ ⎢⎢⎢ ⎥⎥⎥ ⎢⎢⎢ 2 ⎥⎥⎥ ⎢⎢⎢ ⎥⎥⎥ ⎢⎢⎢ ⎥ ˜ 1 e1 , e˜ 1 e2 , e˜ 1 ⎥⎥ = ⎢⎢ p1,0 p1,1 p1,2 ⎥⎥ ⎢⎢ 0 3 0 ⎥⎥ = ⎢⎢ 0 1 0⎥⎥⎥⎥ . . ⎢ e0 , e (4.94) ⎢⎣ ⎦⎣2 ⎦ ⎦ ⎣ ⎣ ⎦ p2,0 p2,1 p2,2 3 0 25
e0 , e˜ 2 e1 , e˜ 2 e2 , e˜ 2 0 0 1 Finally the dual basis results as e˜ (x) =
. 0
3 3 − 5x2 , 8
e˜ 1 (x) =
3 x, 2
e˜ 2 (x) =
15 −1 + 3x2 . 8
(4.95)
The basis functions from (4.92) and the dual basis (4.95) satisfy (4.85). ν=0
μ=0
-1
0.5
0.5 μ=1
μ=2
ν=1
1
1
-1
-1
0.5
1
x
-1
1
ν=2
1
x
1 -1
1 x
1
-1
x
1
-1
1
x
1
-1
1 x
1
1 -1
1
x
x
x
Fig. 4.15 Products eμ (x) e˜ ν (x) of pairs of basis functions eμ (x) from (4.92) and their duals e˜ ν (x) from (4.95) of Example 4.10
4.4 Duality and Biorthogonality
101
Figure 4.15 shows the products eμ (x) e˜ ν (x) of one basis function eμ (x) from (4.92) and one dual basis function e˜ ν (x) from (4.95) for μ = 0, 1, 2 and ν = 0, 1, 2. The functions on the main diagonal clearly have a positive area for −1 < x < 1 while the integrals on the off diagonals are zero, either because the functions have odd ( ) symmetry (μ, ν) = (1, 0), (0, 1), (1, 2), (2, 1) or because positive and negative ar( ) eas cancel (ν, μ) = (0, 2), (2, 0) . The exact values of the scalar product are given by (4.93). Examples 4.9 and 4.10 have covered non-orthogonal basis functions and their dual basis. The following Example 4.11 shows that the consideration of the dual basis provides additional insight also for expansions into orthogonal basis functions. Example 4.11 (Orthogonal Matrix). Example 4.6 considered an orthonormal coordinate system. Here, orthogonality is sacrificed for simplicity to avoid square roots in √ the definition of the basis vectors. The basis vectors are now chosen as eμ = en / 2 and combined into a matrix E 1 1 1 1 1 1 1 , e2 = , E= . .e1 = (4.96) 2 1 2 −1 2 1 −1 The vectors e1 and e2 are orthogonal but not orthonormal since EH E = 12 I. Therefore the inverse matrix is given by E−1 = 2EH and the dual basis follows similar to (4.90) from " 1 1 ! 1 1 H ˜ ˜ 1 e˜ 2 = , e˜ 1 = .E = 2E = e , e˜ 2 = . (4.97) 1 −1 1 −1 Since E is real and symmetric, EH = E and thus the conjugate transpose in (4.97) is not necessary. Nevertheless, this notation is kept here to facilitate later generalization of this example.
Fig. 4.16 Basis vectors e1 and e2 (black) and dual basis vectors e˜ 1 and e˜ 2 (gray) from (4.97) of Example 4.11. The grid size is half a unit. The basis vectors eμ and the dual basis vectors e˜ ν differ only by a scaling factor of 2
e˜ 1 e1 e2 e˜ 2
Figure 4.16 shows the primal basis vectors e1 and e2 according to (4.96) and dual basis vectors e˜ 1 and e˜ 2 from (4.97). Since e1 and e2 constitute already an orthogonal basis, the dual basis vectors differ only by a factor of 2 from the primal basis. The primal and the dual basis can also be interpreted in terms of columns and ˜ see Fig. 4.17. The primal basis vectors e1 and e2 are rows of the matrices E and E, the columns of the matrix E and the dual basis vectors e˜ 1 and e˜ 2 are the columns
102
4 Signal Spaces
˜ Vice versa, the rows of E are the scaled (and complex conjugate) of the matrix E. ˜ are the scaled (and complex conjugate) vectors of the dual basis, while the rows of E vectors of the primal basis. e0
n
→
e1
1 2
1 2
1 2
− 12
e˜ 0 1 H e˜ 2 0
↓m
E=
1
m
→
e˜ 1 1
E˜ = 2EH =
1 H e˜ 2 1
2eH0 ↓n
1
−1
2eH1
Fig. 4.17 Matrices E and E˜ from Example 4.11. The basis vectors eμ and the dual basis vectors e˜ ν correspond to column and row vectors of the matrix E, respectively (up to a scaling factor)
Comparing Examples 4.9 and 4.11 leads to the following conclusions: • If the primal basis is not orthogonal then the dual basis is different from the primal one (see Fig. 4.14). Nevertheless, both bases are closely related by the condition of biorthogonality (4.85). • If the primal basis is orthogonal but not orthonormal then the dual basis vectors are a scaled version of the primal ones (see Fig. 4.16). • If the primal basis is orthonormal, then the primal basis and the dual basis are identical. For orthogonal bases, the formal distinction between the primal basis and the dual basis is often neglected. Only the primal basis is used with appropriate scaling factors where needed but without dedicated reference to a dual basis. After this short introduction to biorthogonal signal spaces, the Examples 4.9 to 4.11 are generalized at first to matrices of arbitrary size in Sect. 4.4.2 and then to general signal spaces in Sect. 4.4.3.
4.4.2 Sets of Biorthogonal Vectors If the elements v in (4.84) are column vectors with N elements then there are also N primal basis vectors eμ and N dual basis vectors e˜ ν , as well as N expansion coefficients Vm . To benefit from vector-matrix notation, the expansion coefficients Vμ are arranged in a column vector V and the basis vectors eμ and e˜ ν become the column vectors of the matrices E and E˜ "T " ! " ! ! V. = V1 . . . Vμ . . . VN , E = e1 . . . eμ . . . eN , E˜ = e˜ 1 . . . e˜ ν . . . e˜ N . (4.98) The scalar product of two vectors v and w is given by v, w = wH v. Then Eqs. (4.86) and (4.87) become
4.4 Duality and Biorthogonality
103
Vν = v, e˜ ν = e˜ Hν v,
v=
.
N
Vμ eμ ,
(4.99)
μ=1
or more elegantly in matrix notation with (4.98) H
V = E˜ v,
v = EV .
.
(4.100)
H
For row μ and column ν of the matrix product E˜ E holds ! H " ˜ E = e˜ Hν eμ = eμ , e˜ ν = δμν . . E
(4.101)
μν
The last equality is true because of the biorthogonality condition (4.85). Thus the H matrix product E˜ E is equal to the identity matrix and E˜ can be obtained easily H E˜ E = I
.
E˜ = E−H .
(4.102)
If in addition E is orthogonal but not necessarily orthonormal, then E˜ is a scaled version of E with c ∈ R EH E = cI
.
1 E˜ = E . c
(4.103)
The results (4.102) and (4.103) generalize (4.90) in Example 4.9 and (4.97) in Example 4.11. So far, the dual space for v˜ from (4.84) has not been used explicitly. Its role appears in a natural form when adopting vector-matrix notation also for the expansion coefficients V˜ n as in (4.98). Then Eqs. (4.86) and (4.87) become
.
(4.104)
Comparing the first and the third column in (4.104) shows that the vector v from the ˜ in the dual space and vice primal space is associated with the vector of coefficients V versa V with v˜ . These pairs evolve by left and right multiplication with the matrices H E˜ and E, respectively. Therefore the primal and the dual space are connected to the ˜ H and E. This result generalizes the corresponding observarows and columns of E tion from Fig. 4.17. The primal and the dual space coincide for orthonormal bases, i.e. for E˜ = E, as is confirmed by comparison of the first and the second column in (4.104). In the signal space of vectors, matrix notation provides a tool for the elegant formulation of the relations between a vector v of the signal space and the vector V of its expansion coefficients, see (4.100). The same holds for the relations between primal und dual space, see (4.104). H In signal processing applications, V = E˜ v is called the analysis equation because it represents a vector v by its expansion coefficients. Likewise v = EV is called the synthesis equation, because it reconstructs the vector v from its expansion coefficients, see e.g. [10].
104
4 Signal Spaces
A representation of similar simplicity is desirable also for spaces with other types of signals as elements. Such a generalization is discussed next.
4.4.3 General Biorthogonal Signal Spaces The calculation of the expansion coefficients in (4.86) is expressed as the scalar product of the Hilbert space with elements v. This general definition holds for any Hilbert space, irrespective of the particular type of signal. The vector formulation in (4.99) is just one specific example for the Hilbert space of vectors of length N. On the other hand, the reconstruction of an element v from its expansion coefficients in (4.84) does not have the form of a scalar product, not even in the vector form of (4.99). Nevertheless, the synthesis equation v = EV in (4.100) has the simH ple form of a matrix-vector product, just like the analysis equation V = E˜ v which results from the scalar product in (4.99). This observation suggests to look for operations which are similar to scalar products but act on all expansion coefficients simultaneously. This search for more general expressions is conducted first for the vector space from Sect. 4.4.2, then it is generalized to spaces with other types of elements. The step from (4.99) to (4.100) for the analysis equation can be expressed as a vector of scalar products ⎤ ⎡ ⎤ ⎡ ⎢⎢⎢ v, e˜ 1 ⎥⎥⎥ ⎢⎢⎢ e˜ H1 v ⎥⎥⎥ "H ⎢⎢⎢ . ⎥⎥⎥ ⎢⎢⎢ . ⎥⎥⎥ ! H .V = ⎢ ⎢⎢⎢ .. ⎥⎥⎥⎥ = ⎢⎢⎢⎢ .. ⎥⎥⎥⎥ = e˜ 1 . . . e˜ N v = E˜ v . ⎦ ⎣ H ⎦ ⎣
v, e˜ N e˜ N v
(4.105)
Instead of a vector of scalar products, a different operation is now introduced. It overloads the notation v, e˜ μ for a scalar product by admitting also matrices instead of vectors. The result is not a scalar as in a single scalar product but a vector of scalar products, here the vector products of each column of E˜ with the column vector v ˜ H v = v, E˜ . .V = E (4.106) The matrix-vector product for the synthesis equation in (4.100) appears as ⎡ ⎤ ⎢V ⎥ " ⎢⎢⎢⎢ .1 ⎥⎥⎥⎥ .v = Vμ eμ = e1 . . . eN ⎢⎢⎢⎢ .. ⎥⎥⎥⎥ = EV = V, EH . ⎢⎣ ⎥⎦ μ=1 VN N
!
(4.107)
This equation also represents a vector of scalar products, in particular, vector products of each row of E with the column vector V. A different notation has been ( ) adopted for these scalar products , instead of , because—in general—the space of expansion coefficients might be different from the space of signals v. Consequently, also the scalar products must be defined in different ways.
4.5 Signal Transformations
105
Writing the analysis equation and the synthesis equation in the form of ˜ , .V = v, E v = V, EH , (4.108) allows to abstract from the particular signal space for v and V. It is valid for any signal space with suitable scalar products. It remains to generalize the biorthogonality condition from (4.101) and (4.102). To this end, a matrix of scalar products is established and denoted as ⎡ ⎢⎢⎢ e1 , e˜ 1 ⎢⎢⎢ . .⎢ ⎢⎢⎢ .. ⎣
eN , e˜ 1
⎤ . . . e1 , e˜ N ⎥⎥ ⎥ .. .. ⎥⎥⎥⎥ = E, E˜ = I . . . ⎥⎥⎥ ⎦ . . . eN , e˜ N
(4.109)
This notation provides also an elegant way to express the Gramian of the two biorthogonal bases, e.g. for the primal basis ⎡ ⎢⎢⎢ e1 , e1 ⎢⎢⎢ . .G = ⎢ ⎢⎢⎢ .. ⎣
eN , e1
⎤ . . . e1 , eN ⎥⎥ ⎥ .. .. ⎥⎥⎥⎥ = E, E , . . ⎥⎥⎥ ⎦ . . . eN , eN
(4.110)
and similar for the dual basis. In signal spaces other than for vectors of length N, not only a suitable scalar product has to be specified but also a suitable identity operator other than the Kronecker symbol δμν or, in matrix notation, the identity matrix I. Possible candidates are delta impulses as in (3.5). More examples are discussed in Sect. 4.5 on signal transformations. Furthermore, the notation , and , might not only include scalar products, resp. vectors and matrices of scalar products in the classical sense but also in the wider sense as introduced in Sect. 4.2.6.3. Consequently, the idea of orthogonality and biorthogonality has to be extended from (4.109) to expressions with distributions on the right hand side. Then (bi-)orthogonality is given by expressions like (4.109) with scalar products understood in the wider sense.
4.5 Signal Transformations The Fourier transformation introduced in Sect. 3.2 is just one of many different signal representations. There are several other Fourier-type transformations, depending on the nature of the signals in the time domain: continuous or discrete, periodic or non-periodic. The diversity increases further when multidimensional signals are considered in Chap. 5. This section presents a general framework for the definition of signal transformations. This framework includes not only the various classical Fourier-type transformations as special cases, it provides also a solid basis for the construction of new transformations for multidimensional signals.
106
4 Signal Spaces
4.5.1 General Procedure The general procedure for the construction of signal transformations follows a series of steps that are based on the material introduced in Sects. 4.1 and 4.3. These steps are outlined below. Their application to different signal spaces yields the classical Fourier-type transformations as shown in Sects. 4.5.2 to 4.5.5. 1. The first step is the choice of a suitable signal space. Its elements may be continues-time or discrete-time signals which are either periodic or not. 2. Then define a scalar product by either integration of continuous-time signals or summation of discrete-time signals. The ranges of integration or summation may be over one period for periodic signals or from −∞ to ∞ for non-periodic signals. 3. The third step chooses the primal basis functions e. Complex exponential functions or complex exponential sequences are popular candidates but they are not the only possible choice. 4. Then investigate the orthogonality properties of the basis functions. 5. With the results from the previous step, the dual basis functions e˜ are determined. 6. With the scalar product and the dual basis functions, the analysis equation according to Eq. (4.108) is set up. 7. In a similar way follows a second scalar product with the primal basis functions for the synthesis equation, see also Eq. (4.108). Following this procedure establishes the forward and the inverse signal transformation from the analysis and the synthesis equation. Example 4.12 (State Transformation as Signal Transformation). Section 3.4.2 has introduced conventional state space systems with a finite number of states. Further, the similarity transformation of the state vector has been interpreted as a signal transformation in (3.139)–(3.141). This example reviews this interpretation in the light of the above steps of the general procedure. To emphasize the parellelity to Eqs. (4.106) and (4.107), the state vector in the diagonal representation (3.138) is denoted in this example by a bold uppercase letter as x˜ (t) = X(t). 1. The state vectors x(t) and X(t) are elements of the Hilbert space of column vectors with time dependent components. 2. The scalar product is defined as in Eq. (4.40) over the finite range of indices from (3.132). 3. The basis functions are the column vectors of the matrix K , i.e. eμ = kμ or E = K . 4. There are no further assumptions for the matrix A , except that it is square and has single eigenvalues. Therefore the eigenvectors kμ are linear independent but not necessarily orthogonal. 5. For nonorthogonal basis vectors follows the dual basis from (4.102) as K˜ = K −H . 6. The analysis equation is formulated in different ways according to (4.106) and (3.139) as H −1 .X(t) = K x(t) = K˜ x(t) = x(t), K˜ = T {x(t)} . (4.111)
4.5 Signal Transformations
107
7. The synthesis equation follows from (4.107) and (3.140) as x(t) = K X(t) =
M
.
Xμ (t)kμ = X(t), K H = T −1 {X(t)} .
(4.112)
μ=1
The basis vectors are—in general—not orthogonal, therefore the transforma tion T lacks the elegance of the Fourier-type transformations from Sect. 4.5. Unfortunately, the notation in textbooks on functional analysis and on signal processing is not always identical. To switch to the usual notation in signal processing, the elements of signal spaces and their coefficients are no more distinguished by boldface and normal type. Instead, boldface letters denote vectors and matrices while normal type denotes scalar values. The indices for sequences and vectors with N elements run from 0 to N − 1 rather than from 1 to N. Since signal transformations may be applied in the time domain as well as along a spatial dimension, the index in the original domain is generically denoted by n. For simplicity, terms like time and frequency are retained for the original and the transform domain.
4.5.2 Discrete Fourier Transformation Signal spaces with vectors as elements have already been discussed in Sect. 4.4.2. Here they are reconsidered for a special choice of the basis vectors eμ . The elements of the signal space are either expressed as sequences or as vectors of length N, compare (4.98) ! "T .v(n), 0 ≤ n ≤ N − 1, or v = v(0) . . . v(n) . . . v(N − 1) . (4.113) The scalar product is given by (see (4.99)) .
v(n), u(n) =
N−1
u∗ (n)v(n),
v, u = uH v .
(4.114)
n=0
The set of basis functions is chosen as e (n) =
. μ
"T ! eμ = eμ (0) . . . eμ (n) . . . eμ (N − 1) .
1 i 2π μn eN , N
(4.115)
These basis functions are orthogonal since .
eμ (n), eν (n) =
N−1 n=0
e∗ν (n)eμ (n) =
N−1 1 i 2π (μ−ν)n 1 eN = δμν . N N 2 n=0
(4.116)
The dual basis functions e˜ ν (n) are just a scaled version of the primal basis eμ (n) and satisfy the biortogonality condition (4.85)
108
4 Signal Spaces
e˜ (n) = N eν (n) = ei N νn , 2π
eμ (n), e˜ ν (n) = δμν .
. ν
(4.117)
This condition can also be expressed in matrix notation similar to (4.98) with " ! .E = e0 . . . eμ . . . eN−1 , (4.118) [E]nμ = eμ (n), " ! " ! E˜ E˜ = e˜ 0 . . . e˜ ν . . . e˜ N−1 , = e˜ ν (n), (4.119) nν
˜ as (compare (4.101)) as a scalar product of the columns of E and E ! H " ˜ =E ˜ H E = δμν I, . E, E E˜ E = e˜ Hν eμ = eμ , e˜ ν = δμν . μν
(4.120)
The analysis equation follows from (4.99) or (4.106) as Vμ = v(n), e˜ μ (n) =
N−1
.
v(n) e˜ ∗μ (n) =
n=0
N−1
v(n) e−i N μn . 2π
(4.121)
n=0
and the synthesis equation from (4.107) as v(n) = Vμ , eμ (n) =
.
N−1 2π 1 Vμ ei N μn . N μ=0
(4.122)
Note that the scalar product , in the analysis equation is calculated by a summation along the columns of E˜ (see (4.120)), while the scalar product , in the synthesis equation is calculated by a summation along the rows of E. This observation may appear as a mere sidenote at this point, but the difference between both scalar products is more exposed for other signal transformations. The analysis equation (4.121) and the synthesis equation (4.122) constitute the transformation pair of the Discrete Fourier Transformation or DFT. Since the primal basis functions eμ (n) and the dual basis functions e˜ ν (n) differ only by a factor of N, the corresponding matrices are often expressed by a single matrix W as W = E˜ = N E,
.
[W]nμ = w−nμ N ,
wN = e−i N . 2π
(4.123)
Then the analysis and the synthesis equation turn into the familiar form of the forward and inverse Discrete Fourier Transformation [7, 9] V = WH v,
.
Vμ = DFT{v(n)} =
N−1
v(n) wμn N ,
(4.124)
n=0
v=
1 WV, N
v(n) = DFT−1 {Vμ } =
N−1 1 Vμ w−μn . N N μ=0
(4.125)
4.5 Signal Transformations
109
˜ is given by (4.97) from The factor wN becomes w2 = −1 for N = 2. Then W = E Example 4.11. Indeed, the matrix discussed in that example belongs to a Discrete Fourier Transformation of length N = 2.
4.5.3 Fourier Series Consider the Hilbert space of periodic continuous functions with period T 0 , i.e., v(t) = v(t + T 0 ). The scalar product and the basis functions eμ (t) are chosen as
v(t), u(t) =
.
T 0 /2 1 v(t) u∗ (t) dt, T0
eμ (t) = eiω0 μt ,
ω0 =
2π T0
.
(4.126)
−T 0 /2
The basis functions eμ (t) are orthonormal
.
T 0 /2 1 iω0 (μ−ν)t eiω0 μt , eiω0 νt = e dt = δμν , T0
(4.127)
−T 0 /2
therefore they are identical to their dual basis e˜ ν (t) = eν (t). The Fourier series coefficients Vμ are determined by the analysis equation T 0 /2 1 Vμ = v(t), eμ (t) = v(t), eiω0 μt = v(t)e−iω0 μt dt , T0
.
(4.128)
−T 0 /2
and the expansion of v(t) w.r.t. the basis functions eμ (t) is given by the Fourier series ∞
v(t) =
.
Vμ eμ (t) =
μ=−∞
∞
Vμ eiω0 μt = Vμ , e∗μ (t) .
(4.129)
μ=−∞
The Fourier series expansion acts here as synthesis equation, which can be expressed as a scalar product , between the sequence of Fourier series coefficients Vμ and the time dependent basis functions eμ (t). The result is the time dependent scalar function v(t). For square-summable sequences of Fourier coefficients Vμ , the synthesis equation (4.129) is a scalar product in the space of infinite sequences (denoted by double angles). Its meaning can be extended to include the impulse train v(t) = T10 X( Tt0 ) with a non-convergent sequence of Fourier coefficients Vμ =
.
t 1 T0 X T0
T 0 /2 1 1 1 , eμ (t) = δ(t) eμ (t) dt = eμ (0) = . T0 T0 T0
−T 0 /2
Indeed, the Fourier series in the form of (4.129) gives
(4.130)
110
4 Signal Spaces
v(t) =
.
1 T0 ,
* + ∞ 1 t 1 iω0 μt . e = X e∗μ (t) = T 0 μ=−∞ T0 T0
(4.131)
The last equality sign follows from the results in Sect. 3.2.3 for Ω = ω0 t = 2π Tt0 . Thus the calculation of the Fourier series coefficients by (4.128) or (4.130) as well as the expansion of a function v(t) into its Fourier series by (4.129) and (4.131) are expressed by scalar products as v(t) = Vn , e∗n (t) .
Vn = v(t), en (t) ,
.
(4.132)
In the case of distributions, the scalar product , has to be understood in the sense of Sect. 4.2.6.3. An important example is the relation between the sequence of all basis functions at different time instants t and τ + * ∞ t−τ iω0 μt iω0 μτ = . (4.133) . eμ (t), eμ (τ) = e ,e eiω0 μ(t−τ) = X T0 μ=−∞ The result is a periodic repetition of delta impulses, which acts as a unit function in the space of periodic functions. Comparing (4.127) and (4.133) shows that these two scalar products act quite differently. Further, they expose a double nature of the exponential basis functions eμ (t): Integration of a pair of periodic basis functions eiω0 μt , eiω0 νt in (4.127) over one period of the continuous variable t gives a scalar product of two orthonormal basis functions.In contrast, summation of all products of basis functions in (4.133) as required by eiω0 μt , eiω0 μτ w.r.t the discrete index μ gives a periodic sequence of delta impulses.
4.5.4 Discrete-Time Fourier Transformation Here, the general procedure from Sect. 4.5.1 is applied to the signal space of sequences with infinite duration v(n),
.
−∞ < n < ∞.
(4.134)
Sequences of finite duration (4.113) have been considered in Sect. 4.5.2, therefore some similarities can be expected but there are also notable differences. One of these is the fact that operations with finite vectors and matrices are no longer applicable. The scalar product is given by an infinite sum (compare Eq. (4.114)) .
v(n), u(n) =
∞
u∗ (n)v(n) .
n=−∞
The basis functions are chosen as sequences of complex exponentials
(4.135)
4.5 Signal Transformations
111
e (n) =
. Ω
1 inΩ e . 2π
(4.136)
These basis functions are sequences with the integer index n as independent variable and the continuous real quantity Ω as a parameter. They are periodic with period 2π w.r.t. to the parameter Ω but not periodic w.r.t. to the index n. These basis functions are orthogonal since their scalar product is equal to the unit function for periodic functions (see Eq. (4.133)) .
∞ 1 ein(Ω−Ω ) 2 (2π) n=−∞ n=−∞ + * ∞ 1 1 Ω − Ω . = δ(Ω − Ω − 2πν) = X 2π ν=−∞ 2π (2π)2 ∞
eΩ (n), eΩ (n) =
e∗Ω (n)eΩ (n) =
(4.137)
The right hand side of (4.137) consists of distributions, therefore orthogonality is understood in the extended sense of Sect. 4.4.3. Due to the orthogonality of the primal basis functions eΩ (n), the dual basis functions e˜ Ω (n) are a scaled version of the primal ones e˜ Ω (n) = 2π eΩ (n) = einΩ since .
eΩ (n), e˜ Ω (n) =
∞
δ(Ω − Ω − 2πν) =
ν=−∞
* + Ω − Ω 1 X . 2π 2π
(4.138)
The analysis equation is given by the scalar product of the sequence v(n) and the dual basis sequences e˜ Ω (n) V(Ω) = v(n), e˜ Ω (n) =
∞
.
v(n)e−inΩ .
(4.139)
n=−∞
The result is a complex-valued periodic function V(Ω) of the parameter Ω. The period of 2π is determined by the dual basis function for n = 1 as e˜ Ω (1) = e−iΩ . The synthesis equation follows from integration w.r.t. Ω over one period and establishes the scalar product w.r.t to the parameter Ω π
π 1 .v(n) = V(Ω)eΩ (n)dΩ = V(Ω)einΩ dΩ = V(Ω), e∗Ω (n) . 2π −π −π
(4.140)
The analysis equation (4.139) and the synthesis equation (4.140) constitute the transformation pair of the Discrete Time Fourier Transform (DTFT). The parameter Ω acts as the continuous frequency variable w.r.t. to the discrete time variable n V(Ω) =
∞
.
n=−∞
v(n)e−inΩ ,
v(n) =
π 1 V(Ω)einΩ dΩ . 2π −π
(4.141)
The DTFT has been originally introduced for time domain signals which does not restrict its use to any other discrete variable.
112
4 Signal Spaces
As before in Sect. 4.5.4, the two scalar products , and , are of distinct nature. While v(n), e˜ Ω (n) in (4.139) is defined as an infinite sum, V(Ω), e∗Ω (n) is given by an integration w.r.t. to the continuous frequency Ω over one period of length 2π.
4.5.5 Fourier Transformation Similar to Sect. 4.5.3, the space of continuous-time functions v(t) is considered but now without the requirement of periodicity. A suitable scalar product is given by the infinite integration w.r.t. the continuous variable t .
∞
v(t), u(t) =
u∗ (t)v(t) dt .
(4.142)
−∞
Complex exponential functions of continuous time t with the continuous real-valued parameter ω act as basis functions e (t) =
. ω
1 iωt e . 2π
(4.143)
Note that due to the continuous nature of the parameter ω, two different basis functions eω1 (t) and eω2 (t) with different frequencies ω1 and ω2 do not necessarily share a common period. Further, they are orthogonal but not orthonormal since
e . ω (t), eω (t) =
∞
e∗ω (t) eω (t) dt =
−∞
∞ 1 i(ω−ω )t 1 δ(ω − ω ) . (4.144) e dt = 2π (2π)2 −∞
The term orthogonality has again to be understood in the extended sense of Sect. 4.4.3. From (4.144) follows that the dual basis functions e˜ ω (t) are scaled versions of the primal basis functions eω (t) e˜ (t) = 2π eω (t) = eiωt .
(4.145)
. ω
Now the analysis equation follows from the scalar product (4.142) and the dual basis functions e˜ ω (t) as V(ω) = v(t), e˜ ω (t) =
∞
.
−∞
v(t) e˜ ∗ω (t) dt =
∞
v(t) e−iωt dt .
(4.146)
−∞
The result is a complex-valued function V(ω) similar to Eq. (4.139) but in general it is not periodic. It depends on the parameter ω of the dual basis function e˜ ω (t). Integration of V(ω) eω (t) w.r.t. ω yields the synthesis equation
4.5 Signal Transformations
v(t) =
∞
.
113
V(ω)eω (t) dω =
V(ω), e∗ω (t)
−∞
∞ 1 V(ω) eiωt dω . = 2π −∞
(4.147)
The analysis equation (4.146) and the synthesis equation (4.147) provide the transformation pair of the Fourier Transform with the continuous time variable t and the continuous frequency variable ω. Neither v(t) nor V(ω) is required to be periodic. V(ω) =
∞
.
v(t)e−iωt dt,
v(t) =
−∞
∞ 1 V(ω)eiωt dω . 2π −∞
(4.148)
The scalar products v(t), e˜ ω (t) and V(ω), e∗ω (t) are both defined in terms of an infinite integration either w.r.t. time t or frequency ω.
4.5.6 Discrete Cosine Transformation The discrete cosine transform (DCT) is closely related to the discrete Fourier transform DFT from Sect. 4.5.2. It operates in the same signal space, i.e. on sequences of finite length, resp. vectors. In contrast to the DFT, the discrete cosine transform is a real-valued transformation and usually applied to real-valued data ! "T .v(n) ∈ R, 0 ≤ n ≤ N − 1, or v = v(0) . . . v(n) . . . v(N − 1) . (4.149) The scalar product for real-valued data does not require complex conjugate transposition as in (4.114) .
v(n), u(n) =
N−1
u(n)v(n),
v, u = uT v .
(4.150)
n=0
Actually, the discrete cosine transform is a family of signal transformations rather than a single transformation. Different members of this family differ in their symmetry relations which imply different kinds of even or odd extensions to values beyond the interval [0, N − 1] as specified in (4.149). The version introduced here is the so-called DCT-II which is popular in image coding. The basis functions eμ (n) and the dual basis functions e˜ ν (n) are given by ⎧ 1 ⎪ ⎪ μ = 0, ⎨N (4.151) .eμ (n) = ⎪ 1 π 2 ⎪ ⎩ cos (n + )μ μ = 1, . . . , N − 1, N 2 N e˜ ν (n) = cos Nπ (n + 12 )ν μ = 0, . . . , N − 1. (4.152) The biorthogonality of the basis functions and their duals is shown now separately for μ = 0 and for μ = 1, . . . , N − 1 according to the case distinction in (4.151). Equation (4.150) turns for μ = 0 into a summation over a complete period of the cosine function. The result is zero except for ν = 0
114
4 Signal Spaces N−1
eμ (n) e˜ ν (n) =
.
2 N
n=0
N−1
cos
π N (n
+ 12 )ν = δ0ν .
(4.153)
n=0
For μ = 1, . . . , N − 1 holds with similar arguments N−1 .
eμ (n) e˜ ν (n) =
N−1
2 N
n=0
cos
π N (n
+ 12 )μ cos Nπ (n + 12 )ν
n=0
=
1 N
N−1
cos
π N (n
+
1 2 )(μ
+ ν) +
1 N
N−1
cos
π N (n
+ 12 )(μ − ν) = δμν .
(4.154)
n=0
n=0
The first sum is zero since μ + ν 0 and the second sum is equal to N only for μ − ν = 0. Both Eqs. (4.153) and (4.154) constitute the condition for biorthogonality .
eμ (n), e˜ ν (n) =
N−1
eμ (n) e˜ ν (n) = δμν .
(4.155)
n=0
The analysis equation for the DCT follows for μ = 0, . . . , N − 1 as Vμ = v(n), e˜ ν (n) =
N−1
.
v(n) e˜ ν (n) =
n=0
N−1
v(n) cos
π N (n
+ 12 )μ ,
(4.156)
n=0
and the synthesis equation for n = 0, . . . , N − 1 v(n) = Vμ , eμ (n) =
N−1
.
μ=0
Vμ eμ (n) =
1 N V0
+
2 N
N−1 μ=1
Vμ cos
π N (n
+ 12 )μ . (4.157)
The synthesis equation can also be evaluated for arbitrary values of the index n, i.e., beyond n = 0, . . . , N − 1. From (4.157) follows v(n) = v(−(n + 1)) and v(N − n) = V(N + (n − 1)), i.e., even symmetry at the boundary, such that, e.g., v(0) = v(−1) and v(N − 1) = V(N). Even symmetry avoids discontinuities at the boundary, an effect which is beneficial for image coding.
4.5.7 Summary of the Fourier-Type Transformations The four Fourier-type transformations from Sect. 4.5.2 to Sect. 4.5.5 are compiled in Table 4.1. Note that the Fourier series sticks out as the only one with a truely orthonormal basis while the other three bases are orthogonal but not orthonormal. This fact is due to the normalization factor of 1/T 0 present in the definition of the scalar product (4.127). Similar factors of 1/N or 1/2π could have also been inserted into the definition of the scalar products (4.114) for the DFT and (4.135), (4.142) for the DTFT and FT to achieve orthonormal bases as well. The current scalar products
4.5 Signal Transformations
115
have been chosen such that the synthesis and analysis equations of all four signal transformations complies with popular definitions in the literature. The DCT from Sect. 4.5.6 has not been included in order to preserve the compactness of the table.
DFT FS
eμ (n) =
1 i e N
2π μn N
eμ (t) = eiω0 μt
2π
e˜ μ (n) = ei N μn
Vμ =
e˜ μ (t) = eiω0 μt Vμ =
1 T0
−1 N,
v(t)e−iω0 μt dt
−T 0 /2
DTFT eΩ (n) =
1 inΩ e 2π
e˜ Ω (n) = einΩ
V(Ω) =
eω (t) =
1 iωt e 2π
e˜ ω (t) = eiωt
V(ω) =
FT
2π
v(n) e−i N μn
n=0 T0 /2
∞ ,
v(n)e−inΩ
n=−∞ ∞ −∞
v(t)e−iωt dt
v(n) = v(t) = v(n) = v(t) =
1 N
N−1 , μ=0 ∞ ,
μ=−∞
1 2π 1 2π
π
2π
Vμ ei N μn Vμ eiω0 μt
V(Ω)einΩ dΩ
−π ∞
V(ω)eiωt dω
−∞
Table 4.1 Summary of the four Fourier-type transformations from Sects. 4.5.2 to 4.5.5: discrete Fourier transform (DFT), Fourier series (FS), discrete-time Fourier transform (DTFT), Fourier transform (FT)
4.5.8 A Review of the Notation The notation ·, · and ·, · has been used repeatedly to define the various Fouriertype transformations in Sects. 4.5.2–4.5.6 and it is used to further introduce multidimensional transformations in Chap. 6 and the Sturm-Liouville transformation in Chap. 10. Therefore, it is worthwhile to review the different aspects of this unifying notation ·, ·. It denotes • a scalar product in the strict sense resulting in a scalar value, see (4.116) and (4.127), • an operation similar to (4.50) resulting in a generalized function, typically a delta impulse or a train of delta impulses, see (4.144) and (4.137), • an operation similar to (4.106) resulting in a matrix of scalar products, see (4.120). These different aspects apply similarly to the notation ·, ·. This process of overloading the notation of the classical scalar product to include also operations which result in generalized functions or in vectors of scalar products implements the generalization addressed at the end of Sect. 4.2.6 and at the end of Sect. 4.4.3. The advantage of such a general notation lies in the formulation of the Fourier-type transformations in Sect. 4.5.2 to Sect. 4.5.6 (and many other signal processing operations) in a unifying fashion. Note that overloading of mathematical operators is quite common. As an example the nabla operator ∇ results in a vector (the gradient grad v(x) ) when applied to a scalar function v(x) of several variables and it results in a scalar (the divergence div v(x)) when applied to a vector-valued function v(x), see Sect. 6.3.6.
116
4 Signal Spaces
4.5.9 Signal Transformations with Complex Frequency Variables Besides the Fourier-type transformations covered in Sects. 4.5.2 –4.5.6 with real valued frequency variables, there are also signal transformations with complex frequency variable, most notable the Laplace and the z-transformation. The inverse transformations require complex integration with careful construction of the integration path around the singularities in the complex plane. Their explicit calculation is usually avoided in practical applications. Instead, expansions into residues are employed to reduce the inverse transformation to a restricted set of precalculated inverses. Since Laplace- and z-transformation are core topics in introductory courses in signals and systems, they are listed here for completeness and for later reference without derivation and further discussion of their properties. Also the construction of the forward and inverse transformation as analysis and synthesis equation emphasized in Sects. 4.5.2–4.5.6 is not pursued here.
4.5.9.1 Laplace Transformation The one-sided Laplace transformation resembles the Fourier transformation in (4.148) with two main differences: the frequency variable s is complex and the integration range is restricted to the interval 0 ≤ t < ∞. Definition The Laplace transform F(s) of a continuous-time function f (t) is defined as L { f (t)} = F(s) =
∞
.
f (t) e−st dt,
s∈C.
(4.158)
0
Depending on the properties of f (t), the integral in (4.158) may only exist in a certain region of the complex plane, the so-called region of convergence. Differentation Theorem A beneficial property of the one-sided Laplace transformation is its ability to convert linear ordinary differential equations with constant coefficients into algebraic equations which are considerably easier to solve. This property is based on the differentiation theorem of the Laplace transformation which turns the derivative of f (t) in the time domain into a multiplication with s in the frequency domain. In addition, the initial value f (+0) is considered as an additive term d f (t) = sF(s) − f (+0), f (+0) = lim f (t), .L t>0. (4.159) t→0 dt The Laplace transforms of higher derivatives follow by repeated application of (4.159). Relation to the Fourier Transformation The relation to the Fourier transform of a one-sided function is established by considering the Laplace transformation on the imaginary axis of the complex s-plane, i.e. for s = iω, provided that the imaginary
4.5 Signal Transformations
117
ˆ axis belongs to the region of convergence. Then the Fourier transformation F(ω) of ˆ a one-sided function f (t) is equal to the Laplace transform of f (t) on the imaginary axis (compare (4.148) and (4.158)) ⎧ ⎪ ⎪ ⎨ f (t) t ≥ 0, ˆ . fˆ(t) = ⎪ (4.160) F(ω) = F(iω) . ⎪ ⎩0 t < 0, Many more properties are found in textbooks on signals and systems, see the first paragraph of Chap. 3.
4.5.9.2 z-Transformation The one-sided z-transformation is related to the discrete time Fourier transformation (DTFT) in a similar way as the Laplace transformation to the Fourier transformation: There is a complex-valued frequency variable z and a summation range of 0 ≤ n < ∞. Definition The z-transform of F(z) of a discrete-time sequence f [n] is defined as Z { f [n]} = F(z) =
∞
.
f [n] z−n ,
z∈C.
(4.161)
n=0
Depending on the properties of the sequence f [n], the summation in (4.161) may converge only in certain region of the complex plane. Shift Theorem The one-sided z-transformation converts linear difference equations with constant coefficients into algebraic equations. Its shift theorem turns a shift of the sequence f [n] by one index value in the discrete-time domain into a multiplication with z in the frequency domain and considers the initial value f [0] as an additive term Z { f [n + 1]} = z F(z) − z f [0] .
.
(4.162)
The z-transforms of sequences shifted by more than one index value follow by repeated application of (4.162). Relation to the Discrete Time Fourier Transformation The relation between the z-transformation and the discrete time Fourier transformation (DTFT) is established by considering the z-transformation on the unit circle of the complex z-plane, proˆ vided that the z-transformation converges on the unit circle. Then the DTFT F(Ω) of ˆ a one-sided sequence f [n] is equal to the z-transformation of f [n] on the unit circle z = eiΩ (compare (4.141) and (4.161)) ⎧ ⎪ ⎪ ⎨ f [n] n ≥ 0, ˆ . fˆ[n] = ⎪ (4.163) F(Ω) = F(eiΩ ) . ⎪ ⎩0 n < 0,
118
4 Signal Spaces
4.6 Problems 4.1. Calculate the Fourier coefficients fμ of the Fourier series expansion rect
.
t T
1 + fμ cos μπ Tt , 2 μ=1 ∞
=
−T kc ∧ |ky | > kc
7.5. Consider the signal f (x, y) with Fourier transform F(k x , ky ) from Problem 7.4. This time, the signal is sampled with a small offset εY (0 < ε < 1) as shown in Fig. 7.36. y Y −εY εY
X
x
Fig. 7.36 Two-dimensional sampling grid with horizontal offset εY
252
7 Multidimensional Sampling
Determine the matrices A and Ud for the sampling grid in Fig. 7.36, and the matrices B and Vd for the periodic repetition pattern. Determine the sampling density and compare it to the sampling density for a rectangular grid from Problem 7.4. What do you observe? 7.6. If you read a digital version of this chapter then you can check the resolution of your display by inspection of the zone plate in Fig. 7.20. Zoom into this figure until it occupies the whole display. Then zoom out until ghost images of the white disc in the center appear. Zoom out further and observe the number and position of the ghost images.
References 1. Bracewell, R.N.: Fourier Analysis and Imaging. Kluwer Academic/Plenum Publishers, New York (2003) 2. Dudgeon, D.E., Mersereau, R.M.: Multidimensional Digital Signal Processing. Prentice-Hall, Englewood Cliffs, NJ (1984) 3. Jonscher, M., Seiler, J., Lanz, D., Sch¨oberl, M., B¨atz, M., Kaup, A.: Dynamic non-regular sampling sensor using frequency selective reconstruction. IEEE Transactions on Circuits and Systems for Video Technology 29(10), 2859–2872 (2019). https://doi.org/10.1109/TCSVT.2018.2876653 4. MacKay, D.J.: Information Theory, Inference,and Learning Algorithms. Cambridge University Press, Cambridge, UK (2003) 5. Papoulis, A.: Systems and Transforms with Applications in Optics. McGrawHill Book Company, New York (1968) 6. Schroeder, H., Blume, H.: One- and Multidimensional Signal Processing. Algorithms and Applications in Image Processing. Wiley, Chichester (2000) 7. Seiler, J., Jonscher, M., Ussmueller, T., Kaup, A.: Increasing imaging resolution by non-regular sampling and joint sparse deconvolution and extrapolation. IEEE Transactions on Circuits and Systems for Video Technology 29(2), 308– 322 (2019). https://doi.org/10.1109/TCSVT.2018.2796725 8. Smirnov, A.: Processing of Multidimensional Signals. Digital signal processing. Springer, Berlin (1999) 9. Woods, J.: Multidimensional Signal, Image, and Video Processing and Coding, 2 edn. Elsevier, Amsterdam (2012)
Chapter 8
Discrete Multidimensional Systems
Multidimensional systems with discrete variables arise whenever signals with more than one discrete independent variable are processed. These independent variables may be the discrete time variable, discrete space variables, or other variables which evolve in a discrete fashion. An important example for the latter are iterative algorithms which approximate an exact value in a series of steps. The counter for these steps constitutes a discrete variable. Linear discrete multidimensional systems are classified in the same way as onedimensional systems: there are finite impulse response systems (FIR-systems) and infinite impulse response systems (IIR-systems). FIR-systems are widely applied in image processing and in processing of sampled volume data. They process an output signal by convolution of the input signal with an impulse response. Therefore also the relations between the layers of a convolutional neural network constitute 2D FIR-systems. IIR-systems in multiple dimensions are more complex to handle than in one dimension. The reason is that multidimensional IIR-systems cannot be conveniently described by pole-zero diagrams. Nevertheless, there are multidimensional algorithms which qualify as IIR-systems. Sections 8.1 and 8.2 present discrete FIR- and IIR-systems in the fashion of classical filter theory. However, discrete multidimensional systems arise also in the numerical solution of large systems of linear equations. Therefore, Sect. 8.3 describes how these systems result from the discretization of differential equations. Then Sect. 8.4 discusses the iterative solution of large systems of linear equations. From a multidimensional systems perspective, this problem is described by IIR-systems with the iteration count as an additional discrete variable. The result are systems similar to the one presented in Sect. 2.7.3.1. Finally, Sect. 8.5 considers the relations between multidimensional systems and multi-input, multi-output systems (MIMO systems).
© Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 8
253
254
8 Discrete Multidimensional Systems
8.1 Discrete Finite Impulse Response Systems FIR-systems are defined by a convolution with an impulse response as introduced in Sects. 8.1.1 and 8.1.2. The order of the computations is discussed in Sect. 8.1.3, mainly to prepare for the same issue with IIR-systems. Sections 8.1.4 and 8.1.5 present typical 2D FIR systems along with their transfer functions. Finally, the stability of FIR systems is shortly investigated in Sect. 8.1.6, again as a preparation for IIR systems where stability is a severe issue.
8.1.1 Discrete Convolution Discrete-time one-dimensional systems are determined by a discrete-time convolution as in Eq. (3.4) in Sect. 3.1 or Eq. (5.23) in Sect. 5.2. If the impulse response h(k) is of finite length, say K values, then the system is called a finite impulse response system or an FIR system with input u(k) and output v(k) v(k) =
K−1
.
u(k − κ) h(κ) .
(8.1)
κ=0
In the same way, discrete 2D finite impulse response systems are defined by a 2D convolution similar to Eq. (5.27) in Sect. 5.2.2.2. The convolution partners are here the input signal u(m, n) and the impulse response h(m, n) with finite extension in both indices v(m, n) =
μ2 ν2
.
u(m − μ, n − ν) h(μ, ν) = h(m, n) ∗ ∗ u(m, n) .
(8.2)
μ=μ1 ν=ν1
The definition can be extended to multiple dimensions by using vector notation .v(n) = u(ν)h(n − ν) , (8.3) ν∈I
where n and ν are vectors of arbitrary length and the index set I is a finite set of vectors. For example in (8.2), the dimension is 2 and the corresponding index set is μ . .I = ν= ≤ μ ≤ μ ∧ ν ≤ ν ≤ ν (8.4) μ 2 1 2 ν 1 It is convenient to use index sets with symmetric extension around the origin, i.e., μ
. 1/2
= ±M,
ν1/2 = ±N
such that
−M ≤ μ ≤ M,
−N ≤ ν ≤ N .
(8.5)
The corresponding values of h(μ, ν) can be arranged in a rectangular fashion of size (2M + 1) × (2N + 1) which is called a mask of coefficients, see Fig. 8.1 and
8.1 Discrete Finite Impulse Response Systems
255
Example 5.5 in Sect. 5.2.2.3. Note that symmetry of the shape of the index set does not imply symmetry of the coefficient values. Fig. 8.1 Values h(μ, ν) of a 2D impulse response with an index set according to (8.5) arranged as a mask of coefficients
h(−M, N)
...
.. . h( M,
h(M, N) .. .
N)
. . . h(M,
N)
8.1.2 Definition of 2D FIR Systems An input-output system according to Eq. (2.3) in Sect. 2.7.1 is called a two-dimensional finite impulse response system or 2D FIR system, if its input u(m, n) and its output v(m, n) are related by Eq. (8.2). This definition extends to higher dimensions with the vector formulation (8.3) of the input-output relation. Figure 8.2 shows a graphical rendering of a 2D FIR system. The impulse response is given by a mask of size 3 × 3, i.e., M = N = 1 in (8.5) and in Fig. 8.1. The extension of the mask for a fixed index pair (μ, ν) is indicated by gray color. The mask covers nine values of the input u(m, n). These values are multiplied by the respective values of the impulse response h(m, n) and result in one output value v(m, n). As the indices (μ, ν) vary, the mask slides over the input u(m, n) and produces more output values v(m, n). This process has already been described in detail in Example 5.5 in Sect. 5.2.2.3. n
u(m, n)
n
v(m, n)
∗ ∗ h(m, n)
m
m
Fig. 8.2 2D finite impulse response system. Black dots: positions with known input values or already computed output values. White dots: positions where the output values are not yet computed
256
8 Discrete Multidimensional Systems
8.1.3 Order of Computations The order of the computation of the output values in Fig. 8.2 is indicated by black and white dots. The black dots for v(m, n) denote the output values that are already computed while the white dots denote the values that yet need to be computed. Obviously, the convolution with the impulse response h(m, n) in Fig. 8.2 proceeds in a row-wise fashion, computing on row from left to right and then proceeding to the row above. However, 2D FIR filtering does not require a specific order for the computation of v(m, n). Other possibilities are shown in Fig. 8.3. The horizontal order (row by row) corresponds to Fig. 8.2. Proceeding column by column leads to a vertical order, but also a computation along the diagonals is possible. If FIR filtering is used in conjunction with subsampling then checkerboard ordering (also called red-black ordering) is suitable where only half of the values are computed.
horizontal
vertical
diagonal
checkerboard
Fig. 8.3 Examples of the order of computations for 2D FIR filtering
8.1.4 Typical 2D FIR Systems Due to its simple structure, FIR filtering in two and three dimensions is widely used in image processing and in machine learning. Therefore it is worthwhile to devote some attention to the ensuing numerical expense. As an example, consider FIR filtering of an image with ten megapixels, i.e., the input signal u(m, n) in Fig. 8.2 has 107 scalar values. An output image v(m, n) of the same size requires the evaluation of Eq. (8.2) or (8.3) also for 107 values. If the index set describes a symmetric mask then the total number of multiplications in (8.2) is equal to (2M + 1) × (2N + 1) × 107 , i.e., 9 × 107 for the 3 × 3-mask in Fig. 8.2. This number grows quickly for masks of size 5 × 5 and more. If the index set in (8.3) covers the complete input image then 1014 multiplications were required for one single output image. Since FIR filtering steps occur repeatedly, e.g., for each frame of a video sequence or for each layer in a deep convolutional neural network, the numerical expense for each step must be kept at a minimum. Therefore the mask
8.1 Discrete Finite Impulse Response Systems
257
sizes are usually restricted to typical sizes of 3 × 3 or 5 × 5. Figure 8.4 shows some widely used masks of size 3 × 3 which serve different purposes. 1 16 1 8 1 16
1 8 1 4 1 8
1 16 1 8 1 16
smoothing
1 8 1 − 4 1 − 8 −
0 0 0
1 8 1 4 1 8
differentiation x
1 8
1 4
1 8
0
0
0
−
1 8
−
1 4
−
1 8
differentiation y
0 1 4 0
1 4 −1 1 4
0 1 4 0
Laplace operator
Fig. 8.4 Some popular 3 × 3-masks for different purposes: smoothing, numerical differentiation in horizontal x and vertical y direction, second order numerical differentiation in both directions
Smoothing The mask for smoothing (see left hand side of Fig. 8.4) calculates a weighted mean of the center pixel and the eight surrounding pixels. It can be thought of as a coarse approximation of the Gaussian function from Figs. 6.3 and 6.4. The weights are chosen such that a constant input function u(m, n) = const passes the FIR system unaltered or, in other words, that the DC amplification is unity. Smoothing for a subset of grid points in checkerboard ordering is an efficient way for low pass filtering and subsampling in 2D filterbanks. Differentiation in x-Direction The mask for numerical differentiation in x-direction (see Fig. 8.4) is a combination of differentiation and smoothing. In horizontal direction, it implements an approximation of the partial derivative ∂ u[m + 1, n] − u[m − 1, n] . u(x, y) x = mX ≈ (8.6) .X 2 ∂x y = nY The notation u[m, n] denotes the value of u(x, y) on the grid x = mX, y = nY. It is used whenever a clear distinction between continuous variables and discrete variables is necessary. The difference of two step sizes between u[m−1, n] and u[m+ 1, n] ensures that the resulting approximation of the first derivative in horizontal direction is valid for the center grid point m. In vertical direction, smoothing is applied where again the weights are chosen for a DC amplification of unity. Differentiation in y-Direction The mask for numerical differentiation in y-direction (see Fig. 8.4) implements the approximate differentiation in vertical direction ∂ u[m, n + 1] − u[m, n − 1] , u(x, y) (8.7) .Y ≈ x = mX 2 ∂y y = nY and applies smoothing in horizontal direction. These discrete differentiation masks are called Sobel operators. They are well suited for the detection of edges in images
258
8 Discrete Multidimensional Systems
as already demonstrated in Fig. 5.13. There exist various forms of Sobel operators which may differ by a constant. Discrete Laplace Operator The discrete Laplace operator (see Fig. 8.4) results from the superposition of finite difference approximations to the second order partial derivatives in the continuous Laplace operator (6.129). For equal step sizes X = Y holds 2 ∂ ∂2 2 u(x, y) .X + x = mX ∂x2 ∂y2 y = nX 1
≈ u[m + 1, n] + u[m − 1, n] + u[m, n + 1] + u[m, n − 1] − u[m, n] . (8.8) 4 Besides approximating the continuous Laplace operator, the discrete Laplace operator is used in image processing for finding one-pixel spots. Note that these masks are applied to the input signal by convolution, i.e., sign inversion and shift in both direction applies as explained in Fig. 5.7 in Sect. 5.2.2.3.
8.1.5 Transfer Functions of 2D FIR Systems Transfer functions of 2D FIR systems are formulated as the 2D z-transformation or as the 2D Discrete-Time Fourier Transformation (2D DTFT) of the impulse response h(m, n). From Eq. (6.207) in Sect. 6.7 follows the transfer function of a 2D FIR system as a direct equivalent to the 1D case .
H(z1 , z2 ) = Z2D {h(m, n)} =
∞
∞
−n h(m, n) z−m 1 z2 .
(8.9)
m=−∞ n=−∞
Evaluating the transfer function H(z1 , z2 ) on the unit circles of both the complex z1 and the z2 -domain gives the 2D DTFT of the impulse response z = eiK1 ,
. 1
z2 = eiK2 ,
H(eiK1 , eiK2 ) = DTFT2D {h(m, n)} ,
(8.10)
where K1 and K2 are real-valued frequency variables, see Eq. (6.205) in Sect. 6.7. The calculation of 2D transfer functions is reduced to the calculation of a product of 1D transfer functions if the impulse response is separable. From (6.208) follows that also the transfer function H(z1 , z2 ) is separable into the 1D transfer functions H1 (z1 ) and H2 (z2 ) .
H(z1 , z2 ) = H1 (z1 )H2 (z2 ),
H1 (z1 ) = Z1D {h1 (m)},
H2 (z2 ) = Z1D {h2 (n)} . (8.11)
In this case, also the 2D DTFT is separable into two 1D DTFTs
8.1 Discrete Finite Impulse Response Systems .
H(eiK1 , eiK2 ) = H1 (eiK1 ) H2 (eiK2 ) .
259
(8.12)
The behaviour of the 2D Laplace transform as a function of two complex variables is not very comprehensible. However, the 2D DTFT allows for more intuitive graphical representations, as shown here by Examples 8.1 to 8.3. Example 8.1 (Smoothing Operator). The smoothing operator from Fig. 8.4 is separable h(m, n) = h1 (m)h2 (n) with ⎧ ⎧ 1 1 ⎪ ⎪ ⎪ ⎪ m = 0, n = 0, ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ ⎨ 21 ⎨1 (8.13) .h1 (m) = ⎪ h2 (n) = ⎪ m = ±1, n = ±1, ⎪ ⎪ 4 4 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩0 else. ⎩0 else, Inserting (8.13) into (8.9) gives the transfer function of the smoothing operator. However, the matrix notation in Fig. 8.4 suggests a more elegant formulation than the case distinctions in (8.13). The matrix arrangement of the smoothing operator in Fig. 8.4 results as the dyadic product of two vectors where the row vector hT1 contains the values of h1 (m) for horizontal smoothing and the column vector h2 contains the values of h2 (n) for vertical smoothing ⎡ ⎤ ⎡ ⎤ ⎡ ⎡ ⎤ ⎤ ⎢1⎥ ⎢1 ⎥ ⎢1 2 1⎥⎥⎥ ⎢1⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ T ⎢⎢2⎥⎥ 1 2 1 = ⎢⎢2 4 2⎥⎥⎥⎥⎥ . (8.14) ⎢⎢2⎥⎥ , h2 = ⎢⎢⎢2⎥⎥⎥ , H = h2 h1 = .h1 = ⎦ 4⎣ ⎦ 16 ⎢⎣ ⎥⎦ 16 ⎢⎣ 4 ⎢⎣ ⎥⎦ 1 1 121 1 The matrix H contains the values of the smoothing operator and its separability is expressed by the dyadic product of h1 and h2 . The 1D transfer function H1 (z1 ) resp. H1 (eiK1 ) associated with h1 is −1 1 1 . H1 (z1 ) = H1 (eiK1 ) = 12 (1 + cos K1 )) , (8.15) 2 1 + 2 (z1 + z1 ) , and similar for H2 (eiK2 ) such that .
H(eiK1 , eiK2 ) =
1 (1 + cos K1 )(1 + cos K2 ) . 4
(8.16)
Figure 8.5 shows the real-valued 2D transfer function H(eiK1 , eiK2 ) in the range −π ≤ K1 ≤ π and −π ≤ K2 ≤ π. The minimum and maximum values of K1 and K2 correspond to half the sampling frequency in each direction. The transfer function H(eiK1 , eiK2 ) is zero at these frequencies while at K1 = 0 and K2 = 0 the transfer function is H(1, 1) = 1. This behaviour of the 2D transfer function discloses the 2D low-pass characteristic of the smoothing operator. Note also the similarity of Fig. 8.5 with the 2D Fourier transform of the Gaussian impulse in Fig. 6.3. Although not exactly Gaussian, the low-pass characteristic of the smoothing operator is a reasonably well approximation.
260
8 Discrete Multidimensional Systems
1 0.5 0 −1 0 1 −1 −0.5
K1
0.5
0
1
K2
Fig. 8.5 Transfer function H of the smoothing filter from Fig. 8.4 according to Eq. (8.16). The axes for K1 and K2 are labelled in multiples of π. For colorbar information, see Fig. 8.7
1 0 −1 −1 0 K1
1 −1 −0.5
0.5
0
1
K2
Fig. 8.6 Imaginary part of the transfer function H for differentiation in y-direction from Fig. 8.4 according to Eq. (8.18). The axes for K1 and K2 are labelled in multiples of π. For colorbar information, see Fig. 8.7
Example 8.2 (Differentiation in y-Direction). The impulse response for a 2D FIR system performing differentiation in y-direction (see Fig. 8.4) is expressed in the notation from Example 8.1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎡ ⎤ ⎤ ⎢1⎥ ⎢1⎥ ⎢1⎥ 1 ⎢⎢⎢⎢ 1 2 1 ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ T ⎢⎢2⎥⎥ , h2 = ⎢⎢⎢ 0 ⎥⎥⎥ , H = h2 h1 = ⎢⎢⎢ 0 ⎥⎥⎥ 1 2 1 = ⎢⎢⎢⎢ 0 0 0 ⎥⎥⎥⎥ . .h1 = ⎦ 2⎣ ⎦ 8⎣ ⎦ 8⎣ 4 ⎢⎣ ⎥⎦ −1 −1 −1 −2 −1 1 The vector h1 represents the same 1D smoothing operator as in (8.14) and its 1D transfer function is given by (8.15). For h2 follows the 1D transfer function which is purely imaginary on the unit circle z2 = eiK2
8.1 Discrete Finite Impulse Response Systems .
H2 (z2 ) =
261
1 (z2 − z−1 2 ), 2
H2 (eiK2 ) = i sin(K2 ) ,
(8.17)
such that the transfer function results in .
1 H(eiK1 , eiK2 ) = i sin(K2 )(1 + cos(K1 )) . 2
(8.18)
The imaginary part of H(eiK1 , eiK2 ) is plotted in Fig. 8.6. The approximative behaviour as a differentiation operator in vertical direction becomes obvious for K1 = 0 and for small values of |K2 | π .
H(1, eiK2 ) = i sin(K2 ) ≈ iK2 ,
(8.19)
which is almost identical to the factor iky in Eq. (6.127) in Sect. 6.3.6. The difference is that here the vertical axis and the frequency variable K2 are normalized w.r.t. the spatial step size. On the other hand, this systems fails to differentiate for 12 π ≤ K2 ≤ π. That would be asking too much for a 3 × 3-mask and requires impulse responses with more non-zero entries. Example 8.3 (Laplace Operator). The mask of the Laplace operator in Fig. 8.4 is not separable. But since the continuous Laplace operator consists of the sum of two partial derivatives, also the discrete Laplace operator is expressed by the sum of differentiation in horizontal and in vertical direction, see (8.8). However, in contrast to the dyadic product h2 hT1 , there is no sum of h2 and hT1 . Both vectors have to be expanded to 3×3-matrices before they can be added. This expansion of a vector into a matrix is accomplished by the Kronecker product ⊗ with the three-dimensional unit vector e2 . It serves to address the central row or column in a 3 × 3-matrix ⎡ ⎤ ⎡ ⎤ ⎢⎢⎢0⎥⎥⎥ ⎢1⎥ 1 ⎢⎢⎢⎢ ⎥⎥⎥⎥ ⎢ ⎥ (8.20) .h1 = h2 = e2 = ⎢⎢⎢⎢1⎥⎥⎥⎥ . ⎢⎢⎢⎣−2⎥⎥⎥⎦ , ⎣ ⎦ 4 0 1 The matrix H with the mask values as elements follows from ⎡ ⎡ ⎡ ⎤ ⎤ ⎤ ⎢⎢⎢0 0 0⎥⎥⎥ ⎢⎢⎢0 1 0⎥⎥⎥ ⎢⎢⎢0 1 0⎥⎥⎥ 1 1 1 T ⎢⎢⎢⎢0 −2 0⎥⎥⎥⎥ + ⎢⎢⎢⎢1 −2 1⎥⎥⎥⎥ = ⎢⎢⎢⎢1 −4 1⎥⎥⎥⎥ . .H = h2 ⊗ e2 + (h1 ⊗ e2 ) = ⎥⎦ 4 ⎢⎣ ⎥⎦ ⎥⎦ 4 ⎢⎣ 4 ⎢⎣ 0 0 0 0 1 0 0 1 0
(8.21)
Nevertheless, the vectors h1 and h2 define two 1D transfer functions H1 (z1 ) and H2 (z2 ) −1 1 . H1 (z1 ) = (8.22) H2 (z2 ) = 14 −2 + (z2 + z−1 4 −2 + (z1 + z1 ) , 2 ) , and the corresponding 2D DTFT .
H(eiK1 , eiK2 ) = H1 (eiK1 ) + H2 (eiK2 ) =
1 2
(−1 + cos K1 ) + 12 (−1 + cos K2 )
= −1 + 12 (cos K1 + cos K2 ) .
(8.23)
262
8 Discrete Multidimensional Systems
How is this result for the discrete Laplace operator related to the continuous Laplace operator in Sect. 6.3.6? The effect of the continuous Laplace operator is ˜ x , ky ) by rewriting Eq. (6.129) or (6.136) as described by a transfer function H(k ˜ x , ky )F(k x , ky ), F2D { f (x, y)} = H(k
˜ x , ky ) = −(k2x + ky2 ) . H(k
.
(8.24)
˜ x , ky ) is not periodic. Thus a Note that H(eiK1 , eiK2 ) is periodic in K1 and K2 while H(k similar behaviour can only be expected for low frequencies, similar to Example 8.2. However for |K1 | π and |K2 | π, the transfer function H(eiK1 , eiK2 ) is well approximated by its truncated 2D Taylor series expansion .
ˆ iK1 , eiK2 ) = − 1 (K12 + K22 ) = X 2 H(k ˜ x , ky ) , H(e 4
(8.25)
see the definition of the DTFT in (6.205) and the Fourier transform of sampled signals in (7.5). Figure 8.7 shows a plot of H(eiK1 , eiK2 ) and its Taylor approximation ˆ iK1 , eiK2 ). Both functions agree well for small values |K1 | π and |K2 | π. H(e
1 0
0.5
−1 −2
0
−3
−0.5
−1
0 K1
1 −1 −0.5
0.5
0
1
−1
K2
Fig. 8.7 Saturated color: transfer function H of the discrete Laplace operator from Fig. 8.4 according to Eq. (8.23). The axes for K1 and K2 are labelled in multiples of π. Faint color: transfer function Hˆ of the Taylor approximation (8.25)
8.1.6 Stability of 2D FIR Systems Stability is a crucial property of one- and multidimensional systems. A prevalent definition of stability requires that a bounded input signal warrants a bounded output signal (so-called BIBO-stability). This condition can be easily checked if the individual values of the impulse response are available, see e.g. [14]. Since 2D FIRsystems are characterized by a finite number of non-zero entries in the impulse response according to (8.2), the sum of their absolute values is always finite
8.2 Discrete Infinite Impulse Response Systems μ2 ν2 .
|h(μ, ν)| < ∞ .
263
(8.26)
μ=μ1 ν=ν1
This result holds also for higher dimensions and non-rectangular masks according to (8.3), as long as the index set I is finite. In short, FIR-systems in one and more dimensions are always stable. This property is extremely useful for practical applications, since the numerical values of the impulse response are not always known beforehand. For example, in convolutional neural networks, the elements of the impulse response in each layer are determined in the course of learning processes. Independent of the success of the learning procedure and the quality of the final result, each convolution leads to a bounded result.
8.2 Discrete Infinite Impulse Response Systems In 1D signal processing, infinite impulse response systems (IIR-systems), also called recursive systems, offer certain advantages over 1D FIR systems. Compared to an FIR-system, the recursion of already computed values may achieve the same effect (e.g. filtering) with fewer numerical operations per output sample. On the downside, the inherent feedback in IIR-systems raises issues of computability and stability. Computability means that output values have to be already computed in previous steps before they can be used for recursion. In other words, the signal flow graph must not contain any delay-free loops. This condition is met for time-domain filtering if only past output values are fed back. The feedback path may also be detrimental to the stability of the system. However, there is a well-defined stability condition and powerful stability checks. The stability condition requires that the poles of the transfer function lie within the unit circle of the complex plane. Available stability checks verify this condition without explicitly computing the complex-valued poles. Both, computability and stability become more involved issues in two and more dimensions. These difficulties are now explored. First the definition of 2D IIR systems is presented. Then computability in two dimensions is addressed by a few examples and finally stability is discussed.
8.2.1 Definition of 2D IIR Systems The definition of a 2D IIR-system in Eq. (8.27) contains two masks a(m, n) and b(m, n) rather than one for FIR-systems in (8.2). However, neither a(m, n) or b(m, n) are equal to the impulse response which turns out to be of infinite extent
264
8 Discrete Multidimensional Systems
v(m, . n) =
μ2 ν2
a(μ, ν) u(m−μ, n−ν) −
μ2 ν2
b(μ, ν) v(m−μ, n−ν) .
(8.27)
μ=μ1 ν=ν1 μm νn
μ=μ1 ν=ν1
The mask a(m, n) acts on the input values u(m, n), much like the impulse response h(m, n) for FIR-systems in (8.2). The mask b(m, n) acts on the output values v(m, n) of the IIR-system and feeds them back to the current output. This second double sum in (8.27) constitutes the recursive property by creating a feedback path. It has to be investigated for computability and stability. Figure 8.8 shows a block diagram of the system defined by (8.27). The current output value v(m, n) is computed from the input u(m, n) by convolution with the mask a(m, n) and from previously computed output values by convolution with the mask b(m, n). Note that the currently computed output value v(m, n) must not be covered by the mask b(m, n). Otherwise a delay-free loop would result, rendering the algorithm not computable. This condition is reflected in (8.27) by excluding the indices μ = m and ν = n from the second summation. It is, however, not enough to exclude just the point (m, n) from the second sum in (8.27). In general, the more restrictions are put on the mask b(m, n), the more freedom is gained for the order of computations. n
u(m, n)
n
∗ ∗ a(m, n)
m
+
−
v(m, n)
∗ ∗ b(m, n)
m
Fig. 8.8 2D infinite impulse response system. Black dots: positions with known input values or already computed output values. White dots: positions where the output values are not yet computed
8.2.2 Order of Computations Section 8.1.3 has shown that 2D FIR filtering does not require a specific order for the computation of the output from the input values, since no previously computed output values are re-used (see Figs. 8.2 and 8.3). IIR-systems enjoy the same freedom only for the mask a(m, n) for the input values. However, the mask b(m, n) has to be carefully designed to ensure computability. The ensuing restrictions are illustrated by instructive figures below, no theoretical discussion of computability is intended.
8.2 Discrete Infinite Impulse Response Systems
265
8.2.2.1 Quarter-Plane and Half-Plane Filters The second summation in Eq. (8.27) performs a discrete convolution of b(m, n) with the already computed output values v(m, n). The assignment b(m − μ, n − ν) v(μ, ν) is calculated as explained in Fig. 5.7 in Sect. 5.2.2.3 for f (m, n) ∗ ∗g(m, n). The sign change of μ and ν mirrors the mask for b(μ, ν) at both axes as b(−μ, −ν) and a shift by m and n positions as b(m − μ, n − ν). The shape of b(m, n) must ensure that not only the current output value v(m, n) is computable but also all further ones for a certain order of the computations. These restrictions are met by two generic shapes of b(m, n) as shown in Fig. 8.9. The quarter-plane filter or one-quadrant filter is characterized by non-zero values of b(m, n) only in the upper right quadrant of the (m.n)-plane, i.e., 0 ≤ m ≤ mmax , 0 ≤ n ≤ nmax , however with b(0, 0) = 0. The maximum values mmax and nmax are restricted in practice to a low number (see Fig. 8.9). The half-plane filter or two-quadrant filter is characterized by non-zero values of b(m, n) only in the upper half of the (m.n)-plane, i.e., mmin ≤ m ≤ mmax , 0 ≤ n ≤ nmax , however with b(m, 0) = 0 for m ≤ 0. The maximum and minimum values are restricted to low numbers. n
n
m quarter-plane filter
m half-plane filter
Fig. 8.9 Left: quarter-plane filter with nonzero values for m ≥ 0, n ≥ 0, and m = n = 0 excluded. Right: half-plane filter with nonzero values for n ≥ 0, and n = 0 for m ≤ 0 excluded
8.2.2.2 Order of Computations for Quarter-Plane Filters Some orders of computation for quarter-plane filters are shown in Fig. 8.10. As in Fig. 8.3, black dots for v(m, n) denote the output values that are already computed while the white dots denote the values that yet need to be computed. The shape of b(m−μ, n−ν) allows to proceed to the right in horizontal direction and then upward to the next row. It is also possible to proceed upward in vertical direction and then to next column on the right. Also working along the diagonals is possible. In all of these cases, the new value of v(m, n) at μ = m and ν = n is computed from known output values. However, it is not possible to proceed horizontally to the left, because then some mask elements are connected to output values that have not yet been computed.
266
8 Discrete Multidimensional Systems
This situation is apparent in Fig. 8.10 through the white dots inside the mask near the tip of the flash. These examples show that quarter-plane filters allow a choice of different processing directions, however with some restrictions.
horizontal right
vertical up
diagonal
horizontal left
Fig. 8.10 Different orders of computation for a quarter-plane filter b(m − μ, n − ν). Black dots: output values that are already computed. White dots: output values that yet need to be computed. Black arrows: direction of the order of computation
8.2.2.3 Order of Computations for Half-Plane Filters Figure 8.11 shows different orders of computation for a half-plane filter. Moving horizontally to the right and then up whenever a row is completed ensures that only output values from previously computed rows and past values from the current row are used. However moving the same mask vertically upward would require yet unknown values, indicated by white dots inside the mask near the tip of the flash. The same situation occurs for a diagonal order or for movement horizontally to the left.
horizontal right
vertical up
diagonal
horizontal left
Fig. 8.11 Different orders of computation for a half-plane filter b(m − μ, n − ν). Dots and arrows as in Fig. 8.10
These examples show that the choice of processing directions is quite restricted for half-plane filters. Nevertheless, it would be possible to redefine the half-plane filter from Fig. 8.9 to occupy the right half-plane. Then a processing order vertical upward would be possible but no more in horizontal direction. Such a interdependence between mask design and processing order is not attractive from a systems design point of view.
8.2 Discrete Infinite Impulse Response Systems
267
8.2.3 Transfer Functions of 2D IIR Systems Similar to 2D FIR-systems in Sect. 8.1.5, also 2D IIR-systems possess transfer function representations. However, Eq. (8.9) cannot be applied directly, unless the— infinitely extended—impulse response is known beforehand. However, the impulse response of an IIR-system does not follow easily from its definition (8.27). Therefore, the transfer function has to be derived by application of the z-transformation to (8.27). This procedure is a direct extension from the 1D case as shown here by Example 8.4. Example 8.4 (Transfer Function of a 2D IIR-System). This example calculates the transfer function of a simple IIR-system already considered in [7]. It is given by the difference equation v(m, n) = u(m, n) + b1 u(m − 1, n) + b2 u(m, n − 2) ,
(8.28)
.
which can be cast into the form of Eq. (8.27) and Fig. 8.9 with ⎧ ⎪ ⎪ ⎨1 .a(m, n) = ⎪ ⎪ ⎩0
m = n = 0, else,
⎧ ⎪ ⎪ b1 ⎪ ⎪ ⎪ ⎨ b(m, n) = ⎪ b 2 ⎪ ⎪ ⎪ ⎪ ⎩0
n m = 1, n = 0, m = 0, n = 2, else.
b2 0 0 0 b1
m
Obviously, the mask b(m, n) for the recursion of the output signal is a special case of a quarter-plane filter. Application of the z-transformation from Sect. 6.7.2 to (8.28) and observing the shift theorems (6.210) and (6.211) term by term gives −2 V(z1 , z2 ) = U(z1 , z2 ) + b1 z−1 1 V(z1 , z2 ) + b2 z2 V(z1 , z2 ) .
.
(8.29)
This algebraic equation is easily solved for the z-transform of the output V(z1 , z2 ) in relation to the z-transform of the input U(z1 , z2 ) and gives the transfer function .
H(z1 , z2 ) =
z1 z22 1 V(z1 , z2 ) = = , U(z1 , z2 ) 1 − b1 z−1 − b2 z−2 z1 z22 − b1 z22 − b2 z1 1 2
which is a function of two complex variables z1 and z2 .
(8.30)
8.2.4 Stability of 2D IIR Systems The investigation of the stability of 1D IIR systems involves the factorization of the denominator polynomial of arbitrary degree N into the product of its zeros
268
8 Discrete Multidimensional Systems
.
D(z) =
N
bn zn = bN
n=0
N
(z − z∞n ) ,
(8.31)
n=1
where multiple zeros are counted according to their multiplicity. Such a decomposition is always possible according to the fundamental theorem of algebra [6]. The factorization of the denominator polynomial allows the expansion of the transfer function into a sum of partial fractions and the term-wise determination of the impulse response. Investigating these individual terms leads to the stability condition that all zeros z∞n of the denominator must lie within or on the unit circle. Several criteria exist, which allow to check this condition without actually calculating the real- and complex-valued roots. Unfortunately, this concept is not applicable to two- or higher dimensional systems, since there are no distinct poles as for one-dimensional systems. Consequently, there is no equivalent for the fundamental theorem of algebra in two and more dimensions, as already noted in Sect. 6.7.2.2. For example consider the denominator polynomial D(z1 , z2 ) from (8.30) .
D(z1 , z2 ) = z1 z22 − b1 z22 − b2 z1 .
(8.32)
Inspite of its simplicity, it cannot be represented as product of terms like (z1 − z1∞ ) and (z2 − z2∞ ). The sad consequence is that the elegant stability theory from 1D systems does not carry over to 2D or 3D. As a result, stability theory in the 2D case is much more involved than for 1D systems. There are various methods for testing the stability of 2D IIR systems, but they are all more or less tedious to perform. Indeed, research on the stability of multidimensional systems is still going on [2], but the fundamental problems remain.
8.2.5 Application of 2D IIR Systems The restrictions of 2D IIR systems regarding the order of the computations discussed in Sect. 8.2.2 and the stability issues from Sect. 8.2.4 are a drawback for the practical application of IIR systems. In fact, the use of 2D IIR systems is less widespread than 2D FIR systems. The reason for the above problems with 2D IIR systems may be found in the brute force generalization from one to two dimensions. The assumption of an unbounded axis is quite reasonable in 1D time domain filtering, since, e.g., all kinds of sensor signals do not have a determined end point. The same assumption in 2D is questionable, since images are spatially restricted and spatially distributed physical processes are often subject to boundary conditions around a finite spatial domain. Insisting on two unbounded spatial axes as, e.g., in Fig. 8.8 may call for more problems than necessary. Indeed, system models which consider boundary conditions lead to multidimensional systems with IIR characteristic that are more well-behaved in terms of com-
8.3 Discretization of Differential Equations
269
putability and stability. A large and important subclass are multidimensional systems which result from the discretization of partial differential equations by finite difference methods. These discretization methods are conceptually quite simple but they lead to large systems of linear or nonlinear equations that are computationally heavy to solve.
8.3 Discretization of Differential Equations Many discrete multi-dimensional systems arise in the numerical solution of ordinary and partial differential equations. This section gives an overview on some approaches for approximating the solution of a linear differential equation by a numerical procedure. As diverse as these approaches may appear, they all reduce the problem to a system of linear equations. In general, the numerical solution of differential equations is a rocky road with many compromises regarding the quality of the results, the required numerical expense, and the robustness of the algorithm. Rather than exploring all possible pitfalls, a simple 1D example is discussed which shows only the essential steps. The following presentation serves various purposes: • It gives a bird’s eye view on various methods for the numerical approximation of differential equations. • It motivates the need for solving large systems of linear equations, which is a topic for discrete multidimensional systems in itself, see Sect. 8.4. • It prepares the ground for the treatment of continuous multidimensional systems in Chap. 9. The example discussed here is taylored to cover the above topics but is not intended to present the discretization of differential equations in any depth. More fundamental introductions are found, e.g., in [4, 16, 20, 25, 26, 33, 36].
8.3.1 Continuous Poisson Equation Consider a flexible string of length which is fixed at the end points. The material supports oscillations described by a spatially 1D wave equation for the deflection y(x, t) as a function of space x and time t .
1 ∂2 ∂2 y(x, t) − y(x, t) = q(x, t), v2 ∂t2 ∂x2
0 < x < ,
0≤t 2) Case. Lecture Notes in Control and Information Sciences. Springer (2001) 14. Girod, B., Rabenstein, R., Stenger, A.: Signals and Systems. Wiley, Chichester (2001) 15. Golub, G.H., Loan, C.F.V.: Matrix Computations, 4 edn. Johns Hopkins University Press, Baltimore, USA (2012) 16. Großmann, C., Roos, H.G.: Numerische Behandlung partieller Differentialgleichungen. Vieweg+Teubner Verlag, Wiesbaden (2005) 17. Hackbush, W.: Multi-Grid Methods and Applications. Springer, Berlin (1985) 18. Hjelmstad, K.D.: Fundamentals of Structural Dynamics. Springer Nature Switzerland (2022) 19. Jazar, R.N.: Perturbation Methods in Science and Engineering. Springer Nature Switzerland (2021) 20. Johnson, C.: Numerical Solution of Partial Differential Equations by the Finite Element Method. Cambridge University Press, Cambridge, UK (1995) 21. Kaczorek, T.: Two-Dimensional Linear Systems. Springer, Berlin (1985) 22. Kahl, K., Kintscher, N.: Automated local Fourier analysis (aLFA). BIT Numerical Mathematics 60(3), 651–686 (2020). https://doi.org/10.1007/ s10543-019-00797-w. https://www.scopus.com/inward/record.uri?eid=2-s2. 0-85078421494&doi=10.1007%2fs10543-019-00797-w&partnerID=40& md5=12ea31c1134446029db07b9c263ac636. All Open Access, Green Open Access, Hybrid Gold Open Access 23. MIT BCS Perceptual Science Group: Three frames of original flower garden sequence (MPEG suite). http://persci.mit.edu/demos/jwang/garden-layer/ orig-seq.html. Accessed on Apr. 25, 2022 ¨ A.: Weighted residual methods for finite elements. In: H. Altenbach, 24. Ochsner, ¨ A. Ochsner (eds.) Encyclopedia of Continuum Mechanics, pp. 2771–2786. Springer Berlin Heidelberg, Berlin, Heidelberg (2020). https://doi.org/10.1007/ 978-3-662-55771-6 20 25. Quarteroni, A.: Numerical Models for Differential Problems. Springer-Verlag Milan (2009) 26. Quarteroni, A., Sacco, R., Saleri, F.: Numerical Mathematics, 2 edn. Springer, Berlin (2007)
References
307
27. Roesser, R.: A discrete state-space model for linear image processing. IEEE Transactions on Automatic Control 20(1), 1–10 (1975). DOI 10.1109/TAC. 1975.1100844 28. Rogers, E., Galkowski, K., Owens, D.H.: Control Systems Theory and Applications for Linear Repetitive Processes, vol. 349. Springer (2007). https://eprints. soton.ac.uk/263634/ 29. Rogers, E., Galkowski, K., Paszke, W., Moore, K.L., Bauer, P.H., Hladowski, L., Dabkowski, P.: Multidimensional control systems: case studies in design and evaluation. Multidimensional Systems and Signal Processing 26(4), 895–939 (2015) 30. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2 edn. Society for Industrial and Applied Mathematic (SIAM), Philadelphia, USA (2003) 31. Sch¨afer, M.: Computational Engineering – Introduction to Numerical Methods, 2 edn. Springer Nature Switzerland (2022) 32. Schroeder, H., Blume, H.: One- and Multidimensional Signal Processing. Algorithms and Applications in Image Processing. Wiley, Chichester (2000) 33. Strauss, W.A.: Partial Differential Equations. John Wiley & Sons (2008) 34. Traub, J.: Iterative Methods for the Solution of Equations, 2 edn. AMS/Chelsea Publication, Providence, RI, USA (1982) 35. Trottenberg, U., Oosterlee, C., Schuller, A.: Multigrid. Academic Press, San Diego (2001) 36. Tveito, A., Winther, R.: Introduction to Partial Differential Equations – A Computational Approach. Springer-Verlag, Berlin Heidelberg (2005) 37. Varga, R.S.: Matrix Iterative Analysis, 2 edn. Springer-Verlag, Berlin (2000) 38. Wang, J., Adelson, E.: Representing moving images with layers. IEEE Transactions on Image Processing 3(5), 625–638 (1994). https://doi.org/10.1109/83. 334981 39. Wienands, R., Joppich, W.: Practical Fourier Analysis for Multigrid Methods. Chapman & Hall/CRC, London, UK (2011) 40. Woods, J.: Multidimensional Signal, Image, and Video Processing and Coding, 2 edn. Elsevier, Amsterdam (2012)
Chapter 9
Continuous Multidimensional Systems
Multidimensional systems with continuous variables result from the description of natural or technical systems by differential equations. Whenever not only the time evolution but also the spatial extension has to be considered, then the mathematical description comprises partial derivatives w.r.t. time and also partial derivatives w.r.t. space. An example has been given already in Sect. 2.7.3.2, where Eqs. (2.14) to (2.16) represent a system which depends on time and on one space variable or, in short, a (1+1)D system. A further example is provided by Eqs. (2.17) to (2.19), representing a system depending on time and three space variables or a (3+1)D system. For example, also time- and space dependent systems with time delays are multidimensional systems. More abstract, any system with more than one continuous variable is a multidimensional system with continuous variables or, more concisely, a continuous multidimensional system. Nevertheless, the focus is here on systems described by partial differential equations with time and space derivatives, since they are an important subclass in technical applications. Section 9.1 reviews the concept of distributed parameter systems as opposed to lumped parameter systems. A rigorous mathematical description for distributed parameter systems is provided by partial differential equations. They are reviewed for the linear case and a scalar variable in Sect. 9.2 along with initial and boundary conditions. Section 9.3 discusses vector-valued variables and Sect. 9.4 compares scalar and vector-valued representations. An approach for a general solution is envisioned in Sect. 9.5 to be explored further in Chap. 10.
9.1 Distributed Parameter Systems A useful model of physical systems which depend on time and space are the socalled distributed parameter systems. They are introduced here by a simple technical example, which leads to a mathematical description in the form of a partial differential equation. © Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 9
309
310
9 Continuous Multidimensional Systems
9.1.1 Lumped and Distributed Parameter Systems Any natural or technical system exhibits a time-evolution and a spatial extension. The time-evolution is also called the dynamics of the system. Measurement values of its parameters and variables may differ at the same location at different points in time or at the same time at different locations. The system parameters are thus distributed over space, e.g., the mass of a ball is distributed over its volume or the resistance of a wire is distributed over its length. However, such a detailed analysis is not always necessary. Consider, for example, an electrical circuit for processing audio signals. Its output, when reproduced by a loudspeaker, is perceived by human listeners up to frequencies of about f0 = 20 kHz. Therefore the duration of the shortest audible period T 0 = f10 = 50 μs may be regarded as a lower limit for the time scale of interest in audio applications. Now assume that the circuit is small enough such that the propagation time of electrical pulses over its spatial extension is considerably less than T 0 . Then all electrical effects inside the circuit happen approximately at the same time in relation to T 0 . As a consequence the spatial extension of the circuit can be neglected. This is the usual assumption in conventional circuit analysis. These qualitative considerations are complemented by a quantitative investigation in Example 9.1 in Sect. 9.1.4. In a similar way, the analysis of mechanical constructions involving rigid bodies is often based on the assumption that the mass of a body is concentrated in its center of gravity as a so-called point mass. This assumption is just another way of neglecting spatial extension. Summarizing these deliberations, one may consider the spatial extension of a system and regard its parameters as distributed over space or, under suitable assumptions, neglect the spatial extension and consider its parameters as concentrated or lumped at certain positions. These cases are distinguished by the designations distributed parameter systems or systems with lumped parameters.
9.1.2 Electrical Transmission Line As a frequently used example [11], consider the electrical two-wire transmission line in Fig. 9.1. It shows two wires running in parallel, the terminals of each wire being separated by a distance . As a suitable coordinate system, the x-axis is aligned with the wires and the terminals are at x = 0 and x = . The two cylinder symbols in Fig. 9.1 indicate that each unit length of the wires carries a series inductance and a series resistance along the wires, as well as a parallel capacitance and a parallel conductance between the wires. These parameters, their symbols and their physical units are listed in Table 9.1. The physical units tell that they are distributed parameters, i.e., distributed along the length of the transmission line.
9.1 Distributed Parameter Systems
311
Table 9.1 Transmission line parameters for the distributed parameter system from Fig. 9.1 and the lumped parameter system from Fig. 9.2 Distributed Parameters Symbol Unit Lumped parameters Symbol Unit Series inductance l H m−1 Inductance L = l H Series resistance r Ω m−1 Resistance R = r Ω Parallel capacitance c F m−1 Capacitance C = c F Parallel conductance g S m−1 Conductance G = g S
For a short line, the lumped parameter approximation may be permissible. It follows from the distributed parameter model by concentrating each of the line parameters into one inductance, resistance, capacitance, or conductance, respectively. If the line parameters are constant along the line, then the lumped parameters follow from the distributed ones by multiplication with the length . Note here the notation l for the series inductance and for the length. The lumped parameters are also listed in Table 9.1. The voltage u(x, t) between the two wires varies along the line with the values u(0, t) on the left hand side and u(, t) on the right hand side. In the same way the current i(x, t) in the top wire varies between i(0, t) and i(, t). Figure 9.2 shows a resulting lumped parameter system. It is not unique, since the series and the parallel components could also be arranged in a different way. The terminals of the lumped parameters L, R, C, and G do not indicate any distance, however their connections define the topology of the circuit. Since there is no space dependency in a lumped parameter system, the voltage u0 (t) and the current i(t) at the left port and the voltage u (t) at the right port do not depend on a space variable, only the subscripts 0 and refer to the original transmission line. i(x, t) u(0, t) r, l, c, g
0
u(x, t)
u(ℓ, t)
r, l, c, g
ℓ
x
Fig. 9.1 Distributed parameter model of an electrical two-wire transmission line specified by its distributed line parameters r, l, c, g and described by the telegraphers equation
Although the lumped parameter model in Fig. 9.2 may only be a crude approximation to the actual behaviour of the distributed parameter system in Fig. 9.1, it serves well to derive the differential equations for the transmission line.
312
9 Continuous Multidimensional Systems
L
i(t)
R
u0 (t)
uℓ(t)
G
C
Fig. 9.2 Lumped parameter model of an electrical two-wire transmission line with concentrated circuit elements R, L, C, G from Table 9.1
9.1.3 Telegraph Equation The behaviour of the transmission line is ultimately governed by Maxwell’s equations. An easier access is provided by the lumped parameter model from Fig. 9.2. Although the approximation for a transmission line of an arbitrary length may be inaccurate, the quality of the approximation improves for shorter segments of length Δx . The idea is to represent the distributed parameter transmission line by an infinite sequence of arbitrary small segments of lumped parameter models. A small segment of the transmission line from Fig. 9.1 is shown in Fig. 9.3. The lumped parameter values correspond to Table 9.1 with the total length replaced by the length Δx of one segment. i(x, t) u(x, t)
l∆x M
r∆x c∆x
N
i(x + ∆x, t) g∆x
x
u(x + ∆x, t)
x + ∆x
Fig. 9.3 Small segment Δx of a transmission line
The analysis of the circuit from Fig. 9.3 follows the same steps already applied to the RC-circuit from Fig. 3.6. The mesh equation similar to Eq. (3.95) results in a balance equation for all the voltages in the mesh M u(x + Δx, t) − u(x, t) = −i(x, t) rΔx −
.
∂ i(x, t) lΔx , ∂t
(9.1)
and the node equation like Eq. (3.96) adds all currents into the node N up to zero i(x + Δx, t) − i(x, t) = −u(x, t) gΔx −
.
∂ u(x, t) cΔx . ∂t
(9.2)
9.1 Distributed Parameter Systems
313
Dividing both equations by Δx and performing the limit process Δx → 0 turns the left hand sides into the partial derivatives w.r.t x .
∂i(x, t) ∂u(x, t) , = −ri(x, t) − l ∂t ∂x ∂u(x, t) ∂i(x, t) = −gu(x, t) − c . ∂x ∂t
(9.3) (9.4)
Both equations may be combined into one second order equation for u(x, t) by eliminating the current i(x, t). To this end, partial differentiation w.r.t. x is applied to Eq. (9.3) and partial differentiation w.r.t. t to Eq. (9.4). Then (9.4) and its time derivative are inserted into the spatial derivative of (9.3) lc
.
∂2 u(x, t) ∂2 u(x, t) ∂u(x, t) + rg u(x, t) = . + (lg + rc) ∂t ∂x2 ∂t2
(9.5)
A systematic method for this process is introduced in Sect. 9.4.1. This classic partial differential equation (9.5) is known as telegraph equation or telegrapher’s equation. Also the pair of Eqs. (9.3) and (9.4) is called the transmission line equations. Applications are discussed in [1, Chap. 2]. For the moment it is of interest to investigate two special cases at the opposite ends of the frequency spectrum: The direct current or DC case and the high frequency case.
9.1.4 Special Cases For ideal direct current there is no temporal variation, i.e., u(x, t) = uc (x) and the time derivatives in the telegraph equation (9.5) are zero, leaving only the second order derivative for the remaining space coordinate rg uc (x) =
.
d2 uc (x) . dx2
(9.6)
For the high frequency case, the effect of the inductance and of the capacitance exceeds the resistance and the conductance. Thus neglecting r and g in the telegraph equation (9.5) leads to the wave equation lc
.
∂2 u(x, t) ∂2 u(x, t) . = ∂x2 ∂t2
(9.7)
It supports solutions of the form u(x, t) = f (x ± vt)
.
with
1 v= √ , lc
(9.8)
which is easily shown by calculating the second order space and time derivatives
314
9 Continuous Multidimensional Systems .
∂2 u(x, t) = f (x ± vt) , ∂x2
∂2 u(x, t) = v2 f (x ± vt) . ∂t2
(9.9)
The function f is an arbitrary twice differentiable function of a single variable and f denotes its second order derivative. Solutions of the form (9.8) are called waves and the function f is the wave form, hence the name wave equation for (9.7). It describes here a lossless transmission line, since r = 0 and g = 0 have been assumed. However, waves of different kinds occur also in many other media, not just transmission lines. The waveform f (x±vt) does not change its shape, it just moves from one location x1 to another one x2 at two different points in time t1 and t2 as .
f (x1 ± vt1 ) = f (x2 ± vt2 ),
x2 = x1 + Δx,
t2 = t1 + Δt,
v=
Δx . Δt
(9.10)
Here Δx is the distance travelled by the wave and Δt is the travel time. Thus, the quantity v is the propagation speed of the wave. The sign ambiguity in (9.8) and (9.10) indicates that waves may travel into positive and into negative x-direction. A general solution of the wave equation (9.7) consists of different waveforms f1 and f2 travelling into different directions u(x, t) = f1 (x − vt) + f2 (x + vt) .
(9.11)
.
According to (9.8), the propagation speed on a transmission line is determined by its per length inductance and capacitance. Example 9.1 (Propagation Speed on a Transmission Line). Consider a commercial power cable (XLPE/PVC-sheathed cable approved according to KEMA KEUR) where 2 wires of 1.5 mm2 cross section area are used for power distribution. Its series inductance l and parallel capacitance c have been determined by measurements [16], resulting in the corresponding propagation speed according to (9.8) l = 0.48
.
μH , m
c = 63
pF , m
v = 180 · 106
m . s
(9.12)
Although designed to deliver electrical power, such cables may also carry higher frequencies, e.g., for powerline communication up to 100 MHz [12]. Should a power installation in a residential home based on such a cable be considered as distributed or as a lumped parameter system? To answer this question, consider two different frequencies: a frequency f1 = 50 Hz of the AC power supply and a frequency f2 = 100 MHz of potential high frequency communication. Similar to the consideration in Sect. 9.1.1, the distance Δxn travelled during one period Δtn of the frequency fn is compared to the size of the circuit for each of the two frequencies f1 and f2 . From (9.10) follows Δx1 = vΔt1 =
.
v = 3600 km, f1
Δx2 = vΔt2 =
v = 1.8 m . f2
(9.13)
9.1 Distributed Parameter Systems
315
Comparing these distances to typical cable lengths in a residential home gives two different results: For the purpose of power engineering, even a circuit of the size of a house may be savely regarded as a lumped parameter system, since the travel times are tiny fractions of one period of the mains frequency. However, for the purpose of communications, cable lengths of a few meters have to be considered as distributed parameter systems. Example 9.1 shows that the distinction between distributed parameter systems and lumped parameter systems depends on the frequency range of interest. The same physical system may be modelled by distributed parameters or by lumped parameters depending on the application.
9.1.5 Initial and Boundary Conditions The distributed parameter description of the transmission line from Fig. 9.1 is not complete. The derivation in Sect. 9.1.3 contracted the small segment from Fig. 9.3 to a single point x. Therefore Eqs. (9.3)–(9.5) hold for all values of the space variable x without any further specification. To make these equations compatible with the transmission line in Fig. 9.1, the range of the space variable has to be restricted to the length of the line, i.e., the range 0 ≤ x ≤ and the conditions at the ports at x = 0 and x = have to be prescribed. One way of doing so is to connect both ports to possibly time varying voltage sources u0 (t) and u (t) such that u(0, t) = u0 (t),
.
u(, t) = u (t) .
(9.14)
A condition of this form is called a boundary condition and the values u0 (t) and u (t) are the boundary values. Also the time interval needs to be specified, e.g., from a certain starting point to infinity, say 0 ≤ t < ∞. At t = 0 the line may already hold electrical or magnetic energy that define certain initial values (compare Eq. (3.104) in Sect. 3.4.1) in the form of initial conditions. Typical initial conditions for the wave equation are conditions for the value of u(x, t) and for the value of its time derivative, both at t = 0 u(x, 0) = ui,0 (x),
.
∂ u(x, t)|t=0 = ui,1 (x) . ∂t
(9.15)
The two equations are the initial conditions and the space-dependent functions ui,0 (x) and ui,1 (x) are the initial values.
316
9 Continuous Multidimensional Systems
9.1.6 Summary This section has used a two-wire transmission line to introduce distributed and lumped parameter systems. For this simple example, it has been shown how to determine which of these models is the more useful for a certain purpose. It has also been shown how to derive the partial differential equation for the transmission line, the telegraph equation. This process required to consider a few topics which are shortly reviewed here. Definition of a Spatio-Temporal Domain The analysis of a distributed parameter system requires to specify suitable domains in space and time. The definition of a spatial domain is straightforward in one dimension but it may be more difficult in two or three dimensions. The definition of the domain for the time-evolution starts at a certain point in time, e.g., when an electrical system is switched on. For timeinvariant systems, the time axis is usually shifted to a starting point at t = 0. Partial Differential Equation The mathematical description starts from a small segment (see Fig. 9.3), a small area or a small volume and establishes the applicable balance equations, preferable from first principles of physics, like Newton’s laws, Maxwell’s equations, conservation of matter, energy, or alike. The formulation of these balances for infinitely small spatial volumes leads to differential equations with partial derivatives w.r.t. time and space or, in short, to partial differential equations. This modelling process requires fundamental knowledge of physics or chemistry and is not within the focus of this book. However, some examples are given in [1]. Initial Conditions At the start of the time axis, e.g., at t = 0, the system under consideration may already contain stored energy, e.g., in the form electric, magnetic, kinetic, or potential energy. It has to be considered by assigning suitable initial conditions to one or more of the physical variables of the partial differential equation. Boundary Conditions Systems exchange energy or matter with their environment. Consequently, environmental conditions determine the behaviour of a system. If the environment is defined as the exterior of the spatial domain established above, then the influence of the environment has to be considered by conditions at the boundary of the spatial domain, the so-called boundary conditions. Single Variables and Multiple Variables The wave equation has been presented in two different ways in Sect. 9.1.3. The analysis in terms of a mesh and a node equation has led to two coupled first-order differential equations for the voltage u and the current i, see (9.3) and (9.4). On the other hand, the elimination of the current led to a single second-order differential equation for the voltage u only, see (9.5). Both representations have their advantages and disadvantages. Equations (9.3) and (9.4) are of first order only, but there are two equations for two variables, while (9.5) is a single equation for a single variable but it is of second order. Eqs. (9.3) and (9.4) require the voltage and the current to be differentiable at least once, while (9.5)
9.2 Scalar Linear Partial Differential Equations
317
requires a voltage which is at least twice differentiable. This requirement may be relieved for so-called weak solutions [6, 15, 21]. The topics considered in this summary are presented in more detail in Sect. 9.2 for scalar linear partial differential equations and in Sect. 9.3 for vector-valued partial differential equations.
9.2 Scalar Linear Partial Differential Equations The investigation of the electrical transmission line in Sect. 9.1 has shown how the mathematical description of distributed parameter systems leads quite naturally to partial differential equations with initial and boundary conditions. This section introduces the required fundamentals of partial differential equations and the associated notation. No complete coverage is intended, since it is assumed that readers already have a basic understanding of ordinary and partial differential equations. Section 9.2.1 introduces the spatio-temporal domain on which partial differential equations are defined. The influence of initial values and boundary values is presented in Sect. 9.2.2 for scalar variables and in Sect. 9.3 for vector-valued variables. The relations between scalar and vector-valued partial differential equations are discussed in Sect. 9.4. Finally a very general form of the solution of linear partial differential equations with constant coefficients is shortly presented in Sect. 9.5 as outlook to the introduction of the Sturm-Liouville transformation in Chap. 10.
9.2.1 Spatio-Temporal Domain The spatial domain for the derivation of a partial differential equation is a subset of one-, two-, or three-dimensional space Rn , n = 1, 2, 3. The temporal domain may encompass the complete time axis −∞ < t < ∞, or only the positive half axis t0 < t < ∞ or a finite interval t0 < t < t1 . It is assumed, that this time interval is independent of space, and that the spatial domain is constant over time. This assumption allows to define initial- and boundary conditions separately (see Sects. 9.2.2 and 9.3) and it simplifies the application of integral transformations in time and space. The spatio-temporal domain is formally written as Ψ × T where Ψ denotes the spatial and T the temporal domain. The spatial domain Ψ may be an arbitrary spatial shape of dimension n or, more tangible, an interval L on a spatial axis in 1D, an area A in a plane in 2D, or a volume V in 3D. Table 9.2 compiles these spatial subsets. Different kinds of spatial domains have already been discussed in Sec 2.7.3.2 and a detailed specification of a circular disc is given in Example 9.2. Note that the volume Ψ may also be infinite and consist of the entire space Rn .
318
9 Continuous Multidimensional Systems
The boundary of the spatial domain acts as an interface to the environment and is of special importance for the definition of boundary conditions. It is denoted by ∂Ψ , i.e., by ∂V for n = 3, ∂A for n = 2, and ∂L for n = 1, respectively. Spatial domains Ψ of dimension n have boundaries ∂Ψ of dimension n − 1. In particular, the boundary ∂V is a surface around the volume V, ∂A is a curve around the area A, and ∂L consists of the endpoints of a curve L, see Table 9.2. If the volume Ψ encompasses the complete n-dimensional space, then the boundary ∂Ψ at infinity is only of theoretical interest. The independent spatial variables can be defined by the vector x = [x1 , . . . , xn ]T , defined in a suitable coordinate system. Table 9.2 Spatial domains in 1, 2, and 3 dimensions n Ψ 3 V ⊆ R3 Volume 2 A ⊆ R2 Area 1 L⊆R Curve
∂Ψ ∂V ∂A ∂L
Surface of a volume Curve around an area Endpoints of an interval
Example 9.2 (Two-Dimensional Circular Disk). Figure 9.4 shows a 2 dimensional circular disk of radius r0 that is an example of a spatial shape of particular practical interest, e.g., for the mathematical description of membranes [2, 17]. Fitting the geometrical shape, the vector x of independent variables should be defined in polar coordinates (see Sect. 5.1.4.2). The spatial region Ψ = A ⊂ R2 of the disk and its boundary ∂A are described as .
A {x = [r, ϕ]T | 0 ≤ r < r0 , −π < ϕ ≤ π},
∂A {x = [r, ϕ]T | r = r0 , −π < ϕ ≤ π}. This relation and Fig. 9.4 show the connection between the spatial area A and its boundary ∂A. Particularly, the boundary ∂A of a two dimensional region A ⊂ R is given by the curve ∂A ⊂ R. Ψ = A⊂
∂Ψ = ∂A
2
r0
A
Fig. 9.4 Left: spatial area Ψ = A ⊂ R2 . Right: boundary ∂Ψ = ∂A of the spatial area A
While the spatial behavior of a system is defined on a spatial region Ψ , the temporal behavior of a system is defined on the one-dimensional temporal domain
9.2 Scalar Linear Partial Differential Equations
319
T = [t0 ; ∞[, where t0 defines the starting time. For time-invariant systems, the axis is usually shifted such that the starting point is zero t0 = 0. The independent temporal variable is denoted by t. The initial assumption that spatial domain Ψ and temporal domain T are independent of each other holds for the applications discussed in [1]. Nevertheless, this independence is not always given since there are systems which change their spatial shape over time, e.g., a melting ice cube. They are called moving boundary systems and are not considered here.
9.2.2 Initial-Boundary-Value Problems for a Scalar Variable As discussed in Sect. 9.2.1, a partial differential equation is defined on a bounded or unbounded spatial domain Ψ and a temporal domain T. While a PDE itself describes the spatio-temporal properties of a system, the behavior of physical quantities at the boundaries of the spatial domain ∂Ψ and at the starting point t0 of the temporal domain has to be defined separately. Therefore, in addition to the partial differential equation, a set of initial conditions (ICs) defines the behavior of the system at the starting point t0 and a set of boundary conditions (BCs) defines the behavior at the spatial boundaries ∂Ψ of the system. A differential equation, together with a set of initial conditions and boundary conditions is known as an initial-boundary value problem (IBVP). Figure 9.5 illustrates the different components of an initial-boundary value problem. It shows the spatio-temporal progression of a 2D diffusion process, where particles move from an initial distribution to a final one. The spatial domain Ψ is a circular disk (see Fig. 9.4). The very left plot of Fig. 9.5 shows the initial distribution of the particle concentration in the disk, which is described by an initial condition at t = t0 . The spatio-temporal dynamics of the diffusion process for t > t0 are described by a partial differential equation. The solution at t1 with t0 < t1 < ∞ and the final state for t → ∞ are shown in the center and the right hand plot of Fig. 9.5. The effect of the boundary conditions is visible in the right hand plot of Fig. 9.5. The particle concentration for t → ∞ does not tend to zero, but to a nonzero final value of a uniform particle distribution. This effect is caused by the boundary conditions which specify here that diffusing particles are reflected at the boundary and do not leave the disk.
320
9 Continuous Multidimensional Systems 2 1.5 1 0.5 0
2 1.5 1 0.5 0
t = t0
t = t1
2 1.5 1 0.5 0
t→∞
Fig. 9.5 Spatio-temporal progression of a 2D diffusion process to clarify the different parts of an initial-boundary value problem. Left: initial state of the system, described by initial conditions (ICs). Center: progression of the system dynamics, described by partial differential equations (PDEs). Right: concentration saturates for t → ∞ as no particle can leave the system. This situation is described by boundary conditions (BCs) [18]
9.2.3 Partial Differential Equations Compared to ordinary differential equations (see Sects. 2.7.3.2 and 3.4), partial differential equations (PDEs) describe the variation of a quantity w.r.t. several independent variables, e.g., time and up to three spatial dimensions. Therefore, partial differential equations serve as a mathematical description of distributed parameter systems. A (n + 1)-dimensional physical system on the domain Ψ × T is described by the dynamics of a dependent quantity represented by the scalar function y(x, t). Its spatial and temporal dependency is expressed by the spatial independent variables in x = [x1 , . . . , xn ]T , x ∈ Ψ ⊆ Rn and by the time t0 ≤ t < ∞. Most problems discussed in this book are described by linear partial differential equations with constant coefficients. They form a restricted class within all partial differential equations, however they are sufficient to represent a rich set of physical and technical applications. The general form for some second order (2+1)D problems to be discussed here is ∂2 ∂ y(x, t) + a1 y(x, t)+ ∂t ∂t2 ∂2 ∂2 ∂2 b11 2 y(x, t) + b22 2 y(x, t) + b12 y(x, t)+ ∂x1 x2 ∂x2 ∂x1 ∂ ∂ c1 y(x, t) + c2 y(x, t) + d y(x, t) = fe (x, t), x ∈ Ψ, ∂x1 ∂x2 a
. 2
t0 ≤ t < ∞ . (9.16)
The partial derivative ∂x∂ ν y(x, t) denotes the derivative of y(x, t) with respect to the component xν of the spatial variable x and ∂t∂ y(x, t) a derivative with respect to time t. The space and time dependent function fe (x, t) denotes a source term which excites the system. For a zero excitation function ( fe = 0) the PDE (9.16) is called a homogeneous PDE and for non-zero ( fe 0) an inhomogeneous PDE [7].
9.2 Scalar Linear Partial Differential Equations
321
The corresponing form for three spatial variables x1 , x2 , and x3 follows by adding two terms for the first and second order derivative w.r.t. x3 and two more terms for the mixed derivatives w.r.t. x1 and x3 as well as x2 and x3 . However, for problems of practical importance, some of these terms are zero or there exist strong relations between their nonzero coefficients. More general forms of second order (2+1)D problems than (9.16) involve also mixed derivatives between time and space coordinates, however, they are not needed here. Partial differential equations like (9.16) specify the general behaviour of a physical process, but they do not have a unique solution until initial and boundary conditions are specified.
9.2.4 Initial Conditions Two initial conditions (ICs) define the temporal initial state of the physical quantities in the partial differential equation (9.16) y(x, t)|t=t0 = y(x, t0 ) = yi,1 (x),
.
∂ y(x, t)|t=t0 = yi,2 (x), ∂t
x∈Ψ .
(9.17)
The initial values yi,1 (x) and yi,2 (x) are arbitrary spatial functions, representing the initial value of y(x, t) and its first time derivative, respectively. If a2 = 0 then only the first condition in (9.17) applies. This is the case for the diffusion problem in Fig. 9.5, where the left hand plot shows the initial value yi,1 (x). The starting point t0 can be regarded as the boundary of the temporal domain T = [t0 ; ∞[. In this case, the conditions (9.17) are called Cauchy boundary conditions.1
9.2.5 Boundary Conditions In addition to the differential equation (9.16) and its initial condition (9.17) a set of boundary conditions (BCs) has to be defined, which specifies the dynamics of the physical quantities at the spatial boundaries ∂Ψ . Boundary conditions are an essential part of an initial-boundary-value problem and are required to guarantee the uniqueness. Often, the general solution of a partial differential equation includes several undetermined parameters that can be determined from the boundary conditions. The most frequent boundary conditions for scalar partial differential equations are classified into three different kinds where each one is associated with the name of a mathematician, see Table 9.3. Boundary Conditions of the First Kind specify the value of the solution y(x, t) on the boundary ∂Ψ by a given function φ1 (x, t). This function is defined only on the boundary x ∈ ∂Ψ and is called the boundary value. Boundary conditions of the first 1
Augustin-Louis Cauchy (1789–1857).
322
9 Continuous Multidimensional Systems
kind are also called Dirichlet boundary conditions.2 y(x, t) = φ1 (x, t),
.
x ∈ ∂Ψ .
(9.18)
Boundary Conditions of the Second Kind specify the value of the directional derivative of y(x, t) normal to the boundary ∂Ψ by the boundary value φ2 (x, t). Boundary conditions of the second kind are called Neumann boundary conditions2 .
∂ y(x, t) = φ2 (x, t), ∂n
x ∈ ∂Ψ .
(9.19)
Boundary Conditions of the Third Kind specify a linear combination of the value y(x, t) and its normal derivative by the boundary value φ3 (x, t). Boundary conditions of the third kind are called Robin boundary conditions2 c y(x, t) + c1
. 0
∂ y(x, t) = φ3 (x, t), ∂n
x ∈ ∂Ψ .
(9.20)
Directional Derivative The boundary conditions of the second and the third kind make use of the directional derivative in normal direction, or short, the normal derivative [9]. It is defined for a function u(x) which is real-valued and differentiable on the boundary points x ∈ ∂Ψ . The normal derivative is the scalar product of a vector n normal to the boundary and the gradient ∇u(x) ∂ ∂ u(x) = n, ∇u(x) = u(x), nν ∂xν ∂n ν=1 n
.
n ⊥ ∂Ψ,
n, t = 0 .
(9.21)
A vector is normal to the boundary (n ⊥ ∂Ψ ) at a boundary point x ∈ ∂Ψ if the scalar product with the tangent vector t at this boundary point is zero. Figure 9.6 shows a 2D spatial domain Ψ = A with the boundary ∂Ψ = ∂A as well as the tangent vector t and the normal vector n for a selected boundary point x ∈ ∂A. The normal derivative is very convenient for rectangular or cubic domains that are aligned with the coordinate axes of a Cartesian coordinate system, or similarly, for circular or spherical domains aligned with a polar or spherical coordinate system. Figure 9.7 shows the same situation as Fig. 9.6, but for a rectangle where the sides run parallel to the x- and y-axes. The direction of the tangent vector t concides with the respective section of the boundary and the normal vector n is perpendicular to the side of the rectangle. Thus the normal vectors for each side of the rectangle point into the positive or negative direction of the x- or y-axes.
2
Peter Gustav Lejeune Dirichlet (1805–1859), Carl Gottfried Neumann (1832–1925), Victor Gustave Robin (1855–1897).
9.2 Scalar Linear Partial Differential Equations y
323 t
n x
A
∂A
x
Fig. 9.6 Directional derivative at the boundary ∂A of a 2D domain A. At each point of the boundary x ∈ ∂A, the tangent vector t and the normal vector n are perpendicular
y
t n
x A ∂A
x
Fig. 9.7 Rectangular 2D domain A with boundary ∂A. The direction of the normal component n on the boundary is equivalent to the positive x direction of the Cartesian coordinate system
Mixed Boundary Conditions Some problems call for different kinds of boundary conditions on different sections of the boundary. These are called mixed boundary conditions. This case occurs for 1D domains where different conditions are given for the two different endpoints (see Table 9.2). Finally, Table 9.3 compiles the definitions of the boundary conditions of the first, second, and third kind. The functions φk (x, t) in Table 9.3 differ in their physical dimension, since they specify either the value y(x, t), its spatial derivative, or a linear combination with coefficients of arbitrary (but compatible) physical units. This distinction is often omitted, such that the boundary value φ(x, t) indicates a function of appropriate physical dimension. Table 9.3 Definition of the boundary conditions of the first, second, and third kind Kind Name Definition First Dirichlet BC y(x, t) = φ1 (x, t) ∂ y(x, t) = φ2 (x, t) Second Neumann BC n ⊥ ∂Ψ ∂n ∂ y(x, t) = φ3 (x, t) n ⊥ ∂Ψ, c0 , c1 ∈ R Third Robin BC c0 y(x, t) + c1 ∂n
324
9 Continuous Multidimensional Systems
9.2.6 Telegraph Equation The telegraph equation discussed in Sect. 9.1.3 led to a second order partial differential equation (9.5) for the voltage u(x, t). It is a special case of the general second order partial differential equation (9.16) as detailed in Example 9.3. However the telegraph equation (9.5) has been derived from the coupled system of first order partial differential equations (9.3) and (9.4). Since the two representations are closely related it is worthwhile to consider partial differential equations also in the form of a set of coupled first order equations. This investigation is the topic of Sect. 9.3. Example 9.3 (Telegraph Equation). The spatial domain of the transmission line problem discussed in Sect. 9.1 is a 1D interval L = [0, ] and the temporal domain is the positive time axis 0 ≤ t < ∞. The telegraph equation from (9.5) is a special case of the general form (9.16) with x = x1 and the coefficients . 2
a = lc,
a1 = lg + rc,
c1 = 0,
c2 = 0,
b11 = −1, d = rg,
b22 = 0,
b12 = 0,
fe (x, t) = 0 .
(9.22)
The initial conditions are u(x, 0) = ui,1 (x),
.
∂ u(x, t)|t=0 = ui,2 (x), ∂t
(9.23)
where ui,1 (x) and ui,2 (x) are arbitrary functions with x ∈ L. The boundary conditions (9.14) of the first kind correspond to the boundary values in (9.18) as φ (0, t) = u0 (t),
. 1
φ1 (, t) = u (t) .
The boundary ∂L consists of the two points x = 0 and x = .
(9.24)
9.3 Vector-Valued Linear Partial Differential Equations As just mentioned in Sect. 9.2.6, there are two representations for the telegraph equation from Sect. 9.1.3: • A system of two coupled first order partial differential equations for the variables voltage u(x, t) and current i(x, t) in Eqs. (9.3) and (9.4). • A scalar second order partial differential equation for only one variable, e.g., the voltage u(x, t) in Eq. (9.5). The latter case has been generalized to linear second order partial differential equations for a scalar variable in Sect. 9.2.2. This section generalizes the case of coupled first order equations with multiple variables like voltage and current. A useful tool for this purpose is the introduction of vectors for the multiple variables and matrix notation for the differential operators and coefficients.
9.3 Vector-Valued Linear Partial Differential Equations
325
9.3.1 Coupled Partial Differential Equations A systems of m coupled partial differential equations is represented by a vectorvalued partial differential equation of the form ∂ − L y(x, t) = f e (x, t), x ∈ Ψ, t ∈ T. (9.25) . C ∂t The physical quantities of the system are arranged into the m × 1 vector of variables y(x, t). They are defined on the temporal and spatial domain from Sect. 9.2.1. The temporal behavior of the system is reflected by the m × m temporal differential operator C ∂t∂ . It includes the capacitance matrix or mass matrix C of size m × m. The spatial behavior of the system is defined by the spatial differential operator L = A + B∇.
.
(9.26)
The operator L contains the m × m matrix A of physical parameters with the coefficients of the undifferentiated entries of the vector of variables y. The spatial derivatives of a vector-valued partial differential equation are concentrated in the m × m matrix-valued operator B∇. The generic form B∇ comprises different notations depending on the exact form of the partial differential equation. On the one hand, B∇ may contain gradient, divergence and curl operators (see Sect. 6.3.6). On the other hand, B∇ may be split into a sum, where the nonzero entries of the matrices B xν indicate the partial derivatives w.r.t. the individual spatial coordinates xν B∇ =
n
.
ν=1
B xν
∂ . ∂xν
(9.27)
Finally, the vector-valued function f e in (9.25) corresponds to a vector valued version of fe in (9.16). The representation of coupled partial differential equations is now shown for the transmission line equations in Example 9.4 and for the acoustic wave equation in Example 9.5. It is revisited in Sect. 10.2.1 for initial-boundary-value problems. Example 9.4 (Transmission Line Equations). The coupled first order differential equations (9.3) and (9.4) are rearranged here as ∂u(x, t) ∂i(x, t) =0, + ri(x, t) + ∂x ∂t ∂i(x, t) ∂u(x, t) + gu(x, t) + =0. c ∂t ∂x l
.
(9.28) (9.29)
This form allows to combine the voltage u(x, t) and the current i(x, t) into a vector of unknowns and the line parameters l, c, r, and g into matrices of coefficients 0 l ∂ u(x, t) 0 r u(x, t) 1 0 ∂ u(x, t) 0 + + = . . (9.30) c 0 ∂t i(x, t) g 0 i(x, t) 0 1 ∂x i(x, t) 0
326
9 Continuous Multidimensional Systems
The vector-matrix notation in Eq. (9.30) corresponds to the general form in (9.26) and (9.27) for one spatial dimension n = 1 and B = B x1 u(x, t) 0 r 1 0 0 l .y(x, t) = , A=− , B=− = −I, C = . (9.31) i(x, t) g 0 0 1 c 0 Thus the general vector-valued partial differential equation (9.25) represents the transmission line equations (9.28) and (9.29) with the vector of unknowns y(x, t) and the matrices A, B, C from (9.31). Example 9.5 (Acoustic Wave Equation). The acoustic wave equation describes the propagation of sound waves by linking the scalar sound pressure p(x, t) to the vectorvalued particle velocity v(x, t). The parameters are the free-field impedance z0 and the speed of sound c, where the free-field impedance depends on the density ρ as z0 = ρc. Sound pressure and particle velocity depend on time t and on the threedimensional space coordinates x. The spatial differentiation appears as the gradient grad p(x, t) and the divergence div v(x, t) c grad p(x, t) + z0
.
∂ v(x, t) = 0 , ∂t
1 ∂ p(x, t) + c div v(x, t) = 0 . z0 ∂t
(9.32) (9.33)
To arrive at the general vector-matrix notation in (9.25), the sound pressure p(x, t) and the particle velocity v(x, t) are combined into the vector of unknowns y(x, t). The free-field impedance z0 and its inverse show up in the capacitance matrix C, while the gradient and divergence together with the speed of sound c constitute the matrix of spatial derivatives B∇. Since sound pressure p(x, t) and particle velocity v(x, t) appear only as either time or space derivatives, the matrix A is a zero matrix A = 0 0 z I grad 0 p(x, t) .y(x, t) = . (9.34) , C = −1 0T , L = −B∇ = −c 0 div v(x, t) 0 z0 For Cartesian coordinates follows a slightly different representation by expanding the vector-valued particle velocity v(x, t) into its three spatial components vν (x, t), ν = 1, 2, 3. Then the gradient expands into a column vector of partial derivatives and the divergence into a row vector, such that B∇ adopts the form of (9.27) ⎡ ⎤ ⎢⎢⎢ p(x, t) ⎥⎥⎥ ⎢⎢⎢v (x, t)⎥⎥⎥ ⎢⎢ 1 ⎥⎥ , .y(x, t) = ⎢ ⎢⎢⎢⎣v2 (x, t)⎥⎥⎥⎥⎦ v3 (x, t)
⎡ ⎢⎢⎢∂ x1 ⎢⎢⎢∂ B∇ = c ⎢⎢⎢⎢ x2 ⎢⎢⎣∂ x3 0
0 0 0 ∂ x1
0 0 0 ∂ x2
⎤ 0 ⎥⎥ ⎥ 0 ⎥⎥⎥⎥ ⎥ = B x1 ∂ x1 + B x2 ∂ x2 + B x3 ∂ x3 (9.35) 0 ⎥⎥⎥⎥⎦ ∂ x3
with the shorthand notation ∂ xν for the partial derivatives w.r.t. xν and with the matrices B xν for ν = 1, 2, 3
9.3 Vector-Valued Linear Partial Differential Equations
⎡ ⎢⎢⎢1 ⎢⎢⎢⎢0 .B x1 = c ⎢ ⎢⎢⎢0 ⎢⎣ 0
0 0 0 1
0 0 0 0
⎤ 0⎥⎥ ⎥ 0⎥⎥⎥⎥ ⎥, 0⎥⎥⎥⎥⎦ 0
⎡ ⎢⎢⎢0 ⎢⎢⎢1 B x2 = c ⎢⎢⎢⎢ ⎢⎢⎣0 0
0 0 0 0
0 0 0 1
⎤ 0⎥⎥ ⎥ 0⎥⎥⎥⎥ ⎥, 0⎥⎥⎥⎥⎦ 0
327
⎡ ⎢⎢⎢0 ⎢⎢⎢0 B x3 = c ⎢⎢⎢⎢ ⎢⎢⎣1 0
0 0 0 0
0 0 0 0
⎤ 0⎥⎥ ⎥ 0⎥⎥⎥⎥ ⎥. 0⎥⎥⎥⎥⎦ 1
(9.36)
This example shows, that the notation B∇ can be expressed in different ways: either by the efficient differentiation operators gradient and divergence, or by the individual partial derivatives with respect to each single component xν of the vector of spatial variables.
9.3.2 A Note on Analogies of Physical Variables The structure of the matrices A, B, C in the vector-matrix notation (9.25) and (9.26) reveals in a concise fashion how the temporal and the spatial partial derivatives are connected. From a purely structural viewpoint, the variables u and i in (9.30) appear to be interchangeable. However, adopting a physical point of view, voltage u and current i play distinct roles. Voltages are measured as the potential difference between the terminals of lumped parameter circuit elements (resistors, capacitors, inductors), while currents are measured as flowing through the these elements. Therefore, voltages add up to zero along a closed mesh, while currents add up to zero at a node according to Kirchhoff’s circuit laws (see Figs. 3.6 and 9.3). Voltage and current are physical variables from the field of electricity. They have no direct equivalence to variables in other fields, e.g., in mechanics. However their product yields electric power which is equivalent to mechanical power, power radiated by acoustic or electromagnetic waves, thermal power, etc. . Also in these other fields of physics, power is expressed by the product of variables, e.g., power in mechanics is the product of force and velocity. This equivalence becomes apparent by expressing the physical unit of power as W = V A = N m s−1 . Therefore, voltage and current on one hand and force and velocity on the other hand are called pairs of power conjugate variables. These relationships suggest to classify variables from the fields of electricity, mechanics, acoustics, and possibly from other fields of physics in a unifying way. Different approaches have been developed with different advantages and disadvantages, but there is no single best solution for all purposes. These so-called analogies are often applied in modelling of technical systems like electromechanic or electroacoustics systems as networks of lumped parameters. The impedance analogy preserves impedances but converts circuits into their dual ones. The mobility analogy or dynamical analogy preserves the circuit topology but converts impedances into admittances [5]. Since the focus lies here on distributed parameter systems, only the impedance analogy is discussed since impedances of various kinds play a role in imposing boundary conditions.
328
9 Continuous Multidimensional Systems
Table 9.4 Impedance analogy Variables
Mechanical
Electrical
Acoustical
Effort
f (t)
u(t)
p(t)
Flow
v(t)
i(t)
q(t) = v(t)A
Power
f (t) v(t)
u(t) i(t)
p(t) q(t) = p(t) v(t) A
Impedance
F(s) Zmech (s) = V(s)
U(s) Zel (s) = I(s)
Zac (s) =
P(s) 1 P(s) 1 = = Zf (s) Q(s) V(s) A A
The impedance analogy associates the variables in mechanical, electrical, and acoustical systems as shown in Table 9.4. At first, force f (t) in mechanics, voltage u(t) in electrical systems and the sound pressure p(t) in acoustics are considered as analog to each other and called effort variables. Then velocity v(t) in mechanics, current i(t) in electrical systems and the volume velocity q(t) in acoustics are considered as analog and called flow variables. The volume velocity q(t) is related to the particle velocity v(t) by q(t) = v(t) A, where A is the area of an acoustic system like the area of a loudspeaker or microphone membrane or the cross section of a duct. All pairs of effort and flow variables are power conjugate variables, since their product gives power in watts. The variables in the first three lines in Table 9.4 are shown as functions of time such that all products in the third line give instantaneous power. Other forms like root-mean-square (rms) power or complex power can be derived from instantaneous power by assuming appropriate real or complex-valued wave forms. Note, that the mechanical velocity and the particle velocity in acoustics share the same denomination v(t) for simplicity. The last line of Table 9.4 shows the definition of impedances as relation of the Laplace transforms of the respective effort and flow variables. The mechanical impedance Zmech (s) is the relation of the Laplace transforms of force and velocity and the electrical impedance Zel (s) of voltage and current. In acoustics, there are two definitions of impedance, the acoustic impedance Zac (s) and the field impedance Zf (s). The acoustic impedance relies on a representative area A of a transducer or a pipe or duct. The field impedance is defined only by the local variables pressure and particle velocity without reference to a representative area. The assignment of force, voltage, and particle velocity as effort variables is a matter of choice and particular to the impedance analogy. The mobility analogy uses a different assignment. Details on the different analogies and their use in network modelling are given e.g., in [5], where the relation to acoustics in Table 9.4 is addressed as electroacoustic analogy. Analogies of this type are also used in heat transfer, where e.g., multilayer walls are modelled by a series of electric resistances. However, the conventional thermal units temperature and heat flow are not a pair of power complementary variables. Although the main application of these analogies are the cross-domain representation of electromechanical, electroacoustic or thermal systems by lumped parameter electrical circuits, they are also useful for distributed parameter systems. Here,
9.3 Vector-Valued Linear Partial Differential Equations
329
impedances are of importance for the definition of boundary conditions of vectorvalued partial differential equations, see Sect. 9.3.3. The above discussion—by no means exhaustive—is meant to create some awareness of the physical nature of the otherwise abstract entries of the vector y(x, t) in (9.25). Example 9.6 (Field Impedance of the Lossless Transmission Line). The wave equation for voltage and current has already been considered in Sect. 9.1.4 as a special case of the telegraph equation in Sect. 9.1.3. The solution (9.8) for the scalar case is used here as an approach to solving the lossless transmission line in vector form (9.28) and (9.29) for r = 0 and g = 0. Similar to (9.8) the following functions are proposed as solutions u(x, t) = u0 f (x − vt),
.
i(x, t) = i0 f (x − vt) .
(9.37)
Their space and time derivatives are .
∂ u(x, t) = u0 f (x − vt), ∂x ∂ u(x, t) = −vu0 f (x − vt), ∂t
∂ i(x, t) = i0 f (x − vt), ∂x ∂ i(x, t) = −vi0 f (x − vt) . ∂t
(9.38) (9.39)
The notation f denotes here the derivative of the function f w.r.t. its argument. Inserting the space and time derivatives from (9.38) and (9.39) into (9.28) and (9.29) and cancelling the term f (x − vt) gives the simple relation between u0 and i0 u − lv i0 = 0,
. 0
i0 − cv u0 = 0 .
(9.40)
They are simultaneously satisfied for the wave propagation speed from (9.8) and lead to the ratio between u0 and i0 for which (9.40) holds l u0 1 1 = . (9.41) = lv = .v = √ cv c i 0 lc The calculation of the field impedance Zf requires the Laplace transforms of u(x, t) and i(x, t) from (9.37) L {u(x, t)} = U(x, s) = u0 L { f (x − vt)}, L {i(x, t)} = I(x, s) = i0 L { f (x − vt)} .
.
(9.42) (9.43)
The Laplace transform of f (x − vt) is not further specified here, since it cancels anyway in the calculation of the field impedance l U(x, s) u0 = . .Zf (s) = = (9.44) I(x, s) i0 c Thus the field impedance of the lossless transmission line is a real value and frequency independent.
330
9 Continuous Multidimensional Systems
Example 9.7 (Field Impedance of the Acoustic Wave Equation). This example investigates a special solution of the acoustic wave equation in the form of Eqs. (9.32) and (9.33). The determination of the field impedance of a transmission line in Example 9.6 serves as a guideline. Comparing Eqs. (9.3) and (9.4) for the lossless case with r = 0 and g = 0 to Eqs. (9.32) and (9.33) shows indeed an analogy where voltage corresponds to sound pressure and electrical current to acoustic particle velocity. Note however, that Eqs. (9.32) and (9.33) are valid for 3D space. Inspired by Eq. (9.37), the following functions are proposed as a possible solution for Eqs. (9.32) and (9.33) or, equivalently, for the vector formulation (9.34) .
p(x, t) = p0 f1 (x1 − ct),
v(x, t) = v1 (x, t) e1 ,
v1 (x, t) = v0 f1 (x1 − ct) .
(9.45)
The particle velocity v(x, t) has only a nonzero component v1 (x, t) in the direction of x1 as indicated by the unit vector e1 . The propagation speed v from (9.8) and (9.41) has been replaced by the speed of sound c to avoid confusion with the particle velocity. These functions describe a wave travelling into the positive x1 -direction. Note that there is no guarantee that sound pressure and particle velocity should share the same time-space evolution as given by the common function f1 . Indeed, this is not the case for many acoustic sound fields, like e.g., those emanating from spatially concentrated sound sources. Therefore, Eqs. (9.45) only qualify as a solution if they solve Eqs. (9.32) and (9.33). As a first step, note that the spatial derivatives have only components for x1 .
grad p(x, t) =
∂ p(x, t) e1 , ∂x1
div v(x, t) =
∂ v1 (x, t) . ∂x1
(9.46)
Proceeding as in (9.38) and (9.39), the space and time derivatives are calculated from (9.45) .
∂ p(x, t) e1 = p0 f1 (x1 − ct) e1 , ∂x1 ∂ v1 (x, t) = v0 f1 (x1 − ct) . div v(x, t) = ∂x1
grad p(x, t) =
(9.47) (9.48)
Inserting the time derivatives and the spatial derivatives into Eqs. (9.32) and (9.33) and cancelling f1 (x1 − ct) gives two equations which are simultaneously satisfied for the resulting relation between p0 and v0 p − z0 v0 = 0,
. 0
v0 −
1 p0 , = 0 z0
p0 = z0 . v0
(9.49)
The corresponding relation of the Laplace transforms L {p(x, t)} = P(x, s) = p0 L { f1 (x1 − ct)},
(9.50)
L {v1 (x, t)} = V1 (x, s) = v0 L { f1 (x1 − ct)},
(9.51)
.
9.3 Vector-Valued Linear Partial Differential Equations
331
leads to the field impedance Zf as Z (s) =
. f
p0 P(x, s) = = z0 . V1 (x, s) v0
(9.52)
Thus the field impedance Zf has a constant real value at any point x in space and it is equal to the coefficient z0 . This result justifies to call z0 the free field impedance.
9.3.2.1 Initial Conditions The initial conditions for the vector of variables y in (9.25) resemble the scalar definition (9.17) y(x, t)|t=t0 = y(x, t0 ) = yi (x),
.
x ∈ Ψ, t = t0 .
(9.53)
The initial value yi (x) defines the initial state of the vector of variables y(x, t0 ). Since (9.25) contains only a first order time derivative, there is only one initial condition in (9.53).
9.3.3 Boundary Conditions The boundary conditions for scalar differential equations are defined in Sect. 9.2.5 in terms of the scalar value y(x, t) at the boundary, of its derivative, or a superposition of both, see Table 9.3. For the vector formulation (9.25), boundary conditions can be imposed by linear combinations of the individual entries of the vector of variables y(x, t). They are formulated in terms of an m × m boundary operator FHb acting on y(x, t) at the spatial boundary ∂Ψ FHb (x) y(x, t) = y(x, t), Fb (x) = φb (x, t),
.
x ∈ ∂Ψ, t ∈ T.
(9.54)
The vector φb (x, t) with x ∈ ∂Ψ contains the dedicated boundary values. The matrix form of FHb and the notation as conjugate transpose allow a representation of the boundary conditions by a vector of scalar products as introduced in (4.106) in Sect. 4.4. The boundary operator FHb may contain parameters, as well as temporal and spatial derivatives in polynomial form, respectively. Therefore, the formulation in (9.54) is able to realize a wide variety of boundary conditions. Some choices for the transmission line equations along with the definition of the boundary operator FHb and the vector of boundary values φb (x, t) are presented in Example 9.8. Example 9.8 (Boundary Conditions for the Transmission Line Equations). A transmission line according to Fig. 9.1 is terminated at the boundary either by connection to a load circuit or by disconnecting the line from a load. Typical cases are shown in Fig. 9.8.
332
9 Continuous Multidimensional Systems i(ℓ, t) = 0
i(ℓ, t) u(ℓ, t) = 0
u(ℓ, t)
I(ℓ, s) U(ℓ, s)
Z(s)
Fig. 9.8 Typical terminations of a transmission at x = defining different boundary conditions
The situation on the left of Fig. 9.8 shows the boundary at x = terminated by a short circuit. It requires the voltage at the boundary to be zero u(, t) = 0 while the current i(, t) is determined by the state of the transmission line. Thus there is a condition for the voltage but not for the current. This situation is expressed in the general form of (9.54) by 10 u(, t) H H .u(, t) = 0, Fb () y(, t) = 0, Fb () = , y(, t) = . (9.55) 00 i(, t) The center of Fig. 9.8 shows a termination by an open circuit. It requires the current at the boundary to be zero i(, t) = 0 while there is no boundary condition for the voltage u(, t) 01 u(, t) .i(, t) = 0, FHb () y(, t) = 0, FHb () = , y(, t) = . (9.56) 00 i(, t) The right hand side of Fig. 9.8 shows a more general case, the termination by a oneport circuit with impedance Z(s). Here it is convenient to formulate the boundary condition in the frequency domain as 1 −Z(s) U(, s) .U(, s) = Z(s) I(, s), FHb (, s) Y(, s) = (9.57) =0. 0 0 I(, s)
Z(s)
I(0, s) Is (s)
Z(s)
U(0, s)
Us (s)
I(0, s) U(0, s)
Fig. 9.9 Typical terminations of a transmission at x = 0 defining the boundary conditions at x = 0
The matrix FHb () = FHb (, s) now depends on the frequency variable s since it contains the boundary impedance Z(s). It is possible to formulate this boundary condition also in the time domain like the short and the open circuit. However, a time domain version of (9.57) requires either a differential equation at the boundary or a convolution of the current i(t) with the inverse Laplace transform of the impedance
9.3 Vector-Valued Linear Partial Differential Equations
333
z(t) = L −1 {Z(s)}. In either case, the multiplication in the frequency domain is preferable. All boundary conditions presented in Fig. 9.8 are homogeneous boundary conditions, since there are no sources at the boundary. Two cases of inhomogeneous boundary conditions are shown in Fig. 9.9 for the boundary at x = 0. The left hand side of Fig. 9.9 shows a termination by a current source Is (s) with internal impedance Z(s). Since the impedance may be frequency-dependent, all variables are considered in the frequency domain. The relation between the voltage U(0, s) and the current I(0, s) at the boundary and the source current I0 (s) lead to the boundary operator from (9.58). The right hand side of Fig. 9.9 shows the termination by a voltage source Us (s) with internal impedance Z(s) resulting in the boundary operator from (9.59) 0 0 U(0, s) 0 H = Φb (0, s) , .Fb (0, s) Y(0, s) = (9.58) = Z −1 (s) 1 I(0, s) −Is (s) 1 Z(s) U(0, s) Us (s) = Φb (0, s) . FHb (0, s) Y(0, s) = = (9.59) 0 0 0 I(0, s) In both cases, the vector of boundary values Φb (0, s) is nonzero and contains the terminating current or voltage source. Note that in Figs. 9.8 and 9.9 the current I(x, s) for x = 0, runs from left to right in the direction of the implicitly assumed x-axis. This fact accounts for the sign change of Z(s) in Eqs. (9.57) and (9.59). Example 9.8 has demonstrated that source-free, i.e., passive terminations at the boundary lead to homogeneous boundary conditions with φb (x, t) = 0, x ∈ ∂Ψ , while sources at the boundaries lead to inhomogeneous boundary conditions. Frequency dependent passive impedances or admittances are considered by the boundary operator FHb (x, s), x ∈ ∂Ψ , while ideal sources become elements of the nonzero vector of boundary conditions Φb (x, t). Although Example 9.8 has considered a purely electrical system, the general results hold as well for mechanical, acoustical, or thermal systems according to the impedance analogy discussed in Sect. 9.3.2. To this end, the voltage and the current are replaced by the respective effort and flow variables, and the impedances (or their inverses) are defined as shown in Table 9.4. The formulation of a partial differential equation in the general vector-matrix notation (9.25) and of its boundary condition with the boundary operators in (9.54) is not restricted to systems with only one effort and one flow variable. It is easily extended to more complex partial differential equations with more than two physical variables, as demonstrated in [1, Chap. 3]. The definition of the boundary conditions with boundary operators Fb (x, s) in matrix form is redundant, since one row in each Eqs. (9.55)–(9.59) is zero. However, Sect. 10.2.3.3 introduces further boundary operators and their relations are most elegantly formulated in matrix notation.
334
9 Continuous Multidimensional Systems
9.4 Vector-Valued and Scalar Partial Differential Equations Initial-boundary-balue problems for a scalar variable have been discussed in Sect. 9.2.2 and for a vector-valued variable in Sect. 9.3. The equivalence between both representations has been shown shortly in Sect. 9.1.3 for the telegraph equations. There, Eqs. (9.3) and (9.4) for voltage and current were first derived from the mesh and the node in Fig. 9.3 and then converted into the telegraph equation (9.5) for a single scalar variable. This section establishes the relations between scalar and vector formulations by converting the general form of a vector-valued initial-boundary-value problem in Sect. 9.3 into a scalar one.
9.4.1 Converting a Vector Representation into a Scalar Representation A conversion of the vector form of a partial differential equation (9.25) into a scalar form for a vector y(x, t) of arbitrary length and general matrices A, B, C is tedious and rarely required. In most physical problems occurs the spatial differentiation in the form of gradient, divergence or curl (rotation) and thus the spatial differentiation operator L is highly structured and sparse, see e.g., Example 9.5. Therefore the general procedure is shown here for a vector y(x, t) with two entries, e.g., an effort and a flow variable. Example 9.10 in Sect. 9.4.1 demonstrates how this approach generalizes to vectors y(x, t) with more than two entries. At first, the partial differential equation (9.25) is written in an even more concise form y1 (x, t) fe,1 (x, t) , f e (x, t) = . .D xt y(x, t) = f e (x, t), y(x, t) = (9.60) y2 (x, t) fe,2 (x, t) Both the vector of unknowns y(x, t) and the vector of excitation functions f e (x, t) contain two elements each. The operator D xt consists of all coefficients and temporal and spatial derivatives from (9.25). It is characterized by a scalar value det(D xt ) which is calculated by the formal rules for the determinant of a 2 × 2-matrix. However, just like the matrix operator D xt , also the scalar expression det(D xt ) contains temporal and spatial derivatives ∂ D D det(D xt ) = D11 D22 − D12 D21 . − L = 11 12 , (9.61) .D xt = C D21 D22 ∂t Therefore it is not possible to simply invert D xt like a 2 × 2-matrix. In particular, a division by det(D xt ) does not make sense. Nevertheless, the same calculation steps as for Gaussian elimination can be carried out. If a diagonal representation of the matrix operator D xt exists then it has the form
9.4 Vector-Valued and Scalar Partial Differential Equations
.
335
10 D22 −D12 D11 D12 . = det(D xt ) −D21 D11 D21 D22 01
(9.62)
Application of the same elimination steps to the element-wise form of (9.60) fe,1 (x, t) D11 D12 y1 (x, t) = , . (9.63) D21 D22 y2 (x, t) fe,2 (x, t) results in the de-coupled equations .
det(D xt ) y1 (x, t) =
D22 fe,1 (x, t) − D12 fe,2 (x, t),
(9.64)
det(D xt ) y2 (x, t) = −D21 fe,1 (x, t) + D11 fe,2 (x, t) .
(9.65)
Since det(D xt ) contains partial derivatives w.r.t. space and time, the left hand side of each equation constitutes a partial differential equation for the scalar quantity y1 (x, t) or y2 (x, t). The right hand sides consist of differential forms which—in general— involve both components of the vector of excitation functions f e (x, t). The above procedure for diagonalization of an operator matrix without explicit inversion and without division by a determinant can also be extended to quadratic operator matrices of arbitrary size. The required tool from linear algebra is the cofactor expansion det(D xt ) I = AATcof with the matrix of cofactors Acof , see e.g. [20, Chap. 4.4].
9.4.1.1 Examples So far, no assumptions regarding the nature of the elements in the matrix operator D xt have been made. In particular, no conditions for the entries of D xt have been stated such that it is diagonalizable. On the other hand, it is not the user’s choice, how a partial differential equation should look like. In practical problems, the structure and the entries of D xt are dictated by the analysis of the underlying physical problem. Therefore, two extended examples are now discussed. They involve partial differential equations which describe physical effects that have been investigated before: an electrical transmission line and the propagation of acoustic waves. Example 9.9 (Transmission Line Equations). According to Example 9.4, the vectorial form of the transmission line equations is characterized by the matrices ∂ ∂ 0 l ∂ r , C= , =− x (9.66) .L = A + B , ∂x = g ∂x c 0 ∂x ∂x such that the matrix operator D xt is given by l∂t + r ∂x , D xt = C∂t − L = c∂t + g ∂ x
∂t =
∂ . ∂t
(9.67)
336
9 Continuous Multidimensional Systems
The determinant follows from (9.61) .
det(D xt ) = ∂2x − (l∂t + r)(c∂t + g) = −lc ∂2t − (lg + rc) ∂t + ∂2x − rg .
(9.68)
Thus the homogeneous vector-matrix equation (9.30) turns into two separate equations for the voltage u(x, t) and the current i(x, t) ∂2 ∂ u(x, t) + (lg + rc) u(x, t) + rg u(x, t) = 2 ∂t ∂t ∂ ∂2 lc 2 i(x, t) + (lg + rc) i(x, t) + rg i(x, t) = ∂t ∂t
lc
.
∂2 u(x, t), ∂x2 ∂2 i(x, t), ∂x2
(9.69) (9.70)
where (9.69) has already been derived in Sect. 9.1.3 as Eq. (9.5). Next, the effect of the diagonalization on the boundary conditions is investigated. Since only the boundary values are of interest here, the initial values are set to zero. The considered boundary conditions are given as a termination of the transmission line by a voltage source with internal impedance at x = 0, see the right hand side of Fig. 9.9. The boundary condition has been formulated in the frequency domain in (9.59) as Z(s)I(0, s) + U(0, s) = Us (s) .
.
(9.71)
To eliminate either I(0, s) or U(0, s), both sides of Eq. (9.71) are multiplied by the Laplace transforms of the off-diagonal elements of the differential operator D xt in (9.67) Z(s)(sl + r)I(0, s) + (sl + r)U(0, s) = (sl + r)Us (s), Z(s)(sc + g)I(0, s) + (sc + g)U(0, s) = (sc + g)Us (s) . .
(9.72) (9.73)
Now the terms (sl + r)I(0, s) and (sc + g)U(0, s) are replaced by the Laplace transforms of the space derivatives in Eqs. (9.28) and (9.29). Further (9.72) is divided by Z(s) (sl + r) ∂ (sl + r) U(x, s) − U(0, s) = − Us (s), x=0 ∂x Z(s) Z(s) ∂ I(x, s) − Z(s)(sc + g) I(0, s) = −(sc + g) Us (s) . x=0 ∂x .
(9.74) (9.75)
The resulting Eqs. (9.74) and (9.74) are the boundary conditions for the scalar partial differential equations (9.69) and (9.70), respectively. For an ideal voltage source at the boundary with Z(s) = 0 follows U(0, s) = Us (s), ∂ I(x, s) = −(sc + g) Us (s) . x=0 ∂x .
(9.76) (9.77)
9.4 Vector-Valued and Scalar Partial Differential Equations
337
These calculations show, that a termination by a voltage source with internal impedance turns into boundary conditions of the third kind for either the voltage or the current at the boundary. Boundary conditions of the third kind are listed in Table 9.3, where the normal derivative reduces here to a spatial derivative in negative x-direction for x = 0, see Fig. 9.1. The boundary condition for an ideal voltage source is either of the first kind for (9.69) or of the second kind for (9.70). In the transmission line equations, both the effort variable (voltage) and the flow variable (current) are scalars. Often one or even both of these variables have components in all space directions and are thus vector valued. The next example shows how to deal with the acoustic wave equation where the particle velocity is a vector with three spatial components. Example 9.10 (Acoustic Wave Equation). The vector of unknowns y(x, t) of the acoustic wave equation in three spatial dimensions has four scalar entries, the sound pressure p(x, t) and the three components of the particle velocity v(x, t), see (9.35). Thus the matrices C and L are of size 4 × 4. It is possible to apply the elimination procedure from (9.62) also to the resulting 4 × 4-matrix operator. However, the structure of the matrix operator B∇ in (9.35) imposed by the differentiation operators gradient and divergence allows for a shortened procedure. At first, the matrix operator D xt from (9.61) follows with (9.34) and (9.35) ⎤ ⎡ 0 ⎥⎥ ⎢⎢⎢ c∂ x1 z0 ∂t 0 ⎥ ⎢⎢⎢⎢ c∂ x2 0 z0 ∂t 0 ⎥⎥⎥⎥⎥ c grad z0 ∂t I ⎢ .D xt = ⎢ , ⎥ = −1 ⎢⎢⎢ c∂ x 0 0 z0 ∂t ⎥⎥⎥⎥ z0 ∂t c div ⎢⎣ −1 3 ⎦ z0 ∂t c∂ x1 c∂ x2 c∂ x3
3×1 3×3 . 1×1 1×3
(9.78)
The 4 × 4-matrix operator D xt is decomposed into a block matrix of temporal and spatial differentiation operators where the sizes of the submatrices are given in the matrix-like notation on the right hand. Then the pair of coupled differential equations (9.32) and (9.33) is decoupled by a multiplication similar to (9.62) c div −z0 ∂t z0 ∂t v(x, t) + c grad p(x, t) 0 = . (9.79) . −1 0 −z−1 ∂ I c grad ∂ p(x, t) + c div v(x, t) z t t 0 0 with the result .
0 c2 div grad p(x, t) − ∂2t p(x, t) = . c2 grad div v(x, t) − ∂2t v(x, t) 0
(9.80)
The first row of this equation turns into the familiar form of the acoustic wave equation for the sound pressure with the Laplace operator p = div grad p c2 p(x, t) =
.
∂2 p(x, t) . ∂t2
(9.81)
The second row is a bit more tricky. From a mathematical viewpoint, it is the result of the above diagonalization process and cannot be simplified any further. However,
338
9 Continuous Multidimensional Systems
when restricting the consideration to sound pressure levels suitable for human hearing, then it is save to assume that sound waves in air do not the create vortices. Thus for the particle velocity of sound waves in air holds rot v = 0 (resp. curl v = 0). Further information is found in textbooks on acoustics under the topic vector potential [13]. In this case follows from the definition of the Laplace operator for a vector field [7] .
v = grad div v − rot rot v,
(9.82)
that grad div v = v. Now the acoustic wave equation for the particle velocity v takes the same concise form as for the sound pressure in (9.81) c2 v(x, t) =
.
∂2 v(x, t), ∂t2
c2 vν (x, t) =
∂2 vν (x, t), ∂t2
ν = 1, 2, 3 .
(9.83)
Even more, the diagonalization extends also to the individual scalar components vν of the vector v.
9.4.2 Converting a Scalar Representation into a Vector Representation Vector representations of partial differential equations follow from the physical derivation in a natural way. Section 9.1.3 showed this process for the transmission line equations. The derivation of other differential equations of mathematical physics or of other fields is considerably more difficult. However, almost all cases start from balance equations for effort and flow variables and obtain a coupled set of partial differential equations. In many cases of practical importance, they can be brought into the vectorial form of Eq. (9.25). Following first principles ensures the existence of a solution and that the variables in the vector of unknowns have a sound physical meaning and that they are well-behaved in a mathematical sense, e.g., differentiable. This procedure is preferable, since the analysis of processes in physics, as well as in chemistry, biology, economy and other fields is well documented in the respective literature. If, however, only a scalar partial differential equation is available without any background on its derivation, then there is no proven and established procedure to convert the scalar representation into a vectorial one. For those readers, who feel tempted to try it nevertheless, the following discussion highlights some of the pitfalls to watch out for. Assume a scalar partial differential equation with one space variable of the form a ∂2t y(x, t) + a1 ∂t y(x, t) + b ∂2x y(x, t) + dy(x, t) = 0 .
. 2
(9.84)
The designation of the coefficients is adapted from the more general case in (9.16). This equation contains second order derivatives w.r.t. time and space. Therefore, it
9.4 Vector-Valued and Scalar Partial Differential Equations
339
appears feasible to convert it into a system of two equations with first order equations each ˆ xt y(x, t) = m11 ∂ x + n11 ∂t + p11 m12 ∂ x + n12 ∂t + p12 y1 (x, t) = 0 . (9.85) .D m21 ∂ x + n21 ∂t + p21 m22 ∂ x + n22 ∂t + p22 y2 (x, t) 0 ˆ xt are unknown as well as the The coefficients m, n, and p in the operator matrix D physical nature of the two dependent variables y1 and y2 in the vector y(x, t). The connection to (9.84) is easily established through the determinant of the operator ˆ xt as in (9.62) matrix D .
ˆ xt ) = (m11 ∂ x + n11 ∂t + p11 )(m22 ∂ x + n22 ∂t + p22 ) det(D − (m12 ∂ x + n12 ∂t + p12 )(m21 ∂ x + n21 ∂t + p21 ) = a2 ∂2t + a1 ∂t + b ∂2x + d . (9.86)
ˆ xt with the coefficients of the given Matching the coefficients of the determinant of D Eq. (9.84) yields a system of six coupled nonlinear equations for twelve unknowns m11 m22 − m12 m21 = b,
.
n11 n22 − n12 n21 = a2 , p11 p22 − p12 p21 = d,
m11 n22 − m12 n21 + n11 m22 − n12 m21 = 0, m11 p22 − m12 p21 + p11 m22 − p12 m21 = 0, (9.87) n11 p22 − n12 p21 + p11 n22 − p12 n21 = a1 .
It is possible to reduce the number of unknowns down to ten by proper scaling of each equation in (9.85), but anyway • the existence of a solution is not guaranteed and • if a solution exists, it is not unique. The proof of the existence of a solution and possibly the determination of the set of solutions for general coupled nonlinear equations requires a considerable mathematical machinery, e.g., Buchberger’s algorithm for the determination of a Gr¨obner basis [8]. In practice, the use of a suitable computer algebra software is recommended. Instead of delving further into the general case, a familiar partial differential equation is considered in Example 9.11. Example 9.11 (Vector Representation of the Telegraph Equation). A special case of (9.84) is the telegraph equation (9.5) with the determinant (9.68) and the coefficients a = −lc,
. 2
a1 = −(lg + rc),
b = 1,
d = −rg .
(9.88)
A possible set of solutions of the nonlinear equations (9.87) depending on a parameter σ is given by (9.89), as can be shown by inserting (9.88) and (9.89) into (9.87). The solution (9.89) is characterized by a certain structure which becomes apparent in a matrix-like notation for the coefficients of (9.85)
340
9 Continuous Multidimensional Systems .
1 σ−1 σ−1 m11 m12 , = √ m21 m22 2 −σ σ
n11 n12 = n21 n22 p11 p12 = p21 p22
−σ l σ l , −1 −1 2 σ c σ c 1 −σ r σ r . √ −1 −1 2 σ g σ g 1 √
(9.89)
ˆ xt , the system of two coupled first order With these elements of the operator matrix D partial differential equations in (9.85) specializes to −1 σ−1 ∂ x + σ l ∂t + σ r y1 (x, t) σ ∂ x − σ l ∂t − σ r 0 ˆ xt y(x, t) = √1 .D = . −1 −1 −1 −1 + σ c ∂ + σ g σ∂ + σ c ∂ + σ g (x, t) −σ∂ y 0 x t x t 2 2 (9.90) Thus, the task is solved to construct a vector partial differential equation from a scalar one. It remains to answer the obvious question, how the vector representation (9.90) is related to (9.67) which has been found by physical analysis of the transmission ˆ xt line in Sect. 9.1.3. Inspection of (9.90) and grouping the elements of the operator D −1 ˆ w.r.t. σ and σ shows that D xt can be decomposed into the differential operator D xt from (9.67) and a matrix which contains only the parameter σ l ∂t + r 1 σ−1 σ−1 y1 (x, t) 0 ∂x ˆ xt y(x, t) = = .D (9.91) . √ ∂x −σ σ (x, t) y c ∂t + g 0 2 2 Since D xt contains the differential operators for the voltage u and the current i (see Example 9.4), the unknowns y1 and y2 must be related to u and i by 1 σ u − σ−1 i 1 σ−1 σ−1 y1 y1 u or = √ . (9.92) . = √ −1 y2 i 2 −σ σ y2 2 σu + σ i The last equation reveals the physical unit of the parameter σ. Since the √ units in Ω−1 and must be compatible, the unit of σ must be the expressions σ u ± σ−1 i √ √ −1 the units of y1 and y2 are V Ω = A Ω. Obviously, the choice of coefficients from (9.89) has generated two differential equations for unknowns y1 and y2 which are neither voltage nor current. Actually, they describe quantities known as power waves [10, 14]. This designation is motivated by the unit of y21 and y22 which is W. With hindsight, it appears that there would have been a simpler choice for the solution of (9.87) which leads directly to the differential operator D xt from Eq. (9.67) m11 m12 1 0 0 l 0 r n11 n12 p11 p12 = = = . (9.93) , , . m21 m22 n21 n22 p21 p22 0 1 c 0 g 0 However, without physical intuition, this solution does not follow any easier from (9.87) and (9.88) than e.g., the solution presented in (9.89). Although Example 9.11 considered only a simple problem in one space dimension, it illustrates the problems associated with deriving a vector partial differential equation from a scalar one:
9.4 Vector-Valued and Scalar Partial Differential Equations
341
• The coefficients of the vector equation follow from the coefficients of the scalar equation by solution of a system of coupled nonlinear equations. These equations might have no solution at all or the solution is not unique. • If solutions exist, they are hard to compute. Consider, e.g., the problem from Eq. (9.16) in two space dimensions, or even the corresponding 3D problem. • The choice of the particular solution affects also the physical meaning of the unknown variables. To establish their physical nature, an investigation of the physical units is helpful. The following conclusion can be drawn for the formulation of vector-valued partial differential equations: • Starting from fundamental principles of the underlying application field establishes a meaningful system of partial differential equations. • This procedure ensures that the unknowns are meaningful physical variables, e.g., effort and flow variables. • If only a scalar partial differential equation is given and a vectorial one is desired, then effort and flow variables are preferred candidates for the elements of the vector of unknowns. However, this does not mean that only effort and flow variables would be useful to represent physical processes. Also well-designed combinations thereof are of practical importance as discussed in Sect. 9.4.3.
9.4.3 Transformation of the Dependent Variables Effort and flow variables are a good choice for representing a distributed parameter system by a vector-valued partial differential equation. However, this choice is not a strict necessity. This section shows that also other types of variables may be suitable. As an example, consider the variables a and b as a linear combination of voltage u and current i 1 Rp Rp a(x, t) a(x, t) 1 Rp u(x, t) u(x, t) . = , = . (9.94) 1 −Rp i(x, t) b(x, t) i(x, t) 2Rp 1 −1 b(x, t) The quantity Rp is the so-called port resistance with unit Ω. Therefore, the new variables a and b are both voltages, i.e., effort variables, while there is no flow variable. The value of the port resistance is not fixed for the moment. Now the lossless transmission line equations (r = 0 and g = 0) in Example 9.4 in the matrix formulation of Example 9.9 are rewritten in terms of the unknowns a and b rather than u and i. To this end, the vector of unknowns in (9.30) is replaced by the right hand side in (9.94)
342
9 Continuous Multidimensional Systems
⎡ l 1 ∂ x l∂t Rp Rp a 1 ⎢⎢⎢⎢ ∂ x + Rp ∂t . = ⎢⎣ 1 2Rp c∂t ∂ x 1 −1 b 2 Rp (∂ x + Rp c ∂t )
⎤ ⎥⎥⎥ a ∂ x − Rlp ∂t 0 ⎥⎥⎦ = . 1 b 0 (−∂ + R c∂ ) x p t Rp (9.95)
The resulting matrix of differential operators adopts a simple form when the yet unspecified port resistance Rp is chosen as the real-valued field impedance Zf of the lossless transmission line from (9.44). With the propagation speed v from (9.8) follows √ 1 l 1 l = Zf , v= √ , .Rp = (9.96) = Rp c = lc = , c Rp v lc such that the vector differential equation (9.95) is expressed only in terms of field impedance Zf and propagation speed v ⎤ ⎡ 1 ∂ x − 1v ∂t ⎥⎥⎥ a(x, t) 1 ⎢⎢⎢⎢ ∂ x + v ∂t 0 ⎥ . (9.97) = . ⎥ ⎢ 0 2 ⎣ Z1 (∂ x + 1v ∂t ) − Z1 (∂ x − 1v ∂t )⎦ b(x, t) f f This homogeneous partial differential equation is solved by the variables a and b, if they are waves similar to (9.11) a(x, t) = fa (x − vt),
.
b(x, t) = fb (x + vt) .
(9.98)
Then the first equation holds for any pair of differentiable functions fa (x − vt) and fb (x + vt) since
.
∂ x + 1v ∂t fa (x − vt) + ∂ x − 1v ∂t fb (x + vt)
= fa (x − vt) − fa (x − vt) + fb (x + vt) − fb (x + vt) = 0,
(9.99)
and similar for the second equation in (9.97). Due to this property, the variables a and b from (9.94) are called wave variables. In particular, they are voltage waves. Also current waves and power waves are defined in a similar way. Power waves have already been encountered in Example 9.11, see (9.92). The distinct property of wave variables becomes apparent in comparison to the voltage on a lossless transmission line discussed in Sect. 9.1.4. Eq. (9.11) shows a general solution of the wave equation (9.7) which consists of two components: A wave form f1 (x − vt) travelling into the positive x-direction and f2 (x − vt) travelling into the negative x-direction. The general solution for the voltage is a superposition of both components. Similar expressions follow for the current on a lossless transmission line, e.g., from Eq. (9.70) for r = 0 and g = 0. However, Eq. (9.98) shows that the wave variable a(x, t) is a wave-like solution of (9.97) which travels only into the positive x-direction, while b(x, t) represents waves travelling only into the negative x-direction. Thus the wave variables a and b nicely separate the travel directions while both directions are present in the voltage u and in the current i. This superposition is also reflected in the right hand side of (9.94).
9.5 General Solution
343
Wave variables are useful for the description of all kinds of waveguides either for acoustic waves, microwaves or light waves. Moreover, discrete versions of the wave variables gave rise to the wave digital principle introduced by Fettweis3 with applications in digital filters as wave digital filters, in multidimensional systems, and in modelling of nonlinear circuits [3, 4, 10, 19].
9.5 General Solution This section presents a very general solution of the subclass of vector-valued partial differential equations described in Sect. 9.3. It serves to introduce the basic approach that is pursued in Chap. 10.
9.5.1 Review of One-Dimensional Systems Initial-value problems for one-dimensional problems have been discussed in Sect. 3.4. In particular, first order differential equations and higher order differential equations in state space representation have been presented in Sects. 3.4.1 and 3.4.2, respectively. Their solutions have been found in Eqs. (3.109) and (3.115) for the first order case and in (3.141) and (3.157) for the state space representation. The analysis of the one-dimensional case in Sect. 3.4.1 is relatively simple. The state space representation in Sect. 3.4.2 is more involved but there are helpful tools: Laplace transformation deals with time differentiation and initial conditions, and a similarity transformation with the matrix K diagonalizes the matrix exponential. Sections 3.4.2 has interpreted the similarity transformation as a signal transformation T with basis vectors that are not necessarily orthogonal. The effect of the signal transformation T is reviewed here once more for the homogeneous state equation (3.119). It turns out that the transformation T carries the nucleus from which a different and more powerful transformation for the spatial variables of continuous multidimensional systems emerges. The homogeneous state equation follows from (3.116) with zero input u(t) = 0 .
d x(t) = A x(t), dt
x(0) = xi .
(9.100)
The eigenvectors of the state matrix A are collected in the matrix K which not only diagonalizes A according to (3.133) but also establishes the matrix exponential in (3.135) K −1 , A = K DK
.
3
Alfred Fettweis, 1926–2015.
eA t = K eDtK −1 .
(9.101)
344
9 Continuous Multidimensional Systems
The solution of the initial-value problem (9.100) adopts the concise form x(t) = P(t){xi } = K eDtK −1 xi = T −1 {eDt T {xi }} .
.
(9.102)
Note that the diagonal matrix D contains the eigenvalues of the matrix A , while the transformation T is defined in terms of the eigenvectors of A .
9.5.2 Solution of Multidimensional Systems The short compilation of the solution of the homogeneous state equation (9.100) highlights a possible extension from the one-dimensional state space representation to continuous multidimensional systems. This section envisions a similar transformation for the initial-boundary-value problems introduced in Sect. 9.3. The homogeneous form of the system of equations (9.25) with the initial condition (9.53) shows some similarities with the homogeneous state equation (9.100) C
.
∂ y(x, t) = L y(x, t), ∂t
y(x, 0) = yi (x) .
(9.103)
There is an additional weighting matrix C, the capacitance or mass matrix, and the state matrix A is replaced by the spatial differential operator L from (9.26). In parallel to (9.102), imagine now a transformation T , which solves (9.103) as y(x, t) = P(t){yi (x)} = T −1 {eDt T {yi (x)}} .
.
(9.104)
While the matrix A in (9.100) acts on the state vector x(t), the spatial differentiation operator L acts on the function y(x, t). Therefore, the eigenvalue problem for L calls for a decomposition into eigenfunctions rather than eigenvectors. Pursuing the parallelity to the state space problem (9.100) further, one might conjecture that the matrix D consists of the eigenvalues of the spatial differentiation operator L and that the transformation T is defined in terms of the associated eigenfunctions. Setting up the corresponding eigenvalue problem for L would also have to consider the boundary conditions and the weighting matrix C. Note that the existence of a solution in the form of (9.104) is purely speculative at this point. The investigation of the eigenvalue problem for L and its usefulness for the definition of a suitable signal transformation T are explored in Chap. 10.
9.6 Problems 9.1. Show that the system of first-order partial differential equations .
y˙ 1 (x, t) + py1 (x, t) y˙ 2 (x, t)
+ vy2 (x, t) = 0, + vy1 (x, t) = 0,
References
345
is equivalent to the second-order partial differential equation y¨ (x, t) + p˙y(x, t) − v2 y (x, t) = 0
.
for either y(x, t) = y1 (x, t) or y(x, t) = y2 (x, t). Show this result by direct elimination and, alternatively, by setting up the matrix operator D xt . 9.2. Write the system of first-order differential equations from Problem 9.1 in the form of Eqs. (9.25) and (9.26) and indicate the vector y(x, t), the matrices A, B, C, and the spatial differentiation operator L. Establish the boundary operator Fb (x) at the boundary points x = 0 and x = for the boundary conditions y(0, t) = y(, t) = 0. Establish the vector yi (x) for general inhomogeneous initial conditions. Various aspects of this initial-boundary-value problem (IBVP) are considered in further problems. 9.3. Consider the partial differential equation (8.33) for the flexible string with q(x, t) = 0. Convert this scalar equation into a vectorial partial differential equation. Verify the outcome by converting the resulting matrix differential operator D xt back into the scalar form. Hint: To narrow down the multitude of possible solutions, use Eq. (9.93) in Example 9.11 as a guideline. 9.4. Determine sμ such that y(x, t) = e sμ t sin μ π x with μ ∈ N is a solution of y¨ (x, t) + p˙y(x, t) − v2 y (x, t) = 0,
.
for
0 < p < 2π
π .
9.5. Consider the equations for the diffusion of particles in isotropic media in one spatial dimension with the diffusion flux q(x, t), the particle concentration c(x, t) and the constant diffusivity Ddif q(x, t) = −Ddif
.
∂ c(x, t), ∂x
∂ ∂ c(x, t) = − q(x, t) . ∂t ∂x
• Set up the vector of unknowns y(x, t), the matrix C and the spatial differentiation operator L according to (9.25) for f e (x, t) = 0. • Convert the two equations into one scalar equation for the concentration c(x, t) and determine the coefficients of the general form (9.84).
References 1. Rabenstein, R., Sch¨afer, M.: Multidimensional Signals and Systems: Applications. Springer Nature, Heidelberg, Berlin (to appear) 2. Avanzini, F., Marogna, R.: A modular physically based approach to the sound synthesis of membrane percussion instruments. IEEE Transactions on Audio, Speech, and Language Processing 18(4), 891–902 (2010). https://doi.org/10. 1109/TASL.2009.2036903
346
9 Continuous Multidimensional Systems
3. Bernardini, A., Werner, K.J., Smith, J.O., Sarti, A.: Generalized wave digital filter realizations of arbitrary reciprocal connection networks. IEEE Transactions on Circuits and Systems I: Regular Papers 66(2), 694–707 (2019) 4. Bilbao, S.: Wave and Scattering Methods for Numerical Simulation. John Wiley & Sons, Chichester, UK (2004) 5. Blauert, J., Xiang, N.: Acoustics for Engineers. Springer-Verlag, Berlin (2009) 6. Brezis, H.: Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer (2011) 7. Bronshtein, I., Semendyayev, K., Musiol, G., M¨uhlig, H.: Handbook of Mathematics. Springer-Verlag, Berlin (2015) 8. Buchberger, B., Kauers, M.: Groebner basis. Scholarpedia 5(10), 7763 (2010). https://doi.org/10.4249/scholarpedia.7763. Revision #128998 9. Debnath, L.: Nonlinear Partial Differential Equations for Scientists and Engineers. Birkh¨auser, Basel (2012) 10. Fettweis, A.: Wave digital filters: Theory and practice. Proceedings of the IEEE 74(2), 270–327 (1986) 11. Franke, D.: Systeme mit o¨ rtlich verteilten Parametern. Eine Einf¨uhrung in die Modellbildung, Analyse und Regelung. Hochschultext. Springer, Berlin u.a. (1987) 12. IEEE Std 1901-2010, Standard for broadband over power line networks: Medium access control and physical layer specifications (2010) 13. Kim, Y.H.: Sound Propagation. John Wiley & Sons (Asia), Singapore (2010) 14. Kurokawa, K.: Power waves and the scattering matrix. IEEE Transactions on Microwave Theory and Techniques 13(2), 194–202 (1965) 15. Sauvigny, F.: Partial Differential Equations 2. Springer (2012) 16. Sch¨afer, M., Rabenstein, R., Strobl, C.: A multidimensional transfer function model for frequency dependent transmission lines. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–4 (2017) 17. Sch¨afer, M., Schlecht, S.J., Rabenstein, R.: Feedback structures for a transfer function model of a circular vibrating membrane. In: Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5. New Paltz, NY (2019) 18. Sch¨afer, M., Wicke, W., Rabenstein, R., Schober, R.: An nd model for a cylindrical diffusion-advection problem with an orthogonal force component. In: 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), pp. 1–5 (2018) 19. Schwerdtfeger, T., Kummert, A.: Nonlinear circuit simulation by means of Alfred Fettweis’ wave digital principles. IEEE Circuits and Systems Magazine 19(1), 55–C3 (2019) 20. Strang, G.: Linear Algebra and its Applications, 4 edn. Thomson, Brooks/Cole (2006) 21. Strauss, W.A.: Partial Differential Equations. John Wiley & Sons (2008)
Chapter 10
Sturm-Liouville Transformation
This chapter revisits the topic of multidimensional transformations that has been investigated first in Chap. 6. Further, Sect. 9.5 has reviewed the diagonalization of the state space representation by a similarity transformation and its interpretation as a signal transformation originally introduced in Sect. 3.4.2. In addition, an extension of these ideas to the solution of initial-boundary-value problems has been envisioned. Here, these lines of thought are continued by applying signal transformations to initial-boundary-value problems. In particular, a transformation T is established for the class of initial-boundary-value problems discussed in Sect. 9.3 and its relations to the propagator P(t) (see Sect. 3.4.2) and to Green’s function are explored. Some basic notions are first discussed by an introductory example in Sect. 10.1. Then the spatial differentiation operators involved in initial-boundary-value problems are investigated further in Sect. 10.2 as a prerequisite to the introduction of the spatial transformation in Sect. 10.3. The corresponding operations in the space domain are described by Green’s functions in Sect. 10.4 which lead to the corresponding propagator in Sect. 10.5. Section 10.6 provides a review of the procedure developed so far. Although derived for initial-boundary-value problems with constant coefficients, the presented method works also for space-dependent coefficients as shown in Sect. 10.7. Indeed, this is the form of the classical Sturm-Liouville problems reviewed finally in Sect. 10.8.
10.1 Introductory Example The flexible string from Sect. 8.3.1 is reconsidered here. Its dynamic behavior is described by the wave equation (8.33) which has been quickly restricted to the static case in (8.34). Here the dynamic case is reexamined with an additional damping term to account for the loss of kinetic energy.
© Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3 10
347
348
10 Sturm-Liouville Transformation
10.1.1 Physical Problem The physical problem of a flexible string in motion is described by the following initial-boundary-value problem y¨ (x, t) + dy˙ (x, t) − v2 y (x, t) = 0,
.
y(x, 0) = yi (x), y˙ (x, 0) = 0, y(0, t) = 0, y(, t) = 0,
x ∈ L,
(10.1)
x ∈ ∂L.
(10.2)
Time and space cordinates are t and x and for ease of notation, the first and second order time derivatives are denoted by one or two dots and the second order space derivative is denoted by a double prime .
∂2 y(x, t) = y¨ (x, t), ∂t2
∂ y(x, t) = y˙ (x, t), ∂t
∂2 y(x, t) = y (x, t). ∂x2
(10.3)
The term dy˙ (x, t) denotes the influence of damping caused by friction and v is the propagation velocity for the lossless case, compare Eqs. (9.7) and (9.8). The initial-boundary-value problem is completed by initial conditions for t = 0 and by boundary conditions for both ends of the string at x = 0 and at x = . The initial conditions require an initial deflection yi (x) and an initial velocity y˙ (x, 0) of zero. The partial differential equation and the initial values in (10.1) are defined on the interval L = (0, ), while the boundary ∂L consists here of the endpoints of the interval [0, ], see Table 9.2 for n = 1. y(x, t0 )
x=0
x x=ℓ
Fig. 10.1 Oscillation y(x, t0 ) of a flexible string of length as function of the space coordinate x at an arbitrary point in time t = t0
An arbitrary sample solution to the initial-boundary-value problem (10.1) and (10.2) is shown in Fig. 10.1. The boundary conditions are satisfied by the fixing at both ends. In between, the deflection of the string exhibits a shape determined by the initial condition, the excitation and the partial differential equation.
10.1 Introductory Example
349
10.1.2 Laplace Transformation For linear differential equations with constant coefficients, the Laplace transformation is a suitable tool not only to remove the time derivatives but also to consider the initial conditions, see Sect. 4.5.9.1. These effects apply also to the partial differential equation (10.1). The first and second order time derivatives turn into multiplications with the complex frequency variable s, while the respective differentiation theorems consider the initial deflection y(x, 0) and the initial velocity y˙ (x, 0). With the initial values from (10.1) follows .
s2 Y(x, s) + sd Y(x, s) − v2 Y (x, s) = (s + d) yi (x), Y(0, s) = 0, Y(, s) = 0,
x ∈ L, x ∈ ∂L.
(10.4) (10.5)
The boundary conditions (10.2) are now expressed in terms of the Laplace transforms Y(0, s) and Y(, s) at the boundary. Application of the Laplace transformation has simplified the initial-boundaryvalue problem (10.1) and (10.2) considerably: The partial differential equation in (10.1) with derivatives w.r.t. time and space has been turned into a boundaryvalue problem with space derivatives only. The initial values in (10.1) now show up as additive terms on the right hand side of (10.4). What is left as a separate restriction are the boundary conditions (10.5). It is worthwhile to investigate why the Laplace transformation is the method of choice when it comes to dealing with initial conditions like in (10.1). It turns out that the integration range of the one-sided Laplace transformation from 0 to ∞ matches the interval [0, ∞) of the time axis, where the initial-boundary-value problem is defined. As a consequence, the differentiation theorem of the Laplace transformation includes the initial value at t = 0 as an additive term. This beneficial effect of the Laplace transformation suggests to look for a transformation w.r.t. the space variable with similar properties: It should remove the space derivative and include the boundary conditions as additive terms.
10.1.3 Finite Fourier-Sine Transformation To find a suitable transformation for the spatial coordinate, the Laplace transformation may serve as a guideline. The idea of matching the integration range to the time interval of an initial-value problem is easily transferred to the spatial coordinate: A transformation for the spatial variable should have an integration range that corresponds to the spatial interval L where the initial-boundary-value problem (10.1) is defined, i.e. the integration ranges from 0 to . Having fixed the integration range of a spatial integral transformation, it remains to specify the transformation kernel. A suitable kernel can be found by physical intuition: From the acoustics of musical instruments is known that the deflection of a string is a superposition of modes: The lowest mode determines the fundamental
350
10 Sturm-Liouville Transformation
frequency and thus the pitch, while the higher modes determine the timbre of the sound. The wavelengths of the modes are determined by the length of the string. The lowest mode has zeros only at the end points and the higher modes have an increasing number of zeros in between. Figure 10.2 shows the sinusoidal functions sin μπ x for μ = 1, 2, 3, where μ = 1 is the fundamental mode and μ = 2, 3 the first two harmonics. Note that each sinusoid satisfies the boundary conditions (10.2) and so does any superposition of these sinusoids. sin π ℓx
sin 2π ℓx
1 0
sin 3π ℓx
1 0
x
ℓ
0
1 0
ℓ
x
0
0
ℓ
x
Fig. 10.2 Basic shapes of the fundamental mode (left) and the first two harmonics (center and right) of string vibration in Fig. 10.1
Definition The integration range from 0 to and the sine functions as transformation kernel lead directly to the definition of the finite Fourier sine transformation and its inverse x Y(x, s) sin μπ dx, 0 ∞ ¯ s)} = Y(x, s) = 2 ¯ s) sin μπ x . T −1 {Y(μ, Y(μ, μ=1 ¯ s) = T {Y(x, s)} = Y(μ,
.
μ ∈ N,
(10.6) (10.7)
¯ s) for the The forward transformation (10.6) delivers the expansion coefficients Y(μ, series expansion of the inverse transformation (10.7). The designation finite transformation refers to the finite integration range and sine transformation refers to the transformation kernel. The name Fourier emphasizes the relation to the Fourier-type transformations in Sect. 4.5, in particular to the Fourier series (Sect. 4.5.3). Indeed, for spatial functions with period 2 and odd symmetry within each period holds (the dependence on s is omitted for the moment) .
0
x x 1 V(x) e−iμπ dx, Y(x) sin μπ dx = 2 − ∞
x x 2¯ ¯ Y(μ) sin μπ = V(μ) eiμπ , μ=1 μ=−∞ ∞
for the periodically extended functions
(10.8)
(10.9)
10.1 Introductory Example
⎧ ⎪ ⎪ i Y(x) 0 < x < , ⎪ ⎪ ⎪ ⎨ .V(x) = ⎪−V(−x) − < x < 0, ⎪ ⎪ ⎪ ⎪ ⎩V(x + 2) else,
351
⎧ 1 ¯ ⎪ ⎪ 0 < μ < ∞, ⎪ i Y(μ) ⎪ ⎪ ⎨ ¯ V(μ) =⎪ 0 μ = 0, ⎪ ⎪ ⎪ ⎪ ⎩−V(−μ) ¯ −∞ < μ < 0 .
(10.10)
Due to the similarity with the Fourier series, also the finite Fourier sine transformation (10.6) has a discrete spectrum, denoted by the discrete frequency variable μ ∈ N. The lowest value μ = 1 corresponds to the half-wave on the left hand side of Fig. 10.2. The next possible sine-wave to satisfy the boundary conditions results for μ = 2 with one full period in the interval [0, ]. Then follow three half-waves for μ = 3 and so on. Any noninteger values for μ would describe sine-waves which violate the boundary condition (10.5) and do not contribute to a solution. Differentiation Theorem Application of the finite Fourier sine transformation to the partial differential equation (10.4) requires to express the transform of the term Y (x, s) with the second order spatial differentiation by the transform T {Y(x, s)} = ¯ s). Such a relation is known from the Laplace transformation as a differentiation Y(μ, theorem. For the finite Fourier sine transformation, the differentiation theorem is obtained by twofold application of integration by parts (see also Sect. 10.2.2.2) .
0
π 2 x x Y (x, s) sin μπ dx = − μ Y(x, s) sin μπ dx (10.11) 0 x
x π + Y (x, s) sin μπ − μ Y(x, s) cos μπ
. 0
Integration by parts removes the spatial differentiation from the term Y (x, s) and shifts it to the transformation kernel. This way the sine function turns into a negative sine function such that the second integral in (10.11) evolves as T {Y(x, s)} times a negative multiplier. The first term in brackets vanishes since the sine functions are zero at the boundary and the second term vanishes as well since Y(x, s) satisfies the boundary conditions (10.5). The result is the theorem for second order differentiation π 2 π 2 ¯ s) . .T {Y (x, s)} = − μ (10.12) T {Y(x, s)} = − μ Y(μ, It is useful for turning the differential equation (10.4) into an algebraic one.
10.1.4 Transfer Function Description Application of the finite Fourier sine transform (10.6) to the differential equation (10.4) removes the spatial derivatives by virtue of the differentiation theorem (10.11). On the right hand side, the initial condition turns into T {yi (x)} = y¯ i (μ). The result is a purely algebraic equation
352
10 Sturm-Liouville Transformation .
¯ s) + ds Y(μ, ¯ s) + ω ¯ s) = (s + d) y¯ i (μ), s2 Y(μ, ˆ 2μ Y(μ,
ω ˆ μ = μπ
v .
(10.13)
The frequency variable ω ˆ μ results from applying the differentiation theorem as T {v2 Y (x)}, where v is the propagation velocity for the lossless case. Note that ω ˆ μ has the physical unit of a temporal frequency due to the units of the velocity v and the length . ¯ s) expresses the solution in the Solving the algebraic equation (10.13) for Y(μ, ¯ s) and the spatemporal and spatial frequency domain by the transfer function H(μ, tial transform y¯ i (μ) of the initial condition ¯ s) = H(μ, ¯ s) y¯ i (μ), Y(μ,
.
¯ s) = H(μ,
s − (sμ + s∗μ ) s+d . = s2 + ds + ω ˆ 2μ (s − sμ )(s − s∗μ )
(10.14)
The transfer function depends either on the physical parameters d for damping and ω ˆ μ from (10.13) or on the pair of complex conjugate poles sμ and s∗μ , where s = σ + iωμ ,
. μ
σ=−
d , 2
ωμ =
ω ˆ 2μ − σ2 ,
ω ˆ μ > |σ| .
(10.15)
The condition for the existence of a complex conjugate pole pair is revisited in Sect. 10.1.7. Equation (10.14) provides the key to the solution of the initialboundary-value problem (10.1) and (10.2), however concealed in the temporal and ¯ s). The solution y(x, t) in the space-time domain spatial frequency domain as Y(μ, requires two more steps: Inverse Fourier sine transformation into the space domain and inverse Laplace transformation into the time domain.
10.1.5 Inverse Fourier Sine Transformation The transformation from the spatial frequency domain back into the space domain is accomplished by applying the inverse Fourier sine transformation from (10.7) to (10.14). Summing all contributions for μ ∈ N gives ¯ s) y¯ i (μ)} = 2 ¯ s) y¯ i (μ) sin μπ x . Y(x, s) = T −1 {H(μ, H(μ, μ=1 ∞
.
(10.16)
The result is a superposition of the kernel functions from Fig. 10.2 weighted with ¯ s) from (10.14). the expansion coefficients Y(μ,
10.1.6 Inverse Laplace Transformation The only dependency on the temporal frequency variable s in the solution Y(x, s) ¯ s). It is therefore sufficient to apply an inverse occurs in the transfer function H(μ,
10.1 Introductory Example
353
¯ s). A reasonable first step is an expansion of (10.14) Laplace transformation to H(μ, into two partial fractions. Termwise inverse Laplace transformation and collecting ¯ t), one for each mode μ the results gives a set of impulse responses h(μ, ⎧ ⎪ ⎪ ∗ 1 ⎨hμ eσt cos(ωμ t + ϕμ ) t ≤ 0, −1 s t ∗ s t −1 ¯ ¯ t) = i sμ e μ + isμ e μ = ⎪ .L {H(μ, s)} = h(μ, ⎪ ⎩0 2ωμ t < 0. (10.17) The amplitude hμ and the phase ϕμ are given by h =
. μ
σ 2 ωμ
+ 1,
σ
.
(10.18)
μ = 1, . . . , ∞ .
(10.19)
ϕμ = arctan
ωμ
Applying the inverse Laplace transformation to (10.14) results in ¯ t) y¯i (μ), y¯ (μ, t) = h(μ,
.
¯ 0) = 1, ∀μ such that y¯ (μ, 0) = y¯i (μ), as expected from (10.1). FigNote that h(μ, ¯ t) as functions of time. ure 10.3 shows the first few modal impulse responses h(μ, ¯ t) h(1,
1
0 0
¯ t) h(2,
1
T1
t
t
0 0
¯ t) h(3,
1
T1
t
0 0
T1
¯ t) from Eq. (10.17) for μ = 1, 2, 3. The time axis is labelled Fig. 10.3 Modal impulse responses h(μ, by T 1 from Eq. (10.23)
10.1.7 Solution in the Space-Time Domain The last step to the solution in the space-time domain is the inverse Laplace trans¯ t) have already been deterformation of (10.16). The modal impulse responses h(μ, ¯ s) in the series expansion (10.16) mined in (10.17) and adopt the place of H(μ, y(x, t) = L −1 {Y(x, s)} =
.
2 ¯ t) sin μπ x y¯i (μ)h(μ, μ=1 ∞
x 2 . y¯i (μ)hμ eσt cos(ωμ t − ϕμ ) sin μπ μ=1 ∞
=
(10.20)
354
10 Sturm-Liouville Transformation [ht]
yi (x)
1 ℓ
0
x0
0 y¯i (μ) =
y¯i (1)
t
T1
t
0 0
x ℓ
0
T1
y¯(3, t)
sin 2π ℓx
ℓ
t
0
T1
sin 3π ℓx
1 0
¯ t) h(3,
y¯(2, t)
1 0
y¯i (3)
1
y¯(1, t)
sin
{yi (x)}
¯ t) h(2,
1
0 0
x
y¯i (2)
¯ t) h(1,
1
ℓ
x
0
1 0
ℓ
x
0
0
ℓ
x
2 ℓ y(x, t0 )
x=0
x=ℓ
x
Fig. 10.4 From top down: decomposition of an initial condition yi (x) into expansion coefficients y¯ i (μ) according to (10.6) and reconstruction of the oscillation y(x, t) from modal impulse re¯ t) and transformation kernels sin μπ x according to (10.20). The process is shown sponses h(μ, here for the first three modes μ = 1, 2, 3
Figure 10.4 shows the complete process for obtaining a solution to the initialboundary-value problem (10.1) and (10.2). The shape of the initial value yi (x) on
10.1 Introductory Example
355
top resembles an initial deflection at a single arbitrary point x0 . The first step is the calculation of the coefficients y¯ i (μ) of the Fourier sine transformation from (10.6). Their values for this particular shape are given in Example 10.1. The further steps proceed as shown in the first sum in (10.20): Each coeffi¯ t) as a function of time cient y¯ i (μ) is multiplied by the modal impulse response h(μ, from (10.17) and Fig. 10.3. Then the result is multiplied by the transformation kernels shown in Fig. 10.2 as functions of space. It remains to sum all terms and scale by 2 according to (10.20). Example 10.1 (Fourier Sine Transform of the Initial Condition). The initial condition yi (x) from the top of Fig. 10.4 is described by (10.1) and ⎧ 1 x ⎪ ⎪ 0 ≤ x ≤ x0 , ⎪ ⎨ x0 .yi (x) = ⎪ (10.21) ⎪ ⎪ ⎩ 1 −x x0 ≤ x ≤ . −x0
The coefficients y¯ i (μ) of the Fourier sine transformation follow from (10.6) y¯ (μ) =
. i
1 1 1 sin(μπξ0 ), 2 2 μ π ξ0 (1 − ξ0 )
ξ0 =
x0 ,
μ≥1,
(10.22)
with standard integration techniques.
It remains to investigate the meaning of the relation ω ˆ μ > |σ| in (10.15). To simplify the presentation, a stronger requirement is assumed: For the fundamental ˆ 1 . Then express σ by the ˆ 1 d2 such that ω1 ≈ ω frequency ω1 with μ = 1 let ω time constant T σ of the exponential decay in (10.20) and ω1 by the period T 1 of the fundamental frequency σ=−
.
1 d =− , 2 Tσ
ω1 =
2π v ≈π . T1
(10.23)
Now the relation between ω1 and the absolute value of σ turns into .
Tσ ω1 v = 2π = 2π 1, d T1 |σ|
(10.24)
i.e., the time constant T σ of the exponential decay of the amplitude is much larger than the period T 1 of the fundamental frequency. In short: The assumption in (10.15) means that the flexible string vibrates for many periods before it fades out.
10.1.8 Review This introductory example shows that the initial-boundary-value problem (10.1) and (10.2) is effectively solved by two suitable functional transformations:
356
10 Sturm-Liouville Transformation
• The one-sided Laplace transformation converts the time derivatives into multiplications with the frequency variable s and takes care of the initial conditions. • The finite Fourier sine transformation converts the space derivatives into multiplications with a term containing the index μ of the spatial modes (see (10.12)) and considers the initial conditions. The resulting transformation process is clearly structured as shown in Fig. 10.4. Indeed, the decomposition of the initial condition yi (x) into its coefficients y¯ i (μ) by the Fourier sine transformation T according to (10.6) represents an analysis equation in the sense of Sect. 4.4.3. This interpretation is underlined by the close relation between the finite Fourier sine transformation to the Fourier series (see (10.8) and (10.9) and Sect. 4.5.3). The lower part of Fig. 10.4 generates the solution y(x, t) from the expansion coefficients y¯ i (μ). It realizes the inverse Fourier sine transformation T −1 according to (10.20) and represents the synthesis equation from Sect. 4.4.3. The fact that the problem from Sect. 10.1.1 is solved by following the general principle from Sect. 4.4.3, suggests that also more general initial-boundary-value problems can be solved by suitable functional transformations. However, the choice of this simple example has concealed some difficulties associated with a general application to more complex problems: • The physical problem from Sect. 10.1.1 is reduced to one spatial dimension. Initial-boundary-value problems in two and three spatial dimensions on irregular domains Ψ call for spatial transformations T different from the Fourier sine transformation. • The boundary conditions (10.2) are of a particular simple form, namely homogeneous Dirichlet boundary conditions according to Table 9.3. Therefore the choice of the transformation kernel in Sect. 10.1.3 has been based on an educated guess. More complex boundary conditions in two or three dimensions require a systematic way to derive the transformation kernel of the spatial transformation. • The key feature of the Fourier sine transformation is the differentiation theorem (10.11) derived through integration by parts. Corresponding relations for two or three spatial dimensions must be derived for transformation kernels that are not known a priori. • The initial-value-problem (10.1) is driven only by the initial value yi (x). There is no excitation function fe (x, t) and no non-zero boundary values φb (t). Thus the solution (10.20) is equivalent only to the zero-input case from (9.100). The equivalence to the response to an input function as in Sect. 3.4.2.9 needs to be established. • The general form of a solution to multidimensional systems as conjectured in (9.104) contains a matrix exponential with the diagonal matrix D. However, there is no obvious matrix exponential in the solution (10.20). To overcome these restrictions, a more general approach is presented in Sects. 10.2 and 10.3.
10.2 Spatial Differentiation Operators
357
10.2 Spatial Differentiation Operators This section follows the tracks laid out in the introductory example in Sect. 10.1. However, instead of the special physical problem from Sect. 10.1.1, it discusses the more general class of initial-boundary-value problems for vector-valued variables introduced in Sect. 9.3. This way, the restrictions listed in Sect. 10.1.8 are addressed. Of particular interest is the choice of the transformation kernel for the spatial transformation. Note, that even for the simple problem in Sect. 10.1.1, there is no one-size-fits-all transformation kernel. The sine function in (10.6) depends on the length of the flexible string and thus on the spatial domain. In two or three spatial dimensions, the variability of the spatial domains increases and thus the variability of the corresponding transformation kernels. Educated guesses for a suitable kernel as in Sect. 10.1.3 will no longer suffice and a general method to the construction of transformation kernels is needed. The approach followed here uses elements of the Sturm-Liouville1 theory [2, 24, 27], which has been developed for the solution of certain boundary-value problems, so-called Sturm-Liouville problems. A historical account is found e.g. in [3]. Similar approaches are known in control theory [7, 14, 19, 20].
10.2.1 Initial-Boundary-Value Problem The vector formulation of the initial-boundary-value problem from Sect. 9.3 is compiled here in the time domain and in the temporal frequency domain. This compilation serves to formulate objectives for a suitable spatial transformation.
10.2.1.1 Time-Domain Formulation The general form of an initial-boundary-value problem for vector-valued variables in the time domain has been discussed in Sect. 9.3. The partial differential equation from (9.25), the initial conditions from (9.53) for t0 = 0, and the boundary conditions from (9.54) are compiled here ∂ − L y(x, t) = f e (x, t), . C x ∈ Ψ, t > 0, (10.25) ∂t y(x, 0) = yi (x), Fb (x) y(x, t) = φb (x, t), H
x ∈ Ψ, x ∈ ∂Ψ,
t = 0, t > 0.
The spatial differentiation operator L is given by (9.26) and (9.27) as
1
Jacques Charles Franc¸ois Sturm (1803–1855), Joseph Liouville (1809–1882)
(10.26) (10.27)
358
10 Sturm-Liouville Transformation
L = A + B∇,
.
B∇ =
n
B xν
ν=1
∂ . ∂xν
(10.28)
The boundary operator Fb (x) is introduced in Sect. 9.3.3. The variables at the right hand sides of (10.25)–(10.27) are the excitation function f e (x, t), the initial value yi (x), and the boundary value φb (x, t). Note that (10.25) reflects the structure of the ordinary differential equation (3.99) in Sect. 3.4.1.
10.2.1.2 Frequency-Domain Formulation The initial-boundary value problem (10.25)–(10.27) turns into a boundary-value problem by application of the Laplace transformation. Its differentiation theorem replaces the time derivative in (10.25) by a multiplication with the temporal frequency variable s and includes the initial value yi (x) as an additive term on the right hand side of the partial differential equation (compare Sect. 10.1.2). The boundary condition (10.27) is formulated in the temporal frequency domain which does not affect the boundary operator Fb (x). .
(sC − L) Y(x, s) = Fe (x, s) + Cyi (x), Fb (x) Y(x, s) = Φb (x, s), H
x ∈ Ψ, x ∈ ∂Ψ.
(10.29) (10.30)
The result is a boundary-value problem which depends on the complex frequency variable s.
10.2.1.3 Objectives for a Spatial Transformation The Laplace transformation is known to turn linear ordinary differential equations with constant coefficients into algebraic equations, see Sect. 3.4.2.8. Its effect on partial differential equations is demonstrated by (10.29) and (10.30), where Laplace transformation removes temporal derivatives and includes initial values. However, the spatial derivatives in (10.29) and the boundary conditions (10.30) remain. The introductory example in Sect. 10.1 shows that a suitable spatial transformation (the finite Fourier sine transformation) has a similar effect on the spatial derivatives: It removes the spatial derivatives and considers the boundary conditions as in (10.13). Since the boundary conditions (10.2) are homogeneous, no boundary value term shows up on the right hand side of (10.13). However, the transformation T in (10.6) is customized to the problem described in Sect. 10.1.1 since the differentiation theorem (10.12) is taylored to the second order spatial derivative in (10.4) and the Dirichlet boundary conditions (10.5). Nevertheless, the finite Fourier sine transformation is helpful in specifying the objectives for a spatial transformation which can handle the general problem specified by (10.25)–(10.27): The aim is to design a spatial transformation T which • removes the spatial derivation operator L in (10.29) and
10.2 Spatial Differentiation Operators
359
• includes the boundary value Φb (x, s) as an additive term. To achieve this aim, a differentiation theorem for T {LY(x, s)} is required which • replaces the operation LY(x, s) by a multiplication of Y(x, s) with a factor and • evaluates the boundary condition (10.30) by integration on x ∈ ∂Ψ . A possible approach to find such a transformation is anticipated in Sect. 3.4.2.5 for systems of ordinary differential equations in state space representation. A multiplication by the state matrix A in (3.132)—a linear operation—is replaced by the multiplication by a factor—an eigenvalue. The operator L from (10.28) contains also a matrix and, in addition, spatial differentiations. A promising approach is therefore to investigate the eigendecomposition of the spatial differentiation operator L.
10.2.2 Spatial Differentiation Operator and its Adjoint Operator As just formulated in Sect. 10.2.1.3, a spatial transformation T shall be designed which deals with the spatial operator L according to (10.28). To this end, this section ˜ explores the properties of L and introduces the corresponding adjoint operator L. Then the eigenvalue problems of both operators are investigated in Sect. 10.2.3.
10.2.2.1 Scalar Product In Sect. 10.1.3 it has been found useful to adapt the integration range of a spatial transformation to the spatial domain for the partial differentiation under inspection. In Sect. 10.2.1 this spatial domain is Ψ with boundary ∂Ψ according to Table 9.2. A reasonable starting point is to define a scalar product by integration over the spatial domain Ψ for any two functions f(x) and g(x) from the Hilbert space of squareintegrable vector-valued functions in the sense of Sect. 4.2.6.3 . f, g = gH (x) f(x) dx . (10.31) Ψ
˜ Now two functions K(x) and K(x) are considered. They are vectors of the same size as Y(x, s) in (10.29). Then the spatial differential operator L from (10.28) is applied to K(x) such that ˜ = AK, K ˜ + B∇K, K ˜ . . LK, K (10.32) The problem to be explored here is whether the vector K(x) can be freed from ˜ the operator L, possibly by introducing an additional operator L˜ for K(x), such ˜ = K, L˜ K ˜ holds. Since L is a differential operator, some form of that LK, K integration by parts might be useful.
360
10 Sturm-Liouville Transformation
10.2.2.2 Integration by Parts in One Dimension Integration by parts has already been used for the differentiation theorem (10.11). Its derivation for one spatial dimension is repeated here as a blueprint for the multidimensional case. The starting point is the product rule for differentiation of the ˜ product of two scalar functions K(x)K(x) where the prime denotes differentiation .
d ˜ = K (x)K(x) ˜ K(x)K(x) + K(x)K˜ (x) . dx
(10.33)
Integration with respect to an arbitrary interval x0 < x < x1 gives .
x1 x1 x1 x1 d ˜ dx = K(x)K(x) ˜
= K (x)K(x) ˜ dx + K(x)K˜ (x) dx . K(x)K(x) x0 dx x x x 0
0
0
(10.34) ˜ It is thus possible to express the integral of K (x)K(x) by the integral of K(x)K˜ (x) ˜ and vice versa provided that the values of K(x) and K(x) at the endpoints x0 and x1 are known. Note that x0 and x1 constitute the boundary ∂L of the 1D domain L = (x0 , x1 ) according to Table 9.2. This idea is now extended to more dimensions.
10.2.2.3 Integration by Parts in n Dimensions ˜ in (10.32) where the Integration by parts is applied to the scalar product B∇K, K derivatives are represented by the operator B∇ from (10.28) .
n ˜ = ˜ H (x) B∇K(x) dx = ˜ H (x)B xν ∂ K(x) dx . B∇K, K K K Ψ Ψ ∂xν ν=1
(10.35)
A suitable integration by parts relation is found from the equivalent to the derivative of a product in 1D (10.33). For the partial derivative w.r.t. a component xν of the vector x holds ∂ H ∂ ∂ ˜H H ˜ ˜ K (x)B xν K(x) = K (x) B xν K(x) + K (x)B xν K(x) . ∂xν ∂xν ∂xν H ∂ ˜ ∂ H H ˜ K(x) K(x) + K (x) B xν K(x) . (10.36) = B xν ∂xν ∂xν The sum of all partial derivatives for ν = 1, . . . , n is expressed with (10.28) as .
n H ∂ ˜H ˜ ˜ H (x) B∇K(x) , K (x)B xν K(x) = BH ∇K(x) K(x) + K ∂xν ν=1
(10.37)
where left hand side is the divergence of a vector U(x) with the elements Uν (x)
10.2 Spatial Differentiation Operators
361 n ∂ ˜H K (x)B xν K(x) = div U(x) . ∂xν ν=1
H
˜ (x)B xν K(x), Uν (x) = K
.
(10.38)
This representation allows to express the integration in (10.35) by the Gauss integral theorem in a similar way as in the 1D counterpart (10.34) . div U(x) dx = U(x) · dA . (10.39) Ψ
∂Ψ
Thus the integration of the domain Ψ is reduced by one dimension to an integration over the boundary ∂Ψ . The scalar argument of the surface integral in (10.38) uses here (and only here) the conventional notation with a dot product, i.e. the scalar product of the vector U(x) and the vector dA which describes an oriented area element of the boundary ∂Ψ . The vector dA = n(x) dA is given by the unit length vector n(x), x ∈ ∂Ψ with the components nν (x) normal to the boundary Ψ (see Fig. 9.6) and by the infinitesimal scalar surface patch dA. Now the argument U(x) · dA turns with (10.38) into U(x) · dA =
n
.
H
˜ (x) Bn (x) K(x) dA, Uν (x) nν (x) dA = K
ν=1
Bn (x) =
n
B xν nν (x) .
ν=1
(10.40) The newly introduced matrix Bn (x) is the sum of the matrices B xν in (10.28) weighted with the components nν (x) of the normal vector at the boundary location x ∈ ∂Ψ . In a concise notation, the surface integral in (10.39) is expressed by a ˜ on the boundary x ∈ ∂Ψ function Φn which depends on the values of K(x) and K(x) ˜. ˜ H (x)Bn (x)K(x) dA = Φn K, K K . U(x) · dA = (10.41) ∂Ψ
∂Ψ
The Gaussian integral theorem (10.39) is now expressed in terms of integrals with (10.37) and (10.38) or in short form with (10.35) and (10.41) H ˜ H (x)Bn (x)K(x) dA = ˜ H (x) (B∇K(x)) dx, ˜ BH ∇K(x) K K . K(x) dx + ∂Ψ Ψ Ψ ˜ ˜ + B∇K, K ˜ . = K, BH ∇K (10.42) Φn K, K Note that Eq. (10.42) corresponds to (10.34) and thus represents the extension of integration by parts to higher dimensions.
10.2.2.4 Adjoint Spatial Operator ˜ can Returning to Eq. (10.32) shows that the derivatives B∇ in the term B∇K, K ˜ since ˜ by application of (10.42). The same holds for AK, K now be moved to K H ˜ = K ˜ H (x) AK(x) dx = ˜ ˜ . (10.43) AH K(x) . AK, K K(x) dx = K, AH K Ψ
Ψ
362
10 Sturm-Liouville Transformation
This result follows also directly from the definition of the scalar product in Sect. 4.2.2 using (4.10) and (4.12). Equations (10.42) and (10.43) are now inserted into (10.32) ˜ , (10.44) ˜ − K, BH ∇K ˜ = AK, K ˜ + B∇K, K ˜ = K, AH K ˜ + Φn K, K LK, . K where the two scalar products on the right hand side are combined into ˜ ˜ ˜ = K(x), L˜ K(x) + Φn K, K , L˜ = AH − BH ∇, . LK(x), K(x)
(10.45)
˜ Thus when shifting the operator L away from K using intewith the operator L. ˜ If in addition, the gration by parts, it reappears in the form of L˜ as operator for K. ˜ = 0 then the relation ˜ satisfy the condition Φn K, K functions K and K ˜ = K, L˜ K ˜ . LK, K (10.46) holds, which has been conjectured after (10.32). To distinguish both operators, L is called the primal operator and L˜ is called the adjoint operator or simply the adjoint. Also the adjoint operator possesses an adjoint. Indeed, from (10.45) follows that the adjoint of the adjoint operator L˜ is the primal operator L. Furthermore, if the adjoint ˜ = K, LK ˜ then operator L˜ is identical to the primal operator L such that LK, K the problem in (10.25)–(10.27) is called a self-adjoint problem. Note that the adjoint is not only defined by (10.45) within the domain Ψ but ˜ = 0 on the boundary ∂Ψ . This dependence on the also by the condition Φn K, K boundary suggests that the adjoint operator L˜ might be useful for the consideration of the boundary conditions (10.30). To explore this point further, the eigenvalue problems for both differential operators are studied in Sect. 10.2.3.
10.2.3 Eigenfunctions of the Spatial Operators The primal operator and the adjoint operator have been introduced in Sect. 10.2.2 for ˜ arbitrary vectors K(x) and K(x) for which the integral in the scalar product (10.31) exists. This variety is now severely restricted to a much smaller set of vectors. The ˜ reward for this restriction is a close relationship between K(x) and K(x) and the ˜ primal and adjoint operators L and L.
10.2.3.1 Eigenvalue Problems This section investigates eigenvalue problems for the primal operator L and the ˜ with K(x) and K(x) ˜ adjoint operator L as the corresponding eigenfunctions. The starting point is the generalized eigenvalue problem for L LK(x, μ) = sμ CK(x, μ) = sμ CKμ (x) ,
.
(10.47)
10.2 Spatial Differentiation Operators
363
with the eigenvalue sμ and the eigenfunction K(x, μ). Since more than one eigenvalue can be expected, eigenvalues and eigenfunctions are indexed by μ where μ ∈ N, μ ∈ N0 or μ ∈ Z, depending on the situation. If L is a matrix like in (10.28), then K(x, μ) is a vector-valued function of x. Sometimes it is preferable to separate the space variable x and the index μ by the alternate notation Kμ (x) = K(x, μ). Equation (10.47) is a generalized eigenvalue problem because it contains the weighting matrix C from (10.25). Including the matrix C and the operator L from the spatio-temporal differentiation operator C ∂t∂ − L in (10.25) ensures a close connection between the eigenfunctions of (10.47) and the solution of the initialboundary-value problem in (10.25)–(10.27). To explore this close connection, insert (10.47) into (10.46) to obtain ˜ ν = Kμ , L˜ K ˜ ν = sμ CKμ , K ˜ ν = Kμ , s∗μ CH K ˜ν . . LKμ , K (10.48) The second equality in (10.48) follows similar to (10.43) with A replaced by sμ C. The third equality follows from (10.46). Comparing both sides of the third equality reveals another generalized eigenvalue problem, this time for the adjoint operator L˜ ˜ ν) = s˜ν CH K(x, ˜ ν) = s˜ν CH K ˜ ν (x), L˜ K(x,
.
(10.49)
where the indexing of the eigenvalues s˜ν and sμ can be aligned such that s˜ν = s∗μ [17]. The result are two generalized eigenvalue problems for the primal operator L and the ˜ with the eigenfunctions K(x, μ) and K(x, ˜ μ) and the eigenvalues adjoint operator L, ∗ sμ and sμ LK(x, μ) = sμ CK(x, μ),
.
˜ μ), ˜ μ) = s∗μ CH K(x, L˜ K(x,
(10.50)
However, the coexistence of the primal and the adjoint eigenvalue problem depends ˜ ν ) = 0, see (10.45). on the validity of (10.46) and thus on the condition Φn (Kμ , K This condition is explored further after some useful orthogonality properties have been established.
10.2.3.2 Orthogonality of the Eigenfunctions ˜ μ) from (10.50) possess some very useful propThe eigenfunctions K(x, μ) and K(x, erties. They are derived here under the assumption that the eigenvalues are of mul˜ ν ) = 0 holds. tiplicity one and that the condition Φn (Kμ , K Biorthogonality To establish the biorthogonality of the two sets of eigenfunctions, insert (10.50) for two different indices μ and ν into the condition for the adjoint operator (10.46) ˜ ν) , ˜ ν) = K(x, μ), s∗ν CH K(x, . sμ CK(x, μ), K(x, (10.51) and extract the eigenvalues sμ and s∗ν from the scalar products
364
10 Sturm-Liouville Transformation
s
. μ
˜ ν) = sν CK(x, μ), K(x, ˜ ν) . CK(x, μ), K(x,
(10.52)
The scalar products on both sides are now identical and can be combined into ˜ ν) = 0 . .(sμ − sν ) CK(x, μ), K(x, (10.53) Since single eigenvalues have been assumed, any two eigenvalues with different index cannot be the same, i.e. sμ sν for μ ν. Thus there are two possibilities for (10.53) to be satisfied: Either the parenthesis is zero for μ = ν or the scalar product must be zero for μ ν. This argument establishes the biorthogonality of the two sets of eigenfunctions from (10.50) ˜ ν) = δμν Nμ , ˜ μ) 0 . . CK(x, μ), K(x, Nμ = CK(x, μ), K(x, (10.54) ˜ μ) are sets of biorthogonal eigenfunctions in the sense Therefore K(x, μ) and K(x, of Sect. 4.4.1 w.r.t. the scalar product (10.31) defined by spatial integration. The property Nμ 0 is not shown formally here but it is motivated by the calculation methods for Nμ in Sect. 11.2.2. ˜ μ) enjoy also a second Sum Orthogonality The eigenfunctions K(x, μ) and K(x, type of orthogonality w.r.t. to summation over their indices. This orthogonality is revealed by using the biorthogonality (10.54) to express K(x, μ) as K(x, μ) =
∞
.
K(x, ν)δμν =
ν=0
∞ 1 ˜ ν) K(x, ν) CK(ξ, μ), K(ξ, Nν ν=0
∞ 1 ˜ H (ξ, ν) CK(ξ, μ) dξ . K K(x, ν) = Ψ N ν ν=0
(10.55)
The space variable inside the scalar product does not appear outside of the integral and has been renamed to ξ to distinguish it from the argument of K(x, μ). Rearranging integration and summation gives the identity K(x, μ) =
.
∞ 1 ˜ H (ξ, ν) C K(ξ, μ) dξ , K(x, ν)K V Nν ν=0
(10.56)
which requires that the term in parenthesis is a diagonal matrix of delta impulses ∞ 1 ˜ H (ξ, ν) C = δ(ξ − x) I . K(x, ν)K . N ν ν=0
(10.57)
This relation establishes a type of sum orthogonality as already encountered in Sect. 4.2.6.2. Note that orthogonality (10.54) is defined by spatial integration while the sum orthogonality (10.57) results from summation.
10.2 Spatial Differentiation Operators
365
10.2.3.3 Boundary Conditions for the Eigenfunctions The above results on orthogonality have been derived under the tacit assumption that the condition for adjointness (10.46) holds which requires that the boundary ˜ from (10.41) is zero. This section establishes sufficient conditions term Φn (K, K) und expresses them in terms of the boundary operator Fb (x) of the initial-boundaryvalue problem (10.25)–(10.27). ˜ is zero under the sufficient condition that the inThe boundary term Φn (K, K) H ˜ tegrand K (x, ν)Bn (x)K(x, μ) in (10.41) vanishes on the boundary. However, there is so far no connection to the boundary condition (10.27) or (10.30). Therefore, the boundary operator Fb (x) is inserted into (10.41) using the obvious identity FHb (x) + I − FHb (x) = I ˜ H (x, ν)Bn (x)K(x, μ) = K ˜ H (x, ν)Bn (x) FH (x)K(x, μ) + K ˜ H (x, ν)Bn (x) I − FH (x) K(x, μ) . K b b
.
(10.58)
Now two new boundary operators are introduced which incorporate the matrix Bn (x) from (10.40) H .FK (x) = Fb (x)Bn (x), F˜ K (x) = Bn (x) I − FHb (x) . (10.59) The index K is chosen w.r.t. Eqs. (10.61) and (10.62). Rearranging (10.58) to accommodate FK (x) and F˜ K (x) gives an expression ˜ H (x, ν)Bn (x)K(x, μ) = K
.
H ˜ H (x, ν) FHK (x)K(x, μ) + F˜ HK (x)K(x, ˜ ν) K(x, μ) . K
(10.60)
which is zero if the eigenfunctions satisfy the boundary conditions FHK (x) K(x, μ) = 0, H ˜ ν) = 0, F˜ K (x) K(x,
.
∀μ,
x ∈ ∂Ψ,
(10.61)
∀ν,
x ∈ ∂Ψ .
(10.62)
Thus the orthogonality relations (10.54) and (10.56) hold if the eigenfunctions ˜ μ) are determined from the eigenvalue problems (10.50) subject K(x, μ) and K(x, to the boundary conditions (10.61) and (10.62). The boundary operator F˜ K (x) is called the adjoint boundary operator w.r.t. to FK (x).
10.2.4 Recapitulation of the Eigenvalue Problems Starting from the initial-boundary-value problem in Sect. 10.2.1, some prerequisites for a possible spatial transformation have been compiled, the adjoint operator in
366
10 Sturm-Liouville Transformation
Sect. 10.2.2 and the associated eigenvalue problems in Sect. 10.2.3. The main results are recapitulated here as they appear in hindsight. Initial-Boundary-Value Problem The first step from the time-space representation into the frequency domain is the Laplace transformation w.r.t. to time. The result is the boundary-value problem from Sect. 10.2.1.2 .
(sC − L) Y(x, s) = Fe (x, s) + Cyi (x),
x ∈ Ψ, x ∈ ∂Ψ.
Fb (x) Y(x, s) = Φb (x, s), H
(10.63) (10.64)
Differential Operators The differential operator L in (10.63) and its adjoint operator L˜ are defined in the spatial domain Ψ and adopt the form L = A + B∇,
L˜ = AH − BH ∇,
.
B∇ =
n ν=1
B xν
∂ , ∂xν
x∈Ψ .
(10.65)
The nonzero entries into the matrices B xν , ν = 1, 2, 3 indicate the location of a partial ˜ derivative in the matrix-valued spatial differentiation operators L and L. Boundary Operators The sum of the matrices B xν weighted with the components nν (x) of the normal vector n(x) at the surface x ∈ ∂Ψ defines the matrix Bn (x) Bn (x) =
n
.
B xν nν (x),
x ∈ ∂Ψ .
(10.66)
ν=1
The matrix Bn (x) and the boundary operator Fb (x) define two further boundary operators FK (x) and F˜ K (x) H .FK (x) = Fb (x)Bn (x), F˜ K (x) = Bn (x) I − FHb (x) , x ∈ ∂Ψ . (10.67) Eigenvalue Problems The spatial differentiation operator L and the boundary operator FK (x) define a generalized eigenvalue problem for L with the eigenvalues sμ and the eigenfunctions K(x, μ) .LK(x, μ) = sμ CK(x, μ), FHK (x) K(x, μ) = 0,
x ∈ Ψ, x ∈ ∂Ψ,
(10.68) (10.69)
In the same way, the adjoint operator L˜ and the adjoint boundary operator F˜ K (x) define another generalized eigenvalue problem for L˜ with the eigenvalues s∗μ and ˜ μ) the eigenfunctions K(x, ˜ K(x, ˜ μ) = s∗μ CH K(x, ˜ μ), L
. H
˜ μ) = 0, F˜ K (x) K(x,
x ∈ Ψ,
(10.70)
x ∈ ∂Ψ .
(10.71)
10.3 Spatial Transformation
367
Eqs. (10.68) and (10.69) are called the primal eigenvalue problem while Eqs. (10.70) and (10.71) are the adjoint eigenvalue problem. Both eigenvalue problems are boundary-value problems with homogeneous boundary conditions. Properties of Eigenvalues und Eigenfunctions The eigenvalues and the eigenfunctions are linked by several properties. For every eigenvalue sμ of the primal eigenvalue problem, its complex conjugate s∗μ is an eigenvalue of the adjoint ˜ ν) constitute a system of eigenvalue problem. The eigenfunctions K(x, μ) and K(x, biorthogonal functions with the normalization factor Nμ ˜ ν) = δμν Nμ , ˜ μ) 0 . . CK(x, μ), K(x, Nμ = CK(x, μ), K(x, (10.72) Further, the sum orthogonality applies .
∞ 1 ˜ H (ξ, ν) C = δ(ξ − x) I . K(x, ν) K N ν ν=0
(10.73)
These properties prove to be very useful for the definition of a suitable spatial transformation.
10.3 Spatial Transformation The introduction of the primal and adjoint differential operators and their eigenvalue problems in Sect. 10.2 is now followed up by the formulation of a customized spatial transformation for the underlying initial-boundary-value problem. The procedure follows the same basic steps as for the application of the Laplace transformation to initial-value problems with constant coefficients: formulation of the forward and inverse transformation, differentiation theorem, application to the initial-boundaryvalue problem, and finally transfer function description of the problem. The general form of signal transformations in terms on an analysis and a synthesis equation has been introduced in Sect. 4.4.3 by Eq. (4.108). This general principle has been specialized to well-known Fourier-type transformations in Sect. 4.5. Further Sect. 3.4.2.6 has interpreted the state transformation in a state space representation as pair of forward and inverse transformations in Eqs. (3.139) and (3.140). The interpretation as analysis and synthesis equation is given in Example 4.12 in Sect. 4.5.1. Finally, Sect. 9.5 has envisioned that the solution of the vector-valued partial differential equation (9.103) may also be expressed in terms of a pair of forward and inverse transformations. The corresponding analysis and synthesis equations are established now with the help of the Sturm-Liouville transformation introduced in Sect. 10.3.2. As a prerequisite, the suitability of eigenfunctions as basisfunctions has to be discussed in Sect. 10.3.1.
368
10 Sturm-Liouville Transformation
10.3.1 Eigenfunctions and Basisfunctions Now that the eigenfunctions of the primal and the adjoint spatial differentiation operator have been determined (see Sect. 10.2.4), it is tempting to use these eigenfunctions to define an analysis and a synthesis equation as introduced in Sect. 4.4.3. Furthermore, Sect. 3.4.2.6 has shown that a similarity transformation with the eigenvectors of a matrix can be described as a signal transformation. However, this result for a finite dimensional system has to be observed with care. While a full set of eigenvectors of an n×n matrix with n < ∞ constitutes a basis in a finite-dimensional vector space, it is not guaranteed, that the eigenfunctions K(x, μ) of an operator like L from (10.28) also constitute a basis for an infinite dimensional function space. To explore the conditions under which eigenfunctions may also act as basis functions, it is worthwhile to observe the concept of Riesz bases [9, 10, 13, 23]. Consider the Hilbert space of functions y(x) that can be represented by a series expansion with the functions K(x, μ), μ ∈ N0 and the arbitrary scalar values y¯ (μ) y(x) =
∞
.
y¯ (μ)K(x, μ) .
(10.74)
μ=0
The scalar product in this Hilbert space is denoted by y1 , y2 . The set of functions K(x, μ) is a Riesz basis if there exist positive constants C1 and C2 with 0 < C1 ≤ y(μ)|2 < ∞ holds C2 < ∞ such that for all sequences {¯y(μ)} with ∞ μ=0 |¯ C1
∞
.
|¯y(μ)|2 ≤ y, y ≤ C2
μ=0
∞
|¯y(μ)|2 .
(10.75)
μ=0
Now, consider an operator L from (10.28) defined on the above Hilbert space with eigenvalues sμ and eigenfunctions K(x, μ) as well as the adjoint operator L˜ with ˜ μ). Assume that the operator L eigenvalues s∗μ and eigenfunctions K(x, • has simple eigenvalues sμ , μ ∈ N0 and • the eigenvectors K(x, μ) form a Riesz basis. Then the functions y(x) are uniquely represented by (see e.g. [13, Definition 2.3.1 and Lemma 2.3.2]) y(x) =
∞
.
y¯ (μ)K(x, μ),
˜ μ) . y¯ (μ) = y(x), K(x,
(10.76)
μ=0
The above results are shortly recapitulated as follows: If the primal operator L acting on the functions y(x) has simple eigenvalues and if its eigenfunctions do neither tend to zero nor to infinity then the functions y(x) can be represented as ˜ μ) . .y(x) = y ¯ (μ), KH (x, μ) , y¯ (μ) = y(x), K(x, (10.77)
10.3 Spatial Transformation
369
These equations resemble the analysis and the synthesis equation from (4.108) with the scalar product ·, ·
given by (4.107), see Sect. 4.4.3. Indeed, there are conditions, under which the eigenvectors of the primal and of the adjoint operator serve as basis vectors. Example 10.2 investigates these conditions for the example from Sect. 10.1. Example 10.2 (Finite Fourier Sine Transformation). The introductory example in Sect. 10.1 uses in Eqs. (10.6) and (10.7) a finite Fourier sine expansion of the form (10.77) y¯ (x) = y(x), K˜ H (μ, x) = y(x)K˜ H (μ, x) dx,
(10.78)
y(x) = y¯ (μ), K(μ, x) =
(10.79)
.
0 ∞
y¯ (μ)K(μ, x),
μ=1
˜ x) with the basis functions K(μ, x) and the dual basis functions K(μ, x 2 ˜ x) = sin μπ x . K(μ, sin μπ , . K(μ, x) =
(10.80)
Note that both bases are orthogonal and the basis functions could easily be scaled to be equal. Similar cases have already been discussed in Sect. 4.4.1, see Example 4.11. Further, since y(x) and the basis functions are real scalars no transposition or complex conjugation is required. The scalar product ·, · induces the norm (see Eq. (4.41) in Sect. 4.2.5) ||y(x)||2 = y(x), y(x) ,
.
(10.81)
from which the center term in (10.75) is calculated using first (10.79) and then (10.78) M M M
2
y¯ (μ)K(μ, x)
= y¯ (μ)K(μ, x), y¯ (ν)K(ν, x)
.
μ=1
μ=1
=
M M μ=1 ν=1
ν=1
y¯ (μ)¯y(ν)
0
K(μ, x)K(ν, x) dx .
(10.82)
Evaluating the integral expression confirms the orthogonality of the basis functions .
0
K(μ, x) K(ν, x) dx =
2 x x 2 2 sin μπ sin νπ dx = δμν , 0
(10.83)
such that the norm becomes M M
2 2 |¯y(μ)|2 . y(x), y(x) =
y¯ (μ)K(μ, x)
= μ=1 μ=1
.
(10.84)
370
10 Sturm-Liouville Transformation
Thus the condition (10.75) is satisfied with tight bounds C1 = C2 = 2 .
What is the practical relevance of the conditions for eigenfunctions to be suitable as basis functions? The condition (10.75) states that the eigenfunctions of the operator L should be roughly in the same order of magnitude. Since eigenfunctions are only determined up to a constant factor, such a condition can be fulfilled by proper scaling. From a practical viewpoint, it is only reasonable to do so, since keeping the terms in a sum in about the same order of magnitude prevents over- or underflow in the numerical evaluation. ˜ What about the condition that the eigenvalues of the operator L (and thus of L) should be simple, i.e. have a multiplicity of one? The spectrum of an infinitedimensional operator is more complex than the spectrum of a finite-size matrix. Only for self-adjoint operators exists some parallelity to matrix theory (see [13, A.4.2], [22]). But even then there is no general way to exclude multiple eigenvalues, because the eigenvalues depend not only on the form of the operator L but also on the conditions at the boundary. This fact is reflected by the eigenvalue problems in Sect. 10.2.4 which contain also boundary conditions. The approach taken here is a practical one. The evaluation of the eigenvalue problems for all further initialboundary-value problems considered here reveals the nature and multiplicity of the eigenvalues. Indeed, also problems with multiple eigenvalues can be treated along these lines, as has been shown in [14–17]. Finally, it is observed that the condition (10.75) resembles the definition of frames in wavelet theory. The relations between frames and Riesz bases are discussed in [8, 10] and [23, Sec. 5.1].
10.3.2 Definition of the Sturm-Liouville Transformation The results from Sect. 10.3.1 confirm that—under suitable conditions—the eigenfunctions of the primal operator L and its adjoint operator L˜ from Eqs. (10.68) – (10.71) can be used as basis functions for a representation like (10.77). However, the definition of a signal transformation for the initial-boundary-value problem in the time domain (10.25)—(10.27) or in the frequency domain (10.29) and (10.30) has to consider also the capacitance matrix C. Furthermore, the time variable t resp. the temporal frequency variable s have to be included as a parameter, although they are not affected by the transformation. Definition The spatial transformation T is formulated in terms of a scalar product, resp. in terms of an integral transformation (see (10.31)), where the adjoint eigenfunctions from (10.70) act as transformation kernels and the capacitance matrix C as weighting matrix. The inverse transformation T −1 is defined similar to the expansion (10.76) with the normalization factor Nμ
10.3 Spatial Transformation
371
˜ μ) = T {y(x, t)} = y¯ (μ, t) = y(x, t), CH K(x,
.
T −1 {¯y(μ, t)} = y(x, t) = y¯ (μ, t), KH (x, μ) =
Ψ ∞ μ=0
˜ H (x, μ) C y(x, t) dx , (10.85) K 1 y¯ (μ, t) K(x, μ) . Nμ
(10.86)
Designation. Due to its close relation to Sturm-Liouville theory, the transformation T is called Sturm-Liouville transformation or SL-transformation. This designation dates back to [11, 18], where it has been introduced for scalar variables and for self-adjoint operators as a counterpart to the Laplace transformation for initial value problems. Extensions to non-selfadjoint problems are found in [6]. The initial formulation for scalar variables has been extended to vector-valued variables e.g. in [4, 21, 26]. The values y¯ (μ, t) are the scalar representations of the vector of variables y(x, t) in the spatial transform domain, i.e., the vector of variables y(x, t) after a SturmLiouville transformation. Functions after a Sturm-Liouville transformation are denoted by an over-bar. Modal Expansion As the idea of the SLT is based on a modal expansion into eigenfunctions, the values y¯ (μ, t) can be regarded as expansion coefficients (counted by μ). Thus, the expansion coefficients y¯ (μ, t) are a measure for “how much” of the ˜ μ) is present in y(x, t) over time t. There is a dedicated eigenvalue eigenfunction K(x, ˜ μ), K(x, μ) for each expansion coefficient y¯ (μ, t). sμ and eigenfunctions K(x, The number of eigenvalues and eigenfunctions is infinite but countable. The actual enumeration scheme (here μ = 0, . . . , ∞) is arbitrary and can be adapted to the problem at hand. For the introductory example in Sect. 10.1, the set μ = 1, . . . , ∞ proved to be suitable. For systems with complex eigenvalues a summation with μ = −∞, . . . , ∞ may be of advantage. These general hints have to suffice since there is no general enumeration scheme for non-selfadjoint systems [27, Chap. 3.9, Comment (2)]. Transform Pair The forward transformation T represents an analysis equation and the inverse transformation T −1 represents a synthesis equation in the sense of Sect. 4.4.3. Their nature as transform pair is now shown in both directions by observing either biorthogonality (10.72) or the sum orthogonality (10.73). Biorthogonality To show that T {T −1 {¯y(μ, t)}} = y¯ (μ, t), insert the inverse transformation (10.86) into the forward transformation (10.85) ⎛∞ ⎞ ⎜⎜⎜ 1 ⎟⎟ H ˜ ⎜ K (x, μ)C ⎜⎝ .T {y(x, t)} = y¯ (ν, t)K(x, ν)⎟⎟⎟⎠ dx Ψ Nν ν=0
=
∞ ∞ 1 1 ˜ H (x, μ)CK(x, ν) dx = K y¯ (ν, t) y¯ (ν, t) δμν Nμ = y¯ (μ, t) . Ψ Nν Nν ν=0 ν=0
(10.87)
This derivation shows the validity of the inverse transformation in (10.86) w.r.t. the forward transformation in (10.85) and also the necessity of the scaling factor Nμ . It can be written in a more abstract way as
372
10 Sturm-Liouville Transformation
˜ μ) T {y(x, t)} = y¯ (ν, t), KH (x, ν)
, CH K(x, (10.88) H ˜ = y¯ (ν, t), K(x, ν), C K(x, μ)
= y¯ (μ, t), δμν Nμ = y¯ (μ, t) .
.
Sum Orthogonality The relation T −1 {T {y(x, t)}} = y(x, t) follows in the same way for an interchanged sequence of inverse and forward transformation T −1 {¯y(μ, t)} =
.
=
∞ 1 ˜ H (ξ, ν) C y(ξ, t) dξ K K(x, μ) Ψ N μ μ=0
(10.89)
∞ 1 ˜ H (ξ, ν) C y(ξ, t) dξ = K(x, μ) K δ(ξ − x) y(ξ, t) dξ = y(x, t), Ψ Ψ Nμ μ=0
or more concisely ˜ μ) , KH (x, μ)
T −1 {¯y(μ, t)} = y(ξ, t), CH K(ξ, (10.90) H ˜ (ξ, μ)C, KH (x, μ)
= y(ξ, t), δ(ξ−x)I = y(x, t). = y(ξ, t), K
.
Application For the scalar product in (10.85) holds (see (4.10) and (4.12) in Sect. 4.2.2) ˜ μ) = Cy(x, t), K(x, ˜ μ) . .y ¯ (μ, t) = y(x, t), CH K(x, (10.91) Therefore the SL-transformation (10.85) is suitable for the first term on the left hand side of (10.25) ∂ ∂ − L y(x, t) = C y(x, t) − Ly(x, t), . C (10.92) ∂t ∂t since the partial derivative w.r.t. time is not affected by the spatial SL-transformation. However the second term contains the spatial differentiation operator L and requires a differentiation theorem like in (10.12).
10.3.3 Differentiation Theorem The derivation of a differentiation theorem for the SL-transformation starts from the introduction of the adjoint operator in (10.45) with K(x) replaced by Y(x, s) ˜ μ), ˜ μ) = Y(x, s), L˜ K(x, ˜ μ) + Φn Y(x, s), K(x, LY(x, s), K(x,
.
(10.93)
The representation in the temporal frequency domain has been chosen here for convenience. There are two important differences to the relation (10.45) for the primal ˜ and the adjoint operator K and K:
10.3 Spatial Transformation
373
• The concept of biorthogonality applies to two sets of functions like K(x, μ) and ˜ ν) but not to a single function like Y(x, s). Nevertheless, Y(x, s) can be exK(x, panded into a set of basisfunctions by the inverse SL transform (10.86). • The solution Y(x, s) does not satisfy homogeneous boundary conditions in general. Instead, the boundary condition (10.30) holds for Y(x, s). For the first term on the right hand side in (10.93) follows with the eigenvalue ˜ μ) and the definition (10.85) of the SL-transformation problem (10.70) for K(x, ˜ K(x, ˜ μ) = Y(x, s), s∗μ CH K(x, ˜ μ) . Y(x, s), L ˜ μ) = sμ Y(μ, ¯ s) . (10.94) = sμ Y(x, s), CH K(x, The second term on the right hand side in (10.93) is given by (10.41) ˜ μ) = ˜ H (x)Bn (x)Y(x, s) dA, K .Φn Y(x, s), K(x, ∂V
(10.95)
where the integrand can be expressed by (10.60) with K replaced by Y and with FK (x) from (10.59) ˜ H (x, ν)Bn (x)Y(x, s) K
.
H ˜ H (x, ν) Fb (x)BHn (x)H Y(x, s) + F˜ HK (x)K(x, ˜ ν) Y(x, s) . =K
(10.96)
˜ ν) satisfies the homogeneous boundary condiThe second term is zero since K(x, tions (10.62) and the first term contains the boundary condition (10.30) for Y(x, s) ˜ H (x, ν)Bn (x)Y(x, s) = K ˜ H (x, ν)Bn (x) FH (x)Y(x, s) = K ˜ H (x, ν)Bn (x)Φb (x, s) . .K b (10.97) Now the integrand in (10.95) contains only known values: The eigenfunctions ˜ μ) of the adjoint operator L, ˜ the matrix Bn (x) from (10.40) and the boundK(x, ary value Φb (x, s) from the boundary condition (10.30). Thus Φn is calculated by ¯ b (μ, s) integration on the boundary and the result is abbreviated as Φ ˜ μ) = ˜ H (x, μ)Bn (x)Φb (x, s) dx = Φ ¯ b (μ, s), x ∈ ∂Ψ. (10.98) K .Φn Y(x, s), K(x, ∂Ψ
Inserting (10.94) and (10.98) into (10.93) gives finally the differentiation theorem of the Sturm-Liouville transformation ˜ μ) = sμ Y(μ, ¯ s) + Φ ¯ b (μ, s) . . LY(x, s), K(x, (10.99) It has the same features as the well-known differentiation theorem of the Laplace transformation: The differentiation operator L causes a multiplication by the eigenvalue sμ and the values Φb (x, s) on the boundary x ∈ ∂Ψ turn from an additional ¯ b (μ, s). boundary condition (10.30) into an additive term Φ
374
10 Sturm-Liouville Transformation
The differentiation theorem for the finite Fourier sine transformation is closely related to the differentiation theorem of the SL-transformation, except that the former has been derived for the second order space derivative. The differentiation theorems (10.12) and (10.99) correspond to each other for sμ = i μ π . The boundary term in (10.12) is zero due to the homogeneous boundary conditions (10.2).
10.3.4 Application to the Initial-Boundary-Value Problem The SL-transformation has been defined to include the weighting matrix C in (10.85) and a differentiation theorem for the spatial operator L has been derived in (10.99). Now the SL-transformation is ready for application to the boundary value problem in the frequency domain formulation in Sect. 10.2.1.2 and repeated here as .
sCY(x, s) − LY(x, s) = Fe (x, s) + Cyi (x), FHb (x) Y(x, s) = Φb (x, s),
x ∈ Ψ, x ∈ ∂Ψ.
(10.100) (10.101)
˜ μ) to (10.100), where • is a placeholder for Applying the scalar product •, K(x, the individual terms gives ˜ μ) − LY(x, s), K(x, ˜ μ) = . sCY(x, s), K(x, ˜ μ) + Cyi (x), K(x, ˜ μ) . Fe (x, s), K(x, (10.102) Exploiting the definition of the SL-transformation in (10.85) and the differentiation theorem in (10.99) results in .
¯ s) − sμ Y(μ, ¯ b (μ, s). ¯ s) = F¯ e (μ, s) + y¯ i (μ) + Φ sY(μ,
(10.103)
The values F¯ e (μ, s) and y¯ i (μ) denote the transform domain representations of the vector of excitation functions and the initial values, respectively ˜ μ) , ¯ e (μ, s) = Fe (x, s), K(x, .F (10.104) ˜ μ) = T {yi (x)}. y¯ i (μ) = Cyi (x), K(x, (10.105) At this point it is worthwhile to review the objectives for the spatial transformation listed in Sect. 10.2.1.3. Indeed, the SL-transformation exhibits properties w.r.t. to the space coordinates which parallel the properties of the Laplace transformation w.r.t. to time: The SL-transformation T removes the spatial differentiation operator L and replaces it by a multiplication with an eigenvalue sμ . Fur¯ b (μ, s) resulting from ther, the boundary value Φb is included as an additive term Φ integration on the boundary x ∈ ∂Ψ . The representation (10.103) shows that the combination of Laplace transformation and Sturm-Liouville transformation turn the initial-boundary-value problem from Sect. 10.2.1.1 into an algebraic equation. Thus
10.3 Spatial Transformation
375
parallels to one-dimensional signals and systems can be expected, where transfer functions and impulse responses play an important role.
10.3.5 Transfer Function Description The solution of the algebraic equation (10.103) leads to a description by multidimensional transfer functions which resemble their one-dimensional counterparts.
10.3.5.1 Solution of the Algebraic Equation ¯ s) of the The algebraic equation (10.103) is easily solved for the transform Y(μ, solution in the temporal and spatial frequency domain ¯ b (μ, s) , ¯ s) = H(μ, ¯ s) F¯ e (μ, s) + y¯ i (μ) + Φ .Y(μ, (10.106) where the factor (s − sμ )−1 acts as a transfer function with a complex temporal frequency s. The single pole sμ is an eigenvalue of the operator L .
¯ s) = H(μ,
1 , s − sμ
μ = 0, . . . , ∞ .
(10.107)
This first order transfer function appears to be overly simple, however the required complexity to solve the general initial-boundary-value from Sect. 10.2.1.1 is intro¯ s), one for each duced by the existence of infinitely many transfer functions H(μ, eigenvalue sμ for μ = 0, . . . , ∞. It may appear strange that the temporal frequency variable s and the eigenvalue sμ of the spatial operator share the same physical unit. Note, however, that also the terms sC and L in (10.29) are compatible. The inclusion of the matrix C into the generalized eigenvalue problem (10.68) preserves this compatibility of phy¯ s). sical units also in the transfer function H(μ, ¯ The transfer function H(μ, s) is addressed here as a multidimensional transfer function since it depends on two variables: the temporal frequency variable s and the index μ of the eigenvalues of the spatial operator. In the same way as s translates into the time variable t by inverse Laplace transformation, so does μ translate into the space variable x by inverse SL-transformation (10.86). ¯ s) is its inverse Laplace transform Closely related to the transfer function H(μ, ⎧ ⎪ ⎪e sμ t 0 ≤ t, −1 ¯ t) = L {H(μ, ¯ s)} = ⎨ .h(μ, (10.108) ⎪ ⎪ ⎩0 t 0,
(10.157)
0 < x < , x ∈ {0, },
t = 0, t > 0,
(10.158) (10.159)
0 0 while N0 adopts a form similar to (11.142) ˜ H ˜ H 1 + Zμ−2 0 1 , N0 = A Nμ = 2 1 + A2 C A1 + A2 = . 0 1 + Zμ2 The normalization factor Nμ follows from the matrix Nμ through (11.129) with the (11.4) boundary conditions from Problem ⎧ ˆ ∗ 1 ⎪ ˜ ⎪ K K ⎨ μ μ 2 (1 + Zμ2 ) μ > 0, Nμ = ⎪ ⎪ ⎩K0 Kˆ˜ ∗ μ = 0. 0 11.7 The characteristic polynomial q(γ) of the matrix Q, its derivative, the roots of q(γ), the squared roots and the squared matrix Q2 are ⎡ ⎤ γ1 = −1, γ12 = 1, ⎢⎢⎢0 −1 1⎥⎥⎥ q(γ) = γ3 + γ2 + γ + 1, ⎢ ⎥ γ2 = i, γ22 = −1, Q2 = ⎢⎢⎢⎢0 −1 0⎥⎥⎥⎥ . 2 ⎣ ⎦ q (γ) = 3γ + 2γ + 1, 1 −1 0 γ3 = −i, γ32 = −1, −1 T ˆ The matrix product ⎡ HV ⎤D⎡ is given⎤ ⎡by ⎡ ⎤ ⎤ ⎢⎢2 1 − i 1 + i ⎥⎥⎥ 0 ⎥⎥⎥ ⎢⎢⎢⎢1 1 1⎥⎥⎥⎥ ⎢⎢⎢⎢ 1 1 1 ⎥⎥⎥⎥ ⎢⎢⎢⎢2 0 ⎢ 1 1 −1 ⎥ ˆ = ⎢⎢⎢1 1 0⎥⎥⎥ ⎢⎢⎢γ1 γ2 γ3 ⎥⎥⎥ ⎢⎢⎢0 −1 − i 0 ⎥⎥⎥⎥ = ⎢⎢⎢⎢0 −2i 2i ⎥⎥⎥⎥ . W = HVT D ⎣⎢ ⎦⎥ ⎣⎢ 2 2 2 ⎦⎥ ⎣⎢ ⎦ ⎦⎥ 4 4 ⎣⎢ 1 0 0 γ1 γ2 γ3 0 2 −1 − i −1 + i −1 + i Since γ3 = γ2∗ , also A3 = A∗2 and the matrix exponential is given by eQx = A1 eγ1 x + A2 eγ2 x + A3 eγ3 x = A1 eγ1 x + 2 {A2 eγ2 x }, with A1 = w11⎡ I + w21 Q⎤ + w31 Q2 and A2 = w12 I + w22 Q + w32 Q2 ⎢1 −1 1⎥⎥⎥ 1 ⎢⎢⎢ ⎥ A1 eγ1 x = ⎢⎢⎢⎢0 0 0⎥⎥⎥⎥ e−x , ⎦ 2⎣ 1 −1 1 ⎤ ⎡ ⎡ ⎤ π π i(x+ π4 ) −ei(x− 4 ) ⎥⎥⎥ e√ ei(x− 4 ) ⎢⎢⎢ 1 − i 1 + i −1 + i⎥⎥⎥ ⎢⎢⎢ √ √ 1 ⎢ 1⎢ π π ⎥ ⎥ 2 2i ⎥⎥⎥⎥ eix = √ ⎢⎢⎢⎢− 2 ei(x+ 2 ) 2eix 2 ei(x+ 2 ) ⎥⎥⎥⎥ . A2 eγ2 x = ⎢⎢⎢⎢ −2i ⎦ ⎣ ⎦ π π π 4⎣ 2 2 ) i(x+ ) i(x+ ) i(x− 4 −1 − i 1 − i 1 + i e 4 e −e 4 Some cosines in the real part of the complex exponential functions are converted to sines by cos(x + π2 ) = − sin x and cos(x + π4 ) = − sin(x − π4 ) such that ⎡ π ⎤ sin(x − π4 ) − cos(x ⎢cos(x − π4 ) −√ √ − 4 )⎥⎥⎥⎥ 1 ⎢⎢⎢⎢ √ γ2 x 2 {A2 e } = √ ⎢⎢⎢ 2 sin x 2 cos x − 2 sin x ⎥⎥⎥ . 2 ⎣ sin(x − π ) cos(x − π ) − sin(x − π ) ⎥⎦ 4 4 4 d Qx e = QeQx is now shown separately for A1 e−x and for 2 {A2 eix }: The identity dx
498
A Solutions to the Problems
⎡ ⎤ ⎢⎢⎢−1 1 −1⎥⎥⎥ d 1 ⎢ ⎥ A1 e−x = Q A1 e−x = ⎢⎢⎢⎢ 0 0 0 ⎥⎥⎥⎥ e−x , ⎦ 2⎣ dx −1 1 −1 ⎤ ⎡ − π4 ) sin(x − π4 ) ⎥⎥⎥ sin(x − π4 ) − cos(x ⎢⎢⎢−√ √ √ d 1 ⎥ ⎢ 2 {A2 eix } = Q 2 {A2 eix } = √ ⎢⎢⎢⎢ 2 cos x 2 sin x − 2 cos x ⎥⎥⎥⎥ . ⎣ dx 2 cos(x − π ) − sin(x − π ) − cos(x − π )⎦ 4 4 4 The following trigonometric identities have been used in the calculation of the product Q 2 {A2 eix } √ √ cos(x − π4 ) + sin(x − π4 ) = 2 sin x , cos(x − π4 ) − sin(x − π4 ) = 2 cos x . 11.8 Equation (11.97) gives the same result for the matrix exponential as in Solution 3.8. 11.9 Start with the left hand side of Eq. (11.1) with kμ = μ π and set Kˆ μ = 1
∂ r −i Zw (μ) sin(kμ x) − r − ikμ Zw (μ) cos(kμ x) LK(x, μ) = − x = . kμ + g Zw (μ) sin(kμ x) g ∂x cos(kμ x) From (11.10) follows γ(μ) = g + sμ c, γ(μ) Zw (μ) = r + sμ l, Zw (μ) such that with γ(μ) = ikμ − r − ikμ Zw (μ) = − r − γ(μ) Zw (μ) = sμ l, γ(μ) kμ + g Zw (μ) = iZw (μ) g − = −isμ c Zw (μ). Zw (μ) Expanding into
gives
a matrix and a vector 0 l −i Zw (μ) sin(kμ x) sμ l cos(kμ x) = sμ = sμ CK(x, μ). LK(x, μ) = −isμ c Zw (μ) sin(kμ x) cos(kμ x) c 0 12.1 Partial fraction expansion of H c (s) gives the weighted difference of two first order systems 1 1 1 1 c . H (s) = = − s − s∞2 (s − s∞1 )(s − s∞2 ) s∞1 − s∞2 s − s∞1 The corresponding discrete-time transfer function by bilinear transformation is z + 1 z + 1 1 H d (z) = c∞1 , − c∞2 s∞1 − s∞2 z − z∞1 z − z∞2 1 + s∞1/2 , such that with c∞1/2 = T4 (z + z∞1/2 ) and z∞1/2 = 1 − s∞1/2 (z + 1)2 T z∞1 − z∞2 H d (z) = . 4 s∞1 − s∞2 (z − z∞1 )(z − z∞2 ) 12.2 Application of Eq. (12.56) to H c (s) gives (z + 1)2 T z∞1 − z∞2 1 , H d (z) = = 2 z−1 4 s∞1 − s∞2 (z − z∞1 )(z − z∞2 ) − s∞1 2 z−1 − s∞2 T z+1
with
T z+1
z∞1 − z∞2 T = . T s∞1 − s∞2 1 − s∞1 2 1 − s∞2 T2
12.3 Decomposition of H c (s) as in Problem 12.1 and of Eq. (12.77) give
A Solutions to the Problems d Hiit (z) =
499
z z T z z∞1 − z∞2 − =T s∞1 − s∞2 z − z∞1 z − z∞2 s∞1 − s∞2 (z − z∞1 ) (z − z∞2 )
with z∞1/2 = e s∞1/2 T . 12.4 The state equation sY(s) = A Y(s) + X(s) contains the two scalar equations Y1 (s) = s−1 s∞1 Y1 (s) + X1 (s) , Y2 (s) = s−1 Y1 (s) + s∞2 Y2 (s) + X2 (s) and corresponds to the signal flow diagram s∞1 s−1
X1 (s)
Y1 (s)
1 s−1
X2 (s)
Y2 (s)
s∞2
I − A) H (s) = (sI c
−1
s − s∞1 0 = −1 s − s∞2
−1
⎡ ⎢⎢ = ⎢⎢⎢⎣
⎤ 1 0 ⎥⎥⎥ s−s∞1 ⎥⎥⎦ . 1 1 (s−s∞1 )(s−s∞2 ) s−s∞1
12.5 The eigenvalues of the matrix A are the zeros of its characteristic polynomial I − A } = (s − s∞1 )(s − s∞2 ) = 0. det{sI From Eq. (12.92) and further from Eq. (11.97) follows 1 e s∞1 t − e s∞2 t A − s∞2 e s∞1 t − s∞1 e s∞2 t I hc (t) = eA t = s∞1 − s∞2
e s∞1 t 0 . = es∞1 t −es∞2 t e s∞2 t s∞1 −s∞2 The matrix of discrete-time impulse responses is ⎤ ⎡ k 0 ⎥⎥ ⎢⎢⎢ z∞1 ⎥⎥⎥ d c s T h [k] = h (kT ) = ⎢⎢⎢⎣ zk −zk ⎥ , z∞1/2 = e ∞1/2 , k ⎦ ∞1 ∞2 z∞2 s∞1 −s∞2 and finally the matrix of discrete-time transfer functions
z 0 d z−z∞1 . H (z) = T z∞1 −z∞2 z z s∞1 −s∞2 (z−z∞1 )(z−z∞2 )
z−z∞2
12.6 From Eq. (12.97) follows
−1 0 1 − z∞1 z−1 d A T −1 −1 Hiit (z) = T I − e z =T −1 ∞1 −z∞2 1 − z∞2 z−1 − sz∞1 −s∞2 z
z 0 . = T z∞1 −z∞2 z−z∞1 z z s∞1 −s∞2 (z−z∞1 )(z−z∞2 )
12.7 The scalar transfer function
Y2 (s) X1 (s)
is the element (2, 1) of the matrix Hc (s) from
Problem 12.4. Similarly, the scalar transfer function d
z−z∞2
Y2d (z) X1d (z)
is the element (2, 1) of the
matrix H (z) from Problem 12.5 or 12.6. These transfer functions Y2d (z) 1 z Y2 (s) z∞1 − z∞2 = , =T d X1 (s) (s − s∞1 )(s − s∞2 ) s∞1 − s∞2 (z − z∞1 ) (z − z∞2 ) X1 (z) correspond to Problem 12.1 and 12.3.
Index
A acoustic wave equation,209, 330, 337, 397 additivity,23, 26 adjoint boundary operator, 365 eigenvalue problem, 365, 384, 423, 432 operator, 359, 362, 365, 367, 370, 372, 378, 387, 423 self-, 391, 393 affine mapping, 179–181, 183 theorem, 184, 197 algebraic structure, 73 aliasing, 231, 236, 237, 240, 466 analogy dynamical, 327 electroacoustic, 328 impedance, 327 mobility, 327 analysis equation, 103, 107, 164, 201, 209, 356, 371 angle, 76 angular expansion, 202 associated Legendre function, 396 axial symmetry, 123 B back-projection, 192 base band spectrum, 232 basis, 73 dual, 98 function, 271, 368 functions global, 273 functions local, 273 functions modal, 273 functions nodal, 273 primal, 98
Riesz, 368 vector, 88, 90, 93 Bessel equation, 397 function, 203, 386, 397 spherical B. function, 397 bilinear transformation, 455 biorthogonal, 97, 102, 104, 363, 367, 371, 388, 393, 440, 443 block matrix, 284, 302, 337 boundary condition, 16, 19, 315, 319, 321, 331 Dirichlet, 322, 323 mixed, 323 Neumann, 323 Robin, 323 boundary value, 20, 315, 319, 321, 322, 331, 373, 379, 389, 408, 417, 433, 445, 447, 448, 475, 479 problem, 270, 357, 390, 433 C capacitance, 310, 314 matrix, 325, 326, 370 Cauchy sequence, 81 Cayley-Hamilton theorem, 409, 425 characteristic equation, 405, 409, 412 frequency, 460 impedance, 403, 406 polynomial, 403, 409, 412, 419, 420 chirp function, 236 closure relation, 86 collocation method, 273 companion matrix, 411 completeness, 81 computability, 263 condition
© Springer Nature Switzerland AG 2023 R. Rabenstein, M. Sch¨afer, Multidimensional Signals and Systems, https://doi.org/10.1007/978-3-031-26514-3
501
502 boundary, 19, 319, 321 initial, 19, 319, 321 conductance, 310 continuous-discrete conversion, 455 convolution, 23, 26, 36, 44, 45, 135, 150, 197, 379 by inspection, 136 continuous-time, 35, 36, 38 discrete-time, 26, 38 two-dimensional, 143, 156 coordinates Cartesian, 15, 161, 163, 183 polar, 151, 157, 161, 183, 193, 199, 397 spherical, 15, 158, 161, 206 cosine, law of, 80 curl, 195 current waves, 342 D DCT, see discrete cosine transformation defect, 271, 286, 293 del, 196 delta impulse, 27, 84, 85, 168, 273, 364, 381, 461 comb, 33, 34 integration, 29 scaling, 31, 32, 34 two-dimensional, 140, 160 weight, 29, 34 delta sequence, 25 density, see sampling density derivative directional, 323 normal, 323 partial, 194 determinant, 412 Jacobi, 158 Jacobian, 182 DFT, see discrete Fourier transformation difference equation, 19 differential equation, 20, 48–51 differentiation theorem, 116, 372, 389 dimension one-dimensional, 7 two-dimensional, 8 three-dimensional, 11 four-dimensional, 14 dimension, 73 Dirac impulse, see delta impulse directional derivative, 323 Dirichlet boundary condition, 322, 323 discrete cosine transformation, 113 discrete Fourier transformation, 107, 108 discrete Poisson equation, 289
Index discrete-time Fourier transformation, 110, 210 dispersion, 406 distance, 76, 79 distributed parameter system, 310 distribution, 27 normal, 127 support of, 140 divergence, 195 dot product, 76 dual basis, 98 lattice, 224 space, 97 dynamical analogy, 327 E effort variable, 328 eigen function, 451 value, 451 eigenfunction, 362, 368, 406, 432, 440 eigenvalue, 405, 406, 432 eigenvalue problem, 279, 365, 388, 423, 432 electroacoustic analogy, 328 energy, 79 enumeration, 451 equation analysis, 103, 107, 164, 201, 209, 356, 371 Bessel, 397 characteristic, 405, 409, 412 difference, 19 differential, 20, 48–51, 313, 348 general Legendre, 396 input, 52, 445 Legendre, 394 output, 52, 445 Poisson, 269, 283, 289 state, 52, 445 synthesis, 103, 107, 164, 201, 209, 356, 371 telegraph, 313, 324, 426, 428, 430 transmission line, 313 wave, 20 error, 286, 293 expansion angular, 202 Jacobi-Anger, 203 F FDM, see finite difference method FEM, see finite element method filter half-plane, 265 one-quadrant, 265 quarter-plane, 265
Index two-quadrant, 265 finite difference method, 271, 279 finite element method, 271, 279 finite Fourier sine transformation, 369, 374 finite impulse response, 254 finite impulse response system, 254, 255 FIR, see finite impulse response flow variable, 328 Fourier-Bessel transformation, 204 Fourier series, 109 Fourier sine transformation, 350 Fourier transformation, 39, 112, 163, 164 convolution, 43 differentiation, 44 linear time scaling, 43 multiplication, 44 similarity, 43 time shift, 43 free field impedance, 331 frequency characteristic, 460 fundamental, 350 instantaneous, 236 Nyquist, 236 sampling, 236, 466 frequency division multiplex, 66 frequency warping, 460 function associated Legendre, 396 basis, 271, 368 Bessel, 203, 386, 397 chirp, 236 eigen-, 362, 368, 432, 440 Gaussian, 127, 168, 257 generalized, 27, 84 generating, 394 Green’s, 376, 449 homogeneous, 26 plenoptic, 15 si-, 41 spherical Bessel, 397 step, 29 sweep, 236 test, 271 transfer, 44, 51, 58, 389 trial, 271 weighting, 271 function space, 66 fundamental frequency, 350 mode, 350 sequence, 81
503 G Gaussian elimination, 334 function, 127, 168, 257 Gauss-Seidel iteration, 288, 290 generalized function, 27, 84 general Legendre equation, 396 generating function, 394 global basis functions, 273 gradient, 194, 322, 326, 337 Gramian matrix, 90 Gram-Schmidt orthogonalization, 93 Green’s function, 376, 449 grid, 173, 218, 270, 273, 279 H half-plane filter, 265 Hankel matrix, 415 transformation, 204 Hermite polynomial, 386 Hessian normal form, 87, 145, 181, 223 hexagonal sampling, 247 Hilbert space, 82 homogeneity, 26 homogeneous function, 26 partial differential equation, 320 I impedance characteristic, 403, 406 field, 403 free field, 331 impedance analogy, 327 impulse delta, 27 line, 140, 144, 150, 174, 187, 189 point, 140, 143, 157, 160 unit, 27 impulse comb, 33 impulse grid, 155 impulse invariant transformation, 461, 468 impulse response, 23, 25, 34, 44, 59 finite, 254 inductance, 310, 314 inequality Schwarz, 75, 78 triangle, 77 inhomogeneous partial differential equation, 320 initial condition, 15, 16, 19, 315, 319, 321, 355 value, 315, 389, 476
504 initial-boundary-value problem, 21, 319, 324 initial-value problem, 19, 50, 52 inner product, 76 inner product space, 76, 78 input equation, 52, 445 instantaneous frequency, 236 iteration, 293 Gauss-Seidel, 288, 290 Jacobi, 288, 289 J Jacobian determinant, 158, 182 Jacobi-Anger expansion, 203 Jacobian matrix, 182 Jacobi iteration, 288, 289 K Kronecker product, 261 Kronecker symbol, 89 L Laguerre polynomial, 386 Laplace operator, 194–196, 258, 337 Laplace transformation, 116 lattice, 222 dual, 224 primal, 224 law of cosines, 80 Legendre associated L. function, 396 equation, 394 general L. equation, 396 polynomial, 96, 386, 394 linear, 17, 26, 45, 142 linearity, 26 linear space, 73 linear time-invariant system, 18, 26 line impulse, 140, 144, 174, 187, 189 local basis functions, 273 local Fourier analysis, 295 LTI system, see linear time-invariant system lumped parameter system, 310 M mapping, 179–181, 183 mask, 255 mass matrix, 325 matrix block, 284, 302, 337 capacitance, 325, 370 companion, 411 Gramian, 90 Hankel, 415
Index Jacobian, 182 mass, 325 repetition, 221, 240 sampling, 221, 240 sparse, 284 state space, 52, 445 Toeplitz, 284, 288, 302 Vandermonde, 412 matrix exponential, 53, 407, 468 mesh, 49, 279 method collocation, 273 multigrid, 297 relaxation, 288 metric, 80 metric space, 80 MIMO-system, see multiple-input multipleoutput system mixed boundary condition, 323 mixing console, 68 mobility analogy, 327 modal basis functions, 273 mode, 350 multigrid cycle, 298 multigrid method, 297 multiple-input multiple-output system, 299 N nabla, 196 Neumann boundary condition, 323 nodal basis functions, 273 node, 49 norm, 76 normal derivative, 323 normal distribution, 127 Nyquist frequency, 236 O one-quadrant filter, 265 operator adjoint, 359, 362, 365, 367, 370, 372, 378, 423 adjoint boundary, 365 Laplace, 194–196, 258, 337 nabla, 196 primal, 362, 387 resolvent, 58, 443 Sobel, 258 orthogonal, 86, 88, 363, 364, 393 orthogonality, 86 orthogonalization, 93, 94 orthonormal, 88 output equation, 52, 445
Index P partial derivative, 194 partial differential equation, 20, 313, 320, 348 periodic repetition, 37 periodization, 37 plenoptic function, 15 point impulse, 140, 143, 157, 160 Poisson equation continuous, 269 discrete, 283, 289 polynomial characteristic, 409, 412, 419, 420 Hermite, 386 Laguerre, 386 Legendre, 386, 394 port resistance, 341 power conjugate variables, 327 power waves, 340, 342 pre-warping, 460 primal basis, 98 eigenvalue problem, 365 lattice, 224 operator, 362, 387 problem boundary value, 270, 390 eigenvalue, 279, 365, 384, 388 initial-boundary-value, 21, 319, 324 initial-value, 19, 50, 52 self-adjoint, 362, 391 product dot, 76 inner, 76 Kronecker, 261 profile, 149 projection, 93 back-, 192 slice theorem, 189, 197 propagator, 50, 53, 382, 408, 451 Q quantization, 16 quarter-plane filter, 265 R radial symmetry, 123 Radon transformation, 191 ray, 189 relaxation method, 288 parameter, 291 successive over-, 288, 291 repetition matrix, 221, 240 residual, 271
505 resistance, 310, 341 resolvent operator, 58, 443 response impulse, 23 Riesz basis, 368 RMS, see root-mean-square Robin boundary condition, 323 Rodrigues formula, 394 root-mean-square, 78 rotation, 195, 197 S sampling, 16, 45, 240 density, 242 frequency, 236, 466 grid, 225 hexagonal, 247 lattice, 225 matrix, 221, 240 pattern, 225 scalar Product, 74 scaling, 45, 142 Schwarz inequality, 75, 78 screen, 189 self-adjoint, 362, 391–393 separability, 123, 130, 135, 176, 447, 479 separation of variables, 126 sequence Cauchy, 81 delta, 25 fundamental, 81 shear, 197 shift, 45, 197 shift theorem, 117 si-function, 41 signal analog, 16 complex-valued, 16 continuous-space, 16 continuous-time, 16 deterministic, 16 digital, 16 discrete-space, 16 discrete-time, 16 random, 16 real-valued, 16 separable, 123 signal space, 65, 66, 72, 443 signal transformation, 105 slice, 189 SL-transformation, SLT, see Sturm-Liouville transformation Sobel operator, 258 space
506 basis of, 73 dimension of, 73 dual, 97 function-, 66 Hilbert, 82 inner product, 76, 78 linear, 73 metric, 80 signal, 65 signal-, 66, 72 vector-, 66 sparse matrix, 284 spectral image, 232 spectrum, 232 sphere packing, 250 stability, 262 state equation, 52, 445 state space matrix, 52, 445 state space representation, 50, 52, 445 Sturm-Liouville problem, 357, 390 theory, 357 transformation, 370, 371, 389 successive over-relaxation, 288, 291 sum orthogonality, 86, 364 superposition principle, 26 support, 140 sweep function, 236 symmetry, 130, 143 axial, 123 radial, 123 synthesis equation, 103, 107, 164, 201, 209, 356, 371 system autonomous, 17 distributed parameter, 310 finite impulse response, 254, 255 input-output-, 17 linear, 17, 26 linear time-invariant, 18, 26 lumped parameter, 310 multiple-input multiple-output, 299 shift-invariant, 18 time-invariant, 18 T telegraph equation, 313, 324, 426, 428, 430 theorem affine, 184, 197 Cayley-Hamilton, 409, 425 differentiation, 116, 372, 389 projection slice, 189, 197 shift, 117 time division multiplex, 66
Index time-invariant, 18, 26 Toeplitz matrix, 284, 288, 302 transfer function, 44, 51, 58, 389 transformation z-, 117 2D z-, 211 transformation 2D Fourier, 164 bilinear, 455 discrete cosine, 113 discrete Fourier, 107, 108 discrete-time Fourier, 110, 210 finite Fourier sine, 369, 374 forward, 440 Fourier, 39, 112, 163 Fourier-Bessel, 204 Fourier sine, 350 Hankel, 204 impulse invariant, 461, 468 Laplace, 116 Radon, 191 signal, 105 Sturm-Liouville, 370, 371, 389 transmission line, 329, 331 transmission line equations, 313 triangle inequality, 77 tridiagonal, 284 Tustin’s method, see bilinear transformation two-quadrant filter, 265 U uncertainty principle, 236 unit impulse, 27 unit step function, 29 V Vandermonde matrix, 412 variable effort, 328 flow, 328 power conjugate, 327 wave, 342 vector potential, 338 space, 66 voltage waves, 342 W warping, see frequency warping wave current, 342 power, 340, 342 voltage, 342 wave equation, 20
Index wave variable, 342 weight, 29, 143 weighted residual, 271 weighting function, 271
507 Z z-transformation, 117 2D z-transformation, 211 zone plate, 238